An update from the boss

By David Winter 01/07/2013 1


It’s about time this blog moved from the abstract to the concrete. Now you know why we are sequencing the tuatara genome, and have an idea about how we’ll go about doing it, it’s time to meet the boss and hear how the project is going so far.

Neil Gemmell is the leader of the tuatara genome project. Neil is a professor in the Department of Anatomy at Otago University, where he and his lab study the biology of  reproduction from the level of genes all the way up to the consequences of reproductive biology on ecology, evolution, conservation and economics. I asked him a few questions about the project, and how it’s going

How did you end up heading the tuatara genome project?

This happened through a rather fortuitous sequence of events. I was at an international meeting of scientists in Santa Cruz, California in 2011, with about 100 other scientists who were proposing to sequence the genomes of 10,000 vertebrates; the Genome10K consortium. An initial list of priority species had been produced, the top 100 if you like, and there at the top was tuatara. I innocently asked who was leading the project, whether they had samples or sequence already, and importantly whether they had had discussion with iwi. It turned out that although a team were willing to commit resource to getting the sequence, none of that crucial work had been undertaken. I offered to find out what would be needed, talked to a variety of people and slowly but surely became more involved in the project as I realised that it could, and indeed should, be undertaken in New Zealand. Via a collaboration with Ngatiwai iwi, the Allan Wilson Centre for Molecular Ecology and Evolution, together with initial support from New Zealand Genomics Limited and Illumina, we are well on our way to producing a draft sequence of tuatara.

Is there any one thing you’d really like to learn from the genome sequence, or are you happy to be surprised?

I’m hoping to be surprised. There are a few things I’m particularly interested in though. These include why the genome is so big, roughly 5 Gbp, thus 70% bigger than human. Often such large genomes contain numerous repetitive elements, but the only work undertaken prior to the genome sequencing project suggested tuatara might have relatively few repeats. If the large genome size is not due to repeats this would be unusual. Another area of considerable interest to me is around the genes involved in sex determination. In humans and almost all other mammals the presence of a Y chromosome containing a gene called Sry, is the trigger for male sexual development. In tuatara sex is govern by the incubation temperature of the eggs; when the temperature is high embryos develop as males , while embryos develop as females when the temperature is lower. Lots of reptiles have this form of sex determination, but we remain rather ignorant of the mechanism. It maybe that the tuatara genome, together with comparisons to those of other animals, may enable us to start to identify likely mechanisms of sex determination that we can subsequently test.

 

What stage is the project up to? What’s been done and what’s happening at the moment

The project breaks into three main steps: sequencing (collecting the raw data), assembly (putting it together, ideally in the right order) and annotation (trying to determine the function to each component of the sequence). The first step is as you’d imagine, obtaining a vast quantity of raw sequence and we’ve done plenty of this over the past year. We now have roughly 70 fold coverage of the genome, which means that on average each base pair of the genome has been sequenced 70 times. What we now trying to do is to produce an genome assembly. This involves taking the equivalent of paper confetti, perhaps created from a great work like Darwin’s Origin of the Species, and trying to piece together every word, sentence, paragraph, page and chapter in the right order to reproduce the book. Thus far we have managed to rebuild the equivalent of ‘pages’ of the tuatara genome, but we don’t yet fully know how these are ordered. We are now using some clever sequencing technology, called mate-pair jump libraries, to help us figure out how many letters and words lie between two point in our book to help us better order or scaffold these sequences.

 

How many people are working on the project at the moment, and where do they come from?

We have had a team of four working on collecting the sequence data: Becky Laurie, Rob Day and Aaron Jeffs (NZGL Otago) and Lorraine Berry (NZGL Massey). On the assembly side we have both local, national and international collaborators: Kim Rutherford (Otago), Ross Crowhurst (Plant and Food), Steven Salzberg (John’s Hopkins) who have led the majority of this work together with members of Ross and Steven’s teams. We also have had support from Nicky Nelson (VUW), Scott Edwards (Harvard), Bob Macey (Berkley), and Pieter de Jong (CHORI) who have contributed resources and know how to the project thus far, plus numerous other offers for collaboration and assistance. Last, we have had excellent support of this endeavour from Ngatiwai, through the efforts of Clive Stone.

 

What will the next steps be?

The next steps are to complete the genome assembly. This is a complex task requiring lots of computer resource and expertise and is highly iterative. Thus to speed up this process we will be establishing a genome assembly challenge with the view that we might be able to encourage some of the best labs in the world at assembly, including Steven Salzberg’s, to have a go at producing the best assembly they can from our data with their favourite programs and algorithms. We’d hope to start that in the next month or two. In addition we will be looking to obtain data on RNA expression patterns from a variety of tissues to help us with the final task of annotation as well as start to explore the genome sequence data in more detail.


One Response to “An update from the boss”

  • Concise and easy to understand. I’d wondered where Gemmell’s team was at, and this interview clarifies the landscape nicely.