Neil Gemmell is the leader of the tuatara genome project. Neil is a professor in the Department of Anatomy at Otago University, where he and his lab study the biology of reproduction from the level of genes all the way up to the consequences of reproductive biology on ecology, evolution, conservation and economics. I asked him a few questions about the project, and how it’s going:
How did you end up heading the tuatara genome project?
This happened through a rather fortuitous sequence of events. I was at an international meeting of scientists in Santa Cruz, California in 2011, with about 100 other scientists who were proposing to sequence the genomes of 10,000 vertebrates; the Genome10K consortium
. An initial list of priority species had been produced, the top 100 if you like, and there at the top was tuatara. I innocently asked who was leading the project, whether they had samples or sequence already, and importantly whether they had had discussion with iwi. It turned out that although a team were willing to commit resource to getting the sequence, none of that crucial work had been undertaken. I offered to find out what would be needed, talked to a variety of people and slowly but surely became more involved in the project as I realised that it could, and indeed should, be undertaken in New Zealand. Via a collaboration with Ngatiwai
iwi, the Allan Wilson Centre for Molecular Ecology and Evolution
, together with initial support from New Zealand Genomics Limited
, we are well on our way to producing a draft sequence of tuatara.
Is there any one thing you’d really like to learn from the genome sequence, or are you happy to be surprised?
I’m hoping to be surprised. There are a few things I’m particularly interested in though. These include why the genome is so big, roughly 5 Gbp, thus 70% bigger than human. Often such large genomes contain numerous repetitive elements, but the only work undertaken prior to the genome sequencing project suggested tuatara might have relatively few repeats. If the large genome size is not due to repeats this would be unusual. Another area of considerable interest to me is around the genes involved in sex determination. In humans and almost all other mammals the presence of a Y chromosome containing a gene called Sry, is the trigger for male sexual development. In tuatara sex is govern by the incubation temperature of the eggs; when the temperature is high embryos develop as males , while embryos develop as females when the temperature is lower. Lots of reptiles have this form of sex determination, but we remain rather ignorant of the mechanism. It maybe that the tuatara genome, together with comparisons to those of other animals, may enable us to start to identify likely mechanisms of sex determination that we can subsequently test.
What stage is the project up to? What’s been done and what’s happening at the moment
The project breaks into three main steps: sequencing (collecting the raw data), assembly (putting it together, ideally in the right order) and annotation (trying to determine the function to each component of the sequence). The first step is as you’d imagine, obtaining a vast quantity of raw sequence and we’ve done plenty of this over the past year. We now have roughly 70 fold coverage of the genome, which means that on average each base pair of the genome has been sequenced 70 times. What we now trying to do is to produce an genome assembly. This involves taking the equivalent of paper confetti, perhaps created from a great work like Darwin’s Origin of the Species, and trying to piece together every word, sentence, paragraph, page and chapter in the right order to reproduce the book. Thus far we have managed to rebuild the equivalent of ‘pages’ of the tuatara genome, but we don’t yet fully know how these are ordered. We are now using some clever sequencing technology, called mate-pair jump libraries, to help us figure out how many letters and words lie between two point in our book to help us better order or scaffold these sequences.