What we already know – a tuatara transcriptome

By David Winter 14/08/2013


We are not starting from scratch in our mission to understand the genetics of tuatara. Scientists have been working on these creatures for more than a hundred years, and in that time plenty of researchers have used tuatara DNA to try to understand the world. For the most part, these studies have used DNA sequences as witnesses to evolutionary history, rather than data from which to understand the day-to-day biology of tuatara.

Hilary Miller is one researcher who has taken a genetic approach to understanding how tuatara work. In her postdoctoral research at Victoria University, Hilary sequenced and analysed tuatara MHC genes. These genes play an important role in the immune system of vertebrates,  helping their carriers develop immunity to diseases they encounter during their lives. Populations that contain many different variants for each of the MHC genes are well placed to deal with new diseases, so maintaining MHC-diversity is of great interest to conservation biologists and managers.  Hilary and her colleagues showed that some tuatara populations have relatively low MHC diversity, and what what variation there is does not contribute much to mating behaviour in tuatara (as opposed to some mammals and birds, which actively seek out mates with different MHC genes).

Up until last year, there were a few hundred tuatara DNA sequences known to science. That included all of Hilary’s research, a few other studies that focused the functions of particular genes and all those studies that used tuatara DNA as a witness to the evolution of reptiles. Then Hilary, working with colleagues at Massey University, sequenced 33, 000 more. Hilary’s study was the first to use modern sequencing technologies on tuatara genes, gave us our first look at what the tuatara genome will be like, and will be a great help in making sense of the sequences created in the tuatara genome project. So we asked Hilary a few questions about her work.

The sequences you published make up what’s called “transcriptome” – what does that mean?

A transcriptome is a set of expressed genes in a given cell type.  Every cell contains a copy of the genome in its nucleus, but in any given cell only a subset of the genes in the genome will be active – transcribed into mRNA and then translated into protein.  A transcriptome is built from the mRNA, so it only contains sequences of genes that are expressed, not all the non-coding DNA that makes up a large part of the full genome.

 

transciption

Is there a particular reason you used an embryo for the first transcriptome sequence?

We were aiming to find genes involved in sex determination (which in tuatara is regulated by temperature, not by sex chromosomes), so we used an embryo from the approximate stage when sex is determined.  It was also one of the only ways we could get tuatara tissue that was in good enough condition to extract mRNA without sacrificing an adult animal, which obviously we didn’t want to do.

 Before your study, what did we know about tuatara genetics. How much more do we know now?

Tuatara had mostly only been studied from a population genetics/phylogenetics perspective, so we really only had genetic markers like microsatellites and mitochondrial genes that were useful for those studies.  Only a small number of functional genes had previously been isolated from tuatara – there were something like 60 gene sequences from tuatara in the Genbank database, including the immune genes I previously isolated for earlier postdoc work.   With the transcriptome, we now have about 33,000 gene sequences for tuatara, so we’ve increased the genomic information we have for tuatara 500-fold.  We don’t know what all these sequences are, as about half of them don’t match to any known genes from other species in Genbank, but for about 15,000 of the sequences we have a pretty good idea of what genes they are from.  So this dataset has really improved our knowledge of what the tuatara genome looks like.

 One of the things we’ve talked about on this blog is the challenge of building an unknown sequence from short fragments. You used really short reads in your study – did you have much trouble building up your large sequences?

The gene sequences that make up a transcriptome are usually much shorter and easier to assemble than genome sequences, so short reads aren’t so much of a problem, as long as you have enough of them and have read-pairing information.  A bigger problem was verifying that the sequences I’d assembled were correct.  Most transcriptome studies rely on a closely related genome to verify sequences are assembled correctly, and of course for tuatara we don’t have anything closely related enough for comparison.  So we had to do a lot of careful checking of the assemblies to make sure the reads had been put together properly, and compared them as best we could to more distantly related species for which there is a lot of genomic information.

What was the most exciting result to come from the transcriptome study?

Well, we didn’t find many sex-determining genes that we’d hoped for, but we did find some genes that could be involved in regulating genes in response to temperature changes, like heat-shock proteins and a cold-inducible RNA-binding protein.  Now that we have these gene sequences, we can begin to study their function.  For example, looking at whether there is a difference in their expression when eggs are incubated at different temperatures.  There’s a whole raft of studies just waiting to be done using the transcriptome data – from investigating the evolution of various gene families, to looking for genes that might be involved in local adaptation of different tuatara populations.  And the transcriptome sequences will hopefully also help with assembling and annotating the full genome sequence.

 


Hilary wrote the Chicken or Egg blog at sciblogs, in which she detailed her own work on tuatara and other topics in conservartion genetics. She now works for Biomatters, a bioinformatics company in Auckland, which is most famous for it’s popular sequence-managing software geneious