Ladies and genetlemen, we have the gorilla genome
. You reaction to this news is probably determined by what you do for a living. If you write the headlines for major news services you will convince yourself that this result will, in some utterly undefined way, teach us about what it is to be human. Just about everyone else will develop a case of Yet-Another-Genome Syndrome
. The gorilla is, by my count, the 51st animal to be added to the full-genome club and the last of the great apes (joining humans, chimps and orangutans). More to the point, the publication of a new genome sequence doesn’t, by itself, tell us all that much. The real achievement in a “genome of x” paper is the creation of a resource that scientists will continue to work from for decades. The analysis that comes with it is really just a first pass.
But there was one very cool result to come from the analysis of the gorilla genome. About 15% of our genes are more closely related to their counterparts in gorillas than they are to the same genes in chimps.
That sounds suprising. People are always going on about how humans and chimps are ninety-nine-point-some-magic-number percent identical, and there are exactly two scientists in the world who think chimps are not our closest relatives (Grehan and Schwartz, 2009 doi: 10.1111/j.1365-2699.2009.02141.x). Have we been wrong? And how can 15% of a genome show one pattern while the rest shows another?
To understand what’s going on, we need to remember where species come from. Species start forming when populations stop sharing genes which other. When genetic changes in one population can’t filter through to another, those two populations are capable of evolving apart from each other and so can become distinct and take on the various characters that we use to tell species apart. So, new species only become different as they start to evolve apart, but they start of with a more or less random sampling of the genes in the ancestral population from which they descend. If we want to understand what’s going with the gorilla genome, we need to understand the history of those genes.
In most populations at least some genes come in distinct “flavors” (technically called alleles) . So, for instance we all have a gene called MC1R, but some of have an MC1R allele that is associated with red hair, and others have alleles that usually lead to dark hair. We inherit our genes from our parents, so each allele has a history that stretches back through time. If we look at modern populations we can use genetic differences between alleles to reconstruct that evolutionary history. Here’s a simplified history of four alleles, in a very small population (if you re-trace the lineages you see they fit the tree to the right):
So, what happens when a population with different alleles starts to diverge into new species:
The genetic lineages will keep on evolving down through the new tree, but now lineages will never cross the barriers to gene flow that are driving speciation. Often, the genetic lineages in the ancestral population will “sort” in such a way that when you trace the genetic lineages within a species back you arive at a member of that species (not an individual from the ancestral population). In that case, the genetic relationships (which we’ll call “gene trees) will be the same as relationships between species (“species trees”):
But population genetic theory tells us we won’t always get such a simple pattern. For recent or repeated and rapid speciation processes there might not be time for the genetic lineages to sort. The gene tree can be different from the species tree:
Exactly this process has happened with the gorilla genome. The genetic lineages hadn’t sorted before the human-chimp split so some of our genes are more closely related to gorilla ones than chimps ones. This phenomenon might tell us something about the evolution of the great apes . The time that it takes for lineages to sort is proportional to the population size of the organisms through which the lineages are evolving. Processes that effectively limit the population size (like natural selection, which results from relatively few individuals contributing to the next generation) might leave a pattern in the way lineages have sorted. The authors of the gorilla genome paper use this prediction to search for and find areas of the gorilla genome that may have been subject to strong selection after the population went its own way.
So called “incomplete lineage sorting” is a problem for people like me who aim to reconstruct the evolutionary history of species using genetic data. Although we’ve always known this problem existed, we’ve only recently been able to extend population genetics theory to actually infer the history of species for gene trees even when those gene trees are unsorted. It’s important we have these methods, because it’s actually predicted that most genetic lineages will be unsorted for about 1 million years after speciation starts – often all we have are unsorted genes and it’s nice to be able to extract some information from them.
The Gorilla Genome paper is
Scally, A., Dutheil, J. Y., Hillier, L. W. et al. (2012) Insights into hominid evolution from the gorilla genome sequence. Nature, 483, 169-175 doi:10.1038/nature10842