Epigenetics and 3-D gene structure

By Grant Jacobs 03/08/2010

DNA methylation controls the binding of proteins that control the 3-D structure of genes.

This is a lightly edited version of an article I wrote as a guest on Alison’s blog over a year ago, looking back a couple of years to show something of what epigenetics was bringing to genome biology. The science has advanced further again since, but I’m nicking it back onto my blog (with Alison’s permission!) as it sets up other articles I would like to write.

Human karotype. (Source; Wikimedia Commons.)
Human karyotype. (Source; Wikimedia Commons.)

My article followed one Alison wrote about epigenetics. I’d suggest you read that first, as it will help!

While I’ve simplified quite a bit of the science to make things a bit clearer, it is a lot to take in, but persevere and you might get a glimpse of some of what this epigenetics fuss it really all about. (Feel free to ask questions in the comments section.)

I wanted to introduce an aspect of epigenetics that interests me: specifying the use of genes through forming different chromatin loops. In the case I’m going to look at the structure of the gene depends on which parent the copy of the gene came from.

Humans are diploid: we have two copies of each chromosome, one from each parent, except in males there is usually only one X and one Y chromosome (but two of all the others). Ignoring the sex chromosomes in males, having two of each chromosome also means that we have two copies of each gene. Each of the two genes making up a pair of corresponding genes, one from each parent, is called an allele. The two alleles of a gene make up the genotype of that person for that gene.

For most genes, when the gene is needed, both alleles are expressed and roughly the same amount of the RNA each allele codes for is made. But in some cases, evolution has selected that one of the two alleles should be switched off.

Tortoise shell cats are an example of mosaic X-chromosome inactivation (Image source: Wikimedia Commons.)
Tortoise shell cats are an example of mosaic X-chromosome inactivation (Image source: Wikimedia Commons.)

Alison described one example of this in her article: dosage compensation in females ’corrects’ for having twice the number of X chromomome genes as needed by switching one copy off. Recapping on what she was saying, in the case of switching off the ’extra’ copy of the genes on the ’second’ X chromosome in females, the choice of if the copy from the father (paternal allele) or from the mother (maternal allele) is inactivated is random. The choice made is inherited in each cell line once that choice is made. Because there are many cells, each making a separate random choice of which allele to switch off, most female mammals are mosaics, with a mixture of cells with an active paternal X chromosome genes and with an active maternal X chromosome genes. (I believe, rodents and marsupials are exceptions to this rule.)

One example of this are tortoise shell cats. The choice of expressing the black or orange alleles for fur colour are randomly chosen over the cat’s body.

X chromosome inactivation doesn’t ’care’ which parent the inactivated allele came from.

In allele-specific gene expression, the alleles are ’imprinted’ with an epigenetic parent-of-origin mark which specifies what parent the allele is from that determines how each allele is used.

(Source: University New South Wales, Embryology.)
(Source: University New South Wales, Embryology.)

One region in our genomes that has been studied in detail is the region around the IGF2 (insulin-like growth factor 2) gene. Near the IGF2 gene is the H19 gene, which codes for a non-coding RNA, that is, a RNA that is not translated to a protein, but functions as a RNA. In these two neighbouring genes, only one of allele is used, each from different parents. Normally, the only the paternal allele of IGF2 is used, and the maternal counterpart is ’silent’ and vice versa, only the maternal allele of H19 is used, and the paternal H19 allele silent.

A protein called CTCF binds to regions near these genes called ’imprinting control regions’ or ICRs. Genomic imprinting chemically records the origin of that region of a chromosome, specifying which parent it came from, usually by methylating the DNA.

CTCF binds DNA with the sequence CTCCC (hence it’s name, CTCCC binding factor). If the DNA bases of the binding site are methylated, CTCF cannot bind the DNA. If the binding site has no methyl groups, CTCF can bind it. Thus, DNA methylation can control which binding sites CTCF is able to bind to.

So epigenetic modification of the DNA can control what DNA sites a protein can bind. It turns out that this protein can form chromatin loops that control how genes are used. Thus DNA methylation can control the binding of a protein that control the structure of a gene, which affects how the gene is used.

When CTCF is bound to DNA, it has two properties: (1) it prevents the spreading of histone modifications that mark a gene as inactive (heterochromatin), and (2) it limits the ability of enhancers to encourage (or ’enhance’) the expression of a gene via the promoter immediately before the gene (I’ll come back to this). This figure from Kurikuti et al’s review article summarises this graphically:

Fig. 1, Wei et al., Cell Research (2005) 15, 292—300
Fig. 1, Wei et al., Cell Research (2005) 15, 292—300

These two properties are thought to be a result of pairs of DNA-bound CTCF proteins joining together to form loops of chromatin. (Chromatin simply means ‘DNA wrapped around histone proteins’, like DNA to the left of the words ‘Histone modification’ in the earlier illustration showing the main components of the epigenetic code. Almost all the DNA in the nucleus is packaged into chromatin.)

In gene structure A either enhancer can affect either gene; in structure B the enhancers can only affect the neighbouring gene (Source: Corces Lab web pages, Emory University)
In gene structure A either enhancer can affect either gene; in structure B the enhancers can only affect the neighbouring gene (Source: Corces Lab web pages, Emory University)

One CTCF molecule bound to one ICR can interact (join together) with an other CTCF molecule bound to another ICR, so that the chromatin between the two CTCF molecules form a loop. It’s a bit like putting a bit of Blu-Tack on a peice of string, and other bit of Blu-Tack somewhere else on the same piece of string, then pushing the two pieces of Blu-Tack together: the string in between the two bits of Blu-Tack will form a loop.

Research scientists have worked out what the ’loop structure’ of the DNA in the IGF2-H19 region looks like and it’s pretty complicated! It looks a bit like if we stuck several pairs of Blu-Tack on our string (chromsome) together, and joined the Blu-tack pairs together to form a ’hub’ with the ends of the loops near the middle, with several loops coming from it.

Enhancers are (small) regions of DNA, that if proteins that regulate gene expression (gene regulatory proteins) bind to them, they ’encourage’ the promoter region immediately before a gene to express that gene. Enhancers seem to usually only able to only ’encourage’ promoters that are in the same chromatin loop. (There also seem to be exceptions to this rule.) What loops form limit which genes enhancers can affect.

Furthermore, different enhancers have DNA sequence that allow different gene regulatory proteins to bind, so different chromatin loops may make genes responsive to different regulatory signals.

Fig 5., Kurikuti et al., Proc. Natl. Acad. Sci. USA (2006) 103(28)10684-10689
Fig 5., Kurikuti et al., Proc. Natl. Acad. Sci. USA (2006) 103(28)10684-10689

Pulling all this together, the maternal and paternal copies of the IGF2-H19 region have different DNA methylation of the ICR regions, so that they present different binding sites for CTCF. This causes the maternal and paternal copies of the IGF2-H19 region to have different loop structures. In the maternal copy, the IGF2 gene is in an inactive loop and H19 gene in an active one. And vice versa, for the paternal copy of the IGF2-H19 region, with the IGF2 gene in an active loop and the H19 gene in an inactive one. And older idea of the structure of the maternal copy of the IGF2-H19 region can be seen in this figure to the right (from Kurukuti et al). (The more recent view is a little messier!)

So… epigenetics, like the DNA methylation of CTCF binding sites, can control the loop structure of regions of a chromosome, affecting which allele is used and how, and in ways that can depend on which parent a gene came from.

(I’m simplifying this: a lot more proteins and other types of special locations in the DNA are involved in structuring the chromatin/DNA into loops, and there is a lot more to how the genes are regulated, but this gives starting point in a way that I hope shows that DNA in the nucleus is not just a linear sequence, but organised in domains by forming different loops that affect how the genes in those loops are used.)

All of this, in turn, relates to responses to hormone signals and to development, but that’s for other articles!


WEI, G., LIU, D., & LIANG, C. (2005). Chromatin domain boundaries: insulators and beyond Cell Research, 15 (4), 292-300 DOI: 10.1038/sj.cr.7290298 (open access)

Kurukuti, S. (2006). CTCF binding at the H19 imprinting control region mediates maternally inherited higher-order chromatin conformation to restrict enhancer access to Igf2 Proceedings of the National Academy of Sciences, 103 (28), 10684-10689 DOI: 10.1073/pnas.0600326103 (open access)

YANIV, M., & ELGIN, S. (2008). Chromosomes and expression mechanisms: bringing together the roles of DNA, RNA and proteins Current Opinion in Genetics & Development, 18 (2), 107-108 DOI: 10.1016/j.gde.2008.02.002 (subscription required)

For those wanting more depth, there are many excellent review articles explaining epigenetics. Serious readers should bear in mind that this field is moving very fast: the papers and reviews I cite here are now dated, but still will give an inkling of what is (was!) happening. An older special issue devoted to epigenetics can be found in volume 128, issue 4 of Cell (pages 627-802, 23 February 2007). Most of the review issues tend to cover some aspects at the expense of others; this issue of Cell does better than most at covering the complete range of issues at that time.

Other articles on Code for life:

The roots of bioinformatics

Loops to tie a knot in proteins?

Scientific article download costs

Temperature-induced hearing loss

What is your relationship with your research notebook?

0 Responses to “Epigenetics and 3-D gene structure”