2 Comments

I offer this belated explanation for my non-biologist readers that were, I guess, left a bit confused by my Xmas message… (Anyone who could read the message will want to skip this post.)

If you’re in a flipping great hurry and don’t want the explanation, just the translation, skip to the end of the article. If not, here’s that Xmas message again, below:

Xmas-Card-BLOG

Biologists tend to think of proteins as molecules that ’do’ things.

Proteins include enzymes, some types of hormones, the receptors that receive and recognise hormones (hormone receptors).

Some types of proteins–transcription factors–control the rate genes are used, others package the DNA in our cells (the protein part of chromatin).

Other proteins make up the molecular trafficking system that move still more proteins around in the cell.

Protein have three-dimensional shapes that are exploited in their function (see the figure below).

But what the heck has this to do with my Xmas message?

Proteins are a chain of amino acids and my Xmas message is encoded in a sequence of amino acids.

Each protein has it’s own unique sequence of amino acids, coded in the DNA for the gene that specifies the protein.

Main protein structure levels (Source: Wikimedia Commons)

Main protein structure levels (Source: Wikimedia Commons)

There are two short-hand codes or abbreviations for amino acids.

The longer of the two codes is the three-letter code.

You can see three-letter abbreviations for the twenty ‘standard’ amino acids in the schematic representation of a protein chain at the top of the figure to the right.

(These are the 20 common amino acids coded for in the DNA sequence of a gene.)

Each three letters are a mnemonic for the amino acid it represents, most often being the first three letters of the full name of the amino acid.

Alanine’s three-letter code is Ala. Glycine is Gly. And so on. A few don’t follow this rule. Asp refers to aspartic acid but asparagine is abbreviated to Asn (likewise for glutamic acid and glutamine). Isoleucine gets Ile and Tryptophan, Trp.

To store protein sequences on a computer, an alternative single-letter code is used. These use the first letter of the full name of the amino acid, with variants where this would be ambiguous. Tyrosine is represented by Y, for example; T is taken for Threonine. Tryptophan is shortened to W.

The ‘trick’ in my card is simply to replace the three-letter amino codes with their single-letter counterparts. Once you’ve done this, you can read the message.

A table of the twenty ‘standard’ amino acids and their abbreviations can be found at Manchester University. (I’ve chosen this for it’s simplicity and hence clarity.) Another clear table that is slight better arranged if you want to translate the code for yourself is at the DNA Data Bank of Japan. If you’d like to do this, better stop reading here as next is the translation!

So, the translation…

Trp Ile Ser His Ile Asn Gly : W I S H I N G

Arg Glu Ala Asp Glu Arg Ser :  R E A D E R S

Ala : A

His Ala Pro Pro Tyr : H A P P Y

Asp Ala Tyr : D A Y

Ala Met Ile Asp : A M I D

Phe Ala Met Ile Leu Tyr : F A M I L Y

Ala Asn Asp : A N D

Phe Arg Ile Glu Asn Asp Ser : F R I E N D S


Other articles on Code for life:

A holiday message to my readers

Loops to tie a knot in proteins?

Beauty in biology — green fluorescent protein

I remember because my DNA was methylated

Christmas tree