Their web page for it is excellent, and gives some more details along with a list of the proteins featured in their alphabet. They’ve also made this great video of their proteinaceous font:
Proteins make the chemical ‘engines’ and structures in our cells. Enzymes are proteins that make specific chemical reactions we need happen more efficiently. Other proteins help give cells shapes and help organise things with in the cell, for example by moving them about.
(Proteins are also involved in reorganising our genes, depending if they’re supposed to be used in that type of cell or not. This is a key element of epigenetics. Genes that are not being used are packed away; genes to be used are unpacked. Previously used genes can be primed for reuse.)
You’ll see that for the proteins for ‘B’ and ‘T’ scientists don’t know what they do. It’s less common to not know the function of a protein whose three-dimensional is known. It takes a lot of work to purify and work out the structure of a protein, so people tend to look at proteins that seem important for a particular purpose.
It is relatively common for the function of a gene not to be known. When the sequence of a genome is constructed, the locations of genes within the DNA sequence of the genome identified in them. One genome sequence shows all of the genes, whether they’ve been studied before or not.
Scientists have made databases of the genomes and proteins we know about that anyone can look at. The main database for three-dimensional structures of molecules is ‘the PDB’, the Protein DataBase.
If you’re keen you can explore proteins at the PDB website.
You’ll also see that they feature a molecule of the month. Currently it’s the huge protein titin, which has the longest protein-coding sequence of any gene in the human genome (80,780 base pairs). Titin form elastic fibres in muscles.
The PDB website (and some other websites online) have interactive viewers that let you explore the structure of the proteins online, rotating them and zooming in and out.
I wonder if some enterprising souls will create protein alphabets for the languages that don’t use a Roman script – Arabic, Cyrillic, Coptic and so on?
1. But, please, not Comic sans!
2. Molecular biologist, computational biologist, crystallographer, etc…
3. I’m writing tongue-in-cheek here. I doubt this is available as an actual computer font, but it’s a fun alphabet all the same!
4. Science communication types might consider this a good example of outreach, presenting protein structures using something we’re all familiar with.
One thing I don’t like about their page is it forces all links to be presented in the same tab, overwriting the page and disallowing users to open a link in a new tab. Just my opinion, but it’s poor practice to take control off users; better to let users determine/allow users what wish to do.
5. I’ve no idea if this is the case here, but potentially one reason to not know the function of the protein is that the structure of the protein may be from one of the laboratories aiming to find examples of all the shapes (folds) proteins have. These projects are more interested in finding the shapes proteins, not in they have a known function. Once you have the shape (fold) of protein, you can test if a protein whose shape or structure you do not know is likely to have the same shape of a protein whose shape is known.
This can be done using so-called ‘protein threading’. The chain of amino acids that make up a protein (the protein sequence) is ‘threaded’ onto a scaffold of each protein shape known and tested if the protein sequence can be fitted on to the shape in a way that has the chemical properties of each amino acid fitting well to the shape. These computational biology techniques are a bit of a hobby-horse of mine – if you’re interested in a better explanation, you’re welcome to let me know in the comments below.
Other articles in Code for life: