A weekend peek at an… ‘unconventional’ research paper.

Investigate Daily, a part of local conspiracy theory magazine Investigate edited by creationist Ian Wishart,  earlier in the week released Scientists dumbstruck: signs of intelligent design in DNA code.

Investigate Daily concludes, “the study is groundbreaking in its implications.”

Sounds pretty definitive, right?

For a magazine titled Investigate, it ought to, well, investigate before announcing things as groundbreaking.

More disturbing, for me, is that this is published “science” nominally related to my own field, computational biology. More on that later.

The paper by shCherbaka and Makukovb published in an astrophysics journal, Icarus, the official publication of American Astronomical Society’s Division for Planetary Sciences and pushed by creationists as supporting ‘intelligent design’ claims to have found a ‘code’ within our DNA that indicates life on earth was designed by aliens.[1]

If you think the claim is far-fetched, the stuff of science fiction or fantasy novels, you’re right.

Looking for biochemical clues for the possibility that life once was present on Mars, as the Mars Rover is doing, is one thing. Using numerology hoping to discover a hidden, secret code within our genetic code left by ‘the ancients’ is quite another.

You don’t need to delve into the ‘science’ the paper offers to realise it isn’t up to anything useful either.

As ‘Diogene’ has pointed out in a comment to another blogger’s take on shCherbaka and Makukovb’s paper it rests on a false comparison of two options[2]:

  1. Created by random chance
  2. Created by space aliens

This is set up so that if the first is unlikely, the second “must” be right.

The setting is rigged because these two aren’t all the possibilities. There is at least one more:

  1. Created by a non-random natural process (e.g. evolved)

To declare any one the ‘preferred’ choice they’d have to investigate all three possibilities, then compare what was found. But they don’t: they only look at the first then declare the second as the ‘winner’ without ever looking at the third.[2]

My impression is that Wishart’s forte is (or was) political journalism.

Here’s it in a nutshell, using an election as an analogy:

This work is like an election with three candidates where the third is left off the ballot sheets. Obviously the third candidate cannot win, even if they were the favoured candidate. It’s even more unjust: the second is handed victory by a claim of a weak result from the first without inspecting how the second fared.

I’d like to think Wishart can see that’s one rigged election!

A point here is that considering the overall logic, without looking at details of the paper, this isn’t the “groundbreaking” finding Wishart claims it to be.

It suggests—to my reading—that Wishart hasn’t even looked the paper’s claims before making claims about it himself. I’d have thought that poor journalism.

Either way: time for publication of a retraction or errata by Investigate Daily, then?

Wishart commits other journalistic faux pas. He writes “Bio-scientists are reeling at the implications of new research […]” but nowhere quotes, cites or provides any substantiation of ‘bio-scientists’ doing this. He’s erecting a straw-man. The only ‘reeling’ sound biologists are going to be doing over this paper is at the silliness of it — and why it was ever published.

Understanding the flawed logic above is all you need to realise the paper isn’t going anywhere, but let’s go on to look at how their code is silly all the same, then let’s question how this managed to get published. Readers who don’t want to read biology may want to stop here.


My field is computational biology. Essentially it’s the theoretical biology arm of molecular biology. Investigating the genetic code would be considered to fall within this area. It’s always a bit annoying to find nonsense published in your own field or your field used for silly purposes…

Occasionally eccentric characters attempt to find ‘secret codes’ they claim hint at alien life or supernatural properties within the genetic code by numerology or the like. Typically these are on websites created by their eager authors; a handful make it into journals.[3]

shCherbaka and Makukovb’s interest—or at least ‘sales pitch’—is in SETI, the search for extra-terrestrial intelligence (more broadly, a search for intelligent extraterrestrial life).

They hope to find within the genetic code signs of a code by aliens embedded within it. As numerologists they hold prime numbers, zero, repeated numbers and so forth as ‘special’.

I’ll try keep my description of their ‘code’ short, as I’ll wind up covering similar ground as PZ Myers.

Genes code for proteins, molecules that have biochemical roles in our cells. (Genes can also code for RNAs, which can also carry out functions: the paper overlooks this.) Proteins are chains of amino acids. The protein-coding portion of a gene is made of repeated triplets of three adjacent bases, called a codon. Each codon specifies one amino acid of the protein the gene codes for.

shCherbaka and Makukovb invent a code through a mix of dividing the genetic code into subgroups and classifying the amino acids specified by each codon using the atomic mass number of the amino acid totalled – the nucleon, the total number of neutrons and protons (the two larger particles making up the nucleus of an atom).

Proteins are a repeated identical peptide unit, with side chains branching from them with a different side chain for each different amino acid. In their code, each amino acid is divided into the peptide unit that is the same in all amino acids and the side chain.

The ‘nucleon number’ of a peptide unit is 74, which they declare, for no reason other than their interest, is special because it is twice the prime, 37.[4] Because peptide units are the same for all amino acids, this hand-waving “guarantees” this prime number will feature in selected examples of any further ‘analysis’ of their code.

The amino acid proline doesn’t fit their nucleon scheme, so they fiddle the books by shifting a hydrogen to bring the ‘nucleon number’ of proline up to 74. (They also declare proline ‘special’ and a ‘key’.)

They use the code they have invented to turn codons into numbers, then go on to ‘discover’ “patterns” in some arrangements of DNA bases and codons, particularly arithmetic that generates the prime 37. Pure numerology. It’s not surprising they’ll find something doing this, it’d be far more surprising to my mind if they found nothing  at all.

As you might imagine this descends to silliness. They make pointed references to repeated units as being ‘special’. They contrive the prime number 37 to ‘appear’ by adding and subtracting numbers until it does –  with the number of combinations available and that they baked this prime into their code from the onset this is hardly interesting or surprising. At one point they even write “this leads to 32 + 42 = 52 – numerical representation of the 243 Egyptian triangle, possibly as a symbol of two-dimensional space.

The only ‘intelligence’ here is the authors’, in their creation: their “code” and their ‘reading’ of it.

They go on to estimate the likelihood of getting their ‘code’ by random chance as 1 in 10-13, failing to say how this is done and suggest this means there ‘must‘ be an ‘intelligent’ (alien) code present, but fail to consider patterns in codons for biological reasons.

And there really lies the rub.

The genetic code is not random. No biologist would suggest it is.

Think about it. Things that are evolved cannot be random, essentially by definition. (If they were truly random, they wouldn’t have evolved into a particular thing, they’d just be random…)

There are patterns in the DNA bases in the codons and the amino acids the codon codes for. Some of these relate how chemically-similar amino acids use related codons and are made in similar ways. These reflect the biochemistry used to make the amino acids and it doesn’t take much to see that they are likely a product of evolution expanding on an earlier repertoire. (It’s how evolution works, building on what went before.) To save a little time, here are the examples PZ Myers listed:

The first DNA base of a code is related to how the amino acid is made:

  • C, then the amino acid is derived from alpha-ketoglutarate.
  • A, then the amino acid is derived from oxaloacetate.
  • T, then the amino acid is derived from pyruvate.
  • G, then the amino acid is derived in a single step from simple precursors.

The second DNA base of a codon is related to the chemical properties of the amino acid:

  • A, then the amino acid is hydrophilic.
  • T, then the amino acid is hydrophobic.
  • G or C, the amino acid has an intermediate hydrophobicity.

How did this get published?

I guess the last word should be how this got published in a peer-reviewed scientific journal.

Icarus journal is a journal “devoted to the publication of original contributions in the field of Solar System studies.”  (Their emphasis.)

Most journals will reject papers outside of the field they cover. Aside from wanting to stay “on topic”, you can’t properly review things outside of your area of expertise. This paper was pitched as being related to SETI, but the body of the paper is related to biological material (DNA and the genetic code). It would want to be peer-reviewed by biologists, in particular computational biologists or those working on the origins of life. I can’t imagine it has (they’d reject it).[5]


Other discussions of this paper can be found on reddit (note onxylion’s response), briefly on a biochemist’s blog, along with other science blogs like Sensuous Curmudgeon (a science blog – the name is a wordplay) and PZ Myer’s well-known Pharyngula (named after a developmental stage of an embryo).

1. The full story of the ‘Intelligent Design’ (ID) movement is too long for here. Briefly it is creationism in drag (as description proponents protest) stemming from a court ruling in the USA that creationism as not science and therefore not able to be presented as such in science classes. To work around this, a core group of creationists presented a new name in place of creationism – intelligent design. Since then, courts in the USA have ruled that ID is creationism under another name. There are persistent efforts by members of the religious Right in the USA to insert creationism teachings into science classes there. Some have raised related concerns  in New Zealand over that the new government ‘charter’ school initiative headed by an MP who has acknowledged he is a creationist.

2. I’ve simplified this to make it easier on non-scientist readers, presenting it as black and white rather than shades of grey. They do offer some sort of attempts at calculating odds in their Appendices. Key is that they calculate odds for finding their code, not a comparison involving the functional relationships known to be in the genetic code like those I gave (from Myers). That’s the key point I’ve highlighted by drawing on diogene’s logic.

In the end they hand-wave their testing through as ‘their opinion’ (my emphasis):

This result gives probabilities for the specific type of patterns – nucleon equalities and ideogram symmetries. However, testing the hypothesis of an intelligent signal should take into account patterns of other sorts as well, as long as they meet the requirements outlined in Introduction. After analysis of the literature on the genetic code our opinion is still that nucleon and redundancy numbers are the best candidates for “ostensive numerals”. We admit though that there could be other possibilities and that the obtained P-value should be regarded as a very rough approximation (keep in mind simplifications in the test as well). But admittedly, there just cannot be enough candidates for “ostensive numerals” and corresponding logical arrangements to compensate for the small P-value obtained and to raise it close to the significance level.

Thus, in the end their analysis seems to come to their opinion. (Note also hiding behind ‘ostensive’ – special pleading that the ‘numerals’ will be ‘difficult to understand’.)

3. Writing at Sandwalk, Steve Oberski described shCherbaka and Makukovb’s paper as a “biochemical Kabbalah”. I’m instead reminded of the book I’ve just read, The Queen’s Conjuror, an account of the life of John Dee and his seemingly relentless search for the codes hidden in ancient texts. In Dee’s time this was based on a mix of astrology and religion and many scholars of his day made sincere efforts of this type. To see this sort of thing today, however, should immediately raise red flags.

4. The senior author apparently has a history of hankering after codes based on the prime number 37 (which he refers to as 037 – as if it were a secret agent à la Bond, James Bond) and codes that are digital base 10 (why not sexagesimal, as for Babylonians, etc).

As Piotr Gasiorowski put it,

shCherbak seems to be obsessed with the beauty and magical properties of the decimal system (which he believes is the system of the human mind and a hallmark of intelligence in general). His publications are mostly about things decimal (and divisibility by 37, and numerological mysticism in general). How he manages to smuggle such stuff past reviewers is a mystery to me (unless the journals in question are nor really peer-reviewed). I suppose the number of digits in first tetrapods was fixed as 5 so that their distant descendants could use their fingers as a portable calculator.

One wit has re-worked this to 42, 42 = 2 * 3 * 7. (Recall the authors divided 74 by 2 to get 37.) Of course, you could do this sort of thing endlessly and indeed that pretty much the point.

5. I guess one formal, but I would like to think unlikely, possibility is creationist reviewers.

Other articles on Code for life:

Looking for our inner Neanderthal

Theoretical evolutionary genetics – free e-book

Deleting a gene can turn an ovary into a testis in adult mammals

Doggie ERVs