7 Comments

I’d argue (good) computational biology is more than computer science + biology, it is computer science (or statistics, mathematics, etc.) + biology + theoretical science.

Ruminating, mulling over something relatively trivial can be fun…

Sandra Porter has a recent post, Digital biologists, bioinformaticists, and computational biologists: more thoughts on the question of names partly in response to comments by myself and others to an earlier post she wrote on the same subject that evolved (read: diverged from the topic) somewhat and a post by “Mike, the mad biologist”.

This short article post is intended partly as a reply to this discussion (it’s too long for a blog comment) and as a background piece for me to refer back to. It may also be useful introductory thoughts for some “bench” biologists.

Speak up if you’re familiar with the subject and see things differently; they’re working definitions, one’s I’m happy to change if I see a better fit.

To keep try keep this brief(ish) I’m just going to update my categories quickly, then offer a few words of commentary.

Computational biologist, bioinformaticist and digitial biologist (or bioinformatics analyst) all lie at an intersection of computing (computer science) and biology.

All are biologists in one sense, using different biological backgrounds.

Here are my categories, updated:

Computational biologist: Specialists, focusing on developing and applying theoretical biology

Bioinformaticist: Generalists, developers/advanced users of informatics tools that manipulate biological data

Digital biologist / bioinformatics analyst: Biologists who conduct bioinformatics analyses full-time, but don’t develop software (I prefer the latter term)

Biologist: Experimentalists, who primarily do “bench” work, but may occasionally use bioinformatics tools

These are, of course, generalisations, with all that comes with that.

Let me look in more detail to show I how get to thinking this.

I’ll start with computational biology, as that’s my own patch, and get to Sandra’s considering a new term ‘digital biology’ towards the end. Don’t skip ahead or you’ll miss the reasoning!

A Sandra says computational biologists are people that know both computer science (and/or statistics, mathematics, etc.) as well as biology, but two additional things need to be factored in to my mind. (Bear in mind her interest is with considering ‘digital biologist’, so it’s understandable that she doesn’t elaborate on this.)

  1. Computing vs. computer science
  2. Theoretical biology, or specialist knowledge

I’ve elaborated on this in long-hand fashion previously (here and here). I’m going to try be concise here.

Computer science vs. computing

Easiest explained by asking if the learning involved for a project or job is essentially “by rote” from a textbook describing a programming language or software system or if it is the underlying science used to construct these.

The former is computing; the latter computer science.

Computing is used in things like developing websites, databases, or data pipelines using existing tools.

Computer science is a collection of theoretical knowledge, techniques and methods used in developing algorithms (beyond the trivial “list of steps” kind).

Theoretical biology

This is not “a basic understanding of biology” as in the job advertisement Sandra gives as her first exhibit. I presume she is being sarcastic when she writes “Okay – so all you need to have to be a computational biologist is a computer science degree and one genetics class.” I’d agree, sarcastically. (To be fair, it’s common for advertisements to set low “bars”  with a view to seeing what they get.)

What characterises computational biology (to me) is not a theoretical (read: basic) knowledge of biology, it is a knowledge of theoretical biology, a quite different (and more involved, detailed) thing.

This is a different background than what experimental biologists typically have. It is shared with biologists from some specialist areas, however.

What in the past I’ve usually termed ‘theoretical biology’ might be equally, and in some ways more fairly, termed specialist knowledge.

It may be only biological in it’s application, being actually based on, say, physics or physical chemistry. The sorts of things I’m referring to are the (bio)physics underlying the nature of proteins (cells, tissues), the theoretical issues behind developing phylogenetics methods, and so on. They’re “deep”, fiddly and specialist.

This knowledge is shared with specialists in these respective experimental areas, e.g. experimental biophysics, etc.

With a few exceptions, most “mainstream” molecular biologists, geneticists, developmental biologist, cell biologists, etc., don’t tap into this sort of thing directly (or not often), but usually rely on others to provide it to them in the form of equipment, techniques, software, collaboration, consulting, etc. Nor do most bioinformaticians or what Sandra refers to as ‘digital biologists’. They rarely need this level of detail. This applies to more than computational biology: you can think of pretty much any specialist equipment, technique, etc., this way.

Computational biologists are leveraging specialist knowledge from particular areas generating results or developing tools that can be used by people outside those specialist areas (and in some cases, even experimental biologists within the same general niche).

Computational biology vs. bioinformatics

Bioinformatics differs from computational biology in primarily being based on informatics. Informatics by it’s nature does not require a deep biological knowledge. Bioinformaticists tend to be generalists, as only a relatively modest knowledge of biology is needed and the computing skills can be applied across a fairly wide range of biological problems.

By contrast, computational biology is primarily founded on theoretical science; computer science is a toolkit used to implement the theoretical concepts in a practical way. Computational biologists tend to be specialists, as they tap into specialist biological knowledge that take time to build up.

In reality there is a continuous spectrum between the two, but I feel it’s important to point to the theoretical or specialist knowledge-base as it seems rarely discussed when these topics come up.

‘Digital biologists’ = ‘bioinformatics analyst’?

What the above illustrates that you need to look to the underpinning knowledge used in the work and what is done.

In the case of her (proposed) ‘digital biologists’, the underlying knowledge is (I hope) the methods implemented in the software and a knowledge of biology of the kind that most experimental biologists have and what they do is conduct bioinformatics analyses.

To me this evokes the name ‘bioinformatics analyst’.

Wrap-up thoughts

Speaking for myself, all the respective terms for the different players in the computing (computer science) + biology sphere seem so widely “misused” that it’s become something of a lottery as to what meaning is really intended for any of the terms.

Sandra’s example advertisement doesn’t fit the bill, for me, neither did one I saw immediately after offering ‘bioinformatics analyst’ as a possible name on her blog; it was an advertisement for a bioinformatics analyst, which clearly described that what they wanted was someone to develop bioinformatics software (i.e. not be an analyst, but a developer).

So… I suspect as a practical matter, given the mess the terms are in, people may just have pick a name that suits them and get used to explaining it… Sigh. (I’d love to see them more consistently used, but realistically I can’t see the situation changing.)

If Sandra’s crowd like the term ‘digital biologist’, I can’t see it’s worse than any other term.

I understand where she’s coming from, too. It’s a pain when you’re misdescribed, misunderstood, or unable to convey accurately what you do in a neat term to others.


Earlier bioinformatics and related posts on Code for life:

Retrospective: Credits, Dis-credits and Mis-credits

Retrospective–The mythology of bioinformatics

Bioinformatics — computing with biotechnology and molecular biology data

Computational biology: Natural history v. explanatory models

Bibliographies-why can’t research papers self-document what they are?