6 Comments

(A little indulgence: ruminating an aspect of the development of science since the late 1980s/early 1990s.)

Steve Caplan has contrasted experimental biology of his Ph.D. student days and the present-day kit-driven science, comparing his early 1990’s efforts to manually labour over a relatively small number of techniques, with today’s students using a much wider range of techniques via kits. He first frets this might result in them not understanding what the kits really d0, but then suggests this might lead to:

‘[…] more time thinking, more time reading, more time figuring out which new assays will be applicable to the research; how to best spend the money to get ‘the most research for the buck.’ There will be less work at the bench, and more thought given to which kits to order and what work to outsource and to whom.’

He concludes that this might be OK, provided students (and researchers!) ‘understand the technical concepts of the science that they carry out.’

When I first read his article, it struck me that a similar thing to the development of kits had happened to computational biology over the same time frame.

A commenter, Boel, has beaten me to raising the essence of the point over there but allow me the luxury of putting my own more extended take on this, explaining the parallel I was seeing and adding a few further thoughts.

Today students and researchers can punch in an URL–or pull up the bookmark–to bring up a (hopefully!) relevant web service, drop their sequence into the textbox, then push the button and just read the results – all with little thought to the conceptual[1] nature of algorithms that are analysing their data, should they chose not to.

Once web service tools didn’t exist. It wasn’t that long ago either. I’ve seen that transition hands-on, implementing a few web services myself.

During my Ph.D. studies you had to locate the software–sometimes a small mission in itself in the late 1980s–then transfer a copy over to your machine, install it, read the notes that came with the software and the paper describing the method, then experiment with the parameters they described to get what you wanted. Remember this wasn’t just for computational biologists, but anyone who wanted to analyse their data.[2]

It struck me that perhaps the change mediated via web-interfaced bioinformatics servers is similar to what experimental kits have done for experimental biology. Both are taking much of the mechanics of doing the work off the hands of the researcher.

Steve goes on to extend this to out-sourcing, using DNA sequencing as his example. Out-sourcing computational biology analysis to people like me[3] might be an obvious parallel.

It’s not difficult to argue that there is a parallel set of good and bad points to these computational web services as to the experimental kits Steve  talks about. Both can let you ‘get away’ with knowing the underlying details, should you choose to.

The argument over there is that the real science lies in the decisions made (what methods to use, etc.) and that these ought to be informed by a knowledge of what the methods achieve.

Even though you don’t have to put as much effort into making the things work using services, you still have to understand what the things are doing to your data.

The same applies to all the other tools: the various machinery and instrumentation used.

You need to know at least at a conceptual level what the things are doing.

I would add that this implies a need for good documentation. I personally don’t consider software projects done until the documentation is done. For data analysis methods it has to be more than the mechanics of to run the things, but what they do with the data – the algorithm (in conceptual terms), the parameters, and so on. All fairly obvious to those delivering the tools, but it must be exercised by those using them, too.

I would add, also, that there is a balancing act here. How to best spend the money and time. Time is money, in many ways. If a particular task involves considerable background knowledge, is it really best to spend valuable time learning the background in order to decide what method to choose or how to perform them, or should that be out-sourced and let a specialist take over?[4]

Steve refers to people working in a wider range of techniques, sometimes going (a little) outside their comfort zone. There is a point, I think, at which the thing gets too ‘wide’ – where part of the decision-making is when to locate a collaborator or service. A standard research decision, nothing new there. But is the ease of the tools encouraging people to push wider than they ought to?

Your thoughts are welcome. Right now I’m thinking that people should invest in talking to specialists in the planning stages to check that the plan is one that they can realistically cover themselves, or even if they are taking the right approach at all, to ensure that ‘gotchas’ don’t catch them out.[5]

Footnotes

[1] I’m not suggesting they need to know the finer points of how they’re implemented, just how the conceptually work.

[2] As many of my readers will know, it’s still somewhat like that behind the scenes for those that make the services, and for those working at the cutting edge – most web services are of the more established methods.

[3] I’m a freelance computational biologist, working as a consultant. I have to admit I prefer to be more deeply involved with the project, given a choice.

[4] A problem I run into sometime is biologists who have determined what they consider the appropriate data analysis might be. Sometimes it’s fine, but I find myself asking that they explain the biological problem that they wish to have addressed, so that I might see that the method would in fact give them what they wish, or if there are better approaches that what they have suggested.

[5] I’ve written before that an impression I get of (many) grant applications is that they are written such that ‘the project will ‘hire someone with appropriate expertise when the time comes’, which assumes that the data analysis portion of the plan is sound.’ A related issue is that this can backfire, with a ‘rescue effort’ needed.


Other articles on Code for Life:

Developing bioinformatics methods: by who and how

External (bioinformatics) specialists: best on the grant from the onset

Choosing an algorithm — benchmarking bioinformatics

Loops to tie a knot in proteins?

Epigenetics and 3-D gene structure