A report on the issue, published in Nature this May, found that about 90% of some 1,576 researchers surveyed now believe there is a reproducibility crisis in science.
While this rightly tarnishes the public belief in science, it also has serious consequences for governments and philanthropic agencies that fund research, as well as the pharmaceutical and biotechnology sectors. It means they could be wasting billions of dollars on research each year.
One contributing factor is easily identified. It is the high rate of so-called false discoveries in the literature. They are false-positive findings and lead to the erroneous perception that a definitive scientific discovery has been made.
This high rate occurs because the studies that are published often have low statistical power to identify a genuine discovery when it is there, and the effects being sought are often small.
Further, dubious scientific practices boost the chance of finding a statistically significant result, usually at a probability of less than one in 20. In fact, our probability threshold for acceptance of a discovery should be more stringent, just as it is for discoveries of new particles in physics.
The English mathematician and the father of computing Charles Babbage noted the problem in his 1830 book Reflections on the Decline of Science in England, and on Some of Its Causes. He formally split these practices into “hoaxing, forging, trimming and cooking”.
‘Trimming and cooking’ the data today
In the current jargon, trimming and cooking include failing to report all the data, all the experimental conditions, all the statistics and reworking the probabilities until they appear significant.
Man prefers to believe what he prefers to be true.
Deep-seated cognitive biases, consciously and unconsciously, drive scientific corner-cutting in the name of discovery.
This includes fiddling the primary hypothesis being tested after knowing the actual results or fiddling the statistical tests, the data or both until a statistically significant result is found. Such practices are common.
Even large randomised controlled clinical trials published in the leading medical journals are affected (see compare-trials.org) – despite research plans being specified and registered before the trial starts.
Researchers rarely stick exactly to the plans (about 15% do). Instead, they commonly remove registered planned outcomes (which are presumably negative) and add unregistered ones (which are presumably positive).
Publish or perish
We do not need to look far to expose the fundamental cause for the problematic practices pervading many of the sciences. The “publish or perish” mantra says it all.
Academic progression is hindered by failure to publish in the journals controlled by peers, while it is enhanced by frequent publication of, nearly always positive, research findings. Does this sort of competitive selection sound familiar?
It is a form of cultural natural selection – natural, in that it is embedded in the modern culture of science, and selective in that only survivors progress. The parallels between biological natural selection and selection related to culture have long been accepted. Charles Darwin even described its role in development of language in his The Descent of Man (1871).
Starkly put, the rate of publication varies between scientists. Scientists who publish at a higher rate are preferentially selected for positions and promotions. Such scientists have “children” who establish new laboratories and continue the publication practices of the parent.
Good science suffers
In another study published in May, researchers modelled the intuitive but complex interactions between the pressure and effort to publish new findings and the need to replicate them to nail down true discoveries. It is a well-argued simulation of the operation and culture of modern science.
They also conclude that there is natural selection for bad scientific practice because of incentives that simply reward “publication quantity”:
Scrupulous research on difficult problems may require years of intense work before yielding coherent, publishable results. If shallower work generating more publications is favored, then researchers interested in pursuing complex questions may find themselves without jobs, perhaps to the detriment of the scientific community more broadly.
The authors also reiterate the low power of many studies to find a phenomenon if it was truly there. Despite entreaties to increase statistical power, for example by collection of more observations, it has remained consistently low for the last 50 years.
In some fields, it averages only 20% to 30%. Natural academic selection has favoured publication of a result, rather than generation of new knowledge.
The impact of Darwinian selection among scientists is amplified when government support for science is low, growth in the scientific literature continues unabated, and universities produce an increasing number of PhD graduates in science.
We hold an idealised view that science is rarely fallible, particularly biology and medicine. Yet many fields are filled with publications of low-powered studies with perhaps the majority being wrong.
This problem requires action from scientists, their teachers, their institutions and governments. We will not turn natural selection around but we need to put in place selection pressures for getting the right answer rather than simply published.
Feature image: Flickr / frankieleon