beyond reasonable doubt: a significant improvement

By John Pickering 15/01/2015

For the second time in a week I have removed the word “significant” from a draft manuscript written by a colleague of mine in clinical medicine. In a significantly p’d I wrote about the myth of significance – that is about the ubiquitous use of the term “significant” in the medical literature to mean a specific probability  incorrectly rejecting the hypothesis that two things (eg two treatments) are the same (you may need to read that twice).  What I pointed out was the “significant” does not mean “meaningful.”   Here I want to propose an alternative.  But first, I need to discuss two major problems with the term.

Where common is not specific

In my experience the common usage of “significant” to mean important is the normal interpretation of the word in the science literature even by many medically trained people and sometimes the authors of articles themselves.

The tyranny of p<0.05

When the maths wiz Ronald Fisher talked about significance (in an agricultural journal not a medical one!) he used 0ne in 20 (p<0.05) as an acceptable error rate in agricultural field trials so that trials did not have to be repeated many times.  That p<0.05 has taken on almost magical proportions (‘scuse the pun) in the medical literature is scary and shameful.  I don’t want to delve into all that now.  If you want to, a starting point maybe the Nature article here.

My proposal

I propose that in all scientific literature that authors replace the term “significant” with the phrase “beyond reasonable doubt” and that they only be allowed to publish the article if in the methods section they define what p value they choose to represent “beyond reasonable doubt” and they defend why they have chosen this value and not another.  “Beyond reasonable doubt” is a term used in the New Zealand judicial system where those charged with a crime are presumed innocent (Null hypothesis) until proven otherwise.  Perhaps those of us in science could learn something from our lawyer friends.

Tagged: Beyond reasonable doubt, medical literature, medicine, p value, reasonable doubt, significance testing

0 Responses to “beyond reasonable doubt: a significant improvement”

  • Good post thanks! I agree with your essential arguments (the over-interpretation of the p-value and the use of an arbitrary threshold), but I’m not sure that using ‘beyond reasonable doubt’ will solve the problem. My concern is that, just as ‘significant’ gets conflated with clinical significance, ‘beyond reasonable doubt’ will be conflated with bias and confounding. In other words, if someone reports findings as ‘beyond reasonable doubt’ readers will interpret that as meaning that they are unlikely to be due to bias and confounding as well. My personal preference is that people report a finding as ‘statistically significant’, not just ‘significant’. Of course, I’d much prefer people to place greater emphasis on looking at confidence intervals rather than just p-values anyway 😉

  • Thanks Simon – great points. Perhaps “beyond reasonable doubt” could be linked not only to the statistical analysis but also to the limitations/potential sources of confounding etc. In medicine, at least, it is rare that a study cohort is identical to that encountered during routine clinical work – to my mind this means that we should be much more cautious in interpreting study results as suggestive of benefits if changes in practice are made.

  • Another problem with “beyond reasonable doubt” is that to most people it signifies a verdict reached by a jury of non-experts, chosen at random. The reliability of the evidence may play a part, but in all likelihood the emotional responses of jurors – both to the crime, and to the persuasiveness of the legal representatives, will also influence the outcome. At least “statistically significant” has a specific scientific meaning.