On Levels of Evidence

By Jim McVeagh 01/01/2012

Yesterday, John Roughan’s column in the Herald was about the people of Christchurch and their need to know “the worst they might face”. He draws his story from the reluctance of the GNS to make comments about the possibility of further strong earthquakes even though they knew there was good scientific evidence that there would be further large shocks. While I have plenty of sympathy for the people of Christchurch and much admiration for the way they have stood up to things so far, I do not think that Roughan is correct in his desire to see such scientific information released. I do not think it would be helpful to the long-suffering people of Canterbury.

The problem is that Roughan, like most lay people, does not understand the wide range of levels of evidence available to scientists. In our day-to-day lives, we mostly encounter two types of evidence – established (or empirical) fact and testimonials. We either know something is true or we think it is true because someone (whose opinion/knowledge we respect) told us so. Science uses these types of evidence alongside a third – statistical evidence. There are multiple levels of statistical evidence.

Obviously, experimentally verifiable fact is the highest level of evidence available to science. Unfortunately, many lay-people think that when they hear a scientific expert speak, s/he is speaking from this level of evidence. This is rarely so. Most are talking from some level of statistical evidence. This is because we are usually dealing with very complex systems that are difficult to experiment on.

If we are lucky, we can be making our statements based on level A evidence. This is the level of the randomised, double-blinded clinical trial in medicine. Readers of MacDoctor will know that this level of evidence is by no means the same thing as established fact, but it is good evidence, nevertheless. Sadly, this type of controlled experiment can only be easily performed on closed complex systems such as people and animals. Geological systems are much harder to perform experiments upon.

The commonest level of evidence available in most disciplines is Level B evidence. This would be the equivalent of observational and cohort studies in medicine. The system is too large to easily manipulate and it can only be observed and studied. Readers of MacDoctor will also know that this sort of evidence is very prone to confounding effects where it becomes very difficult to prove that correlation is actually causation. In medicine, conclusions made from this sort of data are then tested with Level A studies, often leading to the opposite conclusion. This option is not available to geologists, climatologists, vulcanologists and other disciplines dealing with large, open-ended stochastic systems. These disciplines rely on accumulation of large amounts of Level B evidence from multiple sources to refine their theories.

The “lowest” level of evidence is again one that a lay-person would be more familiar with, the level of experience and expert opinion. While this type of evidence is indeed extremely useful, it is not the type of evidence that we should be formulating major policies upon. This is often really the only available information for unique disaster situations – not necessarily a bad thing in the initial stages of a disaster, but a real problem when it comes to long-term planning.

The people at GNS would not have concerned themselves much with the public’s right to know. After all, they are scientists and the scientist’s raison d’être is to impart knowledge. No, the question they would have wrestled with is “does the level of certainty of this information warrant the possible  consequences of releasing it?“. Like John Roughan, most people will think this information is like any other “fact” that they know and accept it as “truth”. Unfortunately, a 90% certainty of another magnitude 6 quake in the next 6 months does not tell us a lot if the quake’s epicentre is 100Km deep and 50 Km out to sea.

It is exactly this sort of uncertainty that makes it so difficult for experts to speak out; especially experts with good credibility whose word will therefore be easily accepted by the public as gospel. What if the revealing of this information had many thousands more leaving the city of Christchurch, leading to a complete collapse of Christchurch’s economic structure? What if it then turns out that the quakes either did not occur or were trivial? This is a piece of information where the consequences are very different to a false tsunami warning, which is merely an inconvenience.

Alternatively, the geologists could try and couch the information in dozens of caveats. This will sound to the general public like the experts don’t really know anything or that they are dismissing the quakes as unlikely (people hear what they want to hear unless the message is emphatic). What if there was then a quake in which there was loss of life? Who would then be “to blame”?

Roughan would like the people of Christchurch to cease being in the limbo of uncertainty. The GNS would like that too. Unfortunately, despite good, educated opinion and evidence, they cannot provide the sort of certainty that Roughan is hoping for. Nature may be susceptible to analysis on a wide scale but is wildly unpredictable in any one locality.

That is the only real fact.


0 Responses to “On Levels of Evidence”

  • Your overall message about ‘levels of evidence’ is more-or-less fine, but a couple of loose suggestions/thoughts:

    “Sadly, this type of controlled experiment can only be easily performed on closed complex systems such as people and animals.”

    Surely what are being tested here are not (truly) closed systems. It seems to me if they were closed systems it would indeed be trivial, but isn’t what controlled trials are trying to do is control for all the ‘other’ things, i.e. not just the patient but all the variables that might affect their diseases status – their diets, environmental factors, and so on – ?

    While you do get to what Berryman indicated as their concern—uncertainly over the location of a subsequent magnitude 6+ event—my reading is that you have linked two different things. Doesn’t this more to do with the output of a complex system than ‘levels of evidence’ – ? A difference here is that they would be placed in the position of trying to predict the outcome of one particular complex system, rather than state what is generally true. Sticking my neck out, an analogy might be predicting the outcome of a treatment that is known to have different outcomes in some patients; while you might be able to confidently say what is typically the case, you would struggle to say what might be true for each individual patient.

  • You are right about “closed” systems. Perhaps I should have said “systems for which external variables can be controlled for to a large extent” – but it does not have a very nice ring to it! 🙂

    I get your point about outcome and levels of evidence, but I would contend that our models for these complex systems are an attempt to predict the outcomes of said systems. All models require input from our evidence base to construct. The weaker the level of evidence, the poorer the model, so outcome and evidence are strongly related.

    If geologists could gather sufficient data to account for all significant variables, it may be they could produce a model that made local predictions with sufficient accuracy. I suspect this will not be eventuating any time soon.

  • “I would contend that our models for these complex systems are an attempt to predict the outcomes of said systems.”

    I said that! 🙂 (See “trying to predict the outcome of one particular complex system”.)

    With that in mind, I think you’re doing the rather exasperating thing of saying the same as the other person while at the same time not seeing how you’re getting it wrong.

    They’re effectively being asked to predict the outcome of a single case, not a reporting what’s typical (which is what trials yield). The two are fundamentally different. The problems of modelling a complex system are more than, and separate too, the issue of ‘levels of evidence’. You can know a good deal about general properties of a system, or individual components, but still (i.e. inherently) not make definitive statements about single cases especially if the mechanisms (something trials generally don’t reveal) are many and interact in complex ways differing in subtle ways.

    This issue with complex systems modelling is very important. It affects biological modelling, too.

  • Excuse my third paragraph – meant to edit the post further – but it is frustrating when people repeat what you’ve said back at you, not realising what they’re missing!

  • Sorry to exasperate you, Grant…. 🙂 I suspect we are actually in agreement but talking from opposite ends of a problem.

    I accept that there is more to modelling than the single issue of how good your data is. But sooner or later it all comes back to the verifiability of your models. At an observational level your natural outcomes form as much apart of your data as a specific trial. After all, what you are trying to do is feed back this information to your model – to try and gain understanding of the underlying mechanisms in action – then alter your model to a more accurate version.

  • My re-reading of this is your article confused as your term “level of evidence” to me appeared to switch mid-article from “levels of evidence” for inputs to confidence estimates in outputs – models use both.

    Having said that I’m not quite convinced that you get what I’m saying about modeling complex systems (!) as your reply goes further afield writing about feedback to try derive mechanisms, and verifiability, neither of which has anything to do with what I was pointing out.

    I was trying to point out that with complex systems, you can inherently not be able to model the state of individual elements within a system because of a mix of the complexity in the system and uncertainly (however slight), i.e. nothing to do with imput “levels of evidence” but a reflection on the nature of complex systems in and of themselves.