The causal bind

If you hang around social media debates about education for long enough then it is inevitable that you will be drawn into a discussion of evidence.

These arguments follow a familiar path. The role of randomised controlled trials – or other quantitative studies – will be questioned. The analogy of education and medicine will be challenged. On this basis, the contention will be advanced that we need to accept a wide range of different types of evidence. This will, incidentally, allow us to now include evidence for approaches that were previously lacking in support.

Not so different

First of all, the fields of medicine and education are not so different, particularly when you include the arguments around alternative medicine. Granted, the aims of medicine are much less contentious: people get better or they don’t, whereas an educational approach might fail to improve students understanding of maths but might be advanced with arguments that it will make them more motivated or develop some other quality.

However, giving a drug to a group of people is similar to giving a group of students a certain type of instruction. The pill will have differential effects on individuals due to body mass, genetics, diet, lifestyle, psychology and many other factors, just as the instruction will affect individuals differently. And so we look for average effects and try to uncover broad principles.

In fact, alternative medical practitioners gripe against evidence-based medicine on much the same grounds that some people gripe against evidence-based education: it fails to sufficiently recognise individual differences and contexts and it doesn’t address the whole patient. Still, if I were ill I know where I would turn.

And medicine has already rehearsed the arguments of education, seriously and in some detail. Medicine is the tradgedy to our farce. Faced with calls for alternatives to randomised trials in medicine, the medical blogger David Colquhoun notes that most scientists are unaware that there is even a debate to be had because the role of randomisation has long been resolved:

“Despite this, there is a body of philosophers who dispute it. And of course it is disputed by almost all practitioners of alternative medicine (because their treatments usually fail the tests).”

The bind

You might be tempted to draw the conclusion that I consider randomised controlled trials as constituting the best kind of evidence with other forms of evidence being somehow less good. But this would be to commit a category error. Types of evidence are neither good nor bad. We need to focus instead on the quality of the inferences that we draw from them.

For instance, if a politician were to claim that, “no children are subject to restraint techniques in government run care-homes,” then just one well-documented case-study could refute this. Conversely, a randomised trial would be an impractical and wholly unsuitable way of trying to address the question. The type of evidence needs to suit the question you are asking.

Well designed randomised trials are particularly good at investigating causal claims: If I do x then y is more likely to happen. Examples might include the hypothesis that giving a patient an antiviral is likely to reduce the length of a bout of flu or giving students problem based learning will improve their higher order thinking skills.

And this is the bind. If you make causal claims then we should be able to test them with randomised trials. If you really think all children are individual, all teaching interactions are about relationships and are entirely mediated by context then that’s fine but it doesn’t fit with going into a school and encouraging greater use of drama activities or iPads or teaching like a highwayman or whatever. If you are making claims about these teaching approaches then you are doing so because you believe that they will have desirable effects: you are making causal claims whether you state these explicitly or not. The burden is then on you to supply evidence for those claims. And the form of evidence best suited to evaluating causal claims is randomised trials.

Nobody is imposing such trials on you. Nobody is suggesting that all children are the same. It is you who is implying a positive effect and it is this that is testable. Without this evidence, your views are speculative.

Maybe you have evidence from experience, testimonials or case studies. But these aren’t anywhere near as strong for establishing causal links as randomised trials because they are subject to so much potential bias. If the effect is real then a well-run randomised trial will demonstrate it. Let me turn the point around and rephrase it as a question: If the advantages of an approach are so ephemeral that they don’t show up in a trial then can we have any confidence at all in harnessing them in our classrooms?

Of course, there is more to life than simply running trials. They can be poorly designed – and many are. They need to be replicated. And I would always look for triangulation. Longer correlational studies that triangulate with the findings of smaller trials can give us confidence in the mechanisms we have uncovered. We also need a plausible theoretical mechanism for the causal claim or it becomes hard to interpret data or make suggestions about how we might apply or extend the findings. But there is no escape from the fact that causal claims are testable.


Sleight of hand

The education consultant who sells us an approach to instruction on the basis that it is superior but who, when asked for evidence, seeks to muddy the waters, is performing a sleight of hand. It is the same trick attempted by homeopaths and chiropractors. We have not imposed randomised trials on them. After all, we have not asked them to make causal claims. They have done that all by themselves. So it is they who should supply good evidence to support these claims.


9 Comments on “The causal bind”

  1. Mike says:

    Interesting post.

    I’ve come to the conclusion over the years that the phrase “Studies have shown that…” or “The evidence is that…” can usually be translated as “You are about to be sold a furphy.” It tends to be a method of avoiding any proper debate about the topic in question by invoking The Science (blessings be unto it), however problematic the concept of “science” might be in areas such as education.

    You’re right to compare the various snake-oil merchants in education to the homeopaths and other quacks that Ben Goldacre (among others) writes about so trenchantly. But there is another dimension to this which I think needs to be mentioned:

    “However, giving a drug to a group of people is similar to giving a group of students a certain type of instruction.”

    Yes, but with one big difference.

    Proper RCTs in medicine depend on double-blinding. I simply don’t see how this is possible in education, and this is what makes virtually all “studies” on different teaching methods problematic in my view. A teacher just can’t be made unaware of whether s/he is instituting the new intervention or the “placebo” – it compromises the data from the very start. (That’s even if confirmation bias hasn’t been built into the study from the very start, as it seems to be in most cases.)

    This is why, in my view, diachronic studies of teaching methods which are already in practice at different schools, as part of the school culture, is really the only way of collecting genuinely relevant data on effectiveness. And this, of course, is fraught with other problems.

    • gregashman says:

      You are right that it is difficult (although not always impossible) to blind education trials. This is not fatal in my opinion, it just means that we need to look for larger effects that cannot be accounted for by expectation effects. After all, feeling positive about algebra can only get you so far.

      I am not convinced that the studies you refer to overcome this problem. They also introduce other possible sources of bias and confounds. However, they are important for triangulating experimental evidence.

      • Stan says:

        Psychologists are smart enough to get around this issue. You simply make all cases – the control and the experimental group experience a common change. For example both groups could have the lighting changed and a different teacher and be told that is the experiment. The must be blind to whether they are in the control group or not being given. However, one group also gets the additional change being tested. After the experiment all parties get full disclosure, no one is kept in the dark.

        As another example cited on the jumpmath site Teachers in both control and experimental groups were given additional training so both may have expectations that they will see improvement. To keep it really blind they would need to be misled into believing that the training was the intervention and be unaware that some were given different training.

        This is time consuming but the alternative is to roll out new methods with no basis for knowing their efficacy. Think of all the hours and dollars spent on learning styles or PBL.

  2. Nick says:

    You hit the nail on the head here. It is not about what can work sometimes but works more consistently. I can’t tell you how often I see an educationalist in Alberta tweet about the science of climate change against some denialist but then, in the next few minutes, tweets the equivalent of “but I’ve seen it snow in my classroom… in July! And that proves you are wrong!” moments later when discussing other research.

  3. Ben says:

    Do you think it is at all possible that many of the reformers are basing the push for reform on experience with ineffective teachers and not necessarily ineffective methods?

    An ineffective teacher will be ineffective no matter what method, but when I grow up and become a teacher myself I don’t blame my past teacher, I blame the method in which I was instructed. Therefore I must do something different and become an education reformer. I rationalize this to myself saying,”this isn’t how I was taught, therefore I must be successful.”

    Do you think this is what might be happening with various ed reformers? In full disclosure I have found myself guilty of thinking like that and am trying to change my ways.

  4. While I agree with your general drift, Greg, I disagree with one point and would emphasise a couple of others.

    I disagree with the popular decoupling of evidence and inference. Data and inference, maybe, but evidence is always evidence *of* something – it is necessarily attached to an inference. The same goes for the popular view that what is valid is not an assessment but the inference that is based on the assessment. We do not set assessments without a clear idea of what inferences we intend to make on the basis of that assessment – we always (or at least should) test *something* – so I don’t think the decoupling is valid.

    First point that I would emphasise is, given the number of different variables in education (rather more than medicine, at least in respect of the way that an intervention is applied), and the clustering of classes and schools, we need a great deal of data in order to randomize effectively. This we do not generally have – and is why I see the only practical way that we will move to a more evidence-based approach to education is through the application of effective forms of digital technology (something that attempts to date to implement edtech have failed lamentably to do) – as this has the potential, not only to provide effective instructional activities and rapid feedback in many areas where mechanical feedback is perfectly adequate, but it also promises to harvest data automatically.

    Second point is to emphasise what you say about the variability and uncertainty of educational aims. This is why we need to revisit the attempt to describe our educational objectives clearly. We spent the last 25 years trying to do this with criterion referencing, which failed because it was poorly conceived and implemented. But the fact that we did something badly is not a reason why we cannot do it well. This is a subject I am currently blogging about at some length.


  5. suecowley says:

    My friend runs a large CTU (clinical trials unit) and there is a big difference between the way that her team recruits subjects for and runs RCTs and the way that an RCT could be done in an educational setting of any kind.

    To say that “giving a drug to a group of people is similar to giving a group of students a certain type of instruction” is incorrect. The ‘group of people’ in a medical RCT will have been carefully and painstakingly recruited with a specific protocol in mind, based on the medical intervention being tested. The recruitment part of the process is (I am told) extremely difficult, and must be carefully designed in order to avoid introducing bias into the RCT from the outset. The patients will often be recruited on the basis of some kind of pre-existing medical condition (e.g. moderate essential hypertension, diabetes) before being put into the randomised groupings. You might find this link useful in understand more:

    Unless you are recruiting a group of children with a specific, identifiable condition, and working with them in highly controlled conditions (i.e. something similar to a multi million pound CTU) then saying the two are directly comparable is to misunderstand how medical RCTs work.

    • gregashman says:

      I don’t think this makes any material difference to my argument. If you want to see if a drug is effective against IBS, for instance then, of course, you would select people with IBS and this is similar, as you point out, to selecting a group of students with a particular reading difficulty. But if you wanted to check the efficacy of a vaccine then you would want to sample from the whole population just as you would with a general teaching approach. How does the selection of sub populations affect the suitability of RCTs? Do you think your argument means that they are suitable for medical interventions but not for educational ones? If so, why?

