The truth about teaching methods

There are those who would disagree that there is even such a thing as a specifiable teaching method. And there are those who would deny us the language with which to compare methods. But setting such defensive obfuscation aside, how can we decide which teaching approach is the best one to use in a given situation?

Effect Sizes

For John Hattie, the answer has been to compare effect sizes from different kinds of intervention. An effect size is a way of working out a standardised effect across different studies. It is basically the difference between the average of some kind of score from two comparison groups divided by the size of the spread of the data (the standard deviation). The problem is that Hattie perhaps uses this idea too generally. Can we really compare an effect size from a before-versus-after study with one from a control-versus-intervention study? If we narrow the student population e.g. by studying a high ability group then we will narrow the spread of results and so inflate the effect size. If we use standardised tests designed under psychometric principles then we will typically see less difference between groups and therefore reduce the effect size – this is due to the way that standardised tests are constructed.

In Hattie’s 2009 book, all types of test are treated equally. This means that trials where a new approach has been implemented by enthusiastic teachers and then compared to a do-nothing control group are lumped in with much more rigorous trials. For instance, the worked-example effect has been tested in randomised controlled trials. Is its effect size of 0.57 really comparable to similar effect sizes from poorly controlled trials? I don’t think so. Perhaps the answer is to agree to only look at randomised controlled trials?

Randomized controlled trials

There has certainly been a push for more randomised controlled trials (RCTs) in education. Ben Goldacre and others have made a strong case in the UK and the Education Endowment Foundation (EFF) is leading the charge, funding a whole range of different studies. However, RCTs are not without their problems.

Firstly, in medicine, RCTs are ‘blinded’. This is where, typically, one group is given a treatment whilst another is given a placebo. The patients and researchers do not know who is in each group. The purpose is to eliminate the placebo effect where simply knowing that you are being treated can lead to favourable outcomes. It is often quite impossible to blind an educational trial; most students will know if they are receiving something new and funky instead of business-as-usual. We therefore have to factor in the possibility of a placebo effect in whatever we find.

But it is also possible to poorly design an RCT by varying a whole bunch of factors at once. I recently wrote about such an RCT evaluating a scale-up of Reading Recovery. In this case, the differences between the control and intervention groups were multiple and it is impossible to tell whether it is the specific Reading Recovery practices that caused the effect.

In my post on the research, I asked if other studies had been conducted on Reading Recovery that were better controlled. One person linked me to this paper where Reading Recovery was compared (amongst other conditions) to a strange version of direct instruction where the students hardly did any reading. If you have access, it is worth reading the full paper, particularly for its description of the Reading Recovery teaching method:

In this example, Dana is reading Nick’s Glasses (Cachemaille, 1982), an 8-page illustrated book about a boy who cannot find his glasses because he is wearing them. The text on page 6 says, “‘Have you looked behind the TV?’ said Peter.”

Dana read, “Have you looked under the….” She hesitated, glanced at the picture (which did not provide the needed information), and searched the line of print. Then she started over, ” ‘Have you looked behind the TV?’ said Peter.”

At the end of the page, her teacher quickly said, “I like the way you were checking carefully on that page. Show me the tricky part.” Dana pointed to the word behind, saying, “It had a b.” “Yes,” said the teacher, “Under would have made sense. He could have looked under the TV, but that word couldn’t be under. I also like the way you checked the picture, but that didn’t help enough, did it? You were really smart to use the first letter; read it again fast and be sure that it makes sense.”

Dana read the page again fluently, saying, “That’s right.” In this example, the teacher was pointing out to Dana how she effectively used several different sources of information simultaneously to monitor her own reading.

This seems like a poor method. Encouraging students to guess words from the pictures is problematic because it won’t help them to read books that don’t have lots of pictures in them. Phonics should be a first resort not a half-hearted last resort when guessing fails. In this instance, phonics was only employed in relation to the first letter of the word rather than for the decoding of the whole word.

This makes me even more skeptical that the recent positive result from an RCT was due to the specific Reading Recovery methods.

Process-product research

An overlooked body of research evidence is the wealth of process product research spanning the 1950s through to the early 1980s. This research is essentially investigating correlations between specific teaching methods and gains in student knowledge and understanding. You can see why it has been largely replaced by experiments and quasi-experiments where factors can be systematically varied. However, I think it is still important and highly suggestive of which approaches are the more effective.

Barak Rosenshine looked into this research and derived principles of ‘direct instruction’. I also like Greg Yates’ discussion in his “How Obvious” paper. If you have the time, it is worth reading Thomas Good and Jere Brophy’s summary of the research, whilst mindful of Good’s warning not to see the findings as a checklist or observation tool.


So, it can be hard to evaluate approaches in education. The common methods have obvious flaws. However, I am not a postmodernist. I believe that people are more similar than different in the way that they learn and that we should ultimately be able to find some good general principles on which to base our decisions.

In the medium to long term, we should find ways to encourage knowledge building through properly controlled, randomised trials. The formation of the EEF in the UK is a good sign. As teachers, we may wish to involve ourselves in such research when we undertake Masters and PhD study. University education departments should focus more on this sort of research and less on ideological opinion papers or woefully flawed trials. I am hopeful.

However, we should not overlook what we already know. Despite the flaws, as Rosenshine notes, the results of investigating educational practices using quite diverse approaches all seem to converge on quite similar findings; the importance of teacher clarity and explicit instruction, the value of academic time-on-task, the role of practice and testing. There is enough to be going on with.


20 thoughts on “The truth about teaching methods

  1. Dylan Wiliam says:

    Chapter 3 of my next book—Leadership for teacher learning—explains why meta-analysis is almost useless in education. It should be published by September…

    • Sounds right up my alley! I have been hoping you would further enter the teacher learning space, after reading your 2014 papers. My PhD is on teacher learning and its leadership, so I’ll try to get a copy of your new book before submission.

    • Thank you! I can’t wait to read it and have others read it as well. As a biologist turned educator, the prevalence of meta-studies in education has always bothered me. And John Hattie’s research “findings” I have named meaningless and irresponsible since they broadly categorize practices based on desperate research models and data sources.

      September can’t come fast enough! 🙂

  2. As all methods have limitations, best approach is looking for converging evidence from the more rigorous approaches described above. This certainly does not lead to a completely relativist position re. teaching methods, as key factors recur across methods. It also does not imply a free-for-all relativism on methods, as it is still easy to discern differences in rigour which can be taken into account when summarising findings. This does imply that researcher/practitioner judgement is essential when reading quantitative studies – numbers don’t take away need for this.

  3. Pingback: Teaching Methods | Learning and Teaching Ideas

  4. The high-profile, highly funded, Education Endowment Foundation in England arguably has limitations of ‘not joining the dots’ in the English context.

    As you know, Greg, there was a Government match-funded phonics initiative from 2011 to 2013 and various phonics programmes, sets of books, resources and training were scrutinised according to a strict research-informed ‘core criteria’. A number of phonics programmes, having managed to get through the tendering process and scrutiny and, in effect, benefiting from this publicly funded official initiative, should surely have been the obvious programmes to be objectively researched in their own right on the level of national interest. Various programmes which happened to be included in the match-funded catalogue during that period may well be involved in various EEF projects but I’m unaware of any initiative to look more closely at all the publicly-funded phonics programmes in their entirety. Instead, various concoctions and clones of programmes and practices appear to be taking precedence because they are school-generated. I’m not privy to any of the EEF projects but I have attempted to generate conversations with the EEF regarding this issue (logic based on national accountability and joining the dots) of researching the match-funded phonics programme but this perspective seems to have no place under the terms and conditions of the EEF. I got short shrift.

    You also mention Reading Recovery specifically – and one of the main reasons for recurring criticism of RR is its within-house research, the lack of fair comparisons with other programmes and practices, its huge expense meaning that even if it did get good results, could the same results be achieved or bettered, at much less expense – and, of course, everyone knows that the underpinning multi-cueing teaching principles in RR are now discredited (and long since) but the programme is so entrenched in ‘establishments’ that this may never be possible. Would it not also be a ‘join the dots’ move to properly compare with the known main research-informed phonics programmes? These programmes are also ‘intervention’ programmes and the authors can easily guide as to how best to use them for intervention purposes. This is needed in all the English-speaking countries, not just England. The contradictions in official guidance and actions beggar belief and the teachers continue to get mixed messages about ‘what to teach’ -and it is the weakest and slowest-to-learn children who are at the greatest risk from this lack of transparency and RR’s entrenchment.

    • The paper that describes the RR teachIng methods is from 1994. So it describes RR 21 years ago not now. I know from my own teaching career that phonics teaching 21 years ago would not look the same as now.

      • OK. That’s a fair point. But this was presented to me as a fair test of RR against competing approaches (which it probably isn’t). Would love to see a more recent test pitting up-to-date RR against a different intervention.

  5. We have already called upon your postings, Greg, in our thread below with the theme of what can we believe about research on Reading Recovery – and I’ve just added a post to show the level of funding for RR in Ohio – it is mind-blowing. So, if it is the case that we can use teaching methods which are better for the slower-to-learn children, how does one ever transcend the enormity of such an established programme? Is it possible or are we all just wasting our time trying to get to the bottom of this educationally and scientifically – transparently?

  6. MaggieD says:

    With regard to RR, Greg, you may be interested in this piece in the RRF Newsletter Spring 2007. It has a little bit about their teaching methods.

    At that time they had managed to secure government funding to roll out RR nationally, even though its methods completely contravened the structured, systematic phonics work set out in the Rose Report for all stages for the initial teaching and the remdiation of reading

  7. Attempting to compare “teaching methods” is tough for methodological reasons. For one thing, teachers who say and believe they are using a given “teaching method” are actually providing very different instruction. So even with well designed and executed comparison experiments, one finds as much or more variability within methods as between methods. This occurs no matter how carefully the method is defined and how well the teachers are trained in the method.

    What does have operational integrity is the products and protocols that constitute an instructional programme. In comparing the accomplishment of instructional programmes, the instruction is being tested, not the students or the teachers. Products and protocols can be readily modified; people not so much.

    The enterprise of schooling has all of the elements required for Natural Experiments to investigate the black box of instruction. The situation in England that Debbie Hepplewhite refers to is case in point. The screening check administered to all children at the end of Yr 1 and to screenee children at the end of Yr 2 constitutes a psychometrically-sound dependent variable for comparing instructional programmes. Both the instruction and the measurement has already been paid for, and the “treatment” has already been provided, so the experiment would entail no
    additional cost or intrusion. All that’s needed is to ask teachers and schools what programme(s) they are using. It’s a bit more complicated than that, but not much.

  8. Dr No says:

    There should be a moratorium on all education research for the next five years and the then look back and see what difference that decision has made. it will at least determine the real value of university education departments (if any)

  9. Paul says:

    Interesting suggestion, Dr. No. I bet the world will just keep on revolving. Just like all research, maybe even the whole service industry. Politicians we can also do away with. Plus an additional problem: compare with what? Point being: it’s a completely empty statement except that it conveys your opinion about said departments.

  10. See below for the four-quadrant graphic that I drew up recently. It attempts to illustrate that even when teachers in England all purport to do ‘Systematic Synthetic Phonics’, what this looks like can be very different indeed school to school.

    I think the point we have reached in England is that the level of detail and reality needs to be observed, noted and fully understood. To date, we do not have commonality of professional understanding in reading instruction in our schools – and then ‘intervention’ may still be dominated by either weak phonics, or alternative phonics provision (not necessarily using the main phonics programme) or an intervention based on multi-cueing reading strategies such as Reading Recovery.

    So, we need to know the details of schools’ phonics provision – and it is long overdue that we know the details of the teaching principles underpinning Reading Recovery in England. I keep hearing ‘indications’ that it is has ‘changed’. Well then – tell us how it has changed and provide the documentation to show this – and allow people to look at the content of the year long RR masters’ course. Then we might understand what is going on. We’re not going to take any RR person’s ‘word for it’ – that is not sufficient. Be specific.

    The Simple View of Schools’ Phonics Provision:

    Click to access Simple%20View%20of%20Schools.pdf

  11. Pamela Snow says:

    “Firstly, in medicine, RCTs are ‘blinded’. This is where, typically, one group is given a treatment whilst another is given a placebo. The patients and researchers do not know who is in each group. The purpose is to eliminate the placebo effect where simply knowing that you are being treated can lead to favourable outcomes. It is often quite impossible to blind an educational trial; most students will know if they are receiving something new and funky instead of business-as-usual. We therefore have to factor in the possibility of a placebo effect in whatever we find”.

    Some points of clarification if I may:

    1. It’s important to note that it’s not always possible to blind in medical RCTs – consider for example a trial in which one study arm is receiving a medical management for a condition, and the other is receiving a surgical management – both patients and researchers know who is who (though it is arguably possible to blind the research assistant who collects self-report follow-up data…but this is fraught). Hence in some such studies, the extent to which blinding was actually achieved is assessed.

    2. Not all medical RCTs compare the “new” intervention with placebo – in fact usual practice is to compare with standard existing treatment – often for good ethical reasons.

    3. A “placebo effect” is not simply nuisance-noise. It is an established mechanism by which humans derive some of the benefit of an intervention. It means “I please” (as opposed to “nocebo” – I don’t please) and typically accounts for some 30% of reported benefit in trials where subjective (self-report) measures are taken. There’s not a placebo effect on a dependent variable such as reading accuracy, but there may be on a related variable such as confidence or self-efficacy.

    4. It is possible to use limited blinding in educational RCTs, e.g. by having Research Assistants collect data without knowing which students/class/teacher/school etc is in which study arm (though of course leaks can occur – see point 1 above). Also, it is best practice for data analysts to be given a file that does not identify which arm is intervention Vs control – to reduce the risk of smoke and mirrors statistical analysis by someone trying to torture data until it admits to an effect.

  12. Pingback: Academic educationalists need to engage with the new education debate or risk being sidelined | Education: the sacred and the profane

  13. Pingback: Hoodwinked by flapdoodle | Filling the pail

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.