The truth about teaching methods

There are those who would disagree that there is even such a thing as a specifiable teaching method. And there are those who would deny us the language with which to compare methods. But setting such defensive obfuscation aside, how can we decide which teaching approach is the best one to use in a given situation?

Effect Sizes

For John Hattie, the answer has been to compare effect sizes from different kinds of intervention. An effect size is a way of working out a standardised effect across different studies. It is basically the difference between the average of some kind of score from two comparison groups divided by the size of the spread of the data (the standard deviation). The problem is that Hattie perhaps uses this idea too generally. Can we really compare an effect size from a before-versus-after study with one from a control-versus-intervention study? If we narrow the student population e.g. by studying a high ability group then we will narrow the spread of results and so inflate the effect size. If we use standardised tests designed under psychometric principles then we will typically see less difference between groups and therefore reduce the effect size – this is due to the way that standardised tests are constructed.

In Hattie’s 2009 book, all types of test are treated equally. This means that trials where a new approach has been implemented by enthusiastic teachers and then compared to a do-nothing control group are lumped in with much more rigorous trials. For instance, the worked-example effect has been tested in randomised controlled trials. Is its effect size of 0.57 really comparable to similar effect sizes from poorly controlled trials? I don’t think so. Perhaps the answer is to agree to only look at randomised controlled trials?

Randomized controlled trials

There has certainly been a push for more randomised controlled trials (RCTs) in education. Ben Goldacre and others have made a strong case in the UK and the Education Endowment Foundation (EFF) is leading the charge, funding a whole range of different studies. However, RCTs are not without their problems.

Firstly, in medicine, RCTs are ‘blinded’. This is where, typically, one group is given a treatment whilst another is given a placebo. The patients and researchers do not know who is in each group. The purpose is to eliminate the placebo effect where simply knowing that you are being treated can lead to favourable outcomes. It is often quite impossible to blind an educational trial; most students will know if they are receiving something new and funky instead of business-as-usual. We therefore have to factor in the possibility of a placebo effect in whatever we find.

But it is also possible to poorly design an RCT by varying a whole bunch of factors at once. I recently wrote about such an RCT evaluating a scale-up of Reading Recovery. In this case, the differences between the control and intervention groups were multiple and it is impossible to tell whether it is the specific Reading Recovery practices that caused the effect.

In my post on the research, I asked if other studies had been conducted on Reading Recovery that were better controlled. One person linked me to this paper where Reading Recovery was compared (amongst other conditions) to a strange version of direct instruction where the students hardly did any reading. If you have access, it is worth reading the full paper, particularly for its description of the Reading Recovery teaching method:

In this example, Dana is reading Nick’s Glasses (Cachemaille, 1982), an 8-page illustrated book about a boy who cannot find his glasses because he is wearing them. The text on page 6 says, “‘Have you looked behind the TV?’ said Peter.”

Dana read, “Have you looked under the….” She hesitated, glanced at the picture (which did not provide the needed information), and searched the line of print. Then she started over, ” ‘Have you looked behind the TV?’ said Peter.”

At the end of the page, her teacher quickly said, “I like the way you were checking carefully on that page. Show me the tricky part.” Dana pointed to the word behind, saying, “It had a b.” “Yes,” said the teacher, “Under would have made sense. He could have looked under the TV, but that word couldn’t be under. I also like the way you checked the picture, but that didn’t help enough, did it? You were really smart to use the first letter; read it again fast and be sure that it makes sense.”

Dana read the page again fluently, saying, “That’s right.” In this example, the teacher was pointing out to Dana how she effectively used several different sources of information simultaneously to monitor her own reading.

This seems like a poor method. Encouraging students to guess words from the pictures is problematic because it won’t help them to read books that don’t have lots of pictures in them. Phonics should be a first resort not a half-hearted last resort when guessing fails. In this instance, phonics was only employed in relation to the first letter of the word rather than for the decoding of the whole word.

This makes me even more skeptical that the recent positive result from an RCT was due to the specific Reading Recovery methods.

Process-product research

An overlooked body of research evidence is the wealth of process product research spanning the 1950s through to the early 1980s. This research is essentially investigating correlations between specific teaching methods and gains in student knowledge and understanding. You can see why it has been largely replaced by experiments and quasi-experiments where factors can be systematically varied. However, I think it is still important and highly suggestive of which approaches are the more effective.

Barak Rosenshine looked into this research and derived principles of ‘direct instruction’. I also like Greg Yates’ discussion in his “How Obvious” paper. If you have the time, it is worth reading Thomas Good and Jere Brophy’s summary of the research, whilst mindful of Good’s warning not to see the findings as a checklist or observation tool.


So, it can be hard to evaluate approaches in education. The common methods have obvious flaws. However, I am not a postmodernist. I believe that people are more similar than different in the way that they learn and that we should ultimately be able to find some good general principles on which to base our decisions.

In the medium to long term, we should find ways to encourage knowledge building through properly controlled, randomised trials. The formation of the EEF in the UK is a good sign. As teachers, we may wish to involve ourselves in such research when we undertake Masters and PhD study. University education departments should focus more on this sort of research and less on ideological opinion papers or woefully flawed trials. I am hopeful.

However, we should not overlook what we already know. Despite the flaws, as Rosenshine notes, the results of investigating educational practices using quite diverse approaches all seem to converge on quite similar findings; the importance of teacher clarity and explicit instruction, the value of academic time-on-task, the role of practice and testing. There is enough to be going on with.


the inquiry

The Inquiry


sense making



What are your post-apocalypse skills?

So there you are, one of the few, disparate survivors of The Event. Caught like a rabbit in a trap, you are brought before the psychopathic, leather-clad local warlord who is to decide whether you are useful or whether he should just kill you for the fun of it. What do you say? What valuable skills might you trade for your life?

I think that the 21st century skills movement has not been anywhere near ambitious enough. It has not asked the right questions. Yes, we all know that there will be jobs in the 21st century that don’t exist yet, even though we are already in the 21st century. Obviously, all of the jobs that we can think of – accountant, plumber – already exist and so this proves that we can’t possibly name and describe the non-existent ones. In such a case, how can we prepare our students for them?

We also know that it is impossible to teach children the knowledge that they will need in the 21st century. The rate at which cat photos and fake Einstein quotes are being uploaded to the internet means that nobody could possibly memorise them all. So we have to prepare for this brave future. But what of the future after the future? Won’t somebody please think of that?

When the apocalypse comes, there will be some absolutely essential skills that all survivors will need in order to… survive. They will need to be creative, have the capacity to think critically and be able to work in teams. Clearly, the education systems of the past that produced all of the scientists, inventors and artists who contributed to a vast flowering of human knowledge and increased standards of living; these education systems will be insufficient to the task.

Instead, we must engage children in randomly making whatever they feel like and we need to get them to do role-plays and stuff like that.

Of course, there are those who dissent from this vision. Thinkers like E D Hirsch systematically mine culture for that which has endured on the assumption that knowledge that has been valuable in the past is our best guide to what might be of value in the future, both for work and for pleasure. Oh.

As for me, when the time comes and I’m hauled before that warlord, I think I’ve worked out what I’m going to go with. I won’t try and sell him on my ability to collaborate or my creativity. I won’t even mention my proficiency with 21st century technologies. Instead, I think I’ll go with the fact that I can make beer from scratch. That might work.

By Drawings: Hennequin de Bruges Weaving: Robert Poinçon's team, in Nicolas Bataille's factory. ( [Public domain], via Wikimedia Commons

By Drawings: Hennequin de Bruges Weaving: Robert Poinçon’s team, in Nicolas Bataille’s factory. ( [Public domain], via Wikimedia Commons


Is Reading Recovery like Stone Soup?

Researchers from the Universities of Delaware and Pennsylvania have written a paper describing a large, multi-site, randomly controlled trial of Reading Recovery. The effect size is impressive: 0.69 when compared to a control group of eligible students. This is above Hattie’s effect size threshold of 0.40 and so suggests that we should pay attention. As a proponent of evidence-based education, you may think it perverse of me to question such a result.

It’s not.

Reading Recovery involves taking students out of normal lessons and giving them a series of 30-minute one-to-one reading lessons with a Reading Recovery trained teacher over a period of 12 to 20 weeks. So the intervention packages together a number of different factors including:

– the specific Reading Recovery techniques

– additional reading instructional time on top of standard classroom reading instruction

– one-to-one tuition

Each of these factors could plausibly impact on a child’s reading progress. For instance, we might expect a series of 30-minute one-to-one reading sessions with an educated adult volunteer to also improve students’ reading performance.

However, the implicit claim is that it is the specific Reading Recovery techniques that are responsible for any effect. Otherwise, why would we spend considerable amounts of money training and hiring Reading Recovery teachers? Indeed, the abstract suggests that, “the intensive training provided to new RR teachers was viewed as critical to successful implementation.”

It would be very easy to test the effect of the actual strategies. A good model is a study carried out by Kroesbergen, Van Luit and Maas on mathematics interventions with struggling maths students. They created three randomised groups. The first were given a ‘constructivist’ maths intervention, the second were given an ‘explicit’ maths intervention and a third control group were given no intervention (at least, during the study). Both interventions were beneficial when compared with the control. This is to be expected – any reasonable intervention is likely to be more effective than no intervention at all. However, the explicit intervention was found to be superior to the constructivist one and so we may assign some of the effect to the different strategies used in the two interventions.

Following this model, a good test of Reading Recovery might be to compare it with the kind of tuition from an educated volunteer that I described above or maybe to compare it with a different one-to-one intervention program. Of course, all programs would need the same amount of instructional time.

However, this is not what seems to happen in Reading Recovery research. Reading Recovery is proprietary and so the consent of the organisation is required in order to use its copyrighted materials in trials. The only trials that seem to take place are those that compare Reading Recovery with no intervention at all, like in the Delaware/Pennsylvania study (I am happy to be proved wrong on this – if you know of any different types of trials then please link in the comments).

This is problematic. The first rule of scientific research is to control variables. Admittedly, some variables are highly unlikely to affect the result and so we might not worry too much about them. However, in this case, multiple variables are changed at once, each of which could plausibly produce an improvement in reading performance.

Hey Google, what is a fair test?

Hey Google, what is a fair test?

Imagine a trial of a new medicine. It is unlikely that such a trial would be run against no intervention. At the very least, it would be compared with a placebo because of the well-known placebo effect. A more pertinent example might be if a study was done to test a regime of diet, exercise and a patented vitamin pill against no intervention at all and found that the former led to considerably more weight loss. What would we learn from this?

All that we can conclude from the Delaware/Pennsylvania study is that the entire Reading Recovery package – which is expensive to implement – is more effective than standard classroom teaching alone. We don’t know what causes this effect and whether we could gain the same effect without the same expense. Moreover, I would suggest that the principles of Reading Recovery, seemingly validated by such research, have a tendency to wash back into classroom teaching, potentially at the expense of evidence-based approaches. Researchers at Massey University in New Zealand have even claimed that the ‘failure’ of New Zealand’s literacy strategy has largely been as a result of the widespread adoption of Reading Recovery principles.

It reminds me of the folktale of the weary traveller who makes soup out of a stone. He knocks on the door of an old woman and asks for some hot water. She asks him what it’s for. He explains that he intends to make soup out of a stone and that she can have some. After a while, he tastes the soup, “It’s good,” he says, “but it could do with a little bacon.” The old woman gets some. A short time later, he tastes it again, “Mmmm,” he says, “some turnip would just improve it a little.” And so it continues, with the woman fetching one new ingredient after another. Eventually, the traveller serves the soup.

“Delicious,” says the old woman, “who would have thought that you could make such tasty soup out of a stone?”

By Qù F Meltingcardford (Own work) [CC BY-SA 3.0 (], via Wikimedia Commons

By Qù F Meltingcardford (Own work) [CC BY-SA 3.0 (, via Wikimedia Commons

Update: Since writing this, I have become aware that the control group for the I3 study was more complex than ‘no intervention’. Instead, Reading Recovery was compared with a school’s usual intervention for poor readers. This was a mix of things from no intervention at all to small group interventions and so on. However, we are still not comparing like with like and so the original criticism in this post still stands.


Five Common Misconceptions about Learning

There are many books out there that deal with education myths. Daisy Christodoulou’s “Seven Myths about Education” is excellent and I have recently been sent “Urban Myths about Learning and Education” by De Bruyckere, Kirschner and Hulshof which I will review when I have finished reading it.

However, in this necessarily brief post I am going to characterise some common ideas as misconceptions rather than as myths. To me, this is how such views often present themselves; they are plausible, a wide range of people seem to arrive at them independently and yet the available evidence suggests that they are flawed. They feel ‘truthy’ in much the same way that it seems reasonable that something must be pushing the Moon around the Earth.

1. Novices should emulate the behaviour of experts

This misconception has legs. It is a key driver behind inquiry-based programmes in science, mathematics and history. For instance, in an article in The Telegraph, Jo Boaler contrasts the work of a PhD mathematics student with the sort of maths that takes place in classrooms, finding the latter wanting. But why should children who are just embarking on their mathematical journey need the same kind of learning experiences as someone much more expert? Experts have a vast amount of content knowledge that enables them to perform differently. It is easy to underestimate the scale of this. A key finding of cognitive science is that experts and novices benefit from quite different types of instruction.

If you ask a history student to read a source document then at least you are replicating the actual behaviour of experts. This may not be optimal for learning but it could have value as part of a range of strategies – students might enjoy it, perhaps. However, some strategies that are supposed to be based upon the behaviour of experts might not even reflect what experts do. For instance, the use of multiple cues or “searchlights” in reading instruction is meant to reflect experts’ strategies but it is unclear whether expert readers actually use these cues.

2. You understand concepts better if you discover them for yourself

I recently fixed the toilet. It was a frustrating experience because I didn’t know what I was doing. There was lots of cursing and plenty of, ‘what does this bit do?’ I spent at least the next week wondering whether an eruption of soiled water was imminent. Do I now understand toilets better than if a plumber had been alongside me, explaining exactly what to do? Definitely not.

So why do we have this intuitive preference for students figuring things out for themselves? In one seminal study, students were randomly divided into two groups. The first group were explicitly instructed in the fundamental scientific principle of controlling variables. The second group were given investigations to complete in which they had to figure out this concept for themselves. Unsurprisingly, fewer students in the second condition learnt the principle. However, those that did were no better than students from the first group at later evaluating science fair posters. There was no advantage to discovery.

Nowadays, advocates of student discovery tend to promote the less ambitious idea of ‘productive failure’. They concede that explicit instruction is needed but only after a period of open-ended problem solving. Interestingly, given the above discussion, it has been argued that many of the experiments that have been designed to test this idea fail to properly control variables (see the discussion at the end of this paper). A 2014 study by Manu Kapur is the best quality study so far but there is still a problem with it: The students given explicit instruction prior to problem solving then have to spend a whole hour solving a single problem which they already know how to solve.

3. Meta-cognition is a short-cut to expertise

So if simply imitating the behaviour of an expert will not make you an expert, are there other shortcuts available? Clearly, it would be great if we could find a way to develop expertise without students having to learn and practice all of the boring stuff. Perhaps we can teach general strategies which, if our students apply them, can be used in a range of situations. This way, we can teach them ‘how to learn’ and they can apply this to anything they need to learn in the future.

The picture here is actually quite complex. Take the example of reading comprehension strategies. These are general strategies that you can apply to anything that you read in order to help you understand it. Such strategies exist, although they pretty much boil down to one single strategy – asking yourself questions whilst reading. These strategies can also be explicitly taught to students and confer an advantage. However, they tend to provide a one-off boost which further repetition and practice doesn’t seem to increase very much.

The same can be said of critical thinking skills or ‘learning to learn’ skills. Asking yourself questions whilst reading a text is only helpful if you can answer those questions. Similarly with critical thinking; asking who wrote a source is only of any use if you can find the answer and know what this means. The fact that a source was written by a loyalist doesn’t help much if you don’t know anything about the American Revolution.

Similarly, learning to learn – when not presented vaguely – seems to reduce to study skills and, of these, the evidence for self-testing stands-out from the rest. It’s worth knowing that this is a good studying strategy but it acts to help consolidate the knowledge base rather than reduce the need for it. And it’s all rather prosaic when you consider the fact that these ideas are often sold as somehow teaching students how to think.

4. Knowledge-based education is really boring

We tend to conjure an image of some kind of nineteenth century classroom where the teacher beats facts into children at the end of a cane. And yet what is being proposed instead?

I cannot think of anything worse than spending hour after lengthy hour repeating reading comprehension exercises. In his recent book, David Perkins makes the case for authentic learning activities in which students – naturally acting like experts – engage in, “Project-based learning in mathematics or science, which, for instance, might ask students to model traffic flow in their neighbourhood or predict water needs in their community over the next twenty years.”


Set against this, a whole-class discussion of the dinosaurs or the possibility of alien life or the battle of El Alamein or the concept of infinity or whether Macbeth is a misogynistic play; these all seem positively in-tune with your average teenager’s interests.

The purpose of education is not to entertain; it is to educate. But if the criticism of knowledge-based education is that it is boring then the critics need to work a bit harder on the alternative.

5. Education must be personalised

Setting aside practical considerations, education should clearly work better if it meets students at their point of need. However, it needs to meet them there and then take them somewhere else. To borrow from Eric Kalenze, education should act like a funnel that prepares diverse students for college, careers and engagement in civic society.

However, it seems as if we have lost this clear mission. If students are pursuing their own interests and are expected to engage in learning only if we make it as easy and as accessible as possible then how are we preparing them for life after school?

Imagine a tour operator running trips to Greece. Of course, the tour operator needs to take account of where people are travelling from so that she can arrange planes and the like. But she still has to get them to Greece. It would be a poor tour operator indeed that told people not to bother going there and to go for a walk around their home town instead.

Students need to be able to read, write and do basic maths. These are functional skills that society demands and, in order to do the first of these, they will need a fair amount of general knowledge. As an advocate of the liberal arts, I would also argue that education must go beyond a merely instrumental role and seek to improve the quality of people’s lives. By taking academic studies further we open up opportunities for future study, careers or cultural interests.

This is the mission.



Does direct instruction in the early years cause teenage criminal behaviour?

I recently read a piece in Psychology Today that argues that early academic training causes long term harm:

I find such an argument hard to take. The causal mechanism that is suggested is that students who experience direct instruction have less time to discovery-learn social skills. However, this seems highly implausible. Even in DI schools, there is still a great deal of non-DI time. And how could any effect persist into adolescence? It would need to be one of the strongest causal relationships identified in education.

The bulk of the evidence in the article relates to the High/Scope research of David Weikart and colleagues. This jogged a memory. I could remember a piece by Carl Bereiter that criticised the methodology of one of the High Scope studies:

Further investigation revealed a critique of the High/Scope research by Martin Kozloff, originally posted on the DI Listserve. I reproduce this in full below (with permission):

“There are about half a dozen articles in the series written by Schweinhart and Weikart, in which they claim to compare (and apparently believe that they demonstrate the superiority of) their High/Scope pre-school curriculum with (1) a preschool that used Direct Instruction in reading and language for about an hour a day for one year, and (2) a traditional nursery school. It does not take a heavy background in research methodology to see what is wrong with the Schweinhart and Weikart studies. All it takes is reading their articles and applying a bit of commonsense. (See Weikart, D. [1988]. A perspective in High/Scope’s early education research. Early Child Development and Care, 33, 29-44; Schweinhart, L. & Weikart, D. [1997]. The High/Scope curriculum comparison study. Early Childhood Research Quarterly, 12, 117-143.)

In a nutshell, they claim that: (1) approximately 20 years after the children went to the three preschools; (2) with only about one-third of the original children (now adults) available for further study; (3) with little idea where the other two-thirds were; and (4) with important differences in the remaining samples (notably, far more of the adults who had been in the Direct Instruction preschool had been reared by single, working mothers whose income was about half that of households in the High/Scope group (socializing conditions [e.g., economic disadvantage, inadequate supervision and discipline] that sociological research—e.g., of Marvin Wolfgang and Gerald Patterson—has shown to be predisposing factors to juvenile delinquency); even so, (5) an hour a day of Direct Instruction is said by Schweinhart and Weikart to have caused anti-social behavior in the DI kids one and two decades later.

Let us ignore the fact that instead of reporting the actual rates of antisocial behavior, Schweinhart and Weikart generally report percentages; e.g., that there was allegedly twice as much antisocial behavior among prior DI kids twenty years later. Let us also ignore the fact that these percentage differences actually amount to differences in the activities of only one or two persons. What is most telling, and just plain bizarre, is that these two writers barely entertain the possibility that: (1) a dozen years of school experience; (2) area of residence; (3) family background; (4) the influence of gangs; and (5) differential economic opportunity, had anything to do with adolescent development and adult behavior. But, Weikart and Schweinhart are experts in early childhood. Therefore, they may not know that experiences after preschool affect adolescent development and adult behavior.

If one wanted to use an ad hominem argument, the Schweinhart and Weikart studies certainly provide the opportunity. First, they assert—again and again–that their curriculum is the superior one. Indeed, the number of times they make this claim gives their articles the ring of an advertising campaign for hair care products. Second, the early results of project Follow Through (with, I believe, 9000 children assigned to nine early childhood curricula) showed that disadvantaged children who received Direct Instruction went from the 20th to about the 50th percentile on the Metropolitan Achievement Test. Children who received the High/Scope curriculum did not do as well. They fell from the 20th percentile to the 11th. Maybe that is the animus for Schweinhart and Weikart’s claim. But we won’t speculate.”

Kozloff, M. (2011). DI creates felons, but literate ones. Contribution to the DI Listserve, University of Oregon, 31 December, 2011.