This is the homepage of Greg Ashman, a teacher, blogger and PhD candidate living and working in Australia. Everything that I write reflects my own personal opinion and does not necessarily represent the views of my employer or any other organisation.

Read about my ebook, “Ouroboros” here.

Watch my researchED talks here and here

Here is a piece I wrote for The Age, a Melbourne newspaper:

Fads aside, the traditional VCE subjects remain the most valuable

Read a couple of articles I have written for The Spectator here:

A teacher tweets

School makes you smarter

Read my articles for the Conversation here:

Ignore the fads

Why students make silly mistakes


Accuracy, opinions and tone

I wrote an article for Impact, the magazine of England’s new Chartered College of Teaching. They declined to print it. They had sent it out to two sets of reviewers who took issue with a number of aspects of what I had written. Given that it was essentially an opinion piece, I expected comments on accuracy: Were my figures correct? Had I correctly explained statistical significance? And so on.

I was surprised to read comments objecting to my tone or simply disagreeing with my opinions. Tone is largely subjective. Anyone reading my piece who is personally invested in metacognition and self-regulation will find the tone confronting. Others will simply find it playful.

When I commented on Twitter about the fact that the reviews took issue with some of my opinions, the College Twitter account said that I was wrong:

This is what forced my hand and made me decide to release the anonymous reviews.

Following this, some defenders of the College engaged in a Pythonesque Twitter thread where they attempted to deny the obvious. As far as I understand it, they claim that the reviews did not take issue with my opinions because they gave good reasons for taking issue with my opinions. Or something. For me, this peaked with the following exchange:

However, what has perhaps been lost in this process is the notion of accuracy. When I was sent the final set of reviews, I was given a couple of days to revise my article for tone and accuracy. I declined, partly due to the fact that I didn’t see the need and partly because of the time I was given.

One reason I didn’t see the need is that there were so few comments relating to accuracy. Indeed, one reviewer seemed to indicate that s/he lacked the knowledge to comment on accuracy and instead suggested my tone was ‘Clarksonesque’. Those who did made comments about accuracy appeared to be wrong.

For instance, one reviewer claimed, “the assertion that the Toolkit indicates that a teacher will get 8 months’ progress is wrong and misleading.” On the 13th April, the Toolkit revised this figure down from 8 months to 7 months, but at the time of writing, and at the time the review was written, it was indeed 8 months. So what was wrong with my claim? It is obviously a nonsense that anyone could expect to get 8 months progress from implementing one of these diverse approaches but this is what the Toolkit suggests and this is one of the reasons I am critical of it.

One reviewer noted that my opinion contradicted that of the Education Endowment Foundation (EEF) but didn’t really resolve the issue, despite seemingly showing sympathy for the EEF’s position. The comment seemed more of a caution to the editor than something I could respond to.

There is one important claim about accuracy in the reviews that could be a valid point. After criticising my ‘selective’ use of a Dylan Wiliam quote, a usage that Dylan Wiliam did not object to in his own review, Reviewer four mentions, “…references to statistical significance which are of little relevance in cases where effect sizes are reported with confidence intervals.”

Jim Thornton, professor of obstetrics and gynaecology at Nottingham university and an expert in randomised controlled trials, took up this point in a comment on an earlier post:

“There were no confidence intervals around the effect sizes in the P4C trial, nor in the EEF meta-cognition and self regulation review. In the technical appendix the range of the different effect sizes is reported, which might have confused reviewer 4, but this is quite different from the confidence interval around the effect size. Reviewer 4 is correct that confidence intervals would have negated the need for tests of statistical significance. The problem is they weren’t reported. The very point that Greg was making.”

So I’m not really sure how I could have revised the article for accuracy.

Notes: You can find my original submission, prior to any amendments I made on the basis of the first three reviews, here. You can find my response to the first three reviews here. The sixth response should read ‘The fact that teaching methods..” but I’ve left it as submitted for transparency.

Is the Education Endowment Foundation chicken?

The EEF is an independent charity that operates in England and that was founded with the help of £125 million of UK taxpayer money. Its mission is to help improve educational outcomes, particularly for the children of disadvantaged and low income households. To this end, it publishes a Toolkit that aims to summarise the evidence for different types of educational intervention, as well as conducting its own randomised controlled trials (RCTs) in order to generate new evidence. Latterly, the evidence from its RCTs has become incorporated into the toolkit.

There are some fairly major implications for Australia. Evidence for Learning (E4L) is a social venture that has licensed the EEF Toolkit for use in Australia and I strongly suspect that this will be a central plank of the ‘Gonski 2.0’ proposals for Australian education when they are finally published. So there is a lot at stake.

I am not ideologically opposed to the EEF. I strongly support the idea of conducting RCTs, as well as the aim of summarising evidence in a way that teachers can use. However, I am a critical friend and have had cause to become more critical over time. Some RCTs have struck me as a little pointless or have not demonstrated what has been claimed. I have even suggested an improvement to the design of RCTs so that we may make stronger inferences from them.

I have also been critical of the EEF’s attempts to compute an ‘effect size’ for the different interventions in its Toolkit and express these in ‘months of additional progress’ because it is a complete nonsense. Instead, the summaries should be open and contested qualitative accounts.

And I have been particularly critical of the ‘metacognition and self-regulation’ strand of the toolkit which, as I explained in the recent article that the UK’s Chartered College of Teaching declined to print, appears to be a chimera; a monster stitched together from quite disparate things. What does a writing intervention where students are explicitly taught how to plan, draft, edit and revise their writing – also known as ‘teaching English’ as Joe Nutt drily observed on Twitter – have in common with ‘philosophy’ lessons where children discuss whether it is okay to hit a teddy bear? Not much. It’s astonishing that such wildly different interventions would be classed as the same thing. Medical researchers, on whose model of meta-analysis the EEF has built, would never group together such diverse approaches, with such different aims, methods and outcomes, and try to compute an effect size as if they are all the same thing.

I have never had a response to any of these criticisms from the EEF. They don’t owe me one but I do think that a body in receipt of public funds should be open to scrutiny and debate. When the Chartered College announced that the next issue of their Impact magazine would focus on ‘developing effective learners’ and would be edited by Dr Jonathan Sharples of the EEF, I speculated that it would focus on metacognition and self-regulation. Sharples replied that I should ‘submit an abstract’ and this struck me as a potential way of initiating the debate that I was seeking.

When the Chartered College indicated that they would not be printing my article, I offered them a suggestion: Why not print it alongside a rebuttal? It seems that Dylan Wiliam was thinking along the same lines.

I am not afraid of a debate. Who would be? Well, perhaps the EEF are.

Peer review of my article for Impact magazine

Yesterday, I published an article that had been rejected for publication in Impact magazine, the journal of England’s Chartered College of Teaching.

Following publication, and a discussion about this on Twitter, the College made claims about the contents of the review comments. Specifically, they claimed that, at no point did reviewers take issue with the opinions in my piece:

I have previously asked the College for permission to publish the review comments and they declined on the basis that they had not sought permission from the reviewers for me to do this. I therefore asked the College if they would seek permission from the reviewers. They replied that this was not something that they would do.

The Chartered College has been set-up with five million pounds of UK taxpayer money after failing to raise sufficient cash through a crowd-funding campaign. Many UK teachers have concerns about the College, how it will compare to the defunct and disliked General Teaching Council and whether it will become a mechanism for educationalists to impose their ideas on classroom teachers. In addition, the article that I wrote is critical of the approach of the Educational Endowment Foundation (EEF), another body in receipt of large amounts of public money. That is why it is in the public interest to scrutinise the College and the EEF, and that is why I have published details of the process of submitting my article, something I would not do if submitting an article to an academic journal.

Given the public interest argument, the fact that the reviewers are anonymous and cannot be identified in the reviews and the fact that the College have now chosen to make claims about the content of these reviews, I have decided to take the step of publishing them in the link at the end of this post. Paraphrasing them was never going to be satisfactory because of the potential for unconscious bias in how I completed this task.

There are a few points to note. I sent a first draft to Impact and received the first three reviews. I then revised my piece on the basis of the first and second review in order to make it more clear that I was criticising both the use of meta-analysis in general and the way the EEF used meta-analysis. I also fixed some relatively minor points, including one about a missing reference. It is the revised piece that I published yesterday. As requested by the College, I provided them with a detailed, point-by-point response to all of the issues raised in the first three reviews. These were mostly points raised by the first reviewer because they were far more numerous.

The College then sent out my revised piece to four new reviewers.

I will not comment on the reviews in this post because I would like you to make up your own mind. However, there is one potential point of confusion that is worth clearing up. One reviewer wrote that, “the assertion that the Toolkit indicates that a teacher will get 8 months’ progress is wrong and misleading.” If you look at the toolkit today, you will see that the figure is +7 months of additional progress and so you may therefore think that the reviewer’s point is that I got this figure wrong. However, this figure was altered on the 13th April 2018. At the time that I wrote the article, and at the time it was reviewed, this figure was +8 months.

You can read the reviews here.

The article that England’s Chartered College will not print

The following article was submitted to Impact, the trade journal of England’s Chartered College of Teaching. This is the version that was revised in response to the first three peer review reports. You can read about the process here. I will follow-up by discussing the peer reviews. Unfortunately, I don’t have permission from the College to print these in full. For now, read the article and see what you think: 

If you are a teacher or a school leader and you visit the Education Endowment Foundation’s (EEF) online toolkit, you will notice one ‘strand’ that stands out from the rest. Implement ‘meta-cognition and self-regulation’ in your school and you can expect your students to make an additional eight months of progress. But wait, it doesn’t stop there. Implementation is low cost. And there’s more. The evidence for its effectiveness is even stronger than the evidence supporting the use of feedback. It’s a no-brainer then. Off you go and do it!

What is that, you say? You are not sure what ‘meta-cognition and self-regulation’ is? Perhaps Kevan Collins, Chief Executive of the EEF, can help. According to Collins, ‘Meta-cognition is getting beyond – above the actual thing – to have a better sense of it.” (Collins, 2017). Does that help?

I was not entirely clear and so I decided to look at the studies that sit behind the EEF figures. What I discovered caused me to question the headline claims. It seems as if, like the mythical chimera, the category has been stitched together from a range of different beasts. Moreover, the outcomes we should expect vary greatly, depending on what kind of approach we select.

The EEF produce a ‘technical appendix’ for each of their toolkit strands and so I consulted the technical appendix prepared for meta-cognition and self-regulation (Education Endowment Foundation, 2016). It lays out two sources of evidence. The first is a range of meta-analyses conducted by different education researchers that seek to draw together and synthesise the findings from many different studies. The second is a set of individual studies, mostly conducted as randomised controlled trials by the EEF itself.

An ‘effect size’ is calculated for each of the meta-analyses and studies and these are combined by the EEF, in a further layer of meta-analysis, to produce an overall effect size which is then used to generate the headline figure of eight months of additional progress. Combining effect sizes through meta-analysis is controversial because the conditions of a study can influence the effect size. For instance, the age of the subjects and whether the outcome measure is designed by the researchers can both influence effect sizes (Wiliam, 2016). In the case of meta-cognition and self-regulation, this issue is compounded by the fact that the outcome measures vary widely from reading to maths to critical thinking.

Moreover, we might expect effect sizes to be influenced by the quality of the study design and the technical appendix seems to support this conclusion because the effect sizes of the more rigorous EEF randomised controlled trials are generally lower than for the meta-analyses.

Such problems have led Dylan Wiliam, Emeritus Professor of Educational Assessment at the UCL Institute of Education, to conclude that, ‘…right now meta‐analysis is simply not a suitable technique for summarizing the relative effectiveness of different approaches to improving student learning…” (Wiliam, 2016). It is therefore not clear that meta-analysis is an appropriate way of evaluating educational interventions at all and it certainly calls into question the EEF’s approach of attempting to derive an overall effect size from multiple meta-analyses.

If we focus only on the randomised controlled trials conducted by the EEF, the case for meta-cognition and self-regulation seems weak at best. Of the seven studies, only two appear to have statistically significant results. In three of the other studies, the results are not significant and in two more, significance was not even calculated. This matters because a test of statistical significance tells us how likely we would be to collect this particular set of data if there really was no effect from the intervention. If results are not statistically significant then they could well have arisen by chance.

Furthermore, the diversity of approaches sitting under the EEF label of meta-cognition and self-regulation is astonishing. In Philosophy for Children, for instance, teachers use stimulus material to initiate class discussions around concepts such as truth. This supposedly has an impact on their maths performance (Gorard et al., 2015) although the way that this is meant to happen seems spookily mysterious and the lack of a test of statistical significance does not fill me with confidence.

In contrast, Improving Writing Quality is an intervention where students are explicitly taught how to plan, draft, edit and revise their writing. This was one of the two EEF studies with a statistically significant result and it was the one with the largest effect size. This is hardly surprising because explicit writing interventions have repeatedly been shown to be effective at improving students’ writing (Torgerson et al., 2014). Moreover, in contrast to Philosophy for Children, the way that it works is highly plausible.

What do these approaches have in common and what do they have in common with a science intervention such as Thinking Doing Talking Science (Hanley et al., 2015) or a growth mindset intervention? True, they all involve students in thinking, but then so does every other educational activity.

One intervention, Let’s Think Secondary Science, has not yet made it into the data pool for meta-cognition and self-regulation and I’m not clear as to why, although it may just be due to the timing of the study. It is based on the Cognitive Acceleration in Science Education projects of the late 1980s and early 1990s, has similarities to Thinking Doing Talking Science, but when tested by the EEF was found to have no effect on learning (Hanley et al., 2016).

It therefore matters greatly what type of intervention we select and what outcomes we are intending to improve by selecting it. By stitching together explicit writing interventions with philosophical discussions, the EEF have created a monster; a chimera that hinders our ability to understand what works best in schools. Teachers and school leaders would be wise to read the underlying studies in the meta-cognition and self-regulation strand and draw their own conclusions. For its part, the EEF should get, ‘beyond – above the actual thing – to have a better sense of it,’ and then break it apart.


Collins K (2017) Sir Kevan Collins on Metacognition. Vimeo. Available at: https://vimeo.com/225229615

Education Endowment Foundation (2016) Technical Appendix: Meta-cognition and self-regulation. Education Endowment Foundation. Available at: https://educationendowmentfoundation.org.uk/public/files/Toolkit/Technical_Appendix/EEF_Technical_Appendix_Meta_Cognition_and_Self_Regulation.pdf

Gorard S, Siddiqui N and Huat See B (2015) Philosophy for Children Evaluation report and Executive summary. Education Endowment Foundation. Available at: https://educationendowmentfoundation.org.uk/public/files/Support/Campaigns/Evaluation_Reports/EEF_Project_Report_PhilosophyForChildren.pdf

Hanley, Slavin & Elliott (2015) Thinking Doing Talking Science Evaluation Report London: EEF *Higgins, S., Hall, E., Baumfield, V., & Moseley, D. (2005). A meta-analysis of the impact of the implementation of thinking skills approaches on pupils. In: Research Evidence in Education Library. London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London.

Hanley P, Böhnke JR, Slavin R, Elliott L and Croudace T (2016) Let’s Think Secondary
Science Evaluation report and executive summary. Education Endowment Foundation. Available at: https://educationendowmentfoundation.org.uk/public/files/Projects/Evaluation_Reports/Lets_Think_Secondary_Science.pdf

Torgerson D, Torgerson C, Ainsworth H, Buckley HM, Heaps C, Hewitt C, and Mitchell N
(2014) Improving Writing Quality: Evaluation Report and Executive Summary. Education Endowment Foundation. Available at: http://educationendowmentfoundation.org.uk/uploads/pdf/EEF_Evaluation_Report_-

Wiliam D (2016) Leadership for teacher learning. Morrabbin, Victoria. Hawker Brownlow Education.

Meet the new boss

Embed from Getty Images

Is murder wrong because it is illegal or is murder illegal because it is wrong?

A few years back, I decided to read Pedagogy of the Oppressed by Paulo Freire. I didn’t expect to agree with the book because I knew it was a key text of educational progressivism: I was aware that Freire’s criticism of the ‘banking model’ was used by some to disparage explicit teaching. However, I initially found the book to be abstract and vague. Freire wrote hypnotically about the oppressors and the oppressed, without clearly identifying who these were, while focusing on the need to become ‘fully human’. So when I reached the following sentence, it was as if jolted from a dream:

“However, the restraints imposed by the former oppressed on their oppressors, so that the latter cannot resume their former position, do not constitute oppression.”

Freire goes on to expand on the difference between the new regime and the old but even so, it does look a lot like, ‘meet the new boss: same as the old boss’. And as I have discussed before, Freire talks of repressing the old oppressive power and points to Mao’s cultural revolution as a positive example. We can imagine rules governing behaviour under the new regime and consequences for disobeying these rules.

When a group is engaged in a long battle against injustice, it is a victory when laws or regulations are changed to support their position. But it is not an end. I was challenged recently on Twitter as to whether the idea that Australia is breach of the UN convention on the rights of refugees is a strong argument against Australia’s position. I don’t think it is, no. A strong argument would consist of explaining why Australia’s behaviour regarding refugees is wrong.

An action may be in violation of a law or some other form of regulation and it may attract a punishment, and yet that does not explain why it is wrong. Some laws are silly. Some are misconceived. It is possible to break a law, receive a sanction and be morally right. This is the logic of civil disobedience. It’s a social Godel’s theorem: some truths simply cannot be expressed while staying entirely within the formal systems that humans have constructed.

Those who have fought a long battle against a great injustice and who have won that battle and changed the law may say, “I am tired of explaining. You are wrong and the law says you are wrong.” But this is dangerous thinking.

Imagine the young man who is engaged by fascist ideologues or religious extremists. They listen to the young man’s views, take him seriously and invest time and effort in explaining their moral universe to him. Opponents of the fascists or religious extremists take no such trouble. Instead, they tell the young man that he is wrong and his views and actions are unlawful. This may be true, but it is not an argument with much potential to change the young man’s views.

This is not to force a choice. A person who commits an offence should expect to both receive the appropriate punishment for that offence and to come to understand why it was wrong; or at least why others believe it to be wrong. We can have both of these things. They are not mutually exclusive.

And I believe that we should model this in the way we deal with behaviour in school. No, I do not think that a teacher should immediately stop a class to explain in great detail why throwing paper balls is disruptive. Sometimes we need to simply act in the moment and leave any discussions for later. But we should design systems where we explain norms and expectations from the outset and positively reinforce and reexplain those constantly as the contexts vary. When we give a necessary sanction, we should explain why we have done so, even if the student does not agree or we feel that it’s obvious or that we have explained it a thousand times before.

If we explain the need for a sanction then that doesn’t somehow stop the sanction from being ‘punitive‘ as some people seem to think. It is still a punishment and it will still be experienced as a punishment. The danger with mincing our words is that it reduces the precision of our language, and when dealing with student behaviour, clarity is essential. Teachers need to know what actions they may take that are aligned with the culture and values of the school. They don’t need to be given some vague, intangible notion of what not to do.

As teachers, we wish students to understand the world they inhabit. Insofar as we have a responsibility to teach them right from wrong, we have a responsibility to explain why some things are right and some things are wrong. Ultimately, our students may reject these arguments and face the social consequences of doing so, but that is their choice. Humans are not computers. Humans cannot be programmed with rules, whether they are the bad rules of the oppressor or the good rules of the enlightened and virtuous.

The inevitability of anti-intellectualism

Embed from Getty Images

Peter Lang, a publisher of academic books, has been kind enough to send me an electronic copy of the International Handbook of Progressive Education (here’s a review by John Howlett). It is a quirky mix that is enticing for education anoraks like me. Each chapter is preceded by an abstract in the fashion of an academic paper, and this enables a reader to dip in and out. I will write more as I read more but, for now, I wish to focus on a chapter by Wayne J. Urban of the University of Alabama.

Urban focuses on three critics of progressive education from the 1950s and 1960s; Arthur Bestor, Richard Hofstadter and James Bryant Conant. I knew about Bestor’s writing, having researched him for my new book, but I knew less of the others.

Urban concedes that progressivism is difficult to define and has a ‘tendency to contradiction’. However, he focuses on curriculum as a key feature of progressive education. To Urban, progressive education has been, at least in part, ‘a movement to diversify the school curriculum to allow it to include nonacademic studies and concerns’. This speaks loudly to our contemporary social media debate about progressivism, where philosophies are sometimes reduced to teaching methods. Certain critics of this debate have sought to focus instead on curriculum, but these issues are all connected.

It also makes clear the connection between some odd currents in contemporary progressivism. If we associate progressive education with left-wing politics, then it can be jarring to see progressives call for 21st century skills on the basis that employers value these supposed skills. Equally, it can be strange to see the unqualified promotion of tech companies and their products. These seem like capitalist concerns. But despite enthusiasm for progressivism on the political left, it is an error to associate it with any particular side of politics. It stands on its own. So, when Kenneth Baker, a former Conservative education minister, called for a greater focus on vocational studies, he did so from within a part of the progressivist tradition.

Urban’s three critics of progressivism are therefore critics of the degradation of an academic curriculum. Despite being scholars, they attack education in polemical terms that are not always to Urban’s taste. Bestor critiques the ‘life adjustment’ movement, a fully-formed forerunner of the 21st century skills movement that sought to supplement, and perhaps replace, academic goals with, ‘studies that addressed issues of how one was to live and prosper in a modernizing society’.

Schools of education are to blame for propagating these ideas, and this is something that Urban himself concedes when he largely agrees with these critics. Urban quotes Richard Hofstadter musing on the quality of trainee teachers and wondering, ‘To what extent able students stayed out of teaching because of its poor rewards and to what extent because of the nonsense that figured so prominently in teacher education.’ Having spent his career in education faculties, Urban indicates that he has a few tales to tell himself about doctoral dissertations with titles that appear to be an ‘academic joke’.

Urban muses on the role of Dewey. Dewey never sought to abandon an academic curriculum; it was his followers who did that. Or did they? How much responsibility should be Dewey’s? Urban wonders whether the evolution of schools of education is part of the problem. Despite having only his own reflections to draw upon, Urban is in no doubt that there is a problem.

I would go further and claim that anti-intellectualism is hard-wired into progressive education through its values. It is not some 1930s appendage; a path that could have been avoided. This is because, whatever Dewey’s pragmatism, progressive education is essentially romantic in origin. It asserts the primacy of the child’s nature, his or her surroundings and interests. Few children are naturally academic because, as relatively recent, technical, cultural constructions, academic subjects are fundamentally unnatural. And so antipathy towards them is inevitable.

Can meta-analysis be saved?

Embed from Getty Images

My thoughts on meta-analysis have changed over the years. My first encounter was through John Hattie’s 2008 book, Visible Learning, and I was aware of some key objections from the outset because Hattie outlines them himself, before going on to rebut them. The Hattie hypothesis is that even though the studies he uses have different subjects, designs, durations, assessment instruments and so on, this all washes out if you look for effect sizes above the typical effect size of about 0.4 standard deviations. To a teacher new to research, this seemed like a reasonable argument.

In the intervening years, I have realised just how noisy education data is and how much effect sizes depend on particular conditions. If you want a large effect size, run a quasi-experiment with a narrow ability range of very young children and make sure that you design the outcome assessment. On the other hand, if you run a properly controlled randomised controlled trial (RCT) with older teenagers and assess them with a standardised test, you are doing well to get any positive effect at all. You can’t just mush all these different kinds of studies together as if they are equivalent.

An evolution of Hattie’s approach might be to find typical effect sizes for all of the different kinds of studies, subjects and assessments. We could then modify the effect size threshold that we look for. So 0.4 might be appropriate for quasi-experiments in primary schools, for instance, and 0.1 might be the threshold for RCTs with high school students. We could draw-up a table. I don’t see anyone working on this and I think it would be challenging. We would lack enough studies to generate these typical levels in many areas and even once we have categorised study types, a bad study may still generate a larger effect than a good study of the same kind.

An alternative to the Hattie approach is to be very selective about the studies that you look at by excluding studies that don’t meet set criteria. We might call this the What Works Clearinghouse method. And yet this poses its own problems. You can end-up making statements that ‘there is no evidence to support’ something when what you actually mean is ‘there is no evidence that fits our selection criteria’. It therefore leaves out a lot of evidence that may be indicative, if not the final say on the matter. As a consumer of research, I would rather know about these inferior studies alongside their limitations than be led into thinking that no such studies exist.

I also want to know about correlations, single-subject research or any other kind of research that might have a bearing on my decision-making. I was blown-away a few years ago to read about research conducted by Kevin Wheldall and colleagues on seating arrangements. Briefly, they changed the usual arrangement of rows or groups in a classroom to the alternative arrangement before changing them back again, all the while monitoring on-task behaviour. It makes quite a compelling case for the value of seating students in rows but it is not a traditional RCT or quasi-experiment. Do we want to filter out all such research and fool ourselves into thinking it’s never been conducted? It is such exclusion/inclusion criteria that may have led to the strangely ahistorical nature of educational research that I once saw Thomas Good, veteran of process-product research, complain about at the ICSEI conference in Cincinnati.

I am fairly clear on what we should be doing to resolve these issues in the medium term. If we want to know whether teaching calculus through dance is better or worse than teaching it on horseback, then we should run an RCT with three conditions; dancing, horseback and the standard approach. This is far superior to comparing the effect size of dancing versus a control with the effect size of horseback versus a control because we know that all the other conditions are the same. There are still dangers and we still need to look at the detail to ensure that we have compared the approaches fairly, but this is something that we should be able to do.

When it comes to summaries of existing research, these must be entirely qualitative. There should be no attempt to compute an overall effect size or ‘months of additional progress’ in the manner of England’s Education Endowment Foundation. It is meaningless, particularly when we might dispute whether different studies represent examples of the same broad strategy or not. Instead, we need what is effectively a literature review of all the relevant evidence, its strengths and weaknesses. This should be open to continual review and discussion and it will be quite an art to make it pithy enough to accurately and historically capture the state of evidence in a particular area. Nevertheless, it is a venture worth embarking upon. There is no valid alternative.