March, this year, will see the publication of the first issue of the new researchED magazine. There are many top contributors (and also me). According to the website:
“Contributors to the first issue include: Carl Hendrick; Professor Robert Davis; Benjamin Riley, Deans for Impact; Sam Freedman; Pedro de Bruyckere; Professor Daniel Willingham; Nick Rose; Daisy Christodoulou; Professor Paul Kirschner; an interview with Professor John Sweller; Greg Ashman; Jennifer Buckingham.”
Which is quite extraordinary.
It looks like the first few issues will be free and will be sent to any address in the world. No membership fees or credit card details are required. So I don’t see any reason why you shouldn’t sign-up.
The beauty of researchED is that it is a grassroots, teacher-led movement. That makes it a little edgy and a bit different from your usual education fare. So a researchED magazine is exactly what we need.
As my students would bafflingly say: Get around it!
When I google the term ‘metacognition and self-regulation’ (in quotes), the top hits are from the UK’s Education Endowment Foundation (EEF) and Evidence for Learning, an organisation that uses and promotes EEF materials in Australia.
I’ve long argued that this term is a problem because it seems to contain a bunch of things that are largely unrelated to each other. My interest was piqued when I spoke to someone from the EEF who was quite critical of Nick Gibb, England’s school minister, and Gibb’s focus on phonics and subject knowledge. Instead, EEF evidence was apparently pointing in the direction of ‘metacognition and self-regulation’.
The EEF use ‘effect size’ as their main measure of an educational intervention; something they then translate into additional months of progress that the intervention will supposedly provide. This is flawed because it’s not appropriate to compare effect sizes for studies with different types of outcome test (experimenter designed versus standardised), different subject matter, different age groups, different cohort types (selective or full ability range) and different study designs.
You see this effect in the EEF results; its own randomised controlled trials (RCTs) often result in lower effect sizes than studies using weaker methods. Nevertheless, it’s worth looking at the data that goes into the EEF’s measure of metacognition and self-regulation. I’ve summarised the studies, plus one additional study, in the table below:
The studies in italics are ones that the EEF have drawn from a review of the available literature. They typically consist of meta-analyses of a range of studies that are not all RCTs. I have stated the kind of outcome that was assessed, as best I understand it, in brackets.
Below these are the studies carried out by the EEF. The final EEF study listed is for, ‘Let’s Think Secondary Science’, an intervention based upon the longstanding Cognitive Acceleration in Science Education (CASE) materials. The EEF have not yet included this result in their overall weighted average for metacognition and self-regulation, but they should. ‘Let’s Think,’ is actually very similar to the, ‘Thinking Doing Learning Science’ intervention for primary students which is the subject of the other Hanley et al. study in the table.
For completeness, it’s worth sharing that the two Gorard et al. references reflect the two different outcomes of the (in)famous Philosophy for Children trial and the NIESR references reflect a study involving mindset. Torgerson et al. evaluated an explicit writing intervention.
Given this information, how should we interpret the EEF’s claim that, overall, implementation of a metacognition and self-regulation strategy will result in an additional 8 month’s progress and is, “High impact for very low cost, based on extensive evidence’?
Firstly, apart from the explicit writing intervention evaluated by Torgerson et al., school leaders should probably avoid the rest of the EEF interventions. Even the one other clearly statistically significant result, with an effect size 0.22, maps to far less than 8 additional months progress (I make it about 3 months according to the EEF conversion chart).
We may achieve a larger effect if we implement one of the non-EEF interventions. However, it was the problems with the methods used by these kinds of studies that prompted the foundation of the EEF in the first place.
The second, and perhaps most fundamental, point that the table highlights is the disparate nature of the trials and outcomes. What does an explicit writing intervention have in common with a mindset intervention or philosophy for children?
Explicit attempts to teach reading and writing strategies have a long track record and are clearly defined, even if the former may provide a limited, one-off benefit. Would it not aid clarity to give these interventions their own categories? After all, we should not assume that adopting a mindset intervention will lead to the same effect as an explicit writing intervention.
Not only are these programmes qualitatively different to the others in the metacognition and self-regulation basket, the mechanism seems far more obvious. It is entirely plausible that teaching children to plan their writing will improve the structure of their writing and lead to higher writing scores. But how is Philosophy for Children supposed to improve maths?
The EEF seem strangely attached to evaluating interventions with obscure or mysterious mechanisms. For instance, in evaluating a core knowledge programme, they taught children about one domain of knowledge and then gave them a reading test in a different domain. Not even enthusiasts for core knowledge would argue this should work because the whole idea is that reading is enhanced by knowledge of the domain, not by knowledge of a different domain.
I wonder whether researchers at the EEF subscribe to a form of genericism, where academic performance is viewed as the result of a set of general, trainable skills. That might explain the way they design some of their studies, as well as their attachment to the metacognition and self-regulation chimera.
For their part, school leaders need to be aware that this category is deeply misleading, that it isn’t an actual thing and that they need to look under the hood if they want to find anything worth examining.
A couple of years ago, I published an ebook, the central claim of which is that education is caught in something of a strange loop. We are condemned to constantly repeat our mistakes.
One of the tools of this cyclical reproduction is a form of thinking that rationalises away the need for change. Periodically, a research project will generate strong scientific evidence against popular and widely-accepted practices. This causes shock in the short-term and the scales fall from the eyes of a small number of educators. But time soon brings a rationalisation. Critics, pretending to be philosophers rather than educationalists, zoom out, put on their deepy-thinking faces and start asking, “But what does it mean to know something?”
In short, it is claimed that education is like really complicated dude. This is then seen to imply that tools for establishing facts in other domains of knowledge – simpler areas like genetics or quantum physics – are totally inappropriate for education. We need something else. It is naive to present supposedly ‘scientific’ evidence about education. In fact, there is a name for it: “positivism”. Given that evidence can now be dismissed because positivism, we can go on believing in the demonstrably false things that we believed in before.
Instead, the suggestion is that education needs theory. But this is not the kind of testable theory that we see in science. It’s deeper and more philosophical than that. As I was once told on Twitter, “Back up your evidence with theory or admit you have no pedagogy.”
Of course, none of this is explained as simply as I have just explained it. Part of the mystery play involves obscuring obvious meaning by deploying long and complicated words and endless definitional arguments. It is a performance, intended to convince you that the performer must be very clever and therefore right.
It is clear that this particular form of (non)logic is holding education back. It is as if medics were still clinging on to theories of the four humours. So, the only thing to do is call out the flaws in this thinking.
Pasi Sahlberg has recently been appointed to the newly created Gonski Institute at the University of New South Wales where he will be working alongside Adrian Piccoli, a former New South Wales education minister.
It’s not clear exactly how Sahlberg will use this role and I will return to that later. But first, it is worth considering why he was appointed.
Sahlberg is a former director general of the Finnish education system. He has written and lectured extensively about the Finnish approach to education. The Finland connection is key. Reporting of his appointment has drawn heavily on the high performance of Finland on tests run by the Programme for International Student Assessment (PISA).
I think there is a problem here. Australians seem to assume that, because Finland is ranked highly on PISA, Sahlberg must be able to offer advice that will improve our system. This is flawed.
Firstly, Finland is very different to Australia. It has a far more homogeneous society. Children start school later than in Australia, but their language skills are already well developed and up to a third can already read. Those who can’t, face a much easier language to learn than English. This is because Finnish has a ‘transparent orthography’, meaning that letters map reliably to sound with little ambiguity (see Shanahan on these factors).
By contrast, English has a more complicated orthography, with many letter combinations representing more than one sound and many sounds being represented by more than one letter combination. This is why an early grasp of phonics knowledge is so important and why a phonics diagnostic screen for Australian students has been supported by many language experts.
It is worth pointing out that England have adopted such a screen and the early signs of its effectiveness are extremely positive. As measured by PIRLS, another international test, the reading performance of the most vulnerable children has improved since its introduction.
However, Sahlberg thinks we should not bother with such a check:
“I think what the government in Australia could do instead is before thinking about these sorts of things is to make sure every child has enough time to play before they come to school.”
I have never met an educator who is opposed to children playing and so setting this up in opposition to a five minute screen is a bit like saying that children should not have the polio vaccination because they should be playing instead. Presumably, Sahlberg is not simply opposed to the screen but to the phonics instruction associated with it; instruction that is often playful and that my own children rather enjoyed.
But this returns us to the issue of Sahlberg’s role. On Twitter, he suggested to me that he does not intend to tell Australia to do what Finland has done:
And yet this is exactly what he seems to be doing with the phonics check, as well as his other pronouncements.
If Sahlberg wishes to make these arguments then he should be upfront about this and expect many of us to disagree. I look forward to taking him up on his suggestion that we meet and discuss these issues.
The fact that Sahlberg draws authority from Finland presents us with a further problem. As seen in the graph at the top of this post, Finland’s performance in PISA has been dropping. New research by Altinock, Angrist and Patrinos, aggregated scores across a range of measures and this seems to show that Finland obtained most of the gains in its performance in the 1980s and 1990s:
It is worth noting that the Finnish system has changed over time and there was more control and a more traditional approach in the years when these gains took place. If we want to achieve similar gains then we might be better to look at these policies rather than the ones Finland has enacted more recently (and certainly not ones they intend to enact in the future).
Ultimately, it is evidence to which we should defer, something that has been largely missing from the reporting so far. If Sahlberg has a case to make then he should do so and present his evidence. Borrowing authority from Finland will not do. If he just wants to listen then that’s fine. But he can’t have it both ways.
John Kenny has also written a post on this topic. It’s well worth a read.
On this blog, I have repeatedly expressed unease at the terms ‘no-excuses’ and ‘zero-tolerance’ when applied to schools. I have never visited a school that describes itself in this way, but I doubt whether the phrase is completely accurate. I am also uneasy about much of the criticism these schools attract because it seems to be a complaint against any kind of strong behaviour policy at all. And I am in favour of strong behaviour policies.
I was therefore interested to read a new critique, based on research funded by the Australian government. The article cautions us against adopting what the author, Dr. Linda Graham, sees as a British model of ‘no excuses’ schools. Graham draws on statements made by Jonathan Porter, a UK ‘no excuses’ advocate. Porter’s statements suggest that such policies enable all teachers to teach, not just the particularly charismatic or experienced ones.
I share Porter’s reasoning when I advocate strong behaviour policies. Students should respect their school as an institution and therefore they should respect any teacher who represents that institution, unless and until that teacher does something to forfeit this respect. There should be no need for a long and tortuous process whereby every individual teacher has to work to earn this respect. Not only is this wasteful of teaching time, it is inequitable. Teachers can be the victims of racism, sexism and homophobia, and experience suggests that tall, middle-aged teachers who have been at a school for a number of years and who hold positions of authority, find it easier to earn respect than small teachers, new teachers, young teachers, teachers with strong accents, casual relief teachers and so on.
Graham challenges this position, arguing that the factor which impacts most upon classroom behaviour is the quality of teaching. If students are forced into compliance, they cannot communicate that the quality of teaching is poor and so ‘no excuses’ offers cover for ineffective or lazy teachers.
I am dubious about this contention because I don’t accept some of the paper’s premises and I don’t agree that it is supported by the research evidence presented.
Firstly, I am deeply sceptical of the idea that poor behaviour is a form of communication. If that were the case, why would humans spend so much time and energy trying to hide their poor behaviour? Poor behaviour derives from personal and situational factors that are complex, and it often seems motivated by the desire to please friends or to avoid challenging or boring tasks or tasks that threaten the ego. Sometimes, it seems to be about asserting a position in the peer hierarchy. None of these motivations are necessarily about sending a signal to a teacher or other authority figure.
Secondly, the author seems to be worried that ‘no excuses’ is the project of a shadowy group of figures who I don’t recognise:
“The most recent strain of the crisis rhetoric appears to be driven largely by teachers identifying as ‘neo traditional’, particularly in England where these teachers are well-connected by social media. The neo traditional teacher favours teacher-centred instruction, deplores inclusion and differentiation, and promotes strict whole-school ‘no excuses’ discipline policies modelled on an extreme interpretation of behaviourism.”
I know a lot of people who favour teacher-centred instruction and it is true that the advocates of ‘no excuses’ tend to fall into this camp. However, I don’t think they identify as ‘neo traditional’ because this oxymoronic term is mostly used pejoratively. Neither do they ‘deplore’ inclusion or differentiation; a strangely emotive term to use in an academic paper, particularly when you realise that this paper was published in the International Journal of Inclusive Education and you consider the likely views of this journal’s readers.
I am aware of some well reasoned critique of common approaches to inclusion and differentiation – I’ve written some myself – but this is far more nuanced than Graham allows. Does Graham wish to force us into a binary; that we must either accept inclusion and differentiation entirely on her terms or we must ‘deplore’ them?
From there, the paper becomes quite strange. In an attempt to show that ineffective teachers are the major cause of student misbehaviour, we are presented with a series of anecdotes about two teachers who have been observed as part of a research project. Mr Smith is a really bad teacher who manhandles the students and is constantly berating them, often unfairly. Miss Jones, on the other hand, is really good. She sets clear rules and routines, such as getting the children to sit with their hands on their heads, and reinforces these with rewards. She is also really nice and says, ‘bless you,’ when a child sneezes. (I’m just imagining what the reaction would be on Twitter if a ‘no excuses’ school ‘forced’ children to put their hands on their heads, but that’s another matter…)
These observations left me feeling uncomfortable. I am pretty sure that if I were one of these teachers and I read this paper then I would recognise myself from the details given. The way these passages are written, and the way the teachers are described, reminded me that Graham is an academic who has never been a teacher herself. Mr Smith is not described at all sympathetically.
Moreover, I find it odd that Graham builds her case based on anecdotes about just two teachers and I doubt whether such evidence can prove anything much. As it stands, it actually offers some support for the kind of behaviourist approach espoused by ‘no excuses’ advocates. The obvious point to raise is that these two teachers are not working in a ‘no excuses’ school. Perhaps Mr Smith would be a more effective teacher in an environment where rules, routines and consequences were determined at a whole-school level – rather than one where these decisions were his to make – and where he received some training in how to apply these procedures. We don’t know, but he clearly needs help and support. I’m not sure that he needs researchers earnestly documenting his every failure.
Perhaps aware that her own evidence offers support for rules, routines and consequences, and therefore behaviourism, Graham attempts to use a lesson observation rating scale – CLASS – to show that the teachers differ in other aspects of their abilities and not just classroom management. I am unconvinced by this, given the validity and reliability issues faced by classroom observation instruments of this kind.
And so, there we are.
It’s hard to know what to make of it all. Graham clearly wants to put the boot in to ‘no excuses’. The trouble is that rather than brandishing a boot, she appears to be waving an old sock with the words, “there might be something in ‘no excuses’ after all,” stitched into it.
Does your school group students into classes in a given subject based on their level of advancement? I use this rather clumsy phrase because I have learnt that terms some people use – e.g. setting – are not shared by others. If your school doesn’t group students in this way, and you have more than one class running in a subject, then there is still a chance that you end up with group differences. This may be due to the way classes are blocked against each other or just due to random variation.
If this is the case, how do you decide which teacher should teach which group?
I am aware of schools where the head of department somehow ends up taking the most advanced classes and new teachers get the less advanced ones. If there is any attempt at justification, it is usually around the subject knowledge needed to stretch the most advanced students.
But let’s take subject knowledge off the table for a minute*. Let’s assume that any teacher you assign to a class either has the requisite subject knowledge or is willing and able to plug any holes through preparation. The difference between a new teacher and a more experienced one will now be in pedagogical content knowledge, a posh label for knowing the bits kids find hard, having a good analogy to draw upon and so on.
In this situation, if you give a new teacher a less advanced group and an experienced teacher a more advanced group then you would expect, on assessments conducted through the year, that the less advanced group will gain lower scores. How much lower should they be? You don’t know.
If you reverse the situation, then you would expect the gap to be smaller. If the new teacher is teaching the more advanced group and these students actually underperform the less advanced group in a particular area of the curriculum, or in a particular question type, then you can make a pretty strong inference that the experienced teacher is drawing on some pedagogical content knowledge the new teacher lacks. You can then attempt to identify it and make it explicit in your curriculum documentation for the following year.
So new teachers should be given the more advanced classes.
*If you have a teacher that lacks subject knowledge then you have a real problem. However, I still can’t justify assigning this teacher to the least advanced students because these students will find it harder to learn implicitly and will need more explicit teaching.
I woke up this morning to find my twitter timeline flooded with comments about Ofsted, the English schools’ inspectorate.
Having worked for 13 years in English schools, I went through a number of Ofsted inspections. Some I felt were fair and others less so. For those who don’t work in the system, it can be hard to overstate the effort that schools expend on ensuring a good Ofsted report.
There are two main issues that people are discussing at the moment:
1. Ofsted have abandoned grading individual lesson observations and yet some schools still do this
Ofsted abandoned grading lesson observations for a very good reason. The best evidence we have suggests that such grades are invalid and unreliable. The MET project, conducted in the U.S. by the Gates Foundation, did manage to reach mediocre levels of validity and reliability but this involved observers watching videos rather than being in the room, multiple observations of each teacher by multiple observers and the teachers being unaware of which rubric they were being assessed against. There is absolutely no chance of Ofsted or senior leadership observations getting anywhere near this. An article by Robert Coe is worth reading on the issue.
So what should Ofsted do about schools who are still grading lessons? Is it their responsibility? Should they just shrug and blame the school leadership?
Ofsted clearly created this problem so they need to take some responsibility for cleaning it up. If they started to write things in their reports like, “The school leadership still spend time and effort grading individual lessons even though this practice is not supported by the evidence,” then it would cease pretty much overnight. So this is what they need to do.
2. Ofsted suggest there is no need to prepare for an inspection but lots of schools do anyway
Schools that know they are about to be inspected can do a number of things. They may ask teachers to get their marking up to date. They may collect lots of data. They may ensure that the litter is picked up and that certain students are not around. You can even buy a book from a former inspector on how best to prepare for a visit.
Teachers are likely to put themselves under pressure, crafting long-winded lesson plans they wouldn’t normally bother with and dreaming up whizzy starter activities.
So, what should Ofsted do about this? Should they simply tell schools and teachers that preparation is unnecessary? Is it the fault of schools?
No, it is the fault of the inspection regime. Ever since I can remember, there has been talk of no-notice inspections, and with good reason. If you don’t know an Ofsted inspection is coming and if they are scheduled unpredictably then you can’t prepare. So this is what they need to do.