Differentiation: Good intentions are not enough

The Cambridge-Somerville Youth Study was a ground-breaking piece of research. It followed a mentoring initiative for boys from deprived suburbs of Boston that ran from 1939 to 1945. At the time, this initiative was almost unique in the field of social sciences because it also followed a matched control group of boys who did not receive the mentoring intervention. Through the efforts of researchers, the subjects were traced over a long period of time so that outcomes such as involvement in crime and stability of relationships could be compared.

The study was the subject of a recent Freakonomics podcast that serves as a warning to any of us who wish to intervene positively in the lives of young people. The boys who were given the intervention were overwhelmingly positive about it. And it’s not hard to see why. According to the criminologist, Brandon Welsh:

“The counselors would meet every couple of weeks with the boys, interact with them, help them with homework, take them to the YMCA. During the summer months, some of the treatment-group boys were able to go to summer camps and so were sent out of the city.”

Yet when the data was analysed thirty years later, the boys who received the intervention had fared significantly worse than the control group on a whole range of outcomes related to criminal behaviour, health and work.

Clearly, good intentions are not enough when it comes to social interventions. We cannot fool ourselves into thinking that the worst that can happen is that we have no effect: Sadly, it is possible to do harm. As Denise Gottfredson, professor of criminology and criminal justice at the University of Maryland, states in the podcast, “People just assume that if you do something that sounds good, that it’s going to have positive effects. But it’s actually more complicated than that.” She then goes on to explain that the somewhat counter intuitive result may be due to students in the intervention providing a form of validation for each others’ behaviour.

This is just one example of why there is a clear, ethical imperative to collect quantitative data. We cannot simply rely on our own good intentions or on interventions that are based in theory. It is quite wrong to simply dismiss calls for empirical evidence as, ‘positivism,’ while continuing to intervene with young people in ways that might actually be harmful.

This is why I am so concerned about particular models of differentiation that have become popular in schools. As I have previously written, Universal Design for Learning (UDL) is such a model that lacks evidence to support a positive effect on academic outcomes. One UDL website promotes the program with spurious images of brains. A partner website has a section on the evidence for UDL. However, this section asks visitors if they have any evidence to support these claims. Seeking confirmatory evidence is basically the opposite of science.

One of the features of these forms of differentiation is that they often allow students to present what they have learnt in a variety of different ways. This may be appropriate if a student with a disability simply cannot complete a certain type of task. However, I am deeply concerned about the practice of offering such alternatives to students who have difficulties with a particular task. If a student habitually avoids writing, for instance, then she is never likely to improve at writing and will be severely disadvantaged as a result as the gap grows ever wider between her and her peers. Instead of planning for alternative ways of demonstrating learning, we should be focusing our efforts on intensive and explicit writing instruction.

I might be quite wrong about this, but how would we know?

I have criticised forms of differentiation in the past and it has resulted in some pretty unpleasant commentary. I have been accused by Dr Linda Graham of disregarding the Australian Disabilities Discrimination Act, something I strongly dispute. This kind of authoritarian stance can only serve to silence criticism. Yet as an education community, we desperately need to think more critically. With UDL gaining in popularity to the extent of featuring in an Australian Senate report as an example of good practice, we need to ask whether it is possible, just possible, that it does more harm than good.

I was struck by a post by John Kenny who wrote about a presentation he had recently attended. Advised by the presenter that he should let students with Language Processing Disorder present their learning in the form of a video rather than in writing, he questioned this. According to Kenny, “When I raised this concern with the presenter, she simply stated that it was not fair to make students constantly do what they are not good at, that they should be given the chance to shine at things they are good at.”

It is clear that such views come from a place of deep compassion and concern for students; of this there is no dispute. But what if compassion and concern not enough? After all, there were not enough in the Cambridge-Somerville study. What if, in our rush to apply ideas that we think are sound and that suit our ideological outlook, we are actually doing harm? We will never know if we focus on silencing criticism and refuse to support the kinds of quantitative trials that could answer these questions.

What is the most useless problem solving strategy?

Problem-solving is one of those twentieth century skills that schools are supposed to be teaching in order to prepare students for jobs that haven’t been invented yet. It is a key component of the ‘critical and creative thinking’ general capability of the Australian Curriculum. Yet problem-solving is a tricky concept. I certainly don’t believe it is a discrete skill like a golf-swing that can be developed through teaching and practice.

David Geary’s distinction between biologically primary and biologically secondary knowledge perhaps offers a useful way of thinking about problem-solving. General problem-solving skills are biologically primary because they have evolved alongside the human mind: we have always had to solve problems in a general sense. Tricot and Sweller suggest that mean-ends analysis is a good example of such a strategy. This is essentially the tactic of working backwards from the goal in order to assess your progress towards it. We all possess this strategy already and so there is no need to teach it to children.

Biologically secondary knowledge builds upon primary knowledge. Means-end analysis is at the root of maths problem-solving (unless we engineer its absence) but we then complement this by teaching a series of domain specific strategies. For instance, knowing how to factorise a quadratic is really useful if you need to solve a problem involving quadratic equations, yet it’s not much help in solving other kinds of maths problems, let alone the problem of a blocked drain.

One of the difficulties that maths teachers encounter is in enabling students to transfer problem solving procedures from one context to another. For instance, a student may be able to factorise a quadratic equation but may not realise that this is what is required in a particular problem. My approach to this is an explicit one where students begin work on problems that look very similar to each other but then gradually work through problems with varying contexts. At a later stage, different problem types need to be interleaved so that students may learn to spot the deep structure and therefore the type of solution required. All of this proceeds with lots of teacher guidance and demonstration and plenty of formative assessment to keep the trains on the track.

However, what if we could develop more general problem-solving strategies that could shortcut this process? These may be completely general skills like means-end analysis that can be applied to any problem or they may be intermediate strategies that apply to a broad range of problems within a particular domain. There is little evidence available that we can learn general problem-solving strategies but what of these more middling kinds of strategies?

For instance, ‘draw a diagram’ is a reasonably useful strategy to help solve a very broad class of physics problems. It might not help much in other kinds of problems but it has a general applicability in physics. Nevertheless, it is nowhere near as useful as a specific strategy for solving a particular problem. It works as a cognitive aid to enable us focus on the deep structure.

Some intermediate strategies have potential and they seem similar in nature to reading comprehension strategies. Research suggests that reading comprehension strategies produce a quick boost in reading performance but that they don’t really function as a skill in the sense that repeated practice of these strategies increases their effect. Perhaps ‘draw a diagram’ is a useful heuristic that operates in a similar way.

I can’t see any value in some intermediate strategies. For instance, ‘solve a simpler problem’ only works if you already know the deep structure of the problem because the two problems need to share structure for the simpler one to be useful. Yet if you already knew the deep structure then you would not need a more general problem-solving strategy.

Dr Jennifer Buckingham at researchED Melbourne, 1st July 2017

As you may already know, it was my job to film some of the sessions at the researchED Melbourne event held on the 1st July at Brighton Grammar. These videos will be available via the researchED YouTube channel. So far, I have managed to load my own presentation on to this channel as well as the presentation above by Jen Buckingham.

In her presentation, Jen discusses the evidence for systematic synthetic phonics in the context of the five skill areas that are involved in reading. She also looks at some of the barriers to implementing this evidence. I enjoyed the talk and it was well received by the audience. There was some controversy at the time when screenshots taken by audience members were posted on Twitter. If you were aware of that controversy then you may now place these screenshots in the context of the whole talk.

Fishing for red herrings

Let’s create an abstraction: the northern hemisphere is more industrialised than the southern hemisphere. On the basis of this abstraction we might make predictions. For instance, we might suggest that measurable levels of air pollution are likely to be a more significant problem in the northern hemisphere than the southern hemisphere. We may also predict that this would affect rates of conditions such as asthma. If so, we might be able to draw a link between asthma and air pollution.

You might object to this model on a number of levels. For instance, you may find it trivial and uninteresting, or you may find evidence to refute one or more of it’s predictions. However, if this discussion was taking place on social media then you might come across some of the following arguments:

1. Disputing the definition

“What do you mean by ‘northern hemisphere’? The Earth is not actually spherical so it can’t have hemispheres. And anyway, the term ‘northern’ can mean different things in different contexts e.g. a ‘northern accent’ in England is something completely unrelated to industrialisation. And what do you mean by ‘industrialised’? The root of ‘industry’ means effort or diligence and this is surely unrelated to geography. Are you just talking about the preponderance of secondary industries? If so, why not state this instead? Unless you can define ‘northern hemisphere’ and ‘industry’ to my satisfaction then I don’t think I, or anyone else, can be clear about what you are claiming.”

2. Disputing the distinction

“Where does the northern hemisphere start and the southern hemisphere end? Hardly anyone lives at the polls. There are plenty of people living near the equator so I simply don’t think that your distinction makes sense. And within what you are calling the ‘northern hemisphere’ lies a diverse range of environments including highly populated cities and sparsely populated wildernesses, with everything in between. It’s a continuum, not a binary. And there are other ways of dividing up the world. What about continents? What about coastal areas versus inland continental zones?”

3. Raising the irrelevant

“And are you aware that the southern hemisphere has more marsupials than the northern hemisphere? And satellite photography demonstrates that trees and oceans cover more of the Earth’s surface than industry. By the way, the first person who attempted to count factories made some errors in his analysis and didn’t include those in the Soviet sphere of influence (let me give you the details of this at some length). Industries change over time – we don’t make many steam engines any more. Your simplistic dichotomy fails to represent every single aspect and detail of the world and its history.”

So what?

Some of these arguments obfuscate by attempting to deny us the use of well-established terms, distinctions or abstractions. That is a significant problem. However, the majority can be answered with a simple, ‘So what’? My hemisphere model was presented as a way of making predictions about air pollution. You may think the model is trivial but its purpose is clear. Most of the points raised in 1, 2 and 3 are of no consequence to this. Models are not meant to be facsimiles of reality that capture every single detail. If they were, they would be useless as models. Models derive their power by being a relatively simple way of capturing features of a complex thing.

Imagine I am giving directions to a hiker and suggest taking the path next to the field that is full of ‘cows’. It’s not particularly helpful or insightful for a third party to suggest that the cows have different colours and ages and that, even at a genetic level, there is continuous variation between examples of what I am blithely labeling as ‘cows’. “Is a three-legged cow still a cow?” etc…


I have recently decided to disengage from many discussions of this sort; ones that fail the, ‘So what?’ test. In certain cases, I have come to view these arguments as disingenuous. A few people on social media have made explicit their attempts to influence and shift opinion by less-than-straightforward means and it has made me view some of these debates as fitting that strategy. In other cases, I really do think that people believe they are displaying intellectual gravitas by suggesting that all cows are not the same. Whatever the case may be, I see no value in constantly addressing irrelevancies. If you ask me to comment on something and I direct you to this piece then that is the reason.

This does not mean that I am banning such discussions or that I have instituted a set of rules with which I intend to police social media. People are free to discuss whatever they want. I am free to engage with those discussions as and when I wish. And it does not mean that I will stop criticising ideas myself.

We need to talk about dual coding

Dual coding seems to be gaining in popularity. It even made its way into a Deans for Impact blogpost as a way of psychologically manipulating teachers to abandon their belief in learning styles. However, I’m not sure we all have the same understanding of what dual coding actually means.

Essentially, dual coding is based upon the idea that our working memory has separate channels for processing verbal and pictoral information. This has a number of key implications, many of which have been incorporated into Mayer’s Cognitive Theory of Multimedia Learning (CTML).

The “Modality Principle” in Mayer’s theory is essentially the same as Cognitive Load Theory’s “Modality Effect”. This is where, for instance, a physics teacher might explain the function of the components of an electric motor while displaying a diagram or simulation of the motor. The diagram can be processed separately, and in parallel, to the verbal information.

I have two concerns about how ‘dual coding’ might be making its way into the wild.

1. The joy of text

According to CTML, text is cognitively demanding because the text symbols must first be processed in the pictoral channel before being transferred to the verbal channel to be processed as virtual sounds. So if your ‘picture’ component actually has a lot of text on it – for example a concept map or an annotated timeline – you may not enjoy the benefits of dual coding because this text would interfere with the verbal explanation.

2. It’s about teaching not studying

As should be clear, the modality principle is about presenting information to students. It’s not about the best way for students to review material. I am not aware of evidence that doodling pictures while rereading your notes is a particularly effective form of studying but if someone has the evidence for that then I’m happy to be wrong.

Has cognitive load theory been debunked?

Has cognitive load theory been debunked? In short, no. However, you might get that impression if you saw this tweet by @Research_Tim and the subsequent discussion:

The tweet references a paper by Naismith and Cavalcanti (2015) that reviewed efforts to directly measure cognitive load. This has been attempted a number of ways including simply asking people how much mental effort they are expending and asking people to complete secondary tasks to see how much working memory capacity is being used. The authors suggest that these measures are not very good.

The findings in this paper lend support to cognitive load theory rather than debunk it. To understand why, we need to be clear about what cognitive load theory is.

Cognitive load theory posits a relatively simple model* of the mind. On the basis of this model, it makes predictions about different instructional procedures (teaching methods). Verifying or falsifying these predictions therefore requires us to run tests where one instructional procedure is compared with another. This process of attempted falsification is important because it causes us to refine or perhaps even set-aside our theories.

For instance, cognitive load theory predicts that for relatively complex tasks, such as solving a physics problem or composing a paragraph, novices will learn more by studying a worked example than by problem solving. So you could falsify cognitive load theory by showing the opposite result or a null result. Interestingly, this has already happened. Early in the development of the theory, experiments were run on geometry and physics problems that involved the use of diagrams. The worked examples were no more effective than problem solving. However, by redesigning the worked examples so that relevant information was placed directly on the diagram rather than in a key at the bottom, researchers again found them to be more effective than problem solving.

This is the origin of the ‘split-attention’ effect and it added an interesting component to cognitive load theory with practical significance for the design of worked examples. It also directly linked to the explanatory mechanism in the model – the need to integrate information from two different places needlessly increased cognitive load.

Measuring cognitive load directly is therefore not necessary for the development of the theory (for instance, I haven’t attempted it yet in my own research). Yet it is clearly an avenue worth pursuing because it might shed further light on how cognitive load varies for different tasks and therefore offers the prospect for further refinement.

Unfortunately, as the Naismith and Cavalcanti paper describes, these direct measures of cognitive load are not as valid as we would like; which is unsurprising when you examine the detail of how they are conducted. When they reviewed the literature, they found that the basic idea of cognitive load theory – that higher cognitive load would impede learning – was not always present in the data. However, given that the measures of cognitive load lacked validity, it was hard to draw conclusions from this.

However, they did find that, “Studies reporting greater validity evidence were more likely to report that high CL [cognitive load] impaired learning.” And that is in line with the predictions of cognitive load theory.

So no debunking today. Critics will need to wait for that.

*I have written an FAQ on models that addresses a common issue that people raise when discussing cognitive science

FAQ: Models

Science proceeds by developing models to approximate reality and then testing the predictions of these models through observation or experiment.

Science does not deny the existence of an underlying reality and yet it does not claim to know or describe it. All it claims is the predictive and explanatory power of its various models.

While I have only recently come to explicitly draw this distinction, this is something I think I have implicitly known for a long time due to my physics training. In physics, we are constantly confronted by the incompleteness of our models. The best example of this is wave-particle duality where we have two conflicting models that both accurately predict different aspects of the behaviour of fundamental particles; the model of a wave and the model of a particle. These two models can be reconciled in a mathematical framework that is impossible to put into words by drawing on analogies from the everyday world.

Why is this important? Because there is a tendency for people to dismiss models in cognitive science by pointing out that they are incomplete or that they do not describe every aspect of the working of the mind. This is a misunderstanding.

For instance, in ‘Why don’t students like school’, Dan Willingham posits a simple model of the mind in order to help explain some constraints on learning. This model consists of the environment, the working memory and the long term memory. Some have criticised it on the basis that it is an oversimplification; that the working memory has sub-components – such as the phonological loop – and that the model excludes elements such as the sensory buffers. It may be an oversimplication, but it order to demonstrate this we would need to see how the addition of these elements would change the predictions Willingham makes on the basis of this model. If they don’t change these predictions then they are irrelevant to Willingham’s argument. If they do change these predictions to ones that are less aligned with experiment and observation then we should definitely leave them out. The only case where we should include them is if they change the predictions of the model in a way that is relevant to Willingham’s argument and that represents a superior description of reality. Otherwise, there is value in keeping it simple.

After all, it’s models all the way down.