# Example-problem pairs

Over the past year, I have been thinking a great deal about the implications of the following section from “Cognitive Load Theory” by Sweller, Kalyuga and Ayers*:

“Trafton and Reiser found that for an example to be most effective, it had to be accompanied by a problem to solve. The most efficient method of studying examples and solving problems was to present a worked example and then immediately follow this example by asking the learner to solve a similar problem. This efficient technique was, in fact, identical to the method used by Sweller and Cooper (1985) and followed in many other studies. It was notable that the method of showing students a set of worked examples followed later by a similar set of problems to solve led to the worst learning outcomes.”

It is something that we have discussed as a maths department as we try to improve our planning documents and resources. How best can we utilise this effect when teaching our students new concepts? I admit that my standard practice has often be to present a set of worked examples and this does not seem to be the most effective strategy.

One approach might be to present an example alongside a similar problem.

I know some people are PowerPoint-phobic but please don’t be put-off by that. It could just as easily be presented in a booklet or on an iPad screen. I like PowerPoint slides because I can project them onto a standard whiteboard and then scribble over and annotate them.

The above example is useful for illustrating the idea but it is abstract. I have been wondering exactly how close to the original problem a ‘similar’ problem must be. In the statistics example-problem pair below I have change the context. In other respects, the problem is very similar to the example. For instance, the information is presented in the same order. Could such an example-problem pair have an impact on transfer? It certainly highlights the deep structure.

My other concern cuts to the heart of cognitive load theory. There are those who posit that we can lower cognitive load too far and that some load is necessary for learning. This led to the introduction of the notion of ‘germane’ cognitive load – productive load – into cognitive load theory. It has caused problems because it makes the theory potentially unfalsifiable. Sweller explains why in an email comment for a post on my old blog:

However, the potential issue remains. We could possibly have activities that are too fully guided and that therefore lead to little learning. Example-problem pairs seem to minimise the amount of struggle and so you might think that they would qualify as such activities. They certainly don’t seem to induce any ‘desirable difficulties’. Yet the evidence suggests that they are effective.

And fully guided strategies such as Engelmann’s Direct Instruction also seem to work. I have heard anecdotal reports that students and teachers sometimes even don’t notice the progress that is made until they conduct a new assessment. How can this be if they are making an effort to learn? Surely there must be a limit to how far we can reduce load?

If we substitute activities with ones that involve students thinking about something other than the targeted learning then they won’t learn much. This is what happens when we ask students to complete wordsearches to learn key words or in Dan Willingham’s famous example of baking biscuits to learn about The Underground Railroad.

But I wonder whether, as educators, we have a systematic bias towards providing too much load. Therefore, anything that we come up with that reduces load seems to work (with novice learners). There might potentially be a lower limit – it seems reasonable to suggest that optimal learning will take place when we have a full but not overloaded working memory. It’s just that we haven’t reached it yet. Even with example-problem pairs, there is still plenty for a novice to think about.

*Sweller and Kalyuga are my PhD supervisors.

Standard

## 16 thoughts on “Example-problem pairs”

1. Yet the evidence suggests that they are effective.

Is the evidence so unambiguous here? For example, van Merrienboer et al write, “Learners may only briefly look at the worked-out examples and only consult them when they have difficulties in performing their tasks.” They suggest other tasks to take the place of worked-out examples.

It makes sense to me that there would be gap between the effectiveness of worked examples in the lab and in the classroom. Though Sweller typically assumes maximum levels of motivation in his work, in the classroom I often find a too-easy task can lead to low levels of motivation.

I have been wondering exactly how close to the original problem a ‘similar’ problem must be.

What a great practitioner’s question. I suspect that it’s going to vary for the group and the content. For what it’s worth, I like the first slide much more than the statistics slide. I would predict that kids won’t notice the deep structure of the statistics example because they won’t even read with the text — the example makes it too easy to jump straight to the variables — and hence won’t learn to ignore it.

It has caused problems because it makes the theory potentially unfalsifiable.

This line hasn’t made much sense to me, and I’ve been thinking a lot about it lately. If it truly renders the theory unfalsifiable in a deep way, presumably Sweller and others would not have introduced it in 1998. And if it also rendered the theory unfalsifiable in a deep, obvious and direct way (as is suggested, no?) then Kirschner and van Merrienboer and so many others wouldn’t continue to embrace germane load.

My best read of this line is that it means something a bit different than “CLT would be unfalsifiable.” Instead, it would mean that it would become much harder to use measurements of net load to indicate reductions of extraneous load. For Sweller and others, it’s very important to be able to interpret reductions of net load as indicative of reductions of extraneous load, but with the presence of germane load it becomes possible to interpret such net reductions as not due to the reduction of extraneous load but the increase of germane load. This allows for sloppy science, unless you carefully control for germane load.

My read has therefore been, “germane load makes it easy for sloppy science to occur, since changes in net load can be sloppily interpreted as either changes in extraneous or germane load.”

Does that interpretation make sense? Or am I off? If I’m off, how did Sweller mess up in 1998, and how are so many others missing the unfalsifiability of CLT today?

• I wrote something sloppy that I need to revise:

it becomes possible to interpret such net reductions as not due to the reduction of extraneous load but the increase of germane load.

Sorry. I meant, “It becomes possible to interpret better learning when there’s a reduction of net load as a result of increased germane load instead of as simply a reduction of extraneous load.”

2. Regarding the statistics example I would shoot any student who produced an answer like the one given, without ANY explanation. I know what the formula is, but do they? They certainly don’t need to know in order to stuff the numbers in.

• I take your point but that’s one slide out of a lesson. And they do also need to be able to stuff the numbers in.

• Properly presented work IS sufficient explanation. I hope that’s what you mean @howard. I would never want to see an essay for the solution of a problem like this. It is a slide, and so must be somewhat telegraphic. I do quibble slightly that the concluding sentence is a fragment that strictly doesn’t say anything, how about “So the required confidence interval is”? In the previous line the only addition I would like to see in student work is to precede with “SE = … ” to make clear that the student knows what they are obtaining with this formula. I don’t want to see an essay about the formula, separate explanation of each of its parts or some recitation of its derivation.

• I quite agree. Lets blame the interactive whiteboard!

3. So the only difference between effectiveness and ineffectiveness if presenting similar problems to worked examples is the number of each? If n worked examples shown and n similar exercises given immediately after, then with n=1, effective learning happens but if n=5, the effect is significantly less? Does this manifest as an observable function of n? Does n=2 give similar results to n=1, or does it immediately become less effective? Is there a point where it doesn’t matter, and say n=5 is not significantly different than n=10? Those things might be informative.

What do we know about non-live worked examples? As a student I always found examples in the text mirroring assigned questions (initially) to be very helpful, and my students always demand this. That is the highly successful model of the Schaum’s Outline series of university learning aids that I often recommend to my students: Fully worked examples, then exercises with answers in the back.

There are benefits to teacher-worked examples that are not evident simply by considering CLT: they provide a model for what work should look like. The primary motivation for this, I confess, as a professor, is that it makes my life easier when evaluating student work. However, there is an objective value in teaching (by example) proper work habits. How an algebra problem is arranged on the page, and the flow of that work, is often critical to students’ later success in the material. Teachers should be modelling correct habits and this should translate into students learning the same. My students suffer under my examples because (a) I have extremely poor penmanship, both on paper and on the blackboard, and (b) I tend to skip steps and sometimes must revisit to fill in gaps — so they don’t always see strict linear presentation of a logical progression of thought. While I have learned, like a very good golf hack, to “score well” over my career, it does my students no favours to pass along my bad habits to them (“do as I say, not as I do”?). These recent years of concern with educational advocacy has led me to reflect on this problem and I am trying to fix these these things. Can an old dog learn new tricks? (Depends on the cognitive load and the breed of dog perhaps). I started working on blackboard handwriting – with old-fashioned drills for myself – and was intrigued that a noticeable effect was had after a single 20-minute session. Encouraging.

“Example-problem pairs seem to minimise the amount of struggle and so you might think that they would qualify as such activities. They certainly don’t seem to induce any ‘desirable difficulties’. Yet the evidence suggests that they are effective.”

That may be because what is being learned and measured is not the kind of learning that arises from so-called “productive struggle” (something in me really dislikes that phrase though I acknowledge it does seem to capture an important idea). Perhaps it measures the transfer of a correct pattern of thought and/or work. It may be what I’m pointing to above: work habits. If so it suggests there is a distinct category of learning that isn’t being captured by that experiment. Rather than merely pairing one-on-one examples to exercises like this, what if learning was sequenced so that each pair is followed with an exercise with built-in variation? Could we measure a micro-scale version of “expertise reversal”? What I mean is once a student does one (or a series of) worked exercises to learn the pattern, will the have the, um, expertise, to extend the pattern within some reasonable realm of variation? I think that is a reasonable thing to ask. One could try the variation with and without the worked example and also the intervening exercise to determine the relative effects of each.

Finally, I teach university students, so I wonder if these CLT experiments are germane to my work, as young adult brains are different in certain aspects relative to these questions because of PFC development. I am repeatedly astonished at how many studies of university students’ response to teaching interventions are used to infer things about early years education. Is that not a perilous stretch? Are worked-example-effect studies primarily performed on adult subjects, or young children … or both?

4. From what I know about worked examples, the goal is to focus students on the (domain specific) problem-solving process, rather than on the outcome as such: this helps to learn the procedural knowledge aimed for. What I’m missing her is are the in-between versions of worked examples vs original problems, such as goal-free tasks etc. I wonder what would happen if students were given cards with similar tasks/problems in varying degrees of worked-out state, with the assignment to complete all. AND coupling that with metacognitive support (modelling before the card task, metacognitive hints during the card task).

This site uses Akismet to reduce spam. Learn how your comment data is processed.