A new test of ‘productive failure’

Embed from Getty Images

Productive failure is a concept that has become popular in recent years, particularly among maths teachers. It derives from a number of studies by Manu Kapur in Singapore that have been disseminated through articles and blog-posts aimed at teachers and the general public. Productive failure follows in the footsteps of similar strategies such as invention learning or introducing desirable difficulties.

Advocates of productive failure accept the need for explicit teaching. However, they propose that students should be allowed to struggle a little and attempt to solve problems for themselves before they are explicitly taught a solution method. There are three mechanisms that they propose for why productive failure is beneficial:

There has been some criticism of the experiments that have been used to support productive failure. These studies have not always varied one factor at a time (see questions at the end of this article). When they have been more rigorously designed, these studies don’t always replicate real-world forms of instruction. For instance, in a 2014 study, it seems that students in the control condition were taught a solution method and then asked to solve the same problem in as many ways as possible. It is doubtful that an ordinary maths teacher would use such an approach. And there has been conflicting evidence – studies where the advantages of productive failure have not materialised or appear more nuanced.

A different criticism is theoretical. Cognitive Load Theory has been successful in explaining many learning related phenomena and generally predicts that, for novice learners, fully guided instruction and the use of worked examples is superior.

This tension therefore makes for an interesting area of research because there are two theories that make different predictions.

And so it was with interest that I read a new paper that tests the predictions of productive failure. The paper is by Likourezos and Kalyuga and it is worth pointing out that Slava Kalyuga is one of my PhD supervisors. The study was carried-out with high school maths students who were learning a geometry topic over a six week period. I have summarised the design in the diagram below:


There were eight regular maths lesson that were split into two 30 minute phases. At the outset, students were randomly allocated into one of three conditions where, for the first 30 minute phase of each lesson, they either received fully guided instruction in the form of worked examples, partially guided instruction that included some scaffolds or unguided problem solving. The second half of each lesson consisted of explicit instruction to the whole group. There were 24 students in each condition and the pre-test showed no significant variation between the groups.

The researchers basically found no difference between conditions in a later post-test. This test consisted of two parts: a component containing questions that were very similar to those used during the teaching, as well as a component that consisted of slightly different kinds of questions – ‘transfer’ questions. There was no effect on either component.

When they analysed the data further, they did find a few statistically significant results. Perhaps surprisingly, students in the fully guided condition produced a greater number of creative (and still correct) solutions on the post-test; solutions that involved pre-empting a more sophisticated method that is often taught to older students. Students in the fully guided condition also showed a significantly greater level of interest and felt they were more likely to be successful than those in the unguided group. Students in the unguided condition reported significantly higher levels of challenge than those in the fully guided group.

If these results have not occurred by chance then it seems to me that there are two viable explanations. The first is that the advantages of productive failure and the advantages of providing worked examples trade-off against each other, resulting in no overall effect. Another possibility is that the effect of the explicit teaching component is so large that it washes out the effect of the previous conditions. In this case, the worked examples condition might be redundant because it is effectively repeated in the explicit teaching phase.

Either way, this result is important because it represents a key attempt to replicate previous productive failure findings. The lack of replication should prompt us to pause before making further recommendations for the use of productive failure.


3 thoughts on “A new test of ‘productive failure’

  1. Chester Draws says:

    I doubt the results occurred by chance.

    One of the risks of unguided learning is that once they have found a technique, students will find it very difficult to move on, since the technique they have is working, so why would they learn another?

    Only those taught explicitly will have the full range of techniques, so can answer more creatively.

    [Lots of my students work out for themselves that to solve problems of the type ax + b = c, you take the b off the c and then divide out the a, and give working that goes x = c – b ÷ a. It’s real discovery learning, since I most certainly don’t teach them it.

    Such students cannot then do problems of the type ax + b = cx + d, but since the technique they have works well, from their point of view, many will continue to try to use numerical techniques, and I get working that goes a – c = e and d – f = f and x = f ÷ e, which is usually done wrong, but works just often enough to give them faith in it.

    I wonder how those that advocate discovery learning prevent this sort of learning — apparently efficient at getting the answer, but useless in the long run.]

  2. Thanks Greg – really helpful to flag this up. We talked about productive failure in our department meeting today. A conjecture: productive failure might produce the backfire effect: where allowing pupils to struggle and get things wrong leads to them also remembering their wrong approaches, as opposed to the right approach, and so being more confused in the longer term. E.g.: https://skepticalscience.com/Debunking-Handbook-Part-2-Familiarity-Backfire-Effect.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.