The sweet spot

I am sitting in the airport at Sydney, waiting for my flight back to Geelong. I’ve just taken part in a panel review of my PhD progress and caught up with my supervisors. I also attended a fascinating talk and discussion led by Jan Plass who started his career in cognitive load theory and has now branched out into studying various aspects of adaptive learning.

It’s given me the opportunity to think about a few issues related to cognitive load theory (CLT). It has been bothering me for a while that there are certain contradictory views coming out of cognitive science that seem to be bound up with the issue of germane cognitive load. When CLT was first formulated, two types of working memory load were identified; the intrinsic load related to completing a problem or task and the extrinsic load that is not required for problem solving but that might be generated by distracting information, images and so on. Later, germane load was added.

Germane load is essentially the component of the working memory load that leads to learning. It is a problem for CLT because it makes the theory unfalsifiable: design an experiment that reduces cognitive load and leads to learning and we can say that we have reduced extrinsic load; design one that increases load and is effective and we can say that we have increased germane load. The reverse explanations can also apply; if the reduced load leads to less learning we can say that we eliminated germane load an so on. The results of any experiment can be explained and so CLT ceases to be a scientific theory because it ceases to make predictions that may be proved wrong.

Sweller’s solution is to not explain anything in terms of germane cognitive load. His view is that CLT is not a theory of everything. Even so, CLT provides useful results. Those of you who have been following my Twitter account will no doubt have already clicked on a link to this excellent piece by Sweller that was recently published and that gives an historical background to CLT theory before going on to explain all of the important findings in simple terms. Germane load is not mentioned.

But if CLT is not a theory of everything then what exactly is it a theory of? I am starting to think that CLT findings are best applied to domains with high intrinsic load; domains such as maths and physics problem solving and the more analytical aspects of the humanities. In such domains, it makes sense as an heuristic to generally try to reduce load. There is so much to pay attention to that we might seek to eliminate any extrinsic load for novice learners whilst also breaking the major tasks down into smaller components to be trained independently: We learn to factorise and solve quadratic equations in isolation before we learn to solve word problems that require the use of quadratic equations (and we learn our times tables before we learn to factorise quadratic equations).

And yet there are other areas where precisely the reverse effect is found. The generation effect is the phenomenon where learning is enhanced by students having to generate a piece of information for themselves rather than by simply reading that information. This increases working memory load. Perhaps it is desirable to make learning a little bit harder? Perhaps this makes it more memorable?

The apparent contradiction might be solved if we consider that worked examples seem to be better than problem solving for complex kinds of problems whereas the generation effect seems to work for tasks that involve memorising single words or sequences of words. Sweller would define this as a difference in ‘element interactivity’. In an algebra or physics problem, each move is dependent upon another; the fifth line depends upon the fourth and so on. The elements of the problem interact. However, words and names are discrete items and so the element interactivity is low, even if the words are complicated or technical.

If we have a model of learning that suggests that we must engage but not overload the working memory in order to optimise learning then the worked example effect can be explained due to the fact that it reduces load to manageable proportions. The generation effect occurs because it increases the load of a learning episode that would otherwise possess a very low intrinsic load. If you want students to learn names and dates then generation could be a good strategy.

Similarly, when we become more expert, a series of problem moves becomes ‘chunked’ together as a single item and so this also effectively reduces the element interactivity of a problem, explaining why experts benefit more from problem solving than from worked examples.

There is some experimental evidence for this idea and it’s similar to some of the recent findings regarding the testing effect. A paper by Chen, Kalyuga and Sweller presents a series of experiments that show a worked example effect for high element interactivity and a generation effect for low element interactivity.

This is now falsifiable, provided that we can all accept the concept of element interactivity. If so, a worked example effect for very low element interactivity or a generation effect for high element interactivity would disprove CLT.

Do we therefore need the notion of germane cognitive load at all? I’m not sure. We might be able to make do with something simpler: Fill the working memory without over-filling it. Hit the sweet spot. Vygotsky would approve.

syd air


15 thoughts on “The sweet spot

  1. Greg,
    Your discussion about how it is unfalsifiable struck home. For years, I’ve joked that due to this we see that every new unintended and/or unexpected result seems to have led to a new CLT “principle” and that germane can only be “proven” ex-post facto. Let’s keep to 2 types and element complexity.

  2. Very interesting. Would be a better model without germane load part. Also moving the demand for thinking to be an internal part of that which is to be learned is a more powerful idea.

  3. Hi Greg. I enjoy the complexity you are exploring in your thinking here, and that the answer might be ‘it depends’.

    I’m not in this area of expertise so my understanding is wobbly, but I wonder how this also relates to other fields in which increasing sophistication can be reached as knowledge and skills become internalised and automaticity of practice develops.

    For instance, I recently reflected on my own development as a coach; as I have gained knowledge and experience I have been able to automate some things so as to focus my attention on others: . Now you have me thinking about the actual processes that may have gone into how I have ‘learned’ these things.


  4. Here’s something from a training session today: Why is the sky blue? Take a very large measuring cyclinder full of water and shine an LED torch up through it. White light on the ceiling and from the side the water is pretty colourless. Add a drop of milk. Now the light on the ceiling is more yellowy and the water looks blueish. Give students a diagram showing blue and yellow light leaving the torch and get them to draw the paths.Now give them a diagram showing the same colours leaving the sun and reaching the edge of the atmosphere with one observer looking straight at the sun and the other looking at another part of the sky; get them to complete this diagram too… This works (with these particular students and a bit of discussion); it strikes me as an example of highly scaffolded discovery learning leading to good understanding of why the sky is blue. My experience is that this is more effective than just explaining it, and quite a bit more engaging too.
    In contrast, my experience is that because of the difficulty of getting circuits to behave, it’s better to use whole class interactive teaching to cover the theory of what happens when components are connected in parallel, so that the practical becomes a process of reinforcing the theory by applying it to the complexity of real circuits: extracting the deep structure from the noise.
    Could it be that the mixture of traditional and constructivist approaches that you see in most science classrooms is where the sweet spot lies after all?

  5. Tunya Audain says:

    Half-baked Fads Should Not Be Tolerated

    Imagine eating pancakes, or muffins, with the insides still doughy and yucky.

    Reading Sweller’s article, pg 5, from your link, I see that two decades of schooling saw the problem solving method used — despite Sweller’s cautions, which were treated with hostility or ignored. Who pays for this “damage” to students? Don’t brains get scrambled from some of these fads? It bothers me that so much unproven practices are allowed to prevail without safeguards.

    Story of a Research Program, Sweller

  6. Thank you for this excellent reflective piece. I like how you are thinking how to merge seemingly contradicting results. It’s good to not see CLT as a theory of everything. I’m not sure if removing germane load and focusing on element interactivity really solves some of ‘unknowns’ though. Maybe I’m not precise enough in my wording but it does seem that “Similarly, when we become more expert, a series of problem moves becomes ‘chunked’ together as a single item and so this also effectively reduces the element interactivity of a problem, explaining why experts benefit more from problem solving than from worked examples.” sounds a lot like integrating knowledge into schemas, which ‘Germane Load’ would address. I would say that it’s exactly that process, moving from less expert to more expert, when a novice, when not any more, taking into account prior schemas and knowledge, needs more clarity and research. The same authors have an even more recent paper in which they describe expertise reversal as a ‘special case’ of element interactivity ( But now it seems the discussion is more about *when* is there lower or higher element interactivity. If this is directly influenced by ones capacity of chunking (and perhaps schema building) it is still possible to explain away any outcome? When, for example, according to our assumptions something must have a high EI and there is an improvement: they must’ve been expert enough to succeed. If there is no improvement with high EI: they weren’t expert enough? With Low EI: if improvement, that must’ve been because they were not so expert. Wit low EI but no improvement, EI still must’ve been too high. In any case, the definition of lower or higher EI seems crucial, I think I’ve seen some studies where I thought “yeah makes sense” but also some where I thought “really?” (wasn’t there one recently with the number of steps as indicator of EI, I did not find that very convincing tbh). Any way. Thanks.

    • I wish we could use more precise language than “falsify CLT.” Is falsifying the worked example effect equivalent to falsifying CLT? I’d argue that it shouldn’t. CLT has some core commitments –the role of working memory in learning, the nature and limitations of working memory — and CLT could be falsified if those core commitments were shown false. But the worked example effect is a specific experimental finding of the theory, and CLT can live with it being complicated by new findings.

      After all, Kalyuga basically accommodates findings that problem solving helps with novice learning in his latest, right? I know, I know, Sweller and you don’t find Schwartz and the productive struggle stuff to be well-controlled. (Kalyuga also has a line in that latest piece where he expresses a desire for well-controlled replication studies, but he’s clearly comfortable enough with the findings to propose a major theoretical revision to CLT.)

      So I think it’s imprecise to say “CLT is unfalsifiable” when we mean “it becomes hard to falsify results of CLT.” Which, as far as I can tell, is all that we’re talking about here.

  7. Stan says:

    Greg, There are lots of skills training techniques – skiing is a good example, where after learning some technique there is step where the instructor goes for intentional mental overloading. The idea is that at some point to start forcing the subconscious to do the work and stop people over thinking about it. So some additional mental exercise is added to force cognitive overload while performing a drill. That seems to be a common technique to speed up the process of making a conscious process subconscious.
    There seems no obvious reason why it wouldn’t work with a mental learning exercise. I am sure the best ski instructors have it down to an art in assessing when to push people a bit more to get them past the conscious thinking stage.

      • Stan says:

        Juggling is another good example. Learning to juggle can start with the simplest case of standing in a doorway and throwing one object from hand to hand in a high arc. While doing this you practice keeping the hands low and in the same place and getting a uniform throw. The doorway helps to track the arc of the object. At some level of competence in this you move on to two objects and then three and beyond. But it is worth moving on before you are perfect at any stage. Having to manage two objects leaves you less attention to spend keeping your hands in position and thinking about the throw. Getting the hand position to work subconsciously is achieved faster by not allowing the brain enough conscious attention to dwell on it. For any individual case there will be an optimum moving on point and perhaps some need to move back again.

        The nice thing about juggling is it would be pretty easy to test how well this worked with various choices on when to switch to more objects. I suspect it would be testable with simple math tasks.

  8. Pingback: What’s your theory? | Filling the pail

  9. Pingback: Filling the pail

  10. Pingback: Being wrong is a strategy | Filling the pail

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.