Evidence for an inquiry-based science program… no, wait 

Embed from Getty Images

I am growing increasingly concerned about the way that some randomised controlled trials (RCTs) are being used to give legitimacy to otherwise weak ideas.

I am in favour of conducting RCTs because they have the greatest potential to tease out whether one thing causes another. But this doesn’t mean that we must dismiss all other kinds of evidence. Sometimes RCTs are difficult to conduct or are unethical. We should not forget that we can potentially draw inferences from correlations provided we are suitably cautious. Basic cognitive science can also inform our theory of how different approaches might work.

Furthermore, some RCTs can be pretty badly designed or analysed. The new RCT factories such as the Education Endowment Foundation in England and I3 in the U.S. tend to find positive effects that don’t always stand up to closer scrutiny.

This became apparent in an article about inquiry-based science that Steven Cooke (@SteveTeachPhys) drew my attention to on Twitter. It was written by Wynne Harlen for the October edition of the U.K’s Association for Science Education’s ‘Science Teacher Education‘ magazine. This is a, “publication for all concerned with the pre-service education, induction and professional development of science teachers.” Bear that it mind as you read what follows.

The article is fascinating on a number of levels, not least for perpetuating the ‘constructivist teaching fallacy‘ i.e. that constructivist learning theory implies a specific set of teaching practices. For instance, the way that ‘understanding’ is defined excludes the possibility of a teacher explaining a concept to a child so that the child understand it:

“Current views of learning lead us to conclude that understanding is created by learners themselves through their mental (and physical) activity. It is not something that can be received ready-made from others; it involves generation rather than acquisition of knowledge.”

We are told that the only alternative to constructivism is ‘behaviourism’ which is characterised by rewards, punishments and rote learning. Harlen favours socio-cultural constructivism where students work in groups (which I can’t reconcile with the notion that understanding cannot be received from others). This naturally implies inquiry-based science teaching:

“To identify what this means in practice, consider what pupils will be doing when learning in this way. Their activities will include: working in groups; exploring and manipulating physical materials; building on their prior experiences and ideas; raising questions; communicating their ideas; listening to the ideas of others; reasoning; and arguing from evidence.”

This is where the RCT makes an appearance.

It’s actually pretty easy to design an RCT that will show a positive effect for inquiry learning. Here’s my recipe:

  1. Randomly assign your subjects to one of two groups
  2. Give the experimental group a set of activities to complete involving marbles rolling on ramps
  3. Give the control group standard explicit instruction on Newton’s laws of motion
  4. Conduct a post-test where students are assessed on their ability to answer questions about experiments with marbles on ramps

You see a lot of this kind of thing in the literature. These studies succeed due to my first principle of educational psychology: students tend to learn the things you teach them and don’t tend to the learn the things you don’t teach them.

To their credit, I3 did not do this. They instead used a standardised assessment known as ‘PASS’. PASS has three elements:

  1. Selected Response or Multiple Choice Items (MC): Items assess students’ understanding of important scientific facts, concepts, principles, laws, and theories.
  2. Constructed Response Investigations and Open‐Ended Questions (OE): Students analyze a problem, think critically, conduct a secondary analysis, and apply learning. They construct explanations using evidence.
  3. Hands‐on Performance Tasks (PT): Investigations identifying a problem to solve. Students use equipment to perform investigations; make observations; generate, organize, and analyze data; communicate understandings; and apply learning.

Note that it is mainly the first element – MC – that tests whether students know and understand science.

I3 randomised schools into one of two conditions. The experimental group received a package known as ‘LASER’ which is an inquiry-based science programme consisting of curriculum materials and support. The control group did not get LASER. I would have expected some sort of Hawthorne effect where those schools who knew they were part of an intensive intervention would perform better on the PASS test. But they did not. There was no statistically significant difference between the two groups of schools. So you might think that this was the end of it.

But no.

I3 decided to slice and dice their data. This is a risky practice. Let’s set aside the debate about p-values for a moment and assume that we are happy to use them as our test of whether something is significant. A p-value of p=0.05  means that if there really is no effect, for every twenty analyses we do we could expect to get one false positive result. So if we slice and dice our data 20 ways then we might expect to find something that passes the test of statistical significance.

This doesn’t necessarily prohibit the slicing and dicing of data but it does mean that you have to apply a much stricter test that takes account of the number of ways that you have chopped it up.

When I3 sliced the data they found three outcomes that were statistically significant using an ordinary test of significance. These were English Language Learners who performed better than the control on the OE and PT tasks and students with a disability who outperformed the control on the PT task only. However, once they applied the more stringent tests that take into account the slicing and dicing, the significance of these results disappeared.

Nevertheless, I3 claim this as an important finding. They cite the What Works Clearinghouse guidelines to support an argument that the more stringent kinds of analyses are not required for an ‘exploratory’ study. Which seems like a sleight of hand to me. At the very least, when quoting this data people need to make it clear the exploratory – and thus provisional – nature of the findings.

With a little searching, I found a separate executive summary document that goes much further than the final report, chopping the data up into separate states and making strong claims about performance in non-science subjects such as maths and reading. This seems to be the document that Harlen quotes from:

“In 2010 the U.S. Department of Education awarded the SSEC a five-year Investing in Innovation (i3) validation grant to evaluate the LASER model’s efficacy in systemically transforming science education. “LASER i3” refers to the resulting longitudinal study of the LASER model, which unequivocally demonstrates that inquiry-based science improves student achievement not only in science but also in reading and math. LASER plays a critical role in bolstering student learning, especially among underserved populations including children who are economically disadvantaged, require special education, or are English language learners.” [my emphasis]

I haven’t analysed the separate state level data but given what we have seen from the overall data, we need to treat it with great caution. I don’t see how the claim that this RCT ‘unequivocally demonstrates’ anything can be justified by looking at the overall data. We certainly should not be using it to support possibly fallacious claims about constructivist teaching practices.

I now think I understand where some of the problems in teacher education originate.

Advertisements

7 Comments on “Evidence for an inquiry-based science program… no, wait ”

  1. Mike says:

    …It’s actually pretty easy to design an RCT that will show a positive effect for inquiry learning. Here’s my recipe:

    1.Randomly assign your subjects to one of two groups
    2.Give the experimental group a set of activities to complete involving marbles rolling on ramps
    3.Give the control group standard explicit instruction on Newton’s laws of motion
    4.Conduct a post-test where students are assessed on their ability to answer questions about experiments with marbles on ramps…

    Greg, you’ve left one out (although this does tie in with the Hawthorne Effect that you mention):

    5.Ensure that the experimental group is taught by an enthusiastic teacher/s who is peachy-keen on the new intervention, while the control group is taught by a teacher/s who’s been dragooned unwillingly into the whole thing and forced, by the parameters of the “experiment”, to teach in an artificially stilted and ineffectual way…because naturally the designers of the study are not interested in the messy area of a mix of styles, they want a straight either-or so that they can push their particular little barrel for all the speaking circuit invitations it’s worth.

    Not that I’m cynical or anything. 🙂

    Let me say though, I do very much appreciate that there are those like yourself (and a few others) around who are actually prepared to do the statistical legwork of exposing the essential phoniness of all this. It’s amazing how even intelligent people these days will lose all proper scepticism and critical faculty when some intellectual snake oil salesman says “evidence has shown”. You’re lucky if you even get the relevant footnote on the PowerPoint presentation these days, and of course no-one’s likely to check it.

  2. The pattern of research fraud in the ranks of progressivists was set with the work of Ellsworth Collings, student of ‘Project Method’ William Heard Kilpatrick, successor of John Dewey, about 1920.

  3. […] one originally proposed. I’ve been looking into these kinds of trials a little recently (eg here and here) and there are a lot of potential […]

  4. Gary says:

    Great spot and investigative work. Seriously bad data dredging going on here.

  5. Iain Murphy says:

    “Current views of learning lead us to conclude that understanding is created by learners themselves through their mental (and physical) activity. It is not something that can be received ready-made from others; it involves generation rather than acquisition of knowledge.”

    I’m not sure how your conclusion of “the way that ‘understanding’ is defined excludes the possibility of a teacher explaining a concept to a child so that the child understand it:” can be made from that statement. Yes, a teacher can explain a concept and a good teacher will look for understanding but it is still the learner that is creating the understanding within their context. This is why some students can’t apply an idea beyond the examples given.

    Unless I’m not understanding the whole explicit teaching model.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s