Selection on the dependent variablePosted: June 14, 2017
Imagine you are striving to make a mark in the world of business. Now, let me sell you a book. In this book, I describe the visits that I have made to all the funkiest, most fashionable and most profitable companies on the planet. What’s more, as I’ve visited each of them, I’ve made a little list of their attributes. And I am prepared to share with you exactly what characteristics these companies have in common.
Are you sold?
You shouldn’t be because I’ve performed a sleight of hand. I have implied that the characteristics that I have identified are somehow related to the success of these companies but we have no way of knowing this.
Imagine, for instance, that I find my top companies splash out on lavish Christmas parties and I weave that into a narrative about how they make their staff feel valued. What we don’t know is whether less successful companies also do this and, if they do, whether they tend to do less or perhaps even more of it.
This is why a comparison group is so important. For my notional book, I should visit both successful and unsuccessful companies.
And even then, the best we can discover is a correlation. We might find that Christmas parties correlate with more successful companies but the parties might not cause the success. The correlation might be because successful companies can afford better parties. Or it might be that a greater number of creative people make companies more successful whilst at the same time agitating for better parties.
This is the crux of what any fool on the internet can tell you: correlation is not causation. However, I reckon correlation is a lot better than a mere description, and a description is all you can ever achieve without a comparison group.
In education, it would not be top performing companies that we would select for study but maybe some subgroup of students or teachers or schools. And we might not analyse their characteristics in a straightforward way. Perhaps we would grab a gauloise before completing a poststructuralist discourse analysis or something groovy like that.
But the effect is the same. Without a comparison group, it is just an elaborate description.