A blog on statistics, methods, philosophy of science, and open science. Understanding 20% of statistics will improve 80% of your inferences.

Saturday, November 11, 2017

The Statisticians' Fallacy

If I ever make a follow up to my current MOOC, I will call it ‘Improving Your Statistical Questions’. The more I learn about how people use statistics, the more I believe the main problem is not how people interpret the numbers they get from statistical tests. The real issue is which statistical questions researchers ask from their data.

Our statistics education turns a blind eye to training people how to ask a good question. After a brief explanation of what a mean is, and a pit-stop at the normal distribution, we jump through as many tests as we can fit in the number of weeks we are teaching. We are training students to perform tests, but not to ask questions.

There are many reasons for this lack of attention in training people how to ask a good question. But here I want to focus on one reason, which I’ve dubbed the Statisticians' Fallacy: Statisticians who tell you ‘what you really want to know’, instead of explaining how to ask one specific kind of question from your data.

Let me provide some example of the Statisticians' Fallacy. In the next quotes, pay attention to the use of the word ‘want’. Cohen (1994) in his ‘The earth is round (p < .05)’ writes:


Colquhoun (2017) writes:


Or we can look at Cumming (2013):


Or Bayarri, Benjamin, Berger, and Sellke (2016):


Now, you might have noticed that these four statements by statisticians of ‘what we want’ are all different. The one says 'we want' to know the posterior probability that our hypothesis is true, the others says 'we want' to know the false positive report probability, yet another says 'we want' effect sizes and their confidence intervals, and yet another says 'we want' the strength of evidence in the data.

Now you might want to know all these things, you might want to know some of these things, and you might want to know yet other things. I have no clue what you want to know (and after teaching thousands of researchers the last 5 years, I’m pretty sure often you don't really have a clue what you want either - you've never been trained to thoroughly ask this question). But what I think I know is that statisticians don’t know what you want to know. They might think some questions are interesting enough to ask. They might argue that certain questions follow logically from a specific philosophy of science. But the idea that there is always a single thing ‘we want’ is not true. If it was, statisticians would not have been criticizing what other statisticians say ‘we want’ for the last 50 years. Telling people 'what you want to know' instead of teaching people to ask themselves what they want to know will just get us another two decades of mindless statistics.

I am not writing this to stop statisticians from criticizing each other (I like to focus on easier goals in my life, such as world peace). But after reading many statements like the ones I’ve cited above, I have distilled my main take-home message in a bathroom tile:



There are many, often complementary, questions you can ask from your data, or when performing lines of research. Now I am not going to tell you what you want. But what I want, is that we stop teaching researchers there is only a single thing they want to know. There is no room for the Statistician’s Fallacy in our education. I do not think it is useful to tell researchers what they want to know. But I think it’s a good idea to teach them about all the possible questions they can ask.


Further Reading:
Thanks to Carol Nickerson who, after reading this blog, pointed me to David Hand's Deconstructing Statistical Questions, which is an excellent article on the same topic - highly recommended.