TIL: Gorilla Hypotheses

A hypothesis can be a liability.

Vincent Warmerdam koaning.io
2021-09-26

You may have seen the popular Youtube video about the “attention experiment”. The point of the video is that attention is selective and that by focussing your attention on a single specific thing you may miss something that’s completely obvious. If you’ve never seen the video before, please do so now.

In a similar vein, I recently read an enjoyable paper that demonstrates something similar but in the realm of statistics. It’s the “A hypothesis is a liability” paper by Itai Yanai and Martin Lercher.

The authors created a dataset containing gender, BMI and steps taken per day. They then asked students to analyse the dataset. The students in the first group were asked to consider three specific hypotheses. They were also asked if there was anything else they could conclude from the dataset. In the second, “hypothesis-free,” group, students were simply asked: What do you conclude from the dataset?

The Gorilla in the Room

The thing was; this dataset was completely artificial. Here’s a screenshot from the paper.

The dataset didn’t really resemble BMI, it mainly resembled a gorilla. It also seems like groups that were “hypothesis free” were more likely to discover this fact. While the difference in the groups weren’t statistically significant, and the authors admit further study is needed, they do suggest that hypotheses can be distracting.

Conclusion

I think hypotheses are a useful tool. They can add formality to decision making which tends to prevent folks from claiming a success pre-maturely. But they can cause pre-mature tunnel vision too. I’ve seen it happen in data teams but this paper demonstrates what may go wrong with a humorous example.