Here’s how to get larger correlations coefficients

Here’s a link to a blog post by Andrew Gelman that deserves to be read more widely. It reports on a paper in a field remote to what I’m doing, but the issues is about correlation coefficients — a staple in much of what we do. Apparently the authors of the paper must have thought that a correlation coefficient of 0.02 is not enough to get published, and resorted to binning the data. Binning data in itself is not a bad thing, it can be quite useful for graphs, for instance. However, they then calculated and report the correlation of the binned data. Not so miraculously, the correlation coefficient increases; they average out unexplained variance.

Maybe I really should change the framing of my statistics course to focus on how to lie and cheat with statistics: I guess the students will learn just as much about good statistics this way.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: