Here’s how to get larger correlations coefficients

Here’s a link to a blog post by Andrew Gelman that deserves to be read more widely. It reports on a paper in a field remote to what I’m doing, but the issues is about correlation coefficients — a staple in much of what we do. Apparently the authors of the paper must have thought that a correlation coefficient of 0.02 is not enough to get published, and resorted to binning the data. Binning data in itself is not a bad thing, it can be quite useful for graphs, for instance. However, they then calculated and report the correlation of the binned data. Not so miraculously, the correlation coefficient increases; they average out unexplained variance.

Maybe I really should change the framing of my statistics course to focus on how to lie and cheat with statistics: I guess the students will learn just as much about good statistics this way.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s