An Ode to Low R2

It’s the time of the year when many of us do their share of grading. In my case, it’s quantitative projects, and every time I’m impressed how much the students learn. One thing that annoys me sometimes is to see how many of them (MA students) insist on interpreting R2 in absolute terms (rather than to compare similar models, for instance). That’s something they seem to learn in their BA course:

[in this simple model with three predictor variables], we only explain 3% of the variance; it’s a ‘bad’ model.

I paraphrased, of course. But I started to like low R2: They are a testament to the complexity of humans and their social world. They are a testament to the fact that we are not machines, we are in the world where quantitative analysis is about tendencies. Just imagine a world in which knowing your age and gender I could perfectly predict your political preferences… So there you have it: low R2 are great!

it’s called research for a reason!

All credits to Gary King for this one. In a forthcoming piece of advice to grad students, we find this gem:

It will require rewriting, recasting your argument, reconceptualizing your theory, recollecting your evidence, remeasuring your variables, or reanalyzing your data. You’ll have to revise more than you want and you thought possible. But try not to get discouraged; they call it ​re​search, not search, for a reason!