Cronbach’s Alpha with Zero-Inflated Data

Cronbach’s alpha is a common way to test the internal consistency of scales. In a recent scale I constructed, I got an excellent alpha, and started wondering to what extent the many zeros in my data were the cause. Basically, a sizeable proportion of the respondents answered “no” to all the questions, and I wanted to know to what extent this drives the alpha rather than having picked good questions.

What I needed was a base-line, which I simulated in R (code).


Basically I start with a random draw, and then gradually replace the values with zeros (could be any value). We can see that many zeros (x-axis) are required to drive the alpha (y-axis). If more than about half the values are zeros, we probably should start being bit more careful in interpreting alphas directly.

A quick conversation with William Revelle confirmed that I’m not looking at what he calls “lumpy data”, and that factor analysis was indeed the correct reaction. Given the way Cronbach’s alpha reacts to zero-inflation, factor analyses may be a necessary addition to the alpha when more than half the values are zeros.

Cronbach, Lee J. 1951. “Coefficient alpha and the internal structure of tests.” Psychometrika 16 (3): 297–334. doi:10.1007/BF02310555.

Revelle, William. 2013. Psych: Procedures for Psychological, Psychometric, and Personality Research. Evanston, Illinois.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s