Calculating Polarization

How can we enumerate the polarization of a party system, or the polarization of opinions? Polarization exists when the population are divided in their opinions. If we measure these opinions on an ordered scale (as is common place), we’re looking at peaks in two non-adjacent positions. An ideal type would be 50% for an issue, and 50% against it.

The opposite ideal type can help us formulate what we mean by polarization. If all positions are equally popular, we cannot really speak of polarization, but it is not the logical opposite. The opposite of polarization is agreement: everyone has the same position on an issue.

To enumerate polarization, we can work backwards from Cees van der Eijk‘s (2001) measure of agreement: inverting it. I’ve written up a few functions to do this in R.

Van der Eijk, C. 2001. “Measuring agreement in ordered rating scales.” Quality and Quantity 35(3): 325-341.

Bootstrapping in R

Bootstrapping uses resampling to assign measures of accuracy, and it can easily be used in R. When I first used it, it took me a while to figure out the double subscripts needed, so here is how to do it.

First, we’ll need the boot package from CRAN. Here’s an example using polarization from my agrmt package. After installing the packages, we define a function. So in this case, I use polarization(collapse(POSIT, pos=w)) to calculate the point estimates of polarization. POSIT is the variable of interest – I am interested in the polarization in this positional variable. To use this as a function for bootstrapping, a second subscript is necessary: p <- function(x,y) {polarization(collapse(x[y], pos=w))}.

The call of the boot package is simple: boot(data, function, replications), as long as you have a function with double subscripts. So we would use: mean(x[y]) rather than mean(x) in the function.

w <- c(-1,-0.5,0,0.5,1) # positions at which data could occur, used by polarization function: the pos=w argument
c <- c(0,0,2,2,0) # collapsed data; *not* used
d <- c(0,0,-.5,-.5) # raw data; used; normally a variable
z <- boot(d,p,500) # bootstrapping: data=d, function=p, 500 draws

To come back to my example, calculating standard errors is simple: sd(as.numeric(boot(POSIT,p,500)$t)). So we run 500 draws of the function p defined above, and calculate the standard error.

Here’s an example using the mean: p <- function(x,y) {mean(x[y], na.rm=TRUE)}

Using functions may appear cumbersome at first sight, but once you use sapply to calculate many standard errors at once, for instance, it becomes much easier. sapply(1980:2010, function(x) sd(as.numeric(boot(POSIT[YEAR == x],p,500)$t))) will get the standard errors for each of the 30 years.