Bootstrapping in R

Bootstrapping uses resampling to assign measures of accuracy, and it can easily be used in R. When I first used it, it took me a while to figure out the double subscripts needed, so here is how to do it.

First, we’ll need the boot package from CRAN. Here’s an example using polarization from my agrmt package. After installing the packages, we define a function. So in this case, I use polarization(collapse(POSIT, pos=w)) to calculate the point estimates of polarization. POSIT is the variable of interest – I am interested in the polarization in this positional variable. To use this as a function for bootstrapping, a second subscript is necessary: p <- function(x,y) {polarization(collapse(x[y], pos=w))}.

The call of the boot package is simple: boot(data, function, replications), as long as you have a function with double subscripts. So we would use: mean(x[y]) rather than mean(x) in the function.

w <- c(-1,-0.5,0,0.5,1) # positions at which data could occur, used by polarization function: the pos=w argument
c <- c(0,0,2,2,0) # collapsed data; *not* used
d <- c(0,0,-.5,-.5) # raw data; used; normally a variable
z <- boot(d,p,500) # bootstrapping: data=d, function=p, 500 draws

To come back to my example, calculating standard errors is simple: sd(as.numeric(boot(POSIT,p,500)$t)). So we run 500 draws of the function p defined above, and calculate the standard error.

Here’s an example using the mean: p <- function(x,y) {mean(x[y], na.rm=TRUE)}

Using functions may appear cumbersome at first sight, but once you use sapply to calculate many standard errors at once, for instance, it becomes much easier. sapply(1980:2010, function(x) sd(as.numeric(boot(POSIT[YEAR == x],p,500)$t))) will get the standard errors for each of the 30 years.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: