# Bootstrapping in R

Bootstrapping uses resampling to assign measures of accuracy, and it can easily be used in R. When I first used it, it took me a while to figure out the double subscripts needed, so here is how to do it.

First, we’ll need the boot package from CRAN. Here’s an example using polarization from my agrmt package. After installing the packages, we define a function. So in this case, I use `polarization(collapse(POSIT, pos=w))` to calculate the point estimates of polarization. POSIT is the variable of interest – I am interested in the polarization in this positional variable. To use this as a function for bootstrapping, a second subscript is necessary: `p <- function(x,y) {polarization(collapse(x[y], pos=w))}`.

The call of the `boot` package is simple: `boot(data, function, replications)`, as long as you have a function with double subscripts. So we would use: mean(x[y]) rather than mean(x) in the function.

```w <- c(-1,-0.5,0,0.5,1) # positions at which data could occur, used by polarization function: the pos=w argument c <- c(0,0,2,2,0) # collapsed data; *not* used d <- c(0,0,-.5,-.5) # raw data; used; normally a variable z <- boot(d,p,500) # bootstrapping: data=d, function=p, 500 draws```

To come back to my example, calculating standard errors is simple: `sd(as.numeric(boot(POSIT,p,500)\$t))`. So we run 500 draws of the function p defined above, and calculate the standard error.

Here’s an example using the mean: `p <- function(x,y) {mean(x[y], na.rm=TRUE)}`

Using functions may appear cumbersome at first sight, but once you use `sapply` to calculate many standard errors at once, for instance, it becomes much easier. `sapply(1980:2010, function(x) sd(as.numeric(boot(POSIT[YEAR == x],p,500)\$t)))` will get the standard errors for each of the 30 years.

This site uses Akismet to reduce spam. Learn how your comment data is processed.