# Custom Tables of Descriptive Statistics in R

Here’s how we can quite easily and flexibly create tables of descriptive statistics in R. Of course, we can simply use `summary(variable_name)`, but this is not what you’d include in a manuscript — so not what you want when compiling a document in knitr/Rmarkdown.

First, we identify the variables we want to summarize. Often our database includes many more variables:

vars <- c("variable_1", "variable_2", "variable_3")

Note that these are the variable names in quotes. Second, we use `lapply()` to calculate whatever summary statistic we want. This is where flexibility kicks in: have you ever tried to include an interpolated median in such a table, just as easy as the mean in R. Here’s an example with the mean, minimum, maximum, and median:

`v_mean <- lapply(dataset[vars], mean, na.rm=TRUE)`
`v_min <- lapply(dataset[vars], min, na.rm=TRUE)`
`v_max <- lapply(dataset[vars], max, na.rm=TRUE)`
`v_med <- lapply(dataset[vars], median, na.rm=TRUE)`

Too many digits? We can use `round()` to get rid of them. There’s actually an argument ‘digits’ in the `kable()` command we’ll use in a minute that in principle allows rounding at the very end, but unfortunately it often fails on me. Rounding:

`v_mean <- round(as.numeric(v_mean), 2)`

Now we only need to bring the different summary statistics together:

`v_tab <- cbind(mean=v_mean, min=v_min, max=v_max, median=v_med)`

`rownames(v_tab) <- c("Variable 1", "A description of variable 2", "Variable 3")`
and we use `kable()` to generate a decent table:
`kable(v_tab)`