# Calculating Standard Deviations on Specific Columns/Variables in R

When calculating the mean across a set of variables (or columns) in R, we have colMeans() at our disposal. What do we do if we want to want to calculate say the standard deviation? There are a couple of packages offering such a function, but there is no need, because we have apply().

Let’s start with creating some data, a matrix with 3 columns full of random numbers.

```M <- matrix(rnorm(30), ncol=3) ```

This gives us something like this:
``` [,1] [,2] [,3] [1,] -0.3533716 -1.12408752 0.09979301 [2,] 0.6099991 -0.48712761 0.22566861 [3,] -0.9374809 -1.10497004 -0.26493616 [4,] -0.5243967 -0.66074559 0.16858864 [5,] 0.2094733 -0.45156576 -0.27735151 [6,] 0.6800691 1.82395926 -0.18114150 [7,] 0.1862829 0.43073422 0.14464538 [8,] -1.0130029 -1.52320349 -1.74322076 [9,] 1.1886103 0.09653443 -1.95614608 [10,] -0.9953963 -1.15683775 1.61106346```

``` ```

Now comes apply(), where 1 indicates that we want to apply the function specified (here: sd(), but we can use any function we want) across columns; we can use 2 to apply it across rows).

`apply(M, 1, sd)`

This gives us the standard deviations for each row:

```  0.6187682 0.5566979 0.4446021 0.4447124 0.3426177 1.0058659 0.1545623  0.3745954 1.5966433 1.5535429 ```

We can quickly check whether these numbers are correct:

`sd(c(-0.3533716, -1.12408752, 0.09979301))`

` 0.6187682`

Of course we can choose the variables or columns we want, such as this `apply(M[,2:3], 1, sd)` or by using `cbind()`.

This site uses Akismet to reduce spam. Learn how your comment data is processed.