# R: empty cells in weighted cross-tabs across multiple variables

I’m not even sure how to succinctly describe the problem, but here’s what worked for me. Well, I have two sets of variables and want to run a cross-tabulation. I also want to weigh the frequencies and then calculate the sum of them, and there are some empty (blank) cells to add to the mix. Three small problems in one; R to the rescue.

The two series of variables are as follows: attributes, each with 6 categories to indicate frequencies, alas grouped. So for attribute 1, variable Q5_1 indicates 0 occurrences, 1 to 5 occurrences, 6 to 10 etc. There are also sectors to identify subgroups, using a series of dummy variables to identify the sector (Q6_1, Q6_2, … Q6_10). So basically I want to run table(Q5_{1:6}, Q6_{1:10}), turning the categorical variables into approximative frequency counts.

First, I `attach()` my data; the `get(paste(...))` code seems to like this by a mile.

Second, I create an empty matrix that I will subsequently fill with the (approximated) frequencies: `tbl <- matrix(data=NA, nrow=6,ncol=10)`.

Third, I cycle through each pair of variables: 1:6 sectors (`sectvar`), and 1:10 attributes (`attrvar`).
```for(attrvar in 1:6) { for(sectvar in 1:13) { ```

Here I create a simple cross-tabulation for the current pair of variables. `get(paste(...))` does all the work.
`raw <- table(get(paste("Q5_",sectvar,sep="")), get(paste("Q6_",attrvar,sep="")))`

Since I want to weigh the counts so as to approximate the actual frequencies from the categorical counts, I run into problems if there are empty cells in the previous step. `table()` simply leaves them out in the result. That’s usually fine, but problematic because of the weights. So I have to add the zeros back in. Here’s one way to do this: create an empty vector with as many zeros as the variable (attrvar) has: 6. (The package agrmt has a helper function for similar cases.)
`raw2 <- c(0,0,0,0,0,0)`

Next I replace all the zeros with the actual values from the variables `raw` if they exist. If they do not exist, we keep the zero.
`for(i in 1:6){raw2[as.numeric(dimnames(raw)[])[i]] <- raw[i]}`

Now we have a complete frequency vector and I can apply my weights.
`wei <- raw2 * c(0, 2.5, 5.5, 10.5, 15.5, 0)`

and then sum up to approximate the actual count:
``` tbl[sectvar, attrvar] <- sum(wei) } }```

`print(tbl)` now gives me the cross-tabulation with approximate counts.

This site uses Akismet to reduce spam. Learn how your comment data is processed.