R Code: Swiss Postcode to Canton

This week I had a data set with Swiss postcodes (Postleitzahl; PLZ), and wanted to use this information to create a variable on the respondent’s canton. It turned out slightly less trivial than I thought.

Finding a list of postcodes with the canton indicated wasn’t difficult. I quickly realized, though, that the relationship between Swiss postcodes and cantons isn’t straightforward: 1000:1197 are in the canton of Vaud, 1200:1258 in the canton of Geneva, then back to Vaud, etc. Once we get to 1290, it becomes obvious that it’s even more complicated than that: the same postcode is used for municipalities in two different cantons.

Since the relationship between postcode and canton is quite messy, I decided to simply use a table to lookup; I was looking for a quick solution, not necessarily an elegant one.

It turns out, there are just 12 postcodes not clearly assignable to a single canton. I looked more closely, and in most cases, the situation on the ground is a town or village in one canton, with a hamlet across the cantonal border. I assigned these to the larger settlement.

The R-code is available on github as a simple function that can be loaded using source(), but essentially it’s a large table and a loop with a single line to match postcode and canton. All the ambiguous cases are clearly identified, making it easy to filter them out (e.g. a “NA” category).

Here are a two really simple tools I use in conjunction with the code posted previously.

The first one is a simple wrapper for the converters I presented above. It converts cantonal identity numbers to abbreviated labels and vice versa.

convid <- function(ID) {
clabels <- c("ZH", "BE", "LU", "UR", "SZ", "OW", "NW", "GL", "ZG", "FR", "SO", "BS","BL", "SH", "AR", "AI", "SG", "GR", "AG", "TG", "TI", "VD", "VS", "NE", "GE", "JU")
id <- ifelse(is.numeric(ID), clabels[ID], match(ID, clabels))
return(id)
}

The second one checks whether a number is a valid Swiss postcode, as in whether it is a postcode actually in use.

plzvalid <- function(PLZ) {
return(!is.na(plzcanton(PLZ)))
# usage: which(!sapply(1:1008, function(x) plzvalid(plz[x])))
}

All code also on Gist. I haven’t finished my code to convert postcodes to local language, though, as I ended up doing something else…

There were minor issues with Appenzell postcodes. For 16 postcodes only a probabilistic assignment is possible, and this is handled by siding with the (typically much) larger municipality. Updated versions:

Convert Swiss postcodes to cantons: https://gist.github.com/druedin/6690720

Two simple helper functions to go with: https://gist.github.com/druedin/8758265

Published 28 September 2013