This week I had a data set with Swiss postcodes (Postleitzahl; PLZ), and wanted to use this information to create a variable on the respondent’s canton. It turned out slightly less trivial than I thought.
Finding a list of postcodes with the canton indicated wasn’t difficult. I quickly realized, though, that the relationship between Swiss postcodes and cantons isn’t straightforward: 1000:1197 are in the canton of Vaud, 1200:1258 in the canton of Geneva, then back to Vaud, etc. Once we get to 1290, it becomes obvious that it’s even more complicated than that: the same postcode is used for municipalities in two different cantons.
Since the relationship between postcode and canton is quite messy, I decided to simply use a table to lookup; I was looking for a quick solution, not necessarily an elegant one.
It turns out, there are just 12 postcodes not clearly assignable to a single canton. I looked more closely, and in most cases, the situation on the ground is a town or village in one canton, with a hamlet across the cantonal border. I assigned these to the larger settlement.
The R-code is available on github as a simple function that can be loaded using
source(), but essentially it’s a large table and a loop with a single line to match postcode and canton. All the ambiguous cases are clearly identified, making it easy to filter them out (e.g. a “NA” category).