A Modified Shapefile for Plotting Swiss Cantons

The GADM database of Global Administrative Areas is the place to go for shapefiles of administrative boundaries. The data are freely available for academic use and other non-commercial use. Unfortunately, the boundaries of the Swiss cantons are a bit buggy. Version 2 has fixed the presence of a 27th canton — part of Lake Geneva, but it includes the entire lake of any lake shared with a neighbouring country as Swiss territory.

The licence of the GADM files is never made explicit, and while it says “Please contribute in whatever way you can by sending us a message to point out errors, or even better, to send an improved file for a country of your interest,” there is no way to send improved files on their website. I never got replies to my messages, so consider this post as my sending an improved file.

The files are on Github. I have modified cantonal boundaries so that they match the national boundaries: only the Swiss part of lakes is included as Swiss territory. I’m not sure the files other than the .shp, .shx, .dbf, .prj are actually needed.

Here’s how to draw a map in R. First we load two libraries:

library(sp)
library(rgdal)

Then we import the shapefile:

x <- readOGR(dsn = ".", layer = "ch-cantons")

The dsn argument takes the location of the shapefile (dot for current directory); the layer argument takes the name of the shapefile without extensions.

And now we draw the map:

data <- 1:26 # values to plot
x$data <- data # add to SpatialPolygonsDataFrame
col <- rev(heat.colors(26)) # define colours
spplot(x, "data", col.regions=col, main="", colorkey=TRUE)

The cantons are as follows: c("AG","AR", "AI","BL", "BS", "BE", "FR", "GE", "GL", "GR", "JU", "LU", "NE", "NW", "OW", "SG", "SH", "SZ", "SO", "TG", "TI", "UR", "VS", "VD", "ZH", "ZG").

26map

Mapping Ethnic Representation Scores in R

Here’s a demonstration of how easy it is to map with the R package rworldmap by Andy South. I map the ethnic representation scores in my 2009 JLS article, available from my Dataverse. I used the tab-delimited file, which contains the country name, Q-scores, R-scores, and a binary indicator of minority presence. I searched&replaced the tabs with commas, and added a new column for the ISO3 country codes.

After that, it’s just a few lines in R:

library(rworldmap)
dta <- read.csv("Representation.csv") # these are the data described above.

> head(dta) gives:
Country ISO3 QScore RScore Present
1 Afghanistan AFG 0.803 0.697 1
2 Albania ALB 0.980 0.510 1
3 Algeria DZA NA NA 1
4 Andorra AND 0.980 0.000 0
5 Angola AGO NA NA 1
6 Antigua & Barbuda ATG 0.910 0.000 0

Next we have to identify the countries. joinCode specifies that I used ISO3, nameJoinColumn specifies the variable with the country abbreviations:

jcd <- joinCountryData2Map(dta, joinCode="ISO3", nameJoinColumn="ISO3")

Next a line from the package vignette that makes the plot use the space available.
par(mai=c(0,0,0.2,0),xaxs="i",yaxs="i")

Now, while mapCountryData(jcd, nameColumnToPlot="QScore") would suffice to draw a map, I used some of the options available (e.g. a blue ocean, light grey for missing data), and drew the legend separately for a little extra control:

mapParams <- mapCountryData(jcd, nameColumnToPlot="QScore", addLegend=FALSE, mapTitle="Ethnic Representation Scores", oceanCol="light blue", missingCountryCol="light grey")
do.call(addMapLegend, c(mapParams, legendWidth=0.5, legendMar = 4))

The title is a bit off, but other than that, I’m pretty happy for a first cut with so little coding.

ers-map
Ruedin, Didier. 2009. ‘Ethnic Group Representation in a Cross-National Comparison.’ The Journal of Legislative Studies 15 (4): 335–54. doi:10.1080/13572330903302448.

Maps in R

It isn’t difficult to draw maps in R, and in fact there are several options available. Here’s how to draw maps using the package sp.

I use a map of Swiss cantons to lament on the shapefile available from fantastic GADM. Let’s have a look at the population for each canton first, however, available here. I copied the table into a spreadsheet to select just the cantons and replace the canton names with their abbreviations. I often find this kind of data handling easier outside of R.

p <- read.table(file="clipboard")
rownames(p) <- p$V1
p$V1 <- NULL
p <- data.frame(t(p))

Here I import the data via the clipboard; use the names of the cantons as row names, and remove the first column that held the names. I also transpose the data and convert it to a data frame. Usually this kind of preparation isn’t necessary, as I have the data already in R (from whatever analysis I’m doing).

library(sp)
con <- url("http://biogeo.ucdavis.edu/data/gadm2/R/CHE_adm1.RData")
print(load(con))
close(con)

Here I load the package, and get the shapefile from GADM. You can find the URL on the GADM website. This code is essential.

target <- c("AG","AR", "AI","BL", "BS", "BE", "FR", "GE", "GL", "GR", "JU", "LU", "NE", "NW", "OW", "SG", "SH", "SZ", "SO", "TG", "TI", "UR", "VS", "VD", "ZG", "ZH")

This is the order of the cantons in the shapefile. It seems to be in alphabetical order when the names of the cantons are spelled out. This is different from the idiosyncratic way the data are reported on the website used. With such a small number of units (26 cantons), I could enter the data manually, but that’s prone to errors.

data <- NA # create target
data <- sapply(target, function(x) {data[1] <- as.numeric(p[x])})

Here I use the fantastic sapply function to allocate the values in p to their right place.

gadm$data <- data
col <- rev(heat.colors(26))
spplot(gadm, "data", col.regions=col, main="Population (in thousands)")

This is the final code, and we’re back to essential code in the final line.

cantonsPop

If you want to follow this example, but struggle to get the data in, try this instead:

p <- structure(list(ZH = 1425.5, BE = 1001.3, LU = 390.3, UR = 35.9,
SZ = 151.4, OW = 36.5, NW = 41.9, GL = 39.6, ZG = 118.1,
FR = 297.6, SO = 261.4, BS = 189.3, BL = 278.7, SH = 78.8,
AR = 53.7, AI = 15.8, SG = 491.7, GR = 195, AG = 636.4, TG = 260.3,
TI = 346.5, VD = 749.4, VS = 327, NE = 176.4, GE = 469.4,
JU = 71.7), .Names = c("ZH", "BE", "LU", "UR", "SZ", "OW",
"NW", "GL", "ZG", "FR", "SO", "BS", "BL", "SH", "AR", "AI", "SG",
"GR", "AG", "TG", "TI", "VD", "VS", "NE", "GE", "JU"), row.names = "V2", class = "data.frame")

And I nearly forgot the lamenting: while version 1 of the GADM shapefile included 27 areas, one of which was the Swiss part of Lake Geneva, version 2 of the GADM shapefile includes the entire Lake Constance as part of the canton of Thurgau — including the parts that are German. Let’s hope their quicker to fix this time around (I reported it).