It is not difficult to write a function in R to calculate the mode (as a central tendency). The function I have posted a while ago did not work with missing values. Here’s a tweak that does work with NA in the data.
Mode = function(x) {
ux = na.omit(unique(x))
return(ux[which.max(tabulate(match(x, ux)))])
}
Plotting Connected Lines with Missing Values
When we plot data with missing values, R does not connect them. This is probably the correct behaviour, but what if we really want to gloss over missing data points?
plot(variable.name[country=="UK"], type="b")
gives me something like the following. I used type="b"
, since type="l"
will give an empty plot – generally not very useful.
What if we simply leave out the missing values? plot(na.omit(variable.name[country=="UK"]), type="b")
kind of works, but we lose the correct spacing on the x-axis:
So what we can do is the following. In a first step we identify for which points we have data. Next we plot, but only these. In contrast to the above method, the spacing on the x-axis remains intact.
miss <- !is.na(variable.name[country=="UK"])
plot(which(miss), variable.name[country=="UK" & miss], col="red", type="b", lwd=2)
It is important to include an xlim
argument if we add multiple lines on the same plot. Typically I draw the axes separately, as this gives me more control over them, especially the labels on the x-axis.
miss <- !is.na(variable.name[country=="UK")
plot(which(miss), variable.name[country=="UK" & miss], col="red", type="b", lwd=2, axes=FALSE, xlim=c(1,16))
axis(2)
axis(1,at=c(1,6,11,16), labels=c("1995","2000","2005","2010"))