Using Splines to Fill Holes (Interpolation)

Following a comment I recently had on using moving averages, I wanted to share more widely how we can use splines to fill holes in (time series) data. Contrary to the moving-average based approach I posted earlier, using splines we can keep the observed values and just interpolate where there are missing values. What is more, in many cases splines are more appropriate, and actually surprisingly easy to use.

First we need some data with missing values (shamelessly nicked from Kirk, although modified to protect the original should this be necessary):
splines1

z <- c(-1.1484, -1.3842, -1.5985, -1.0626, -1.3413, -1.2341, -1.1269, NA, -0.7411, -0.7840, -0.6125, -0.8912, NA, NA, -1.1912, -1.7271, -1.0841, -0.9555, -0.9555, -0.6554, -0.4196, -0.5268, -0.3767, -0.2695, 0.2019, NA, NA, NA, 1.1880, 0.9736, 0.7807, 0.5878, 1.3594, 0.6306, 1.3809, NA, NA, NA, NA, NA, 1.5738, 1.6595, 1.1665, 0.9950, 1.7238, 1.1022, 1.1451, NA, 0.7807, NA, NA, NA, NA, NA, NA, NA, 0.8450, 0.5449, 0.2662, 0.8021, 0.4806, 0.1376, 0.5449, 0.2019, 0.4592)

First, we’re looking at the data.

plot(z, type="l", bty="n")

To identify the location of the missing values, we can use the following code (it also keeps things slightly simpler below):

miss <- which(is.na(z))

The core of the approach presented here is the splines function, part of R. It defines a function, here I called it a().

a <- splinefun(z)

We can now use this function to get individual points using splines to interpolate the curve. For instance, here I plot the values for each integer. Note that this plots on top of the observed values, but easily fills the holes. 65 is the length of the data vector.

points(1:65, a(1:65), col=2)

We can highlight the missing values (now interpolated) by plotting them in a different colour:

points(miss, a(miss), col=5)

And here’s how to identify a single point: the value for y at x=14.

points(14, a(14), col=3)

This figure has not much value except for demonstrating how splines can be used. The code here could be quite easily changed to replace missing values with interpolated ones.

splines2

To finish off, here’s a demonstration that the splines function is quite capable of dealing with fine-graded interpolations:

points(seq(1,65,.5), a(seq(1,65,.5)), col=3)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s