In an earlier assessment of PSPP as a replacement of SPSS I mentioned some of the reasons I think PSPP cannot yet fulfil its ambition of being such a replacement. I’m happy to report that PSPP now does logistic regressions, but the point of this post is to highlight one of its strength in a practical application: PSPP is super fast.
I frequently use an old, underpowered Netbook, and usually that’s enough computing power for basic analyses (most mobile phones these days are more powerful). A few days ago, I wanted to run a very simple analysis on the longitudinal WVS data. We’re looking at a 500Mb SPSS file here, and all I wanted to do was calculating a new variable, and then get the mean by country and year. Really basic stuff, except that I only have 1Gb RAM available (I did say underpowered).
What happened next: I gave up on R loading the data file (the RData file provided by the WVS is 1.3Gb), it took too long. Opening PSPP is a breeze on this machine (and easily beats opening SPSS on the brand new Windows machine I’m provided with; I cannot imagine how long SPSS would take on this machine). While it wasn’t actually fast, it took around 5 minutes for each of the steps (calculating the new variable, sorting/splitting by country and year, frequencies statistics to get the mean). That’s around as long as I waited for R to load the data before giving up. Moreover, PSPP played nicely and did not lock up the computer, so I could actually do some other work at the same time; R can be more demanding.