PSPP is sometimes touted as a replacement for SPSS (including by its creators). Well, it isn’t (this is often the case with open source alternatives; the ambition and reality do not quite match). By stating plainly that PSPP is not a replacement for SPSS, I don’t mean to dismiss PSPP.
First off, PSPP is under active development, and getting hold of the latest version can be a bit difficult. For Windows, this site often has the most up-to-date version, for Linux/Debian you’ll need to be on a “unstable” release or compile your own (which I doubt many will want to do given that we’re looking at an SPSS replacement, not R or Octave).
Second, recent releases cover many basic functions needed for an introductory statistics course. The GUI frequently lags a bit the underlying capability, so some functions will only be available using SYNTAX. Oddly enough, the PSPP team copy the SPSS interface quite well, including things that could readily be improved (e.g. why do we have tabs for the “Data View” and the “Variable View”, but a separate window for the results or syntax? Why mix the two?).
So PSPP can readily do tables, ANOVA, linear and logistic regressions, and recoding variables. Unfortunately, and this is why PSPP is not even a replacement for basic SPSS users, there are bits and pieces missing even in the basic functions. On the positive side, PSPP has a cleaner interface than SPSS, on the negative side some features are just not there. Unless users follow a course designed specifically with PSPP in mind, they will frequently hit a wall. The same is the case for SYNTAX. Users will be able to run SPSS syntax with no problem, as long as PSPP has the commands implemented. Again, when using code from the many websites helping SPSS users, unfortunately PSPP users will frequently hit a wall.
What do I mean by bits and pieces missing? Let’s take a linear regression. It’s there, the familiar box with arrows to choose variables. Now I may want some multi-collinearity statistics, too. Ah, sorry, doesn’t exist yet. So I can build a model, but do not even have one of the most basic means to check whether it is any good. For this reason I am not surprised nobody has written an that there are not many introductions into statistics using PSPP… it’s just not there yet.
One thing I missed a lot is that PSPP does not remember the last input. So if I run a regression and want to add another variable, I’ll have to start from scratch in PSPP, entering each variable. Graphing is lacking or very poor.
With the advancements in Rstudio, R Commander, etc., I sometimes wonder whether PSPP is just advancing too slowly. Having said all this, I wanted to add on a positive note. PSPP has got quite stable in recent releases; it’s got a price tag hard to beat and moral superiority with being truly open source. And finally, it is fast, much faster than SPSS!
Some of what you say is valid, other bits are just plain wrong. Let’s deal with them case by case:
1. Like all GNU software it is easy to find and download: http://ftp.gnu.org
2. Yes, the GUI does not feature all the abilities of the program. Is that important? After all, it is a scientific program – not a computer game – if the user can’t operate a computer except by a point and click, then they’re unlikely to understand concepts such as linear regression, degrees of freedom etc.
3. There are many tutorials based on PSPP – some of them commercial offerings – just search the web!
4. Yes. “bits and peices” are missing. But this is where you come in! Instead of whinging, pull your finger out and send the developers a patch. After all that’s what open source is all about.
Thanks for checking in!
(1) I guess I have a quite different user in mind when I think of SPSS. It’s not usually the kind of person who is going to compile their software; I’m pretty certain they would have no idea what to do with a file ending in .tar.gz once they browsed the GNU FTP server…
(2) Is it important for the GUI to feature all the abilities? For many of the SPSS users I meet it is. Yes, it is a scientific program, but not all scientists (let alone students) are computer geeks. It’s not just about ability (I guess they could all learn the commands), but also about preference. I always thought one of the main draws of SPSS was that it looked familiar (a bit like Excel) and offered a point and click interface. If I’m going to type code anyway, why not go for something like R?
Many students who get introduced to SPSS at university are covered by a site licence that allows them to install a copy on their laptop. If this is not the case, or if they change to another university without a site licence, PSPP looks like a promising alternative. What I tried to show in my post is that a student doing a introductory statistics course using SPSS is probably not able to replicate everything on PSPP, in fact I think they’re very quickly going to hit a wall.
(3) OK, my original post was inaccurate; I was thinking of book-length introductions, but alas I was wrong here, too.
(4) The classic answer, do it yourself. Well, first I wasn’t exactly whinging (it’s not my claim that PSPP is a replacement for SPSS); second I’m not exactly proficient in C.
I like very much to be discussed about free software, its uses and its implications. Also keep in mind that by its nature, there are elements that can confuse users when compared with the typical proprietary software.
PSPP is free software, and is offered as replacement off SPSS. Keep in mind that it is currently beta software and therefore there are many features not available in its interface or from the command line.
However, what we can now do PSPP, it does very well. No limits on the amount of data that can be used, it is much faster at performing calculations compared with SPSS and presents an almost perfect compatibility in the proprietary SPSS file format (supports *.sys, *.sps *.por *.sav ,and *.zsav file format ). This means you can easily create and share databases created between SPSS and PSPP without losing information –note that the role column has no function in PSPP–. This is a great achievement, which is used by other software to read and write SPSS data, like R.
Moreover, it is not difficult to get the latest version of PSPP (official or snapshotbuils) without compiling the software. For GNU/Linux distributions simply search the repositories (or application store) for the latest official version, searching for the latest snapshots just add a PPA, COPR, AUR, your-favorite-custom-distribution repository. For Windows can be downloaded from the following link: http://sourceforge.net/projects/pspp4windows/files/?source=navbar (The latest development version is marked “release candidate”) and for OS X search in: http://lavergne.gotdns.org/projects/pspp/ .
PSPP has a limited set of developers, but it’s always good to help the community, indicating for example which are the basic analysis are missing for a basic statistics course. Developers are interested in this information to prioritize their work.
There are different types of users and different ways to use the software. PSPP In my case I use to create my databases and analyze data with available functions. For the remainder is always available R.
Cheers
Notes:
1. Use separate windows for syntax or the output view. is a design decision. In the future pspp support advanced functions like reading scripts or output spss files (*.spv).
2. I invite you to join the PSPP community http://savannah.gnu.org/projects/pspp/. For example you can suggest remember the last entry in the GUI or report any failure to reading SPSS files (any bugfixes benefit a lot of software).
Thanks for your comments. You are right, what PSPP does it does very well (and don’t forget its speed!).
This review seems to be highly negative – that is ok – if that is an honest opinion you are entitled to state it.
But it seems to me that you are simply LOOKING for negative aspects and OVERLOOKING the others.
Firstly, you complain about all the things that PSPP doesn’t do but which SPSS does (like missing commands etc).
Then you complain that PSPP is too much like SPSS!!! Make up your mind!!
Then you overstate the negative aspects. For example phrases like “hitting a brick wall” are gross exaggerations.
A student might have issues, but they can be worked around. For pedagogical purposes surely that is a good thing?
We like to teach our students to think outside the square – if the first attempt doesn’t work then try something
else.
Finally you talk about some lacking feature in linear regression. Regression is not a topic for the statistical novice. By the time they have grasped the concepts involved there, a student is likely to be proficient in R – or even writing their own program.
By all means criticise – but take a clear position, not just look for everything that is bad.
Thanks for your comments, and apologies for coming across so negatively!
When I wrote this post, I was imagining what would happen if a student of mine would use PSPP rather than SPSS in the MA stats course I teach (We use SPSS and R/Rstudio in parallel. Nobody is forced to use SPSS, although almost all of them strongly prefer it because it’s what they have learned first. I regularly check whether we could switch to PSPP or JASP.). This student would not be able to follow the course, as many things aren’t implemented in PSPP yet. Of course I could design a course that worked perfectly in PSPP (others have done it), but I actually believe graphing is essential, and do think VIF have some value (We always calculate a couple of them using plain linear regressions as PSPP can do them, as this helps understand VIF, but it’s not convenient).
No, I don’t think PSPP is bad, but it’s not yet a replacement for SPSS.
P.S. It probably makes a big difference what discipline you’re in. For instance, psychology and sociology emphasize quite different aspects of statistical analysis.
I wonder about your thoughts on the latest versions of PSPP.
I haven’t looked at version 1.01 in detail yet, but it’s definitely getting closer to be useful as an alternative to SPSS. We can do multiple linear regression analysis (OLS), and binary logistic models. We still have very limited graphical capabilities, and there is for instance no VIF — a very simple regression diagnostic (I know it’s easy to calculate, but that’s not the point). GLM would be nice, to do stuff like Poisson regressions, and multiple imputations (SPSS has a rather intuitive way to show the different imputed datasets). What’s missing completely is Bayesian analysis. SPSS has started adding them, and of course we find them in the proclaimed SPSS alternatives like JASP and jamovi.
You forgot the most important aspect of comparing SPSS to PSPP: the difference in price.
PSPP is currently $0.00.
SPSS is between $1250.00 to $8290.00 (PER YEAR) for a single user license.
You had mentioned Bayesian analysis. That’s not available on the “cheaper” licenses. For that you’re looking to pay at least the price of a used car, every year.
It doesn’t seem beneficial to completely ignore cost in comparing two pieces of software. That’s like saying a Honda Accord is a poor replacement for a Bugatti Veyron.
Yes, PSPP is of course completely free as the post mentions: “it’s got a price tag hard to beat and moral superiority with being truly open source”. Now I don’t think the comparison to cars is appropriate. On the one hand, there’s probably quite a difference between a Honda and a Bugatti (but obviously I’m not an expert on these). On the other hand, I haven’t seen a Honda advertised as a replacement for a Bugatti. Sure PSPP aims to be a replacement for SPSS, but in my view it isn’t yet. Let’s be clear, I’m not here to slam PSPP. It’s an excellent piece of software that I regularly use, but that doesn’t make it a replacement for SPSS.