I have recently explored open-source approaches to computer-assisted qualitative data analysis (CAQDA). As is common with open-source software, there are several options available, but as is often also the case, not many of them can keep up with the commercial packages, or are abandoned.

Here I wanted to highlight just three options.

RQDA is built on top of R, which is perhaps not the most obvious choice — but can have advantages. The documentation is steadily improving, making it more apparent how RQDA has the main features we’ve come to expect from CAQDA software. I find it a bit fiddly with the many windows that tend to be opened, especially when working on a small screen.

Colloquium is Java-based, which makes it run almost everywhere. It offers a rather basic feature set, and tags can only be assigned to lines (which also implies that lines are the unit of analysis). Where it shines, though, is how it enables working in two languages in parallel.

CATMA is web-based, but runs without flash — so it should run pretty anywhere. It offers basic manual and automatic coding, but there’s one feature we really should care about: CATMA does TEI. This means that CATMA offers a standardized XML export that should be usable in the future, and facilitate sharing the documents as well as the accompanying coding. That’s quite exciting.

What I find difficult to judge at the moment, is whether TEI will be adopted by CAQDA software. Atlas.ti does some XML, but as far as I know it’s not TEI. And, would TEI be more useful to future researchers than a SQLite database like RQDA produces them?

Qualitative Studies are often not quite as small N

Qualitative studies are often described as small N studies because the number of respondents is small. I argue that this is the wrong perspective: What we really have in qualitative data, say interviews, is lots of data (points) clustered within individuals. Rather than focusing on the number of respondents, we should probably focus on the number of relevant statements (i.e. statements about our quantity of interest), and describe this number (along with the number of respondents). When computer aided qualitative data analysis (CAQDA) is used, I guess the number of tags relevant to our quantity of interest is that number. Seen this way, many qualitative studies are no longer small N studies, but we’re still faced with unstructured, messy data that may be difficult to analyse, and of course we don’t have independent observations — so generalization remains a challenge.