The role of source language in Wordscores etc.

My paper on the role of source language in the automatic coding of political texts (Wordscores, dictionary coding) is now available online. I make use of Swiss party manifestos to examine the impact of source language on party positions derived from the manifestos: does it matter if a French or German manifesto is used? The conclusion is that both stemming and particularly stop words are important to obtain comparable results for Wordscores, while the keyword-based dictionary approach is not affected by language differences. Replication material is available on my Dataverse.

MIPEX and Naturalization Policies

In a recent working paper Thomas Huddleston and Maarten Peter Vink demonstrate that the different dimensions covered by the MIPEX indicators all tend to correlate strongly with naturalization policies. A country tough on naturalization tends to be tough on other aspects of immigration and integration policies.

While it didn’t make a direct reference to this debate, my 2011 working paper on the reliability of the MIPEX as a scale fully supports this. In this working paper I show that all MIPEX indicators combined are a reliable scale, but also highlight redundancies. These findings actually prepared my recent post on remastering MIPEX indicators depending on the research question.

MIPEX Remastered

The MIPEX (Migrant Integration Policy Index) is a relatively widely used index. I have demonstrated empirically that it can be used as a scale, but have voiced some concerns about the weak theoretical foundation.

The number of countries covered by the MIPEX is increasing, and there are 148 indicators available. In an attempt to make most of these data, I have picked the parts of the MIPEX that most closely fit the typology developed in Koopmans et al. (2005).

To get a better handle of developments over time, I use the SOM extension of MIPEX, and here is how the situation in Austria and the Netherlands has changed over time.


We can discuss the labels, but there are clear differences between countries, and citizenship regimes are clearly dynamic. This means that, yes, citizenship regimes are worth investigating, but country dummies will fail to provide an accurate picture.

Koopmans, Ruud, Paul Statham, Marco Giugni, and Florence Passy. 2005. Contested Citizenship: Immigration and Cultural Diversity in Europe. Minneapolis: Minnesota University Press.

Ruedin, Didier. 2011. “The reliability of MIPEX indicators as scales.” SOM Working Paper 3: 1–19.

Immigrant Diversity and Social Cohesion

Immigration to Western European countries is nothing new. Arguably, the diversity of immigrants has increased in recent years. Inevitably, this leads to a more divers population in most European countries, and this diversity is viewed by some with scepticism. The fear is that increased (ethnic) diversity due to immigration threatens social cohesion. However, despite similar demographic developments across Western European countries, reactions to increased diversity have been quite different across countries. The reason for this can be found in historical legacies and the development of the welfare state — an institution that is inclusive by design.

The concept of social cohesion is broad, to say the least. A simple definition can be derived from shared values and feelings of togetherness in society. Depending on the political colour, either aspect tends to be highlighted. A minimalistic definition thus insists on individuals feeling part of society, and trusting each other — other groups and individuals accepted as full members of society.

In the context of immigration, five indicators can be considered: generalized trust, naturalization rates, confidence in key institutions, early leavers, and voter turnout.

Measuring Descriptive Representation

In the last two weeks I had several conversations on how to best measure descriptive representation (i.e. the numerical representation of groups). I treated this in my recent monograph, but also in a conference paper in 2011. In my view, there are three important points: (1) What’s best depends on your research question. (2) It is important to include the population and the representatives. (3) I recommend two measures as follows: Ri / Pi for measuring the representation of a single group (e.g. a specific minority group, or all minorities combined as opposed to the majority population); for the situation at the national level, I prefer the Rose index (1 – 0.5 * |Ri – Pi|) over the Gallagher index (but following recent simulations I have undertaken, less strongly than previously). Ri stands for the proportion of a group among the representatives, Pi for the proportion among the population.