An Overview of Recent Correspondence Tests

In a recent IZA working paper, Stijn Baert offers a long list of correspondence tests: field experiments where equivalent CV are sent to employer to capture discrimination in hiring. What’s quite exciting about this list is that it covers all kinds of characteristics, from nationality to gender, from religion to sexual orientation. What’s also great is the promise to keep this list up-to-date on his website. At the same time, the register does not describe the inclusion criteria in great detail. I was surprised not to find some of the studies Eva Zschirnt and I included in our meta-analysis on the list, despite our making all the material available on Dataverse. Was this an oversight — the title of the working paper includes an “almost” –, or was this due to inclusion criteria? What I found really disappointing was the misguided focus on p-values to identify the ‘treatment effect’. All in all a useful list for those interested in hiring discrimination more generally.

Did I just find these “missing” papers in the meta-analysis on hiring discrimination?

When Eva Zschirnt and I were working on the meta-analysis on ethnic discrimination in hiring, I also run one of these tests for publication bias (included in the supplementary material S12). According to the test, there are a couple of studies “missing”, and we left this as a puzzle. Here’s what I wrote at the time: “Given that studies report discrimination against minority groups rather consistently, we suspect that a study finding no difference between the minority and majority population, or even one that indicates positive discrimination in favour of the minority groups would actually be easier to publish.” (emphasis in original).

We were actually quite confident not to have missed many studies. One way is to dismiss the assumptions between the tests for publication bias. Perhaps a soft target, but who are we to say that there are no missing studies?

Here’s another explanation that didn’t occur to me at the time, and nobody we asked about it explicitly came up with it. It’s just a guess, and will remain one. David Neumark has suggested a correction for what he calls the “Heckman critique” in 2012. We were aware of this, but I did not connect the dots until reading David Neumark and Judith Rich‘s 2016 NBER working paper where they apply this correction to 9 existing correspondent tests. They find that the level of discrimination is often over-estimated without the correction: “For the labor market studies, in contrast, the evidence is less robust; in about half of cases covered in these studies, the estimated effect of discrimination either falls to near zero or becomes statistically insignificant.”

This means that the “Heckman critique” seems justified, and at least in the labour market some of the field experiments seem to overstate the degree of discrimination. Assuming that this is not unique to the papers they could re-examine, the distribution of effect sizes in the meta-analysis would be a bit different and include more studies towards the no discrimination end. I can imagine that in this case, the test for publication bias would no longer suggest “missing” studies. Put different, these “missing” studies were not missing, but reported biased estimates.

The unfortunate bit is that we cannot find out, because the correction provided by David Neumark has data requirements not all existing studies can meet. But at least I have a potential explanation to that puzzle: bias of a different kind than publication bias and the so-called file-drawer problem.

Neumark, D. (2012). ‘Detecting discrimination in audit and correspondence studies’, Journal of Human Resources, 47(4), pp. 1128-157

Neumark, David, and Judith Rich. 2016. “Do Field Experiments on Labor and Housing Markets Overstate Discrimination? Re-Examination of the Evidence.” NBER Working Papers w22278 (May).

Zschirnt, Eva, and Didier Ruedin. 2016. “Ethnic Discrimination in Hiring Decisions: A Meta-Analysis of Correspondence Tests 1990–2015.” Journal of Ethnic and Migration Studies 42 (7): 1115–34. doi:10.1080/1369183X.2015.1133279.

Are Low-Skilled Minorities Discriminated More?

Today a colleague asked me whether our recent meta-analysis drew any inferences on whether low-skilled minorities are discriminated more than highly-skilled minorities. It does so only at the margins — mostly in the supplementary material (S13). And to be precise, with the data at hand, we can’t say anything about the skills of the applicants, but we’re talking about the skills levels necessary for the job at hand.

What about the average call-back ratios by skills-level of the job? The data are available on Dataverse: doi:10.7910/DVN/ZU8H79.

First we load the data file.

disc = read.csv("meta-clean.csv", header=TRUE, sep=",", fileEncoding="UTF8")

Then we simply average across skills levels (using aggregate). For the meta-analytic regression analysis, refer to the supplementary material. Here we only look at the “subgroup” level, and store the averages in a variable called x.

x = aggregate(disc$[disc$global=="subgroup"], by=list(Global=disc$global[disc$global=="subgroup"], Skills=disc$skills[disc$global=="subgroup"]), mean, na.rm=TRUE)

Since I want a figure, I’m sorting the result, and I don’t include the call-back rate for studies where the skills level was not indicated. Then I add the labels.

p = sort(x[2:4,3])
names(p) = c("high skills", "mixed skills", "low skills")

Finally, here’s the figure. I specify the ylim to include zero so as not to suggest bigger differences as there are.

barplot(p, ylim=c(0,2.2), bty="n", ylab="Average Call-Back Ratio")

The difference between “high” and “low” is statistically significant in a t-test (p=0.002).

Also on Figshare.

I also looked at the ISCO-88 codes. Now, the level of detail included in the different studies varies greatly, and the data file includes text rather than numbers, because some cells include non-numeric characters. After struggling a bit with as.numeric on factors, I chose a different approach using our good friend sapply.

I create a new variable for the 1-digit ISCO-88 codes. There are 781 rows. For each row, I convert what’s there into a character string (in case it isn’t already), then use substr to cut the first character, and then turn this into numbers.

disc$isco88_1 = sapply(1:781, function(x) as.numeric(substr(as.character(disc$isco88[x]), 0, 1)))

We can again run aggregate to average across occupation levels.

aggregate(disc$[disc$global=="subgroup"], by=list(Global=disc$global[disc$global=="subgroup"], ISCO88=disc$isco88_1[disc$global=="subgroup"]), mean, na.rm=TRUE)

ISCO88 x
2 1.629796
4 1.422143
5 2.142449

I am not including all the output, because there are too few cases for some of the levels:

ISCO-88 Level 1 2 3 4 5 7 8 9
N 3 68 8 36 62 7 11 12

Zschirnt, Eva and Didier Ruedin. 2016. “Ethnic discrimination in hiring decisions: A meta-analysis of correspondence tests 1990–2015”, Journal of Ethnic and Migration Studies. Forthcoming. doi:10.1080/1369183X.2015.1133279

Ethnic discrimination in hiring decisions: a meta-analysis of correspondence tests 1990–2015

Eva Zschirnt and I have undertaken a meta-analysis of correspondence tests in OECD countries between 1990 and 2015. It is now available on the website of JEMS. We cover 738 in 43 separate studies conducted in OECD countries between 1990 and 2015. In addition to summarizing research findings, we focus on groups of specific tests to ascertain the robustness of findings, emphasizing (lack of) differences across countries, gender, and economic contexts. Discrimination of ethnic minority and immigrant candidates remains commonplace across time and contexts.