How well do correspondence tests measure discrimination?

Correspondence tests are a useful field experiement to measure discrimination in the formal labour market. These correspondence tests are also known as CV experiements: Researchers send two equivalent applications to an employer, differening only in the quantity of interest — gender and ethnicity are common. If only the majority or male candidate is invited for a job interview, we probably have a case of discrimination. Once we aggreate across many employers, we’re pretty confident to have captured discrimination.

Most studies stop there, declining any offer to reduce the burden on employers. The hiring process, however, does not end there. Lincoln Quillian and his team have now compiled a list of studies that went further. They find that the first stage of screening is far from the end of discrimination, and the job interview can increase overall discrimination substantially. Correspondece tests focusing on the first stage will capture only some of the discrimination. Interestingly the discrimination at the job interview stage appears unrelated to discrimination at the first screening of applications.

Quillian, L., Lee, J., & Oliver, M. (2018). Meta-Analysis of Field Experiments Shows Significantly More Racial Discrimination in Job Offers than in Callbacks. Northwestern Workin Paper Series, 18(28). Retrieved from

Zschirnt, E., & Ruedin, D. (2016). Ethnic discrimination in hiring decisions: A meta-analysis of correspondence tests 1990–2015. Journal of Ethnic and Migration Studies, 42(7), 1115–1134.

Image: CC-by Richard Eriksson.

Ethnic discrimination in hiring: UK edition

The BBC report on a large correspondent test in the UK carried out by the excellent GEMM project. It’s good to see this reach a wider audience; it’s sad to see the results from our meta-analysis confirmed once again.

British citizens from ethnic minority backgrounds have to send, on average, 60% more job applications to get a positive response from employers compared to their white counterparts

What I really like about this short report by the BBC is that the essentials are covered. Yes we see discrimination, but no, it’s not so bad that none of the minority applicants would ever succeed. They also start the piece with an example of someone changing their name on the CV as a strategy to counter expected (or experienced) discrimination — and they highlight that discrimination has not declined despite policy changes, and indeed that discrimination affects native citizens who happen to have a ‘foreign’ name: they pay for an action of their parents or grandparents.

Are employers in Britain discriminating against ethnic minorities?, GEMM project: PDF of report

Zschirnt, Eva, and Didier Ruedin. 2016. ‘Ethnic Discrimination in Hiring Decisions: A Meta-Analysis of Correspondence Tests 1990–2015’. Journal of Ethnic and Migration Studies 42 (7): 1115–34.

Discrimination not declining

A new meta-analysis draws on correspondence tests in the US to show that levels of ethnic discrimination in hiring do not seem to have changed much since 1989. This persistence in racial discrimination is bad news, and indeed Eva Zschirnt and I have shown the same result across OECD countries a year ago. While policies have changed, especially in the European Union, looking at the ‘average’ from correspondence tests suggests that they may not have been effective — and that is bad news.

Correspondence tests are widely accepted as a means to identify the existence of ethnic discrimination in the labour market, and as field experiments they are in a relatively good position to make the causal claims we typically want to make. It turns out that most correspondence tests have not paid sufficient attention to heterogeneity, which — as David Neumark and Judith Rich demonstrate — means that they likely over-estimate the degree of discrimination. Unfortunately, most old studies did not vary the groups in a way that this could be fixed post-hoc. If we throw these out of the meta-analysis, we probably no longer have sufficient studies to make claims about changes over time.

Meta-analyses are no doubt an important tool of science, but there’s always a delicate balance to be struck: are the experiments included really comparable? Here we’re looking at field experiments in different countries, different labour markets, different jobs, and different ethnic groups. We can control for these factors in the meta-analysis, but with the limited number of studies we have, this might not be sufficient to silence critics. With correspondence tests, we only cover entry-level jobs, and despite much more fine-graded studies going into the field recently, we don’t have a tool to really identify why discrimination takes place.

Neumark, David, and Judith Rich. 2016. ‘Do Field Experiments on Labor and Housing Markets Overstate Discrimination? Re-Examination of the Evidence’. NBER Working Papers w22278 (May).

Quillian, Lincoln, Devah Pager, Ole Hexel, and Arnfinn H. Midtbøen. 2017. ‘Meta-Analysis of Field Experiments Shows No Change in Racial Discrimination in Hiring over Time’. Proceedings of the National Academy of Sciences, September, 201706255. doi:10.1073/pnas.1706255114.

Zschirnt, Eva, and Didier Ruedin. 2016. ‘Ethnic Discrimination in Hiring Decisions: A Meta-Analysis of Correspondence Tests 1990–2015’. Journal of Ethnic and Migration Studies 42 (7): 1115–34. doi:10.1080/1369183X.2015.1133279.

Image: CC-by CharlotWest

An Overview of Recent Correspondence Tests

In a recent IZA working paper, Stijn Baert offers a long list of correspondence tests: field experiments where equivalent CV are sent to employer to capture discrimination in hiring. What’s quite exciting about this list is that it covers all kinds of characteristics, from nationality to gender, from religion to sexual orientation. What’s also great is the promise to keep this list up-to-date on his website. At the same time, the register does not describe the inclusion criteria in great detail. I was surprised not to find some of the studies Eva Zschirnt and I included in our meta-analysis on the list, despite our making all the material available on Dataverse. Was this an oversight — the title of the working paper includes an “almost” –, or was this due to inclusion criteria? What I found really disappointing was the misguided focus on p-values to identify the ‘treatment effect’. All in all a useful list for those interested in hiring discrimination more generally.

Did I just find these “missing” papers in the meta-analysis on hiring discrimination?

When Eva Zschirnt and I were working on the meta-analysis on ethnic discrimination in hiring, I also run one of these tests for publication bias (included in the supplementary material S12). According to the test, there are a couple of studies “missing”, and we left this as a puzzle. Here’s what I wrote at the time: “Given that studies report discrimination against minority groups rather consistently, we suspect that a study finding no difference between the minority and majority population, or even one that indicates positive discrimination in favour of the minority groups would actually be easier to publish.” (emphasis in original).

We were actually quite confident not to have missed many studies. One way is to dismiss the assumptions between the tests for publication bias. Perhaps a soft target, but who are we to say that there are no missing studies?

Here’s another explanation that didn’t occur to me at the time, and nobody we asked about it explicitly came up with it. It’s just a guess, and will remain one. David Neumark has suggested a correction for what he calls the “Heckman critique” in 2012. We were aware of this, but I did not connect the dots until reading David Neumark and Judith Rich‘s 2016 NBER working paper where they apply this correction to 9 existing correspondent tests. They find that the level of discrimination is often over-estimated without the correction: “For the labor market studies, in contrast, the evidence is less robust; in about half of cases covered in these studies, the estimated effect of discrimination either falls to near zero or becomes statistically insignificant.”

This means that the “Heckman critique” seems justified, and at least in the labour market some of the field experiments seem to overstate the degree of discrimination. Assuming that this is not unique to the papers they could re-examine, the distribution of effect sizes in the meta-analysis would be a bit different and include more studies towards the no discrimination end. I can imagine that in this case, the test for publication bias would no longer suggest “missing” studies. Put different, these “missing” studies were not missing, but reported biased estimates.

The unfortunate bit is that we cannot find out, because the correction provided by David Neumark has data requirements not all existing studies can meet. But at least I have a potential explanation to that puzzle: bias of a different kind than publication bias and the so-called file-drawer problem.

Neumark, D. (2012). ‘Detecting discrimination in audit and correspondence studies’, Journal of Human Resources, 47(4), pp. 1128-157

Neumark, David, and Judith Rich. 2016. “Do Field Experiments on Labor and Housing Markets Overstate Discrimination? Re-Examination of the Evidence.” NBER Working Papers w22278 (May).

Zschirnt, Eva, and Didier Ruedin. 2016. “Ethnic Discrimination in Hiring Decisions: A Meta-Analysis of Correspondence Tests 1990–2015.” Journal of Ethnic and Migration Studies 42 (7): 1115–34. doi:10.1080/1369183X.2015.1133279.