Understanding p-hacking through Yahtzee?

P-values are hard enough to understand — the appear ‘magically’ on the screen — so how can we best communicate the problem of p-hacking? How about using Yahtzee as an analogy to explain the intuition of p-hacking?

In Yahtzee, players roll five dice to make predetermined combinations (e.g. three of a kind, full house). They are allowed three turns, and can lock dice. Important for the analogy, players decide which of combination they want to use for their round after the three turns. (“I threw these dice, let’s see what combination fits best…”) This is what adds an element of strategy to the game, and players can optimize their expected (average) points.

Compare this with pre-registration (according to Wikipedia, this is actually a variant of the Yahtzee variant Yatzy — or is Yahtzee a variant of Yatzy? Whatever.). This means players choose a predetermined combination before throwing their dice. (“Now I’m going to try a full house. Let’s see if the dice play along…”)

If the implications are not clear enough, we can play a couple of rounds to see which way we get higher scores. Clearly, the Yahtzee-way leads to (significantly?) more points — and a much smaller likelihood to end up with 0 points because we failed to get say that full house we announced before throwing the dice. Sadly, though, p-values are designed for the forced Yatzy variant.

Image: cc-by by Joe King


It’s nice to see IMISCOE keep growing (now 39 member institutes), with the biggest ever conference just finished. More importantly, the conference is increasingly well attended, and we no longer have to struggle to find decent quantitative panels or economists attending. That’s an encouraging sign.

Our IMISCOE research group on brain waste in the labour market had another successful high-level panel, and I’ve seen excellent work on discrimination in the labour market and immigrant integration. Perhaps it’s time to drop the E (for Europe) in IMISCOE…?

Should I pay my interview respondents?

As an interdisciplinary institute, we have recently discussed whether (and under what circumstances) we should pay interview respondents and participants in our studies? Here are a few things I have compiled for this purpose. At this point, I really would like to thank the participants of the Rencontre Scientifique SFM on 28 June 2016 for the discussion and comments.

General Ethical Principles
General ethical principles apply, and the following table reviews a list of general ethical principles taken from the SFM Ethics Guidelines in view of the question of paying interview respondents. The focus is on interview participants in a wider sense to include focus groups, expert interviews, laypersons with expert knowledge, and informants.

Principle Impact of Payment Evaluation
no harm to subjects and researcher not affected by payment neutral
potential benefits to subjects payment clearly as a benefit to subjects positive
informed consent should normally be obtained: participants should be aware of nature of research and their involvement; participants have a right to withdraw consent at any time without giving any reason payment creates incentives to participate when there is no consent; so payment needs to be separated from completion negative
accordance to relevant law and legislation payment is legal, but may be taxable neutral
researchers must respect the rights, dignity, and interests of participants, including assurances of confidentiality and anonymity payment does not affect these obligations on part of the researcher neutral
research involving children should obtain consent from both the parents and the children, consistent with their capacity payment does not affect this neutral
reduce likelihood that research experience is disturbing to participants and others payment does not change this, but may compensate for inadvertent violation of this principle positive
avoid actions that may have deleterious consequences for researchers who come after or undermine reputation of the discipline payment may increase the expectation that other researchers pay (which can undermine future researchers to carry out the same kind of research), payment as such is unlikely to undermine the reputation of the discipline, but may enhance it mixed
ensure that funders appreciate the ethical obligations of researchers not affected, if there is an ethical obligation to pay, this needs to be communicated to the funder – the fact that the project will be more expensive than a competitor’s is no excuse not to follow research ethics neutral

Disciplinary Traditions in Experiments
In experiments, economists almost always pay their participants. Economists worry that without payment there is no real incentive to follow the instructions. With payment there is a possible problem with satisficing.

In psychology, they hardly ever do. Psychologists worry that with payment inherent preferences and motivations are overruled. Without payment there is a potential problem with participants trying to please the researchers.

General Considerations

Participants and their contributions should be respected.

Observation Impact of Payment Evaluation
Respondents in professional capacity are already paid unclear payment not necessary
Respondents in professional capacity may be ordered to participate unclear unclear
People like to talk (about themselves) unnecessary commodification payment not necessary
No obligation to participate, but payments can be interpreted as coercion1 (Boddy et al. 2010), people from poorer background may be more susceptible to this kind of influence undermines informed consent, especially for some parts of society negative
Payment can reduce non-response bias, precisely because payment incites participation (Boddy et al. 2010; Grady 2011). increases participation rates, reduce non-response bias positive
Payment often facilitates recruitment (Grady 2011) facilitates recruitment positive
Is the influence is undue: likely to distort judgement of risks and benefits? Likely to affect giving consent? Payment should never trump freely given informed consent.2 may affect judgement of individuals negative
Reimbursing expenses is different from paying for time, skills, and expertise. The latter are subject to employment law, thus taxable income.3 depends on the form of payment mixed
Payment may lead to fictional accounts4 5 invalid responses negative
Paying creates an obligation that can blur boundaries and undermine trust. undermining trust negative
May skew samples (Grady 2011) skew samples negative
Not paying may introduce bias by excluding participants, e.g. poor who cannot afford to participate, despite having something important to say (Thompson 1996) reduce coverage bias positive

Boddy et al. (2010) suggest: (1) create guidelines when and how payments are made, (2) payment is justified, (3) ensure that those who withdraw are still paid, (4) carefully consider cases where consent may be given only because of the payment

Specific Considerations

In the social sciences it is uncommon to pay interview participants. This, however, does not constitute an ethic statement. Direct costs incurred by participants (e.g. travel expenses) should always be reimbursed.

Sometimes it is necessary to pay to ensure participation (e.g. interviews with prostitutes or taxi drivers (e.g. Gambetta and Hamill 2005), some online survey panels). A similar case is paying for survey respondents, where payment may be necessary to get access.

There are some ideal-typical cases of participants, but in reality the boundaries are fluid: (1) Focus groups, (2) experts in their professional capacity, (3) lay persons with expert knowledge, and (4) informants.

(1) Focus Groups

Participants in focus groups are normally paid. Participants are generally paid a flat amount (depending on the circumstances) to cover travel and a symbolic amount. Focus group participants need to prepare for the focus group, they are asked to come to the venue the researchers specify at a given time (in the other cases considered here, it is the researcher who travels and adjust his or her schedule). The participants are asked to follow the design and have less scope to deviate from it. Payment also creates an informal obligation to turn up, and often motivates participants who would otherwise not participate in such an endeavour despite supporting the research project otherwise: the specificities of a focus group (time, scheduling, and preparations) make it relatively costly to the participants.

(2) Experts in their Professional Capacity

Experts in their professional capacity are not normally paid. There is no reason to. Experts in their professional capacity are in a way paid to participate, but potential direct expenditures (e.g. travel) should be reimbursed if their employer does not cover them. Experts are free to tell the researchers what they want and have more scope to deviate from the questionnaire and participants in a focus group.

(3) Lay Persons with Expert Knowledge

Lay persons with expert knowledge are not normally paid. Lay persons with expert knowledge are like experts in that they are consulted as witnesses and for providing a synthesis – describing the situation of a group, for instance, not (just) about the person in question. They may be active in associations, often on a voluntary basis. At the same time, given their position and expert knowledge, they are often frequently consulted by researchers. They are free, however, to decline participation in a way experts in their professional capacity may not be. They often agree to participate because they have a message to share, and their contribution should be sufficiently recognized. Like with other participants, direct costs (e.g. travel) should always be reimbursed. Lay persons with expert knowledge are free to tell the researchers what they want and have more scope to deviate from the questionnaire and participants in a focus group.

(4) Informants

Informants are not normally paid. Informants are free to decline participation. However, informants can be in a precarious situation (e.g. undocumented migrants). Researchers frequently provide symbolic gestures like vouchers in these situations, but in order not to create pressure to participate, these gestures should probably not be mentioned during recruitment: informants should not participate only or largely because of the rewards. Informants are free to tell the researchers what they want and have more scope to deviate from the questionnaire than participants in a focus group, and contrary to experts and lay persons with expert knowledge, no synthesis is expected on part of the informant.

Should sensitive topics mean more payment? – There is no reason why this should be the case, if anything larger payments can lead to distorted answers.

Should those with higher income receive more payment? – No, payments are allowances and not payments to cover (potential) loss of earnings. The situation is different where access to a group of informants is contingent on payment (e.g. prostitutes).


The decision whether to pay has to be determined on a case-by-case basis

We now recommend that researchers consult the specific consideration above when considering whether to pay participants or not.

1 payment is not coercion (no harm threatened), but may be undue inducement/influence (an offer one cannot refuse, controlling or irresistible influence, strong enough to compel participation against interest) (Grady 2011)

2 money may affect risk assessments and consent (Grady 2011)

3 Free prize draws are frequently used in marketing (not subject to employment laws, not subject to lottery laws). The nature of prizes, cash equivalents, and notification of winners must be clear (Boddy et al. 2010).

4 Film-makers do not pay participants, it’s regarded a “privilege” to tell one’s story, although travel expenses are generally reimbursed; usually there is no money in documentaries; they worry that if you pay, people will tell you what they think you want to hear; experts should be credited; most experts are happy to contribute; when filming with really poor populations, the crew often make a gesture after filming, e.g. food, clothes. Film-makers are worried that paying in advance (or agreeing to pay) leads to fictional accounts.

5 Journalists typically don’t pay. They worry that it affects what people say.

Boddy, J., T. Neumann, S. Jennings, V. Morrow, P. Alderson, R. Rees, and W. Gibson. 2010. The Research Ethics Guidebook: A Resource for Social Scientists. http://www.ethicsguidebook.ac.uk/.

Gambetta, D., and H. Hamill. 2005. Streetwise: How Taxi Drivers Establish Their Customers’ Trustworthiness. New York: Russel Sage Foundation.

Grady, Christine. 2011. “Ethical and Practical Considerations of Paying Research Participants.” Department of Clinical Bioethics Clinical Center/NIH. https://www.niehs.nih.gov/research/resources/assets/docs/ethical_and_practical_considerations_of_paying_research_participants_508.pdf.

Ruedin, Didier. 2016. “Ethics Guidelines SFM.” SFM University of Neuchâtel.

Thompson, Sonia. 1996. “Paying Respondents and Informants.” Social Research Update 14: 1–5.

The Magic of (Kernel Density) Plots

Today I was looking at some data I gathered on how different groups react to a certain stimulus. A classic case for the aggregate function in R:

aggregate(reaction_var, by=list(group=group_var), median, na.rm=TRUE)

I looked at the mean, median, and interpolated medians, but it was hard to make out whether there were real differences between the groups. That’s the moment I do what I tell my students to do all the time: graph, plot, … (and wonder why this time I thought I wouldn’t have to plot everything anyway)

Here’s the magic of the kernel densities that helped me see what’s going on.

plot(density(reaction_var, na.rm=TRUE, bw=8), main="", lty=2, ylim=c(0, 0.032), xlim=c(0,100), bty="n")
lines(density(reaction_var[group_var==1], na.rm=TRUE, bw=8), col="blue")
lines(density(reaction_var[group_var==0], na.rm=TRUE, bw=8), col="red")
abline(v=50, lty=3)

Here I only look at one particular stimulus, and first plot the kernel density for everyone (no square brackets). I chose a dashed line so that the aggregate is less dominant in the plot (lty=2), after all I’m interested in the group differences (if there are any). I also set the ylim in a second round, because the kernel densities for the red group would otherwise be cut off. I also set the xlim, because the range of my variable is only 0 to 100. Because of the bandwidth, kernel density plots never quite end at logical end points. I also set the bandwidth of the kernel density to 8 so that it is exactly the same across the groups. The last argument (bty) gets rid of the box R puts around the plot by default.

I then add the kernel densities for the two groups of interest (square brackets to identity the group of question) with a particular colour. Finally I added the median value for reference.


Well, what is going on? All three lines (combined, and each group separately) have roughly the same median value. The mean is lower for the blue group, but the interpolated median values are almost exactly the same as the median values. Difference or no real difference? I know that the old “textbook” rule that the difference between the median and mean indicates skew often fails, so definitely a case for plotting. And we can see that the central tendency is not telling us much in this case, it’s mostly about the tail.

von Hippel, P. “Mean, Median, and Skew: Correcting a Textbook Rule.” Journal of Statistics Education 13, no. 2 (2005).

Same Explanatory Variables, Multiple Dependent Variables in R

I needed to run variations of the same regression model: the same explanatory variables with multiple dependent variables. In R, we can do this with a simple for() loop and assign().

First I specify the dependent variables:

dv <- c("dv1", "dv2", "dv3")

Then I create a for() loop to cycle through the different dependent variables:

for(i in 1:length(dv)){

Within this loop, I need to create an object to hold the models. I need a separate object for each model, so I create one with paste(). For the first dependent variable, this will be model1; for the second dependent variable model2, and so on.

model <- paste("model",i, sep="")

With this object to hold the model in place, I can run the model: the ith dependent variable is used. It is stored in an object called m.

m <- lm(as.formula(paste(dv[i],"~ ev1 + ev2")), data=mydata)

Now, I assign the model m to the model object created above: model1 for the first dependent variable, etc. That’s also the end of the for() loop.


We can now look at the results:

summary(model1); summary(model2); summary(model3)

or, more practical to compare models:

mtable(model1, model2, model3)