In this video, we set the reference category in a regression model in R. When running a regression model with a categorical variable, R picks one as the reference against which the other categories are compared (aka base category). To change this reference, we can use relevel().
Why is there only a single coefficient for a categorical variable?
In this video, we solve the mystery of why there is sometimes only one coefficient for a categorical variable in the results of a multiple regression model. When things work as expected, there are 3 coefficients or lines in the results, for a variable with 4 categories (one is kept as the reference or base). If the computer treats a categorical variable as if it were a continuous one, we will only get one coefficient. The solution is to tell the computer that the variable is categorical; in R we could use as.factor(), or we could use strings rather than numbers.