Interpreting regression coefficients

Interpreting coefficients in linear regression equations can be unintuitive. Fortunately, there is a relatively straight-forward, systematic way to figure out what each coefficient means.

A simple example

Let’s start with a simple linear regression equation predicting blood pressure based on a single binary variable for gender.1 In this case, $Y_i$ is the blood pressure of the $i$th subject. $X_{1i}$ is the gender of the $i$th subject (1=male, 0=female). The regression equation would then be:

$E(Y_i) = \beta_0 + \beta_1X_{1i}$

In words, this is saying that the expected value of the blood pressure for the $i$th subject is given by the right-hand side of the equation.

The question is: what do the $\beta$s (coefficients) mean?

I like to make a table:

$X_{1}$ $E(Y_i|X_{1})$
0 (female) $\beta_0 + \beta_1 X_{1} = \beta_0 + \beta_1(0) = \beta_0$
1 (male) $\beta_0 + \beta_1 X_{1} = \beta_0 + \beta_1(1) = \beta_0 + \beta_1$

The table has one row for each value of $X_1$ as specified in the left column, the only predictor variable in this regression equation. The right column is the expected value for blood pressure $Y_i$ given the specified value for gender ($X_1$). You get this simply by plugging in the specified $X_1$ into the regression equation and removing any $\beta$s that are multiplied by 0.

Let’s look at a version of the table where the right column only includes the simplified answer:

$X_{1}$ $E(Y_i|X_{1})$
0 (female) $\beta_0$
1 (male) $\beta_0 + \beta_1$

Looking at the row for $X_1 = 0$, the interpretation for $\beta_0$ is straight-forward: it’s the expected value for blood pressure ($Y_i$) among women.

What’s $\beta_1$? We know what $\beta_0 + \beta_1$ is just by reading the table: it’s $E(Y_i \vert X_1=1)$, or the expected value for blood pressure among men. And we know what $\beta_0$ is. So we can subtract: $(\beta_0 + \beta_1) - \beta_0 = \beta_1$.

The interpretation of $\beta_1$ is just this subtraction in words: it’s the difference in the expected value for blood pressure between men and women. In other words, it’s the mean difference in blood pressure attributable to gender.2

A slightly more complex example

Let’s add another binary variable $X_2$ for ever smoking to the regression equation:

$E(Y_i) = \beta_0 + \beta_1X_{1i} + \beta_2X_{2i}$

And the corresponding table:

$X_{1}$ $X_{2}$ $E(Y_i|X_{1}, X_{2})$
0 (female) 0 (never smoker) $\beta_0$
1 (male) 0 (never smoker) $\beta_0 + \beta_1$
0 (female) 1 (ever smoker) $\beta_0 + \beta_2$
1 (male) 1 (ever smoker) $\beta_0 + \beta_1 + \beta_2$

And the interpretations of the $\beta$s:

• $\beta_0$ is the mean blood pressure for female never-smokers.
• $\beta_1$ is the mean difference in blood pressure comparing males to females, controlling for smoking.
• This is from subtracting the first two rows: $(\beta_0 + \beta_1) - \beta_0$
• You get the same if you use the last two rows: $(\beta_0 + \beta_1 + \beta_2) - (\beta_0 + \beta_2)$
• $\beta_2$ is the mean difference in blood pressure comparing ever smokers to never smokers, controlling for gender.
• This is from subtracting the first row from the third row, or the second row from the fourth row.

An interaction term

Let $X_{3i}$ be the interaction between gender and smoking: $(X_{1i})(X_{2i})$. Here’s how the different combinations of $X_1$ and $X_2$ relate to the values of $X_3$:

$X_1$ $X_2$ $X_3$
0 0 0
1 0 0
0 1 0
1 1 1

Here’s the full regression equation:

$E(Y_i) = \beta_0 + \beta_1X_{1i} + \beta_2X_{2i} + \beta_3X_{3i}$

And the corresponding table:

$X_{1}$ $X_{2}$ $X_{3}$ $E(Y_i|X_{1}, X_{2}, X_{3})$
0 (female) 0 (never smoker) 0 (gender × smoker) $\beta_0$
1 (male) 0 (never smoker) 0 (gender × smoker) $\beta_0 + \beta_1$
0 (female) 1 (ever smoker) 0 (gender × smoker) $\beta_0 + \beta_2$
1 (male) 1 (ever smoker) 1 (gender × smoker) $\beta_0 + \beta_1 + \beta_2 + \beta_3$

Note that the $X_3$ column is based entirely on the $X_1$ and $X_2$ columns. That’s why there aren’t more rows than for the previous example.

And the interpretations of the $\beta$s. Differences from the previous example are highlighted.

• $\beta_0$ is the mean blood pressure for female never-smokers.
• $\beta_1$ is the mean difference in blood pressure comparing males to females, controlling for smoking and the gender/smoking interaction.
• $\beta_2$ is the mean difference in blood pressure comparing ever smokers to never smokers among women.3
• $\beta_2 + \beta_3$ is the mean difference in blood pressure comparing ever smokers to never smokers among men.4
• $\beta_3$ gets a little confusing. Using the same subtraction technique: $(\beta_2 + \beta_3) - \beta_2 = \beta_3$. Conveniently, the previous two bullets tell us what $\beta_2 + \beta_3$ is and what $\beta_2$ is.

So if we translate $(\beta_2 + \beta_3) - \beta_2$ into words, we get [the mean difference in blood pressure comparing smokers to never smokers among men] minus [the mean difference in blood pressure comparing smokers to never smokers among women].

In other words, $\beta_3$ is the difference in mean differences.

Closing thoughts

This method can be extended to arbitrarily complex regression equations. It is especially helpful for strange parameterizations, such as models with an interaction but no main effect for one of the interaction terms (this parameterization does make sense in some cases). It also works for logistic regression!

I think this works well for me because it takes the complicated, unintuitive process of interpreting regression coefficients and provides a system and structure for parsing the model.

I’ve found that as long as I’m fastidious about setting up the table for each combination of parameters and very literal about interpreting each parameter based on the table, I’m always able to come up with a verbal interpretation for the $\beta$s given enough time.

Credit

Dr. Larry Magder teaches a very similar concept in his clustered data analysis class at UMB, which is the inspiration for this post. However, any mistakes are mine and mine alone.

1. This probably doesn’t make sense scientifically, but it serves the purpose of the example. [return]
2. The expected value is the mean, so the difference in expected values is the difference in means, or mean difference. [return]
3. Note that you no longer get the same answer when you subtract row 3 - row 1 and row 4 - row 2. This is because of the interaction. [return]
4. This is what you get when you subtract row 4 - row 2. [return]