Tutorial 3: Marginal Effects

In Tutorial 1 we computed a basic average marginal effect for age. This tutorial goes deeper. We will learn how to compute marginal effects for continuous and categorical variables, how to evaluate them at different points, and how to use them with interaction terms.

What you will learn

  • Average Marginal Effect (AME), Marginal Effect at the Mean (MEM), and Marginal Effect at Representative values (MER)

  • Discrete contrasts for categorical variables

  • Discrete change for dummy variables

  • Subgroup marginal effects via atexog

Setup

We continue with the same fitted model and Margins object from the previous tutorials:

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf
from smmargins import Margins

rng = np.random.default_rng(7)
N = 5_000
df = pd.DataFrame({
    "age":    rng.normal(45, 12, N).clip(18, 90),
    "income": rng.lognormal(10.5, 0.4, N),
    "educ":   rng.choice(["hs", "college", "grad"], N, p=[0.4, 0.4, 0.2]),
    "female": rng.integers(0, 2, N),
})
eta = (-4.0 + 0.05 * df["age"] + 0.00001 * df["income"]
       + 0.8 * (df["educ"] == "college") + 1.4 * (df["educ"] == "grad")
       + 0.3 * df["female"] - 0.0004 * df["age"] * df["female"])
df["voted"] = (rng.uniform(0, 1, N) < 1 / (1 + np.exp(-eta))).astype(int)

fit = smf.logit("voted ~ age + income + C(educ) + female + age:female", data=df).fit(disp=False)
M = Margins(fit)

Continuous variables: AME, MEM, and MER

Average Marginal Effect (AME)

The AME is the default. It computes the marginal effect at each observation using that observation’s actual covariate values, then averages:

M.dydx("age").summary()
dy/dx std err z P>|z| [95% Conf. Interval]
dage 0.010118 0.000505 20.037431 2.598390e-89 0.009128 0.011108

The AME answers: “What is the average effect of a one-year age increase across our sample?”

Marginal Effect at the Mean (MEM)

The MEM sets all covariates to their means (or reference levels) and computes the marginal effect at that single point:

M.dydx("age", at="mean").summary()
dy/dx std err z P>|z| [95% Conf. Interval]
dage (at means) 0.011345 0.000633 17.920877 8.104021e-72 0.010104 0.012585

The MEM answers: “What is the effect of age for an average individual?”

Marginal Effect at Representative values (MER)

The MER lets you choose specific values. Here we compute the marginal effect of age at ages 25, 45, and 65:

M.dydx("age", atexog={"age": [25, 45, 65]}).summary()
dy/dx std err z P>|z| [95% Conf. Interval]
dage | age=25 0.006943 0.000196 35.427388 6.468298e-275 0.006559 0.007328
dage | age=45 0.010715 0.000597 17.942831 5.460079e-72 0.009545 0.011886
dage | age=65 0.011380 0.000561 20.299690 1.293986e-91 0.010281 0.012479

The MER answers: “How does the effect of age differ at ages 25, 45, and 65?” Notice that the effect is slightly smaller at the extremes because the logistic curve flattens near 0 and 1.

Categorical variables: discrete contrasts

For categorical variables like educ, dydx() computes pairwise contrasts between levels. By default it compares each level to the reference level (the first level alphabetically, "college" in our case because C(educ) uses treatment coding):

M.dydx("educ").summary()
contrast std err z P>|z| [95% Conf. Interval]
educ: grad vs college 0.140983 0.018278 7.713321 1.225856e-14 0.105159 0.176807
educ: hs vs college -0.147471 0.013992 -10.539750 5.664943e-26 -0.174895 -0.120047

Each row is the difference in predicted probability between two education levels, averaged across the sample. For example, individuals with a graduate degree have a predicted probability of voting that is about 14.2 percentage points higher than those with a high school diploma.

You can change the reference level:

M.dydx("educ", reference="hs").summary()
contrast std err z P>|z| [95% Conf. Interval]
educ: college vs hs 0.147471 0.013992 10.539750 5.664943e-26 0.120047 0.174895
educ: grad vs hs 0.288454 0.017694 16.302114 9.533652e-60 0.253774 0.323134

Binary variables: discrete change

For binary (dummy) variables like female, dydx() computes the discrete change: the difference in predicted probability when moving from 0 to 1:

M.dydx("female").summary()
contrast std err z P>|z| [95% Conf. Interval]
female: 1 vs 0 0.062486 0.012681 4.927392 8.333442e-07 0.037631 0.087341

Being female is associated with a 1.8 percentage point decrease in the predicted probability of voting. This is a discrete change, not a derivative.

Subgroup marginal effects with interactions

Our model includes an interaction between age and female. We can compute the marginal effect of age separately for men and women using atexog:

M.dydx("age", atexog={"female": [0, 1]}).summary()
dy/dx std err z P>|z| [95% Conf. Interval]
dage | female=0 0.009905 0.000704 14.070242 5.787136e-45 0.008525 0.011285
dage | female=1 0.010332 0.000724 14.267606 3.483339e-46 0.008913 0.011752

The marginal effect of age is slightly larger for males (0.0128) than for females (0.0122). The difference is small but the interaction term in the model allows these effects to differ.

You can also combine multiple atexog variables. Here we compute the effect of education by gender:

M.dydx("educ", atexog={"female": [0, 1]}).summary()
contrast std err z P>|z| [95% Conf. Interval]
educ: grad vs college | female=0 0.139279 0.018217 7.645405 2.082886e-14 0.103574 0.174985
educ: hs vs college | female=0 -0.139961 0.013430 -10.421393 1.980307e-25 -0.166284 -0.113639
educ: grad vs college | female=1 0.142747 0.018426 7.746865 9.418929e-15 0.106632 0.178862
educ: hs vs college | female=1 -0.154996 0.014784 -10.484040 1.022786e-25 -0.183972 -0.126020

Summary table: when to use each type

Type

Code

Best for

AME

M.dydx("age")

Reporting a single main effect

MEM

M.dydx("age", at="mean")

Comparing effects at a reference point

MER

M.dydx("age", atexog={"age": [...]})

Showing how effects vary

Recap

In this tutorial we covered:

  1. AME: average marginal effect across the sample

  2. MEM: marginal effect at the mean covariate profile

  3. MER: marginal effect at user-specified values

  4. Discrete contrasts for categorical variables like educ

  5. Discrete change for binary variables like female

  6. Subgroup effects using atexog to condition on interaction partners

Next steps