Introduction¶

smmargins is a small module that fills in the marginal-effects gaps in StatsModels: adjusted predictions and marginal effects at user-specified covariate profiles, with delta-method standard errors, for any fitted model that exposes params, cov_params(), and a predict(params, exog) method.

The design target is Stata’s margins command: the same statistics, the same parameter names where they translate, and the same answers to the precision both tools agree on.

Why another margins module?¶

StatsModels ships Results.get_margeff, but it is limited:

only marginal effects, not adjusted predictions;
atexog is keyed by column index, not variable name;
no at(...) profiles, no representative-value contrasts;
no joint covariance across statistics, so you cannot form contrasts like a difference-in-differences without re-deriving the delta method by hand;
no support for difference-in-differences on the response scale (the Ai & Norton 2003 issue).

smmargins provides:

predict() — adjusted predictions (AAP / APM / APR), with at= and name-keyed atexog=;
dydx() — marginal effects (AME / MEM / MER), continuous and discrete, including elasticities (eyex / dyex / eydx);
did() — 2x2 difference-in-differences on the response scale, with the joint covariance baked in;
contrast() — exact linear combinations of any result, reusing the joint covariance;
Multi-outcome support for MNLogit and OrderedModel.

Multi-outcome models¶

smmargins supports statsmodels.MNLogit (multinomial logit) and statsmodels.miscmodels.ordinal_model.OrderedModel (ordered logit/probit). For these models every statistic returns one value per outcome class — K values in place of the usual scalar — with full joint covariance across both rows and classes.

ame = M.dydx("x1")          # AME of x1 on each class probability; K rows
ame.summary()               # long-format DataFrame with `outcome` column

# Subset to specific outcomes
M.predict(outcome=1)               # only class 1
M.predict(outcome="versicolor")    # by label, if labeled

Difference-in-differences¶

Two small additions turn the module into a full DiD estimator:

contrast() forms any linear combination of the estimates directly on the already-computed joint covariance.
did() sets up the 2x2 grid and returns a DiDResult bundling the four cell predictions, the two simple effects, and the DiD — all sharing the same joint covariance.

For multi-outcome models, did() returns a DiDResult where every field carries the K-outcome axis. The DiD contains K estimates whose sum is exactly zero.

Installation¶

pip install smmargins

Requires Python ≥3.9. Dependencies (numpy, pandas, statsmodels, scipy, patsy) are installed automatically.

Quickstart¶

import statsmodels.formula.api as smf
from smmargins import Margins

fit = smf.logit(
    "voted ~ age + income + C(educ) + female + age:female",
    data=df,
).fit()
M = Margins(fit)

# Adjusted predictions
M.predict()                                    # AAP
M.predict(at="mean")                           # APM (margins, atmeans)
M.predict(atexog={"age": [25, 45, 65]})        # APR

# Marginal effects on the response (probability) scale
M.dydx("age")                                  # AME
M.dydx("age", at="mean")                       # MEM
M.dydx("age", atexog={"female": [0, 1]})       # MER, by sex
M.dydx("educ", reference="college")            # discrete contrasts

# Difference-in-differences on the response scale
res = M.did("group", "preexist_Y",
            group_levels=["A", "B"], condition_levels=[0, 1])
print(res)                                     # cells, simple effects, DiD

Each call returns a MarginsResult with .estimate, .se, .vcov, .ci_lower, .ci_upper, .pvalue, plus .summary() returning a pandas.DataFrame. Pass use_t=True to the Margins constructor for t-distribution inference (uses results.df_resid).

Why patsy¶

When the formula is y ~ x1 + I(x1**2) + x1:x2 + C(group) and we want the marginal effect of x1, we cannot just nudge one column of the design matrix — x1 enters three columns. What we can nudge is the x1 column of the original data frame, then ask patsy to rebuild the design matrix using the stored DesignInfo:

patsy.dmatrix(design_info, perturbed_frame, return_type="matrix")

That preserves polynomial terms, interactions, splines (bs(x, df=4)), and categorical contrasts automatically. It is also the right abstraction for “hold age=45” or “set group='b'” — you mutate the data frame, not the design matrix.

Formula vs. raw exog mode¶

Margins supports models fit without formulas (sm.OLS(y, X).fit()). In this raw mode, variable names are taken from model.exog_names.

Warning

In raw mode, Margins cannot know about relationships between columns of the design matrix. If you manually included an interaction column (e.g. X["x1_x2"] = X["x1"] * X["x2"]), perturbing x1 for a marginal effect will not automatically update x1_x2, and the marginal effect will be wrong.

If your model has interactions or transformations, fit it with a formula so Margins can rebuild the design matrix correctly.

Where to next¶

Mathematical motivation — delta method, statistic schema, analytic vs FD Jacobian.
Demos — full Williams-style and DiD walkthroughs.
API reference — reference documentation for every public class and method.

Introduction¶

Why another margins module?¶

Multi-outcome models¶

Difference-in-differences¶

Installation¶

Quickstart¶

Why patsy¶

Formula vs. raw exog mode¶

Where to next¶

smmargins

Navigation

Related Topics