# How to verify smmargins results against R marginaleffects ## Prerequisites - {doc}`Tutorial: First steps with smmargins ` — basic model fitting and margins - R installed locally with the `marginaleffects`, `readr`, and `sandwich` packages ## Problem statement You want to confirm that `smmargins` produces results consistent with the de-facto cross-language reference, R's `marginaleffects` package. You need to know which cases are covered, what tolerances are expected, and how to regenerate reference outputs. ## Minimal working solution The test suite in `tests/comparison/` contains six parity cases. These tests auto-skip if the reference CSVs are missing, so the rest of the suite still runs without R. ```python import pathlib import numpy as np import pandas as pd import statsmodels.formula.api as smf import statsmodels.api as sm from numpy.testing import assert_allclose from smmargins import Margins HERE = pathlib.Path("tests/comparison") def _read_r_csv(path): return pd.read_csv(path) def _match_label(ame_labels, r_term): labels = np.array(ame_labels) for prefix in ("d", "e"): stripped = np.array([lab[len(prefix):] if lab.startswith(prefix) else lab for lab in labels]) mask = stripped == r_term if np.any(mask): return np.where(mask)[0][0] return None # Load shared fixture data df = pd.read_csv(HERE / "data" / "data_seed42.csv") # Case 1: Logit AME with HC3 fit = smf.logit("y_logit ~ x1 + x2 + C(grp)", data=df).fit(disp=False) M = Margins(fit, cov_type="HC3") ame = M.dydx(["x1", "x2"]) ref = _read_r_csv(HERE / "r_logit_ame_hc3.csv") for _, row in ref.iterrows(): if row["term"] == "grp": continue idx = _match_label(ame.labels, row["term"]) assert_allclose(ame.estimate[idx], row["estimate"], atol=1e-6) assert_allclose(ame.se[idx], row["std.error"], atol=5e-5) print("Case 1 (Logit AME HC3): PASS") ``` ## Variations ### Case 2: Poisson AME with HC3 ```python df = pd.read_csv(HERE / "data" / "data_seed42.csv") fit = smf.glm("count ~ x1 + x2", data=df, family=sm.families.Poisson()).fit() M = Margins(fit, cov_type="HC3") ame = M.dydx(["x1", "x2"]) ref = _read_r_csv(HERE / "r_poisson_ame.csv") for _, row in ref.iterrows(): idx = _match_label(ame.labels, row["term"]) assert_allclose(ame.estimate[idx], row["estimate"], atol=1e-6) assert_allclose(ame.se[idx], row["std.error"], atol=5e-4) print("Case 2 (Poisson AME HC3): PASS") ``` ### Case 3: OLS pairs bootstrap ```python df = pd.read_csv(HERE / "data" / "data_seed12345.csv") fit = smf.ols("y_ols ~ x1 + x2", data=df).fit() M = Margins(fit) ame = M.dydx(["x1", "x2"], vce="bootstrap", n_boot=1000, boot_seed=42) ref = _read_r_csv(HERE / "r_ols_bootstrap.csv") for _, row in ref.iterrows(): idx = _match_label(ame.labels, row["term"]) assert_allclose(ame.estimate[idx], row["estimate"], atol=1e-10) r_se = row["std.error"] if pd.notna(row.get("std.error")) else (row["conf.high"] - row["conf.low"]) / (2 * 1.96) mc_tol = 5 * max(ame.se[idx], r_se) / np.sqrt(1000) assert abs(ame.se[idx] - r_se) < mc_tol print("Case 3 (OLS bootstrap): PASS") ``` ### Regenerating reference outputs ```bash cd tests/comparison && Rscript generate_r.R ``` ## Tolerance summary | Case | Estimate tol | SE tol | |------|-------------|--------| | Logit AME with HC3 | 1e-6 | 5e-5 | | Poisson AME with HC3 | 1e-6 | 5e-4 | | OLS pairs bootstrap (n=1000) | 1e-10 | Monte Carlo (5 sigma) | | OLS with HC1, polynomial + interaction | 1e-6 | 1e-6 | | Logit cluster-robust AME | 1e-5 | 1e-4 | | Logit AME at representative x2 values | 1e-5 | 1e-4 | > ⚠️ **Trade-off:** The R parity tests are the strongest validation of the package's numerical correctness, but they require a working R installation. They auto-skip when reference files are missing, so CI pipelines without R still pass. The bootstrap SE tolerance is Monte Carlo — it scales with the bootstrap SE itself, not an absolute threshold. ## When to use this Run the parity test suite whenever you modify the core inference machinery (Jacobian computation, delta method, VCE workers) or upgrade statsmodels. It is also useful when porting analyses from R to Python — the same data and model specification should match to the tolerances above. ## When NOT to use this > ⚠️ **Trade-off:** Do not treat these tests as a substitute for understanding your model — they verify numerical parity, not substantive correctness. The factor-contrast handling differs between `smmargins` and R `marginaleffects` (see the `grp` skip in Case 1), so categorical variable comparisons may not align one-for-one. ## See also - `tests/comparison/generate_r.R` — the R script that generates reference CSVs - `tests/comparison/test_r.py` — the full pytest parity suite - {doc}`Mathematical motivation ` — the mathematical foundation being verified