Statistical Methods for the Social Sciences
Leonard Wainstein
PhD, 2021
Hazlett, Chad J
This dissertation is a collection of three articles on three distinct topics in statistical methodology, but that share a relevance to social science research. The first article (Chapter 1) introduces Targeted Function Balancing (TFB), a covariate balancing framework for estimating the causal effect of a binary treatment on an outcome. TFB first regresses the outcome on observed covariates, and then selects weights that balance functions (of the covariates) that are probabilistically near the resulting regression function. This yields balance in the predicted values of the regression function and the covariates, with the regression function's estimated variance determining how much balance in the covariates is sufficient. The second article (Chapter 2) introduces tools for assessing the sensitivity, to unobserved confounding, of two weighted estimators of the causal effect of a treatment on an outcome: (a) a weighted difference in means and (b) a weighted regression of the outcome on the treatment and observed covariates. The article argues that these tools are more intuitive than existing sensitivity tools involving weights. They also refrain from distributional assumptions on the observed data or unobserved confounding, apply with very general weights (e.g., propensity score, matching, or covariate mean balancing), and can address bias from misspecification in the observed data. The third article (Chapter 3) is a pedagogical piece for working with grouped data, and deciding between “fixed effects'' models (FE) with specialized (e.g., cluster-robust) standard errors, or “multilevel models” (MLMs) employing “random effects''. This article reviews the claims given in published works regarding this choice, then clarifies how these approaches work and compare by showing that: (i) random effects in MLMs are simply “regularized'' fixed effects; (ii) unmodified MLMs are consequently susceptible to bias, but there is a longstanding remedy; and (iii) “default” MLM standard errors rely on narrow assumptions that can lead to undercoverage in many settings. The article describes how to debias MLM's coefficient estimates, and how to more flexibly estimate their standard errors. After adjusting an MLM accordingly, the point estimate and standard error for the target coefficient are exactly equal to those of the analogous FE model with cluster-robust standard errors.
2021