Causal Inference and Large Language Models from the Causal Invariance Framework

Emily Frances Wong
MS, 2023
Lu, Hongjing
Statistics serves as the grammar of all science, and central to the goal of science is understanding cause-effect relationships. Scientists rely on research methodology and statistical tools to uncover causal relationships, and engineers rely on statistical methods to create artificial assistants to aid daily life. Neither statistical learning nor next-word-prediction (used to train artificial general intelligence) are consistent with rational causal learning and reasoning in humans. The present thesis examines the fundamental goals and assumptions made in dominant statistical methods and discusses their implications for statistical inference and commonsense reasoning in artificial general intelligence (AGI). The first section introduces and evaluates a causal alternative to logistic regression, which estimates the causal power (from the causal invariance framework) of treatments among covariates. Causal invariance is defined as the influence of a candidate cause (elemental or conjunctive) that is independent of background causes, with the aspiration of acquiring knowledge that’s useable, in the minimalist sense being able to generalize from a learning context to an application context. The second and final section investigates current benchmark tasks used to evaluate causal reasoning in large language models (e.g., GPT-3, GPT-4), and introduces a stricter test informed by psychological literature on human causal cognition under the causal invariance framework.
2023