Learning from Social Data: Non-exchangeable Priors in the Social Sciences

Marika Danielle Csapo
MS, 2016
Zhou, Qing
Bayesian analyses are often critiqued on the basis of dubious exchangeability claims regarding the data. Not only must observed data be exchangeable, but prior “data'' must be as well, and the observed data must also be exchangeable with the prior data–an assumption not typically justified by the practitioner. Yet social scientists often utilize social data–observed human behaviors that rely on human judgment–to make inferences. Social priors shared by the researcher are, therefore, non-exchangeable with social data. One common defensive argument offered by Bayesian practitioners is that as long as there is some component of new information in the observed data, repeated observation-updating cycles will still eventually produce a highly informative posterior distribution. In frequentist statistics we have power analyses–a way of estimating how much data we need to get desirable properties from our estimator. Here I develop a model that parameterizes the degree of non-exchangeability between the observed data and the prior data and offers a standard way to calculate how many observations are needed to achieve a parameterized definition of an “informative” Bayes' estimate in a single iteration of updating, or the number of updating iterations needed given a fixed observation size n at each iteration. I illustrate the phenomenon with a combination of real and model-synthesized data showing how New York police officers who make stops learn from social data—convictions generated by jury trials in the U.S. justice system.
2016