Machine Learning Methods of Protest Detection

Anton Sobolev
MS, 2019
Hazlett, Chad
This thesis develops a model that estimates the number of participants in a protest event. My central assumption suggests that those engaged in the same collective action should have similar trajectories during the event. Individual trajectories of those who are away should not express the same patterns in trajectories. I developed a Convolutional Neural Network (CNN) model with two input layers. The first layer contains data on the trajectories of easily identified subgroups of protesters and easily identifies subgroups of nonprotesters. The second layer indicates if a specific part of each trajectory is not observed directly but, instead, is imputed. The resulting classifier identifies citizens who share the same trajectory during the protest with an accuracy of 94 percent.The proposed algorithm significantly outperforms these alternatives: Logistic Regression, Random Forest, and Artificial Neural Network. The advantage of CNN in this setting comes from the fact that this architecture applies feature-recognition filters to all segments of the trajectory and thus is robust to potential shifts among various trajectories. These shifts exist because protesters do not appear at the same time or in the same place. The resulting estimates are validated using estimates of protest events from the national press.
2019