Estimating Privacy Leakage of Machine Learning Models

Ryan O'Dell
MS, 2023
Cheng, Guang
A membership inference attack is a method of extracting the training data from machine learning models. Previous analysis has characterized the worst case vulnerability to membership inference by instantiating the attack algorithm as the Bayes Optimal Classifier. We extend these findings by developing practical estimators for the worst case vulnerability on a sub-class of membership inference problems that are easy to compute without resorting to computationally expensive privacy auditing techniques. Extensive simulation studies are conducted on real world data sets to show that privacy auditing techniques, such as shadow modeling, can be replaced with the proposed worst case estimators. Furthermore, we examine the notion of disparity in membership inference: that some subgroups of the population are easier to identify in the training data set than others. We use a framework to quantify the degree of disparity and demonstrate that several real world models exhibit disparity in membership inference. We advocate that average metrics of attack accuracy, commonly usedin the privacy auditing literature, do not reliably convey the difference in privacy risks across different levels of the population.
2023