Synthetic Data Generation for Fraud Detection

Jonathan Shan
MASDS, 2023
WU, YINGNIAN
This paper applies various synthetic data generation techniques to create synthetic fraud datafor buy now, pay later (BNPL) financial institutions that mimic the statistical properties of real data. We utilize both statistical and deep learning methods to accomplish this task, contrasting each different framework’s respective qualities. We evaluate the efficacy of our approaches by using our generated data to enhance the training sets of a fraud detection model and analyze the effects on validation results. Our results show that including synthetic data in existing datasets can improve the accuracy of fraud detection systems.
2023