Fraud Detection in Vehicle Insurance Claims using Machine Learning

Ziyang Zhang
MASDS, 2024
WU, YINGNIAN
Insurance fraud poses a significant financial burden on the industry, with fraudulent vehicle insurance claims being a major contributor. This study explores the application of machine learning techniques to accurately detect fraudulent vehicle insurance claims. Six different models – Logistic Regression, Random Forest, Gaussian Naive Bayes, Decision Tree, XGBoost, and Gradient Boosting classifiers – are evaluated on an imbalanced dataset. To address class imbalance, oversampling techniques like SMOTE, Borderline SMOTE, and ADASYN are employed. Performance is assessed using metrics such as F1 score, recall, and AUC. Results indicate that XGBoost and Gradient Boosting models demonstrate superior overall performance, effectively balancing precision and recall. The Gaussian Naive Bayes model exhibits exceptional recall, making it suitable for minimizing missed fraud cases.
2024