Application of Machine Learning and Resampling Techniques to Credit Card Fraud Detection
Keywords:Machine learning, Fraud detection, Random forest, Resampling techniques, XGBoost, TensorFlow, Deep neural network
The application of machine learning algorithms to the detection of fraudulent credit card transactions is a challenging problem domain due to the high imbalance in the datasets and confidentiality of financial data. This implies that legitimate transactions make up a high majority of the datasets such that a weak model with 99% accuracy and faulty predictions may still be assessed as high-performing. To build optimal models, four techniques were used in this research to sample the datasets including the baseline train test split method, the class weighted hyperparameter approach, and the undersampling and oversampling techniques. Three machine learning algorithms were implemented for the development of the models including the Random Forest, XGBoost and TensorFlow Deep Neural Network (DNN). Our observation is that the DNN is more effcient than the other 2 algorithms in modelling the under-sampled dataset while overall, the three algorithms had a better performance in the oversampling technique than in the undersampling technique. However, the Random Forest performed better than the other algorithms in the baseline approach. After comparing our results with some existing state-of-the-art works, we achieved an improved performance using real-world datasets.
R. Aitken, “U.S. card fraud losses could exceed 12B USD by 2020”, Forbes, (2016), http://www.forbes.com/sites/rogeraitken/2016/10/26/uscard-fraud-losses-could-exceed-12bn-by-2020/
V. Umarani, A. Julian & J. Deepa, “Sentiment analysis using various machine learning and deep learning Techniques”, Journal of the Nigerian Society of Physical Sciences (2021) 385. DOI: https://doi.org/10.46481/jnsps.2021.308
D. O. Oyewola, E. G. Dada, J. N. Ndunagu, T. A. Umar & S. A. Akinwunmi, “COVID-19 risk factors, economic factors, and epidemiological factors nexus on economic impact: machine learning and structural equation modelling approaches”, Journal of the Nigerian Society of Physical Sciences 3 (2021) 395. DOI: 10.46481/jnsps.2021.173 DOI: https://doi.org/10.46481/jnsps.2021.173
A. B. Yusuf, R. M. Dima & S. K. Aina, “Optimized breast cancer classification using feature selection and outliers detection”, Journal of the Nigerian Society of Physical Sciences 3 (2021) 298. DOI: https://doi.org/10.46481/jnsps.2021.331
O. E. Ojo, A. Gelbukh, H. Calvo & O. O. Adebanji, “Performance study of N-grams in the analysis of sentiments”, Journal of the Nigerian Society of Physical Sciences 3 (2021) 477. DOI : 10.46481/jnsps.2021.201 DOI: https://doi.org/10.46481/jnsps.2021.201
O. Olubi, E. Oniya, & T. Owolabi, “Development of predictive model for radon-222 estimation in the atmosphere using stepwise regression and grid search based-random forest regression”, Journal of the Nigerian Society of Physical Sciences 2 (2021) 132-139. DOI: https://doi.org/10.46481/jnsps.2021.177
R. Sarno, R. D. Dewandono, T. Ahmad, M. F. Naufal & F. Sinaga, “Hybrid association rule learning and process mining for fraud detection”, IAENG International Journal of Computer Science 42 (2015) 59.
C. Ivo, F. Fabiana & S. Inna, “Industry paper: The uncertain case of credit card fraud detection,” Proceedings of the 9th ACM International Conference on Distributed Event-based Systems, (2015), https://dl.acm.org/doi/10.1145/2675743.2771877
S. Ishan, P. Rameshwar & N. Ullas, “Ensemble learning for credit card fraud detection”, The ACM India Joint International Conference on Data Science and Management of Data, (2018), https://dl.acm.org/doi/10.1145/3152494.3156815
H. T. Phuong, P. T. Kim, T. H. Truong, H. Cedric, H. T. Phuong & H. L. Thi, “Real time data-driven approaches for credit card fraud detection”, Proceedings of the 2018 International Conference on E-business and Applications, (2018), https://dl.acm.org/doi/10.1145/3194188.3194196
A. Artikis, N. Katzouris, I. Correia, C. Baber, N. Morar, I. Skarbovsky, F. Fournier & G. Paliouras,“A prototype for credit card fraud management: industry paper”, The Proceedings of the 11th ACM International Conference on Distributed and Event-Based Systems (2017), https://dl.acm.org/doi/10.1145/3093742.3093912 DOI: https://doi.org/10.1145/3093742.3093912
F. Kang, C. Dawei, T. Yi & Z. Liqing, “Credit card fraud detection using convolutional neural networks,” International Conference on Neural Information Processing. Springer (2016) 483, https://www.springerprofessional.de/en/credit-card-fraud-detectionusing-convolutional-neural-networks/10799390 DOI: https://doi.org/10.1007/978-3-319-46675-0_53
Y. Abakarim, M. Lahby & A. Attioui, “An efficient real time model for credit card fraud detection based on deep learning,” The Proceedings of the 12th International Conference on
Intelligent Systems: Theories and Applications 30 (2018) 1, https://dl.acm.org/doi/10.1145/3289402.3289530
Y. Lucas, P.-E. Portier, L. Laporte, S. Calabretto, O. Caelen, L. He-Guelton & M. Granitzer, “Multiple perspectives HMM-based feature engineering for credit card fraud detection”, The Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing (2019) 1359, https://dl.acm.org/doi/10.1145/3297280.3297586 DOI: https://doi.org/10.1145/3297280.3297586
H. Deshan, L. Yu, W. Zhaoxing & X. Jiajie, “Decision analysis and prediction based on credit card fraud data”, The 2nd European Symposium on Computer and Communications (ESCC ’21), Belgrade, Serbia. ACM, New York, NY, USA (2021), https://doi.org/10.1145/3478301.3478305 DOI: https://doi.org/10.1145/3478301.3478305
G. Yuxin, Z. Shuoming & L. Jiapeng, “Machine learning for credit card fraud detection”, Proceedings of the 2021 International Conference on Control and Intelligent Robotics (2021), https://dl.acm.org/doi/abs/10.1145/3473714.3473749
Kaggle, Credit Card Fraud Detection, (2022) https://www.kaggle.com/mlg-ulb/creditcardfraud
M. Roweida, R. Jumanah & A. Malak, “Machine learning with oversampling and undersampling techniques: overview study and experimental results”, 11th International Conference on Information and Communication Systems (2020).
A. E. Ibor, O. B. Okunoye, F. A. Oladeji, andK. A. Abdulsalam, “Novel hybrid model for intrusion prediction on cyber-physical systems’ Communication Networks based on Bio-inspired Deep Neural Network Structure”, Journal of Information Security and Applications 65 (2022). DOI: https://doi.org/10.1016/j.jisa.2021.103107
G. Zoto, “Credit card fraud detection using ML and deep learning”, YouTube, (2020), https://www.youtube.com/watch?v=yX1 iDV0E50
How to Cite
Copyright (c) 2022 Chinedu L. Udeze, Idongesit E. Eteng, Ayei E. Ibor
This work is licensed under a Creative Commons Attribution 4.0 International License.
The Journal of the Nigerian Society of Physical Sciences (JNSPS) is published under the Creative Commons Attribution 4.0 (CC BY-NC) license. This license was developed to facilitate open access, namely, it allows articles to be freely downloaded and to be re-used and re-distributed without restriction, as long as the original work is correctly cited. More specifically, anyone may copy, distribute or reuse these articles, create extracts, abstracts, and other revised versions, adaptations or derivative works of or from an article, mine the article even for commercial purposes, as long as they credit the author(s).