A Machine Learning Based Approach for the Fraud Detection in Imbalanced Credit Card Transaction Dataset

International Journal of Electronics and Communication Engineering
© 2024 by SSRG - IJECE Journal
Volume 11 Issue 8
Year of Publication : 2024
Authors : Rinku, Ashutosh Kumar Dubey, Sushil Kumar Narang, Neha Kishore
pdf
How to Cite?

Rinku, Ashutosh Kumar Dubey, Sushil Kumar Narang, Neha Kishore, "A Machine Learning Based Approach for the Fraud Detection in Imbalanced Credit Card Transaction Dataset," SSRG International Journal of Electronics and Communication Engineering, vol. 11,  no. 8, pp. 244-259, 2024. Crossref, https://doi.org/10.14445/23488549/IJECE-V11I8P124

Abstract:

In this study, a comprehensive evaluation of machine learning models was conducted to detect fraudulent transactions in a highly imbalanced credit card dataset. An ensemble of algorithms was utilized, including Logistic Regression (LR), kNearest Neighbors (kNN), Support Vector Machines (SVM), Decision Tree (DT), Random Forest (RF), AdaBoost, Gradient Boosting (GB), Multi-Layer Perceptron (MLP), and Gaussian Naïve Bayes (GNB), each chosen to address the distinct challenges posed by the dataset's skew. Preprocessing techniques, such as Synthetic Minority Over-Sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN) sampling methods, were implemented to correct class imbalances, followed by feature selection through Linear Discriminant Analysis (LDA) to enhance model training efficacy. The experimental results showcased that the ensemble methods, particularly RF, outperform, offering high accuracy and specificity, evidenced by an accuracy rate of 0.9995 using ADASYN in an 80:20 training-test split. These methods effectively handled the imbalanced nature of the dataset while maintaining high levels of predictive reliability. This study demonstrates the efficacy of ensemble machine learning approaches in detecting fraud in datasets characterized by class imbalance. The strategic application of oversampling techniques, coupled with ensemble models, provides a robust framework for identifying fraudulent activities, thereby significantly reducing the risk associated with such transactions.

Keywords:

Fraud detection, Class imbalance, Ensemble learning, Oversampling techniques, Machine learning algorithms.

References:

[1] Ayoub Mniai, Mouna Tarik, and Khalid Jebari, “A Novel Framework for Credit Card Fraud Detection,” IEEE Access, vol. 11, pp. 112776-112786, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Rafaël Van Belle, Bart Baesens, and Jochen De Weerdt, “CATCHM: A Novel Network-Based Credit Card Fraud Detection Method Using Node Representation Learning,” Decision Support Systems, vol. 164, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Yusuf Yusuf Dayyabu, Dhamayanthi Arumugam, and Suresh Balasingam, “The Application of Artificial Intelligence Techniques in Credit Card Fraud Detection: A Quantitative Study,” E3S Web of Conferences, vol. 389, pp. 1-19, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Zahra Salekshahrezaee, Joffrey L. Leevy, and Taghi M. Khoshgoftaar, “The Effect of Feature Extraction and Data Sampling on Credit Card Fraud Detection,” Journal of Big Data, vol. 10, no. 1, pp. 1-17, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Yuanming Ding et al., “Credit Card Fraud Detection Based on Improved Variational Autoencoder Generative Adversarial Network,” IEEE Access, vol. 11, pp. 83680-83691, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Sunil Gupta et al., “Authentication for Online Fraud Detection through Hidden Markov Model,” 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, pp. 1-6, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Shivam Priyadarshi, and M. Adil Hashmi, “Cybersecurity Data Science and Threats: An Overview from Machine Learning Perspective,” ACCENTS Transactions on Information Security, vol. 7, no. 25, pp. 1-8, 2022.
[CrossRef] [Publisher Link]
[8] Daniele Lunghi et al., “An Adversary Model of Fraudsters’ Behavior to Improve Oversampling in Credit Card Fraud Detection,” IEEE Access, vol. 11, pp. 136666-136679, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Jiajian Zheng et al., “The Credit Card Anti-Fraud Detection Model in the Context of Dynamic Integration Selection Algorithm,” Frontiers in Computing and Intelligent Systems, vol. 6, no. 3, pp. 119-122, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Maryam Habibpour et al., “Uncertainty-Aware Credit Card Fraud Detection Using Deep Learning,” Engineering Applications of Artificial Intelligence, vol. 123, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Emilija Strelcenia, and Simant Prakoonwit, “A Survey on GAN Techniques for Data Augmentation to Address the Imbalanced Data Issues in Credit Card Fraud Detection,” Machine Learning and Knowledge Extraction, vol. 5, no. 1, pp. 304-329, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Honghao Zhu et al., “NUS: Noisy-Sample-Removed Undersampling Scheme for Imbalanced Classification and Application to Credit Card Fraud Detection,” IEEE Transactions on Computational Social Systems, vol. 11, no. 2, pp. 1793-1804, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Huanjing Wang et al., “Enhancing Credit Card Fraud Detection through a Novel Ensemble Feature Selection Technique,” 2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI), Bellevue, WA, USA, pp. 121-126, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[14] B. Lebichot et al., “Assessment of Catastrophic Forgetting in Continual Credit Card Fraud Detection,” Expert Systems with Applications, vol. 249, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[15] C. Victoria Priscilla, and D. Padma Prabha, “A Two-Phase Feature Selection Technique Using Mutual Information and XGB-RFE for Credit Card Fraud Detection,” International Journal of Advanced Technology and Engineering Exploration, vol. 8, no. 85, pp. 1656- 1668, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Mimusa Azim Mim, Nazia Majadi, and Peal Mazumder, “A Soft Voting Ensemble Learning Approach for Credit Card Fraud Detection,” Heliyon, vol. 10, no. 3, pp. 1-19, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Kun Zhu et al., “An Adaptive Heterogeneous Credit Card Fraud Detection Model Based on Deep Reinforcement Training Subset Selection,” IEEE Transactions on Artificial Intelligence, vol. 5, no. 8, pp. 4026-4041, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Fatima Zohra El Hlouli et al., “Credit Card Fraud Detection: Addressing Imbalanced Datasets with a Multi-phase Approach,” SN Computer Science, vol. 5, no. 1, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Menglin Kong et al., “CFTNet: A Robust Credit Card Fraud Detection Model Enhanced by Counterfactual Data Augmentation,” Neural Computing and Applications, vol. 36, no. 15, pp. 8607-8623, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Vaman Ashqi Saeed, and Adnan Mohsin Abdulazeez, “Credit Card Fraud Detection using KNN, Random Forest and Logistic Regression Algorithms: A Comparative Analysis,” The Indonesian Journal of Computer Science, vol. 13, no. 1, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[21] K.P. Bindu Madavi, and K. Krishna Sowjanya, Credit Card Fraud Detection Using Big Data Analytics and Machine Learning, 1st ed., Big Data Computing, CRC Press, pp. 1-15, 2024.
[Google Scholar] [Publisher Link]
[22] Seema Garg, and Ritu Sharma, Fraud Detection with Machine Learning and Artificial Intelligence, 1st ed., Handbook of Artificial Intelligence Applications for Industrial Sustainability, CRC Press, pp. 1-10, 2024.
[Google Scholar] [Publisher Link]
[23] Xiangrui Chao et al., “An Efficiency Curve for Evaluating Imbalanced Classifiers Considering Intrinsic Data Characteristics: Experimental Analysis,” Information Sciences, vol. 608, pp. 1131-1156, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[24] G.S. Thejas et al., “An Extension of Synthetic Minority Oversampling Technique Based on Kalman Filter for Imbalanced Datasets,” Machine Learning with Applications, vol. 8, pp. 1-12, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Waleed Hilal, S. Andrew Gadsden, and John Yawney, “Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances,” Expert Systems with Applications, vol. 193, pp. 1-34, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Jenny Domashova, and Elena Kripak, “Development of a Generalized Algorithm for Identifying Atypical Bank Transactions Using Machine Learning Methods,” Procedia Computer Science, vol. 213, pp. 101-109, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Gang Kou, Hao Chen, and Mohammed A. Hefni, “Improved Hybrid Resampling and Ensemble Model for Imbalance Learning and Credit Evaluation,” Journal of Management Science and Engineering, vol. 7, no. 4, pp. 511-529, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Wonjae Lee, and Kangwon Seo, “Downsampling for Binary Classification with a Highly Imbalanced Dataset Using Active Learning,” Big Data Research, vol. 28, pp. 1-19, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Mohammed Temraz, and Mark T. Keane, “Solving the Class Imbalance Problem Using a Counterfactual Method for Data Augmentation,” Machine Learning with Applications, vol. 9, pp. 1-16, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Tie Li, Gang Kou, and Yi Peng, “A New Representation Learning Approach for Credit Data Analysis,” Information Sciences, vol. 627, pp. 115-131, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Asma Cherif et al., “Credit Card Fraud Detection in the Era of Disruptive Technologies: A Systematic Review,” Journal of King Saud University-Computer and Information Sciences, vol. 35, no. 1, pp. 145-174, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Bryan Karunachandra et al., “On the Benefits of Machine Learning Classification in Cashback Fraud Detection,” Procedia Computer Science, vol. 216, pp. 364-369, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Palak Gupta et al., “Unbalanced Credit Card Fraud Detection Data: A Machine Learning-Oriented Comparative Study of Balancing Techniques,” Procedia Computer Science, vol. 218, pp. 2575-2584, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Jonathan Kwaku Afriyie et al., “A Supervised Machine Learning Algorithm for Detecting and Predicting Fraud in Credit Card Transactions,” Decision Analytics Journal, vol. 6, pp. 1-12, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Teuku Rizky Noviandy et al., “Credit Card Fraud Detection for Contemporary Financial Management Using XGBoost-Driven Machine Learning and Data Augmentation Techniques,” Indatu Journal of Management and Accounting, vol. 1, no. 1, pp. 29-35, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Abdulaziz Saleh Alraddadi, “A Survey and a Credit Card Fraud Detection and Prevention Model Using the Decision Tree Algorithm,” Engineering, Technology & Applied Science Research, vol. 13, no. 4, pp. 11505-11510, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[37] N. Prabhakaran, and R. Nedunchelian, “Oppositional Cat Swarm Optimization-Based Feature Selection Approach for Credit Card Fraud Detection,” Computational Intelligence and Neuroscience, vol. 2023, no. 1, pp. 1-13, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Emmanuel Ileberi, Yanxia Sun, and Zenghui Wang, “A Machine Learning Based Credit Card Fraud Detection Using the GA Algorithm for Feature Selection,” Journal of Big Data, vol. 9, no. 1, pp. 1-17, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Joffrey L. Leevy, John Hancock, and Taghi M. Khoshgoftaar, “Comparative Analysis of Binary and One-Class Classification Techniques for Credit Card Fraud Data,” Journal of Big Data, vol. 10, no. 1, pp. 1-13, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Hadeel Ahmad et al., “Class Balancing Framework for Credit Card Fraud Detection Based on Clustering and Similarity-Based Selection (SBS),” International Journal of Information Technology, vol. 15, no. 1, pp. 325-333, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Aya Abd El-Naby, Ezz El-Din Hemdan, and Ayman El-Sayed, “An Efficient Fraud Detection Framework with Credit Card Imbalanced Data in Financial Services,” Multimedia Tools and Applications, vol. 82, no. 3, pp. 4139-4160, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Indrani Vejalla et al., “Credit Card Fraud Detection Using Machine Learning Techniques,” 2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS), Nagpur, India, pp. 1-4, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Aditi Singh et al., “Design and Implementation of Different Machine Learning Algorithms for Credit Card Fraud Detection,” 2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Maldives, Maldives, pp. 1-6, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Anshika Sharma, and Himanshi Babbar, “Towards Resilient IoT Security: An Analysis and Classification of Attacks in MQTT-Based Networks” 2024 2nd International Conference on Advancement in Computation & Computer Technologies (InCACCT), Gharuan, India, pp. 122-125, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Sonam Mittal et al., “Security of Internet of Things Based on Cryptographic Algorithm,” International Journal of Electronic Security and Digital Forensics, vol. 16, no. 1, pp. 28-39, 2024.
[CrossRef] [Google Scholar] [Publisher Link]