Improving Classification of Fraudulent Sales
International Journal of Computer Science and Engineering |
© 2018 by SSRG - IJCSE Journal |
Volume 5 Issue 12 |
Year of Publication : 2018 |
Authors : Barry E. King |
How to Cite?
Barry E. King, "Improving Classification of Fraudulent Sales," SSRG International Journal of Computer Science and Engineering , vol. 5, no. 12, pp. 16-17, 2018. Crossref, https://doi.org/10.14445/23488387/IJCSE-V5I12P104
Abstract:
This article presents an improved solution to classifying fraudulent sales. An original k-nearest neighbor solution for a dataset of more than fifteen thousand cases yielded a misclassification rate of 0.058 where eight percent of the observations were fraudulent. An improved solution using a boosted C5.0 algorithm yielded a misclassification rate of 0.038. The solution was expanded to recognize that false positives (classifying a fraudulent sale as clean) were five times as costly as were false negatives (classifying a clean sale as fraudulent). The misclassification rate for this expanded solution was 0.058 but lowered the misclassification cost by twenty-one percent.
Keywords:
binary classification, machine learning, k-nearest neighbor, C5.0 algorithm
References:
[1] Murillo, J. P. (2016). Predicting fraudulent sales. [Online] https://rpubs.com/jpmurillo/fraudulentsales.
[2] Lantz, B. (2015). Machine Learning with R, 2nd edition. Birmingham, UK: Packt.