Improving Classification of Fraudulent Sales

Barry E. King

doi:10.14445/23488387/IJCSE-V5I12P104

Improving Classification of Fraudulent Sales

International Journal of Computer Science and Engineering

Volume 5 Issue 12

Year of Publication : 2018

Authors : Barry E. King

10.14445/23488387/IJCSE-V5I12P104

How to Cite?

Barry E. King, "Improving Classification of Fraudulent Sales," SSRG International Journal of Computer Science and Engineering , vol. 5, no. 12, pp. 16-17, 2018. Crossref, https://doi.org/10.14445/23488387/IJCSE-V5I12P104

Abstract:

This article presents an improved solution to classifying fraudulent sales. An original k-nearest neighbor solution for a dataset of more than fifteen thousand cases yielded a misclassification rate of 0.058 where eight percent of the observations were fraudulent. An improved solution using a boosted C5.0 algorithm yielded a misclassification rate of 0.038. The solution was expanded to recognize that false positives (classifying a fraudulent sale as clean) were five times as costly as were false negatives (classifying a clean sale as fraudulent). The misclassification rate for this expanded solution was 0.058 but lowered the misclassification cost by twenty-one percent.

Keywords:

binary classification, machine learning, k-nearest neighbor, C5.0 algorithm

References:

[1] Murillo, J. P. (2016). Predicting fraudulent sales. [Online] https://rpubs.com/jpmurillo/fraudulentsales.
[2] Lantz, B. (2015). Machine Learning with R, 2nd edition. Birmingham, UK: Packt.

Call for Paper - Upcoming Issues

Improving Classification of Fraudulent Sales

How to Cite?

Abstract:

Keywords:

References: