Unsupervised Learning for Real-Time Data Anomaly Detection: A Comprehensive Approach

International Journal of Computer Science and Engineering
© 2024 by SSRG - IJCSE Journal
Volume 11 Issue 10
Year of Publication : 2024
Authors : Pankaj Gupta, Prasanta Tripathy

pdf
How to Cite?

Pankaj Gupta, Prasanta Tripathy, "Unsupervised Learning for Real-Time Data Anomaly Detection: A Comprehensive Approach," SSRG International Journal of Computer Science and Engineering , vol. 11,  no. 10, pp. 1-11, 2024. Crossref, https://doi.org/10.14445/23488387/IJCSE-V11I10P101

Abstract:

Financial services, healthcare, cybersecurity, and industrial IoT use real-time anomaly detection to detect fraud, cyberattacks, damaged machinery, and other significant issues. Traditional supervised learning methods, which use labelled data, often encounter challenges in adapting to new abnormalities. Unsupervised learning is powerful and adaptable, and irregularities can be discovered in real time without pre-labeled samples. The several unsupervised learning approaches used to detect point, contextual, and collective abnormalities are reviewed in this study, along with their applicability for real-time anomaly recognition. K-means and DBSCAN find anomalies as outliers inside clusters, Principal Component Analysis and Autoencoders simplify data to reveal unusual patterns, Isolation Forest and Local Outlier Factors find anomalies based on data density, and One-Class Support Vector Mac finds anomalies based on data density. The study also examines hybrid models that combine strategies to improve detection. The article also discusses real-time anomaly detection challenges, including idea drift and the need for efficient, scalable algorithms that can handle enormous amounts of high-velocity data. Data stream management, scalability, and real-time data processing are stressed. Research on financial fraud, cybersecurity concerns, and industrial IoT applications shows how these strategies function. The article concludes by examining the drawbacks of unsupervised learning methods and suggesting future research. Create adaptable learning models and use reinforcement learning to strengthen them. Real-time anomaly detection raises ethical issues, including privacy and monitoring, and emphasizes the need for responsible deployment.

Keywords:

Clustering methods, Dimensionality reduction, Density-based methods, Real-time anomaly detection, Unsupervised learning.

References:

[1] Riyaz Ahamed Ariyaluran Habeeb et al., “Real-time Big Data Processing for Anomaly Detection: A Survey,” International Journal of Information Management, vol. 45, pp. 289-307, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Tsatsral Amarbayasgalan et al., “Unsupervised Anomaly Detection Approach for Time-series in Multi-Domains using Deep Reconstruction Error,” Symmetry, vol. 12, no. 8, pp. 1-22, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Paul Bergmann et al., “MVTec AD--A Comprehensive Real-world Dataset for Unsupervised Anomaly Detection,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CA, USA, pp. 9592-9600, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Wentai Wu et al., “Developing an Unsupervised Real-time Anomaly Detection Scheme for Time Series with Multi-seasonality,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 9, pp. 4147-4160, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Paul Bergmann et al., “The MVTec Anomaly Detection Dataset: A Comprehensive Real-world Dataset for Unsupervised Anomaly Detection,” International Journal of Computer Vision, vol. 129, pp. 1038-1059, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Konstantinos Demertzis et al., “Anomaly Detection Via Blockchained Deep Learning Smart Contracts in Industry 4.0,” Neural Computing and Applications, vol. 32, pp. 17361-17378, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Zheng Li et al., “Ecod: Unsupervised Outlier Detection using Empirical Cumulative Distribution Functions,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12181-12193, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Yassine Himeur et al., “Artificial Intelligence-based Anomaly Detection of Energy Consumption in Buildings: A Review, Current Trends and New Perspectives,” Applied Energy, vol. 287, pp. 1-26, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[9] N.A. Stoian, “Machine Learning for Anomaly Detection in IoT Networks: Malware Analysis on the IoT-23 Data Set,” Bachelor's Thesis, University of Twente, pp. 1-10, 2020.
[Google Scholar] [Publisher Link]
[10] Abhijit Guha, and Debabrata Samanta, “Hybrid Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance,” International Journal of Automation and Computing, vol. 18, pp. 55-72, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Shikhar Pandey, Anurag K. Srivastava, and Brett G. Amidan, “A Real-time Event Detection, Classification and Localization using Synchrophasor Data,” IEEE Transactions on Power Systems, vol. 35, no. 6, pp. 4421-4431, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Riyaz Ahamed Ariyaluran Habeeb et al., “Clustering‐based Real‐time Anomaly Detection-A Breakthrough in Big Data Technologies,” Transactions on Emerging Telecommunications Technologies, vol. 33, no. 8, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Nesryne Mejri et al., “Unsupervised Anomaly Detection in Time-series: An Extensive Evaluation and Analysis of State-of-the-Art Methods,” Expert Systems with Applications, vol. 256, p. 124922, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Aditya Vikram, and Mohana, “Anomaly Detection in Network Traffic using Unsupervised Machine Learning Approach,” 2020 5th International Conference on Communication and Electronics Systems, Coimbatore, India, pp. 476-479, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Randeep Bhatia et al., “Unsupervised Machine Learning for Network-centric Anomaly Detection in IoT,” Proceedings of the 3rd ACM CoNEXT Workshop Big Data, Machine Learning and Artificial Intelligence for Data Communication Network, pp. 42-48, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Samir Khan et al., “Unsupervised Anomaly Detection in Unmanned Aerial Vehicles,” Applied Soft Computing, vol. 83, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Yildiz Karadayi, Mehmet N. Aydin, and Arif Selcuk Öǧrencí, “Unsupervised Anomaly Detection in Multivariate Spatio-Temporal Data using Deep Learning: Early Detection of COVID-19 Outbreak in Italy,” IEEE Access, vol. 8, pp. 164155-164177, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Ruei-Jie Hsieh, Jerry Chou, and Chih-Hsiang Ho, “Unsupervised Online Anomaly Detection on Multivariate Sensing time Series Data for Smart Manufacturing,” 2019 IEEE 12th Conference on Service-Oriented Computing and Applications, Kaohsiung, Taiwan, pp. 90-97, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Fitore Muharemi, Doina Logofătu, and Florin Leon, “Machine Learning Approaches for Anomaly Detection of Water Quality on a Realworld Data Set,” Journal of Information and Telecommunication, vol. 3, no. 3, pp. 294-307, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Rashmiranjan Nayak, Umesh Chandra Pati, and Santos Kumar Das, “A Comprehensive Review on Deep Learning-based Methods for Video Anomaly Detection,” Image and Vision Computing, vol. 106, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Mohsin Munir et al., “A Comparative Analysis of Traditional and Deep Learning-based Anomaly Detection Methods for Streaming Data,” 2019 18th IEEE International Conference on Machine Learning and Applications, Boca Raton, FL, USA, pp. 561-566, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Jacinto Carrasco et al., “Anomaly Detection in Predictive Maintenance: A New Evaluation Framework for Temporal Unsupervised Anomaly Detection Algorithms,” Neurocomputing, vol. 462, pp. 440-452, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Denis Gudovskiy, Shun Ishizaka, and Kazuki Kozuka, “CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization Via Conditional Normalizing Flows,” Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, pp. 98-107, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Wasim Khan, and Mohammad Haroon, “An Unsupervised Deep Learning Ensemble Model for Anomaly Detection in Static Attributed Social Networks,” International Journal of Cognitive Computer in Engineering, vol. 3, pp. 153-160, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Xiaoliang Chen et al., “Self-taught Anomaly Detection with Hybrid Unsupervised/Supervised Machine Learning in Optical Networks,” Journal of Lightwave Technology, vol. 37, no. 7, pp. 1742-1749, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Nan Ding et al., “Real-time Anomaly Detection based on Long Short-Term Memory and Gaussian Mixture Model,” Computers and Electrical Engineering, vol. 79, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Andrea Castellani, Sebastian Schmitt, and Stefano Squartini, “Real-world Anomaly Detection by Using Digital Twin Systems and Weakly Supervised Learning,” IEEE Transactions on Industrial Informatics, vol. 17, no. 7, pp. 4733-4742, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Xuanhao Chen et al., “Daemon: Unsupervised Anomaly Detection and Interpretation for Multivariate Time Series,” 2021 IEEE 37th International Conference on Data Engineering, Chania, Greece, pp. 2225-2230, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Tingting Chen et al., “Unsupervised Anomaly Detection of Industrial Robots using Sliding-window Convolutional Variational Autoencoder,” IEEE Access, vol. 8, pp. 47072-47081, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Sepehr Maleki, Sasan Maleki, and Nicholas R. Jennings, “Unsupervised Anomaly Detection with LSTM Autoencoders using Statistical Data-filtering,” Applied Soft Computing, vol. 108, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Hyunseong Lee et al., “Real-time Anomaly Detection Framework using A Support Vector Regression for the Safety Monitoring of Commercial Aircraft,” Advanced Engineering Informatics, vol. 44, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Amir Farzad, and T. Aaron Gulliver, “Unsupervised Log Message Anomaly Detection,” ICT Express, vol. 6, no. 3, pp. 229-237, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Muhammad Usama et al., “Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges,” IEEE Access, vol. 7, pp. 65579-65615, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Minghu Zhang et al., “Data-driven Anomaly Detection Approach for Time-Series Streaming Data,” Sensors, vol. 20, no. 19, pp. 1-16, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Hui Yie Teh, Kevin I-Kai Wang, and Andreas W. Kempa-Liehr, “Expect the Unexpected: Unsupervised Feature Selection for Automated Sensor Anomaly Detection,” IEEE Sensors Journal, vol. 21, no. 16, pp. 18033-18046, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Milad Memarzadeh, Bryan Matthews, and Ilya Avrekh, “Unsupervised Anomaly Detection in Flight Data using Convolutional Variational Auto-encoder,” Aerospace, vol. 7, no. 8, p. 115, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Eustace M. Dogo et al., “A Survey of Machine Learning Methods Applied to Anomaly Detection on Drinking-water Quality Data,” Urban Water Journal, vol. 16, no. 3, pp. 235-248, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Lorenzo Concetti et al., “An Unsupervised Anomaly Detection Based on Self-Organizing Map for the Oil and Gas Sector,” Applied Sciences, vol. 13, no. 6, pp. 1-28, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Christian Velasco-Gallego, and Iraklis Lazakis, “RADIS: A Real-time Anomaly Detection Intelligent System for Fault Diagnosis of Marine Machinery,” Expert Systems with Applications, vol. 204, pp. 1-13, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Thittaporn Ganokratanaa, Supavadee Aramvith, and Nicu Sebe, “Unsupervised Anomaly Detection and Localization Based on Deep Spatiotemporal Translation Network,” IEEE Access, vol. 8, pp. 50312-50329, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Zaffar Haider Janjua et al., “IRESE: An Intelligent Rare-event Detection System using Unsupervised Learning on the IoT Edge,” Engineering Applications of Artificial Intelligence, vol. 84, pp. 41-50, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Stefania Russo et al., “Active Learning for Anomaly Detection in Environmental Data,” Environmental Modelling & Software, vol. 134, pp. 1-11, 2020.
[CrossRef] [Google Scholar] [Publisher Link]