Implementation of Six Single Classifiers and Feature Selection for Performance Enhancement in Anomaly-Based Intrusion Detection

International Journal of Electronics and Communication Engineering
© 2024 by SSRG - IJECE Journal
Volume 11 Issue 3
Year of Publication : 2024
Authors : Abdisalam A. Mohamed, Ibraheem Shayea, Fadi Al-Turjman
pdf
How to Cite?

Abdisalam A. Mohamed, Ibraheem Shayea, Fadi Al-Turjman, "Implementation of Six Single Classifiers and Feature Selection for Performance Enhancement in Anomaly-Based Intrusion Detection," SSRG International Journal of Electronics and Communication Engineering, vol. 11,  no. 3, pp. 195-208, 2024. Crossref, https://doi.org/10.14445/23488549/IJECE-V11I3P118

Abstract:

Attacks against information systems have been sharply increasing recently. Cyberattacks are becoming less detectable by the normal antiviruses and firewalls. Various security systems have been deployed to protect information systems; Intrusion Detection Systems (NIDS) are among the most widely used security systems in the networking industry. IDS can be an anomaly-based or signature-based system. Signature-based NIDSs are effective against known attacks but futile against zero-day attacks. To detect novel attack techniques, anomaly-based IDS has proven to be more useful than signature-based IDS. This study used six Machine Learning algorithms to detect network intrusion incidents. The CSE-CIC-IDS2018 dataset is employed to train and test the algorithms. The dataset is cleared of defects, and important features are selected using the Random Forest Regressor algorithm. A sample of the dataset with selected key features is applied to six machine learning algorithms (Gradient Boosting, AdaBoost, ID3, KNN, MLP, and Random Forest). Within a short period of time, the algorithms achieved the following F1- Scores: Gradient Boosting (0.95), AdaBoost (0.94), K-Nearest Neighbors (0.93), ID3 (0.93), Random Forest (0.93), and MLP (0.78).

Keywords:

AdaBoost, CSE-CIC-IDS2018, Machine Learning, MLP Network Intrusion Detection, Random Forest.

References:

[1] ITU, Statistics, 2024. [Online]. Available: https://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx.
[2] Ali A. Ghorbani, Wei Lu, and Mahbod Tavallaee, Network Intrusion Detection and Prevention- Concepts and Techniques, 1 st ed., Springer New York, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[3] James F. Kurose, and Keith W. Ross, Computer Networking: A Top Down Approach, 7th ed., Pearson, 2017.
[Google Scholar] [Publisher Link]
[4] S. Latha, and Sinthu Janita Prakash, “A Survey on Network Attacks and Intrusion Detection Systems,” 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, pp. 1-7, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Peter Loshin, Which is Better: Anomaly-Based IDS or Signature-Based IDS?, TechTarget, 2019. [Online]. Available: https://searchsecurity.techtarget.com/tip/IDS-Signature-versus-anomaly-detection
[6] Canadian Institute for Cybersecurity, Applications - CICFlowMeter (Formerly ISCXFlowMeter). [Online]. Available: https://www.unb.ca/cic/research/applications.html#CICFlowMeter
[7] Kaspersky, Brute Force Attack: Definition and Examples, Kaspersky, 2021. [Online]. Available: https://www.kaspersky.com/resource-center/definitions/brute-force-attack
[8] Maryam M. Najafabadi et al., “Machine Learning for Detecting Brute Force Attacks at the Network Level,” 2014 IEEE International Conference on Bioinformatics and Bioengineering, Boca Raton, USA, pp. 379-385, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Daniel J. Barrett, Richard E. Silverman, and Robert G. Byrnes, SSH, The Secure Shell - The Definitive Guide, 1st ed., O’Reilly Media, 2001.
[Google Scholar] [Publisher Link]
[10] Riyad Alshammari, and A. Nur Zincir-Heywood, “A Flow Based Approach for SSH Traffic Detection,” 2007 IEEE International Conference on Systems, Man and Cybernetics, Montreal, Canada, pp. 296-301, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Maryam M. Najafabadi et al., “Detection of SSH Brute Force Attacks Using Aggregated Netflow Data,” 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), Miami, USA, pp. 283-288, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[12] GitHub, lanjelot/patator: Patator is a Multi-Purpose Brute-Forcer, with a Modular Design and a Flexible Usage. [Online]. Available: https://github.com/lanjelot/patator
[13] Peter Likarish, Eunjin Jung, and Insoon Jo, “Obfuscated Malicious Javascript Detection Using Classification Techniques,” 2009 4th International Conference on Malicious and Unwanted Software (MALWARE), Montreal, Canada, pp. 47-54, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Maryam Feily, Alireza Shahrestani, and Sureswaran Ramadass, “A Survey of Botnet and Botnet Detection,” 2009 Third International Conference on Emerging Security Information, Systems and Technologies, Athens, Greece, pp. 268-273, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Arash Habibi Lashkari et al., “A Survey Leading to a New Evaluation Framework for Network Based Botnet Detection,” Proceedings of the 2017 7th International Conference on Communication and Network Security, pp. 59-66, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Anna Sperotto, and Aiko Pras, “Flow-Based Intrusion Detection,” 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops, Dublin, Ireland, pp. 958-963, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Canadian Institute for Cybersecurity, CSE-CIC-IDS2018 on AWS, A Collaborative Project between the Communications Security Establishment (CSE) & The Canadian Institute for Cybersecurity (CIC). [Online]. Available: https://www.unb.ca/cic/datasets/ids-2018.html
[18] G. Carl et al., “Denial-of-Service Attack-Detection Techniques,” IEEE Internet Computing, vol. 10, no. 1, pp. 82-89, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[19] GitHub, jseidl/GoldenEye: GoldenEye Layer 7 (KeepAlive+NoCache) DoS Test Tool. [Online]. Available: https://github.com/jseidl/GoldenEye0
[20] Sunny Behal, and Krishan Kumar, “Characterization and Comparison of DDoS Attack Tools and Traffic Generators - A Review,” International Journal of Network Security, vol. 19, no. 3, pp. 383-393, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Sergey Shekyan, Tag: Slow http Attack, Qualys Community, 2011. [Online]. Available: https://blog.qualys.com/tag/slow-http-attack
[22] Saman Taghavi Zargar, James Joshi, and David Tipper, “A Survey of Defense Mechanisms against Distributed Denial of Service (DDOS) Flooding Attacks,” IEEE Communications Surveys & Tutorials, vol. 15, no. 4, pp. 2046-2069, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[23] GitHub, NewEraCracker/LOIC: Low Orbit Ion Cannon - An Open Source Network Stress Tool, Written in C#. Based on Praetox’s LOIC Project. [Online]. Available: https://github.com/NewEraCracker/LOIC/
[24] Roxana Papadie, and Ioana Apostol, “Analyzing Websites Protection Mechanisms against DDoS Attacks,” 2017 9th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Targoviste, Romania, pp. 1-6, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Jose Fonseca, Marco Vieira, and Henrique Madeira, “Testing and Comparing Web Vulnerability Scanning Tools for SQL Injection and XSS Attacks,” 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007), Melbourne, Australia, pp. 365-372, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization,” Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP), vol. 1, pp. 108-116, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Srilatha Chebrolu, Ajith Abraham, and Johnson P. Thomas, “Feature Deduction and Ensemble Design of Intrusion Detection Systems,” Computers & Security, vol. 24, no. 4, pp. 295-307, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Weiming Hu, Wei Hu, and Steve Maybank, “AdaBoost-Based Algorithm for Network Intrusion Detection,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 38, no. 2, pp. 577-583, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Warusia Yassin et al., “Anomaly-Based Intrusion Detection through K-Means Clustering and Naive Bayes Classification,” Proceedings of the 4th International Conference on Computing and Informatics, pp. 298-303, 2013.
[Google Scholar] [Publisher Link]
[30] V. Kanimozhi, and T. Prem Jacob, “Artificial Intelligence Based Network Intrusion Detection with Hyper-Parameter Optimization Tuning on the Realistic Cyber Dataset CSE-CIC-IDS2018 Using Cloud Computing,” ICT Express, vol. 5, no. 3, pp. 211-214, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Qianru Zhou, and Dimitrios Pezaros, “Evaluation of Machine Learning Classifiers for Zero-Day Intrusion Detection -- An Analysis on CIC-AWS-2018 Dataset,” arXiv, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Peng Lin, Kejiang Ye, and Cheng-Zhong Xu, “Dynamic Network Anomaly Detection System by Using Deep Learning Techniques,” International Conference on Cloud Computing, pp. 161-176, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Laurens D’hooge et al., “Inter-Dataset Generalization Strength of Supervised Machine Learning Methods for Intrusion Detection,” Journal of Information Security and Applications, vol. 54, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Joffrey L. Leevy, and Taghi M. Khoshgoftaar, “A Survey and Analysis of Intrusion Detection Models Based on CSE-CIC-IDS2018 Big Data,” Journal of Big Data, vol. 7, pp. 1-19, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Qusyairi Ridho Saeful Fitni, and Kalamullah Ramli, “Implementation of Ensemble Learning and Feature Selection for Performance Improvements in Anomaly-Based Intrusion Detection Systems,” 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), Bali, Indonesia, pp. 118-124, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Gozde Karatas, Onder Demir, and Ozgur Koray Sahingoz, “Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset,” IEEE Access, vol. 8, pp. 32150-32162, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[37] XuKui Li et al., “Building Auto-Encoder Intrusion Detection System Based on Random Forest Feature Selection,” Computers & Security, vol. 95, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Katherinne Shirley Huancayo Ramos, Marco Antonio Sotelo Monge, and Jorge Maestre Vidal, “Benchmark-Based Reference Model for Evaluating Botnet Detection Tools Driven by Traffic-Flow Analytics,” Sensors, vol. 20, no. 16, pp. 1-31, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Abdelraman Alenazi et al., “Holistic Model for HTTP Botnet Detection Based on DNS Traffic Analysis,” International Conference on Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments, pp. 1-18, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Richard Zuech, John Hancock, and Taghi M. Khoshgoftaar, “Detecting Web Attacks Using Random Undersampling and Ensemble Learners,” Journal of Big Data, vol. 8, pp. 1-20, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Pierre Geurts, Damien Ernst, and Louis Wehenkel, “Extremely Randomized Trees,” Machine Learning, vol. 63, pp. 3-42, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The Elements of Statistical Learning - Data Mining, Inference, and Prediction, 2nd ed., Springer New York, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Tianqi Chen, and Carlos Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Henrik Brink, Joseph W. Richards, and Mark Fetherolf, Real-World Machine Learning, Simon and Schuster, 2016.
[Google Scholar] [Publisher Link]
[45] Scikit learn, 3.2.4.3.2. sklearn.ensemble.RandomForestRegressor - scikit-learn 0.22.1 Documentation. [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html
[46] William Groves, “Using Domain Knowledge to Systematically Guide Feature Selection,” Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 3215-3216, 2013.
[Google Scholar] [Publisher Link]