Implementation of Six Single Classifiers and Feature  Selection for Performance Enhancement in Anomaly-Based Intrusion Detection

Abdisalam A. Mohamed; Ibraheem Shayea; Fadi Al-Turjman

doi:10.14445/23488549/IJECE-V11I3P118

Implementation of Six Single Classifiers and Feature Selection for Performance Enhancement in Anomaly-Based Intrusion Detection

International Journal of Electronics and Communication Engineering

Volume 11 Issue 3

Year of Publication : 2024

Authors : Abdisalam A. Mohamed, Ibraheem Shayea, Fadi Al-Turjman

10.14445/23488549/IJECE-V11I3P118

How to Cite?

Abdisalam A. Mohamed, Ibraheem Shayea, Fadi Al-Turjman, "Implementation of Six Single Classifiers and Feature Selection for Performance Enhancement in Anomaly-Based Intrusion Detection," SSRG International Journal of Electronics and Communication Engineering, vol. 11, no. 3, pp. 195-208, 2024. Crossref, https://doi.org/10.14445/23488549/IJECE-V11I3P118

Abstract:

Attacks against information systems have been sharply increasing recently. Cyberattacks are becoming less detectable by the normal antiviruses and firewalls. Various security systems have been deployed to protect information systems; Intrusion Detection Systems (NIDS) are among the most widely used security systems in the networking industry. IDS can be an anomaly-based or signature-based system. Signature-based NIDSs are effective against known attacks but futile against zero-day attacks. To detect novel attack techniques, anomaly-based IDS has proven to be more useful than signature-based IDS. This study used six Machine Learning algorithms to detect network intrusion incidents. The CSE-CIC-IDS2018 dataset is employed to train and test the algorithms. The dataset is cleared of defects, and important features are selected using the Random Forest Regressor algorithm. A sample of the dataset with selected key features is applied to six machine learning algorithms (Gradient Boosting, AdaBoost, ID3, KNN, MLP, and Random Forest). Within a short period of time, the algorithms achieved the following F1- Scores: Gradient Boosting (0.95), AdaBoost (0.94), K-Nearest Neighbors (0.93), ID3 (0.93), Random Forest (0.93), and MLP (0.78).

Keywords:

AdaBoost, CSE-CIC-IDS2018, Machine Learning, MLP Network Intrusion Detection, Random Forest.

References:

[1] ITU, Statistics, 2024. [Online]. Available: https://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx.
[2] Ali A. Ghorbani, Wei Lu, and Mahbod Tavallaee, Network Intrusion Detection and Prevention- Concepts and Techniques, 1 st ed., Springer New York, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[3] James F. Kurose, and Keith W. Ross, Computer Networking: A Top Down Approach, 7^th ed., Pearson, 2017.
[Google Scholar] [Publisher Link]
[4] S. Latha, and Sinthu Janita Prakash, “A Survey on Network Attacks and Intrusion Detection Systems,” 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, pp. 1-7, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Peter Loshin, Which is Better: Anomaly-Based IDS or Signature-Based IDS?, TechTarget, 2019. [Online]. Available: https://searchsecurity.techtarget.com/tip/IDS-Signature-versus-anomaly-detection
[6] Canadian Institute for Cybersecurity, Applications - CICFlowMeter (Formerly ISCXFlowMeter). [Online]. Available: https://www.unb.ca/cic/research/applications.html#CICFlowMeter
[7] Kaspersky, Brute Force Attack: Definition and Examples, Kaspersky, 2021. [Online]. Available: https://www.kaspersky.com/resource-center/definitions/brute-force-attack
[8] Maryam M. Najafabadi et al., “Machine Learning for Detecting Brute Force Attacks at the Network Level,” 2014 IEEE International Conference on Bioinformatics and Bioengineering, Boca Raton, USA, pp. 379-385, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Daniel J. Barrett, Richard E. Silverman, and Robert G. Byrnes, SSH, The Secure Shell - The Definitive Guide, 1^st ed., O’Reilly Media, 2001.
[Google Scholar] [Publisher Link]
[10] Riyad Alshammari, and A. Nur Zincir-Heywood, “A Flow Based Approach for SSH Traffic Detection,” 2007 IEEE International Conference on Systems, Man and Cybernetics, Montreal, Canada, pp. 296-301, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Maryam M. Najafabadi et al., “Detection of SSH Brute Force Attacks Using Aggregated Netflow Data,” 2015 IEEE 14^th International Conference on Machine Learning and Applications (ICMLA), Miami, USA, pp. 283-288, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[12] GitHub, lanjelot/patator: Patator is a Multi-Purpose Brute-Forcer, with a Modular Design and a Flexible Usage. [Online]. Available: https://github.com/lanjelot/patator
[13] Peter Likarish, Eunjin Jung, and Insoon Jo, “Obfuscated Malicious Javascript Detection Using Classification Techniques,” 2009 4th International Conference on Malicious and Unwanted Software (MALWARE), Montreal, Canada, pp. 47-54, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Maryam Feily, Alireza Shahrestani, and Sureswaran Ramadass, “A Survey of Botnet and Botnet Detection,” 2009 Third International Conference on Emerging Security Information, Systems and Technologies, Athens, Greece, pp. 268-273, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Arash Habibi Lashkari et al., “A Survey Leading to a New Evaluation Framework for Network Based Botnet Detection,” Proceedings of the 2017 7th International Conference on Communication and Network Security, pp. 59-66, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Anna Sperotto, and Aiko Pras, “Flow-Based Intrusion Detection,” 12^th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops, Dublin, Ireland, pp. 958-963, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Canadian Institute for Cybersecurity, CSE-CIC-IDS2018 on AWS, A Collaborative Project between the Communications Security Establishment (CSE) & The Canadian Institute for Cybersecurity (CIC). [Online]. Available: https://www.unb.ca/cic/datasets/ids-2018.html
[18] G. Carl et al., “Denial-of-Service Attack-Detection Techniques,” IEEE Internet Computing, vol. 10, no. 1, pp. 82-89, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[19] GitHub, jseidl/GoldenEye: GoldenEye Layer 7 (KeepAlive+NoCache) DoS Test Tool. [Online]. Available: https://github.com/jseidl/GoldenEye0
[20] Sunny Behal, and Krishan Kumar, “Characterization and Comparison of DDoS Attack Tools and Traffic Generators - A Review,” International Journal of Network Security, vol. 19, no. 3, pp. 383-393, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Sergey Shekyan, Tag: Slow http Attack, Qualys Community, 2011. [Online]. Available: https://blog.qualys.com/tag/slow-http-attack
[22] Saman Taghavi Zargar, James Joshi, and David Tipper, “A Survey of Defense Mechanisms against Distributed Denial of Service (DDOS) Flooding Attacks,” IEEE Communications Surveys & Tutorials, vol. 15, no. 4, pp. 2046-2069, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[23] GitHub, NewEraCracker/LOIC: Low Orbit Ion Cannon - An Open Source Network Stress Tool, Written in C#. Based on Praetox’s LOIC Project. [Online]. Available: https://github.com/NewEraCracker/LOIC/
[24] Roxana Papadie, and Ioana Apostol, “Analyzing Websites Protection Mechanisms against DDoS Attacks,” 2017 9^th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Targoviste, Romania, pp. 1-6, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Jose Fonseca, Marco Vieira, and Henrique Madeira, “Testing and Comparing Web Vulnerability Scanning Tools for SQL Injection and XSS Attacks,” 13^th Pacific Rim International Symposium on Dependable Computing (PRDC 2007), Melbourne, Australia, pp. 365-372, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization,” Proceedings of the 4^th International Conference on Information Systems Security and Privacy (ICISSP), vol. 1, pp. 108-116, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Srilatha Chebrolu, Ajith Abraham, and Johnson P. Thomas, “Feature Deduction and Ensemble Design of Intrusion Detection Systems,” Computers & Security, vol. 24, no. 4, pp. 295-307, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Weiming Hu, Wei Hu, and Steve Maybank, “AdaBoost-Based Algorithm for Network Intrusion Detection,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 38, no. 2, pp. 577-583, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Warusia Yassin et al., “Anomaly-Based Intrusion Detection through K-Means Clustering and Naive Bayes Classification,” Proceedings of the 4th International Conference on Computing and Informatics, pp. 298-303, 2013.
[Google Scholar] [Publisher Link]
[30] V. Kanimozhi, and T. Prem Jacob, “Artificial Intelligence Based Network Intrusion Detection with Hyper-Parameter Optimization Tuning on the Realistic Cyber Dataset CSE-CIC-IDS2018 Using Cloud Computing,” ICT Express, vol. 5, no. 3, pp. 211-214, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Qianru Zhou, and Dimitrios Pezaros, “Evaluation of Machine Learning Classifiers for Zero-Day Intrusion Detection -- An Analysis on CIC-AWS-2018 Dataset,” arXiv, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Peng Lin, Kejiang Ye, and Cheng-Zhong Xu, “Dynamic Network Anomaly Detection System by Using Deep Learning Techniques,” International Conference on Cloud Computing, pp. 161-176, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Laurens D’hooge et al., “Inter-Dataset Generalization Strength of Supervised Machine Learning Methods for Intrusion Detection,” Journal of Information Security and Applications, vol. 54, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Joffrey L. Leevy, and Taghi M. Khoshgoftaar, “A Survey and Analysis of Intrusion Detection Models Based on CSE-CIC-IDS2018 Big Data,” Journal of Big Data, vol. 7, pp. 1-19, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Qusyairi Ridho Saeful Fitni, and Kalamullah Ramli, “Implementation of Ensemble Learning and Feature Selection for Performance Improvements in Anomaly-Based Intrusion Detection Systems,” 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), Bali, Indonesia, pp. 118-124, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Gozde Karatas, Onder Demir, and Ozgur Koray Sahingoz, “Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset,” IEEE Access, vol. 8, pp. 32150-32162, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[37] XuKui Li et al., “Building Auto-Encoder Intrusion Detection System Based on Random Forest Feature Selection,” Computers & Security, vol. 95, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Katherinne Shirley Huancayo Ramos, Marco Antonio Sotelo Monge, and Jorge Maestre Vidal, “Benchmark-Based Reference Model for Evaluating Botnet Detection Tools Driven by Traffic-Flow Analytics,” Sensors, vol. 20, no. 16, pp. 1-31, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Abdelraman Alenazi et al., “Holistic Model for HTTP Botnet Detection Based on DNS Traffic Analysis,” International Conference on Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments, pp. 1-18, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Richard Zuech, John Hancock, and Taghi M. Khoshgoftaar, “Detecting Web Attacks Using Random Undersampling and Ensemble Learners,” Journal of Big Data, vol. 8, pp. 1-20, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Pierre Geurts, Damien Ernst, and Louis Wehenkel, “Extremely Randomized Trees,” Machine Learning, vol. 63, pp. 3-42, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The Elements of Statistical Learning - Data Mining, Inference, and Prediction, 2^nd ed., Springer New York, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Tianqi Chen, and Carlos Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proceedings of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Henrik Brink, Joseph W. Richards, and Mark Fetherolf, Real-World Machine Learning, Simon and Schuster, 2016.
[Google Scholar] [Publisher Link]
[45] Scikit learn, 3.2.4.3.2. sklearn.ensemble.RandomForestRegressor - scikit-learn 0.22.1 Documentation. [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html
[46] William Groves, “Using Domain Knowledge to Systematically Guide Feature Selection,” Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 3215-3216, 2013.
[Google Scholar] [Publisher Link]

IJECE MENUS

Call for Paper - Upcoming Issues

Implementation of Six Single Classifiers and Feature Selection for Performance Enhancement in Anomaly-Based Intrusion Detection

How to Cite?

Abstract:

Keywords:

References: