Multi-Dimensional Machine Intelligence Technique on High Computational Data for Bigdata Analytics

International Journal of Electrical and Electronics Engineering
© 2024 by SSRG - IJEEE Journal
Volume 11 Issue 6
Year of Publication : 2024
Authors : K. Kishore Raju, Ch.S.V.V.S.N. Murty, Suresh Kumar Kanaparthi, Amdewar Godavari, Kayam Saikumar
pdf
How to Cite?

K. Kishore Raju, Ch.S.V.V.S.N. Murty, Suresh Kumar Kanaparthi, Amdewar Godavari, Kayam Saikumar, "Multi-Dimensional Machine Intelligence Technique on High Computational Data for Bigdata Analytics," SSRG International Journal of Electrical and Electronics Engineering, vol. 11,  no. 6, pp. 91-100, 2024. Crossref, https://doi.org/10.14445/23488379/IJEEE-V11I6P110

Abstract:

 In the current digital environment, copious amounts of data are generated across diverse sectors like healthcare, content creation, the internet, and businesses. ML algorithms are pivotal in analyzing this data to unveil significant ways to make decisions. However, not all features within these datasets are relevant for constructing robust machine learning models. Some features may be insignificant or have minimal impact on the prediction outcomes. By filtering out these irrelevant features, the computational burden on machine learning algorithms is reduced. Using the freely available MINIST dataset, this study explores the application of t-SNE, LDA, and Principal Component Analysis (PCA) alongside several prominent ML techniques like Naive Bayes, SVM classifiers, and K-NN classifications employed. Experimental outcomes illustrate the effectiveness of ML algorithms in this context. Furthermore, the experiments demonstrate that employing PCA with machine learning algorithms leads to improved outcomes, particularly when dealing with high-dimensional datasets. Performance measures like Accuracy 98.34%, Sensitivity 98.76%, Recall 98.45% and Throughput 98.65% have been attained, which was a good improvement.

Keywords:

Dimensionality reduction, KNN, ML, NB, PCA, LDA, t-SNE, SVM.

References:

[1] Adiwijaya et al., “Dimensionality Reduction Using Principal Component Analysis for Cancer Detection based on Microarray Data Classification,” Journal of Computer Science, vol. 14, no. 11, pp. 1521-1530, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Gustavo Eloi de Rodrigues, Wilson Marcílio, and Danilo Eler, “Data Classification: Dimensionality Reduction Using Combined and NonCombined Multidimensional,” 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), Sao Paulo, Brazil, pp. 402-407, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Pitoyo Hartono, “Classification and Dimensional Reduction Using Restricted Radial Basis Function Networks,” Natural Computing Applications, vol. 30, pp. 905-915, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Md. Golam Saroware et al., “Performance Evaluation of Feature Extraction and Dimensionality Reduction Techniques on Various Machine Learning Classifiers,” 2019 IEEE 9th International Conference on Advanced Computing (IACC), Tiruchirappalli, India, pp. 1924, 2019.
[CrossRef] [Google Scholar] [Publisher Link] 
[5] Hany Yan, and Hu Tianyu, “Unsupervised Dimensionality Reduction for High-Dimensional Data Classification,” Machine Learning Research, vol. 2, no. 4, pp. 125-132, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[6] G. Thippa Reddy et al., “Analysis of Dimensionality Reduction Techniques on Big Data,” IEEE Access, vol. 8, pp. 54776-54788, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Areej Alsaafin, and Ashraf Elnagar, “A Minimal Subset of Features Using Feature Selection for Handwritten Digit Recognition,” Journal of Intelligent Learning Systems and Applications, vol. 9, no. 4, pp. 55-68, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Abbas Mardani et al., “A Multi-Stage Method to Predict Carbon Dioxide Emissions Using Dimensionality Reduction, Clustering, and Machine Learning Techniques,” Journal of Cleaner Production, vol. 275, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Rizgar R. Zebari et al., “A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction,” Journal of Applied Science and Technology Trends, vol. 1, no. 1, pp. 56-70, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Drishti Beohar, and Akhtar Rasool, “Handwritten Digit Recognition of MNIST Dataset Using Deep Learning State-of-the-Art Artificial Neural Network (ANN) and Convolutional Neural Network (CNN),” 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, pp. 542-548, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Tausifa Jan Saleem, and Mohammad Ahsan Chishti, “Assessing the Efficacy of Machine Learning Techniques for Handwritten Digit Recognition,” International Journal of Computing and Digital Systems, vol. 9, no. 2, pp. 299-308, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Cheng-Lung Huang, and Jian-Fan Dun, “A Distributed PSO–SVM Hybrid System with Feature Selection and Parameter Optimization,” Applied Soft Computing, vol. 8, no. 4, pp. 1381-1391, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Y. Lecun et al., “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[CrossRef] [Google Scholar] [Publisher Link]
[14] M. RamakrishnaMurty, J.V.R. Murthy, and Prasad Reddy P.V.G.D., “Text Document Classification Based on a LeastSquare Support Vector Machines with Singular Value Decomposition,” International Journal of Computer Application (IJCA), vol. 27, no. 7, pp. 21-26, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Lingyun Wang et al., “Short-Term Power Load Forecasting Model Based on t-SNE Dimension Reduction Visualization Analysis, VMD and LSSVM Improved with Chaotic Sparrow Search Algorithm Optimization,” Journal of Electrical Engineering & Technology, vol. 17, pp. 2675-2691, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[16] I.T. Jolliffe, Generalizations and Adaptations of Principal Component Analysis, Principal Component Analysis, New York, pp. 223-234, 1986.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Olivier Chapelle et al., “Choosing Multiple Parameters for Support Vector Machines,” Machine Learning, vol. 46, pp. 131-159, 2002.
[CrossRef] [Google Scholar] [Publisher Link]
[18] T. Cover, and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21-27, 1967.
[CrossRef] [Google Scholar] [Publisher Link]
[19] L. Breiman, “Bagging Forests [J]”, Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[Google Scholar]
[20] D. Padmaja Usharani  et al., “Classification of High-Dimensionality Data Using Machine Learning Techniques,” Intelligent System Design, vol. 494, pp. 227-237, 2022.
[CrossRef] [Google Scholar] [Publisher Link]