Augmenting Association Rule Mining in Apriori Algorithm using Cuckoo Search with Opposition Parameters-Based Learning

International Journal of Computer Science and Engineering
© 2024 by SSRG - IJCSE Journal
Volume 11 Issue 9
Year of Publication : 2024
Authors : N. Bhanu Prakash, E. Kesavulu Reddy

pdf
How to Cite?

N. Bhanu Prakash, E. Kesavulu Reddy, "Augmenting Association Rule Mining in Apriori Algorithm using Cuckoo Search with Opposition Parameters-Based Learning," SSRG International Journal of Computer Science and Engineering , vol. 11,  no. 9, pp. 26-38, 2024. Crossref, https://doi.org/10.14445/23488387/IJCSE-V11I9P104

Abstract:

Data mining extracts hidden patterns from large datasets, making the information extracted useful for improving decisions and, hence, business outcomes. Among these methods, frequent itemset mining is a very popular and core technique within association rule mining The Apriori algorithm is one of the most popular algorithms in this area of frequent itemset and association rule discovery. Applications include market basket analysis, educational course selection, stock management, and medical data analysis. However, large datasets are exponentially increasing the computational burden of the Apriori algorithm, and hence, execution on parallel-distributed environments can improve performance. The improved approach presented in this paper integrates the Apriori algorithm with the Cuckoo Search algorithm using opposition parameters-based learning (CSOPBL). The Cuckoo Search mechanism with opposition-based learning efficiently prunes the transactions and items in each transaction. It is an approach whose processing time is greatly reduced if executed on a Spark in-memory distributed environment. The experimental results showed that the proposed CS-OPBL-based method outperforms the competing algorithms; for example, at a minimum support threshold of 0.75%, the processing time of this approach is only about 5.8% of that by using the state-of-the-art method on the retail dataset.

Keywords:

Data Mining, Frequent Itemset Mining (FIM), Association Rule Mining, Apriori Algorithm, Cuckoo Search and Spark.

References:

[1] Made Leo Radhitya et al., “Product Layout Analysis Based on Consumer Purchasing Patterns Using Apriori Algorithm,” Journal of Computer Networks, Architecture and High-Performance Computing, vol. 6, no. 3, pp. 1701-1711, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[2] D. Padmini Bai, and P. Preethi, “Security Enhancement of Health Information Exchange Based on Cloud Computing System,” International Journal of Scientific Engineering and Research, vol. 4, no. 10, pp. 79-82, 2016.
[Google Scholar] [Publisher Link]
[3] M. Supriyamenon, and P. Rajeswari “A Review on Association Rule Mining Techniques with Respect to their Privacy Preserving Capabilities,” International Journal of Applied Engineering Research, vol. 12, no. 24, pp. 15484-5488, 2017.
[Google Scholar] [Publisher Link]
[4] P. Preethi, and R. Asokan, “Modelling LSUTE: PKE Schemes for Safeguarding Electronic Healthcare Records Over Cloud Communication Environment,” Wireless Personal Communications, vol. 117, pp. 2695-2711, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[5] R. Agrawal, and J.C. Shafer, “Parallel Mining of Association Rules,” IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 6, pp. 962–969, 1996.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Ning Li et al., “Parallel Implementation of Apriori Algorithm Based on MapReduce,” International Journal of Networked and Distributed Computing, vol. 1, pp. 89-96, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Brijendra Singh, and Rohit Miri, “An Efficient Parallel Association Rule Mining Algorithm Based on MapReduce Framework,” International Journal of Engineering Research, vol. 5, no. 6, pp. 236–240, 2016.
[Google Scholar] [Publisher Link]
[8] Hongjian Qiu et al., “YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark,” 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, Phoenix, AZ, USA, pp. 1664–1671, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Sanjay Rathee, Manohar Kaul, and Arti Kashyap, “R-Apriori: An Efficient Apriori Based Algorithm on Spark,” Proceedings of the 8th Workshop on Ph.D. Workshop in Information and Knowledge Management, New York, NY, USA, pp. 27–34, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Shashi Raj et al., “EAFIM: Efficient Apriori-Based Frequent Itemset Mining Algorithm on Spark for Big Transactional Data,” Knowledge and Information Systems, vol. 62, pp. 3565–3583, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Krishan Kumar Sethi, and Dharavath Ramesh, “HFIM: A Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data Processing,” The Journal of Supercomputing, vol. 73, pp. 3652–3668, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Sanjay Rathee, and Arti Kashyap, “Adaptive-Miner: An efficient Distributed Association Rule Mining Algorithm on Spark,” Journal of Big Data, vol. 5, pp. 1–17, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Fei Gao, Ashutosh Khandelwal, and Jiangjiang Liu, “Mining Frequent Itemsets Using Improved Apriori on Spark,” Proceedings of the 2019 3rd International Conference on Information System and Data Mining, Houston, TX, USA, pp. 87–91, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Eduardo P.S. Castro et al., “Review and Comparison of Apriori Algorithm Implementations on Hadoop-MapReduce and Spark,” The Knowledge Engineering Review, vol. 33, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Shashi Raj, Dharavath Ramesh, and Krishan Kumar Sethi, “A Spark-Based Apriori algorithm with Reduced Shuffle Overhead,” The Journal of Supercomputing, vol. 77, pp. 133–151, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Sunil Kumar, and Krishna Kumar Mohbey, “A Utility-Based Distributed Pattern Mining Algorithm with Reduced Shuffle Overhead,” IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 1, pp. 416–428, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Sunil Kumar, and Krishna Kumar Mohbey, “A Review on Big Data Based Parallel and Distributed Approaches of Pattern Mining,” Journal of King Saud University-Computer and Information Sciences, vol. 34, no. 5, pp. 1639–1662, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Mohamed A. Gawwad, Mona F. Ahmed, and Magda B. Fayek, “Frequent Itemset Mining for Big Data Using Greatest Common Divisor Technique,” Data Science Journal, vol. 16, pp. 1-10, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Sunil Kumar, and Krishna Kumar Mohbey, “UBDM: Utility-Based Potential Pattern Mining Over Uncertain Data Using Spark Framework,” 5 th International Conference, Emerging Technologies in Computer Engineering: Cognitive Computing and Intelligent IoT, Jaipur, India, pp. 623–631, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Sunil Kumar, and Krishna Kumar Mohbey, “Memory-Optimized Distributed Utility Mining for Big Data,” Journal of King Saud University-Computer and Information Sciences, vol. 34, no. 8, pp. 6491– 6503, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Subramanian Kannimuthu, and Kandhasamy Premalatha, “Stellar Mass Black Hole Optimisation for Utility Mining,” International Journal of Data Analysis Techniques and Strategies, vol. 11, no. 3, pp. 222–245, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[22] S. Kannimuthu, and D. Gowtham Chakravarthy, “Discovery of Interesting Itemsets for Web Service Composition Using Hybrid Genetic Algorithm,” Neural Processing Letters, vol. 54, no. 5, pp. 3913–3939, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Kannimuthu Subramanian, and Premalatha Kandhasamy, “UP-GNIV: An Expeditious High Utility Pattern Mining Algorithm for Itemsets with Negative Utility Values,” International Journal of Information Technology and Management, vol. 14, no. 1, pp. 26–42, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Le Hoang Son et al., “ARM–AMO: An Efficient Association Rule Mining Algorithm Based on Animal Migration Optimization,” Knowledge-Based Systems, vol. 154, pp. 68–80, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Aghila Rajagopal et al., “A Novel Approach in Prediction of Crop Production Using Recurrent Cuckoo Search Optimization Neural Networks,” Applied Sciences, vol. 11, no. 21, pp. 1-13, 2021.
[CrossRef] [Google Scholar] [Publisher Link]