An Efficient CoRXGB Approach to Estimate Effort of Scrum Projects

International Journal of Electronics and Communication Engineering
© 2025 by SSRG - IJECE Journal
Volume 12 Issue 1
Year of Publication : 2025
Authors : Shivali Chopra, Arun Malik
pdf
How to Cite?

Shivali Chopra, Arun Malik, "An Efficient CoRXGB Approach to Estimate Effort of Scrum Projects," SSRG International Journal of Electronics and Communication Engineering, vol. 12,  no. 1, pp. 44-71, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I1P104

Abstract:

In agile software development, Story Point Estimation (SPE) is at the core of project planning, resource planning, and project timeline management. During the last ten years, many researchers have reasonably attempted to propose methods for estimating story points or tasks in agile projects. Expert judgment, planning poker, and analogy are some traditional approaches that have been widely applied but criticized because of the inherent subjectivity, vulnerability to biases, and inability to handle intrinsic complexities in user stories. Eventually, these lead to inaccurate estimates, misaligned stakeholder expectations, and suboptimal sprint outcomes. The research direction has also shifted more in recent times to machine learning-based and deep learning-based approaches that try to present more systematic estimation models driven by the data itself. However, these also face difficulties while fully capturing the nuances involved with the multifaceted nature of user stories. This paper proposes a new hybrid model for software effort estimation entitled CoRXGB. This will help through synergistically combine CNN, RNN, and XGBoost and take the strength of all: CNN for extracting contextual and textual features, the Bi-LSTM for extracting sequential and temporal relations, and XGBoost is superior at classifications. Among the key originalities of the research approach, the most important may become the strategy for hyperparameter optimization that involves Bayesian Optimization integrated with the Learning Gain Matrix. This strategy thus systematically analyzes and optimizes performance gains from various configurations of hyperparameters and hence effectively removes inefficiencies that are associated traditionally with the tuning process. This indeed lets one make better-informed and selective adjustments in reaching high performance. Then, the resultant CoRXGB model has been applied extensively to a wide array of data sourced from different Agile projects that included user stories amounting to more than 23,000 stories. The results showed a significant improvement in the accuracy after hyperparameter tuning, with Appcelerator Studio increasing its accuracy to 90.55% from 82.47% and Aptana Studio increasing from 82.57% to 90.82%, reflecting an increase of 6.44%. In different data sets, CoRXGB outperformed traditional classifiers like Logistic Regression, Support Vector Classifier, and K-Nearest Neighbors, and also outperformed advanced models like RNN-CNN and DEEP-SE. These results underpin the efficiency of the CoRXGB model in story point estimation. It not only outperforms baselines by substantial margins in precision, recall, and F1-score but also holds immense promise to improve Agile project planning processes toward more reliable and efficient software development practices.

Keywords:

Agile, CNN, Deep Learning, Effort Estimation, Machine Learning, RNN, Scrum, XGB.

References:

[1] Nisma Gaffar et al., “A Proposed Framework for Enhancing Story Points in Agile Software Projects,” Indian Journal of Science and Technology, vol. 11, no. 31, pp. 1-11, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Janeth Lopez-Martinez et al., “Estimating User Stories’ Complexity and Importance in Scrum with Bayesian Networks,” Recent Advances in Information Systems and Technologies, pp. 205-214, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[3] M. Shepperd, and C. Schofield, “Estimating Software Project Effort Using Analogies,” IEEE Transactions on Software Engineering, vol. 23, no. 11, pp. 736-743, 1997.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Ravi Kiran Mallidi, and Manmohan Sharma, “Study on Agile Story Point Estimation Techniques and Challenges,” International Journal of Computer Applications, vol. 174, no. 13, pp. 9-14, 2021.
[Google Scholar] [Publisher Link]
[5] Suyash Shukla, and Sandeep Kumar, “Study of Learning Techniques for Effort Estimation in Object-Oriented Software Development,” IEEE Transactions on Engineering Management, vol. 71, pp. 4602-4618, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Mohit Arora et al., “An Efficient ANFIS-EEBAT Approach to Estimate Effort of Scrum Projects,” Scientifc Reports, vol. 12, pp. 1-14, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Ekrem Kocaguneli, Tim Menzies, and Jacky W. Keung, “On the Value of Ensemble Effort Estimation,” IEEE Transactions on Software Engineering, vol. 38, no. 6, pp. 1403-1416, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Onkar Malgonde, and Kaushal Chari, “An Ensemble-Based Model for Predicting Agile Software Development Effort,” Empirical Software Engineering, vol. 24, pp. 1017-1055, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Claudio Ratke et al., “Effort Estimation Using Bayesian Networks for Agile Development,” 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS), Riyadh, Saudi Arabia, pp. 1-4, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Ali Bou Nassif et al., “Neural Network Models for Software Development Effort Estimation: A Comparative Study,” Neural Computing and Applications, vol. 27, no. 8, pp. 2369-2381, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Asad Ali, and Carmine Gravino, “A Systematic Literature Review of Software Effort Prediction using Machine Learning Methods,” Journal of Software: Evolution and Process, vol. 31, no. 10, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Vlad-Sebastian, Horia, and Istvan-Gergely, “Natural Language Processing and Machine Learning Methods for Software Development Effort Estimation,” Studies in Informatics and Control, vol. 26, no. 2, pp. 219-228, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Akshay Jadhav et al., “Effective Software Effort Estimation Leveraging Machine Learning for Digital Transformation,” IEEE Access, vol. 11, pp. 83523-83536, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Harish Kumar Mittal, Mohd Arsalan, and Puneet Garg, “A Novel Deep Learning Model for Effective Story Point Estimation in Agile Software Development,” 2024 International Conference on Emerging Innovations and Advanced Computing (INNOCOMP), Sonipat, India, pp. 404-410, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Bhaskar Marapelli, Anil Carie, and Sardar M.N. Islam, “RNN-CNN MODEL:A Bi-directional Long Short-Term Memory Deep Learning Network For Story Point Estimation,” 2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA), Sydney, Australia, pp. 1-7, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Haithem Kassem, Khaled Mahar, and Amani A. Saad, “Story Point Estimation Using Issue Reports with Deep Attention Neural Network,” E-Informatica Software Engineering Journal, vol. 17, no. 1, pp. 1-15, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Ali Bou Nassif et al., “Software Development Effort Estimation Using Regression Fuzzy Models,” Computational Intelligence and Neuroscience, vol. 2019, pp. 1-17, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Jasem M. Alostad, Laila R.A. Abdulla, and Lamya Sulaiman Aali, “A Fuzzy Based Model for Effort Estimation in Scrum Projects,” International Journal of Advanced Computer Science and Applications, vol. 8, no. 9, pp. 270-277, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Marta Fernandez-Diego et al., “An Update on Effort Estimation in Agile Software Development: A Systematic Literature Review,” IEEE Access, vol. 8, pp. 166768-166800, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Pekka Abrahamsson et al., “Predicting Development Effort from User Stories,” 2011 International Symposium on Empirical Software Engineering and Measurement, Banff, AB, Canada, pp. 400-403, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Simone Porru et al., “Estimating Story Points from Issue Reports,” Proceedings of the 12th International Conference on Predictive Models and Data Analytics in Software Engineering, Ciudad Real, Spain, pp. 1-10, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Ezequiel Scott, and Dietmar Pfahl, “Using Developers' Features to Estimate Story Points,” Proceedings of the 2018 International Conference on Software and System Process, Gothenburg, Sweden, pp. 106-110, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Morakot Choetkiertikul et al., “A Deep Learning Model for Estimating Story Points,” IEEE Transactions on Software Engineering, vol. 45, no. 7, pp. 637-656, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Michael Fu, and Chakkrit Tantithamthavorn, “GPT2SP: A Transformer-Based Agile Story Point Estimation Approach,” IEEE Transactions on Software Engineering, vol. 49, no. 2, pp. 611-625, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Tianqi Chen, and Carlos Guestrin, “XGBoost : A Scalable Tree Boosting System,” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Xilu Wang et al., “Recent Advances in Bayesian Optimization,” ACM Computing Surveys, vol. 55, no. 13s, pp. 1-36, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[27] James Bergstra, Daniel Yamins, and David Cox, “Making a Science of Model Search : Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures,” Proceedings of the 30th International Conference on Machine Learning, vol. 28, no. 1, pp. 115-123, Atlanta, Georgia, USA, 2013.
[Google Scholar] [Publisher Link]
[28] Macarious Abadeer, and Mehrdad Sabetzadeh, “Machine Learning-based Estimation of Story Points in Agile Development : Industrial Experience and Lessons Learned,” 2021 IEEE 29th International Requirements Engineering Conference Workshops (REW), pp. 106-115, Notre Dame, IN, USA, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Muaz Gultekin, and Oya Kalipsiz, “Story Point-Based Effort Estimation Model with Machine Learning Techniques,” International Journal of Software Engineering and Knowledge Engineering, vol. 30, no. 1, pp. 43-66, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Przemyslaw Pospieszny, “Software Estimation: Towards Prescriptive Analytics,” Proceedings of the 27th International Workshop on Software Measurement and 12th International Conference on Software Process and Product Measurement, Gothenburg Sweden, pp. 221-226, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Janeth Lopez-Martinez et al., “User Stories Complexity Estimation using Bayesian Networks for Inexperienced Developers,” Cluster Computing, vol. 21, pp. 715-728, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Panut Chongpakdee, and Wiwat Vatanawood, “Estimating User Story Points Using Document Fingerprints,” 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, pp. 149-152, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Ahmad Azzazi, “A Framework using NLP to Automatically Convert User-Stories into Use Cases in Software Projects,” International Journal of Computer Science and Network Security, vol. 17, no. 5, pp. 71-76, 2017.
[Google Scholar] [Publisher Link]
[34] M. Thangaraj, and M Sivakami, “Text Classification Techniques: A Literature Review,” Interdisciplinary Journal of Information, Knowledge, and Management, Knowledge, vol. 13, pp. 117-135, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Vali Tawosi, and Rebecca Moussa, “Agile Effort Estimation: Have We Solved the Problem Yet? Insights from a Replication Study,” IEEE Transactions on Software Engineering, vol. 49, no. 4, pp. 2677-2697, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Hung Phan, and Ali Jannesari, “Story Point Effort Estimation by Text Level Graph Neural Network,” arxiv, pp. 1-4, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Jiale Wu et al., “Toward Efficient and Effective Bullying Detection in Online Social Network,” Peer-to-Peer Networking and Applications, vol. 13, no. 5, pp. 1567-1576, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Chaudhary Hamza Rashid et al., “Software Cost and Effort Estimation : Current Approaches and Future Trends,” IEEE Access, vol. 11, pp. 99268-99288, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Lan Cao, “Estimating Efforts for Various Activities in Agile Software Development : An Empirical Study,” IEEE Access, vol. 10, pp. 83311-83321, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Indra Kharisma Raharjana, Daniel Siahaan, and Chastine Fatichah, “User Stories and Natural Language Processing: A Systematic Literature Review,” IEEE Access, vol. 9, pp. 53811-53826, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Julliano Trindade Pintas, Leandro A.F. Fernandes, and Ana Cristina Bicharra Garcia, “Feature Selection Methods for Text Classification: a Systematic Literature Review, vol. 54, no. 8, pp. 6149-6200, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Sepp Hochreiter, and Jurgen Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Burcu Yalcıner et al., “Enhancing Agile Story Point Estimation: Integrating Deep Learning, Machine Learning, and Natural Language Processing with SBERT and Gradient Boosted Trees,” Applied Sciences, vol. 14, no. 16, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Eduardo Rodriguez Sanchez, Eduardo Filemon Vazquez Santacruz, and Humberto Cervantes Maceda, “Effort and Cost Estimation Using Decision Tree Techniques and Story Points in Agile Software Development,” Mathematics, vol. 11, no. 6, pp. 1-31, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Remah Younisse, and Mohammad Azzeh, “Application of Natural Language Processing Techniques in Agile Software Project Management: A Survey,” 2023 14th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, pp. 1-6, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Haithem Kassem, Khaled Mahar, and Amani Saad, “Software Effort Estimation using Hierarchical Attention Neural Network,” Journal of Theoretical and Applied Information Technology, vol. 100, no. 18, pp. 5308-5322, 2022.
[Google Scholar] [Publisher Link]
[47] N.V. Chawla et al., “SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
[CrossRef] [Google Scholar] [Publisher Link]