Dynamic Indian Sign Language Recognition Based on Enhanced LSTM with Custom Attention Mechanism
International Journal of Electronics and Communication Engineering |
© 2024 by SSRG - IJECE Journal |
Volume 11 Issue 2 |
Year of Publication : 2024 |
Authors : Jay M. Joshi, Dhaval U. Patel |
How to Cite?
Jay M. Joshi, Dhaval U. Patel, "Dynamic Indian Sign Language Recognition Based on Enhanced LSTM with Custom Attention Mechanism," SSRG International Journal of Electronics and Communication Engineering, vol. 11, no. 2, pp. 60-68, 2024. Crossref, https://doi.org/10.14445/23488549/IJECE-V11I2P107
Abstract:
In this paper, the author developed a system with an enhanced Long-Short-Term Memory (LSTM) model using a custom Attention Mechanism specifically designed for real-time dynamic Indian Sign Language (ISL) recognition. A large custom dataset of 59 signs was employed. For feature extraction, the MediaPipe framework was used. On this dataset, three different models were trained, the Long Short-Term Memory (LSTM) Dense model with Customed Attention, the LSTM-Dense model with a traditional Attention Mechanism, and the LSTM-Dense model without Attention Mechanism. The proposed Long Short-Term Memory (LSTM) dense model with a customed Attention Mechanism (AM) outperformed the other two models. The proposed LSTM-Dense model with a customed Attention Mechanism achieved a maximum accuracy of 96.08% in predicting the sign. When compared with existing Indian Sign Language recognition methods, our proposed model surpassed all others in accuracy, even with a large number of signs. In addition, 5-fold cross-validation of the proposed model confirmed the durability of our results, with a 93% accuracy. The results show that the proposed recognition system can effectively and robustly recognize ISL gestures.
Keywords:
Indian Sign Language recognition, Computer vision, Gesture recognition, Pattern recognition, Deep Learning.
References:
[1] Bransford J.D., Brown A.L., and Cocking R.R., “How People Learn: Brain, Mind, Experience, and School,” National Academy Press, pp. 1-2, 1999.
[Google Scholar] [Publisher Link]
[2] Kshitij Bantupalli, and Ying Xie, “American Sign Language Recognition Using Machine Learning and Computer Vision,” Master of Science in Computer Science Theses, pp. 1-45, 2019.
[Google Scholar] [Publisher Link]
[3] Shivashankara S., and S. Srinath, “A Comparative Study of Various Techniques and Outcomes of Recognizing American Sign Language: A Review” International Journal of Scientific Research Engineering & Technology (IJSRET) vol. 6, no. 9, pp. 1013-1023, 2017.
[Google Scholar]
[4] Anup Nandy et al., “Recognition of Isolated Indian Sign Language Gestures in Real-Time,” International Conference on Recent Trends in Business Administration and Information Processing: Information Processing and Management, Kerala, India, pp. 102-107, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Aditya P. Uchil, Smriti Jha, and B.G. Sudha, “Vision Based Deep Learning Approach for Dynamic Indian Sign Language Recognition in Healthcare,” International Conference on Computational Vision and Bio Inspired Computing, vol. 1108, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Anuja V. Nair, and V. Bindu, “A Review on Indian Sign Language Recognition,” International Journal of Computer Applications, vol. 73, no. 22, pp. 33-38, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[7] P.K. Athira, C.J. Sruthi, and A. Lijiya, “A Signer Independent Sign Language Recognition with Co-Articulation Elimination from Live Videos: An Indian Scenario,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 3, pp. 771-781, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[8] G.R. Sinha, “Indian Sign Language (ISL) Biometrics for Hearing and Speech Impaired Persons: Review and Recommendation,” International Journal of Information Technology, vol. 9, no. 4, pp. 425-430, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Priyanka Mekala et al., “Real-Time Sign Language Recognition Based on Neural Network Architecture,” 43rd Southeastern Symposium on System Theory, pp. 195-199, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Justin K. Chen, Debabrata Sengupta, and Rukmani Ravi Sundaram, “Sign Language Recognition with Unsupervised Feature Learning,” CS229 Project Final Report, Stanford University, CA, USA, 2011.
[Google Scholar] [Publisher Link]
[11] Joyeeta Singha, and Karen Das, “Recognition of Indian Sign Language in Live Video,” International Journal of Computer Applications, vol. 70, no. 19, pp. 17-22, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Madhuri Sharma, Ranjna Pal, and Ashok Kumar Sahoo, “Indian Sign Language Recognition Using Neural Networks and KNN Classifiers,” ARPN Journal of Engineering and Applied Sciences, vol. 9, no. 8, pp. 1255-1259, 2014.
[Google Scholar] [Publisher Link]
[13] Kapil Mehrotra, Atul Godbole, and Swapnil Belhe, “Indian Sign Language Recognition Using Kinect Sensor,” Image Analysis and Recognition: Image Analysis and Recognition, ICIAR 2015, Canada, vol. 9164, pp. 528-535, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Kumud Tripathi, Neha Baranwal, and G.C. Nandi, “Continuous Dynamic Indian Sign Language Gesture Recognition with Invariant Backgrounds,” International Conference on Advances in Computing, Communications, and Informatics (ICACCI), pp. 2211-2216, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[15] P.V.V. Kishore et al., “Optical Flow Hand Tracking and Active Contour Hand Shape Features for Continuous Sign Language Recognition with Artificial Neural Networks,” 6th International Conference on Advanced Computing (IACC), pp. 346-351, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Deepali Naglot, and Milind Kulkarni, “ANN-Based Indian Sign Language Numerals Recognition Using the Leap Motion Controller,” International Conference on Inventive Computation Technologies (ICICT), vol. 3, pp. 1-6, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Pradeep Kumar et al., “Coupled HMM-Based Multi-Sensor Data Fusion for Sign Language Recognition,” Pattern Recognition Letters, vol. 86, pp. 1-8, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[18] G. Ananth Rao, and P.V.V. Kishore, “Selfie Video Based Continuous Indian Sign Language Recognition System,” Ain Shams Engineering Journal, vol. 9, no. 4, pp. 1929-1939, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Necati Cihan Camgoz et al., “Neural Sign Language Translation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7784-7793, 2018.
[Google Scholar] [Publisher Link]
[20] Pradeep Kumar et al., “A Position and Rotation Invariant Framework for Sign Language Recognition (SLR) Using Kinect,” Multimedia Tools and Applications, vol. 77, pp. 8823-8846, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Hemina Bhavsar, and Jeegar Trivedi, “Indian Sign Language Recognition Using Framework of Skin Color Detection, Viola-Jones Algorithm, Correlation-Coefficient Technique and Distance Based Neuro-Fuzzy Classification Approach,” International Conference on Emerging Technology Trends in Electronics Communication and Networking, Surat, India, pp. 235-243, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Bin Xie, Xiaoyu He, and Yi Li, “RGB‐D Static Gesture Recognition Based on Convolutional Neural Network,” The 2nd Asian Conference on Artificial Intelligence Technology (ACAIT), vol. 16, pp. 1515-1520, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Vanita Jain et al., “American Sign Language Recognition Using Support Vector Machine and Convolutional Neural Network,” International Journal of Information Technology, vol. 13, no. 3, pp. 1193-1200, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Deep Kothadiya et al., “Deepsign: Sign Language Detection and Recognition Using Deep Learning,” Electronics, vol. 11, no. 11, pp. 1-12, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Adam A.Q. Mohammed et al., “Multi-Model Ensemble Gesture Recognition Network for High-Accuracy Dynamic Hand Gesture Recognition,” Journal of Ambient Intelligence and Humanized Computing, vol. 14, no. 6, pp. 6829-6842, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Anshul Mittal et al., “A Modified LSTM Model for Continuous Sign Language Recognition Using Leap Motion,” IEEE Sensors Journal, vol. 19, no. 16, pp. 7056-7063, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Chenguang Song et al., “A Multimodal Fake News Detection Model Based on Crossmodal Attention Residual and Multichannel Convolutional Neural Networks,” Information Processing & Management, vol. 58, no. 1, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Yuhuang Hu et al., “Overcoming the Vanishing Gradient Problem in Plain Recurrent Networks,” Neural and Evolutionary Computing, pp. 1-20, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Wei Pan, Xiongquan Zhang, and Zhongfu Ye, “Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features,” vol. 8, pp. 215592-215602, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Sarthak Yadav et al., “Comparing Biosignal and Acoustic Feature Representation for Continuous Emotion Recognition,” Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge, pp. 37-45, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[31] G. Anantha Rao et al., “Selfie Continuous Sign Language Recognition with Neural Network Classifier,” Proceedings of 2nd International Conference on Micro-Electronics, Electromagnetics and Telecommunications: ICMEET 2016, vol. 434, pp. 31-40, 2018.
[CrossRef] [Google Scholar] [Publisher Link]