The English to Telugu CLIR for NER using Bidirectional Encoding and Unidirectional Decoding using Random Sampling and Beam Search

International Journal of Electronics and Communication Engineering
© 2024 by SSRG - IJECE Journal
Volume 11 Issue 5
Year of Publication : 2024
Authors : B. N. V. Narasimha Raju, K. V. V. Satyanarayana, M. S. V. S. Bhadri Raju
pdf
How to Cite?

B. N. V. Narasimha Raju, K. V. V. Satyanarayana, M. S. V. S. Bhadri Raju, "The English to Telugu CLIR for NER using Bidirectional Encoding and Unidirectional Decoding using Random Sampling and Beam Search," SSRG International Journal of Electronics and Communication Engineering, vol. 11,  no. 5, pp. 170-178, 2024. Crossref, https://doi.org/10.14445/23488549/IJECE-V11I5P117

Abstract:

Neural Machine Translation (NMT) systems and the availability of a wide variety of linguistic resources have greatly improved Cross-Lingual Information Retrieval (CLIR) capabilities. When translating English queries into Indian languages, the NMT approach performs well. The NMT will employ a parallel corpus for translations. The translation of English queries into Telugu is the main emphasis of this study. A lack of Telugu-language content makes it challenging to have a sizable parallel corpus. Consequently, NMT encounters issues with Out-Of-Vocabulary (OOV) and Named Entity Recognition (NER). Byte Pair Encoding (BPE) attempts to translate unusual words by breaking them down into subwords in order to overcome the OOV problem. Problems such as NER still have an effect. The system may be trained in both forward and reverse directions to recognize NER effectively. The system is trained to recognize named entities in both directions through bidirectional encoding. Consequently, NER issues can be solved with Bidirectional Long Short-Term Memory (BiLSTM) encoding. Random sampling and beam search decoding with unidirectional LSTM are used to improve the translation output sequence. The approach using BPE and BiLSTM encoding, along with random sampling and beam search decoding with unidirectional LSTM, will help to resolve the OOV and NER problems and improve the output sequence of the translations generated by the NMT system. This approach is evaluated by using the Bilingual Evaluation Understudy (BLEU) score and other metrics like accuracy, perplexity, and cross-entropy, demonstrating that the translation quality of NMT with bidirectional encoding and unidirectional decoding using random sampling and beam search surpasses that of regular encoding and decoding models using LSTM.

Keywords:

Cross lingual information retrieval, Machine translation, BiLSTM, Random sampling, Beam search.

References:

[1] Varun Bora, Rahil Bassim, and Sakshi Rai, “Indian Languages - Defining India's Internet,” Klynveld Peat Marwick Goerdeler, pp. 1-36, 2017.
[Google Scholar] [Publisher Link
[2] Jianhui Pang et al., “Rethinking the Exploitation of Monolingual Data for Low-Resource Neural Machine Translation,” Computational Linguistics, vol. 50, no. 1, pp. 25-47, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Jonas Waldendorf et al., “Improving Translation of Out Of Vocabulary Words Using Bilingual Lexicon Induction in Low-Resource Machine Translation,” Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas, Orlando, USA, vol. 1, pp. 144-156, 2022.
[Google Scholar] [Publisher Link]
[4] Zhongyu Zhuang et al., “Out-of-Vocabulary Word Embedding Learning Based on Reading Comprehension Mechanism,” Natural Language Processing Journal, vol. 5, pp. 1-6, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Johannes V. Lochter, Renato M. Silva, and Tiago A. Almeida, “Multi-Level Out-of-Vocabulary Words Handling Approach,” Knowledge-Based Systems, vol. 251, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Hong Ming et al., “Few-Shot Nested Named Entity Recognition,” Knowledge-Based Systems, vol. 293, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Zhaojian Cui et al, “Language Inference-Based Learning for Low-Resource Chinese Clinical Named Entity Recognition Using Language Model,” Journal of Biomedical Informatics, vol. 149, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Yan Huang, TianYuan Zhang, and Chun Xu, “Learning to Decode to Future Success for Multi-Modal Neural Machine Translation,” Journal of Engineering Research, vol. 11, no. 2, pp. 1-7, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Jinsong Su et al., “Exploiting Reverse Target-Side Contexts for Neural Machine Translation Via Asynchronous Bidirectional Decoding,” Artificial Intelligence, vol. 277, pp. 1-14, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Xing Liu, Huiqin Chen, and Wangui Xia, “Overview of Named Entity Recognition,” Journal of Contemporary Educational Research, vol. 6, no. 5, pp. 65-68, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Taketomo Isazawa, and Jacqueline M. Cole, “Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor,” Journal of Chemical Information and Modeling, vol. 62, no. 5, pp. 1207-1213, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Cheng-Yen Lee et al., “Named Entity Recognition for Chinese Healthcare Applications,” 2023 International Conference on Consumer Electronics - Taiwan, PingTung, Taiwan, pp. 749-750, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Yi Zhou, Xiao-Qing Zheng, and Xuan-Jing Huang, “Chinese Named Entity Recognition Augmented with Lexicon Memory,” Journal of Computer Science and Technology, vol. 38, no. 5, pp. 1021-1035, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Feng Zhao et al., “Dynamic Entity-Based Named Entity Recognition Under Unconstrained Tagging Schemes,” IEEE Transactions on Big Data, vol. 8, no. 4, pp. 1059-1072, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Ying Luo et al., “Open Named Entity Modeling From Embedding Distribution,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 11, pp. 5472-5483, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Boyuan Pan et al., “Bi-Decoder Augmented Network for Neural Machine Translation,” Neurocomputing, vol. 387, pp. 188-194, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Aloka Fernando, and Surangika Ranathunga, “Data Augmentation to Address Out of Vocabulary Problem in Low Resource Sinhala English Neural Machine Translation,” Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, Shanghai, China, pp. 61-70, 2021.
[Google Scholar] [Publisher Link]
[18] Longtu Zhang, and Mamoru Komachi, “Neural Machine Translation of Logographic Language Using Sub-character Level Information,” Proceedings of the Third Conference on Machine Translation: Research Papers, Brussels, Belgium, pp. 17-25, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Rico Sennrich, Barry Haddow, and Alexandra Birch, “Neural Machine Translation of Rare Words with Subword Units,” Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, vol. 1, pp. 1715-1725, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Martin Sundermeyer et al., “Translation Modeling with Bidirectional Recurrent Neural Networks,” Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 14-25, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, “Sequence to Sequence Learning with Neural Networks,” NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, vol. 2, pp. 3104-3112, 2014.
[Google Scholar] [Publisher Link]
[22] Markus Freitag, and Yaser Al-Onaizan, “Beam Search Strategies for Neural Machine Translation,” Proceedings of the First Workshop on Neural Machine Translation, Vancouver, pp. 56-60, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Rico Sennrich, Barry Haddow, and Alexandra Birch, “Improving Neural Machine Translation Models with Monolingual Data,” Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, vol. 1, pp. 86-96, 2016.
[CrossRef] [Google Scholar] [Publisher Link]