Enhancing Question Answering with a Multidirectional Transformer: Insights from Squad 2.0

International Journal of Electronics and Communication Engineering
© 2024 by SSRG - IJECE Journal
Volume 11 Issue 4
Year of Publication : 2024
Authors : R. Rejimoan, B. Gnanapriya, J.S. Jayasudha
pdf
How to Cite?

R. Rejimoan, B. Gnanapriya, J.S. Jayasudha, "Enhancing Question Answering with a Multidirectional Transformer: Insights from Squad 2.0," SSRG International Journal of Electronics and Communication Engineering, vol. 11,  no. 4, pp. 133-148, 2024. Crossref, https://doi.org/10.14445/23488549/IJECE-V11I4P114

Abstract:

Natural Language Processing (NLP), a field at the intersection of linguistics and artificial intelligence, aims to equip machines with the ability to understand, interpret, and generate human-like text. Focused on the relevance of Machine Reading Comprehension (MRC), a vital subset of NLP, the proposed approach addresses the intricate task of training a model to understand and respond to questions based on a given context, mimicking human-like comprehension. Leveraging the Squad 2.0 dataset, a benchmark in MRC, the methodology employs a Multidirectional Transformer architecture coupled with BERT, a pre-trained language representation model, to enhance the model’s ability to grasp contextual nuances. The tokenization process is utilized to break down raw text into smaller units, allowing for effective analysis. The architecture incorporates embedding techniques, sub-string search mechanisms, and data generators, fostering a comprehensive understanding of the input data. Employing masked softmax and permutation techniques during training contributes to the model’s robustness, particularly in handling long-range dependencies and diverse expressions of the same information. The results obtained reveal a high accuracy of 94.00%, with an Exact Match of 48.4% and an F1 score of 60.9882%. Visual representations further affirm the model’s prowess in comprehension, showcasing aligned predictions with actual answers. In essence, this paper presents a comprehensive approach to MRC within the NLP domain, employing advanced techniques and achieving promising results on the Squad 2.0 dataset.

Keywords:

Machine Reading Comprehension, Question answering, Natural Language Processing, BERT, Squad, Embedding.

References:

[1] Wendy G. Lehnert, The Process of Question Answering, A Computer Simulation of Cognition, 1st ed., Routledge, 1978.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Lynette Hirschman et al., “Deep Read: A Reading Comprehension System,” Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 325-332, 1999.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Ellen Riloff, and Michael Thelen, “A Rule-Based Question Answering System for Reading Comprehension Tests,” Proceedings of the 2000 ANLP/NAACL Workshop on Reading Comprehension Tests as Evaluation for Computer-Based Language Understanding, vol. 6, pp. 13-19, 2000.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Hoifung Poon et al., “Machine Reading at the University of Washington,” Proceedings of the NAACL HLT2010 First International Workshop on Formalisms and Methodology for Learning by Reading, pp. 87-95, 2010.
[Google Scholar] [Publisher Link]
[5] Matthew Richardson, Christopher J.C. Burges, and Erin Renshaw, “Mctest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text,” Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 193-203, 2013.
[Google Scholar] [Publisher Link]
[6] Karl Moritz Hermann et al., “Teaching Machines to Read and Comprehend,” Advances in Neural Information Processing Systems 28(NIPS 2015), 2015.
[Google Scholar] [Publisher Link]
[7] Nisha Varghese, and M. Punithavalli, “Lexical and Semantic Analysis of Sacred Texts Using Machine Learning and Natural Language Processing,” International Journal of Scientific & Technology Research, vol. 8, no. 12, pp. 3133-3140, 2019.
[Google Scholar] [Publisher Link]
[8] Ashish Vaswani et al., “Attention is All You Need,” Advances in Neural Information Processing Systems 30 (NIPS 2017), 2017.
[Google Scholar] [Publisher Link]
[9] Yo Zhang, Bo Shen, and Xing Cao, “Learn A Prior Question-Aware Feature for Machine Reading Comprehension,” Frontiers in Physics, vol. 10, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Ján Staš, Daniel Hládek, and Tomáš Koctúr, “Slovak Question Answering Dataset Based on the Machine Translation of the Squad V2.0,” Journal of Linguistics/Jazykovedný Casopis, vol. 74, no. 1, pp. 381-390, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Kasra Darvishi et al., “PQuAD: A Persian Question Answering Dataset,” Computer Speech & Language, vol. 80, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Yunjie Ji et al., “To Answer or Not To Answer? Improving Machine Reading Comprehension Model with Span-Based Contrastive Learning,” arXiv Computation and Language, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Jianquan Ouyang, and Mengen Fu, “Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training,” Mathematics, vol. 10, no. 3, pp. 1-11, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Dang Van Nhan, and Nguyen Le Minh, “ViMRC-VLSP 2021: Using XLM-RoBERTa and Filter Output for Vietnamese Machine Reading Comprehension,” VNU Journal of Science: Computer Science and Communication Engineering, vol. 39, no. 2, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Nguyen Van Kiet et al., “VLSP 2021-ViMRC Challenge: Vietnamese Machine Reading Comprehension,” VNU Journal of Science: Computer Science and Communication Engineering, vol. 8, no. 2, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Chenxi Yu, and Xin Li, “SSAG-Net: Syntactic and Semantic Attention-Guided Machine Reading Comprehension,” Intelligent Automation & Soft Computing, vol. 34, no. 3, pp. 2023-2024, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Tanik Saikh et al., “ScienceQA: A Novel Resource for Question Answering on Scholarly Articles,” International Journal on Digital Libraries, vol. 23, no. 3, pp. 289-301, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Feng Gao et al., “Knowledge Graph Based Mutual Attention for Machine Reading Comprehension over Anti-Terrorism Corpus,” Data Intelligence, vol. 5, no. 3, pp. 685-706, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Pranav Rajpurkar et al., “SQuAD: 100,000+ Questions for Machine Comprehension of Text,” Proceeding of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp. 2383-2392, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Jacob Devlin et al., “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding,” Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171- 4186, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Mohammadreza Samadi, Maryam Mousavian, and Saeedeh Momtazi, “Deep Contextualized Text Representation and Learning for Fake News Detection,” Information Processing & Management, vol. 58, no. 6, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Francisca Adoma Acheampong, Henry Nunoo-Mensah, and Wenyu Chen, “Transformer Models for Text-Based Emotion Detection: A Review of BERT-Based Approaches,” Artificial Intelligence Review, vol. 54, pp. 5789-5829, 2021.
[CrossRef] [Google Scholar] [Publisher Link]