Research Article | Open Access | Download PDF
Volume 13 | Issue 4 | Year 2026 | Article Id. IJECE-V13I4P129 | DOI : https://doi.org/10.14445/23488549/IJECE-V13I4P129A Transformer–Ensemble Hybrid Framework for Emotion Classification in Low-Resource Kannada Poetry
Smita Girish, Kamalraj R
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 25 Jan 2026 | 25 Feb 2026 | 26 Mar 2026 | 30 Apr 2026 |
Citation :
Smita Girish, Kamalraj R, "A Transformer–Ensemble Hybrid Framework for Emotion Classification in Low-Resource Kannada Poetry," International Journal of Electronics and Communication Engineering, vol. 13, no. 4, pp. 371-383, 2026. Crossref, https://doi.org/10.14445/23488549/IJECE-V13I4P129
Abstract
Emotion classification of literary text remains a challenging NLP task due to metaphorical expressions, implicit affective cues, and semantic ambiguity, particularly in low-resource and morphologically rich languages. Kannada, a primary Dravidian language spoken by over 45 million people, presents additional complexity in poetry, where emotions are often conveyed indirectly through symbolism. Despite this, computational analysis of Kannada poetic texts remains limited. This paper proposes a MuRIL-based Transformer–Ensemble framework for classifying 490 manually annotated Kannada short poems into nine emotion categories: Joy, Peace, Wonder, Courage, Compassion, Anger, Melancholy, Fear, and Disgust. Baseline experiments using traditional machine learning models reveal modest performance, with Support Vector Machine (SVM) achieving 45% accuracy, Naïve Bayes 50%, and Random Forest 55%, indicating their inability to capture contextual and implicit emotional semantics in poetic language. To overcome these limitations, the proposed approach integrates MuRIL contextual embeddings with an ensemble of SVM, Random Forest, and Naïve Bayes classifiers using hard and soft voting strategies. Class imbalance is addressed through a training-only augmentation method, Adaptive Minority Expansion and Overfitting Control (AMEOC), employing synonym replacement, paraphrasing, and back-translation while maintaining semantic integrity. Experimental results show that the proposed framework achieves an overall accuracy of 79% and an F1-score of 0.78, significantly outperforming baseline classifiers, TF–TF-IDF-based models, and a MuRIL-only classifier. The performance progression from classical models to contextual transformer-based ensemble learning demonstrates the effectiveness of ensemble fusion and controlled Augmentation for interpreting subtle, metaphor-rich emotions in Kannada poetry. The proposed framework contributes to computational literary analysis and has potential applications in digital humanities, literary retrieval, and cultural analytics for low-resource Indian languages.
Keywords
Kannada Poetry, SVM, Naïve Bayes, Random Forest, Emotion Classification, MuRIL, Transformer Ensemble, Low-Resource NLP, Data Augmentation, Voting Classifier.
References
- Reut Tsarfaty et al., “Statistical Parsing of Morphologically Rich Languages (SPMRL): What, How and Whither,” Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages, Los Angeles, California, pp. 1-12, 2010.
[Google Scholar] [Publisher Link] - Ekaterina Vylomova, Trevor Cohn, and Xuanli He, “Word Representation Models for Morphologically Rich Languages in Neural Machine Translation,” Proceedings of the First Workshop on Subword and Character Level Models in NLP, Copenhagen, Denmark, pp. 103-108, 2017.
[CrossRef] [Google Scholar] [Publisher Link] - Thomas Wolf et al., “Transformers: State-of-the-Art Natural Language Processing,” Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Association for Computational Linguistics, pp. 38-45, 2020.
[CrossRef] [Google Scholar] [Publisher Link] - Thomas G. Dietterich, “Ensemble Methods in Machine Learning,” Multiple Classifier Systems: First International Workshop, MCS 2000 Cagliari, Italy, pp. 1-15, 2000.
[CrossRef] [Google Scholar] [Publisher Link] - Aliaksei Severyn, and Alessandro Moschitti, “Twitter Sentiment Analysis with Deep Convolutional Neural Networks,” Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959-962, 2015.
[CrossRef] [Google Scholar] [Publisher Link] - Dushyant Singh Chauhan et al., “Sentiment and Emotion Help Sarcasm? A Multi-Task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 4351-4360, 2020.
[CrossRef] [Google Scholar] [Publisher Link] - Bharathi Raja Chakravarthi et al., “DravidianCodeMix: Sentiment Analysis and Offensive Language Identification Dataset for Dravidian Languages in Code-Mixed Text,” Language Resources and Evaluation, vol. 56, no. 3, pp. 765-806, 2022.
[CrossRef] [Google Scholar] [Publisher Link] - Thamar Solorio et al., “Overview for the First Shared Task on Language Identification in Code-Switched Data,” Proceedings of The First Workshop on Computational Approaches to Code Switching, Association for Computational Linguistics, pp. 62-72, 2014.
[CrossRef] [Google Scholar] [Publisher Link] - Mauajama Firdaus et al., “EmoSen: Generating Sentiment and Emotion Controlled Responses in a Multimodal Dialogue System,” IEEE Transactions on Affective Computing, vol. 13, no. 3, pp. 1555-1566, 2022.
[CrossRef] [Google Scholar] [Publisher Link] - Vipin Jain, and Kanchan Lata Kashyap, “Ensemble Hybrid Model for Hindi COVID-19 Text Classification with Metaheuristic Optimization Algorithm,” Multimedia Tools and Applications, vol. 82, no. 11, pp. 16839-16859, 2022.
[CrossRef] [Google Scholar] [Publisher Link] - Taku Kudo, and John Richardson, “SentencePiece: A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text Processing,” Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (System Demonstrations), Association for Computational Linguistics, Brussels, Belgium, pp. 66-71, 2018.
[CrossRef] [Google Scholar] [Publisher Link] - Fangxiaoyu Feng et al., “Language-Agnostic BERT Sentence Embedding,” Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, vol. 1, pp. 878-891, 2022.
[CrossRef] [Google Scholar] [Publisher Link] - Divyanshu Aggarwal, Vivek Gupta, and Anoop Kunchukuttan, “IndicXNLI: Evaluating Multilingual Inference for Indian Languages,” Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 10994-11006, 2022.
[CrossRef] [Google Scholar] [Publisher Link] - Suchin Gururangan et al., “Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 8342-8360, 2020.
[CrossRef] [Google Scholar] [Publisher Link] - Nils Reimers, and Iryna Gurevych, “Sentence-Bert: Sentence Embeddings using Siamese Bert-Networks,” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Association for Computational Linguistics, Hong Kong, China, pp. 3982-3992, 2019. [CrossRef] [Google Scholar] [Publisher Link]
- Jacob Devlin et al., “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding,” Proceedings of NAACL-HLT, Association for Computational Linguistics, Minneapolis, Minnesota, pp. 4171-4186, 2019.
[CrossRef] [Google Scholar] [Publisher Link] - Anastasia Giachanou, and Fabio Crestani, “Like it or Not: A Survey of Twitter Sentiment Analysis Methods,” ACM Computing Surveys (CSUR), vol. 49, no. 2, pp. 1-41, 2016.
[CrossRef] [Google Scholar] [Publisher Link] - Jason Wei, and Kai Zou, “EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks,” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Association for Computational Linguistics, Hong Kong, China, pp. 6382-6388, 2019.
[CrossRef] [Google Scholar] [Publisher Link] - Marina Sokolova, and Guy Lapalme, “A Systematic Analysis of Performance Measures for Classification Tasks,” Information Processing and Management, vol. 45, no. 4, pp. 427-437, 2009.
[CrossRef] [Google Scholar] [Publisher Link] - Takaya Saito, and Marc Rehmsmeier, “The Precision-Recall Plot is More Informative than the ROC Plot when Evaluating Binary Classifiers on Imbalanced Datasets,” PLoS One, vol. 10, no. 3, pp. 1-21, 2015.
[CrossRef] [Google Scholar] [Publisher Link] - Francisca Adoma Acheampong, Henry Nunoo-Mensah, and Wenyu Chen, “Transformer Models for Text-based Emotion Detection: A Review of BERT-based Approaches,” Artificial Intelligence Review, vol. 54, no. 8, pp. 5789-5829, 2021.
[CrossRef] [Google Scholar] [Publisher Link] - Dorottya Demszky et al., “GoEmotions: A Dataset of Fine-Grained Emotions,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 4040-4054, 2020.
[CrossRef] [Google Scholar] [Publisher Link] - Marius Mosbach, Maksym Andriushchenko, and Dietrich Klakow, “On the Stability of Fine-Tuning BERT: Misconceptions, Explanations, and Strong Baselines,” arXiv preprint, pp. 1-19, 2021.
[CrossRef] [Google Scholar] [Publisher Link] - Bharathi Raja Chakravarthi et al., “Overview of the Track on Sentiment Analysis for Dravidian Languages in Code-Mixed Text,” Proceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation, pp. 21-24, 2021.
[CrossRef] [Google Scholar] [Publisher Link]