A Neural based Bidirectional MT System to Investigate the Performance of the Low Resource Language pair English-Nepali

Main Article Content

Amit Kumar Roy
Bipul Syam Purkayastha
Chanambam Sveta Devi
Saptarshi Paul

Abstract

Machine Translation (MT), which was formerly pre-dominated by Statistical Machine Translation (SMT) and Rule-based Machine Translation (RBMT) has recently been losing its edge to the latest trends and technology, such as Neural Machine Translation (NMT) systems. Although Neural Machine Translation performs well for resource-rich languages, SMT is preferable for low-resource languages such as Nepali. Nepali on its part has unique linguistic attributes, properties and scripts. In this paper, a bidirectional SMT system for Nepali-English, a low-resource language pair, is presented. The system is built using a parallel text corpus of more than 17000 sentences and an open-source MOSES tool. Further, automatic evaluation metrics BLEU, F-Score and METEOR, are carried out to assess the effectiveness of our MT system. The system achieved scores of 21.13, 53.32 and 38.29 scores for English-Nepali and 22.26, 57.52 and 27.81 for the Nepali-English language pair respectively.

Article Details

How to Cite
Roy, A., Purkayastha , B. S. ., Devi, C. S., & Paul, S. (2024). A Neural based Bidirectional MT System to Investigate the Performance of the Low Resource Language pair English-Nepali. INFOCOMP Journal of Computer Science, 23(1). Retrieved from https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/3361
Section
Machine Learning and Computational Intelligence

References

Acharya, P. and Bal, B. K. A comparative study of smt and nmt: Case study of english-nepali language pair. In SLTU, pages 90–93, 2018.

Bal, B. K. Structure of nepali grammar. PAN Localization, Madan Puraskar Pustakalaya, Kathmandu, Nepal, pages 332–396, 2004.

Banerjee, S. and Lavie, A. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pages 65–72, 2005.

Bista, S., Keshari, B., Bhatta, J., and Parajuli, K. Dobhase: online english to nepali machine translation system. In The proceedings of the 26th Annual conference of the Linguistic Society of Nepal, 2005.

Chaudhary, B. K., Bal, B. K., and Baidar, R. Efforts towards developing a tamang nepali machine translation system. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pages 281–286, 2020.

Federico, M., Bertoldi, N., and Cettolo, M. Irstlm: an open source toolkit for handling large scale language models. In Ninth Annual Conference of the International Speech Communication Association, 2008.

Goutte, C. and Gaussier, E. A probabilistic interpretation of precision, recall and f-score, with implication for evaluation. In European conference on information retrieval, pages 345–359. Springer, 2005.

Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al. Moses: Open

source toolkit for statistical machine translation. In Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions, pages 177–180. Association for Computational Linguistics, 2007.

Koehn, P. and Knowles, R. Six challenges for neural machine translation. arXiv preprint arXiv:1706.03872, 2017.

Laskar, S. R., Pakray, P., and Bandyopadhyay, S. Neural machine translation: Hindi-nepali. In Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), pages 202–207, 2019.

Och, F. J. and Ney, H. A systematic comparison of various statistical alignment models. Computational linguistics, 29(1):19–51, 2003.[12] Papineni, K., Roukos, S., Ward, T., and Zhu, W.- J. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.

Paul, A. and Purkayastha, B. S. English to nepali statistical machine translation system. In Proceedings of the International Conference on Computing and Communication Systems: I3CS 2016, NEHU, Shillong, India, pages 423–431. Springer, 2018.

Shrestha, H. K. Rule based machine translation system in the context of nepali text to english text. Department of Computer Science, University of Oklahoma, 2005.

www.bible.com. RSV Bible: Revised standard version: Youversion. Last accessed 23 October 2023.

Zhao, Z. The machine translation model. In 2022 5th International Conference on Humanities Education and Social Sciences (ICHESS 2022), pages 2153–2160. Atlantis Press, 2022.