ENGLISH TO BENGALI NEURAL MACHINE TRANSLATION SYSTEM FOR THE AVIATION DOMAIN

Main Article Content

Saptarshi Paul

Abstract

Machine translation systems for Indian languages such as Bengali and others are commonly found. Classical machine translation systems involving Bengali are available for tourism, agriculture, medical and other domains. The performance of these systems are restrained by the linguistic knowledge, that are used to develop the rules. In the recent past, notable results have been achieved by systems using neural machine translation. Well known organizations like Google and Microsoft have started using NMT models. In this paper, we explore the design and implementation of an unexplored domain in Bengali, the aviation domain. It is implemented using a neural machine translation model. In order to implement it, we have used English to Bengali parallel corpus for the aviation domain which was developed specifically for this implementation. The corpus is a unique one with large number of aviation specific OOV words and phraseologies included in it. We have used the already developed aviation preprocessing tool, E-dictionary and transliteration tool for creation of the corpus and system. Ultimately we get the output model which generates our machine translated output file in Bengali. We then apply the aviation phraseology converter and transliteration tool on the output to get a post-processed output. The two versions of the output are compared using n-gram BLEU score. The results ultimately demonstrate that NMT output with the post processing exhibits better results.

Article Details

How to Cite
Paul, S. (2020). ENGLISH TO BENGALI NEURAL MACHINE TRANSLATION SYSTEM FOR THE AVIATION DOMAIN. INFOCOMP Journal of Computer Science, 19(2), 78-97. Retrieved from http://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1007
Section
Machine Learning and Computational Intelligence