ENGLISH TO BENGALI NEURAL MACHINE TRANSLATION SYSTEM FOR THE AVIATION DOMAIN

pdf

Published: Dec 1, 2020

Abstract

Machine translation systems for Indian languages such as Bengali and others are commonly found. Classical machine translation systems involving Bengali are available for tourism, agriculture, medical and other domains. The performance of these systems are restrained by the linguistic knowledge, that are used to develop the rules. In the recent past, notable results have been achieved by systems using neural machine translation. Well known organizations like Google and Microsoft have started using NMT models. In this paper, we explore the design and implementation of an unexplored domain in Bengali, the aviation domain. It is implemented using a neural machine translation model. In order to implement it, we have used English to Bengali parallel corpus for the aviation domain which was developed specifically for this implementation. The corpus is a unique one with large number of aviation specific OOV words and phraseologies included in it. We have used the already developed aviation preprocessing tool, E-dictionary and transliteration tool for creation of the corpus and system. Ultimately we get the output model which generates our machine translated output file in Bengali. We then apply the aviation phraseology converter and transliteration tool on the output to get a post-processed output. The two versions of the output are compared using n-gram BLEU score. The results ultimately demonstrate that NMT output with the post processing exhibits better results.

How to Cite

ENGLISH TO BENGALI NEURAL MACHINE TRANSLATION SYSTEM FOR THE AVIATION DOMAIN. (2020). INFOCOMP Journal of Computer Science, 19(2), 78–97. Retrieved from https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1007

Issue

Vol. 19 No. 2 (2020): December 2020

Section

Machine Learning and Computational Intelligence

Upon receipt of accepted manuscripts, authors will be invited to complete a copyright license to publish the paper. At least the corresponding author must send the copyright form signed for publication. It is a condition of publication that authors grant an exclusive licence to the the INFOCOMP Journal of Computer Science. This ensures that requests from third parties to reproduce articles are handled efficiently and consistently and will also allow the article to be as widely disseminated as possible. In assigning the copyright license, authors may use their own material in other publications and ensure that the INFOCOMP Journal of Computer Science is acknowledged as the original publication place.

Article Sidebar

Main Article Content

Abstract

Article Details