Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation

Main Article Content

Saptarshi Paul

Abstract

The recent advent of corpora based transliteration and translation approaches such as SMT and NMT models are completely based on the parallel corpus. It is the corpus that ultimately decides the Translation Accuracy (TA) of the model. With the regular and common domains exhausted and things of the past, Modern fields of research corpora domains lie anywhere between medicines to aero-science. The Work becomes more interesting when Indian languages are taken up especially ones that include technical touch such as Aeronautics and Aviation. With corpora for technical domains in English-Indian languages pairs such as Bengali coming up now, the automatic analysis for such corpora is an interesting aspect that researchers are taking up. Such analysis also helps developers and researchers to further improve the quality of the corpus and set new benchmarks for the development of future corpora. This paper deals with the need, development and detailed analysis of a bilingual corpus in aviation for English and Bengali language pairs.

Article Details

How to Cite
Paul, S. (2022). Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation. INFOCOMP Journal of Computer Science, 21(1). Retrieved from https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1966
Section
Machine Learning and Computational Intelligence