Gradient Boost algorithms for Modelling Malayalam Poem Syllable Duration Gradient Boost algorithms for Modelling Malayalam Poem Syllable Duration
Main Article Content
Abstract
Emulating natural speech has been a top priority ever since the research activities began in the
area of Natural Language Processing (NLP). Text To Speech Synthesis (TTS) consists of several stages,
which include Text Normalization, Syllabification and Unit Selection, Duration Analysis Modelling,
and Prosody Analysis Modelling. Proper syllabification was required earlier when rule-based concatenative synthesis was used as the main method to synthesize speech. Now statistical parametric speech
synthesis is the state of the art. Supervised and unsupervised machine learning frameworks can be used
to model different aspects of speech such as duration, prosody etc. The proposed work uses classical
poem construct Vruta (meter) to identify the features determine syllable duration. Nineteen features are
extracted from the orthographic representation of poem according to the Vruta definition. Kakali, Keka,
and Manjari are the Vrutas considered. Also the contextual features of the syllables and the accoustic properties like the origin of the syllable are considered to build the feature set. The proposed work
employs Gradient Boost Algorithms for modelling the duration of Malayalam poem syllables. All the
models give superior values for the coefficient of determination (R2) compared to other major models.
Simple Gradient Boost Machine (GBM) is able to produce 90.723 for R2. Similarly, XGBoost gives
90.726, LightBoost yields 90.693 and CatBoost delivers 90.819. Also, the models exhibit lesser values
for different Statistical Error Indicators (SEI) - MAE, RMSE, and MAPE
Article Details
Upon receipt of accepted manuscripts, authors will be invited to complete a copyright license to publish the paper. At least the corresponding author must send the copyright form signed for publication. It is a condition of publication that authors grant an exclusive licence to the the INFOCOMP Journal of Computer Science. This ensures that requests from third parties to reproduce articles are handled efficiently and consistently and will also allow the article to be as widely disseminated as possible. In assigning the copyright license, authors may use their own material in other publications and ensure that the INFOCOMP Journal of Computer Science is acknowledged as the original publication place.