Bhaavana: A Novel and Comprehensive Hindi Poetry Classifier Based on Emotions

Main Article Content

Kaushika Pal
JatinderKumar Saini JatinderKumar SAini


Emotions are the essence of humanity and they lead to various sensations in human beings. In traditional

Indian literature, these complex emotions are represented through the notion of ‘Rasa’ (‘रस’, meaning emotion). For

the current research, five such ‘Rasa’ namely ‘Hasya’ (‘हास्य’, comic), ‘Karuna’ (‘करुणा’, compassion), ‘Shanta’

(‘शांत’, calmness), ‘Shringar’ (‘श्रंगृ ार’, romance) and ‘Veera’ (‘साहस’, courage) have been used to design a classifier

called ‘Bhaavana’ (‘भावना’, emotion) for Hindi poetry. Technically, this is a Natural Language Processing (NLP)

quinary (i.e. five-category) classification task and we make use of various sub-tasks including Pre-processing,

Tokenization, Stemming, Bag-of-Words (BOW), Feature Extraction, and Part-Of-Speech (POS) tagging. Three types

of linguistic features namely Lexical features (LEX), Syntactic features comprising Part-of-Speech (POS) (i.e.,

LEX+POS), and Emotion specific Features (ESF) have been deployed towards the aim of designing an automatic

Hindi Poetry Classifier. A corpus of more than 800 poems with these 5 emotions and comprising more than 1,000,00

words have been processed to obtain a lexical feature set comprising more than 73,000 unique unigrams.

Additionally, Highest Rank features (HRF) have been found and experimented with LEX, LEX+POS, and ESF. The

various Machine Learning (ML) algorithms used are Gaussian Naïve Bayes (GNB), Multinomial Naïve Bayes

(MNB), Neural Network (NN), and Support Vector Machine (SVM) and experimentation results with LEX,

LEX+HRF, LEX+POS and LEX+POS+HRF, ESF+HRF for each ML algorithm are presented. These results are still

further fortified by the use of Frequency Distribution (FD), Term Frequency (TF), and Term Frequency-Inverse

Document Frequency (TF-IDF) during the experimentation. It is concluded that LEX+HRF is the best feature, FD is

the best weighing method and MNB is the best algorithm. These are respectively followed by ESF+HRF and

LEX+POS+HRF. The average of k-fold cross-validation results gives the best performance to be 71.09%. K-fold

cross-validation experiments show that ESF+HRF is a more stable feature set giving stable results across various


Article Details

How to Cite
Pal, K., & JatinderKumar SAini, J. S. (2024). Bhaavana: A Novel and Comprehensive Hindi Poetry Classifier Based on Emotions. INFOCOMP Journal of Computer Science, 23(1). Retrieved from
Machine Learning and Computational Intelligence


Rakhsit G., Ghosh A., Bhattacharyya P., Haffari G.,

“Automated Analysis of Bangla Poetry for Classification

and Poet Identification”, in proceedings of 12th

International Conference on Natural Language Processing,

Trivandrum, India, 2015, pp. 247–253. Online:


Alsharif O., Alshamaa D., Ghneim N., “Emotion

Classification in Arabic Poetry using Machine Learning”,

International. Journal of Computer Application, vol.

(16), 2013, pp. 10-15. doi: 10.5120/11006-6300

Noah S.A., Jamal N., Mohd M., “Poetry classification

using support vector machines”, Journal of Computer

Science, vol. 8(6), 2012, pp. 1441–1446. doi:


Kumar V., Minz S., “Poem Classification Using Machine

Learning Approach”, Advances in Intelligent Systems and

Computing, vol. 236, 2012, pp. 675-682. doi:


Hamidi S., Razzazi F., Ghaemmaghami M.P., “Automatic

Meter Classification in Persian Poetries using Support

Vector Machines”, in proceedings of IEEE International

Symposium on Signal Processing and Information

Technology (ISSPIT-2009), 2009, pp. 563-567. doi:


Anne C., Mishra A., Hoque M.T., Tu S., “Multiclass Patent

Document Classification”, Artificial Intelligence Research,

vol. 7(1), 2017, pp. 1-14. doi: 10.5430/air.v7n1p1

Rennie J.D.M., Rifkin R., “Improving Multiclass Text

Classification with the Support Vector Machine”, in AI

Memos of Massachuseets Institute of Technology, 2001.


Chang C.C., Lin C.J., “LIBSVM: A Library for Support

Vector Machines”, ACM Transactions on Intelligent

Systems and Technology, vol. 2(3), 2011, pp. 1-27. doi:


Gaur A., Yadav S., “Handwritten Hindi Character

Recognition using K- Means Clustering and SVM”, in

proceedings of 4th International Symposium on Emerging

Trends and Technology in Libraries and Information

Services, 2015. doi:10.1109/ettlis.2015.7048173

Puri S., Singh S.P., “Hindi Text Document Classification

System Using SVM and Fuzzy: A Survey”, International

Journal of Rough Sets and Data Analysis, vol. 5(4), pp.

-31, 2018. doi: 10.4018/ijrsda.2018100101

Puri S., Singh S.P., “An Efficient Hindi Text Classification

Model Using SVM”, Computing and Network

Sustainability, vol. 75, 2019. doi:


Kaur J., Saini J.R., “Punjabi Poetry Classification: The Test

of 10 Machine Learning Algorithms”, in proceedings of

ACM International Conference on Machine Learning and

Computing (ICMLC-2017), Singapore, 2017, pp. 1-5.


Kaur J., Saini J.R., “PuPoCl: Development of Punjabi

Poetry Classifier Using Linguistic Features and

Weighting”, INFOCOMP Journal of Computer Science,

Vol. 16(1-2), 2017, pp. 1-7. Online:


Hindi Poetry Collection. Online:

Trigrams’n’Tags tagger. Online:

Omar A., “On the Digital Applications in the Thematic

Literature Studies of Emily Dickinson’s Poetry”,

International Journal of Advanced Computer Science and

Applications, vol. 11(6), 2020, pp. 361-365.


Pal K., Patel B.V., “Automatic Categorized Corpus

Creation of Hindi Poetries Based on ‘Rasa(s)’ for

Linguistics Research”, Smart Innovation, Systems and

Technologies, vol. 235, 2021, pp. 549-556.


Kernot D., Bossomaier T., Bradbury R., “Stylometric

Techniques for Multiple Author Clustering”, International

Journal of Advanced Computer Science and Applications,

vol. 8(3), 2017, pp. 1-8. doi:10.14569/ijacsa.2017.080301

Tarnate K.J.M., Garcia M.M., Sotelo-Bator P., “Short Poem

Generation (SPG): A Performance Evaluation of Hidden

Markov Model based on Readability Index and Turing

Test”, International Journal of Advanced Computer Science

and Applications, vol. 11(2), 2020, pp. 294-297.


Bafna P.B., Saini J.R., “On Exhaustive Evaluation of Eager

Machine Learning Algorithms for Classification of Hindi

Verses”, International Journal of Advanced Computer

Science and Applications, vol. 11(2), 2020, pp. 181-185.


Lou A., Inkpen D., Tan C., “Multicategory Subject-Based

Classification of Poetry”, in proceedings of the 28th

International Florida Artificial Intelligence Research

Society Conference, 2015, pp. 187-192. Online:


Barros L., Rodriguez P., Ortigosa A., “Automatic

Classification of Literature Pieces by Emotion Detection: A

Study on Quevedo’s Poetry”, in proceedings of Humaine

Association Conference on Affective Computing and

Intelligent Interaction (ACII), 2013, pp. 141-146.


Can E.F., Can F., Duygulu P., Kalpakli M., “Automatic

Categorization of Ottoman Literary Texts by Poet and Time

Period”, Computer and Information Sciences II, 2011, pp.

-57. doi:10.1007/978-1-4471-2155-8_6

B. Mehta, B. Rajyagor., “Gujarati Poetry Classification

Based On Emotions Using Deep Learning”, International

journal of Engineering Applied Sciences and Technology,

Vol. 6, Issue 1, pp. 358-362

C. Tanasescu, B. Paget, D. Inkpen., “Automatic

Classification of Poetry by Meter and Rhyme”, Association

for the Advancement of Artificial Intelligence, 2016

R. A. Deshmukh, S. Kore, N. Chavan, S. Gole, K. Adarsh.,

“Marathi Poem Classification using Machine Learning”,

Blue Eyes Intelligence Engineering & Sciences

Publication, 2019. Vol. 8, Issue 2, pp. 2723-2727. DOI:


S. Ahmad , M. Zubair, F. Mazaed, S. Khan., “Classification

of Poetry Text Into the Emotional States Using Deep

Learning Technique”, IEEE Access, DOI:

1109/ACCESS.2020.2987842, 2020, Vol. 8, pp.


A. Lou, D. Inkpen and C. asescu., “Multilabel

Subject-based Classification of Poetry”, Association for the

Advancement of Artificial Intelligence, 2015.

T. Peri-Polonijo, “The Levels Of Classification Oral

Lyrical Poems Classification”, Nar. umjet. 32/1, 1995,


V. Kesarwani, “Automatic Poetry Classification Using

Natural Language Processing”, Thesis, School of Electrical

Engineering and Computer Science Faculty of Engineering,

University of Ottawa, Canada, 2018.