Music Genre Classification Using Timbral Feature Fusion on i-vector Framework

Main Article Content

Rajeev Rajan
Harishanker G.
Athirasree C.A
Haritha S.M.


A method for automatic music genre classification based on the fusion of high-level and low-level timbral descriptors is proposed. High-level features namely, i-vectors are computed from mel-frequency cepstral coefficient (MFCC)-GMM framework. Low-level timbral descriptors namely MFCC, modified group delay features (MODGDF) and timbral feature set are also computed from the audio files. Initially, the experiment is performed using i-vectors alone. Later, low-level timbral features are appended with high-level i-vector features to form a high dimensional feature vector (55 dim). Support vector machine (SVM) and deep neural network (DNN) based classifiers are employed for the experiment. The performance is evaluated using GTZAN dataset on 5 genres. With high-level i-vector features, the baseline-SVM and DNN-based classifiers report average classification accuracies (in %) of 79.30 and 80.67, respectively. A further improvement (9\%) in performance was observed when low-level timbral descriptors are fused with the i-vectors in both SVM and DNN frameworks. The results demonstrate the potential of the timbral feature fusion in the music genre classification task.

Article Details

How to Cite
Rajan, R., Harishanker G., Athirasree C.A, & Haritha S.M. (2021). Music Genre Classification Using Timbral Feature Fusion on i-vector Framework. INFOCOMP Journal of Computer Science, 20(2). Retrieved from
Machine Learning and Computational Intelligence