Author(s):
Panda, Renato ; Rocha, Bruno ; Paiva, Rui Pedro
Date: 2013
Persistent ID: https://hdl.handle.net/10316/95164
Origin: Estudo Geral - Universidade de Coimbra
Project/scholarship:
info:eu-repo/grantAgreement/FCT/5876-PPCDTI/102185/PT
;
Subject(s): music emotion recognition; machine learning; regression; standard audio features; melodic features
Description
We propose an approach to the dimensional music emotion recognition (MER) problem, combining both standard and melodic audio features. The dataset proposed by Yang is used, which consists of 189 audio clips. From the audio data, 458 standard features and 98 melodic features were extracted. We experimented with several supervised learning and feature selection strategies to evaluate the proposed approach. Employing only standard audio features, the best attained performance was 63.2% and 35.2% for arousal and valence prediction, respectively (R2 statistics). Combining standard audio with melodic features, results improved to 67.4 and 40.6%, for arousal and valence, respectively. To the best of our knowledge, these are the best results attained so far with this dataset.
This work was supported by the MOODetector project (PTDC/EIA- EIA/102185/2008), financed by the Fundação para Ciência e a Tecnologia (FCT) and Programa Operacional Temático Factores de Competitividade (COMPETE) - Portugal.