Autor(es):
Panda, Renato ; Rocha, Bruno ; Paiva, Rui Pedro
Data: 2013
Identificador Persistente: https://hdl.handle.net/10316/95164
Origem: Estudo Geral - Universidade de Coimbra
Projeto/bolsa:
info:eu-repo/grantAgreement/FCT/5876-PPCDTI/102185/PT
;
Assunto(s): music emotion recognition; machine learning; regression; standard audio features; melodic features
Descrição
We propose an approach to the dimensional music emotion recognition (MER) problem, combining both standard and melodic audio features. The dataset proposed by Yang is used, which consists of 189 audio clips. From the audio data, 458 standard features and 98 melodic features were extracted. We experimented with several supervised learning and feature selection strategies to evaluate the proposed approach. Employing only standard audio features, the best attained performance was 63.2% and 35.2% for arousal and valence prediction, respectively (R2 statistics). Combining standard audio with melodic features, results improved to 67.4 and 40.6%, for arousal and valence, respectively. To the best of our knowledge, these are the best results attained so far with this dataset.
This work was supported by the MOODetector project (PTDC/EIA- EIA/102185/2008), financed by the Fundação para Ciência e a Tecnologia (FCT) and Programa Operacional Temático Factores de Competitividade (COMPETE) - Portugal.