Publicação

The CNG corpus of European Portuguese children's speech

Detalhes bibliográficos
Resumo:	Speech recognisers trained with adults' speech do not work well with children's speech because of the inherent acoustic and linguistic differences in the speech of these two populations. To develop speech-driven applications capable of successfully recognising children's speech, a sufficient amount of children's speech is needed for training acoustic models from scratch or for adapting acoustic models trained with adults' speech. However, the availability of suitable children's speech corpora is still limited, especially in the case of less-spoken languages. This paper describes the design, collection, transcription and annotation of a 21-hour corpus of prompted European Portuguese children's speech collected from 510 children aged 3-10. Before the development of this corpus, European Portuguese children's speech data have not been available at all for parts of this age range.
Autores principais:	Dias, José Miguel de Oliveira Monteiro Sales
Assunto:	automatic speech recognition children's speech corpus European Portuguese prompted speech
Ano:	2013
País:	Portugal
Tipo de documento:	capítulo de livro
Tipo de acesso:	acesso embargado
Instituição associada:	ISCTE
Idioma:	inglês
Origem:	Repositório ISCTE

Descrição
Resumo:	Speech recognisers trained with adults' speech do not work well with children's speech because of the inherent acoustic and linguistic differences in the speech of these two populations. To develop speech-driven applications capable of successfully recognising children's speech, a sufficient amount of children's speech is needed for training acoustic models from scratch or for adapting acoustic models trained with adults' speech. However, the availability of suitable children's speech corpora is still limited, especially in the case of less-spoken languages. This paper describes the design, collection, transcription and annotation of a 21-hour corpus of prompted European Portuguese children's speech collected from 510 children aged 3-10. Before the development of this corpus, European Portuguese children's speech data have not been available at all for parts of this age range.