Publicação

Prosodic Classification of Discourse Markers

Ver documento

Detalhes bibliográficos
Resumo:The first contribution of this study is the description of the prosodic behavior of discourse markers present in two speech corpora of European Portuguese (EP) in different domains (university lectures, and map-task dialogues). The second contribution is a multiclass classification to verify, given their prosodic features, which words in both corpora are classified as discourse markers, which are disfluencies, and which correspond to words that are neither markers nor disfluencies (chunks). Our goal is to automatically predict discourse markers and include them in rich transcripts, along with other structural metadata events (e.g., disfluencies and punctuation marks) that are already encompassed in the language models of our in-house speech recognizer. Results show that the automatic classification of discourse markers is better for the lectures corpus (87%) than for the dialogue corpus (84%). Nonetheless, in both corpora, discourse markers are more easily confused with chunks than with disfluencies.
Autores principais:Cabarrão, Vera
Outros Autores:Moniz, Helena; Ferreira, Jaime; Batista, Fernando; Trancoso, Isabel; Mata, Ana Isabel; Curto, Sérgio
Assunto:Discourse markers Prosódia Lectures Dialogues Structural Metadata Events
Ano:2015
País:Portugal
Tipo de documento:artigo
Tipo de acesso:acesso aberto
Instituição associada:Universidade de Lisboa
Idioma:inglês
Origem:Repositório da Universidade de Lisboa
_version_ 1865920789847474176
author Cabarrão, Vera
author2 Moniz, Helena
Ferreira, Jaime
Batista, Fernando
Trancoso, Isabel
Mata, Ana Isabel
Curto, Sérgio
author2_role author
author
author
author
author
author
author_facet Cabarrão, Vera
Cabarrão, Vera
Moniz, Helena
Ferreira, Jaime
Batista, Fernando
Trancoso, Isabel
Mata, Ana Isabel
Curto, Sérgio
Moniz, Helena
Ferreira, Jaime
Batista, Fernando
Trancoso, Isabel
Mata, Ana Isabel
Curto, Sérgio
author_role author
contributor_name_str_mv Repositório Científico de Acesso Aberto da ULisboa
country_str PT
creators_json_str [{\"Person.name\":\"Cabarrão, Vera\"},{\"Person.name\":\"Moniz, Helena\"},{\"Person.name\":\"Ferreira, Jaime\"},{\"Person.name\":\"Batista, Fernando\"},{\"Person.name\":\"Trancoso, Isabel\"},{\"Person.name\":\"Mata, Ana Isabel\"},{\"Person.name\":\"Curto, Sérgio\"}]
datacite.contributors.contributor.contributorName.fl_str_mv Repositório Científico de Acesso Aberto da ULisboa
datacite.creators.creator.creatorName.fl_str_mv Cabarrão, Vera
Moniz, Helena
Ferreira, Jaime
Batista, Fernando
Trancoso, Isabel
Mata, Ana Isabel
Curto, Sérgio
datacite.date.Accepted.fl_str_mv 2015-01-01T00:00:00Z
datacite.date.available.fl_str_mv 2018-01-28T15:07:13Z
datacite.date.embargoed.fl_str_mv 2018-01-28T15:07:13Z
datacite.rights.fl_str_mv http://purl.org/coar/access_right/c_abf2
datacite.subjects.subject.fl_str_mv Discourse markers
Prosódia
Lectures
Dialogues
Structural Metadata Events
datacite.titles.title.fl_str_mv Prosodic Classification of Discourse Markers
dc.contributor.none.fl_str_mv Repositório Científico de Acesso Aberto da ULisboa
dc.creator.none.fl_str_mv Cabarrão, Vera
Moniz, Helena
Ferreira, Jaime
Batista, Fernando
Trancoso, Isabel
Mata, Ana Isabel
Curto, Sérgio
dc.date.Accepted.fl_str_mv 2015-01-01T00:00:00Z
dc.date.available.fl_str_mv 2018-01-28T15:07:13Z
dc.date.embargoed.fl_str_mv 2018-01-28T15:07:13Z
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv http://hdl.handle.net/10451/31083
dc.language.none.fl_str_mv eng
dc.publisher.none.fl_str_mv International Phonetic Association
dc.rights.none.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.subject.none.fl_str_mv Discourse markers
Prosódia
Lectures
Dialogues
Structural Metadata Events
dc.title.fl_str_mv Prosodic Classification of Discourse Markers
dc.type.none.fl_str_mv http://purl.org/coar/resource_type/c_6501
description The first contribution of this study is the description of the prosodic behavior of discourse markers present in two speech corpora of European Portuguese (EP) in different domains (university lectures, and map-task dialogues). The second contribution is a multiclass classification to verify, given their prosodic features, which words in both corpora are classified as discourse markers, which are disfluencies, and which correspond to words that are neither markers nor disfluencies (chunks). Our goal is to automatically predict discourse markers and include them in rich transcripts, along with other structural metadata events (e.g., disfluencies and punctuation marks) that are already encompassed in the language models of our in-house speech recognizer. Results show that the automatic classification of discourse markers is better for the lectures corpus (87%) than for the dialogue corpus (84%). Nonetheless, in both corpora, discourse markers are more easily confused with chunks than with disfluencies.
dirty 0
eu_rights_str_mv openAccess
format article
fulltext.url.fl_str_mv https://repositorio.ulisboa.pt/bitstreams/6c51043f-0500-4b70-b814-1aec30c158c1/download
funding.funder.alternateName_str_mv FCT
EC
funding.funder.identifier_str_mv http://doi.org/10.13039/501100001871
http://doi.org/10.13039/501100008530
funding.funder.name_str_mv Fundação para a Ciência e a Tecnologia
European Commission
funding.name_str_mv FP7
id ul_451f35871fa4b6097ca22efbcb71192c
identifier.url.fl_str_mv http://hdl.handle.net/10451/31083
instacron_str ul
institution Universidade de Lisboa
instname_str Universidade de Lisboa
language eng
network_acronym_str ul
network_name_str Repositório da Universidade de Lisboa
oai_identifier_str oai:repositorio.ulisboa.pt:10451/31083
organization_str_mv urn:organizationAcronym:ul
person_str_mv Cabarrão, Vera
Moniz, Helena
Ferreira, Jaime
Batista, Fernando
Trancoso, Isabel
Mata, Ana Isabel
Curto, Sérgio
publishDate 2015
publisher.none.fl_str_mv International Phonetic Association
reponame_str Repositório da Universidade de Lisboa
repository_id_str urn:repositoryAcronym:ul
service_str_mv urn:repositoryAcronym:ul
spelling engInternational Phonetic Associationpt_PTThe first contribution of this study is the description of the prosodic behavior of discourse markers present in two speech corpora of European Portuguese (EP) in different domains (university lectures, and map-task dialogues). The second contribution is a multiclass classification to verify, given their prosodic features, which words in both corpora are classified as discourse markers, which are disfluencies, and which correspond to words that are neither markers nor disfluencies (chunks). Our goal is to automatically predict discourse markers and include them in rich transcripts, along with other structural metadata events (e.g., disfluencies and punctuation marks) that are already encompassed in the language models of our in-house speech recognizer. Results show that the automatic classification of discourse markers is better for the lectures corpus (87%) than for the dialogue corpus (84%). Nonetheless, in both corpora, discourse markers are more easily confused with chunks than with disfluencies.application/pdfpt_PTProsodic Classification of Discourse MarkersCabarrão, VeraMoniz, HelenaFerreira, JaimeBatista, FernandoTrancoso, IsabelMata, Ana IsabelCurto, SérgioHostingInstitutionOrganizationalRepositório Científico de Acesso Aberto da ULisboae-mailmailto:repositorio@reitoria.ulisboa.ptrepositorio@reitoria.ulisboa.pt2018-01-28T15:07:13Z20152015-01-01T00:00:00ZHandlehttp://hdl.handle.net/10451/31083http://purl.org/coar/access_right/c_abf2open accessDiscourse markersProsódiaLecturesDialoguesStructural Metadata Events191026 bytesFundação para a Ciência e a TecnologiaESTRATÉGIAS DE FEEDBACK E DE ENTRAINMENT EM DIÁLOGO E SUA APLICAÇÃO EM SISTEMAS AUTOMÁTICOSCrossref Funder IDhttp://doi.org/10.13039/501100001871European CommissionSpoken Dialogue AnalyticsFP7Crossref Funder IDhttp://doi.org/10.13039/501100008530literaturehttp://purl.org/coar/resource_type/c_6501journal articlehttp://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://repositorio.ulisboa.pt/bitstreams/6c51043f-0500-4b70-b814-1aec30c158c1/downloadInternational Congress of Phonetic Sciences (ICPhS 2015)Glasgow
spellingShingle Prosodic Classification of Discourse Markers
Prosodic Classification of Discourse Markers
Cabarrão, Vera
Discourse markers
Prosódia
Lectures
Dialogues
Structural Metadata Events
Cabarrão, Vera
Discourse markers
Prosódia
Lectures
Dialogues
Structural Metadata Events
status SINGLETON
subject.fl_str_mv Discourse markers
Prosódia
Lectures
Dialogues
Structural Metadata Events
title Prosodic Classification of Discourse Markers
title_full Prosodic Classification of Discourse Markers
title_fullStr Prosodic Classification of Discourse Markers
Prosodic Classification of Discourse Markers
title_full_unstemmed Prosodic Classification of Discourse Markers
Prosodic Classification of Discourse Markers
title_short Prosodic Classification of Discourse Markers
title_sort Prosodic Classification of Discourse Markers
topic Discourse markers
Prosódia
Lectures
Dialogues
Structural Metadata Events
topic_facet Discourse markers
Prosódia
Lectures
Dialogues
Structural Metadata Events
url http://hdl.handle.net/10451/31083
visible 1