Publicação
Prosodic Classification of Discourse Markers
| Resumo: | The first contribution of this study is the description of the prosodic behavior of discourse markers present in two speech corpora of European Portuguese (EP) in different domains (university lectures, and map-task dialogues). The second contribution is a multiclass classification to verify, given their prosodic features, which words in both corpora are classified as discourse markers, which are disfluencies, and which correspond to words that are neither markers nor disfluencies (chunks). Our goal is to automatically predict discourse markers and include them in rich transcripts, along with other structural metadata events (e.g., disfluencies and punctuation marks) that are already encompassed in the language models of our in-house speech recognizer. Results show that the automatic classification of discourse markers is better for the lectures corpus (87%) than for the dialogue corpus (84%). Nonetheless, in both corpora, discourse markers are more easily confused with chunks than with disfluencies. |
|---|---|
| Autores principais: | Cabarrão, Vera |
| Outros Autores: | Moniz, Helena; Ferreira, Jaime; Batista, Fernando; Trancoso, Isabel; Mata, Ana Isabel; Curto, Sérgio |
| Assunto: | Discourse markers Prosódia Lectures Dialogues Structural Metadata Events |
| Ano: | 2015 |
| País: | Portugal |
| Tipo de documento: | artigo |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório da Universidade de Lisboa |
| _version_ | 1865920789847474176 |
|---|---|
| author | Cabarrão, Vera |
| author2 | Moniz, Helena Ferreira, Jaime Batista, Fernando Trancoso, Isabel Mata, Ana Isabel Curto, Sérgio |
| author2_role | author author author author author author |
| author_facet | Cabarrão, Vera Cabarrão, Vera Moniz, Helena Ferreira, Jaime Batista, Fernando Trancoso, Isabel Mata, Ana Isabel Curto, Sérgio Moniz, Helena Ferreira, Jaime Batista, Fernando Trancoso, Isabel Mata, Ana Isabel Curto, Sérgio |
| author_role | author |
| contributor_name_str_mv | Repositório Científico de Acesso Aberto da ULisboa |
| country_str | PT |
| creators_json_str | [{\"Person.name\":\"Cabarrão, Vera\"},{\"Person.name\":\"Moniz, Helena\"},{\"Person.name\":\"Ferreira, Jaime\"},{\"Person.name\":\"Batista, Fernando\"},{\"Person.name\":\"Trancoso, Isabel\"},{\"Person.name\":\"Mata, Ana Isabel\"},{\"Person.name\":\"Curto, Sérgio\"}] |
| datacite.contributors.contributor.contributorName.fl_str_mv | Repositório Científico de Acesso Aberto da ULisboa |
| datacite.creators.creator.creatorName.fl_str_mv | Cabarrão, Vera Moniz, Helena Ferreira, Jaime Batista, Fernando Trancoso, Isabel Mata, Ana Isabel Curto, Sérgio |
| datacite.date.Accepted.fl_str_mv | 2015-01-01T00:00:00Z |
| datacite.date.available.fl_str_mv | 2018-01-28T15:07:13Z |
| datacite.date.embargoed.fl_str_mv | 2018-01-28T15:07:13Z |
| datacite.rights.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| datacite.subjects.subject.fl_str_mv | Discourse markers Prosódia Lectures Dialogues Structural Metadata Events |
| datacite.titles.title.fl_str_mv | Prosodic Classification of Discourse Markers |
| dc.contributor.none.fl_str_mv | Repositório Científico de Acesso Aberto da ULisboa |
| dc.creator.none.fl_str_mv | Cabarrão, Vera Moniz, Helena Ferreira, Jaime Batista, Fernando Trancoso, Isabel Mata, Ana Isabel Curto, Sérgio |
| dc.date.Accepted.fl_str_mv | 2015-01-01T00:00:00Z |
| dc.date.available.fl_str_mv | 2018-01-28T15:07:13Z |
| dc.date.embargoed.fl_str_mv | 2018-01-28T15:07:13Z |
| dc.format.none.fl_str_mv | application/pdf |
| dc.identifier.none.fl_str_mv | http://hdl.handle.net/10451/31083 |
| dc.language.none.fl_str_mv | eng |
| dc.publisher.none.fl_str_mv | International Phonetic Association |
| dc.rights.none.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| dc.subject.none.fl_str_mv | Discourse markers Prosódia Lectures Dialogues Structural Metadata Events |
| dc.title.fl_str_mv | Prosodic Classification of Discourse Markers |
| dc.type.none.fl_str_mv | http://purl.org/coar/resource_type/c_6501 |
| description | The first contribution of this study is the description of the prosodic behavior of discourse markers present in two speech corpora of European Portuguese (EP) in different domains (university lectures, and map-task dialogues). The second contribution is a multiclass classification to verify, given their prosodic features, which words in both corpora are classified as discourse markers, which are disfluencies, and which correspond to words that are neither markers nor disfluencies (chunks). Our goal is to automatically predict discourse markers and include them in rich transcripts, along with other structural metadata events (e.g., disfluencies and punctuation marks) that are already encompassed in the language models of our in-house speech recognizer. Results show that the automatic classification of discourse markers is better for the lectures corpus (87%) than for the dialogue corpus (84%). Nonetheless, in both corpora, discourse markers are more easily confused with chunks than with disfluencies. |
| dirty | 0 |
| eu_rights_str_mv | openAccess |
| format | article |
| fulltext.url.fl_str_mv | https://repositorio.ulisboa.pt/bitstreams/6c51043f-0500-4b70-b814-1aec30c158c1/download |
| funding.funder.alternateName_str_mv | FCT EC |
| funding.funder.identifier_str_mv | http://doi.org/10.13039/501100001871 http://doi.org/10.13039/501100008530 |
| funding.funder.name_str_mv | Fundação para a Ciência e a Tecnologia European Commission |
| funding.name_str_mv | FP7 |
| id | ul_451f35871fa4b6097ca22efbcb71192c |
| identifier.url.fl_str_mv | http://hdl.handle.net/10451/31083 |
| instacron_str | ul |
| institution | Universidade de Lisboa |
| instname_str | Universidade de Lisboa |
| language | eng |
| network_acronym_str | ul |
| network_name_str | Repositório da Universidade de Lisboa |
| oai_identifier_str | oai:repositorio.ulisboa.pt:10451/31083 |
| organization_str_mv | urn:organizationAcronym:ul |
| person_str_mv | Cabarrão, Vera Moniz, Helena Ferreira, Jaime Batista, Fernando Trancoso, Isabel Mata, Ana Isabel Curto, Sérgio |
| publishDate | 2015 |
| publisher.none.fl_str_mv | International Phonetic Association |
| reponame_str | Repositório da Universidade de Lisboa |
| repository_id_str | urn:repositoryAcronym:ul |
| service_str_mv | urn:repositoryAcronym:ul |
| spelling | engInternational Phonetic Associationpt_PTThe first contribution of this study is the description of the prosodic behavior of discourse markers present in two speech corpora of European Portuguese (EP) in different domains (university lectures, and map-task dialogues). The second contribution is a multiclass classification to verify, given their prosodic features, which words in both corpora are classified as discourse markers, which are disfluencies, and which correspond to words that are neither markers nor disfluencies (chunks). Our goal is to automatically predict discourse markers and include them in rich transcripts, along with other structural metadata events (e.g., disfluencies and punctuation marks) that are already encompassed in the language models of our in-house speech recognizer. Results show that the automatic classification of discourse markers is better for the lectures corpus (87%) than for the dialogue corpus (84%). Nonetheless, in both corpora, discourse markers are more easily confused with chunks than with disfluencies.application/pdfpt_PTProsodic Classification of Discourse MarkersCabarrão, VeraMoniz, HelenaFerreira, JaimeBatista, FernandoTrancoso, IsabelMata, Ana IsabelCurto, SérgioHostingInstitutionOrganizationalRepositório Científico de Acesso Aberto da ULisboae-mailmailto:repositorio@reitoria.ulisboa.ptrepositorio@reitoria.ulisboa.pt2018-01-28T15:07:13Z20152015-01-01T00:00:00ZHandlehttp://hdl.handle.net/10451/31083http://purl.org/coar/access_right/c_abf2open accessDiscourse markersProsódiaLecturesDialoguesStructural Metadata Events191026 bytesFundação para a Ciência e a TecnologiaESTRATÉGIAS DE FEEDBACK E DE ENTRAINMENT EM DIÁLOGO E SUA APLICAÇÃO EM SISTEMAS AUTOMÁTICOSCrossref Funder IDhttp://doi.org/10.13039/501100001871European CommissionSpoken Dialogue AnalyticsFP7Crossref Funder IDhttp://doi.org/10.13039/501100008530literaturehttp://purl.org/coar/resource_type/c_6501journal articlehttp://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://repositorio.ulisboa.pt/bitstreams/6c51043f-0500-4b70-b814-1aec30c158c1/downloadInternational Congress of Phonetic Sciences (ICPhS 2015)Glasgow |
| spellingShingle | Prosodic Classification of Discourse Markers Prosodic Classification of Discourse Markers Cabarrão, Vera Discourse markers Prosódia Lectures Dialogues Structural Metadata Events Cabarrão, Vera Discourse markers Prosódia Lectures Dialogues Structural Metadata Events |
| status | SINGLETON |
| subject.fl_str_mv | Discourse markers Prosódia Lectures Dialogues Structural Metadata Events |
| title | Prosodic Classification of Discourse Markers |
| title_full | Prosodic Classification of Discourse Markers |
| title_fullStr | Prosodic Classification of Discourse Markers Prosodic Classification of Discourse Markers |
| title_full_unstemmed | Prosodic Classification of Discourse Markers Prosodic Classification of Discourse Markers |
| title_short | Prosodic Classification of Discourse Markers |
| title_sort | Prosodic Classification of Discourse Markers |
| topic | Discourse markers Prosódia Lectures Dialogues Structural Metadata Events |
| topic_facet | Discourse markers Prosódia Lectures Dialogues Structural Metadata Events |
| url | http://hdl.handle.net/10451/31083 |
| visible | 1 |