61 documents found, page 1 of 7

Sort by Issue Date

ACE-2005-PT: corpus for event extraction in portuguese

Cunha, Luís Filipe; Silvano, Maria da Purificação; Campos, Ricardo; Jorge, Alípio

Event extraction is an NLP task that commonly involves identifying the central word (trigger) for an event and its associated arguments in text. ACE-2005 is widely recognised as the standard corpus in this field. While other corpora, like PropBank, primarily focus on annotating predicate-argument structure, ACE-2005 provides comprehensive information about the overall event structure and semantics. However, its...


BATS-PT: assessing portuguese masked language models in lexico-semantic analogy...

Oliveira, Hugo Gonçalo; Rodrigues, Ricardo; Ferreira, Bruno; Silvano, Maria da Purificação; Carvalho, Sara

This paper presents BATS-PT, the manual translation of the lexicographic portion of the Bigger Analogy Test Set (BATS) to European Portuguese. BATS-PT covers ten types of lexicosemantic analogies and can be used for assessing word embeddings and language models. Following this, the dataset is showcased while assessing two pretrained language models for Portuguese, BERTimbau and Albertina, in two tasks: analogy ...


TELP - Text Extraction with Linguistic Patterns

Cordeiro, João; Silvano, Maria da Purificação; Leal, António; Pais, Sebastião

Linguistic studies in under-resourced languages pose additional challenges at various levels, including the automatic collection of examples, cases, and corpora construction. Several sophisticated applications, such as GATE (Cunningham, 2002), can be configured/adjusted/programmed by experts to automatically collect examples from the Web in any language. However, these applications are too complex and intricate...


Untangling a web of temporal relations in news articles

Silvano, Maria da Purificação; Amorim, Evelin; Leal, António; Cantante, Inês; Jorge, Alípio; Campos, Ricardo; Yu, Nana

Temporal reasoning has been the focus of several studies during the past years, both in linguistics and computational studies. Although advances on this topic are undeniable, there are still improvements to be made and new avenues to pursue. One relevant problem concerns the temporal ordering of the events, particularly asserting and representing how events are temporally related and how the story told in the n...


Text2Story Lusa: a dataset for narrative analysis in european portuguese news a...

Nunes, Sérgio Sobral; Jorge, Alípio; Amorim, Evelin; Sousa, Hugo; Leal, António; Silvano, Maria da Purificação; Cantante, Inês; Campos, Ricardo

Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the...


ISO 24617-8 applied: insights from multilingual discourse relations annotation ...

Tomaszewska, Aleksandra; Silvano, Maria da Purificação; Leal, António; Amorim, Evelin

The main objective of this study is to contribute to multilingual discourse research by employing ISO-24617 Part 8 (Semantic Relations in Discourse, Core Annotation Schema - DR-core) for annotating discourse relations. Centering around a parallel discourse relations corpus that includes English, Polish, and European Portuguese, we initiate one of the few ISO-based comparative analyses through a multilingual cor...


MultiLexBATS: multilingual dataset of lexical semantic relations

Gromann, Dagmar; Silvano, Maria da Purificação

Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it a...


Multilinguality and LLOD: a survey across linguistic description levels

Gromann, Dagmar; Silvano, Maria da Purificação

Limited accessibility to language resources and technologies represents a challenge for the analysis, preservation, and documentation of natural languages other than English. Linguistic Linked (Open) Data (LLOD) holds the promise to ease the creation, linking, and reuse of multilingual linguistic data across distributed and heterogeneous resources. However, individual language resources and technologies accommo...


Como-gerund clauses in European Portuguese: figuring out the riddle

Leal, António; Lobo, Maria; Silvano, Maria da Purificação

Previous literature on the typology of gerund clauses in Portuguese has overlooked a peculiar type of clauses which are always introduced by como ('as') and display an array of characteristics that set them apart from all other gerund clauses (and from other, somehow similar, constructions in different languages). In this paper, we provide an in-depth syntactic and semantic characterisation of these como-gerund...


Overview of the CLEF-2024 CheckThat! Lab Task 3 on persuasion techniques

Piskorski, Jakub; Jorge, Alípio; Silvano, Maria da Purificação; Guimarães, Nuno; Pacheco, Ana Filipa; Yu, Nana

We present an overview of CheckThat! Lab's 2024 Task 3, which focuses on detecting 23 persuasion techniques at the text-span level in online media. The task covers five languages, namely, Arabic, Bulgarian, English, Portuguese, and Slovene, and highly-debated topics in the media, e.g., the Isreali-Palestian conflict, the Russia- Ukraine war, climate change, COVID-19, abortion, etc. A total of 23 teams registere...


61 Results

Queried text

Refine Results

Author





















Date















Document Type





Access rights



Resource


Subject