3 documents found, page 1 of 1

Sort by Issue Date

Development and evaluation of a NER model in the domain of cultural analysis an...

Sotelo Docío, Susana; Gamallo, Pablo; Iriarte, Álvaro

 Named Entity Recognition (NER) is an essential task in information extraction where entities in a text are identified and classified. One of the primary challenges addressed by NER systems is the difficulty of generalizing what was learned to different types of corpora beyond the training data. This problem is magnified by the fact that most of the training corpora used are journalistic and therefore need...

Date: 2023   |   Origin: Linguamática

Towards a morphological analyzer for the Umbundu language

Simões, Alberto; Sacanene, Bernardo; Iriarte, Álvaro; Almeida, José João; Macedo, Joaquim

In this document we present the first developments on an Umbundu dictionary for a jSpell, a morphological analyzer. Initially some comments are performed regarding the Umbundu language morphology, followed by the discussion on jSpell dictionaries structure and its environment. Last, we describe the Umbundu dictionary bootstrap process and perform some final experiments on its coverage.

Date: 2021   |   Origin: CiencIPCA

Procura-PALavras (P-PAL): A web-based interface for a new european portuguese l...

Soares, Ana Paula; Iriarte, Álvaro; Almeida, José João; Simões, Alberto; Costa, Ana; Machado, João; França, Patrícia; Comesaña, Montserrat

In this article, we present Procura-PALavras (P-PAL), a Web-based interface for a new European Portuguese (EP) lexical database. Based on a contemporary printed corpus of over 227 million words, P-PAL provides a broad range of word attributes and statistics, including several measures of word frequency (e.g., raw counts, per-million word frequency, logarithmic Zipf scale), morpho-syntactic information (e.g., pa...

Date: 2021   |   Origin: CiencIPCA

3 Results

Queried text

Refine Results

Author















Date



Document Type



Access rights


Resource



Subject