Author(s):
Querido, Andreia ; Carvalho, Rita ; Rodrigues, João ; Garcia, Marcos ; Silva, João ; Correia, Catarina ; Rendeiro, Nuno ; Valadas Pereira, Rita ; Campos, Marisa ; Branco, António
Date: 2017
Origin: Revista da Associação Portuguesa de Linguística
Subject(s): semântica distribucional; conjuntos de dados; avaliação; português; distributional semantics; data sets; evaluation; Portuguese
Description
In this paper we describe a collection of publicly available data sets for Portuguese that are suitable for the evaluation of distributional semantics models in lexical similarity tasks and in conceptual categorization tasks. These data sets were adapted from English gold-standard test sets, allowing any Portuguese distributional semantics model to be evaluated and also to be compared to mainstream results that have been obtained for this language. We also present an online service that showcases some functionalities of the distributional semantics models.