Publicação

CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media

Ver documento

Detalhes bibliográficos
Resumo:This paper describes the participation of the CeDRI team in eRisk 2021 tasks, particularly, the Task 1: Early Detection of Signs of Pathological Gambling and Task 2: Early Detection of Signs of Self-Harm. The main difference between these two is that the first is a “test only” challenge, where no training data is supplied. The second task has labeled data available, which can be used for training. Both tasks were addressed using the same algorithms, using a custom training set for Task 1 and the provided data in the second. The algorithms were TfIdf vectorizer with a Logistic Regression layer, Word2Vec vectorizer with LSTM and Word2Vec vectorizer with CNN. All vectorizers and Neural Networks were trained solely with the training data. As expected, the algorithms did not state-of-the-art, but the experience allowed to reflect in several aspects related to the importance of proper dataset preparation and processing. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Autores principais:Lopes, Rui Pedro
Assunto:Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J
Ano:2021
País:Portugal
Tipo de documento:comunicação em conferência
Tipo de acesso:acesso aberto
Instituição associada:Instituto Politécnico de Bragança
Idioma:inglês
Origem:Biblioteca Digital do IPB
_version_ 1867173270976987136
author Lopes, Rui Pedro
author_facet Lopes, Rui Pedro
author_role author
contributor_name_str_mv Biblioteca Digital do IPB
country_str PT
creators_json_txt [{\"Person.name\":\"Lopes, Rui Pedro\",\"Person.identifier.orcid\":\"0000-0002-9170-5078\"}]
datacite.contributors.contributor.contributorName.fl_str_mv Biblioteca Digital do IPB
datacite.creators.creator.creatorName.fl_str_mv Lopes, Rui Pedro
datacite.date.Accepted.fl_str_mv 2021-01-01T00:00:00Z
datacite.date.available.fl_str_mv 2021-10-15T10:28:46Z
datacite.date.embargoed.fl_str_mv 2021-10-15T10:28:46Z
datacite.rights.fl_str_mv http://purl.org/coar/access_right/c_abf2
datacite.subjects.subject.fl_str_mv Early risk detection
Tf-Idf
Word2Vec
Recursive neural networks
Dataset heuristics
DL4J
datacite.titles.title.fl_str_mv CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
dc.contributor.none.fl_str_mv Biblioteca Digital do IPB
dc.creator.none.fl_str_mv Lopes, Rui Pedro
dc.date.Accepted.fl_str_mv 2021-01-01T00:00:00Z
dc.date.available.fl_str_mv 2021-10-15T10:28:46Z
dc.date.embargoed.fl_str_mv 2021-10-15T10:28:46Z
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv http://hdl.handle.net/10198/24037
dc.language.none.fl_str_mv eng
dc.publisher.none.fl_str_mv CEUR Workshop Proceedings
dc.rights.cclincense.fl_str_mv http://creativecommons.org/licenses/by/4.0/
dc.rights.none.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.subject.none.fl_str_mv Early risk detection
Tf-Idf
Word2Vec
Recursive neural networks
Dataset heuristics
DL4J
dc.title.fl_str_mv CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
dc.type.none.fl_str_mv http://purl.org/coar/resource_type/c_5794
description This paper describes the participation of the CeDRI team in eRisk 2021 tasks, particularly, the Task 1: Early Detection of Signs of Pathological Gambling and Task 2: Early Detection of Signs of Self-Harm. The main difference between these two is that the first is a “test only” challenge, where no training data is supplied. The second task has labeled data available, which can be used for training. Both tasks were addressed using the same algorithms, using a custom training set for Task 1 and the provided data in the second. The algorithms were TfIdf vectorizer with a Logistic Regression layer, Word2Vec vectorizer with LSTM and Word2Vec vectorizer with CNN. All vectorizers and Neural Networks were trained solely with the training data. As expected, the algorithms did not state-of-the-art, but the experience allowed to reflect in several aspects related to the importance of proper dataset preparation and processing. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
dirty 0
eu_rights_str_mv openAccess
format conferencePaper
fulltext.url.fl_str_mv https://bibliotecadigital.ipb.pt/bitstreams/8cf6b603-2460-44a3-b80e-1e3b1b852e23/download
id ipb_64d504cd51fcf6aab522e223eaec602a
identifier.url.fl_str_mv http://hdl.handle.net/10198/24037
instacron_str ipb
institution Instituto Politécnico de Bragança
instname_str Instituto Politécnico de Bragança
language eng
network_acronym_str ipb
network_name_str Biblioteca Digital do IPB
oai_identifier_str oai:bibliotecadigital.ipb.pt:10198/24037
organization_str_mv urn:organizationAcronym:ipb
person_str_mv Lopes, Rui Pedro
Lopes, Rui Pedro
https://www.ciencia-id.pt/8E14-54E4-4DB5
8E14-54E4-4DB5
http://orcid.org/0000-0002-9170-5078
0000-0002-9170-5078
publishDate 2021
publisher.none.fl_str_mv CEUR Workshop Proceedings
reponame_str Biblioteca Digital do IPB
repository_id_str urn:repositoryAcronym:ipb
service_str_mv urn:repositoryAcronym:ipb
spelling engCEUR Workshop Proceedingspt_PTThis paper describes the participation of the CeDRI team in eRisk 2021 tasks, particularly, the Task 1: Early Detection of Signs of Pathological Gambling and Task 2: Early Detection of Signs of Self-Harm. The main difference between these two is that the first is a “test only” challenge, where no training data is supplied. The second task has labeled data available, which can be used for training. Both tasks were addressed using the same algorithms, using a custom training set for Task 1 and the provided data in the second. The algorithms were TfIdf vectorizer with a Logistic Regression layer, Word2Vec vectorizer with LSTM and Word2Vec vectorizer with CNN. All vectorizers and Neural Networks were trained solely with the training data. As expected, the algorithms did not state-of-the-art, but the experience allowed to reflect in several aspects related to the importance of proper dataset preparation and processing. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).application/pdfpt_PTCeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social mediaPersonalLopes, Rui PedroDSpacehttp://dspace.org/items/e1e64423-0ec8-46ee-be96-33205c7c98a9DSpacehttp://dspace.org/items/e1e64423-0ec8-46ee-be96-33205c7c98a9LopesRui PedroCiência IDhttps://www.ciencia-id.pt8E14-54E4-4DB5ORCIDhttp://orcid.org0000-0002-9170-5078HostingInstitutionOrganizationalBiblioteca Digital do IPBe-mailmailto:dspace@ipb.ptdspace@ipb.ptISSNIsPartOf1613-00732021-10-15T10:28:46Z20212021-01-01T00:00:00ZHandlehttp://hdl.handle.net/10198/24037http://purl.org/coar/access_right/c_abf2open accessEarly risk detectionTf-IdfWord2VecRecursive neural networksDataset heuristicsDL4J1743510 bytesother research producthttp://purl.org/coar/resource_type/c_5794conference paper2021http://creativecommons.org/licenses/by/4.0/http://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://bibliotecadigital.ipb.pt/bitstreams/8cf6b603-2460-44a3-b80e-1e3b1b852e23/downloadCEUR Workshop Proceedings981991Bucharest, Romania
spellingShingle CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
Lopes, Rui Pedro
Early risk detection
Tf-Idf
Word2Vec
Recursive neural networks
Dataset heuristics
DL4J
status SINGLETON
subject.fl_str_mv Early risk detection
Tf-Idf
Word2Vec
Recursive neural networks
Dataset heuristics
DL4J
title CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
title_full CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
title_fullStr CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
title_full_unstemmed CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
title_short CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
title_sort CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
topic Early risk detection
Tf-Idf
Word2Vec
Recursive neural networks
Dataset heuristics
DL4J
topic_facet Early risk detection
Tf-Idf
Word2Vec
Recursive neural networks
Dataset heuristics
DL4J
url http://hdl.handle.net/10198/24037
visible 1