Publicação
CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media
| Resumo: | This paper describes the participation of the CeDRI team in eRisk 2021 tasks, particularly, the Task 1: Early Detection of Signs of Pathological Gambling and Task 2: Early Detection of Signs of Self-Harm. The main difference between these two is that the first is a “test only” challenge, where no training data is supplied. The second task has labeled data available, which can be used for training. Both tasks were addressed using the same algorithms, using a custom training set for Task 1 and the provided data in the second. The algorithms were TfIdf vectorizer with a Logistic Regression layer, Word2Vec vectorizer with LSTM and Word2Vec vectorizer with CNN. All vectorizers and Neural Networks were trained solely with the training data. As expected, the algorithms did not state-of-the-art, but the experience allowed to reflect in several aspects related to the importance of proper dataset preparation and processing. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
|---|---|
| Autores principais: | Lopes, Rui Pedro |
| Assunto: | Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J |
| Ano: | 2021 |
| País: | Portugal |
| Tipo de documento: | comunicação em conferência |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Instituto Politécnico de Bragança |
| Idioma: | inglês |
| Origem: | Biblioteca Digital do IPB |
| _version_ | 1867173270976987136 |
|---|---|
| author | Lopes, Rui Pedro |
| author_facet | Lopes, Rui Pedro |
| author_role | author |
| contributor_name_str_mv | Biblioteca Digital do IPB |
| country_str | PT |
| creators_json_txt | [{\"Person.name\":\"Lopes, Rui Pedro\",\"Person.identifier.orcid\":\"0000-0002-9170-5078\"}] |
| datacite.contributors.contributor.contributorName.fl_str_mv | Biblioteca Digital do IPB |
| datacite.creators.creator.creatorName.fl_str_mv | Lopes, Rui Pedro |
| datacite.date.Accepted.fl_str_mv | 2021-01-01T00:00:00Z |
| datacite.date.available.fl_str_mv | 2021-10-15T10:28:46Z |
| datacite.date.embargoed.fl_str_mv | 2021-10-15T10:28:46Z |
| datacite.rights.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| datacite.subjects.subject.fl_str_mv | Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J |
| datacite.titles.title.fl_str_mv | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| dc.contributor.none.fl_str_mv | Biblioteca Digital do IPB |
| dc.creator.none.fl_str_mv | Lopes, Rui Pedro |
| dc.date.Accepted.fl_str_mv | 2021-01-01T00:00:00Z |
| dc.date.available.fl_str_mv | 2021-10-15T10:28:46Z |
| dc.date.embargoed.fl_str_mv | 2021-10-15T10:28:46Z |
| dc.format.none.fl_str_mv | application/pdf |
| dc.identifier.none.fl_str_mv | http://hdl.handle.net/10198/24037 |
| dc.language.none.fl_str_mv | eng |
| dc.publisher.none.fl_str_mv | CEUR Workshop Proceedings |
| dc.rights.cclincense.fl_str_mv | http://creativecommons.org/licenses/by/4.0/ |
| dc.rights.none.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| dc.subject.none.fl_str_mv | Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J |
| dc.title.fl_str_mv | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| dc.type.none.fl_str_mv | http://purl.org/coar/resource_type/c_5794 |
| description | This paper describes the participation of the CeDRI team in eRisk 2021 tasks, particularly, the Task 1: Early Detection of Signs of Pathological Gambling and Task 2: Early Detection of Signs of Self-Harm. The main difference between these two is that the first is a “test only” challenge, where no training data is supplied. The second task has labeled data available, which can be used for training. Both tasks were addressed using the same algorithms, using a custom training set for Task 1 and the provided data in the second. The algorithms were TfIdf vectorizer with a Logistic Regression layer, Word2Vec vectorizer with LSTM and Word2Vec vectorizer with CNN. All vectorizers and Neural Networks were trained solely with the training data. As expected, the algorithms did not state-of-the-art, but the experience allowed to reflect in several aspects related to the importance of proper dataset preparation and processing. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
| dirty | 0 |
| eu_rights_str_mv | openAccess |
| format | conferencePaper |
| fulltext.url.fl_str_mv | https://bibliotecadigital.ipb.pt/bitstreams/8cf6b603-2460-44a3-b80e-1e3b1b852e23/download |
| id | ipb_64d504cd51fcf6aab522e223eaec602a |
| identifier.url.fl_str_mv | http://hdl.handle.net/10198/24037 |
| instacron_str | ipb |
| institution | Instituto Politécnico de Bragança |
| instname_str | Instituto Politécnico de Bragança |
| language | eng |
| network_acronym_str | ipb |
| network_name_str | Biblioteca Digital do IPB |
| oai_identifier_str | oai:bibliotecadigital.ipb.pt:10198/24037 |
| organization_str_mv | urn:organizationAcronym:ipb |
| person_str_mv | Lopes, Rui Pedro Lopes, Rui Pedro https://www.ciencia-id.pt/8E14-54E4-4DB5 8E14-54E4-4DB5 http://orcid.org/0000-0002-9170-5078 0000-0002-9170-5078 |
| publishDate | 2021 |
| publisher.none.fl_str_mv | CEUR Workshop Proceedings |
| reponame_str | Biblioteca Digital do IPB |
| repository_id_str | urn:repositoryAcronym:ipb |
| service_str_mv | urn:repositoryAcronym:ipb |
| spelling | engCEUR Workshop Proceedingspt_PTThis paper describes the participation of the CeDRI team in eRisk 2021 tasks, particularly, the Task 1: Early Detection of Signs of Pathological Gambling and Task 2: Early Detection of Signs of Self-Harm. The main difference between these two is that the first is a “test only” challenge, where no training data is supplied. The second task has labeled data available, which can be used for training. Both tasks were addressed using the same algorithms, using a custom training set for Task 1 and the provided data in the second. The algorithms were TfIdf vectorizer with a Logistic Regression layer, Word2Vec vectorizer with LSTM and Word2Vec vectorizer with CNN. All vectorizers and Neural Networks were trained solely with the training data. As expected, the algorithms did not state-of-the-art, but the experience allowed to reflect in several aspects related to the importance of proper dataset preparation and processing. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).application/pdfpt_PTCeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social mediaPersonalLopes, Rui PedroDSpacehttp://dspace.org/items/e1e64423-0ec8-46ee-be96-33205c7c98a9DSpacehttp://dspace.org/items/e1e64423-0ec8-46ee-be96-33205c7c98a9LopesRui PedroCiência IDhttps://www.ciencia-id.pt8E14-54E4-4DB5ORCIDhttp://orcid.org0000-0002-9170-5078HostingInstitutionOrganizationalBiblioteca Digital do IPBe-mailmailto:dspace@ipb.ptdspace@ipb.ptISSNIsPartOf1613-00732021-10-15T10:28:46Z20212021-01-01T00:00:00ZHandlehttp://hdl.handle.net/10198/24037http://purl.org/coar/access_right/c_abf2open accessEarly risk detectionTf-IdfWord2VecRecursive neural networksDataset heuristicsDL4J1743510 bytesother research producthttp://purl.org/coar/resource_type/c_5794conference paper2021http://creativecommons.org/licenses/by/4.0/http://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://bibliotecadigital.ipb.pt/bitstreams/8cf6b603-2460-44a3-b80e-1e3b1b852e23/downloadCEUR Workshop Proceedings981991Bucharest, Romania |
| spellingShingle | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media Lopes, Rui Pedro Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J |
| status | SINGLETON |
| subject.fl_str_mv | Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J |
| title | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| title_full | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| title_fullStr | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| title_full_unstemmed | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| title_short | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| title_sort | CeDRI at eRisk 2021: a naive approach to early detection of psychological disorders in social media |
| topic | Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J |
| topic_facet | Early risk detection Tf-Idf Word2Vec Recursive neural networks Dataset heuristics DL4J |
| url | http://hdl.handle.net/10198/24037 |
| visible | 1 |