Publicação

Automated cleansing and harmonization of international trade data

Ver documento

Detalhes bibliográficos
Resumo:Large volumes of data are becoming increasingly available and can be very valuable for the analysis of different phenomena. These data can originate from multiple sources and be recorded in diverse formats, requiring preliminary scrutiny in order to be further used in scientific analyses. This first crucial phase of filtering and cleansing data is usually a cumbersome and time-consuming task, but automated routines can be developed to help researchers. A routine created with the R language is here presented, to screen, harmonize and aggregate international trade data, representing the trade flows between countries for specific products, in a timeframe that covers monthly flows for at least 15 years for most countries. The R script implementing these routines is provided, being easily adapted to other datasets with similar issues. • A step-by-step procedure for cleansing and harmonizing international trade data, using R programming language, is presented • Automated routines are very effective in obtaining robust and filtered data inputs to integrate in scientific models • Spatial and temporal patterns of worldwide trade relations can be explored to enhance our understanding of various associated phenomena
Autores principais:Oliveira, Sandra
Outros Autores:Capinha, César; Rocha, Jorge
Assunto:Automated screening Data harmonization Time-series analysis R software
Ano:2021
País:Portugal
Tipo de documento:artigo
Tipo de acesso:acesso aberto
Instituição associada:Universidade de Lisboa
Idioma:inglês
Origem:Repositório da Universidade de Lisboa
_version_ 1866810900707540992
author Oliveira, Sandra
author2 Capinha, César
Rocha, Jorge
author2_role author
author
author_facet Oliveira, Sandra
Capinha, César
Rocha, Jorge
author_role author
contributor_name_str_mv Repositório Científico de Acesso Aberto da ULisboa
country_str PT
creators_json_txt [{\"Person.name\":\"Oliveira, Sandra\",\"Person.identifier.orcid\":\"0000-0002-6253-4353\"},{\"Person.name\":\"Capinha, César\",\"Person.identifier.orcid\":\"0000-0002-0666-9755\"},{\"Person.name\":\"Rocha, Jorge\",\"Person.identifier.orcid\":\"0000-0002-7228-6330\"}]
datacite.contributors.contributor.contributorName.fl_str_mv Repositório Científico de Acesso Aberto da ULisboa
datacite.creators.creator.creatorName.fl_str_mv Oliveira, Sandra
Capinha, César
Rocha, Jorge
datacite.date.Accepted.fl_str_mv 2021-01-01T00:00:00Z
datacite.date.available.fl_str_mv 2021-11-15T16:54:19Z
datacite.date.embargoed.fl_str_mv 2021-11-15T16:54:19Z
datacite.rights.fl_str_mv http://purl.org/coar/access_right/c_abf2
datacite.subjects.subject.fl_str_mv Automated screening
Data harmonization
Time-series analysis
R software
datacite.titles.title.fl_str_mv Automated cleansing and harmonization of international trade data
dc.contributor.none.fl_str_mv Repositório Científico de Acesso Aberto da ULisboa
dc.creator.none.fl_str_mv Oliveira, Sandra
Capinha, César
Rocha, Jorge
dc.date.Accepted.fl_str_mv 2021-01-01T00:00:00Z
dc.date.available.fl_str_mv 2021-11-15T16:54:19Z
dc.date.embargoed.fl_str_mv 2021-11-15T16:54:19Z
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv http://hdl.handle.net/10451/50118
dc.language.none.fl_str_mv eng
dc.publisher.none.fl_str_mv Elsevier
dc.rights.cclincense.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.none.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.subject.none.fl_str_mv Automated screening
Data harmonization
Time-series analysis
R software
dc.title.fl_str_mv Automated cleansing and harmonization of international trade data
dc.type.none.fl_str_mv http://purl.org/coar/resource_type/c_6501
description Large volumes of data are becoming increasingly available and can be very valuable for the analysis of different phenomena. These data can originate from multiple sources and be recorded in diverse formats, requiring preliminary scrutiny in order to be further used in scientific analyses. This first crucial phase of filtering and cleansing data is usually a cumbersome and time-consuming task, but automated routines can be developed to help researchers. A routine created with the R language is here presented, to screen, harmonize and aggregate international trade data, representing the trade flows between countries for specific products, in a timeframe that covers monthly flows for at least 15 years for most countries. The R script implementing these routines is provided, being easily adapted to other datasets with similar issues. • A step-by-step procedure for cleansing and harmonizing international trade data, using R programming language, is presented • Automated routines are very effective in obtaining robust and filtered data inputs to integrate in scientific models • Spatial and temporal patterns of worldwide trade relations can be explored to enhance our understanding of various associated phenomena
dirty 0
eu_rights_str_mv openAccess
format article
fulltext.url.fl_str_mv https://repositorio.ulisboa.pt/bitstreams/92f44ec1-8c3c-48cf-a1eb-c55fc7555b0b/download
id ul_feb3241df3bbc8cf166c4048cf76ca8a
identifier.url.fl_str_mv http://hdl.handle.net/10451/50118
instacron_str ul
institution Universidade de Lisboa
instname_str Universidade de Lisboa
language eng
network_acronym_str ul
network_name_str Repositório da Universidade de Lisboa
oai_identifier_str oai:repositorio.ulisboa.pt:10451/50118
organization_str_mv urn:organizationAcronym:ul
person_str_mv Oliveira, Sandra
Oliveira, Sandra
https://www.ciencia-id.pt/8A16-4976-FD63
8A16-4976-FD63
http://orcid.org/0000-0002-6253-4353
0000-0002-6253-4353
Capinha, César
Capinha, César
https://www.ciencia-id.pt/7714-2A88-CDE3
7714-2A88-CDE3
http://orcid.org/0000-0002-0666-9755
0000-0002-0666-9755
Rocha, Jorge
Rocha, Jorge
https://www.ciencia-id.pt/EC15-76DC-9B96
EC15-76DC-9B96
http://orcid.org/0000-0002-7228-6330
0000-0002-7228-6330
publishDate 2021
publisher.none.fl_str_mv Elsevier
reponame_str Repositório da Universidade de Lisboa
repository_id_str urn:repositoryAcronym:ul
service_str_mv urn:repositoryAcronym:ul
spelling engElsevierpt_PTLarge volumes of data are becoming increasingly available and can be very valuable for the analysis of different phenomena. These data can originate from multiple sources and be recorded in diverse formats, requiring preliminary scrutiny in order to be further used in scientific analyses. This first crucial phase of filtering and cleansing data is usually a cumbersome and time-consuming task, but automated routines can be developed to help researchers. A routine created with the R language is here presented, to screen, harmonize and aggregate international trade data, representing the trade flows between countries for specific products, in a timeframe that covers monthly flows for at least 15 years for most countries. The R script implementing these routines is provided, being easily adapted to other datasets with similar issues. • A step-by-step procedure for cleansing and harmonizing international trade data, using R programming language, is presented • Automated routines are very effective in obtaining robust and filtered data inputs to integrate in scientific models • Spatial and temporal patterns of worldwide trade relations can be explored to enhance our understanding of various associated phenomenaapplication/pdfpt_PTAutomated cleansing and harmonization of international trade dataPersonalOliveira, SandraDSpacehttp://dspace.org/items/d30eb4c5-8ef1-426b-8e80-baa646b30f0eDSpacehttp://dspace.org/items/d30eb4c5-8ef1-426b-8e80-baa646b30f0eOliveiraSandraCiência IDhttps://www.ciencia-id.pt8A16-4976-FD63ORCIDhttp://orcid.org0000-0002-6253-4353Researcher IDhttps://www.researcherid.comAAK-5051-2020Scopus Author IDhttps://www.scopus.com17435272900PersonalCapinha, CésarDSpacehttp://dspace.org/items/4c666e7e-4ba8-4a41-8064-d26b3b9fc0f8DSpacehttp://dspace.org/items/4c666e7e-4ba8-4a41-8064-d26b3b9fc0f8CapinhaCésarCiência IDhttps://www.ciencia-id.pt7714-2A88-CDE3ORCIDhttp://orcid.org0000-0002-0666-9755Researcher IDhttps://www.researcherid.comK-6439-2017Researcher IDhttps://www.researcherid.comK-6439-2017Scopus Author IDhttps://www.scopus.com32867555000PersonalRocha, JorgeDSpacehttp://dspace.org/items/9c7dabc1-d6c6-4636-9293-6babe2ba64c9DSpacehttp://dspace.org/items/9c7dabc1-d6c6-4636-9293-6babe2ba64c9RochaJorgeCiência IDhttps://www.ciencia-id.ptEC15-76DC-9B96ORCIDhttp://orcid.org0000-0002-7228-6330Researcher IDhttps://www.researcherid.comF-3185-2017Researcher IDhttps://www.researcherid.comF-3185-2017Scopus Author IDhttps://www.scopus.com56428061000HostingInstitutionOrganizationalRepositório Científico de Acesso Aberto da ULisboae-mailmailto:repositorio@reitoria.ulisboa.ptrepositorio@reitoria.ulisboa.ptISSNIsPartOf2215-0161DOIIsPartOf10.1016/j.mex.2021.1015672021-11-15T16:54:19Z20212021-01-01T00:00:00ZHandlehttp://hdl.handle.net/10451/50118http://purl.org/coar/access_right/c_abf2open accessAutomated screeningData harmonizationTime-series analysisR software400839 bytesliteraturehttp://purl.org/coar/resource_type/c_6501journal article2021http://creativecommons.org/licenses/by-nc-nd/4.0/http://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://repositorio.ulisboa.pt/bitstreams/92f44ec1-8c3c-48cf-a1eb-c55fc7555b0b/downloadMethodsX8101567
spellingShingle Automated cleansing and harmonization of international trade data
Oliveira, Sandra
Automated screening
Data harmonization
Time-series analysis
R software
status SINGLETON
subject.fl_str_mv Automated screening
Data harmonization
Time-series analysis
R software
title Automated cleansing and harmonization of international trade data
title_full Automated cleansing and harmonization of international trade data
title_fullStr Automated cleansing and harmonization of international trade data
title_full_unstemmed Automated cleansing and harmonization of international trade data
title_short Automated cleansing and harmonization of international trade data
title_sort Automated cleansing and harmonization of international trade data
topic Automated screening
Data harmonization
Time-series analysis
R software
topic_facet Automated screening
Data harmonization
Time-series analysis
R software
url http://hdl.handle.net/10451/50118
visible 1