Publicação

Machine learning methods to detect money laundering in the Bitcoin blockchain in the presence of label scarcity

Ver documento

Detalhes bibliográficos
Resumo:Every year, criminals launder billions of dollars acquired from serious felonies (e.g. terrorism, drug smuggling, or human trafficking), harming countless people and economies. Cryptocurrencies, in particular, have developed as a haven for money laundering activity. Machine Learning can be used to detect these illicit patterns. However, labels are so scarce that traditional supervised algorithms are inapplicable. This research addresses money laundering detection assuming minimal access to labels. The results show that existing state-of-the-art solutions using unsupervised anomaly detection methods are inadequate to detect the illicit patterns in a real Bitcoin transaction dataset. The proposed active learning solution, however, is capable of matching the performance of a fully supervised baseline by using just 5% of the labels. This solution mimics a typical real-life situation in which a limited number of labels can be acquired through manual annotation by experts.
Autores principais:Lorenz, Joana Susan
Assunto:Anti-money laundering Applied machine learning Supervised learning by classification Anomaly detection Active learning
Ano:2021
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:Every year, criminals launder billions of dollars acquired from serious felonies (e.g. terrorism, drug smuggling, or human trafficking), harming countless people and economies. Cryptocurrencies, in particular, have developed as a haven for money laundering activity. Machine Learning can be used to detect these illicit patterns. However, labels are so scarce that traditional supervised algorithms are inapplicable. This research addresses money laundering detection assuming minimal access to labels. The results show that existing state-of-the-art solutions using unsupervised anomaly detection methods are inadequate to detect the illicit patterns in a real Bitcoin transaction dataset. The proposed active learning solution, however, is capable of matching the performance of a fully supervised baseline by using just 5% of the labels. This solution mimics a typical real-life situation in which a limited number of labels can be acquired through manual annotation by experts.