Publicação

Anomaly detection in cryptocurrency transactions with active learning

Ver documento

Detalhes bibliográficos
Resumo:The cryptocurrency market has experienced tremendous growth in recent years, providing numerous opportunities for investors, innovators, and technological advancements. However, this rapid expansion has also led to an increase in fraudulent activities. As cryptocurrencies have become more popular and widely used, dishonest individuals have taken advantage of the decentralized and anonymous nature of these digital assets for their own illegal purposes. Consequently, there is a pressing need for the implementation of effective fraud detection mechanisms to safeguard the integrity of the crypto market. This work aims to tackle the problem of insufficient labels, which poses a challenge for training high-performance supervised classifiers. The main idea is to reduce the need to manually label an enormous volume of transactions, which can be an arduous and time-consuming task for companies due to budget constraints. Therefore, this work assumes that there are labels of some unlabeled data points that are more informative and relevant to the supervised model to learn. Additionally, this work tries to develop a robust fraud detection approach to deal with the ever-evolving nature of criminal behavior, since criminals are constantly seeking innovative methods to commit their fraudulent crimes. By addressing a cold start scenario, where there are no initial labeled transactions, this work investigates the feasibility of the synergistic utilization of unsupervised Anomaly Detection algorithms and Active Learning techniques to create an iterative process for acquiring labeled transactions. The primary aim is to explore the capabilities of Anomaly Detection in selecting a subset of data that maximizes the supervised models’ learning potential. The results of this study demonstrated that Anomaly Detection algorithms exhibited subpar performance when it comes to identifying relevant patterns associated with cryptocurrency fraud in unlabeled data. The findings emphasize that Anomaly Detection algorithms are only necessary for addressing cold start scenarios. Therefore, it was concluded that switching as soon as possible to more sophisticated Active Learning strategies would lead to superior results and improved performance from a supervised Machine Learning model. Remarkably, Random Forest achieved an optimal F1-score performance of 83% with as few as 700 labels.
Autores principais:Cunha, Leandro Lopes
Assunto:Active learning Anomaly detection Cryptocurrencies Fraud detection Aprendizagem ativa Criptomoedas Deteção de anomalias Deteção de fraude
Ano:2024
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:The cryptocurrency market has experienced tremendous growth in recent years, providing numerous opportunities for investors, innovators, and technological advancements. However, this rapid expansion has also led to an increase in fraudulent activities. As cryptocurrencies have become more popular and widely used, dishonest individuals have taken advantage of the decentralized and anonymous nature of these digital assets for their own illegal purposes. Consequently, there is a pressing need for the implementation of effective fraud detection mechanisms to safeguard the integrity of the crypto market. This work aims to tackle the problem of insufficient labels, which poses a challenge for training high-performance supervised classifiers. The main idea is to reduce the need to manually label an enormous volume of transactions, which can be an arduous and time-consuming task for companies due to budget constraints. Therefore, this work assumes that there are labels of some unlabeled data points that are more informative and relevant to the supervised model to learn. Additionally, this work tries to develop a robust fraud detection approach to deal with the ever-evolving nature of criminal behavior, since criminals are constantly seeking innovative methods to commit their fraudulent crimes. By addressing a cold start scenario, where there are no initial labeled transactions, this work investigates the feasibility of the synergistic utilization of unsupervised Anomaly Detection algorithms and Active Learning techniques to create an iterative process for acquiring labeled transactions. The primary aim is to explore the capabilities of Anomaly Detection in selecting a subset of data that maximizes the supervised models’ learning potential. The results of this study demonstrated that Anomaly Detection algorithms exhibited subpar performance when it comes to identifying relevant patterns associated with cryptocurrency fraud in unlabeled data. The findings emphasize that Anomaly Detection algorithms are only necessary for addressing cold start scenarios. Therefore, it was concluded that switching as soon as possible to more sophisticated Active Learning strategies would lead to superior results and improved performance from a supervised Machine Learning model. Remarkably, Random Forest achieved an optimal F1-score performance of 83% with as few as 700 labels.