Publicação
Anomaly detection in cryptocurrency transactions with active learning
| Resumo: | The cryptocurrency market has experienced tremendous growth in recent years, providing numerous opportunities for investors, innovators, and technological advancements. However, this rapid expansion has also led to an increase in fraudulent activities. As cryptocurrencies have become more popular and widely used, dishonest individuals have taken advantage of the decentralized and anonymous nature of these digital assets for their own illegal purposes. Consequently, there is a pressing need for the implementation of effective fraud detection mechanisms to safeguard the integrity of the crypto market. This work aims to tackle the problem of insufficient labels, which poses a challenge for training high-performance supervised classifiers. The main idea is to reduce the need to manually label an enormous volume of transactions, which can be an arduous and time-consuming task for companies due to budget constraints. Therefore, this work assumes that there are labels of some unlabeled data points that are more informative and relevant to the supervised model to learn. Additionally, this work tries to develop a robust fraud detection approach to deal with the ever-evolving nature of criminal behavior, since criminals are constantly seeking innovative methods to commit their fraudulent crimes. By addressing a cold start scenario, where there are no initial labeled transactions, this work investigates the feasibility of the synergistic utilization of unsupervised Anomaly Detection algorithms and Active Learning techniques to create an iterative process for acquiring labeled transactions. The primary aim is to explore the capabilities of Anomaly Detection in selecting a subset of data that maximizes the supervised models’ learning potential. The results of this study demonstrated that Anomaly Detection algorithms exhibited subpar performance when it comes to identifying relevant patterns associated with cryptocurrency fraud in unlabeled data. The findings emphasize that Anomaly Detection algorithms are only necessary for addressing cold start scenarios. Therefore, it was concluded that switching as soon as possible to more sophisticated Active Learning strategies would lead to superior results and improved performance from a supervised Machine Learning model. Remarkably, Random Forest achieved an optimal F1-score performance of 83% with as few as 700 labels. |
|---|---|
| Autores principais: | Cunha, Leandro Lopes |
| Assunto: | Active learning Anomaly detection Cryptocurrencies Fraud detection Aprendizagem ativa Criptomoedas Deteção de anomalias Deteção de fraude |
| Ano: | 2024 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| Resumo: | The cryptocurrency market has experienced tremendous growth in recent years, providing numerous opportunities for investors, innovators, and technological advancements. However, this rapid expansion has also led to an increase in fraudulent activities. As cryptocurrencies have become more popular and widely used, dishonest individuals have taken advantage of the decentralized and anonymous nature of these digital assets for their own illegal purposes. Consequently, there is a pressing need for the implementation of effective fraud detection mechanisms to safeguard the integrity of the crypto market. This work aims to tackle the problem of insufficient labels, which poses a challenge for training high-performance supervised classifiers. The main idea is to reduce the need to manually label an enormous volume of transactions, which can be an arduous and time-consuming task for companies due to budget constraints. Therefore, this work assumes that there are labels of some unlabeled data points that are more informative and relevant to the supervised model to learn. Additionally, this work tries to develop a robust fraud detection approach to deal with the ever-evolving nature of criminal behavior, since criminals are constantly seeking innovative methods to commit their fraudulent crimes. By addressing a cold start scenario, where there are no initial labeled transactions, this work investigates the feasibility of the synergistic utilization of unsupervised Anomaly Detection algorithms and Active Learning techniques to create an iterative process for acquiring labeled transactions. The primary aim is to explore the capabilities of Anomaly Detection in selecting a subset of data that maximizes the supervised models’ learning potential. The results of this study demonstrated that Anomaly Detection algorithms exhibited subpar performance when it comes to identifying relevant patterns associated with cryptocurrency fraud in unlabeled data. The findings emphasize that Anomaly Detection algorithms are only necessary for addressing cold start scenarios. Therefore, it was concluded that switching as soon as possible to more sophisticated Active Learning strategies would lead to superior results and improved performance from a supervised Machine Learning model. Remarkably, Random Forest achieved an optimal F1-score performance of 83% with as few as 700 labels. |
|---|