Publicação

Advertisement Click Fraud Detection and Prevention: A machine learning approach

Detalhes bibliográficos
Resumo:	Click fraud poses a significant challenge to digital advertising, causing substantial financial losses and undermining advertiser trust. The study explores the potential of machine learning approaches for detecting such malicious conduct in Google Ads. We use five algorithms for modelling and comparison, including support vector machines, random forest, k-nearest neighbours, gradient tree boosting, and XGBoost. These are all part of the CRISP-DM methodology, which gives you a structured way to do machine learning projects. These models were chosen for their proven efficacy in fraud detection. Our analysis revealed that tree-based models, particularly GTB and XGBoost, outperformed others in accuracy, recall, and AUC, making them highly effective in identifying fraudulent clicks. The study confirms that machine learning algorithms can accurately classify and detect fraudulent activities, enhancing the understanding of fraud characteristics using pre-classified data. Additionally, we identified key patterns and characteristics associated with non-genuine clicks, such as primary click actions and click frequency per IP address and user ID. This research bridges the gap between academic theory and practical application, providing actionable insights for marketing agencies to combat click fraud effectively. A collaboration with a marketing agency for this study ensures that the outcomes are directly beneficial, enhancing the overall integrity and performance of digital advertising efforts.
Autores principais:	Santo, Camilla Alves do Espírito
Assunto:	Click fraud machine learning advertising detection ads SDG 8 - Decent work and economic growth SDG 16 - Peace, justice and strong institutions
Ano:	2024
País:	Portugal
Tipo de documento:	dissertação de mestrado
Tipo de acesso:	acesso aberto
Instituição associada:	Universidade Nova de Lisboa
Idioma:	inglês
Origem:	Repositório Institucional da UNL

Descrição
Resumo:	Click fraud poses a significant challenge to digital advertising, causing substantial financial losses and undermining advertiser trust. The study explores the potential of machine learning approaches for detecting such malicious conduct in Google Ads. We use five algorithms for modelling and comparison, including support vector machines, random forest, k-nearest neighbours, gradient tree boosting, and XGBoost. These are all part of the CRISP-DM methodology, which gives you a structured way to do machine learning projects. These models were chosen for their proven efficacy in fraud detection. Our analysis revealed that tree-based models, particularly GTB and XGBoost, outperformed others in accuracy, recall, and AUC, making them highly effective in identifying fraudulent clicks. The study confirms that machine learning algorithms can accurately classify and detect fraudulent activities, enhancing the understanding of fraud characteristics using pre-classified data. Additionally, we identified key patterns and characteristics associated with non-genuine clicks, such as primary click actions and click frequency per IP address and user ID. This research bridges the gap between academic theory and practical application, providing actionable insights for marketing agencies to combat click fraud effectively. A collaboration with a marketing agency for this study ensures that the outcomes are directly beneficial, enhancing the overall integrity and performance of digital advertising efforts.