Publicação

The Powers of the Covariance Matrix: A Novel Approach for Customer Churn Prediction in the Telecom Sector

Ver documento

Detalhes bibliográficos
Resumo:Customer churn prediction remains a key challenge for telecom providers in saturated markets. This paper introduces a novel feature engineering technique for churn prediction based on fractional powers of covariance matrices, which reveal structural relationships among customer features. Namely, we transform tabular customer data into covariance matrices, apply fractional power transformations (α ∈ [-4,4]), and use the resulting matrices as input features to a Convolutional Neural Network (CNN). The optimal power is selected based on class-wise recall performance (from the training samples only). To address the limitations of having only a single instance per customer and to test generalization of our approach under class imbalance without synthetic oversampling, we propose a novel training pipeline that employs homogenous class-based subsets and encodes individual test instances through outer product representations. On the IBM Telco dataset (7,043 customers, 27% churn rate), our method achieved 90% recall for churners (minority class), exhibiting state of the art performance. The model also achieved 81% recall for non-churners, indicating balanced performance. The empirical success of our approach may be explained by the observed consistent separation between our churners and non-churners engineered features in their natural ambient space: the space of symmetric positive-definite (SPD) matrices, endowed with the Affine-Invariant Riemannian Metric (AIRM). This structural separability suggests that our matrix power features offer high identifiability, which is essential for the consistency and generalization of supervised learning models. Our work highlights the benefits of combining deep learning with structure-informed representations of tabular data.
Autores principais:Rodrigues, Catarina Ferreira
Assunto:Customer Churn Prediction Convolutional Neural Networks Fractional Matrix Powers Incremental Learning Structural Consistency Telecom Sector SDG 8 - Decent work and economic growth SDG 9 - Industry, innovation and infrastructure SDG 12 - Responsible production and consumption
Ano:2025
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:Customer churn prediction remains a key challenge for telecom providers in saturated markets. This paper introduces a novel feature engineering technique for churn prediction based on fractional powers of covariance matrices, which reveal structural relationships among customer features. Namely, we transform tabular customer data into covariance matrices, apply fractional power transformations (α ∈ [-4,4]), and use the resulting matrices as input features to a Convolutional Neural Network (CNN). The optimal power is selected based on class-wise recall performance (from the training samples only). To address the limitations of having only a single instance per customer and to test generalization of our approach under class imbalance without synthetic oversampling, we propose a novel training pipeline that employs homogenous class-based subsets and encodes individual test instances through outer product representations. On the IBM Telco dataset (7,043 customers, 27% churn rate), our method achieved 90% recall for churners (minority class), exhibiting state of the art performance. The model also achieved 81% recall for non-churners, indicating balanced performance. The empirical success of our approach may be explained by the observed consistent separation between our churners and non-churners engineered features in their natural ambient space: the space of symmetric positive-definite (SPD) matrices, endowed with the Affine-Invariant Riemannian Metric (AIRM). This structural separability suggests that our matrix power features offer high identifiability, which is essential for the consistency and generalization of supervised learning models. Our work highlights the benefits of combining deep learning with structure-informed representations of tabular data.