Autor(es):
Rodríguez-García, María Inmaculada ; Carrasco-García, María Gema ; Cubillas Fernández, Paloma Rocío ; Ribeiro, Conceição ; Cardoso, Pedro ; Turias, Ignacio. J.
Data: 2025
Identificador Persistente: http://hdl.handle.net/10400.1/27954
Origem: Sapientia - Universidade do Algarve
Assunto(s): Deep learning; Autoencoders; Air quality forecasting; NO2; SO2; PM10; Concentration forecasting
Descrição
This study aims to evaluate and compare the performance of Autoencoders (AEs) and Sparse Autoencoders (SAEs) in forecasting the next-hour concentration levels of various air pollutants—specifically NO2(t + 1), PM10(t + 1), and SO2(t + 1)—in the Bay of Algeciras, a highly complex region located in southern Spain. Hourly data related to air quality, meteorological conditions, and maritime traffic were collected from 2017 to 2019 across multiple monitoring stations distributed throughout the bay, enabling the analysis of diverse forecasting scenarios. The output variable was segmented into four distinct, non-overlapping quartiles (Q1–Q4) to capture different concentration ranges. AE models demonstrated greater accuracy in predicting moderate pollution levels (Q2 and Q3), whereas SAE models achieved comparable performance at the lower and upper extremes (Q1 and Q4). The results suggest that stacking AE layers with varying degrees of sparsity—culminating in a supervised output layer—can enhance the model’s ability to forecast pollutant concentration indices across all quartiles. Notably, Q4 predictions, representing peak concentrations, benefited from more complex SAE architectures, likely due to the increased difficulty associated with modelling extreme values.