Publication

Downscaling soil moisture to sub-km resolutions with simple machine learning ensembles

Bibliographic Details
Summary:	Soil moisture is a key factor that influences the productivity and energy balance of ecosystems and biomes. Global soil moisture measurements have coarse native resolutions of 36km and infrequent revisits of around three days. However, these limitations are not present for many variables connected to soil moisture such as land surface temperature and evapotranspiration. For this reason many previous studies have aimed to discern the relationships between these higher resolution variables and soil moisture to produce downscaled soil moisture products. In this study, we test four ensemble machine learning models for this downscaling task. These models use a dataset of over 1,000 sites across the US to predict soil moisture at sub-km scales. We find that all models, particularly one with a very simple structure, can outperform Soil Moisture Active Passive (SMAP) measurements on a cross-fold analysis of the 1,000+ sites. This model has an average ubRMSE of 0.058 vs SMAPs 0.065 and an average R of 0.638 vs SMAPs 0.562. Not all ensembles are beneficial, with some architectures performing better with different training weights than with ensemble averaging. However, some ensembles capture more of the land surface characteristics than ensemble members. Lastly, although general improvements over SMAP are observed, there appears to be difficulty in consistently doing so in cropland regions with high clay and low sand content.
Main Authors:	Poehls, Jeran
Other Authors:	Alonso, Lazaro; Koirala, Sujan; Reichstein, Markus; Carvalhais, Nuno
Subject:	Downscaling Ensemble Remote sensing SMAP Soil moisture Water Science and Technology SDG 15 - Life on Land
Year:	2025
Country:	Portugal
Document type:	article
Access type:	open access
Associated institution:	Universidade Nova de Lisboa
Language:	English
Origin:	Repositório Institucional da UNL

Description
Summary:	Soil moisture is a key factor that influences the productivity and energy balance of ecosystems and biomes. Global soil moisture measurements have coarse native resolutions of 36km and infrequent revisits of around three days. However, these limitations are not present for many variables connected to soil moisture such as land surface temperature and evapotranspiration. For this reason many previous studies have aimed to discern the relationships between these higher resolution variables and soil moisture to produce downscaled soil moisture products. In this study, we test four ensemble machine learning models for this downscaling task. These models use a dataset of over 1,000 sites across the US to predict soil moisture at sub-km scales. We find that all models, particularly one with a very simple structure, can outperform Soil Moisture Active Passive (SMAP) measurements on a cross-fold analysis of the 1,000+ sites. This model has an average ubRMSE of 0.058 vs SMAPs 0.065 and an average R of 0.638 vs SMAPs 0.562. Not all ensembles are beneficial, with some architectures performing better with different training weights than with ensemble averaging. However, some ensembles capture more of the land surface characteristics than ensemble members. Lastly, although general improvements over SMAP are observed, there appears to be difficulty in consistently doing so in cropland regions with high clay and low sand content.