Robust Rainfall Gap-Filling in Coastal Arid Regions Using Ensemble Fusion Models

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Badar Al-Jahwari - , Sultan Qaboos University (Author)
  • Ghazi Al-Rawas - , Sultan Qaboos University (Author)
  • Mohammad Reza Nikoo - , Sultan Qaboos University (Author)
  • Talal Etri - , Sultan Qaboos University (Author)
  • Jens Grundmann - , Chair of Hydrology (Author)

Abstract

In arid regions, the challenges posed by rainfall data availability, missing data, and limited historical records significantly affect hydrological modeling studies and climate change assessments. For various hydrology applications, it is essential to implement advanced techniques in order to obtain a complete dataset series. This study explores the implementation of multiple machine learning techniques to address the complexity of filling daily rainfall data for 88 rainfall stations in the Al-Batinah region of Oman, covering the period from 1993 to 2024. The machine learning models applied in this study include Multiple Linear Regression (MLR), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Regression (SVR), and Gradient-Boosting Trees (GBT). A non-clustering approach is used as well as a clustering approach as part of the methodology. In the first method, rainfall stations are not clustered, while in the second method, optimal cluster numbers are calculated using K-means clustering. The target station utilizes the nearby rainfall station data located within a 50 km radius with the highest correlation coefficients. A novel Ensemble Fusion Model has been applied to improve the efficacy of multiple predictive models, including the RF Fusion Model (RF) and Multi-Model Super Ensemble Fusion Model (MMSE). The estimation approaches are further enhanced and evaluated by Bayesian optimization of hyperparameters, dataset imputation utilizing Multiple Imputation by Chained Equations (MICE), and Leave-One-Year-Out (LOYO) cross-validation. Based on the results, it can be concluded that the GBT model performs the best in both cluster and non-cluster approaches. A further benefit of applying Ensemble Fusion Models to rainfall gap-filling methods is that the coefficient of determination (R2) for clustering and non-clustering approaches increases to 22.5% and 22.2%, respectively.

Details

Original languageEnglish
Article number1
JournalHydrology
Volume13
Issue number1
Early online date20 Dec 2025
Publication statusPublished - Jan 2026
Peer-reviewedYes

External IDs

unpaywall 10.3390/hydrology13010001
Scopus 105028503531

Keywords

Sustainable Development Goals

Keywords

  • Bayesian optimization, Multiple Imputation by Chained Equations (MICE), climate change, ensemble fusion models, hydrological modeling, rainfall gap-filling