Data-driven modelling of hydraulic-head time series: results and lessons learned from the 2022 Groundwater Time Series Modelling Challenge

Raoul A. Collenteur; Ezra Haaf; Mark Bakker; Tanja Liesch; Andreas Wunsch; Jenny Soonthornrangsan; Jeremy White; Nick Martin; Rui Hugman; Ed De Sousa; Didier Vanden Berghe; Xinyang Fan; Tim J. Peterson; Jānis Bikše; Antoine Di Ciacca; Xinyue Wang; Yang Zheng; Maximilian Nölscher; Julian Koch; Raphael Schneider; Nikolas Benavides Höglund; Sivarama Krishna Reddy Chidepudi; Abel Henriot; Nicolas Massei; Abderrahim Jardani; Max Gustav Rudolph; Amir Rouhani; J. Jaime Gómez-Hernández; Seifeddine Jomaa; Anna Pölz; Tim Franken; Morteza Behbooei; Jimmy Lin; Rojin Meysami

doi:10.5194/hess-28-5193-2024

Data-driven modelling of hydraulic-head time series: results and lessons learned from the 2022 Groundwater Time Series Modelling Challenge

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Raoul A. Collenteur - , Eawag - das Wasserforschungsinstitut des ETH-Bereichs (Autor:in)
Ezra Haaf - , Chalmers University of Technology (Autor:in)
Mark Bakker - , Technische Universität Delft (Autor:in)
Tanja Liesch - , Karlsruher Institut für Technologie (Autor:in)
Andreas Wunsch - , Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung (Autor:in)
Jenny Soonthornrangsan - , Technische Universität Delft (Autor:in)
Jeremy White - , INTERA Incorporated (Autor:in)
Nick Martin - , Southwest Research Institute (Autor:in)
Rui Hugman - , INTERA Incorporated (Autor:in)
Ed De Sousa - , INTERA Incorporated (Autor:in)
Didier Vanden Berghe - , Ginger BURGEAP Lyon (Autor:in)
Xinyang Fan - , Friedrich-Alexander-Universität Erlangen-Nürnberg, Universität Bern (Autor:in)
Tim J. Peterson - , Monash University (Autor:in)
Jānis Bikše - , University of Latvia (Autor:in)
Antoine Di Ciacca - , Lincoln University (Autor:in)
Xinyue Wang - , Brown University (Autor:in)
Yang Zheng - , Brown University (Autor:in)
Maximilian Nölscher - , Bundesanstalt für Geowissenschaften und Rohstoffe (BGR) (Autor:in)
Julian Koch - , Geological Survey of Denmark and Greenland (Autor:in)
Raphael Schneider - , Geological Survey of Denmark and Greenland (Autor:in)
Nikolas Benavides Höglund - , Lund University (Autor:in)
Sivarama Krishna Reddy Chidepudi - , Centre national de la recherche scientifique (CNRS), Bureau de Recherches Géologiques et Minières (BRGM) (Autor:in)
Abel Henriot - , Bureau de Recherches Géologiques et Minières (BRGM) (Autor:in)
Nicolas Massei - , Centre national de la recherche scientifique (CNRS) (Autor:in)
Abderrahim Jardani - , Centre national de la recherche scientifique (CNRS) (Autor:in)
Max Gustav Rudolph - , Professur für Grundwassersysteme (Autor:in)
Amir Rouhani - , Helmholtz-Zentrum für Umweltforschung (UFZ) (Autor:in)
J. Jaime Gómez-Hernández - , Polytechnic University of Valencia (Autor:in)
Seifeddine Jomaa - , Helmholtz-Zentrum für Umweltforschung (UFZ) (Autor:in)
Anna Pölz - , Technische Universität Wien, Interuniversitäre Kooperationszentrum Wasser und Gesundheit (Autor:in)
Tim Franken - , Sumaqua (Autor:in)
Morteza Behbooei - , University of Waterloo (Autor:in)
Jimmy Lin - , University of Waterloo (Autor:in)
Rojin Meysami - , University of Waterloo (Autor:in)

Abstract

This paper presents the results of the 2022 Groundwater Time Series Modelling Challenge, where 15 teams from different institutes applied various data-driven models to simulate hydraulic-head time series at four monitoring wells. Three of the wells were located in Europe and one was located in the USA in different hydrogeological settings in temperate, continental, or subarctic climates. Participants were provided with approximately 15 years of measured heads at (almost) regular time intervals and daily measurements of weather data starting some 10 years prior to the first head measurements and extending around 5 years after the last head measurement. The participants were asked to simulate the measured heads (the calibration period), to provide a prediction for around 5 years after the last measurement (the validation period for which weather data were provided but not head measurements), and to include an uncertainty estimate. Three different groups of models were identified among the submissions: lumped-parameter models (three teams), machine learning models (four teams), and deep learning models (eight teams). Lumped-parameter models apply relatively simple response functions with few parameters, while the artificial intelligence models used models of varying complexity, generally with more parameters and more input, including input engineered from the provided data (e.g. multi-day averages). The models were evaluated on their performance in simulating the heads in the calibration period and in predicting the heads in the validation period. Different metrics were used to assess performance, including metrics for average relative fit, average absolute fit, fit of extreme (high or low) heads, and the coverage of the uncertainty interval. For all wells, reasonable performance was obtained by at least one team from each of the three groups. However, the performance was not consistent across submissions within each group, which implies that the application of each method to individual sites requires significant effort and experience. In particular, estimates of the uncertainty interval varied widely between teams, although some teams submitted confidence intervals rather than prediction intervals. There was not one team, let alone one method, that performed best for all wells and all performance metrics. Four of the main takeaways from the model comparison are as follows: (1) lumped-parameter models generally performed as well as artificial intelligence models, which means they capture the fundamental behaviour of the system with only a few parameters. (2) Artificial intelligence models were able to simulate extremes beyond the observed conditions, which is contrary to some persistent beliefs about these methods. (3) No overfitting was observed in any of the models, including in the models with many parameters, as performance in the validation period was generally only a bit lower than in the calibration period, which is evidence of appropriate application of the different models. (4) The presented simulations are the combined results of the applied method and the choices made by the modeller(s), which was especially visible in the performance range of the deep learning methods; underperformance does not necessarily reflect deficiencies of any of the models. In conclusion, the challenge was a successful initiative to compare different models and learn from each other. Future challenges are needed to investigate, for example, the performance of models in more variable climatic settings to simulate head series with significant gaps or to estimate the effect of drought periods.

Details

Originalsprache	Englisch
Seiten (von - bis)	5193-5208
Seitenumfang	16
Fachzeitschrift	Hydrology and earth system sciences
Jahrgang	28
Ausgabenummer	23
Publikationsstatus	Veröffentlicht - 4 Dez. 2024
Peer-Review-Status	Ja

Externe IDs

ORCID	/0000-0002-5201-2586/work/174432362

Forschungsportal der TU Dresden

Data-driven modelling of hydraulic-head time series: results and lessons learned from the 2022 Groundwater Time Series Modelling Challenge

Beitragende

Abstract

Details

Externe IDs

Schlagworte

ASJC Scopus Sachgebiete