Laboratory for Climatology and Remote Sensing

Download

Cite as:

Vorndran, M.; Schütz, A.; Bendix, J. & Thies, B. (2022): Current training and validation weaknesses in classification-based radiation fog nowcast using machine learning algorithms. Artificial Intelligence for the Earth Systems 1(2), e210006.

Resource Description


Title:	Current training and validation weaknesses in classification-based radiation fog nowcast using machine learning algorithms
FOR816dw ID:	481
Publication Date:	2022-05-18
License and Usage Rights:
Resource Owner(s):
Individual:	Michaela Vorndran
Contact:	email: michaela.schuetz <at> geo.uni-marburg.de Germany
Individual:	Adrian Schütz
Contact:	email: webmaster <at> lcrs.de
Individual:	Joerg Bendix
Contact:	email: webmaster <at> lcrs.de
Individual:	Boris Thies
Contact:	email: thies <at> Staff.Uni-Marburg.DE 35032 Marburg Germany
Abstract:
	Large inaccuracies still exist in accurately predicting fog formation, dissipation, and duration. To improve these deficiencies, machine learning (ML) algorithms are increasingly used in nowcasting in addition to numerical fog forecasts because of their computational speed and their ability to learn the nonlinear interactions between the variables. Although a powerful tool, ML models require precise training and thoroughly evaluation to prevent misinterpretation of the scores. In addition, a fog dataset’s temporal order and the autocorrelation of the variables must be considered. Therefore, classification-based ML related pitfalls in fog forecasting will be demonstrated in this study by using an XGBoost fog forecasting model. By also using two baseline models that simulate guessing and persistence behavior, we have established two independent evaluation thresholds allowing for a more assessable grading of the ML model’s performance. It will be shown that, despite high validation scores, the model could still fail in operational application. If persistence behavior is simulated, commonly used scores are insufficient to measure the performance. That will be demonstrated through a separate analysis of fog formation and dissipation, because these are crucial for a good fog forecast. We also show that commonly used blockwise and leave-many-out cross-validation methods might inflate the validation scores and are therefore less suitable than a temporally ordered expanding window split. The presented approach provides an evaluation score that closely mimics not only the performance on the training and test dataset but also the operational model’s fog forecasting abilities.
Keywords:
	\| fog forecasting \| station data \| Machine learning \| Model evaluation \| Decision Trees \| Classification \| Nowcasting \| XGBoost \|

Literature type specific fields:
ARTICLE
Journal:	Artificial Intelligence for the Earth Systems
Volume:	1
Issue:	2
Page Range:	e210006
Publisher:	American Meteorological Society

Metadata Provider:
Individual:	Michaela Vorndran
Contact:	email: michaela.schuetz <at> geo.uni-marburg.de Germany
Online Distribution:
Download File:	http://www.lcrs.de/publications.do?citid=481

Laboratory for
Climatology and Remote Sensing

Resource Description

Data Services

Quick search

Keywords:

Latest datasets

Latest publications