Abstract:
The present study aims to investigate the potential of the random forests ensemble classification and regression technique to improve rainfall rate assignment during day, night and twilight (resulting in 24-hour precipitation estimates) based on cloud physical properties retrieved from Meteosat Second Generation (MSG) Spinning Enhanced Visible and InfraRed Imager (SEVIRI) data.
Random forests (RF) models contain a combination of characteristics that make them well suited for its application in precipitation remote sensing. One of the key advantages is the ability to capture non-linear association of patterns between predictors and response which becomes important when dealing with complex non-linear events like precipitation. Due to the deficiencies of existing optical rainfall retrievals, the focus of this study is on assigning rainfall rates to precipitating cloud areas in connection with extra-tropical cyclones in mid-latitudes including both convective and advective-stratiform precipitating cloud areas. Hence, the rainfall rates are assigned to rain areas previously identified and classified according to the precipitation formation processes. As predictor variables water vapor-IR differences and IR cloud top temperature are used to incorporate information on cloud top height. ?T8.7–10.8 and ?T10.8–12.1 are considered to supply information about the cloud phase. Furthermore, spectral SEVIRI channels (VIS0.6, VIS0.8, NIR1.6) and cloud properties (cloud effective radius, cloud optical thickness) are used to include information about the cloud water path during daytime, while suitable combinations of temperature differences (?T3.9–10.8, ?T3.9–7.3) are considered during night-time.
The development of the rainfall rate retrieval technique is realised in three steps. First, an extensive tuning study is carried out to customise each of the RF models. The daytime, night-time and twilight precipitation events have to be treated separately due to differing information content about the cloud properties between the different times of day. Secondly, the RF models are trained using the optimum values for the number of trees and number of randomly chosen predictor variables found in the tuning study. Finally, the final RF models are used to predict rainfall rates using an independent validation data set and the results are validated against co-located rainfall rates observed by a ground radar network. To train and validate the model, the radar-based RADOLAN RW product from the German Weather Service (DWD) is used which provides area-wide gauge-adjusted hourly precipitation information.
Regarding the overall performance, as indicated by the coefficient of determination (Rsq), hourly rainfall rates show already a good correlation with Rsq = 0.5 (day and night) and Rsq = 0.48 (twilight) between the satellite and radar based observations. Higher temporal aggregation leads to better agreement. Rsq rises to 0.78 (day), 0.77 (night) and 0.75 (twilight) for 8-h interval. By comparing day, night and twilight performance it becomes evident that daytime precipitation is generally predicted best by the model. Twilight and night-time predictions are generally less accurate but only by a small margin. This may due to the smaller number of predictor variables during twilight and night-time conditions as well as less favourable radiative transfer conditions to obtain the cloud parameters during these periods.
However, the results show that with the newly developed method it is possible to assign rainfall rates with good accuracy even on an hourly basis. Furthermore, the rainfall rates can be assigned during day, night and twilight conditions which enables the estimation of rainfall rates 24 h day.