Publications
Found 14 publication(s)
- of type
Guio Blanco, C.M.; Brito Gómez, V.M.; Crespo, P. & Ließ, M. (2018): Spatial prediction of soil water retention in a Páramo landscape: Methodological insight into machine learning using random forest. Geoderma 316, 100-114.
-
log in to download
-
link
-
view metadata
-
DOI: 10.1016/j.geoderma.2017.12.002
-
Abstract:
Abstract:
Soils of Páramo ecosystems regulate the water supply to many Andean populations. In spite of being a necessary input to distributed hydrological models, regionalized soil water retention data from these areas are currently not available. The investigated catchment of the Quinuas River has a size of about 90 km2 and comprises parts of the Cajas National Park in southern Ecuador. It is dominated by soils with high organic carbon contents, which display characteristics of volcanic influence. Besides providing spatial predictions of soil water retention at the catchment scale, the study presents a detailed methodological insight to model setup and validation of the underlying machine learning approach with random forest. The developed models performed well predicting volumetric water contents between 0.55 and 0.9 cm3 cm? 3. Among the predictors derived from a digital elevation model and a Landsat image, altitude and several vegetation indices provided the most information content. The regionalized maps show particularly low water retention values in the lower Quinuas valley, which go along with high prediction uncertainties. Due to the small size of the dataset, mineral soils could not be separated from organic soils, leading to a high prediction uncertainty in the lower part of the valley, where the soils are influenced by anthropogenic land use.
-
Keywords: |
Páramo |
random forest |
water retention |
validation |
parameter tuning |
Ließ, M.; Schmidt, J. & Glaser, B. (2016): Improving the spatial prediction of soil organic carbon stocks in a complex tropical mountain landscape by methodological specifications in machine learning approaches. PLOS ONE 11(4), 1-22.
-
log in to download
-
link
-
view metadata
-
DOI: 10.1371/journal.pone.0153673
-
Abstract:
Abstract:
Tropical forests are significant carbon sinks and their soils’ carbon storage potential is immense. However, little is known about the soil organic carbon (SOC) stocks of Tropical mountain areas whose complex soil-landscape and difficult accessibility pose a challenge to spatial analysis. The choice of methodology for spatial prediction is of high importance to improve the expected poor model results in case of low predictor-response correlations. Four aspects were considered to improve model performance in predicting SOC stocks of the organic layer of a tropical mountain forest landscape: Different spatial predictor settings, predictor selection strategies, various machine learning algorithms and model tuning. Five machine learning algorithms: random forests, artificial neural networks,
multivariate adaptive regression splines, boosted regression trees and support
vector machines were trained and tuned to predict SOC stocks from predictors derived
from a digital elevation model and satellite image. Topographical predictors were calculated with a GIS search radius of 45 to 615 m. Finally, three predictor selection strategies were applied to the total set of 236 predictors. All machine learning algorithms—including the model tuning and predictor selection—were compared via five repetitions of a tenfold cross-validation. The boosted regression tree algorithm resulted in the overall best model. SOC stocks ranged between 0.2 to 17.7 kg m-2, displaying a huge variability with diffuse insolation and curvatures of different scale guiding the spatial pattern. Predictor selection and model tuning improved the models’ predictive performance in all five machine learning algorithms. The rather low number of selected predictors favours Forward compared to backward selection procedures. Choosing predictors due to their indiviual performance was vanquished by the two procedures which accounted for predictor interaction.
-
Keywords: |
soil organic carbon |
digital soil mapping |
Ließ, M. (2015): Sampling for regression-based digital soil mapping: Closing the gap between statistical desires and operational applicability. Spatial Statistics 13, 106-122.
-
log in to download
-
link
-
view metadata
-
DOI: 10.1016/j.spasta.2015.06.002
-
Abstract:
Abstract:
With respect to sampling for regression-based digital soil mapping
(DSM), the above all aim is to ensure that the spatial variability
of the soil is well-captured without introducing any bias, while
the design remains feasible according to operational constraints
such as accessibility, man power and cost. Representativeness of
the sample concerning the population to be sampled needs to be
guaranteed in any regression-based modelling approach. Four selected
sampling designs were adapted to show that basically any
design may be optimised to represent the n-dimensional predictor
space of a particular area, while selecting points is only permitted
from a small accessible sub-area or from outside the area. Sampling
efficiency may be evaluated based on the representation of
the predictor space. However, not only each predictor’s probability
function but also the interaction between predictors may have to
be considered, to select a representative sample. Instead of sampling
a previously un-sampled area with limited accessibility, the
four sampling designs may also be used to subsample an existing
dataset and, thereby, optimise a suboptimal dataset based on the
predictor space of the area which shall be mapped by DSM.
-
Keywords: |
sampling design |
digital soil mapping |
regression |
Hitziger, M. & Ließ, M. (2014): Comparison of Three Supervised Learning Methods for Digital Soil Mapping: Application to a Complex Terrain in the Ecuadorian Andes. Applied and Environmental Soil Science 2014, 10 pages.
-
log in to download
-
link
-
view metadata
-
DOI: 10.1155/2014/809495
-
Abstract:
Abstract:
A digital soil mapping approach is applied to a complex, mountainous terrain in the Ecuadorian Andes. Relief features are derived from a digital elevation model and used as predictors for topsoil texture classes sand, silt, and clay. The performance of three statistical learning methods is compared: linear regression, random forest, and stochastic gradient boosting of regression trees. In linear regression, a stepwise backward variable selection procedure is applied and overfitting is controlled by minimizing Mallow’s Cp. For random forest and boosting, the effect of predictor selection and tuning procedures is assessed. 100-fold repetitions of a 5-fold cross-validation of the selected modelling procedures are employed for validation, uncertainty assessment, and method
comparison. Absolute assessment of model performance is achieved by comparing the prediction error of the selected method and the mean. Boosting performs best, providing predictions that are reliably better than the mean. The median reduction of the root mean square error is around 5%. Elevation is the most important predictor. All models clearly distinguish ridges and slopes.
The predicted texture patterns are interpreted as result of catena sequences (eluviation of fine particles on slope shoulders) and landslides (mixing up mineral soil horizons on slopes).
-
Keywords: |
soil texture |
digital soil map |
Ließ, M.; Hitziger, M. & Huwe, B. (2014): The Sloping Mire Soil-Landscape of Southern Ecuador: Influence of Predictor Resolution and Model Tuning on Random Forest Predictions. Applied and Environmental Soil Science 2014(603132), 10 pages.
-
log in to download
-
link
-
view metadata
-
DOI: 10.1155/2014/603132
-
Abstract:
Abstract:
The sloping mire landscape of the investigation area, in the southern Andes of Ecuador, is dominated by stagnic soils with thick organic layers. The recursive partitioning algorithm Random Forest was used to predict the spatial water stagnation pattern and the thickness of the organic layer from terrain attributes. Terrain smoothing from 10 to 30m raster resolution was applied in order to obtain the best possible model. For the same purpose, several model tuning parameters were tested and a prepredictor selection with the R-package Boruta was applied. Model versions were evaluated and compared by 100 repetitions of the calculation of the residual mean square error of a five-fold cross-validation. Position specific density functions of the predicted soil parameters were then used to display prediction uncertainty. Prepredictor selection and tuning of the Random Forest algorithm in some cases resulted in an improved model performance.We therefore recommend testing prepredictor selection and tuning to make sure that
the best possible model is chosen.This needs particular emphasis in complex tropical mountain soil-landscapes which provide a real challenge to any soil mapping approach but where Random Forest has proven to be successful due to the testing of model tuning and prepredictor selection.
-
Keywords: |
regionalization |
digital soil map |
organic layer |
stagnic properties |
Ließ, M.; Glaser, B. & Huwe, B. (2012): Making use of the World Reference Base diagnostic horizons for the systematic description of the soil continuum - Application to the tropical mountain soil-landscape of southern Ecuador. CATENA 97, 20 -30.
-
link
-
view metadata
-
DOI: 10.1016/j.catena.2012.05.002
-
Abstract:
Abstract:
The World Reference Base for Soil Resources (WRB) (FAO, IUSS Working Group WRB, 2007) at present does not acknowledge the spatial soil continuum, but provides a sound basis to do so. Using methods from statistical learning theory to develop digital soil maps is much more efficient and precise while regionalising soil diagnostic properties instead of complex entities such as the soil units assigned by the WRB. Particularly in
providing spatial soil information displayed in digital soil maps, any aggregation of this spatial soil information to soil units means a loss of information.
The soil landscape can be systematically described in its spatial continuum simply by the vertical order and extent of the WRB diagnostic horizons. The diagnostic horizons are related in their thickness to a standard depth and listed from top to bottom in order of appearance.
Typical diagnostic horizon thickness and occurrence probability were predicted from terrain parameters by classification and regression trees (CART), throughout the research area in southern Ecuador. The two disadvantages of CART, abrupt prediction class boundaries and dependence on the dataset, were addressed by hundredfold model runs on different data subsets, leading to a range of possible predictions. Prediction uncertainty was included in the digital soil maps by calculating these predictions' means and standard deviations as well as by horizon occurrence probability prediction. Model performance was evaluated by means of hundredfold external cross validation.
Terrain parameters were found to have a strong influence on diagnostic topsoil properties. However, no influence on the vertical profile differentiation was observed. Hence predicting horizon thickness and subsoil diagnostic properties was difficult. The systematic description of the soil continuum of this particular soillandscape resulted in histic and stagnic soil parts dominating the first 100 cm of the soil column for most of the area.
Ließ, M.; Glaser, B. & Huwe, B. (2012): Uncertainty in the spatial prediction of soil texture - Comparison of regression tree and Random Forest models. Geoderma 170, 70-79.
Ließ, M. & Huwe, B. (30.09.2011). Uncertainty in soil regionalisation and its influence on slope stability estimation. Presented at Italian Workshop on Landslides, Naples, Italy.
Ließ, M.; Glaser, B. & Huwe, B. (2011): Soil-Landscape Modelling - Reference Soil Group Probability Prediction in Southern Ecuador. In: E. Burcu Özkaraova Güngör (eds.): Principles, Application and Assessment in Soil Science (1 1), INTECH, http://www.intechopen.com/books/show/title/principles-application-and-assessment-in-soil-science, 241-256.
Ließ, M. (2011): SOIL-LANDSCAPE MODELLING IN AN ANDEAN MOUNTAIN FOREST REGION IN SOUTHERN ECUADOR University of Bayreuth, phd thesis
-
log in to download
-
link
-
view metadata
-
DOI: http://opus.ub.uni-bayreuth.de/frontdoor.php?source_opus=907
-
Abstract:
Abstract:
Soil-landscapes are diverse and complex due to the interaction of pedogenetic, geomorphological and hydrological processes. The resulting soil profile reflects the balance of these processes in its properties. Early conceptual models have by now resulted into quantitative soil-landscape models including soil variation and its unpredictability as a key soil attribute. Soils in the Andean mountain rainforest area of southern Ecuador are influenced by hillslope processes and landslides in particular. The lack of knowledge on the distribution of soils and especially physical soil properties to understand slope failure, resulted in the study of this particular soil-landscape by means of statistical models relating soil to terrain attributes, i.e. predictive soil mapping.
A 24 terrain classes comprising sampling design for soil investigation in mountainous areas was developed to obtain a representative dataset for statistical modelling. The soils were investigated by 56 profiles and 315 auger points. The Reference Soil Groups (RSGs) Histosol, Stagnosol, Umbrisol, Cambisol, Leptosol and Regosol were identified according to the World Reference Base for Soil Resources (WRB). Classification tree models and a probability scheme based on WRB hierarchy were applied to include RSG prediction uncertainty in a digital soil map. Histosol probability depended on hydrological parameters; highest Stagnosol probability was found on slopes < 40° and above 2146 m a.s.l.
Poor model performance, probably due to the prediction of complex categories (RSGs) and WRB inconsequence (absolute and relative value criteria), led to the proposal of ?incomplete soil classification? by relating the thickness of the WRB?s diagnostic horizons as percentage to the upper 100 soil centimetres, including the organic layer. Typical diagnostic horizons histic, humic, umbric, stagnic and cambic were regionalised in their thickness and occurrence probability by classification and regression trees (CART). Prediction uncertainty was addressed with hundredfold model runs based on different random Jackknife partitions of the dataset. Whether the first mineral soil horizon displays stagnic properties or not, likely depends on physical soil properties in addition to terrain parameters. Incomplete soil classification resulted in histic and stagnic soil parts dominating the first 100 cm of the soil volume for most of the research area.
While soil profiles and auger points were described in their horizon composition, thickness, Munsell colour and soil texture by finger method (FAO, 2006), soil cohesion, bulk density and texture by pipette and laser were analysed in soil profiles only. Texture results by pipette compared to laser method, showed the expected shift to higher silt and lower clay contents. Linear regression equations were adapted. Pedotransfer functions to predict physical soil properties from the bigger auger dataset analysed by field texture method only, could not be developed, because field texture analysis did not provide satisfying results. It was therefore not possible to correct its results with the more precise laboratory data.
Comparing CART and Random Forest (RF) in their model performance to predict topsoil texture and bulk density as well as mineral soil thickness by hundredfold model runs with random Jackknife partitions, RF predictions resulted more powerful. Altitude a.s.l. was the most important predictor for all three soil parameters. Increasing sand/ clay ratios with increasing altitude, on steep slopes and with overland flow distance to the channel network are caused by shallow subsurface flow removing clay particles downslope. Deeper soil layers are not influenced by the same process and therefore showed different texture properties.
Terrain parameters could only explain the spatial distribution of topsoil properties to a limited extent, subsoil properties could not be predicted at all. Other parameters that likely influence soil properties within the investigation area are parent material and landslides. Strong evidence was found that topsoil horizons did not form from the bedrock underlying the soil profile. Parent material changes within short distance and often within one soil profile. Landslides have a strong influence on soil-landscape formation in shifting soil and rock material.
Soil mechanical and hydrological properties in addition to terrain steepness were hypothesized to be the major factors in causing soil slides. Thus, the factor of safety (FS) was calculated as the soil shear ratio that is necessary to maintain the critical state equilibrium on a potential sliding surface. The depth of the failure plane was assumed at the lower boundary of the stagnic soil layer or complete soil depth, depending on soils being stagnic or non-stagnic. The FS was determined in dependence of soil wetness referring to 0.001, 0.01, 0.1 and 3 mm/h net rainfall rate. Sites with a FS ≥ 1 at 3 mm/h (complete saturation) were classified as unconditionally stable, sites with a FS < 1 at 0.001 mm/h as unconditionally unstable. The latter coincided quite well with landslide scars from a recent aerial photograph.
-
Keywords: |
Ecuador |
tropical montane forest |
CART |
GIS |
soil-landscape modeling |
Ließ, M.; Glaser, B. & Huwe, B. (2011): Functional soil-landscape modelling to estimate slope stability in a steep Andean mountain forest region . Geomorphology 132, 287-299.
-
log in to download
-
link
-
view metadata
-
DOI: 10.1016/j.geomorph.2011.05.015
-
Abstract:
Abstract:
Landslides are a common phenomenon within the Ecuadorian Andes and have an impact on soil-landscape formation. Landslide susceptibility was determined in a steep mountain forest region in Southern Ecuador. Soil mechanical and hydrological properties in addition to terrain steepness were hypothesised to be the major factors in causing soil slides. Hence, the factor of safety (FS) was calculated as the soil shear ratio that is necessary to maintain the critical state equilibrium on a potential sliding surface. Regression tree (RT) and Random Forest (RF) models were compared in their predictive force to regionalise the depth of the failure plane and soil bulk density based on terrain parameters. The depth of the failure plane was assumed at the lower boundary of the stagnic soil layer or soil depth respectively, depending on soils being stagnic or nonstagnic. FS was determined in dependence of soil wetness referring to 0.001, 0.01, 0.1 and 3 mm h−1 net rainfall rates. Sites with FS≥1 at 3 mm h−1 (complete saturation) were classified as unconditionally stable; sites with FSb1 at 0.001 mm h−1 as unconditionally unstable. Bulk density and the depth of the failure plane were regionalised with RF which performed better than RT. Terrain parameters explained the spatial distribution of soil bulk density and the depth of the failure plane only to a relatively small extent which is reasonable due to frequent translocation of soil material by landslides.
Nevertheless, their prediction uncertainty still allowed for a reasonable prediction of nconditionally unstable sites.
Ließ, M.; Glaser, B. & Huwe, B. (2009): Digital Soil Mappingin Southern Ecuador. Erdkunde 63, 309-319.
-
log in to download
-
link
-
view metadata
-
Abstract:
Abstract:
Soil landscape modelling is based on understanding the spatial distribution patterns of soil characteristics. A model relating the soil?s properties to its position within the landscape is used to predict soil properties in other similar landscape positions. To develop soil landscape models, the interaction of geographic information technology, advanced statistics and soil science is needed. The focus of this work is to predict the distribution of the different soil types in a tropical mountain forest area in southern Ecuador from relief and hydrological parameters using a classification tree model (CART) for soil regionalisation. Soils were sampled along transects from ridges towards side valley creeks using a sampling design with 24 relief units. Major soil types of the research area are Histosols associated with Stagnosols, Cambisols and Regosols. Umbrisols and Leptosols are present to a lesser degree. Stagnosols gain importance with increasing altitude and with decreasing slope angle. Umbrisols are to be found only on slopes <30°. Cambisols occurrence might be related to landslides.The CART model was established by a data set of 315 auger sampling points. Bedrock and relief curvature had no influence on model development. Applying the CART model to the research area Histosols and Stagnosols were identified as dominant soil types. Model prediction left out Cambisols and overestimated Umbrisols, but showed a realistic prediction
for Histosols, Stagnosols and Leptosols.
-
Keywords: |
Ecuador |
tropical montane forest |
CART |
GIS |
soil-landscape modeling |