C12 Climate indicators on the local scale for past, present and future, and platform data management.
PI(s) for this project:
Dr. Felix Matt
Prof. Dr. Jörg Bendix
Prof. Dr. Katja Trachte
Predicting future climate change is in itself already difficult, especially in such complex ecosystems as the Andean mountain rain and dry forest as well as the Paramo. The common tools to simulate global climate change are global circulation models (GCM). Because of their coarse resolution they are not able to capture atmospheric processes affecting the local climate. For this reason a dynamical downscaling approach will be used to develop a highly resolved spatial and temporal Climatic Indicator System (hrCIS) to derive ecologically relevant climate change indicators affecting the ecosystems of South Ecuador. A local-limited area model (LAM) will be used to (i) generate a highly resolved gridded climatology for present day (hrCISpr) based on reanalysis data and (ii) to generate a highly resolved gridded climatology for projected future (hrCISpf) based on the new Representative Concentration Pathways (RCP) scenario data. The output of the LAM for present day will be validated with in-situ measurement data and satellite-derived products to ensure the accuracy of the model for the simulations of the projected future. On the basis of statistical analysis of both climatologies changes in climate indicators such as air temperature and precipitation regime will be described.
The proper storage, curation and accessibility of environmental data is of crucial importance for global change research particularly for monitoring purposes. This proposal will offer an adequate data management system for the Platform for Biodiversity and Ecosystem Monitoring and Research. This will be archived by extending the web-based information management system FOR816DW (a data warehouse for collaborative ecological research units) with features like automatic upload interfaces, a workbench for integrative analysis and an user defined alert system, which will facilitate environmental monitoring for scientist as well as stakeholders. Beside the development of these innovations a main objective is the transfer of knowledge and information (know how, source code, and collection data) to our partners in Ecuador. For this, and to bring together the existing data sources, we cooperate with university and non-university parties in the joint establishment of a Data access platform for environmental data of the region. This will include considerations on long-term accessibility, which is envisaged by a data transfer to the planned German national data infrastructure GFBio.
Fig. 1: Embedding of the proposed project data base (light red) in the data flow of the platform. New features (Web services, XML exchange, raster data base, WCS, workbench and alert system) will be added to the FOR816DW technology (described in the WPs of PI Bendix).
This project has the aims to:
- extend the FOR816DW towards a monitoring database
- serve as a project database for all German subprograms
- consider interoperability to the German initiative GFBio
- develop interoperability to data bases of the Ecuadorian non-university and university partners
- warrant the access to the monitoring data.
Figure 1 shows the structure of data management on the platform. All non-university partners (ETAPA, NCI, Gestion Zamora and FORAGUA) already operate databases with different levels of complexity and standardization. This project operates the database for the German subprograms while the UTPL operates the project data base for the SENESCYT program based on the FOR816DW technology. They will additionally develop an access platform to facilitate proper interoperability between the partner data bases and external stakeholders as e.g. MAE. This project and UTPL conduct training workshops for all partners to establish common specifications. The local partners warrant the usage after the end of the funding period. The perspective for German research lies in the possibilities for future integrative global change research based on data mining technology and change analyses over longer time periods, which are hitherto hardly possible in the Andean biodiversity hotspot.
The following tasks are executed by C12
Administration, user support and curation
- server maintenance and data backup
- online help and annual workshops to support users in data preparation and system usage
- data and metadata curation
- evaluation of technical requirements of the Ecuadorian project partners and their data bases for the interlink development.
Web Services and exchange standards
- Enhancement of the interoperability of the data management system with other infrastructures
- connection of automatic measurement stations
- frequent observation data documented in collection templates
- transport of metadata and data to other data repositories (e.g. GFBio and the Ecuadorian data access platform).
- Integration of automated XML based data upload and access.
- Development of specific methods and protocols in cooperation with the providers and consumers
- fostering the Ecological Metadata Language (EML, http://knb.ecoinformatics.org/software/eml/) as the metadata standard.
- metadata mapping to further metadata standards (GFBio portal) by means of the Extensible Stylesheet Language Transformations (XSLT).
- Definition of data sets, survey templates (prepared Excel sheets) for the import of manually collected field data as well as standards in measurement and collection design in cooperation with the users.
Data processing and alert system
The data management system provides a workbench to define and process complex analyses regularly on the latest measurement and collection data. The results can be parsed for threshold or patterns automatically and alerts are sent by mail to the users. An example is the automatic analysis of temperature time series regarding thermal extremes. The incoming data will be automatically processed and an alert will be issued if a certain range of temperature (by definition an excess of ± 1 standard deviation) is exceeded. Temperature data could be coupled to water quality data to send an alert when both parameters show a significant trend during a defined temporal interval.
The realization will be designed as a server-client architecture, integrated into the FOR816DW system. For the server-side calculations the widely known statistic software library R (http://www.r-project.org/) will be used. It will be technically accessible via the web service package Rserv (http://www.rforge.net/Rserve/index.html). The basic access for the user will be realized within the website by the integration of the module R-Node (http://www.squirelove.net/r-node/doku.php) as a browser based command line interface. Stored scripts will be run by a server-side scheduler and can additionally be executed manually via a virtual tool box in the browser. This allows the execution of complex monitoring analyses equally for scientists and stakeholders.
Spatial raster database
For analyses of spatial raster data within the web based data management system the data must be stored accessible for the server application. Latest advanced approaches store the binary raster data within a database and allow analyses operations inside the database system with an excellent performance. This database system (XXL DB, http://xxl.googlecode.com/) is developed at the faculty of informatics at the University of Marburg by the working group of Prof. Seeger and will be applied within the national GFBio data management system. In a first step this work package will implement XXL DB as the storage unit for raster data within the data management system.
Then a user interface will be developed to define spatial queries. These queries will be sent to the database and results will be presented in the browser or delivered as raster files. The Platform Ecuador - PAK 825/1 – Project C12 communication of the request and response will be handled by a web service. The development and definition of this web service will be done in respect to the OGC Web Coverage Service (WCS) standards (www.opengeospatial.org/ standards/wcs). The resulting architecture may be deemed as a first prototype of a WCS for the XXL DB.
Knowledge transfer and software release
We aim to transfer both knowledge and responsibility to the Ecuadorian partners. We support the implementation of the current release of the FOR816DW system at the UTPL/Loja as project data base for the SENESCYT bundle and to safeguard already existing data of the biological department stored in unorganized files. The support includes the initial setup of the system (remote online support) and an on-site training workshop for the local data manager and data provider. This will be the starting point for a collaborative development of the system as an open source project organized within the project hosting of Google code (http://code.google.com). The final goal is to replicate the monitoring data stock and server architecture to a hardware system at the UTPL/Loja to warrant robust and permanent access to all data independently of the German project.
Safeguarding the long-term data access
safeguarding access to research data beyond the end of funding is an important objective of the project. Guidelines of the Reference Model for an Open Archival Information System (OAIS, http://public.ccsds.org/publications/archive/650x0b1.pdf) will be taken into account. Ideally all data sets are properly prepared to be migrated from the project domain to the public domain, but the experiences have shown that a curation process is needed at this boundary. The main work in this package will be the preparation and transfer of the gathered data to the GFBio long-term data infrastructure.