Data Management for Environmental Informatics - An Irish Research Perspective

Peter Mooney and Adam C. Winstanley, National Center for Geocomputation, Ireland


The Environmental Research Center (ERC) within the Irish Environmental Protection Agency (EPA) has developed a large-scale computer-based environmental data management system (EDMS). This EDMS manages data produced primarily by university researchers funded by the EPA and other environmental institutions. All research project data (and metadata) must be supplied to the EDMS where it is archived, transformed to standard formats, and distributed to other stakeholders and users for further research. The full paper will describe how issues of standardisation, interoperability, and metadata supply are being overcome. Also addressed will be how this EDMS will become a building block of a larger international spatial data infrastructures (SDI), such as the European SDI proposed by the INSPIRE Directive. 

Details

Environmental Informatics involves the delivery of quality assured, properly documented (using metadata), and easily accessible datasets (as products of environmental monitoring, survey, aggregation, modelling, etc). Delivery of these products leads to the provision of information and knowledge-based services to end-users. The heterogeneity of data providers in this environmental monitoring and research domain, coupled with the diversity of data needed to satisfy a wide community of users, means interoperability for data integration and data exchange is difficult. However, by allowing data providers and stakeholders to retain a high degree of autonomy, while at the same time adopting common documentation and data practices, access and use of their individual research outputs is maximised. This work has encountered interoperability at several different working levels:

1. Measurement/Monitoring levels - consistency of observational methods, measurement accuracy, limits of detection;
2. Models and Data Analysis Levels - robustness of statistical procedures used in analysis, data analysis algorithms;
3. Metadata Levels - data attributes described, standards used, types of descriptions.

To address these issues the EDMS incorporates a number of important components:

1. A fully J2EE compliant front-end developed using standalone Java data processing programs, Java Servlets, and Java Server Pages (JSP);

2. All metadata is stored in a MySQL database fully separating the metadata from visual presentation. XSLT can be used for sophisticated metadata display or simple table-based display using JSP or Java Servlets;

3. Researchers can upload datasets using a HTTP or FTP interface to their private project area within the EDMS;

4. Deviation towards "light-metadata" (Taylor and Jones 2003) or "short-cut metadata" (Nerbert, 2004) is strongly discouraged. This is where a type of 'minimum metadata' is implemented to encourage data providers to support the generation of metadata. Using either a web-based metadata submission interface or the standalone Java application users are encouraged to provide detailed metadata descriptions before they will be permitted to upload data to the EDMS.

5. Data providers are encouraged to supply metadata and dataset resources by the publication of reasonably frequent EDMS statistics - how many hits on the metadata web-server, how much data is downloaded, the most popular downloads, etc;
Barriers to the free exchange of information are by no means limited to these technical causes or amenable to technical solutions. Many problems can be ascribed to institutional and human factors often reflecting commitment to existing local or narrowly thematic information systems (Wyatt et al. 2004).  Another important aspect going forward is ensuring that non-IT management understands the potential energy held by metadata. The full paper will also describe how the interoperability guidelines of the OGC (Open Geospatial Consortium) are being implemented in this EDMS.

REFERENCES
RDM Working Group, "INSPIRE: Reference Data and Metadata Position Paper", Published by EUROSTAT, October 2002. Identifier (RDM PP v4-3 en)
Nerbert, D (Editor). "Developing Spatial Data Infrastructures: The SDI Cookbook", Version 2.0, Published by Global Spatial Data Infrastructure (GSDI), January 2004
Wyatt, B.K., Briggs, D.J., and Ryder, P. Building a European Information Capacity for Environment and Security. GMES Action Plan (2002 - 2003) European Commission, May 2004.


Keywords: Environmental Data, Interoperability, Metadata, Data Exchange