Metadata for Scientific Data in China: an Overview

Hou Yanfei

Authors: Qing Li, Yanfei Hou

The term metadata is used different in different communities. In the field of scientific data, metadata is commonly used for any formal schema that summarizes data content, context, structure, inter-relationships, and provenance, applying to any data entity from scientific research activities including observations, experiments, simulations, models, and higher order assemblies.

Metadata for scientific data has been being studied since the mid 1990’s in China. In this paper, the representative projects relating to the development and implementation of metadata for scientific in China are introduced. In another main part, the authors make qualitative and quantitative analysis on the journal articles about metadata for scientific data published during the period of 1996-2005 in China, which are collected in “CNKI”. Then the authors draw the following conclusions: (1)The importance of metadata has been recognized in many domains including geoscience, biology, medicine, ecology, agriculture and others. (2)There’re two main types of metadata schemas for scientific data: one is developed to describe datasets and the other is designed to support the exchange and integration of data from a wide variety of databases, like a top-level structure of datasets collections containing independent dataset objects. (3)Although many metadata schemas have been developed, most of them aren’t taken effective and full advantage of for many reasons. (4) The topics of metadata interoperability, metadata quality and metadata schemas designed to support data exchange and integration are drawing more and more attention.

Keywords : metadata, scientific data