Common Metadata for Climate Modelling Digital Repositories

 
  • Decrease font size
  • Default font size
  • Increase font size
Home arrow About METAFOR arrow METAFOR Objectives
Objectives and Principles Print E-mail
Last Modified: Thursday, Mar 06/2008 13:59

METAFOR’s objectives are to define a Common Information Model (the CIM) to describe models and data, use the CIM to develop and deploy a “one-stop-shop” infrastructure to allow models and data to be shared, and to develop tools which translate between existing standards and the CIM and which perform useful tasks with CIM-described models and data. What distinguishes the CIM from other standards is that it incorporates rather than replaces those other standards. This requires a large coordination effort that can only be achieved at the European (and wider) level.

Two guiding principles for this activity will be to maintain an appropriate “separation of concerns“ and to support the governance of the CIM and its component information classes. Adhering to these two principles should avoid irreversible mixing of classes with different heritages. The CIM will allow some classes to continue to evolve under external governance, while maintaining strict version control within the CIM. In practice this will mean either identifying existing (such as those discussed in “state-of-theart” above) information models or developing new CIM information models, each of which will be aimed at a different part of the model and data descriptive hierarchy.

METAFOR does not plan to develop a completely new standard which, because of the impracticality (not to mention unpopularity) of “phasing out” existing standards, would be unlikely to gain the momentum required to bring about long term benefits. Rather, METAFOR aims to integrate existing standards – filling in the gaps as required – so that existing repositories using potentially different “native” metadata formats can interoperate via the CIM.

Existing metadata standards are likely to have some missing concepts and some overlapping concepts as well as some idiosyncratic concepts not required by the wider community. This is still a very powerful model. Consider, for example, the following scenario:

Dataset d is described by CF metadata in an existing repository. That repository is made CIMcompatible. User A, who knows nothing about CF, browses the repository and finds d because the relevant bits of CF have been extracted into the CIM. User A can now use CIM tools on d. User B, who knows a lot about CF, also finds d (in the exact same way). User B downloads d and runs some tools on it which know about CF but not necessarily about the CIM – those tools are outside the scope of METAFOR.

This type of scenario could occur with any one of the existing standards that the CIM will incorporate.

Metadata capture should happen at all stages of the production and use of simulation data. Thus, the CIM will associate data descriptions with descriptions of the models and processes employed to generate that data. Model descriptions will include information about the component model source code, the parameterisation of component models, the way they couple with other component models and with forcing data (their “composition”), the grids upon which their data is discretised, the way they are deployed and run on computing resources, any post processing that is applied to them after they are generated (and potentially after they are archived), as well as a researcher’s intent in running the simulation. This association is currently missing from data archives. Keeping such an association is good scientific practise; it allows results to be reproduced if required, more informed comparisons between datasets to be performed, and more efficient searching for particular types of data. Agreement for a common standard to encompass the entire modelling process is very difficult since the community is so broad and developments advance at a rate faster than standards can keep pace with. But building the CIM on a smaller set of standards, and providing a clear governance policy, will enable it to retain flexibility and react quickly to the needs of the community.