Common Metadata for Climate Modelling Digital Repositories

 
  • Decrease font size
  • Default font size
  • Increase font size
Home arrow Meetings
Meetings
Date and place of Year 2 meeting Print E-mail
Last Modified: Wednesday, Jan 06/2010 15:33

The Month 24 All partners meeting will be held in Paris on the 9th, 10th and 11th February 2010.

Directions will be added to this page soon.

 Further information about the year 2 meeting can be found here.  

 
Metafor Questionnaire meeting - 14th 15th May 2009. Print E-mail
Last Modified: Friday, 29 May 2009

Present –Laurent, Sebastien, Marie-Pierre, Sarah, Lois, Gerry, Mark M, Frank, Charlotte, Eric, Rupert, Bryan.

Summary of meeting:

(a) Things we clarified/agreed on (including why?, how?, who?)

-          Rules (244) for flattening the CV mindmaps (reviewed and amended)

-           Software CV mindmaps rewriting procedure for questionnaire (flattening according to 244 rules) [---> Marie-Pierre]

-           Flat mindmaps validation process [--->Rupert]

-           Maintaining 2 versions of the CV mindmaps (raw + flat) [--->Marie-Pierre]

-           Work flow from raw CV mindmaps to Questionnaire (development diagrams) [---> Lois]

-           Time line for Questionnaire development (from "Alpha phase" to "Production phase" and "after Questionnaire")

-           Info about Coupling in the Questionnaire (list of predefined fields + "get_data" procedure for the others)

-           Consistency between Bryan's UML for Questionnaire and CIM (will arise naturally)

-           Capturing References in the Questionnaire

-          Golden rule – don’t use different names for the same thing!! All have to agree to use the same vocabulary for talking about things.

-          The root is the climate model e.g HadGEM3 and is therefore the eldest parent. Root is unique name of the thing you use to produce the data. Root can be different depending on experiment. Each should be unique.

-          Inherit common properties (answers to common concepts) until you need to change them.

 

       (b) Things for which we identified a need to put efforts on (including things that still need to be clarified / think about it)

-          Provide Activity (CMIP5Experiments) instances [---> Sarah + Charlotte]

-          Conformance between Activity and Software (CMIP5 requirements) [---> Sebastien]

     o    Identify "mandatory" elements in CMIP5 requirements

     o   Find how to ask question about conformance in the Questionnaire

-           Common concepts ! Some progress made during the meeting (for eg. "reusable"). But need to work still on.  [---> Bryan + Marie-Pierre]

-           Dynamical core of Atmosphere and ESG CV ingestion [---> Bryan + Marie-Pierre]

-           Genealogy and model provenence.

-           Validation process for home hand-coded xml documents (for e.g. the ones coming from MetOffice), i.e. the documents bypassing the "CV mindmap to questionnaire" processing/testing line".

-          Being able to report on the process of producing the questionnaire

-          How the different mindmaps will link together.

-          What content should go into the “add new experiments” page?[-à  Sebastien]

-          How the questionnaire will be approved by the CMIP5 panel. 1 page document detailing how the questionnaire data will be used. [-à Eric]

-          Need to come up with a way of versioning and creating an activity diagram for the development process – [ -à Lois and Mark M.]         

(c) Thing to be done "after Questionnaire"

-          Linking variables with components (questionnaire post-processing) - Provide list of "controversial" variables (i.e. no CF-name)  [---> Gerry]

-          Produce an educational page (with metafor logo on) to allow the controlled vocabulary mind maps to be used for educating students.

More detailed notes from the meeting can be found in the document repository here

 
METAFOR Year 1 meeting - List of Actions, Decisions and Issues Print E-mail
Last Modified: Thursday, 19 February 2009

METAFOR Year 1 meeting

 

Cosener’s House, Abingdon Feb. 9-11 2009

 

List of Actions, Decisions and Issues

 1. Actions: 

·        CMIP5 questionnaire:

o   Develop the CMIP5 questionnaire list of requirements and constraints – initial version from mtg minutes to go on wiki (Eric)

o   Develop a sequence diagram/flowchart/storyline for the users’ path through the questionnaire – Laurent and Sebastien based on initial version by Bryan (see http://metaforclimate.eu/trac/wiki/CMIP5/Storyline) Arrange a telco with the USA to discuss the CMIP5 questionnaire storyline, Dean Williams from PCMDI will need to take part.

o   Sarah is to create an ingestion diagram, showing where the metadata can be ingested from and when (netcdf files, questionnaire etc)

o   Talk with Bob Drach regarding ESG Publisher – Bryan

o   Charlotte and Marie-Pierre to liaise with each other and other scientists (e.g. IPSL and GFDL) to fill in the gaps in the controlled vocabulary structure.

o   Arrange a dfn and telco session with Sylvia for her to go through the controlled vocabulary mind map with us, taking into account the results for the Dy-Core workshop Sarah, Marie-Pierre

o   Put the mind maps on the Metafor site to allow visitors to interact with it – Sarah + BADC support

o   Marie-Pierre, Lois, Charlotte and Eric are to decide on a way to govern the mind maps for the controlled vocabulary, filling in the controlled vocabulary, rules for merging the maps etc. - due 20th Feb

o   Work on associated diagnostic variables – Bryan

o   Lois is to email Allyn with already built lists of controlled vocabulary – Lois 

o   Time line:

§  Database open no earlier than July 1st

 

§  ESG CV needs by July 1st (not too disruptive when no instances are there)

 

§  Population of database, Q3 2009

 

§  CV list ready by mid-April 2009

 

§  List of questionnaire requirements ready by end of Y1 mtg

§  So questionnaire ready by July 1st  

 

·        International scene and CMIP5 (including STAB recommandations):

o   We need to seek the endorsement of WGCM (and CMIP panel) (which may already in fact be implied) that METAFOR will lead (and assume primary responsibility) for obtaining from the modelling groups metadata experiment information – Eric/Sarah

o   We are to better structure our communication with the CMIP5 panel and our colleagues in the USA. We should build a closer relationship between PCMDI and Metafor encourage the dissemination and uptake of the CIM by the climate effects and climate modelling communities – All (Sarah/Eric to lead)

o   Medata data governance: Metafor should take the lead in developing the governance for the CIM. This could be by analogy with the CF committee that governs the controlled vocabulary Eric/Karl 

·        Working with Curator/ESG and related projects (includes dissemination actions):

o   Develop a schedule of regular teleconferences between Metafor and Curator to discuss priorities, issues etc. – Sarah

o   Finalise the MoU between Metafor and GENESI-DR – Bryan and Luigi

o   Investigate the other EC projects to determine if any have areas of common interest – Sarah

o   Produce a document on Metafor’s relationship to GEOS – Bryan

o   Do a survey of what other projects are around and what we could get from them to target our effort for cross project interactions. This should include identifying datasets and the interfaces to datasets – Sarah

 

·        CIM development:

o   Bryan is to give a copy of his old UML about CF to Allyn – Bryan

o   Build CIM instances – All

o   Take Marie-Pierre’s controlled vocabulary mind map and ingest it into the CIM Allyn

o   Allyn and Bryan are to discuss with software package people how to translate controlled vocabulary lists into xml schema (blocker)

o   Allyn is to clean up the CIM with regards to the change property and add attributes to the document type – e.g. parent.

o   Allyn is to sort out the coupler concept and break up the simulation class to get rid of file compmonent.

o   Phil to organise a telco on the subject of grids with Sylvia and Balaji

o   Sylvia is to email the Metafor list with the UML of the Curator grids package.

o   Present slides from the quality package at future telco - Mark

 

·        WP4/5/6 issues:

o   Allyn and Bryan to discuss the technology to use for portal searches – Allyn and Bryan

o   Demonstration of the Metafor portal to be given via telco and dfn – BADC

o   Work plans for WP 4, 5 and 6 should be decided on – due by next telco, Thurs 19th Feb

o   Arrange monthly telco to discuss WP 4, 5 and 6 – Sarah/Eric to ensure this happenso    

·        Management and dissemination:

o   Give the STAB access to the full Metafor site – Sarah

o   Eric to discuss with Metafor EU liaison about interactions with EU policy makers to be made by IS-ENES – Eric

o   Develop a video of  Metafor activities – Sarah

o   Timeline for the year 1 report:

§  Sarah to put template for year 1 report and “Form C” (for finances) onto the metafor site – due Feb 28th

§  Eric and Sarah to update metafor-admin list and find out about year 1 review meeting

§  WP leaders to coordinate production of 3-4 pages per WP

§  WP Leaders to provide technical reports to Sarah and Eric – due Mar15th

§  Partners to provide form C (giving financial situation up to end Feb)  to Sarah and Eric – due Mar15th

§  Final report assembled – Eric and Sarah – due Mar 31st.

o   Sarah and Allyn  to coordinate about restructuring the wiki and website to make it easier to find things. They’re also to formulate a working procedure for tracking issues using the wiki and trac system, and inform the list of this procedure.

o   Everyone is to review their tickets, and create new ones as issues arise.

o   Sarah  is to mail the list with instructions on what to do if the website or Trac breaks

o   Everything on the website bar the EU-related documents (proposal, deliverables, reports,…) should be made public – Sarah

  

2. Decisions and information: 

·        CMIP5 metadata:

o   CMIP5 will provide Metafor’s real-world practical example.

o   Who governs changing the structure of the CIM in response to CMIP5? We do.

o   Metafor would like to recommend mandatory minimum content for CMIP5 metadata documents.

o   The CIM tells us how to structure the information we collect from the questionnaire. The questionnaire structure tells us how to collect that information. We need fully fleshed out sequence diagrams/storylines to inform the structure of the questionnaire.

o   We need to ingest metadata from the netCDF data files as well as the questionnaire.

o   We would like PCMDI to not allow datasets to be published until the metadata is complete. (This is still under negotiation, as is the definition of “complete”)

o   Links to other documents should not be included in the documents themselves, instead all documents should have unique identifiers that are linked by a third party register.

o   We need to have 1 coherent voice to speak to the modelling groups with about metadata. Hence the questionnaire can be branded CMIP5 (or other name) instead of Metafor without any problems.

o   The CMIP5 panel will govern CMIP5 controlled vocabulary.

o   For the CMIP5 questionnaire, there will be controlled vocabulary for the model name, which will be unique. Model name will provide a starting point for finding out how the model is configured.

o   CMIP5 data is expected to come in to the data centres no earlier than July 1st 2009. The data centres will need enough metadata to catalogue the incoming data, and will require the rest to publish the datasets.

o   For the list of diagnostic and prognostic variables of interest to scientists, we only need to concentrate on the ones in the CMIP5 output variables list.

o   For every model component we should ask about the output and input variables. It was pointed out that the Met Office (and other modelling groups) can produce this information in their own structure, but it will take work to translate that to the CIM.

o   We can discuss post-processing in the same way as discussing coupling. These topics will have a different set of questions in the questionnaire.

o   We only need to record coupling information at the level required for CMIP5.

o   For CMIP5 the change property is not needed as we only need to say that the modeller is basing a new instance on an old instance. Hence for CMIP5, change property should not be used.

o   Gridspec netcdf files will be submitted separately from the data files. Charles Doutrieux from PCMDI is writing code to harvest gridspec information from netcdf files into xml.

o   We can populate some of the grids package from gridspec netcdf files.

o   The BADC and WDCC will work on building empty questionnaires in parallel, using geonetwork and building tools from scratch using Jango. This will be dependent on the questionnaire sequence diagram/storyline. Once they’re built, all project members should test them.

  

·        Controlled vocabulary:

o   Modellers will maintain the constraints on the controlled vocabulary, not us.

o   Regarding controlled vocabulary, we need to distinguish control of Names and Values from Structure. Structure determines how you compose your vocabulary.

o   In Metafor, there is internally governed controlled vocabulary and externally governed vocabulary.

o   Freemind is a useful communication tool for interacting with other scientists.

o   The question posed to scientists when developing the controlled vocabulary mindmap was “What do you need to know to differentiate many models when looking at their output?”

o   The modelling group is expected to run the same model for all experiments in a given focus area, though this may not happen in practise.

o   Users of the questionnaire should be allowed copy the information given for a different experiment and overwrite it to save as a new experiment, capturing the changes from the previous one without having to input all the previous information which hasn’t changed.

o   The CMIP5 controlled vocabulary should use the same characteristic/coefficient structure as proposed by Marie-Pierre. It’s possible for the controlled vocabulary to be hierarchical and we want this to be the case.

o   Some controlled vocabulary may be component-independent, e.g. Numerical component.

o   We need to get copies of all the existing (and soon-to-be-created) CMIP5 controlled vocabulary lists and ingest them into the CIM.

o   The controller of the mind map is to maintain the maps versioning and comments.

o   In the short term, we will reconcile conflicting scientific information in the controlled vocabulary by making an executive decision ourselves. In the long term, this will be done by the (yet to be formed) standards committee.

o   Interact more with the climate modelling communities to get more potential user opinions on the CIM and the controlled vocabulary. People in charge of coordination:

§  Atmosphere: Marie-Pierre (to interview: Michel Dequé, Fred Hourdin, Marco Giorgetta, Met Office person, Bruce Wyman GFDL, Gary Strand NCAR cf. CCSM.ucar.edu/working_groups)

§  Ocean: Eric (to interview: Gurvan Madec, Steve Griffies, Met Office person, Helmut Haak MPI, NCAR)

§  Sea ice: Eric (to interview: LIM person LLN, David Salas, Andrew+Mike Winton GFDL, Helmut Haak)

§  Land surface: Marie-Pierre (to interview: Herve Douville, Jan Polcher, Sergey Malyshev GFDL, NCAR)

§  Atmosphere Chemistry: Lois + Charlotte (to interview: Met Office Graham + Fiona; GFDL Larry Horowitz, NCAR Gary Strand + Laurence Buja, Mainz people, KNMI Peter VVH, Vincent-Henry Puech)

§  Land ice: Marie-Pierre (to interview: Herve Douville, IPSL Geerhart Krinner)

§  Ocean bio geochemistry: Eric (to interview: Lauren Bopp, John Dunne, Helmut Haak)

§  River routing: Marie-Pierre (to interview: Stefan Hagemann, Bertrand Decharme, Jan Polcher)

 

·        CMIP5 questionnaire:o    Requirements:

§  The questionnaire should be as flexible as possible, to make it easy for the users to use. It should also allow users to tell us things in a different order from our proposed question flow.

§  We never want to ask a user the same question twice. The questionnaire should only be changed/updated by us in such a way so that new information can be added, but previously entered information by the users does not need to be updated.

§  There should be restrictions on what can be entered as responses to the questionnaire. We should use drop down lists of controlled vocabulary where possible, with extra functionality to make it easy for users to navigate through long lists of controlled vocabulary. Free text feedback should be available to inform us if the drop down list is incomplete.

§  We should start with simple questions and get more complicated (progressive disclosure)

§  We need to capture Curator/ESG controlled vocabulary and keep compatibility with it.

§  Metafor can and will spend a couple of weeks correcting and quality controlling the metadata from the questionnaire. The questionnaire software should be able to flag questions with missing answers and guide the user to where those questions are.

§  The questionnaire needs to be all about CMIP5 and only about CMIP5. It has to include a way to collect the extra information the modellers may want to tell us that we haven’t asked about.

§  It should ask for more information than was provided for CMIP3.

§  It should satisfy the CMIP5 use cases and any information we ask for should produce CIM instances for the use cases we came up with.

§  It should allow the user to save a response, but not validate it, to allow the information to be added in more than one session (re-edit/save/update/increment functionality)

§  Users should be able to edit the questionnaire after submission to correct mistakes in the submitted metadata.

o   We shouldn’t rule out having a faceted browse questionnaire, as it allows a lot of flexibility (at the cost of having the questions as triples).

o   We will put the questionnaire together and then decide what questions should be mandatory.

o   In the cases of ensembles, the questions about the model used should be filled in once, while the changes in forcing can be filled in once for each run of the ensemble.

 

·        CIM requirements:

o   A future objective is to have the CIM applicable to other areas of science, e.g. forecasting, reanalysis etc.

o   With regards to the CIM, the following requirements were identified (in collaboration with the climate modelling community):

§  Only describe configured models

§  Change property should be able to add a new component

§  Model properties should only be prognostic variables

§  Detailed sequencing of the model is not needed

§  Metadata from the netCDF files should be preserved and exposed in the CIM

o   It’s important to get the controlled vocabulary lists urgently, but it’s not so urgent to get the controlled vocabulary into the CIM.

o   The change property is now to be considered as a holding place as we’re not going to implement it for the moment.

o   A coupler binds input and output data streams of components either to each other or to a physical file or parameter list. The data stream will have a type.­­­ The coupling connection should always be specified at the level of the parent of the two child components being coupled.

o   The key question for the use cases for Metafor is “what do people want to search on?”

  

·        The end users of the Metafor project are:

o   Climate modellers

o   Climate effects community – professional data users who use climate data (e.g. hydrologists, eecologists) and will use the model outputs

o   Climate impact community – policy makers and resource managers

o   For the second and third group, the usage will depend on the tools we create and the ease of navigation. We expect that the scientists will see different metadata than the policy makers. The policy makers will be dealt with by IS-ENES to a large extent.

o   The interactions with these groups will affect the structure of the metadata. Metafor should be aware of this, but not try to solve it.

 

·        Management and dissemination:

o   We agreed to use the joomla site for storing admin documents, meeting presentations etc, and use the wiki and trac for all working documents.

o   All issues need to be turned into tickets and reviewed regularly.

 

Issues:

·        How deep does the CIM need to go for climate modelling/ being field-specific as opposed to being generalised to include other fields?

·        Change structure (resolved Wed)

·        CF information model

·        Ingestion from data to CIM (THREDS etc and ESG Publisher)

·        Access control (EGEEE versus ESG/NDG)

·        Distributed database model

·        Controlled vocabularies and UML (discussed during meeting but on-going)

·        The BADC have a prototype Metafor portal website, but need instances of the CIM to build the infrastructure around.

·        What do we include in the controlled vocabulary with regards to external dictionaries (e.g. CF standard names?)

·        How much detail do we need to ask for in the questionnaire?

·        Should there be mandatory questions in the questionnaire?

·        When Metafor is finished, we need to have some governance of the CIM in place. How do we arrange this?

·        How can Metafor and Curator touch base regularly to discuss priorities etc?

·        We need to interact more closely with modellers from now on to communicate about controlled vocabulary, train them about the CIM and get their feedback.

·        How should we interact with ESA and where should we put our resources for interaction?

·        Is there a universal structure for controlled vocabulary we could define and work on, or is it just a flat list with the structure associated with a given activity (e.g. CMIP5)?

·        Regarding Marie-Pierre’s work on translating controlled vocabulary into the CIM: discussion needs to continue regarding the fact that characteristic/coefficient/property are all very similar from the UML point of view.

·        We need a procedure to map the information we get from the questionnaire onto the CIM.

·        How do we make links between the model data and ancillary files?

·        How do we ask about coupling information in the questionnaire?

·        How do we create the repository that links CIM documents together?

·        How do we use the quality package? (The structure of the package is detailed in Mark’s presentation)

 

Next telco: Thurs 19th February, 9.30 BT/10:30 CET

 

Notes by Eric Guilyardi and Sarah Callaghan, 17th February 2009

 

 

 
FullMoon Workshop Print E-mail
Last Modified: Thursday, 29 January 2009

FullMoon Workshop

Friday 13th February 10:30 -15:30 GMT
Rutherford Appleton Laboratory

Simon Cox has offered to run an informal workshop with a small group of Metafor folk about the capabilities and requirements of the FullMoon framework for the auto-generation of xml schema from (enterprise architect) UML.

Participants are requested to register with the doodle poll that has been set up for the workshop: here

Remote attendance will be possible through the : DFN 

 

 
How to get to Coseners House Print E-mail
Created: Thursday, 15 January 2009

Directions to get to Coseners House can be found at the STFC Coseners House webpage

The Cosener's House,
Abingdon,
Oxfordshire
OX14 3JD

Telephone: +44 1235 523198
Fax: +44 1235 534160
Email: This e-mail address is being protected from spam bots, you need JavaScript enabled to view it

Online map
 
<< Start < Prev 1 2 3 Next > End >>

Results 1 - 9 of 24