wiki:WP4/URIstructure

WP4/URIstructure - The Metafor URI structure(s)

Status of this doc: V0.2 (in progress)

This page covers two issues: URI structures for metadata, and for lack of anywhere else to put it, what we know about plans for URIs in the context of  CMIP5.

Metadata URIs

We clearly need a uri structure to build metafor (or any other) services. Our original proposal anticipated using OAI-PMH to link things together, but times have moved on and it would be better to use  atom feeds.

From a server point of view then, we should expect to expose some things that looks like

hostname+path/search
hostname+path/entry/repository_id__schema__local_id.format1
hostname+path/feeds/activities.format2
hostname+path/feeds/models.format2
hostname+path/feeds/dataObjects.format2

where

  • hostname+path is obviously where the server is exposing these services
  • search exposes a search interface
  • entry exposes individual metafor documents (in a variety of formats, see below)
    • note that unique identifiers can be used to allow these documents to be moved around and served from multiple portals, this is done by using a uri structure for the entry of the form repository_id + schema +local_id + format and potentially we could overload the uri with information about the package in which the document lives as well (but it shouldn't be necessary). Using schema in this uri allows us to support multiple metadata schema in the same portal without having to do introspection of the contents (yes, I do know this breaks  REST, so we may want to revisit it later, but it will be useful in an evolving development situation like the one we find ourselves in).
  • feeds exposes an atom feed of summaries and links to individual entries, so that changes and additions to the repositories are exposed for harvest (via a restful http GET) into whatever portals need the information (and individual feed readers of scientists and developers).
  • activities, , models, dataObjects are collections of entries of the same type (we might well have collections of grid descriptions too, and maybe other things if we proliferate our top level packages).
  • format1 could be
    • xml, in which case you would get a raw xml document conforming to the metafor schema
    • atom, in which case you would get an xml document either enclosed or pointed to via an atom entry document (and we'd need to describe how to use the atom elements appropriately), and
    • html, in which case you would get an entry styled by the server for human interaction
  • format2 could be
    • atom, in which case you'd get a proper atom feed, or
    • html, in which case you'd get a rendered feed, showing changes to a particular collection, styled by the server.
  • (We could even allow both format1 and format2 to support a .iso extension for a vanilla iso version of the document and collection metadata.)

A prototype example

To show how this works, BADC proposes has built an  example server, exposing NumSim formatted experiment and model records, with a very simple search facility (full text search). (The initial version does not yet include feeds).

This would meet some interim goals for  ndg, metafor, and  ceda.

Note that in the long term (six months) this portal should begin to be as functional as the ESG portal, and we need to consider how we might swap documents.

File URI structures