tidy the CIM (as per the Y1 meeting)

This page provides documentation for ticket:215.

(note: while I'm working I'll tag action items in bold)

changes to think about:

  1. As a point of style, all <<reference>> attributes should use their native datatypes
  1. Hans wonders how to reference external documents (ie: how to record a relationship between different CIM Instances). He started by adding a new "cimrecord_id" attribute. All <<documents>> should already have an "id" attribute. I think that the <<reference>> stereotype can be used to refer to external documents. That stereotype translates to xlink:href which uses  XLink to cross-reference documents; once an external document is located XPath can be used to refer to a specific element (if, you were referring to something other than the root document element). This is also the way that things are handled in GML. Double-check this solution with Hans.
  1. Sort out Change Properties (stop using them and instead use Parent/Child? relationships for <<documents>>)
    • "Change" is a misleading name - it makes people think of version control. Consider renaming it to something like "Pertubation."
    • At Y1 meeting, folks wanted Change to have an "id" attribute - it gets this anyway as a specialisation of Property.
    • added another ChangePropertyType of "new" to represent adding new metadata to the target of a change
    • with the addition of a "new" ChangePropertyType, I need to ensure that Change in the APPCIM can cope with adding nested elements within the value of a Change. So I created a new UML class called PropertyValue. Once we migrate to FullMoon, I will ensure that PropertyValue gets turned into an xs:union of appropriate XML types to allow that.
    • For CMIP5 change is only relevant for ensembles - we can deal w/ ensembles w/out resorting to the C!hangeProperty simply by having simulations which have "parent" simulations
    • So for now, I will defer the finalisation of the ChangeProperty.
      • I have added a new stereotype <<<unused>> which indicates to the concim2appcim program that it shouldn't bother transforming elements with that stereotype (when we migrate to GML, I can just leave the stereotype blank; FullMoon ignores unknown elements). I have tagged both Change and ChangeProperty with <<unused>>
      • I modified concim2appcim.xsl accordingly
      • I have raised a new ticket #234 to sort out Change sometime properly in the future
    • Added a new optional attribute to the Document class named "parent" of type Document with the <<reference>> stereotype; this will ensure that a CIM document can be based on another CIM document.
  1. Conformance needs a bit of a re-think
    • A model can conform to an experiment by using certain data - that data can come from either the data package or the software package
      • Created an abstract type of Data called "DataStream"
      • Changed Conformance to have a single <<reference>> to a DataStream (in practice the dataStream will either come from a software::component or an activity::simulation).
  1. Incorporate Marie-Pierre's suggestions to the software package
    • Added "lag" to Coupling & Connection classes
      • (lag is of type Timing which ensures it has a "rate" and "units")
      • Rupert would like to see a more formal way of specifying sequencing - I agree but I'm not sure how best to do this.
    • playing around with methods, characteristics, and coefficients; is it worth re-factoring things b/c of the similarity w/ ComponentProperties? It would really help to be able to formally restrict specialisations instead of just extending them; it  looks like FullMoon is almost there.
    • wound up adding Method, Characteristic, and Coefficient which are all specialisations of NumericalProperty; ModelComponent includes an arbitrary number of each of these
    • On reflection, these are quite similar to Prognostics. So I initially made these four sibling classes (which inherited from Property). Then rationalised them down to two classes. Two instead of one b/c (from Sophie's email):

They are structurally different.

The "what" is much simpler: a cf-convention name + unit

The "how" is not included in the cf-convention and needs to be described by
+nature ["characteristic", "method" or "coefficient"]
+value [numerical value, if any]
+minBackgroundValue [numerical value, if any]
+represented [boolean]
+reference [publication]
+description [free-text]

  1. Incorporate Lois's suggestions about "technical properties" to the software package
    • cleaned up deployment a bit as per Lois's email...
      • chose not to use generalisations for the class structure - Machines and Compilers are different things
      • still unsure about the "date used" attribute for the Compiler; why is that needed? Don't we already have date information for activities?
    • also not sure about Parellelisation; b/c there is a distinction between domain decomposition (how to partition the grids across different machines) and parallelisation in the sense of number of processors to assign to each "deployment unit". (from Sebastien's email):
      > Hi,
      > I am fine with what was said under this thread, except for the
      > "parallelisation".
      > The number of processes used to run a particular component model
      > should probably be an attribute of "deployment".
      We must also include distinction between mpi and OpenMP processes to be
      complete. The product of those two given the total number of processes
      except if you wanna test overbooking. Also having a machineArchitecture
      attribute for the machine may seems natural (perhaps it's equivalent to
      the machineSystem thingie), like x86_64, ia64, Power6, SX9 ....
      > But I would consider that the detailed description of the
      > parallalisation belongs to the grid package.I remember discussing this
      > with Balaji and we concluded that the parallelisation could be
      > expressed by describing different "tiles" for one grid (am I using
      > the right wording here?). In all cases, I would say this is not
      > urgently needed and can be added later, when required.
      Interesting. But in that case we will have to keep in mind that a model
      component (let's say atmosphere) can have several level of
      parallelization within one execution leading to multiple domain
      decomposition. One domain decomposition for the dynamic, another one for
      the dissipation, another one for the physics, and another one for the
      vertical domain decomposition with OpenMP. Due to load balancing
      requirements between processes. And those domains decompositions will
      mainly depend on models subcomponents from the software package.

      > regards,
      > Sophie

ticket:215 has been closed to coincide with the version-1.0 release. Any outstanding issues will be raised as new tickets.