wiki:tickets/519

This page documents the process of translating from METAFOR Software Component Controlled Vocabulary (SCCV) to ESG OWL.

The SCCV is currently maintained as a set of mindmap files. Mindmaps are used as they provide a nice visual way to view and update the SSCV. Icons, fonts, notes and colours are used in the mindmaps to indicate the type of information.

A set of rules have been defined to which a mindmap must adhere to be a valid SCCV (see ticket  244). An associated validator has been written which checks that these rules are adhered to.

If a mindmap is valid SCCV then it can be translated (again see ticket 244) into SCCV xml. The format of this xml is defined in ticket [ref]. This xml is designed to be in a form that can be easily translated and/or ingested. The METAFOR Questionnaire reads SCCV xml directly. It is SCCV xml that will be translated into ESG OWL.

Sylvia has provided a sample of the type of structure required by ESG. Most of these emails should be archived in the METAFOR mailing list. I have written an associated translator (in XSL) which converts SCCV xml into ESG OWL. This translator is available here  http://metaforclimate.eu/trac/browser/controlled_vocabularies/trunk/Software/xsl/xml2ESGOWL.xsl and can be run using any XSL stylesheet engine (e.g. gnu's xsltproc) ... xsltproc xsl/xml2ESGOWL.xsl realm.xml > realm.owl

The SCCV xml to ESG OWL translation is straightforward apart from one issue. ESG necessarily use software component names and software component property names as part of their (OWL) xml elements. For example ...

[insert example here]

However, xml elements only allow certain characters as names, see [ref], whereas whereas software component names and their properties can be described using any characters. For example ...

[insert example here]

Therefore, the following translation mapping has been defined so that valid ESG OWL is produced from METAFOR SCCV. The mapping below was agreed with Sylvia, Luca and Julien.

.-_0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZunmodified
[space]_
)_RPAREN_
(_LPAREN_
'_APOS_
>_GT_
/_SLASH_
+_PLUS_
,_COMMA_
*_STAR_
&_AND_
[any other character]translator aborts


Note, I've only defined the characters that are used in the current SCCV. The use of any other character currently causes the translator to throw an error and would require a new mapping to be specified.

Also note, this translation is uni-directional as one can not necesarily determine a source string from a result string. This is a conscious decision to choose readability of output over the ability for bi-directional translation.