A Transformation from XML to RDF via XSLT
|
XML
|
|
XSLT |
|
|
Introduction
The Resource Description Framework (RDF) was developed as a new data model for embedding information in a schematic document structure to make it more machine readable.
In practice however, one often finds data in XML format. Here we developed a generic XSLT transformation which can always be applied to convert any XML document into RDF conform structure. This work was motivated by needs in semantic astronomy encountered in the AstroGrid-D project. Here for example the monitoring of robotic telescopes, compute resources and individual jobs is based on XML. However, the information service of AstroGrid-D - Stellaris - uses RDF.RDF describes data through a hierarchical structure of resources, which are represented by universal resource identifiers (URIs). Resources can be composed of other resources in analogy to the real-world object they describe. For example a telescope can be composed of a camera, which is composed of a filter wheel, which is composed of filters, etc. This concept makes RDF an interesting choice for the metadata management in heterogeneous software environments, where an automated interaction between different components is desirable.
However, data is usually not provide in RDF and the development of individual solutions is often not affordable.
Here our generic XML to RDF transformation does the trick.
Design Goals
There are different ways to represent XML in RDF. Different solutions are shown in the history section below. The latest transformation achieves the following design goals:
- avoidance of blank nodes,
- one-to-one mapping for bidirectional extension,
- independence of XML schema.
Blank nodes are subjects without name. Therefore access to them is more difficult and some operations such as direct replacement of nodes cannot be performed. By avoiding blank nodes these complications can be avoided.
A one-to-one mapping is necessary for the inverse transformation. A idirectional transformation can be important e.g. in a robotic telescope network where information about scheduled observations is stored in RDF but where rescheduling requires the original RTML observation request. Therefore in AstroGrid-D also the RTML observation requests were stored along with the RDF. The inverse transformation could make this additional service unnecessary. A unique reconstruction of the original XML requires e.g. to preserve the distinction between attributes and elements. As shown below, this is accomplished by the different transformation of attributes and elements.
The last point makes the transformation independent from the underlying XML schema, so that the structure of RDF is completely determined by the XML. It requires that the order of elements is preserved.
Transformation / Conversion
The transformation is accomplished via XSLT. The latest stylesheet (xml2rdf3.xsl) and some earlier versions are found below. As an example we show the transformation of a reduced description of the robotic telescope STELLA-I from its XML dialect in RTML (STELLA-I.rtml) into RDF (STELLA-I_3.rdf, STELLA-I_3.ttl). The description of STELLA-I is shown below.
<?xml version="1.0" encoding="UTF-8"?> |
The transformation is executed with an XSLT processor like xsltproc as follows:
xsltproc xml2rdf3.xsl STELLA-I.rtml > STELL-I_3.rdf |
The resulting RDF has the structure shown in the graphic below. It is obtained using the RDF visualization tool RDF Gravity.
Applications
This transformation is used to convert XML into RDF format.
For example in AstroGrid-D this transformation is used for monitoring with the information service Stellaris. More precisely it is used for converting
- RTML metadata of robotic telescopes
- Monitoring & Discovery System (MDS) information of the Globus Toolkit
- Usage Records created from Audit logging of the Globus Toolkit
Version History
The table below contains different version of the transformation. The STELLA-I.rtml was slightly different for older versions.A graphical overview can be found here.
|
Release |
Version/Date |
Changes |
Graph |
RDF/XML
|
| xml2rdf3.xsl | 3.0 / 2009-05-19 | rdf:value for every text, no attribute triples, order predicates, comments as triples |
STELLA-I_3.png | STELLA-I_3.rdf |
| xml2rdf25.xsl | 2.5 / 2009-05-19 | added BaseURI variable, keep comments as comments |
STELLA-I_2.5.png | STELLA-I_2.5.rdf |
|
xml2rdf24.xsl
|
2.4 / 2008-09-30 | no rdf:type information used (simpler); attributes are distinguished from elements by an additional xs:attribute triple |
STELLA-I_2.4.png | |
| xml2rdf23.xsl |
2.3 / 2008-09-25 |
distinction of elements from attributes by an rdf:type xsl:element |
||
| xml2rdf22.xsl | 2.2 / 2008-09-23 |
distinction of attributes from elements by an rdf:type xsl:attribute | STELLA-I_2.2.png | |
| xml2rdf21.xsl | 2.1 / 2008-03-14 | resources have an rdf:type information |
STELLA-I_2.1.png | |
| xml2rdf2.xsl | 2.0 / 2007-11-05 | blank nodes are replaced by URIs constructed from the hierarchy of XML element |
STELLA-I_2.0.png |
STELLA-I_2.0.rdf
|
| xml2rdf1.xsl | 1.0 / 2007-03-26 | elements and attributes become literals connected by blank nodes similar to the Java tool
OwlMap
. |
STELLA-I_1.0.png
|
References
- Breitling, F., 2009: "A standard tranformation from XML to RDF via XSLT", Astronomical Notes, Volume 330 Issue 7, 755-760, http://arxiv.org/abs/0906.2291 .
- Grid Integration of Robotic Telescopes, (talk and contribution) to the proceedings of Hot-wiring the Transient Universe: A Joint VOEvent & HTN Workshop, 4-7 June 2007, web page)
- Providing Remote Access to Robotic Telescopes by Adopting Grid Technology, (Proceedings of the German e-Science Conference 2007, 2-5 May)
- Grid Integration of Robotic Telescopes (Workshop on Scientific Instruments and Sensors on the Grid, ICTP Triest, 23-28 April 2007)
Contact
Frank Breitlingfbreitling (at) aip.de
http://www.aip.de/People/fbreitling/



