Draft section on abbreviated form and flattening
authorDave Reynolds <dave@epimorphics.com>
Mon, 04 Mar 2013 11:45:33 +0000
changeset 333 a88422d311a8
parent 332 a4ced54ed13d
child 334 fed08911c41c
Draft section on abbreviated form and flattening
data-cube/index.html
respec/respec3/bibref/biblio.js
--- a/data-cube/index.html	Sun Mar 03 22:05:53 2013 +0000
+++ b/data-cube/index.html	Mon Mar 04 11:45:33 2013 +0000
@@ -18,6 +18,10 @@
 .spare-table td + td { border-left: black 1px solid; padding-left: 1em; padding-right: 1em; }
 .spare-table th + th { border-left: black 1px solid; }
 dl.vocab_reference dt { margin-top: 1em; }
+.bordered-table { border: black 1px solid;}
+.bordered-table th { border-bottom: black 1px solid;}
+.bordered-table td { padding-right: 1em;}
+
   </style>
 </head>
 
@@ -755,7 +759,7 @@
     appropriate user interfaces. It can also be useful in the publication chain to enable
     synthesis of appropriate URIs for observations.</li>
   <li>By default the values of all of the components will be attached to each individual observation,
-    a so called <em>flattened</em> representation.
+    a so called <em><a>flattened</a></em> representation.
     This allows such observations to stand alone, so that a SPARQL query to retrieve the observation
     can immediately locate the attributes which enable the observation to be interpreted. However,
     it is also permissible to attach attributes to the
@@ -1459,27 +1463,167 @@
 <section id="normalize">
 <h2>Abbreviation and normalization</h2>
 
-<p>In normal form then the observations which make up a Data Cube have property values
-for each of the required dimensions, attributes and measures as declared in the associated data structure
-definition. This form for a Data Cube is termed <em><dfn>flattened</dfn></em>. It is a convenient format
-for querying data and makes it possible to write uniform queries which extract 
-sets of observations, including from across multiple cubes. However, the verbosity of a
-fully flattened representation incurs overheads in transmission and storage of Data Cubes
-which may be problematic in some applications. 
+<p>In normal form then the <code><a>qb:Observation</a></code>s which
+make up a Data Cube have property values for each of the required
+dimensions, attributes and measures as declared in the associated data
+structure definition. This form for a Data Cube is
+termed <em><dfn>flattened</dfn></em>. It is a convenient format for
+querying data and makes it possible to write uniform queries which
+extract sets of observations, including from across multiple
+cubes. However, the verbosity of a fully flattened representation
+incurs overheads in transmission and storage of Data Cubes which may
+be problematic in some settings.
 </p>
 
-<p>To address this the Data Cube vocabulary supports a notion of an <em><dfn>abbreviated</dfn></em> format 
-in which component properties may be <em><dfn>attached</dfn></em> to other levels in the Data Cube.
-Specifically they may be attached to 
-a <code><a>qb:DataSet</a></code>, <code><a>qb:Slice</a></code> or <code><a>qb:MeasureProperty</a></code>.
-In those cases the attached property is taken to be applied to all the <code><a>qb:Observation</a></code> 
-instances associated with that attachment point. See <a href="attachment-example">example 4</a>
-in which the unit of measure is declared as to be attached to the whole data set and need not
-be repeated for every observation.</p>
+<p>To address this the Data Cube vocabulary supports a notion of
+an <em><dfn>abbreviated</dfn></em> format in which component
+properties may be <em><dfn>attached</dfn></em> to other levels in the
+Data Cube.  Specifically they may be attached to
+a <code><a>qb:DataSet</a></code>, <code><a>qb:Slice</a></code>
+or <code><a>qb:MeasureProperty</a></code>.  In those cases the
+attached property is taken to be applied to all
+the <code><a>qb:Observation</a></code> instances associated with that
+attachment point. For illustration
+see <a href="attachment-example">example 4</a> in which the unit of
+measure is declared as to be attached to the whole data set and need
+not be repeated for every observation.</p>
 
-@@@@@@@@
+<p>We define these notions by means of a transformation algorithm
+  which can normalize an abbreviated Data Cube to a flattened
+  representation. We express this transformation using the SPARQL 1.1
+  Update language [[!RDF-SPARQL-UPDATE]]. Use of this notation does not imply that
+  the transformation must be implemented this way. Information
+  exchanges using Data Cube may retain data in abbreviated form and
+  use other techniques such as query rewriting to ease access, may
+  implement the normalization algorithm by other means or may handle
+  all data in flattened form or any mix of these.</p>
 
-<p>@@TODO</p>
+<section id="normalize-algorithm">
+<h3>Normalization algorithm</h3>
+
+<p>The normalization algorithm comprises two sets of SPARQL Update
+  operations which should be applied in turn to a Dataset in which the
+  default graph contains the Data Cube graph to be normalized.</p>
+
+<p>The first update operation performs selective type and property closure
+  operations. These serve two purposes. They ensure
+  that <code>rdf:type</code> assertions on instances 
+  of <code><a>qb:Observation</a></code> and <code><a>qb:Slice</a></code>
+  may be omitted in an abbreviated Data Cube. They also simplify 
+  the second set of update operations by expanding
+  the sub properties of <code><a>qb:componentProperty</a></code>
+  (specifically <code><a>qb:dimension</a></code>,  <code><a>qb:measure</a></code>
+  and <code><a>qb:attribute</a></code>).</p>
+
+<table class="bordered-table">
+  <thead>
+   <tr><th>Phase 1: Type and property closure</th></tr>
+  </thead>
+  <tbody><tr><td>
+<pre>
+PREFIX rdf:            &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#>
+PREFIX qb:             &lt;http://purl.org/linked-data/cube#>
+
+INSERT {
+    ?o rdf:type qb:Observation .
+} WHERE {
+    [] qb:observation ?o .
+};
+
+INSERT {
+    ?s rdf:type qb:Slice .
+} WHERE {
+    [] qb:slice ?s.
+};
+
+INSERT {
+    ?cs qb:componentProperty ?p .
+    ?p  rdf:type qb:DimensionProperty .
+} WHERE {
+    ?cs qb:dimension ?p .
+};
+
+INSERT {
+    ?cs qb:componentProperty ?p .
+    ?p  rdf:type qb:MeasureProperty .
+} WHERE {
+    ?cs qb:measure ?p .
+};
+
+INSERT {
+    ?cs qb:componentProperty ?p .
+    ?p  rdf:type qb:AttributeProperty .
+} WHERE {
+    ?cs qb:attribute ?p .
+}
+</pre>
+  </td></tr></tbody>
+</table>
+
+
+<p>These closure operations are implied by the RDFS semantics of the
+  Data Cube vocabulary. Data Cube processors may apply full RDFS
+  closure in place of the update operation defined here.</p>
+
+<p>The second update operation checks the components of the data
+  structure definition of the data set for declared attachment levels.
+  For each of the possible attachments levels
+  (<code><a>qb:DataSet</a></code>, <code><a>qb:Slice</a></code>
+  and <code><a>qb:MeasureProperty</a></code>) it looks for occurrences
+  of that component to be flattened down to the corresponding
+  observations. </p>
+
+<table class="bordered-table">
+  <thead>
+   <tr><th>Phase 2: Flatten attachment levels</th></tr>
+  </thead>
+  <tbody><tr><td>
+<pre>
+PREFIX qb:             &lt;http://purl.org/linked-data/cube#>
+
+# Dataset attachments
+INSERT {
+    ?obs  ?comp ?value
+} WHERE {
+    ?spec    qb:componentProperty ?comp ;
+             qb:componentAttachment qb:DataSet .
+    ?dataset qb:structure [qb:component ?spec];
+             ?comp ?value .
+    ?obs     qb:dataSet ?dataset.
+};
+
+# Slice attachments
+INSERT {
+    ?obs  ?comp ?value
+} WHERE {
+    ?spec    qb:componentProperty ?comp;
+             qb:componentAttachment qb:Slice .
+    ?dataset qb:structure [qb:component ?spec];
+             qb:slice ?slice .
+    ?slice ?comp ?value;
+           qb:observation ?obs .
+};
+
+# Measure property attachments
+INSERT {
+    ?obs  ?comp ?value
+} WHERE {
+    ?spec  qb:componentProperty ?comp ;
+           qb:componentAttachment qb:MeasureProperty .
+    ?dataset qb:structure [qb:component ?spec] .
+    ?comp    a qb:AttributeProperty .
+    ?measure a qb:MeasureProperty;
+             ?comp ?value .
+    ?obs     qb:dataSet ?dataset;
+             ?measure [] .
+}
+</pre>
+  </td></tr></tbody>
+</table>
+
+
+</section>
+
 </section>
 
 <section id="wf">
--- a/respec/respec3/bibref/biblio.js	Sun Mar 03 22:05:53 2013 +0000
+++ b/respec/respec3/bibref/biblio.js	Mon Mar 04 11:45:33 2013 +0000
@@ -3358,13 +3358,14 @@
     },
     "RDF-SPARQL-UPDATE": {
         "authors": [
-            "S. Schenk",
-            "P. Gearon"
-        ],
-        "href": "http://www.w3.org/TR/2010/WD-sparql11-update-20100126/",
-        "title": "SPARQL 1.1 Update.",
-        "date": "W3C Working Draft",
-        "status": "26 January 2010",
+            "Paul Gearon",
+	    "Alexandre Passant",
+ 	    "Axel Polleres"
+        ],
+        "href": "http://www.w3.org/TR/2012/PR-sparql11-update-20121108/",
+        "title": "SPARQL 1.1 Update",
+        "date": "W3C Proposed Recommendation",
+        "status": "8 November 2012",
         "publisher": "W3C"
     },
     "RDF-SPARQL-XMLRES": {