Addressing issue-33, allowing generalized use of slices
authorDave Reynolds <dave@epimorphics.com>
Sat, 02 Mar 2013 16:35:16 +0000
changeset 326 cee7ea45c0f4
parent 325 7de93a2d7f83
child 327 1c3a135ec7bb
Addressing issue-33, allowing generalized use of slices
data-cube/index.html
--- a/data-cube/index.html	Fri Mar 01 17:05:24 2013 +0000
+++ b/data-cube/index.html	Sat Mar 02 16:35:16 2013 +0000
@@ -177,9 +177,9 @@
 the <strong>Content-Oriented Guidelines</strong> (COGs), a set of
 cross-domain concepts, code lists, and categories that support
 interoperability and comparability between datasets by providing a
-shared terminology between SDMX implementers. RDF versions of these
+shared terminology between SDMX implementers [[COG]]. RDF versions of these
 terms are available separately for use along with the Data Cube
-vocabulary.
+vocabulary, see <a href="#dsd-cog">Content oriented guidelines</a>.
 </p>
 </section>
 
@@ -331,9 +331,9 @@
 to refer to such slices in which the single free dimension is time as <em>Time
 Series</em> and to refer slices along non-time dimensions as <em>Sections</em>.
 Within the Data Cube vocabulary we allow arbitrary dimensionality
-slices and do not give different names to particular types of slice but
-extension vocabularies, such as SDMX-RDF, can easily add such
-concept labels.</p>
+slices and do not give different names to particular types of
+  slice. Such sub classes of slice could be added in extension vocabularies.</p>
+
 </section>
 
 </section>
@@ -607,28 +607,31 @@
 </p>
 </section>
 
-<section id="dsd-cog">
+<section id="dsd-cog" class="informative">
 <h3>Content oriented guidelines</h3>
 
-<p>@@TODO fix this section</p>
-
 <p>The SDMX standard includes a set of <em>content oriented guidelines</em> (COG) [[COG]]
- which define a
-   set of common statistical concepts and associated code lists that are intended to be 
-   reusable across data sets. As part of the data cube work we have created RDF analogues
-  to the COG. These include:</p>
-  <ul>
-    <li><code>sdmx-concept</code>: SKOS Concepts for each COG defined concept;</li>
-    <li><code>sdmx-code</code>: SKOS Concepts and ConceptSchemes for each COG defined code list;</li>
-    <li><code>sdmx-dimension</code>: component properties corresponding to each COG concept that can be used as a dimension;</li>
-    <li><code>sdmx-attribute</code>: component properties corresponding to each COG concept that can be used as an attribute;</li>
-    <li><code>sdmx-measure</code>: component properties corresponding to each COG concept that can be used as a measure.</li>
+ which define a  set of common statistical concepts and associated code lists that are intended to be 
+   reusable across data sets. A <a href="https://code.google.com/p/publishing-statistical-data/">community group</a> has developed, and
+   maintains, RDF encodings of these guidelines. These comprise:
+
+<table id="namespaces">
+  <thead><tr><th>Prefix</th><th>Namespace</th><th>Description</th></tr></thead>
+  <tbody>
+    <tr><td><code>sdmx-concept</code></td><td><a href="http://purl.org/linked-data/sdmx/2009/concept#">http://purl.org/linked-data/sdmx/2009/concept#</a></td><td>SKOS Concepts for each COG defined concept</td>
+    <tr><td><code>sdmx-code</code></td><td><a href="http://purl.org/linked-data/sdmx/2009/code#">http://purl.org/linked-data/sdmx/2009/code#</a></td><td>SKOS Concepts and ConceptSchemes for each COG defined code list</td>
+    <tr><td><code>sdmx-dimension</code></td><td><a href="http://purl.org/linked-data/sdmx/2009/dimension#">http://purl.org/linked-data/sdmx/2009/dimension#</a></td><td>component properties corresponding to each COG concept that can be used as a dimension</td>
+    <tr><td><code>sdmx-attribute</code></td><td><a href="http://purl.org/linked-data/sdmx/2009/attribute#">http://purl.org/linked-data/sdmx/2009/attribute#</a></td><td>component properties corresponding to each COG concept that can be used as an attribute</td>
+    <tr><td><code>sdmx-measure</code></td><td><a href="http://purl.org/linked-data/sdmx/2009/measure#">http://purl.org/linked-data/sdmx/2009/measure#</a></td><td>component properties corresponding to each COG concept that can be used as a measure</td>
   </ul>
+  </tbody>
+</table>
 
-<p>The data cube vocabulary is standalone and it is not mandatory to use the SDMX COG-derived
-   terms. However, when the concepts being expressed do match a COG concept it is recommended
-   that publishers should reuse the corresponding components and/or concept URIs to simplify comparisons
-  across data sets. Given this background we will reuse the relevant COG components in our worked example.</p>
+<p>These resources are provided as a convenience and do not form part
+  of the RDF Data Cube standard at this time. However, they are used
+  by a number of existing RDF Data Cube publications and so we will
+  reference them within our worked examples.</p>
+
 </section>
 
 
@@ -645,7 +648,7 @@
   to represent the time period itself it would be convenient to use the data.gov.uk reference
   time service and to declare this within the data structure definition.</p>
 
-<pre>
+<pre class="example">
   eg:refPeriod  a rdf:Property, qb:DimensionProperty;
       rdfs:label "reference period"@en;
       rdfs:subPropertyOf sdmx-dimension:refPeriod;
@@ -656,7 +659,7 @@
 we can use for this, and again we can customize the range of the component. In this case
   we can use the Ordnance Survey Administrative Geography Ontology [[OS-GEO]].</p>
 
-<pre>
+<pre class="example">
   eg:refArea  a rdf:Property, qb:DimensionProperty;
       rdfs:label "reference area"@en;
       rdfs:subPropertyOf sdmx-dimension:refArea;
@@ -671,7 +674,7 @@
   the topic being observed using metadata). However, it can aid readability and processing
   of the RDF data sets to use a specific measure corresponding to the phenomenon being observed.</p>
   
-<pre>
+<pre class="example">
   eg:lifeExpectancy  a rdf:Property, qb:MeasureProperty;
       rdfs:label "life expectancy"@en;
       rdfs:subPropertyOf sdmx-measure:obsValue;
@@ -730,7 +733,7 @@
 
 <p>So the structure of our example data set (and other similar datasets) can be declared by:</p>
 
-<pre>
+<pre class="example">
   eg:dsd-le a qb:DataStructureDefinition;
       # The dimensions
       qb:component [qb:dimension eg:refArea;         qb:order 1];
@@ -783,7 +786,7 @@
 
 <p>For example, if we have a set of shipment data containing unit count and total weight for each
   shipment then we might have a data structure definition such as:</p>
-<pre>
+<pre class="example">
 eg:dsd1 a qb:DataStructureDefinition;
     rdfs:comment "shipments by time (multiple measures approach)"@en;
     qb:component 
@@ -792,7 +795,7 @@
         [ qb:measure    eg-measure:weight; ] . </pre>
         
 <p>This would correspond to individual observations such as:</p>
-<pre>
+<pre class="example">
 eg:dataset1 a qb:DataSet;
     qb:structure eg:dsd1 .
     
@@ -834,7 +837,7 @@
   Thus, qb:measureType is a “magic” dimension property with an implicit code list.</p>
 
 <p>The data structure definition for our above example, using this representation approach, would then be:</p>
-<pre>
+<pre class="example">
 eg:dsd2 a qb:DataStructureDefinition;
     rdfs:comment "shipments by time (measure dimension approach)"@en;
     qb:component 
@@ -844,7 +847,7 @@
         [ qb:dimension  qb:measureType; ] . </pre>
         
 <p>This would correspond to individual observations such as:</p>
-<pre>
+<pre class="example">
 eg:dataset2 a qb:DataSet;
     qb:structure eg:dsd2 .
     
@@ -927,7 +930,7 @@
 
 <p>Thus for our running example we might expect to have:</p>
 
-<pre>
+<pre class="example">
   eg:dataset-le1 a qb:DataSet;
       rdfs:label "Life expectancy"@en;
       rdfs:comment "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
@@ -972,7 +975,7 @@
   original Data Structure Declaration we see that we declared the unit of measure to be
   attached at the data set level. So the corrected example is:</p>
 
-<pre>
+<pre class="example">
   eg:dataset-le1 a qb:DataSet;
       rdfs:label "Life expectancy"@en;
       rdfs:comment "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
@@ -1019,20 +1022,35 @@
 <section id="slices">
 <h2>Slices</h2>
 
-<p>Slices allow us to group subsets of observations together. This not intended
-  to represent arbitrary selections from the observations but uniform slices
-  through the cube in which one or more of the dimension values are fixed.</p>
+<p>Slices allow us to group subsets of observations together.</p>
   
 <p>Slices may be used for a number of reasons:</p>
 <ul>
   <li>to guide consuming applications in how to present the data (e.g. to organize
       data as a set of time series);</li>
-  <li>to provide an identity (URI) for the slice to enable to be annotated or externally referenced;</li>
+  <li>to provide an identity (URI) for the slice to enable it to be annotated or externally referenced;</li>
   <li>to reduce the verbosity of the data set by only stating each fixed dimensional value once.</li>
 </ul>  
 
+<div class="note"> This section has been modified to
+  address <a href="http://www.w3.org/2011/gld/track/issues/33">ISSUE-33</a></div>
+
+<p>Commonly in statistical publishing then slices are created by
+  fixing one or more dimension values. 
+  In particular, a common practice is for each slice to fix all dimensions
+  except time so that a slice represents a time series of
+  observations. However, the Data Cube vocabulary does not impose a
+  restriction that slices be only used in this manner. It is
+  permissible for a slice to be used to group observations together
+  for other reasons and for such slices to thus not have an associated
+  <em>slice key</em> (see below). For example, slices might be used to group
+  all of the latest available observations together for ease of
+  access. Extension vocabularies may introduce specialist sub classes of
+  slice for particular purposes, including ones which require presence
+  of a slice key.</p>
+
 <p>To illustrate the use of slices let us group the sample data set into geographic series.
- That will enable us to refer to e.g. "male life expectancy observations for 2004-6" 
+ That will enable us to refer to e.g. "male life expectancy observations for 2004-2006" 
  and guide applications to present a comparative chart across regions. </p>
 
 <p>We first define the structure of the slices we want by associating a "slice key" which the
@@ -1040,7 +1058,7 @@
    lists the component properties (which must be dimensions) which will be fixed in the
    slice. The key is attached to the DSD using <code>qb:sliceKey</code>. For example: </p>
    
-<pre>
+<pre class="example">
   eg:sliceByRegion a qb:SliceKey;
       rdfs:label "slice by region"@en;
       rdfs:comment "Slice by grouping regions together, fixing sex and time values"@en;
@@ -1061,7 +1079,7 @@
   of <code>qb:sliceStructure</code>. Data sets indicate
   the slices they contain by means of <code>qb:slice</code>. Thus in our example we would have:</p>
 
-<pre>
+<pre class="example">
   eg:dataset-le2 a qb:DataSet;
       rdfs:label "Life expectancy"@en;
       rdfs:comment "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
@@ -1109,7 +1127,7 @@
 definition and search for slice definitions. If it is desired, this redundancy can be reduced
 by declaring different attachment levels for the dimensions. For example:
 </p>
-<pre>
+<pre  class="example">
   eg:dsd-le-slice3 a qb:DataStructureDefinition;
       qb:component 
           [qb:dimension eg:refArea;         qb:order 1];
@@ -1187,7 +1205,7 @@
 <p>We illustrate this with an example drawn from the translation of the SDMX COG
   code list for gender, as used already in our worked example. The relevant subset of this code list is:</p>
 
-<pre>
+<pre class="example">
 sdmx-code:sex a skos:ConceptScheme;
     skos:prefLabel "Code list for Sex (SEX) - codelist scheme"@en;
     rdfs:label "Code list for Sex (SEX) - codelist scheme"@en;
@@ -1236,7 +1254,7 @@
 
 <p>This code list can then be associated with a coded property, such as a dimension:</p>
 
-<pre>
+<pre class="example">
   eg:sex a sdmx:DimensionProperty, sdmx:CodedProperty;
       qb:codeList sdmx-code:sex ;
       rdfs:range sdmx-code:Sex .
@@ -1286,7 +1304,7 @@
 
 <p>Thus our sample dataset might be marked up by:</p>
 
-<pre>
+<pre class="example">
   eg:dataset1 a qb:DataSet;
       rdfs:label "Life expectancy"@en;
       rdfs:comment "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
@@ -1311,7 +1329,7 @@
 The organization should be represented as an instance of <code>foaf:Agent</code>, or
 some more specific subclass such as <code>org:Organization</code> [[ORG]].</p>
 
-<pre>
+<pre class="example">
 eg:dataset1 a qb:DataSet;
     dc:publisher &lt;http://example.com/meta#organization> .
     
@@ -1790,13 +1808,6 @@
   </div>
 
   <div class='issue'>
-    <h3><a href="http://www.w3.org/2011/gld/track/issues/30">Issue-30</a>:
-      Declaring relations between cubes</h3>
-    <p>Consider extending the vocabulary to support declaring
-      relations between data cubes (or between measures within a cube).</p>
-  </div>
-
-  <div class='issue'>
     <h3><a href="http://www.w3.org/2011/gld/track/issues/31">Issue-31</a>:
       Supporting aggregation for other than SKOS hierarchies</h3>
     <p>The Data Cube vocabulary allows hierarchical code lists to be
@@ -1806,20 +1817,6 @@
   </div>
 
   <div class='issue'>
-    <h3><a href="http://www.w3.org/2011/gld/track/issues/32">Issue-32</a>:
-      Relationship to ISO 19156 - Observations & Measurements</h3>
-    <p>
-      One use case for the Data Cube vocabulary is for the publication
-      of observational, sensor network and forecast data
-      sets. Existing standards for such publication include OGC
-      Observations & Measurements (ISO 19156). There are multiple ways
-      that Data Cube can be mapped to the logical model of O&M. 
-      Consider making an explicit statement of the ways in which Data
-      Cube can be related to O&M as guidance for users seeking to
-      work with both specifications.</p>
-  </div>
-
-  <div class='issue'>
     <h3><a href="http://www.w3.org/2011/gld/track/issues/33">Issue-33</a>:
       Collections of observations and well-formedness of slices</h3>
     <p>
@@ -1830,16 +1827,6 @@
       of <code>qb:Slice</code> or through an additional collection mechanism.</p>
   </div>
 
-  <div class='issue'>
-    <h3><a href="http://www.w3.org/2011/gld/track/issues/34">Issue-34</a>:
-      Clarify or drop qb:subslice</h3>
-    <p>
-      Use of <code>qb:subslice</code> in abbreviated datasets can result
-      in ambiguity. Consider
-      clarifying or deprecating <code>qb:subslice</code>.</p>
-  </div>
-
-
 </section>
 
 <section id="acknowledgements" class="appendix">