Editorial changes for Last Call comments from Ulrich Atz including a complete example
authorDave Reynolds <dave@epimorphics.com>
Sat, 13 Apr 2013 16:19:00 +0100
changeset 457 1a50a7e89497
parent 456 2d325bab0d34
child 458 fe45eeecd19d
Editorial changes for Last Call comments from Ulrich Atz including a complete example
data-cube/index.html
respec/gld-bib.js
--- a/data-cube/index.html	Fri Apr 12 16:52:16 2013 +0100
+++ b/data-cube/index.html	Sat Apr 13 16:19:00 2013 +0100
@@ -178,8 +178,7 @@
 (web) addressable. This allows publishers and third parties to annotate
 and link to this data; for example a report can reference the specific
 figures it is based on allowing for fine grained provenance trace-back.</li>
-  <li>Data can be flexibly combined across datasets and between
-statistical and non-statistical sets (for example <em>find all
+  <li>Data can be flexibly combined across datasets sets (for example <em>find all
 Religious schools in census areas with high values for National
 Indicators pertaining to religious tolerance</em>). The statistical
 data becomes an integral part of the broader web of linked data.</li>
@@ -226,6 +225,9 @@
 <p>The RDF Data Cube vocabulary builds upon the core of the
   the SDMX 2.0 Information Model [[SDMX20]].</p>
 
+<p>Readers may find the SDMX User Guide [[SDMX-GUIDE]] useful
+  background.</p>
+
 <p>A key component of the SDMX standards package are
 the <strong>Content-Oriented Guidelines</strong> (COGs), a set of
 cross-domain concepts, code lists, and categories that support
@@ -315,6 +317,35 @@
 <section id="data-cubes" class="informative">
 <h2>Data cubes</h2>
 
+<section id="cubes-model" class="informative">
+<h3>Data Sets</h3>
+
+<p>A DataSet is a collection of statistical data that corresponds to a
+  defined structure. The data in a data set can be roughly described as belonging to one of the following kinds:</p>
+
+<dl>
+  <dt>Observations</dt>
+  <dd>This is the actual data, the measured values. In a statistical table, the observations 
+       would be the values in the table cells.</dd>
+
+  <dt>Organizational structure</dt>
+  <dd>To locate an observation within the hypercube, one has at least to know the value of each 
+      dimension at which the observation is located, so these values must be specified for each observation. 
+      Datasets can have additional organizational structure in the form of <em>slices</em> 
+    as described in <a href="#slices">section 7.2</a>.
+
+  <dt>Structural metadata</dt>
+  <dd>Having located an observation, we need certain metadata in order to be able to interpret it. 
+    What is the unit of measurement? Is it a normal value or a series break? 
+    Is the value measured or estimated? These metadata are provided as <em>attributes</em> and can 
+    be attached to individual observations, or to higher levels.</dd>
+
+  <dt>Reference metadata</dt>
+  <dd>This is metadata that describes the dataset as a whole, such as categorization of the 
+       dataset, its publisher, and a SPARQL endpoint where it can be accessed. 
+      External metadata is described in <a href="#metadata">section 9</a>.</dd>
+</dl>
+</section>
 
 <section id="cubes-model" class="informative">
 <h3>The cube model - dimensions, attributes, measures</h3>
@@ -362,8 +393,9 @@
 dimensions and be able to refer to all observations with those
 dimension values as a single entity. We call such a selection a <em>slice</em>
 through the cube. For example, given a data set on regional performance
-indicators then we might group all the observations about a given indicator
-and a given region into a slice, each slice would then represent a time series of observed values.</p>
+indicators then we might group together all the observations about a given indicator
+and a given region. Each such group would be a slice representing a time
+  series of observed values.</p>
 
 <p>A data publisher may identify slices through the data for various
 purposes. They can be a useful grouping to which metadata might be attached, for example to note a
@@ -512,6 +544,9 @@
 fixed for each slice. Such slices then show the variation in life expectancy across the 
   different regions, i.e. corresponding to the columns in the above tabular layout.</p>
 
+<p>A complete encoding of this data as a Data Cube, including such a
+  slice structure, is shown in <a href="#full-example">Appendix C</a>.</p>
+
 </section>
 
 </section>
@@ -636,7 +671,7 @@
 
 
 <section id="dsd-example" class="informative">
-<h3>Example</h3>
+<h3>Example dimensions and measure</h3>
 
 <p>Turning to our example data set then we can see there are three dimensions to represent
    - time period, region (unitary authority) and sex. There is a single
@@ -909,34 +944,6 @@
 <section id="datasets">
 <h2>Expressing data sets</h2>
 
-<p>A DataSet is a collection of statistical data that corresponds to a given data structure definition. 
-The data in a data set can be roughly described as belonging to one of the following kinds:</p>
-
-<dl>
-  <dt>Observations</dt>
-  <dd>This is the actual data, the measured values. In a statistical table, the observations 
-       would be the values in the table cells.</dd>
-
-  <dt>Organizational structure</dt>
-  <dd>To locate an observation within the hypercube, one has at least to know the value of each 
-      dimension at which the observation is located, so these values must be specified for each observation. 
-      Datasets can have additional organizational structure in the form of <em>slices</em> 
-    as described earlier in <a href="#slices">section 7.2</a>.
-
-  <dt>Structural metadata</dt>
-  <dd>Having located an observation, we need certain metadata in order to be able to interpret it. 
-    What is the unit of measurement? Is it a normal value or a series break? 
-    Is the value measured or estimated? These metadata are provided as <em>attributes</em> and can 
-    be attached to individual observations, or to higher levels as defined by the ComponentSpecification
-    described earlier.</dd>
-
-  <dt>Reference metadata</dt>
-  <dd>This is metadata that describes the dataset as a whole, such as categorization of the 
-       dataset, its publisher, and a SPARQL endpoint where it can be accessed. 
-      External metadata is described in <a href="#metadata">section 9</a>.</dd>
-</dl>
-
-
 <section id="dataset-basic">
 <h3>Data sets and observations</h3>
 
@@ -1074,9 +1081,9 @@
       
   eg:dsd-le-slice1 a qb:DataStructureDefinition;
       qb:component 
-          [ qb:dimension eg:refArea;         qb:order 1 ];
-          [ qb:dimension eg:refPeriod;       qb:order 2 ];
-          [ qb:dimension sdmx-dimension:sex; qb:order 3 ];
+          [ qb:dimension eg:refArea;         qb:order 1 ],
+          [ qb:dimension eg:refPeriod;       qb:order 2 ],
+          [ qb:dimension sdmx-dimension:sex; qb:order 3 ],
           [ qb:measure eg:lifeExpectancy];
           [qb:attribute sdmx-attribute:unitMeasure; qb:componentAttachment qb:DataSet; ] ;
       qb:sliceKey eg:sliceByRegion .
@@ -1100,7 +1107,7 @@
       qb:sliceStructure  eg:sliceByRegion ;
       eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
       sdmx-dimension:sex         sdmx-code:sex-M ;
-      qb:observation eg:o1b, eg:o2b; eg:o3b, ... .
+      qb:observation eg:o1b, eg:o2b, eg:o3b, ... .
 
   eg:o1b a qb:Observation;
       qb:dataSet  eg:dataset-le2 ;
@@ -1157,7 +1164,7 @@
       qb:sliceStructure  eg:sliceByRegion ;
       eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
       sdmx-dimension:sex         sdmx-code:sex-M ;
-      qb:observation eg:o1c, eg:o2c; eg:o3c, ... .
+      qb:observation eg:o1c, eg:o2c, eg:o3c, ... .
 
   eg:o1c a qb:Observation;
       qb:dataSet  eg:dataset-le3 ;
@@ -1382,8 +1389,8 @@
 statistics for the non-leaf concepts in the hierarchy. The Data Cube vocabulary itself imposes
 no constraints on how such aggregation is done. Indeed in statistical applications the
 appropriate statistical corrections to make to aggregated values may be non-trivial and dependent on 
-the data and precise analysis methodology. Even in simple, non-statistical, applications such
-as OLAP a number of different aggregation operators are commonly used.
+the data and precise analysis methodology. Similarly in other
+  applications such as OLAP a number of different aggregation operators are commonly used.
 </p>
 
 <p>Vocabulary terms to represent the aggregation operations employed within a given dataset, and how one dataset 
@@ -2684,6 +2691,319 @@
 
 </section>
 
+<section id="full-example" class="appendix">
+
+<h2>Complete example Data Cube</h2>
+
+<p>This is a complete Data Cube encoding of the running example
+   introduced in <a href="#example">section 5.4</a>. 
+   It uses the abbreviated format so that it can be concisely
+   presented. It passes all the integrity checks (when the
+   declaration of <code>sdmx-dimension:sex</code> is included 
+   from <a href="http://purl.org/linked-data/sdmx/2009/dimension">http://purl.org/linked-data/sdmx/2009/dimension</a>)
+   and so is a well-formed abbreviated Data Cube.
+   </p>
+
+<pre>
[email protected] rdf:      &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
[email protected] rdfs:     &lt;http://www.w3.org/2000/01/rdf-schema#> .
[email protected] owl:      &lt;http://www.w3.org/2002/07/owl#> .
[email protected] xsd:      &lt;http://www.w3.org/2001/XMLSchema#> .
[email protected] skos:     &lt;http://www.w3.org/2004/02/skos/core#> .
[email protected] void:     &lt;http://rdfs.org/ns/void#> .
[email protected] dct:      &lt;http://purl.org/dc/terms/> .
[email protected] foaf:     &lt;http://xmlns.com/foaf/0.1/> .
[email protected] org:      &lt;http://www.w3.org/ns/org#> .
[email protected] admingeo: &lt;http://data.ordnancesurvey.co.uk/ontology/admingeo/> .
[email protected] interval: &lt;http://reference.data.gov.uk/def/intervals/> .
+
[email protected] qb:       &lt;http://purl.org/linked-data/cube#> .
+
[email protected] sdmx-concept:    &lt;http://purl.org/linked-data/sdmx/2009/concept#> .
[email protected] sdmx-dimension:  &lt;http://purl.org/linked-data/sdmx/2009/dimension#> .
[email protected] sdmx-attribute:  &lt;http://purl.org/linked-data/sdmx/2009/attribute#> .
[email protected] sdmx-measure:    &lt;http://purl.org/linked-data/sdmx/2009/measure#> .
[email protected] sdmx-metadata:   &lt;http://purl.org/linked-data/sdmx/2009/metadata#> .
[email protected] sdmx-code:       &lt;http://purl.org/linked-data/sdmx/2009/code#> .
[email protected] sdmx-subject:    &lt;http://purl.org/linked-data/sdmx/2009/subject#> .
+
[email protected] ex-geo:   &lt;http://example.org/geo#> .
[email protected] eg:       &lt;http://example.org/ns#> .
+
+# -- Data Set --------------------------------------------
+
+eg:dataset-le3 a qb:DataSet;
+    dct:title       "Life expectancy"@en;
+    rdfs:label      "Life expectancy"@en;
+    rdfs:comment    "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
+    dct:description "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
+    dct:publisher   eg:organization ;
+    dct:issued      "2010-08-11"^^xsd:date;
+    dct:subject
+        sdmx-subject:3.2 ,      # regional and small area statistics
+        sdmx-subject:1.4 ,      # Health
+        ex-geo:wales;           # Wales
+    qb:structure eg:dsd-le3 ;  
+    sdmx-attribute:unitMeasure &lt;http://dbpedia.org/resource/Year> ;
+    qb:slice eg:slice1, eg:slice2, eg:slice3, eg:slice4, eg:slice5, eg:slice6 ;
+    .
+
+eg:organization a org:Organization, foaf:Agent;
+    rdfs:label "Example org"@en .    
+        
+# -- Data structure definition ----------------------------
+
+eg:dsd-le3 a qb:DataStructureDefinition;
+    qb:component 
+    # The dimensions
+        [ qb:dimension eg:refArea;         qb:order 1 ],
+        [ qb:dimension eg:refPeriod;       qb:order 2; qb:componentAttachment qb:Slice ],
+        [ qb:dimension sdmx-dimension:sex; qb:order 3; qb:componentAttachment qb:Slice ];
+        
+    # The measure(s)
+    qb:component [ qb:measure eg:lifeExpectancy];
+    
+    # The attributes
+    qb:component [ qb:attribute sdmx-attribute:unitMeasure; 
+                   qb:componentRequired "true"^^xsd:boolean;
+                   qb:componentAttachment qb:DataSet; ] ;
+    
+    # slices
+    qb:sliceKey eg:sliceByRegion ;
+    .
+    
+eg:sliceByRegion a qb:SliceKey;
+    rdfs:label "slice by region"@en;
+    rdfs:comment "Slice by grouping regions together, fixing sex and time values"@en;
+    qb:componentProperty eg:refPeriod, sdmx-dimension:sex ;
+    .
+                   
+# -- Dimensions and measures  ----------------------------
+
+eg:refPeriod  a rdf:Property, qb:DimensionProperty;
+    rdfs:label "reference period"@en;
+    rdfs:subPropertyOf sdmx-dimension:refPeriod;
+    rdfs:range interval:Interval;
+    qb:concept sdmx-concept:refPeriod ;
+    .
+
+
+eg:refArea  a rdf:Property, qb:DimensionProperty;
+    rdfs:label "reference area"@en;
+    rdfs:subPropertyOf sdmx-dimension:refArea;
+    rdfs:range admingeo:UnitaryAuthority;
+    qb:concept sdmx-concept:refArea ;
+    .
+
+eg:lifeExpectancy  a rdf:Property, qb:MeasureProperty;
+    rdfs:label "life expectancy"@en;
+    rdfs:subPropertyOf sdmx-measure:obsValue;
+    rdfs:range xsd:decimal ;
+    .
+    
+# -- Observations -----------------------------------------
+
+# Column 1
+    
+eg:slice1 a qb:Slice;
+    qb:sliceStructure  eg:sliceByRegion ;
+    eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
+    sdmx-dimension:sex         sdmx-code:sex-M ;
+    qb:observation eg:o11, eg:o12, eg:o13, eg:o14 ;
+    .
+
+eg:o11 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:newport_00pr ;                  
+    eg:lifeExpectancy          76.7 ;
+    .
+    
+eg:o12 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:cardiff_00pt ;                  
+    eg:lifeExpectancy          78.7 ;
+    .
+
+eg:o13 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:monmouthshire_00pp ;                  
+    eg:lifeExpectancy          76.6 ;
+    .
+
+eg:o14 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:merthyr_tdfil_00ph ;
+    eg:lifeExpectancy          75.5 ;
+    .
+
+# Column 2
+    
+eg:slice2 a qb:Slice;
+    qb:sliceStructure  eg:sliceByRegion ;
+    eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
+    sdmx-dimension:sex         sdmx-code:sex-F ;
+    qb:observation eg:o21, eg:o22, eg:o23, eg:o24 ;
+    .
+
+eg:o21 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:newport_00pr ;                  
+    eg:lifeExpectancy          80.7 ;
+    .
+    
+eg:o22 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:cardiff_00pt ;                  
+    eg:lifeExpectancy          83.3 ;
+    .
+
+eg:o23 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:monmouthshire_00pp ;                  
+    eg:lifeExpectancy          81.3 ;
+    .
+
+eg:o24 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:merthyr_tdfil_00ph ;
+    eg:lifeExpectancy          79.1 ;
+    .
+
+# Column 3
+    
+eg:slice3 a qb:Slice;
+    qb:sliceStructure  eg:sliceByRegion ;
+    eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2005-01-01T00:00:00/P3Y> ;
+    sdmx-dimension:sex         sdmx-code:sex-M ;
+    qb:observation eg:o31, eg:o32, eg:o33, eg:o34 ;
+    .
+
+eg:o31 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:newport_00pr ;                  
+    eg:lifeExpectancy          77.1 ;
+    .
+    
+eg:o32 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:cardiff_00pt ;                  
+    eg:lifeExpectancy          78.6 ;
+    .
+
+eg:o33 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:monmouthshire_00pp ;                  
+    eg:lifeExpectancy          76.5 ;
+    .
+
+eg:o34 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:merthyr_tdfil_00ph ;
+    eg:lifeExpectancy          75.5 ;
+    .
+
+# Column 4
+    
+eg:slice4 a qb:Slice;
+    qb:sliceStructure  eg:sliceByRegion ;
+    eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2005-01-01T00:00:00/P3Y> ;
+    sdmx-dimension:sex         sdmx-code:sex-F ;
+    qb:observation eg:o41, eg:o42, eg:o43, eg:o44 ;
+    .
+
+eg:o41 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:newport_00pr ;                  
+    eg:lifeExpectancy          80.9 ;
+    .
+    
+eg:o42 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:cardiff_00pt ;                  
+    eg:lifeExpectancy          83.7 ;
+    .
+
+eg:o43 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:monmouthshire_00pp ;                  
+    eg:lifeExpectancy          81.5 ;
+    .
+
+eg:o44 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:merthyr_tdfil_00ph ;
+    eg:lifeExpectancy          79.4 ;
+    .
+
+# Column 5
+    
+eg:slice5 a qb:Slice;
+    qb:sliceStructure  eg:sliceByRegion ;
+    eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2006-01-01T00:00:00/P3Y> ;
+    sdmx-dimension:sex         sdmx-code:sex-M ;
+    qb:observation eg:o51, eg:o52, eg:o53, eg:o54 ;
+    .
+
+eg:o51 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:newport_00pr ;                  
+    eg:lifeExpectancy          77.0 ;
+    .
+    
+eg:o52 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:cardiff_00pt ;                  
+    eg:lifeExpectancy          78.7 ;
+    .
+
+eg:o53 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:monmouthshire_00pp ;                  
+    eg:lifeExpectancy          76.6 ;
+    .
+
+eg:o54 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:merthyr_tdfil_00ph ;
+    eg:lifeExpectancy          74.9 ;
+    .
+
+# Column 6
+    
+eg:slice6 a qb:Slice;
+    qb:sliceStructure  eg:sliceByRegion ;
+    eg:refPeriod               &lt;http://reference.data.gov.uk/id/gregorian-interval/2006-01-01T00:00:00/P3Y> ;
+    sdmx-dimension:sex         sdmx-code:sex-F ;
+    qb:observation eg:o61, eg:o62, eg:o63, eg:o64 ;
+    .
+
+eg:o61 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:newport_00pr ;                  
+    eg:lifeExpectancy          81.5 ;
+    .
+    
+eg:o62 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:cardiff_00pt ;                  
+    eg:lifeExpectancy          83.4 ;
+    .
+
+eg:o63 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:monmouthshire_00pp ;                  
+    eg:lifeExpectancy          81.7 ;
+    .
+
+eg:o64 a qb:Observation;
+    qb:dataSet  eg:dataset-le3 ;
+    eg:refArea                 ex-geo:merthyr_tdfil_00ph ;
+    eg:lifeExpectancy          79.6 ;
+    .
+</pre>
+   
+</section>
+
 <section id="references_section" class="appendix">
 </section>
 
--- a/respec/gld-bib.js	Fri Apr 12 16:52:16 2013 +0100
+++ b/respec/gld-bib.js	Sat Apr 13 16:19:00 2013 +0100
@@ -31,6 +31,7 @@
         "status": "W3C Proposed Recommendation",
         "publisher": "W3C"
     },
+    "SDMX-GUIDE": "SDMX User Guide, Version 2009.1, January 2009.  Statistical Data and Metadata Exchange Initiative. URL: <a href=\"http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf\">http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf</a>",
     "SPARQL-UPDATE-11": {
         "authors": [
             "Paul Gearon",