--- a/data-cube/index.html Fri Apr 12 16:52:16 2013 +0100
+++ b/data-cube/index.html Sat Apr 13 16:19:00 2013 +0100
@@ -178,8 +178,7 @@
(web) addressable. This allows publishers and third parties to annotate
and link to this data; for example a report can reference the specific
figures it is based on allowing for fine grained provenance trace-back.</li>
- <li>Data can be flexibly combined across datasets and between
-statistical and non-statistical sets (for example <em>find all
+ <li>Data can be flexibly combined across datasets sets (for example <em>find all
Religious schools in census areas with high values for National
Indicators pertaining to religious tolerance</em>). The statistical
data becomes an integral part of the broader web of linked data.</li>
@@ -226,6 +225,9 @@
<p>The RDF Data Cube vocabulary builds upon the core of the
the SDMX 2.0 Information Model [[SDMX20]].</p>
+<p>Readers may find the SDMX User Guide [[SDMX-GUIDE]] useful
+ background.</p>
+
<p>A key component of the SDMX standards package are
the <strong>Content-Oriented Guidelines</strong> (COGs), a set of
cross-domain concepts, code lists, and categories that support
@@ -315,6 +317,35 @@
<section id="data-cubes" class="informative">
<h2>Data cubes</h2>
+<section id="cubes-model" class="informative">
+<h3>Data Sets</h3>
+
+<p>A DataSet is a collection of statistical data that corresponds to a
+ defined structure. The data in a data set can be roughly described as belonging to one of the following kinds:</p>
+
+<dl>
+ <dt>Observations</dt>
+ <dd>This is the actual data, the measured values. In a statistical table, the observations
+ would be the values in the table cells.</dd>
+
+ <dt>Organizational structure</dt>
+ <dd>To locate an observation within the hypercube, one has at least to know the value of each
+ dimension at which the observation is located, so these values must be specified for each observation.
+ Datasets can have additional organizational structure in the form of <em>slices</em>
+ as described in <a href="#slices">section 7.2</a>.
+
+ <dt>Structural metadata</dt>
+ <dd>Having located an observation, we need certain metadata in order to be able to interpret it.
+ What is the unit of measurement? Is it a normal value or a series break?
+ Is the value measured or estimated? These metadata are provided as <em>attributes</em> and can
+ be attached to individual observations, or to higher levels.</dd>
+
+ <dt>Reference metadata</dt>
+ <dd>This is metadata that describes the dataset as a whole, such as categorization of the
+ dataset, its publisher, and a SPARQL endpoint where it can be accessed.
+ External metadata is described in <a href="#metadata">section 9</a>.</dd>
+</dl>
+</section>
<section id="cubes-model" class="informative">
<h3>The cube model - dimensions, attributes, measures</h3>
@@ -362,8 +393,9 @@
dimensions and be able to refer to all observations with those
dimension values as a single entity. We call such a selection a <em>slice</em>
through the cube. For example, given a data set on regional performance
-indicators then we might group all the observations about a given indicator
-and a given region into a slice, each slice would then represent a time series of observed values.</p>
+indicators then we might group together all the observations about a given indicator
+and a given region. Each such group would be a slice representing a time
+ series of observed values.</p>
<p>A data publisher may identify slices through the data for various
purposes. They can be a useful grouping to which metadata might be attached, for example to note a
@@ -512,6 +544,9 @@
fixed for each slice. Such slices then show the variation in life expectancy across the
different regions, i.e. corresponding to the columns in the above tabular layout.</p>
+<p>A complete encoding of this data as a Data Cube, including such a
+ slice structure, is shown in <a href="#full-example">Appendix C</a>.</p>
+
</section>
</section>
@@ -636,7 +671,7 @@
<section id="dsd-example" class="informative">
-<h3>Example</h3>
+<h3>Example dimensions and measure</h3>
<p>Turning to our example data set then we can see there are three dimensions to represent
- time period, region (unitary authority) and sex. There is a single
@@ -909,34 +944,6 @@
<section id="datasets">
<h2>Expressing data sets</h2>
-<p>A DataSet is a collection of statistical data that corresponds to a given data structure definition.
-The data in a data set can be roughly described as belonging to one of the following kinds:</p>
-
-<dl>
- <dt>Observations</dt>
- <dd>This is the actual data, the measured values. In a statistical table, the observations
- would be the values in the table cells.</dd>
-
- <dt>Organizational structure</dt>
- <dd>To locate an observation within the hypercube, one has at least to know the value of each
- dimension at which the observation is located, so these values must be specified for each observation.
- Datasets can have additional organizational structure in the form of <em>slices</em>
- as described earlier in <a href="#slices">section 7.2</a>.
-
- <dt>Structural metadata</dt>
- <dd>Having located an observation, we need certain metadata in order to be able to interpret it.
- What is the unit of measurement? Is it a normal value or a series break?
- Is the value measured or estimated? These metadata are provided as <em>attributes</em> and can
- be attached to individual observations, or to higher levels as defined by the ComponentSpecification
- described earlier.</dd>
-
- <dt>Reference metadata</dt>
- <dd>This is metadata that describes the dataset as a whole, such as categorization of the
- dataset, its publisher, and a SPARQL endpoint where it can be accessed.
- External metadata is described in <a href="#metadata">section 9</a>.</dd>
-</dl>
-
-
<section id="dataset-basic">
<h3>Data sets and observations</h3>
@@ -1074,9 +1081,9 @@
eg:dsd-le-slice1 a qb:DataStructureDefinition;
qb:component
- [ qb:dimension eg:refArea; qb:order 1 ];
- [ qb:dimension eg:refPeriod; qb:order 2 ];
- [ qb:dimension sdmx-dimension:sex; qb:order 3 ];
+ [ qb:dimension eg:refArea; qb:order 1 ],
+ [ qb:dimension eg:refPeriod; qb:order 2 ],
+ [ qb:dimension sdmx-dimension:sex; qb:order 3 ],
[ qb:measure eg:lifeExpectancy];
[qb:attribute sdmx-attribute:unitMeasure; qb:componentAttachment qb:DataSet; ] ;
qb:sliceKey eg:sliceByRegion .
@@ -1100,7 +1107,7 @@
qb:sliceStructure eg:sliceByRegion ;
eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
sdmx-dimension:sex sdmx-code:sex-M ;
- qb:observation eg:o1b, eg:o2b; eg:o3b, ... .
+ qb:observation eg:o1b, eg:o2b, eg:o3b, ... .
eg:o1b a qb:Observation;
qb:dataSet eg:dataset-le2 ;
@@ -1157,7 +1164,7 @@
qb:sliceStructure eg:sliceByRegion ;
eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
sdmx-dimension:sex sdmx-code:sex-M ;
- qb:observation eg:o1c, eg:o2c; eg:o3c, ... .
+ qb:observation eg:o1c, eg:o2c, eg:o3c, ... .
eg:o1c a qb:Observation;
qb:dataSet eg:dataset-le3 ;
@@ -1382,8 +1389,8 @@
statistics for the non-leaf concepts in the hierarchy. The Data Cube vocabulary itself imposes
no constraints on how such aggregation is done. Indeed in statistical applications the
appropriate statistical corrections to make to aggregated values may be non-trivial and dependent on
-the data and precise analysis methodology. Even in simple, non-statistical, applications such
-as OLAP a number of different aggregation operators are commonly used.
+the data and precise analysis methodology. Similarly in other
+ applications such as OLAP a number of different aggregation operators are commonly used.
</p>
<p>Vocabulary terms to represent the aggregation operations employed within a given dataset, and how one dataset
@@ -2684,6 +2691,319 @@
</section>
+<section id="full-example" class="appendix">
+
+<h2>Complete example Data Cube</h2>
+
+<p>This is a complete Data Cube encoding of the running example
+ introduced in <a href="#example">section 5.4</a>.
+ It uses the abbreviated format so that it can be concisely
+ presented. It passes all the integrity checks (when the
+ declaration of <code>sdmx-dimension:sex</code> is included
+ from <a href="http://purl.org/linked-data/sdmx/2009/dimension">http://purl.org/linked-data/sdmx/2009/dimension</a>)
+ and so is a well-formed abbreviated Data Cube.
+ </p>
+
+<pre>
+@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
+@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
+@prefix owl: <http://www.w3.org/2002/07/owl#> .
+@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
+@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
+@prefix void: <http://rdfs.org/ns/void#> .
+@prefix dct: <http://purl.org/dc/terms/> .
+@prefix foaf: <http://xmlns.com/foaf/0.1/> .
+@prefix org: <http://www.w3.org/ns/org#> .
+@prefix admingeo: <http://data.ordnancesurvey.co.uk/ontology/admingeo/> .
+@prefix interval: <http://reference.data.gov.uk/def/intervals/> .
+
+@prefix qb: <http://purl.org/linked-data/cube#> .
+
+@prefix sdmx-concept: <http://purl.org/linked-data/sdmx/2009/concept#> .
+@prefix sdmx-dimension: <http://purl.org/linked-data/sdmx/2009/dimension#> .
+@prefix sdmx-attribute: <http://purl.org/linked-data/sdmx/2009/attribute#> .
+@prefix sdmx-measure: <http://purl.org/linked-data/sdmx/2009/measure#> .
+@prefix sdmx-metadata: <http://purl.org/linked-data/sdmx/2009/metadata#> .
+@prefix sdmx-code: <http://purl.org/linked-data/sdmx/2009/code#> .
+@prefix sdmx-subject: <http://purl.org/linked-data/sdmx/2009/subject#> .
+
+@prefix ex-geo: <http://example.org/geo#> .
+@prefix eg: <http://example.org/ns#> .
+
+# -- Data Set --------------------------------------------
+
+eg:dataset-le3 a qb:DataSet;
+ dct:title "Life expectancy"@en;
+ rdfs:label "Life expectancy"@en;
+ rdfs:comment "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
+ dct:description "Life expectancy within Welsh Unitary authorities - extracted from Stats Wales"@en;
+ dct:publisher eg:organization ;
+ dct:issued "2010-08-11"^^xsd:date;
+ dct:subject
+ sdmx-subject:3.2 , # regional and small area statistics
+ sdmx-subject:1.4 , # Health
+ ex-geo:wales; # Wales
+ qb:structure eg:dsd-le3 ;
+ sdmx-attribute:unitMeasure <http://dbpedia.org/resource/Year> ;
+ qb:slice eg:slice1, eg:slice2, eg:slice3, eg:slice4, eg:slice5, eg:slice6 ;
+ .
+
+eg:organization a org:Organization, foaf:Agent;
+ rdfs:label "Example org"@en .
+
+# -- Data structure definition ----------------------------
+
+eg:dsd-le3 a qb:DataStructureDefinition;
+ qb:component
+ # The dimensions
+ [ qb:dimension eg:refArea; qb:order 1 ],
+ [ qb:dimension eg:refPeriod; qb:order 2; qb:componentAttachment qb:Slice ],
+ [ qb:dimension sdmx-dimension:sex; qb:order 3; qb:componentAttachment qb:Slice ];
+
+ # The measure(s)
+ qb:component [ qb:measure eg:lifeExpectancy];
+
+ # The attributes
+ qb:component [ qb:attribute sdmx-attribute:unitMeasure;
+ qb:componentRequired "true"^^xsd:boolean;
+ qb:componentAttachment qb:DataSet; ] ;
+
+ # slices
+ qb:sliceKey eg:sliceByRegion ;
+ .
+
+eg:sliceByRegion a qb:SliceKey;
+ rdfs:label "slice by region"@en;
+ rdfs:comment "Slice by grouping regions together, fixing sex and time values"@en;
+ qb:componentProperty eg:refPeriod, sdmx-dimension:sex ;
+ .
+
+# -- Dimensions and measures ----------------------------
+
+eg:refPeriod a rdf:Property, qb:DimensionProperty;
+ rdfs:label "reference period"@en;
+ rdfs:subPropertyOf sdmx-dimension:refPeriod;
+ rdfs:range interval:Interval;
+ qb:concept sdmx-concept:refPeriod ;
+ .
+
+
+eg:refArea a rdf:Property, qb:DimensionProperty;
+ rdfs:label "reference area"@en;
+ rdfs:subPropertyOf sdmx-dimension:refArea;
+ rdfs:range admingeo:UnitaryAuthority;
+ qb:concept sdmx-concept:refArea ;
+ .
+
+eg:lifeExpectancy a rdf:Property, qb:MeasureProperty;
+ rdfs:label "life expectancy"@en;
+ rdfs:subPropertyOf sdmx-measure:obsValue;
+ rdfs:range xsd:decimal ;
+ .
+
+# -- Observations -----------------------------------------
+
+# Column 1
+
+eg:slice1 a qb:Slice;
+ qb:sliceStructure eg:sliceByRegion ;
+ eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
+ sdmx-dimension:sex sdmx-code:sex-M ;
+ qb:observation eg:o11, eg:o12, eg:o13, eg:o14 ;
+ .
+
+eg:o11 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:newport_00pr ;
+ eg:lifeExpectancy 76.7 ;
+ .
+
+eg:o12 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:cardiff_00pt ;
+ eg:lifeExpectancy 78.7 ;
+ .
+
+eg:o13 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:monmouthshire_00pp ;
+ eg:lifeExpectancy 76.6 ;
+ .
+
+eg:o14 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:merthyr_tdfil_00ph ;
+ eg:lifeExpectancy 75.5 ;
+ .
+
+# Column 2
+
+eg:slice2 a qb:Slice;
+ qb:sliceStructure eg:sliceByRegion ;
+ eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2004-01-01T00:00:00/P3Y> ;
+ sdmx-dimension:sex sdmx-code:sex-F ;
+ qb:observation eg:o21, eg:o22, eg:o23, eg:o24 ;
+ .
+
+eg:o21 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:newport_00pr ;
+ eg:lifeExpectancy 80.7 ;
+ .
+
+eg:o22 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:cardiff_00pt ;
+ eg:lifeExpectancy 83.3 ;
+ .
+
+eg:o23 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:monmouthshire_00pp ;
+ eg:lifeExpectancy 81.3 ;
+ .
+
+eg:o24 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:merthyr_tdfil_00ph ;
+ eg:lifeExpectancy 79.1 ;
+ .
+
+# Column 3
+
+eg:slice3 a qb:Slice;
+ qb:sliceStructure eg:sliceByRegion ;
+ eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2005-01-01T00:00:00/P3Y> ;
+ sdmx-dimension:sex sdmx-code:sex-M ;
+ qb:observation eg:o31, eg:o32, eg:o33, eg:o34 ;
+ .
+
+eg:o31 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:newport_00pr ;
+ eg:lifeExpectancy 77.1 ;
+ .
+
+eg:o32 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:cardiff_00pt ;
+ eg:lifeExpectancy 78.6 ;
+ .
+
+eg:o33 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:monmouthshire_00pp ;
+ eg:lifeExpectancy 76.5 ;
+ .
+
+eg:o34 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:merthyr_tdfil_00ph ;
+ eg:lifeExpectancy 75.5 ;
+ .
+
+# Column 4
+
+eg:slice4 a qb:Slice;
+ qb:sliceStructure eg:sliceByRegion ;
+ eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2005-01-01T00:00:00/P3Y> ;
+ sdmx-dimension:sex sdmx-code:sex-F ;
+ qb:observation eg:o41, eg:o42, eg:o43, eg:o44 ;
+ .
+
+eg:o41 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:newport_00pr ;
+ eg:lifeExpectancy 80.9 ;
+ .
+
+eg:o42 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:cardiff_00pt ;
+ eg:lifeExpectancy 83.7 ;
+ .
+
+eg:o43 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:monmouthshire_00pp ;
+ eg:lifeExpectancy 81.5 ;
+ .
+
+eg:o44 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:merthyr_tdfil_00ph ;
+ eg:lifeExpectancy 79.4 ;
+ .
+
+# Column 5
+
+eg:slice5 a qb:Slice;
+ qb:sliceStructure eg:sliceByRegion ;
+ eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2006-01-01T00:00:00/P3Y> ;
+ sdmx-dimension:sex sdmx-code:sex-M ;
+ qb:observation eg:o51, eg:o52, eg:o53, eg:o54 ;
+ .
+
+eg:o51 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:newport_00pr ;
+ eg:lifeExpectancy 77.0 ;
+ .
+
+eg:o52 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:cardiff_00pt ;
+ eg:lifeExpectancy 78.7 ;
+ .
+
+eg:o53 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:monmouthshire_00pp ;
+ eg:lifeExpectancy 76.6 ;
+ .
+
+eg:o54 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:merthyr_tdfil_00ph ;
+ eg:lifeExpectancy 74.9 ;
+ .
+
+# Column 6
+
+eg:slice6 a qb:Slice;
+ qb:sliceStructure eg:sliceByRegion ;
+ eg:refPeriod <http://reference.data.gov.uk/id/gregorian-interval/2006-01-01T00:00:00/P3Y> ;
+ sdmx-dimension:sex sdmx-code:sex-F ;
+ qb:observation eg:o61, eg:o62, eg:o63, eg:o64 ;
+ .
+
+eg:o61 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:newport_00pr ;
+ eg:lifeExpectancy 81.5 ;
+ .
+
+eg:o62 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:cardiff_00pt ;
+ eg:lifeExpectancy 83.4 ;
+ .
+
+eg:o63 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:monmouthshire_00pp ;
+ eg:lifeExpectancy 81.7 ;
+ .
+
+eg:o64 a qb:Observation;
+ qb:dataSet eg:dataset-le3 ;
+ eg:refArea ex-geo:merthyr_tdfil_00ph ;
+ eg:lifeExpectancy 79.6 ;
+ .
+</pre>
+
+</section>
+
<section id="references_section" class="appendix">
</section>