--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/data-cube-ucr/data-cube-ucr-20130227/index.html Wed Feb 27 23:52:03 2013 +0100
@@ -0,0 +1,1739 @@
+<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>
+<html lang="en-US">
+<head>
+<meta http-equiv="content-type" content="text/html; charset=UTF-8">
+<title>Use Cases and Requirements for the Data Cube Vocabulary</title>
+
+
+<script src="respec-ref.js"></script>
+<script src="respec-config.js"></script>
+<link rel="stylesheet" type="text/css" href="local-style.css">
+<style>/*****************************************************************
+ * ReSpec 3 CSS
+ * Robin Berjon - http://berjon.com/
+ *****************************************************************/
+
+/* --- INLINES --- */
+em.rfc2119 {
+ text-transform: lowercase;
+ font-variant: small-caps;
+ font-style: normal;
+ color: #900;
+}
+
+h1 acronym, h2 acronym, h3 acronym, h4 acronym, h5 acronym, h6 acronym, a acronym,
+h1 abbr, h2 abbr, h3 abbr, h4 abbr, h5 abbr, h6 abbr, a abbr {
+ border: none;
+}
+
+dfn {
+ font-weight: bold;
+}
+
+a.internalDFN {
+ color: inherit;
+ border-bottom: 1px solid #99c;
+ text-decoration: none;
+}
+
+a.externalDFN {
+ color: inherit;
+ border-bottom: 1px dotted #ccc;
+ text-decoration: none;
+}
+
+a.bibref {
+ text-decoration: none;
+}
+
+cite .bibref {
+ font-style: normal;
+}
+
+code {
+ color: #ff4500;
+}
+
+
+/* --- --- */
+ol.algorithm { counter-reset:numsection; list-style-type: none; }
+ol.algorithm li { margin: 0.5em 0; }
+ol.algorithm li:before { font-weight: bold; counter-increment: numsection; content: counters(numsection, ".") ") "; }
+
+/* --- TOC --- */
+.toc a, .tof a {
+ text-decoration: none;
+}
+
+a .secno, a .figno {
+ color: #000;
+}
+
+ul.tof, ol.tof {
+ list-style: none outside none;
+}
+
+.caption {
+ margin-top: 0.5em;
+ font-style: italic;
+}
+
+/* --- TABLE --- */
+table.simple {
+ border-spacing: 0;
+ border-collapse: collapse;
+ border-bottom: 3px solid #005a9c;
+}
+
+.simple th {
+ background: #005a9c;
+ color: #fff;
+ padding: 3px 5px;
+ text-align: left;
+}
+
+.simple th[scope="row"] {
+ background: inherit;
+ color: inherit;
+ border-top: 1px solid #ddd;
+}
+
+.simple td {
+ padding: 3px 10px;
+ border-top: 1px solid #ddd;
+}
+
+.simple tr:nth-child(even) {
+ background: #f0f6ff;
+}
+
+/* --- DL --- */
+.section dd > p:first-child {
+ margin-top: 0;
+}
+
+.section dd > p:last-child {
+ margin-bottom: 0;
+}
+
+.section dd {
+ margin-bottom: 1em;
+}
+
+.section dl.attrs dd, .section dl.eldef dd {
+ margin-bottom: 0;
+}
+</style><link rel="stylesheet" href="https://www.w3.org/StyleSheets/TR/W3C-NOTE"><!--[if lt IE 9]><script src='https://www.w3.org/2008/site/js/html5shiv.js'></script><![endif]--></head>
+
+<body><div class="head">
+ <p>
+
+ <a href="http://www.w3.org/"><img width="72" height="48" src="https://www.w3.org/Icons/w3c_home" alt="W3C"></a>
+
+ </p>
+ <h1 class="title" id="title">Use Cases and Requirements for the Data Cube Vocabulary</h1>
+
+ <h2 id="w3c-note-27-february-2013"><abbr title="World Wide Web Consortium">W3C</abbr> Note 27 February 2013</h2>
+ <dl>
+
+ <dt>This version:</dt>
+ <dd><a href="http://www.w3.org/TR/2013/NOTE-data-cube-ucr-20130227/">http://www.w3.org/TR/2013/NOTE-data-cube-ucr-20130227/</a></dd>
+ <dt>Latest published version:</dt>
+ <dd><a href="http://www.w3.org/TR/data-cube-ucr/">http://www.w3.org/TR/data-cube-ucr/</a></dd>
+
+
+ <dt>Latest editor's draft:</dt>
+ <dd><a href="http://dvcs.w3.org/hg/gld/raw-file/default/data-cube-ucr/data-cube-ucr-20120222/">http://dvcs.w3.org/hg/gld/raw-file/default/data-cube-ucr/data-cube-ucr-20120222/</a></dd>
+
+
+
+
+
+ <dt>Previous version:</dt>
+ <dd><a href=""></a></dd>
+
+
+ <dt>Editors:</dt>
+ <dd><a href="http://www.aifb.kit.edu/web/Benedikt_K%C3%A4mpgen/en">Benedikt Kämpgen</a>, <a href="http://www.fzi.de/index.php/en">FZI Karlsruhe</a></dd>
+<dd><a href="http://richard.cyganiak.de/">Richard Cyganiak</a>, <a href="http://www.deri.ie/">DERI, NUI Galway</a></dd>
+
+
+ </dl>
+
+
+
+
+
+ <p class="copyright">
+ <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> ©
+ 2013
+
+ <a href="http://www.w3.org/"><abbr title="World Wide Web Consortium">W3C</abbr></a><sup>®</sup>
+ (<a href="http://www.csail.mit.edu/"><abbr title="Massachusetts Institute of Technology">MIT</abbr></a>,
+ <a href="http://www.ercim.eu/"><abbr title="European Research Consortium for Informatics and Mathematics">ERCIM</abbr></a>,
+ <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved.
+ <abbr title="World Wide Web Consortium">W3C</abbr> <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>,
+ <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and
+ <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.
+ </p>
+
+
+ <hr>
+</div>
+
+ <section id="abstract" class="introductory"><h2>Abstract</h2>
+ <p>Many national, regional and local governments, as well as other
+ organisations in- and outside of the public sector, collect numeric
+ data and aggregate this data into statistics. There is a need to
+ publish these statistics in a standardised, machine-readable way on
+ the web, so that they can be freely integrated and reused in consuming
+ applications.</p>
+ <p>
+ In this document, the <a href="http://www.w3.org/2011/gld/"><abbr title="World Wide Web Consortium">W3C</abbr>
+ Government Linked Data Working Group</a> presents use cases and
+ requirements supporting a recommendation of the RDF Data Cube
+ Vocabulary [<cite><a href="#ref-QB-2013">QB-2013</a></cite>]. The
+ group obtained use cases from existing deployments of and experiences
+ with an earlier version of the data cube vocabulary [<cite><a href="#ref-QB-2010">QB-2010</a></cite>]. The group also describes a set of
+ requirements derived from the use cases and to be considered in the
+ recommendation.
+ </p>
+ </section><section id="sotd" class="introductory"><h2>Status of This Document</h2>
+
+
+
+ <p>
+ <em>This section describes the status of this document at the time of its publication. Other
+ documents may supersede this document. A list of current <abbr title="World Wide Web Consortium">W3C</abbr> publications and the latest revision
+ of this technical report can be found in the <a href="http://www.w3.org/TR/"><abbr title="World Wide Web Consortium">W3C</abbr> technical reports
+ index</a> at http://www.w3.org/TR/.</em>
+ </p>
+
+ <p>
+ This document is an editorial update to an Editor's Draft of the "Use
+ Cases and Requirements for the Data Cube Vocabulary" developed by the
+ <a href="http://www.w3.org/2011/gld/"><abbr title="World Wide Web Consortium">W3C</abbr> Government Linked Data
+ Working Group</a>.
+ </p>
+
+ <p>
+ This document was published by the <a href="http://www.w3.org/2011/gld/">Government Linked Data Working Group</a> as a Note.
+
+
+ If you wish to make comments regarding this document, please send them to
+ <a href="mailto:public-gld-comments@w3.org">public-gld-comments@w3.org</a>
+ (<a href="mailto:public-gld-comments-request@w3.org?subject=subscribe">subscribe</a>,
+ <a href="http://lists.w3.org/Archives/Public/public-gld-comments/">archives</a>).
+
+
+
+
+ All comments are welcome.
+
+
+ </p><p>
+ Publication as a Note does not imply endorsement by the <abbr title="World Wide Web Consortium">W3C</abbr> Membership.
+ This is a draft document and may be updated, replaced or obsoleted by other documents at
+ any time. It is inappropriate to cite this document as other than work in progress.
+ </p>
+
+
+ <p>
+
+ This document was produced by a group operating under the
+ <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February 2004 <abbr title="World Wide Web Consortium">W3C</abbr> Patent Policy</a>.
+
+
+
+
+ <abbr title="World Wide Web Consortium">W3C</abbr> maintains a <a href="" rel="disclosure">public list of any patent disclosures</a>
+
+ made in connection with the deliverables of the group; that page also includes instructions for
+ disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains
+ <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">Essential Claim(s)</a> must disclose the
+ information in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section
+ 6 of the <abbr title="World Wide Web Consortium">W3C</abbr> Patent Policy</a>.
+
+
+ </p>
+
+
+
+
+</section><section id="toc"><h2 class="introductory">Table of Contents</h2><ul class="toc"><li class="tocline"><a href="#introduction-1" class="tocxref"><span class="secno">1. </span>Introduction</a><ul class="toc"><li class="tocline"><a href="#describing-statistics" class="tocxref"><span class="secno">1.1 </span>Describing statistics</a></li></ul></li><li class="tocline"><a href="#terminology-1" class="tocxref"><span class="secno">2. </span>Terminology</a></li><li class="tocline"><a href="#use-cases" class="tocxref"><span class="secno">3. </span>Use cases</a><ul class="toc"><li class="tocline"><a href="#sdmx-web-dissemination-use-case" class="tocxref"><span class="secno">3.1 </span>SDMX Web Dissemination Use
+ Case</a></li><li class="tocline"><a href="#publisher-use-case-uk-government-financial-data-from-combined-online-information-system-coins" class="tocxref"><span class="secno">3.2 </span>Publisher
+ Use Case: UK government financial data from Combined Online
+ Information System (COINS)</a></li><li class="tocline"><a href="#publisher-use-case-publishing-excel-spreadsheets-as-linked-data" class="tocxref"><span class="secno">3.3 </span>Publisher Use
+ Case: Publishing Excel Spreadsheets as Linked Data</a></li><li class="tocline"><a href="#publisher-use-case-publishing-hierarchically-structured-data-from-statswales-and-open-data-communities" class="tocxref"><span class="secno">3.4 </span>Publisher
+ Use Case: Publishing hierarchically structured data from StatsWales
+ and Open Data Communities</a></li><li class="tocline"><a href="#publisher-use-case-publishing-slices-of-data-about-uk-bathing-water-quality" class="tocxref"><span class="secno">3.5 </span>Publisher
+ Use Case: Publishing slices of data about UK Bathing Water Quality</a></li><li class="tocline"><a href="#publisher-use-case-eurostat-sdmx-as-linked-data" class="tocxref"><span class="secno">3.6 </span>Publisher Use Case: Eurostat
+ SDMX as Linked Data</a></li><li class="tocline"><a href="#publisher-use-case-representing-relationships-between-statistical-data" class="tocxref"><span class="secno">3.7 </span>Publisher
+ Use Case: Representing relationships between statistical data</a></li><li class="tocline"><a href="#consumer-use-case-simple-chart-visualisations-of-integrated-published-statistical-data" class="tocxref"><span class="secno">3.8 </span>Consumer
+ Use Case: Simple chart visualisations of (integrated) published
+ statistical data</a></li><li class="tocline"><a href="#consumer-use-case-visualising-published-statistical-data-in-google-public-data-explorer" class="tocxref"><span class="secno">3.9 </span>Consumer
+ Use Case: Visualising published statistical data in Google Public Data
+ Explorer</a></li><li class="tocline"><a href="#consumer-use-case-analysing-published-statistical-data-with-common-olap-systems" class="tocxref"><span class="secno">3.10 </span>Consumer
+ Use Case: Analysing published statistical data with common OLAP
+ systems</a></li><li class="tocline"><a href="#registry-use-case-registering-published-statistical-data-in-data-catalogs" class="tocxref"><span class="secno">3.11 </span>Registry
+ Use Case: Registering published statistical data in data catalogs</a></li></ul></li><li class="tocline"><a href="#requirements-1" class="tocxref"><span class="secno">4. </span>Requirements</a><ul class="toc"><li class="tocline"><a href="#vocabulary-should-build-upon-the-sdmx-information-model" class="tocxref"><span class="secno">4.1 </span>Vocabulary
+ should build upon the SDMX information model</a></li><li class="tocline"><a href="#vocabulary-should-clarify-the-use-of-subsets-of-observations" class="tocxref"><span class="secno">4.2 </span>Vocabulary
+ should clarify the use of subsets of observations</a></li><li class="tocline"><a href="#vocabulary-should-recommend-a-mechanism-to-support-hierarchical-code-lists" class="tocxref"><span class="secno">4.3 </span>Vocabulary
+ should recommend a mechanism to support hierarchical code lists</a></li><li class="tocline"><a href="#vocabulary-should-define-relationship-to-iso19156---observations-measurements" class="tocxref"><span class="secno">4.4 </span>Vocabulary
+ should define relationship to ISO19156 - Observations & Measurements</a></li><li class="tocline"><a href="#there-should-be-a-recommended-mechanism-to-allow-for-publication-of-aggregates-which-cross-multiple-dimensions" class="tocxref"><span class="secno">4.5 </span>There
+ should be a recommended mechanism to allow for publication of
+ aggregates which cross multiple dimensions</a></li><li class="tocline"><a href="#there-should-be-a-recommended-way-of-declaring-relations-between-cubes" class="tocxref"><span class="secno">4.6 </span>There
+ should be a recommended way of declaring relations between cubes</a></li><li class="tocline"><a href="#there-should-be-criteria-for-well-formedness-and-assumptions-consumers-can-make-about-published-data" class="tocxref"><span class="secno">4.7 </span>There
+ should be criteria for well-formedness and assumptions consumers can
+ make about published data</a></li><li class="tocline"><a href="#there-should-be-mechanisms-and-recommendations-regarding-publication-and-consumption-of-large-amounts-of-statistical-data" class="tocxref"><span class="secno">4.8 </span>There
+ should be mechanisms and recommendations regarding publication and
+ consumption of large amounts of statistical data</a></li><li class="tocline"><a href="#there-should-be-a-recommended-way-to-communicate-the-availability-of-published-statistical-data-to-external-parties-and-to-allow-automatic-discovery-of-statistical-data" class="tocxref"><span class="secno">4.9 </span>There
+ should be a recommended way to communicate the availability of
+ published statistical data to external parties and to allow automatic
+ discovery of statistical data</a></li></ul></li><li class="tocline"><a href="#acknowledgements-1" class="tocxref"><span class="secno">A. </span>Acknowledgements</a></li></ul></section>
+
+
+
+ <section id="introduction-1">
+ <!--OddPage--><h2 id="introduction"><span class="secno">1. </span>Introduction</h2>
+ The aim of this document is to present concrete use cases and
+ requirements for a vocabulary to publish statistics as Linked Data. An
+ earlier version of the data cube vocabulary [<cite><a href="#ref-QB-2010">QB-2010</a></cite>] has been existing for some time and
+ has proven applicable in <a href="http://wiki.planet-data.eu/web/Datasets">several deployments</a>.
+ The <a href="http://www.w3.org/2011/gld/"><abbr title="World Wide Web Consortium">W3C</abbr> Government Linked
+ Data Working Group</a> intends to transform the data cube vocabulary into
+ a <abbr title="World Wide Web Consortium">W3C</abbr> recommendation of the RDF Data Cube Vocabulary [<cite><a href="#ref-QB-2013">QB-2013</a></cite>]. This document describes use cases
+ and requirements derived from existing data cube deployments in order
+ to document and illustrate design decisions that have driven the work.
+
+ <p>The rest of this document is structured as follows. We will
+ first give a short introduction of the specificities of modelling
+ statistics. Then, we will describe use cases that have been derived
+ from existing deployments or feedback to the earlier data cube
+ vocabulary version. In particular, we describe possible benefits and
+ challenges of use cases. Afterwards, we will describe concrete
+ requirements that were derived from those use cases and that have been
+ taken into account for the specification.</p>
+
+ <p>We use the name data cube vocabulary throughout the document
+ when referring to the vocabulary.</p>
+
+ <section id="describing-statistics">
+ <h3 id="describing statistics"><span class="secno">1.1 </span>Describing statistics</h3>
+ <p>In the following, we describe the challenge of an RDF vocabulary
+ for publishing statistics as Linked Data.</p>
+ <p>Describing statistics - collected and aggregated numeric data -
+ is challenging for the following reasons:</p>
+ <ul>
+ <li>Representing statistics requires more complex modeling as
+ discussed by Martin Fowler [<cite><a href="#ref-FOWLER97">FOWLER97</a></cite>]:
+ Recording a statistic simply as an attribute to an object (e.g., the
+ fact that a person weighs 185 pounds) fails with representing
+ important concepts such as quantity, measurement, and unit. Instead,
+ a statistic is modeled as a distinguishable object, an observation.
+ </li>
+ <li>The object describes an observation of a value, e.g., a
+ numeric value (e.g., 185) in case of a measurement or a categorical
+ value (e.g., "blood group A") in case of a categorical observation.</li>
+ <li>To allow correct interpretation of the value, the object can
+ be further described by "dimensions", e.g., the specific phenomenon
+ "weight" observed and the unit "pounds". Given background
+ information, e.g., arithmetical and comparative operations, humans
+ and machines can appropriately visualize such observations or have
+ conversions between different quantities.</li>
+ <li>Also, an observation separates a value from the actual event
+ at which it was collected; for instance, one can describe the
+ "Person" that collected the observation and the "Time" the
+ observation was collected.</li>
+ </ul>
+ The following figure illustrates this specificitiy of modelling in a
+ class diagram:
+
+ <p class="caption">Figure: Illustration of specificities in
+ modelling of a statistic</p>
+
+ <p align="center">
+ <img alt="specificity of modelling a
+ statistic" src="./figures/modeling_quantity_measurement_observation.png">
+ </p>
+
+ <p>
+ The Statistical Data and Metadata eXchange [<cite><a href="#ref-SDMX">SDMX</a></cite>] - the ISO standard for exchanging and
+ sharing of statistical data and metadata among organisations - uses
+ "multidimensional model" that caters for the specificity of modelling
+ statistics. It allows to describe statistics as observations.
+ Observations exhibit values (Measures) that depend on dimensions
+ (Members of Dimensions).
+ </p>
+ <p>Since the SDMX standard has proven applicable in many contexts,
+ the vocabulary adopts the multidimensional model that underlies SDMX
+ and will be compatible to SDMX.</p>
+
+ </section> </section>
+
+ <section id="terminology-1">
+ <!--OddPage--><h2 id="terminology"><span class="secno">2. </span>Terminology</h2>
+ <p>
+ <dfn id="dfn-statistics">Statistics</dfn>
+ is the <a href="http://en.wikipedia.org/wiki/Statistics">study</a> of
+ the collection, organisation, analysis, and interpretation of data.
+ Statistics comprise statistical data.
+ </p>
+
+ <p>
+
+ The basic structure of
+ <dfn id="dfn-statistical-data">statistical data</dfn>
+ is a multidimensional table (also called a data cube) [<cite><a href="#ref-SDMX">SDMX</a></cite>], i.e., a set of observed values organized
+ along a group of dimensions, together with associated metadata. If
+ aggregated we refer to statistical data as "macro-data" whereas if
+ not, we refer to "micro-data".
+ </p>
+ <p>
+ Statistical data can be collected in a
+ <dfn id="dfn-dataset">dataset</dfn>
+ , typically published and maintained by an organisation [<cite><a href="#ref-SDMX">SDMX</a></cite>]. The dataset contains metadata, e.g.,
+ about the time of collection and publication or about the maintaining
+ and publishing organisation.
+ </p>
+
+ <p>
+ <dfn id="dfn-source-data">Source data</dfn>
+ is data from datastores such as RDBs or spreadsheets that acts as a
+ source for the Linked Data publishing process.
+ </p>
+
+ <p>
+ <dfn id="dfn-metadata">Metadata</dfn>
+ about statistics defines the data structure and give contextual
+ information about the statistics.
+ </p>
+
+ <p>
+ A format is
+ <dfn id="dfn-machine-readable">machine-readable</dfn>
+ if it is amenable to automated processing by a machine, as opposed to
+ presentation to a human user.
+ </p>
+
+ <p>
+ A
+ <dfn id="dfn-publisher">publisher</dfn>
+ is a person or organisation that exposes source data as Linked Data on
+ the Web.
+ </p>
+
+ <p>
+ A
+ <dfn id="dfn-consumer">consumer</dfn>
+ is a person or agent that uses Linked Data from the Web.
+ </p>
+ <p>
+ A
+ <dfn id="dfn-registry">registry</dfn>
+ collects metadata about statistical data in a registration fashion.
+ </p>
+ </section>
+
+
+ <section id="use-cases">
+ <!--OddPage--><h2 id="usecases"><span class="secno">3. </span>Use cases</h2>
+ <p>This section presents scenarios that are enabled by the
+ existence of a standard vocabulary for the representation of
+ statistics as Linked Data.</p>
+
+ <section id="sdmx-web-dissemination-use-case">
+ <h3 id="SDMXWebDisseminationUseCase"><span class="secno">3.1 </span>SDMX Web Dissemination Use
+ Case</h3>
+ <p>
+ <span style="font-size: 10pt">(Use case taken from SDMX Web
+ Dissemination Use Case [<cite><a href="#ref-SDMX-21">SDMX
+ 2.1</a></cite>])
+ </span>
+ </p>
+ <p>Since we have adopted the multidimensional model that underlies
+ SDMX, we also adopt the "Web Dissemination Use Case" which is the
+ prime use case for SDMX since it is an increasing popular use of SDMX
+ and enables organisations to build a self-updating dissemination
+ system.</p>
+ <p>The Web Dissemination Use Case contains three actors, a
+ structural metadata web service (registry) that collects metadata
+ about statistical data in a registration fashion, a data web service
+ (publisher) that publishes statistical data and its metadata as
+ registered in the structural metadata web service, and a data
+ consumption application (consumer) that first discovers data from the
+ registry, then queries data from the corresponding publisher of
+ selected data, and then visualises the data.</p>
+ <p>In the following, we illustrate the processes from this use case
+ in a flow diagram by SDMX and describe what activities are enabled in
+ this use case by having statistics described in a machine-readable
+ format.</p>
+
+ <p class="caption">
+ Figure: Process flow diagram by SDMX [<cite><a href="#ref-SDMX-21">SDMX 2.1</a></cite>]
+ </p>
+
+ <p align="center">
+ <img alt="SDMX Web Dissemination Use Case" src="./figures/SDMX_Web_Dissemination_Use_Case.png" width="1000px">
+ </p>
+ <p>Benefits:</p>
+ <ul>
+ <li>A structural metadata source (registry) can collect metadata
+ about statistical data.</li>
+
+ <li>A data web service (publisher) can register statistical data
+ in a registry, and can provide statistical data from a database and
+ metadata from a metadata repository for consumers. For that, the
+ publisher creates database tables (see 1 in figure), and loads
+ statistical data in a database and metadata in a metadata repository.</li>
+
+ <li>A consumer can discover data from a registry (3) and
+ automatically can create a query to the publisher for selected
+ statistical data (4).</li>
+
+ <li>The publisher can translate the query to a query to its
+ database (5) as well as metadata repository (6) and return the
+ statistical data and metadata.</li>
+
+ <li>The consumer can visualise the returned statistical data and
+ metadata.</li>
+ </ul>
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Thereshouldbearecommendedwaytocommunicatetheavailabilityofpublishedstatisticaldatatoexternalpartiesandtoallowautomaticdiscoveryofstatisticaldata">There
+ should be a recommended way to communicate the availability of
+ published statistical data to external parties and to allow
+ automatic discovery of statistical data</a></li>
+ </ul>
+
+
+ <p>The SDMX Web Dissemination Use Case can be concretised by
+ several sub-use cases, detailed in the following sections.</p>
+
+ </section> <section id="publisher-use-case-uk-government-financial-data-from-combined-online-information-system-coins">
+ <h3 id="UKgovernmentfinancialdatafromCombinedOnlineInformationSystem"><span class="secno">3.2 </span>Publisher
+ Use Case: UK government financial data from Combined Online
+ Information System (COINS)</h3>
+ <p>
+ <span style="font-size: 10pt">(This use case has been
+ summarised from Ian Dickinson et al. [<cite><a href="#ref-COINS">COINS</a></cite>])
+ </span>
+ </p>
+ <p>More and more organisations want to publish statistics on the
+ web, for reasons such as increasing transparency and trust. Although
+ in the ideal case, published data can be understood by both humans and
+ machines, data often is simply published as CSV, PDF, XSL etc.,
+ lacking elaborate metadata, which makes free usage and analysis
+ difficult.</p>
+ <p>Therefore, the goal in this use case is to use a
+ machine-readable and application-independent description of common
+ statistics with use of open standards, to foster usage and innovation
+ on the published data.</p>
+ <p>
+ In the "COINS as Linked Data" project [<cite><a href="#ref-COINS">COINS</a></cite>], the Combined Online Information System
+ (COINS) shall be published using a standard Linked Data vocabulary.
+ </p>
+ <p>
+ Via the Combined Online Information System (COINS), <a href="http://www.hm-treasury.gov.uk/psr_coins_data.htm">HM
+ Treasury</a>, the principal custodian of financial data for the UK
+ government, releases previously restricted financial information about
+ government spendings.
+ </p>
+
+
+ <p>According to the COINS as Linked Data project, the reason for
+ publishing COINS as Linked Data are threefold:</p>
+ <ul>
+ <li>
+ <ul>
+ <li>using open standard representation makes it easier to work
+ with the data with available technologies and promises innovative
+ third-party tools and usages</li>
+ <li>individual transactions and groups of transactions are
+ given an identity, and so can be referenced by web address (URL),
+ to allow them to be discussed, annotated, or listed as source data
+ for articles or visualizations</li>
+ <li>cross-links between linked-data datasets allow for much
+ richer exploration of related datasets</li>
+ </ul>
+ </li>
+ <li>The COINS data has a hypercube structure. It describes
+ financial transactions using seven independent dimensions (time,
+ data-type, department etc.) and one dependent measure (value). Also,
+ it allows thirty-three attributes that may further describe each
+ transaction. For further information, see the "COINS as Linked Data"
+ project website.</li>
+ <li>COINS is an example of one of the more complex statistical
+ datasets being publishing via data.gov.uk.</li>
+ <li>Part of the complexity of COINS arises from the nature of the
+ data being released.</li>
+ <li>The published COINS datasets cover expenditure related to
+ five different years (2005–06 to 2009–10). The actual COINS database
+ at HM Treasury is updated daily. In principle at least, multiple
+ snapshots of the COINS data could be released through the year.</li>
+ </ul>
+
+ <p>The COINS use case leads to the following challenges:</p>
+ <ul>
+ <li>The actual data and its hypercube structure are to be
+ represented separately so that an application first can examine the
+ structure before deciding to download the actual data, i.e., the
+ transactions. The hypercube structure also defines for each dimension
+ and attribute a range of permitted values that are to be represented.</li>
+ <li>An access or query interface to the COINS data, e.g., via a
+ SPARQL endpoint or the linked data API, is planned. Queries that are
+ expected to be interesting are: "spending for one department", "total
+ spending by department", "retrieving all data for a given
+ observation",</li>
+ <li>Also, the publisher favours a representation that is both as
+ self-descriptive as possible, i.e., others can link to and download
+ fully-described individual transactions and as compact as possible,
+ i.e., information is not unnecessarily repeated.</li>
+ <li>Moreover, the publisher is thinking about the possible
+ benefit of publishing slices of the data, e.g., datasets that fix all
+ dimensions but the time dimension. For instance, such slices could be
+ particularly interesting for visualisations or comments. However,
+ depending on the number of Dimensions, the number of possible slices
+ can become large which makes it difficult to select all interesting
+ slices.</li>
+ <li>An important benefit of linked data is that we are able to
+ annotate data, at a fine-grained level of detail, to record
+ information about the data itself. This includes where it came from –
+ the provenance of the data – but could include annotations from
+ reviewers, links to other useful resources, etc. Being able to trust
+ that data to be correct and reliable is a central value for
+ government-published data, so recording provenance is a key
+ requirement for the COINS data.</li>
+ <li>A challenge also is the size of the data, especially since it
+ is updated regularly. Five data files already contain between 3.3 and
+ 4.9 million rows of data.</li>
+ </ul>
+ <p>Requirements::</p>
+ <ul>
+ <li><a href="#Vocabularyshouldclarifytheuseofsubsetsofobservations">Vocabulary
+ should clarify the use of subsets of observations</a></li>
+ </ul>
+
+ </section> <section id="publisher-use-case-publishing-excel-spreadsheets-as-linked-data">
+ <h3 id="PublishingExcelSpreadsheetsasLinkedData"><span class="secno">3.3 </span>Publisher Use
+ Case: Publishing Excel Spreadsheets as Linked Data</h3>
+ <p>
+ <span style="font-size: 10pt">(Part of this use case has been
+ contributed by Rinke Hoekstra. See <a href="http://ehumanities.nl/ceda_r/">CEDA_R</a> and <a href="http://www.data2semantics.org/">Data2Semantics</a> for more
+ information.)
+ </span>
+ </p>
+
+ <p>Not only in government, there is a need to publish considerable
+ amounts of statistical data to be consumed in various (also
+ unexpected) application scenarios. Typically, Microsoft Excel sheets
+ are made available for download. Those excel sheets contain single
+ spreadsheets with several multidimensional data tables, having a name
+ and notes, as well as column values, row values, and cell values.</p>
+ <p>Benefits:</p>
+ <ul>
+ <li>The goal in this use case is to to publish spreadsheet
+ information in a machine-readable format on the web, e.g., so that
+ crawlers can find spreadsheets that use a certain column value. The
+ published data should represent and make available for queries the
+ most important information in the spreadsheets, e.g., rows, columns,
+ and cell values.</li>
+ <li>For instance, in the <a href="http://ehumanities.nl/ceda_r/">CEDA_R</a>
+ and <a href="http://www.data2semantics.org/">Data2Semantics</a>
+ projects publishing and harmonizing Dutch historical census data
+ (from 1795 onwards) is a goal. These censuses are now only available
+ as Excel spreadsheets (obtained by data entry) that closely mimic the
+ way in which the data was originally published and shall be published
+ as Linked Data.
+ </li>
+ </ul>
+ <p>Challenges in this use case:</p>
+
+ <ul>
+ <li>All context and so all meaning of the measurement point is
+ expressed by means of dimensions. The pure number is the star of an
+ ego-network of attributes or dimensions. In a RDF representation it
+ is then easily possible to define hierarchical relationships between
+ the dimensions (that can be exemplified further) as well as mapping
+ different attributes across different value points. This way a
+ harmonization among variables is performed around the measurement
+ points themselves.</li>
+ <li>In historical research, until now, harmonization across
+ datasets is performed by hand, and in subsequent iterations of a
+ database: it is very hard to trace back the provenance of decisions
+ made during the harmonization procedure.</li>
+ <li>Combining Data Cube with SKOS [<cite><a href="#ref-skos">SKOS</a></cite>] to allow for cross-location and
+ cross-time historical analysis
+ </li>
+ <li>Novel visualisation of census data</li>
+ <li>Integration with provenance vocabularies, e.g., PROV-O, for
+ tracking of harmonization steps</li>
+ <li>These challenges may seem to be particular to the field of
+ historical research, but in fact apply to government information at
+ large. Government is not a single body that publishes information at
+ a single point in time. Government consists of multiple (altering)
+ bodies, scattered across multiple levels, jurisdictions and areas.
+ Publishing government information in a consistent, integrated manner
+ requires exactly the type of harmonization required in this use case.</li>
+ <li>Excel sheets provide much flexibility in arranging
+ information. It may be necessary to limit this flexibility to allow
+ automatic transformation.</li>
+ <li>There are many spreadsheets.</li>
+ <li>Semi-structured information, e.g., notes about lineage of
+ data cells, may not be possible to be formalized.</li>
+ </ul>
+ <p>Existing work:</p>
+ <ul>
+ <li>Another concrete example is the <a href="http://ontowiki.net/Projects/Stats2RDF?show_comments=1">Stats2RDF</a>
+ project that intends to publish biomedical statistical data that is
+ represented as Excel sheets. Here, Excel files are first translated
+ into CSV and then translated into RDF.
+ </li>
+ <li>Some of the challenges are met by the work on an ISO
+ Extension to SKOS [<cite><a href="#ref-xkos">XKOS</a></cite>].
+ </li>
+ </ul>
+
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Vocabularyshouldrecommendamechanismtosupporthierarchicalcodelists">Vocabulary
+ should recommend a mechanism to support hierarchical code lists</a></li>
+ <li><a href="#Thereshouldbearecommendedwayofdeclaringrelationsbetweencubes">There
+ should be a recommended way of declaring relations between cubes</a></li>
+ </ul>
+
+
+ </section> <section id="publisher-use-case-publishing-hierarchically-structured-data-from-statswales-and-open-data-communities">
+ <h3 id="PublishinghierarchicallystructureddatafromStatsWalesandOpenDataCommunities"><span class="secno">3.4 </span>Publisher
+ Use Case: Publishing hierarchically structured data from StatsWales
+ and Open Data Communities</h3>
+ <p>
+ <span style="font-size: 10pt">(Use case has been taken from [<cite><a href="#ref-QB4OLAP">QB4OLAP</a></cite>] and from discussions at <a href="http://groups.google.com/group/publishing-statistical-data/msg/7c80f3869ff4ba0f">publishing-statistical-data
+ mailing list</a>)
+ </span>
+ </p>
+
+ <p>It often comes up in statistical data that you have some kind of
+ 'overall' figure, which is then broken down into parts.</p>
+
+ <p>Example (in pseudo-turtle RDF):</p>
+ <pre>ex:obs1
+ sdmx:refArea <uk>;
+ sdmx:refPeriod "2011";
+ ex:population "60" .
+ex:obs2
+ sdmx:refArea <england>;
+ sdmx:refPeriod "2011";
+ ex:population "50" .
+ex:obs3
+ sdmx:refArea <scotland>;
+ sdmx:refPeriod "2011";
+ ex:population "5" .
+ex:obs4
+ sdmx:refArea <wales>;
+ sdmx:refPeriod "2011";
+ ex:population "3" .
+ex:obs5
+ sdmx:refArea <northernireland>;
+ sdmx:refPeriod "2011";
+ ex:population "2" .
+ </northernireland></wales></scotland></england></uk></pre>
+
+ <p>
+ We are looking for the best way (in the context of the RDF/Data
+ Cube/SDMX approach) to express that the values for the
+ England/Scotland/Wales/ Northern Ireland ought to add up to the value
+ for the UK and constitute a more detailed breakdown of the overall UK
+ figure? Since we might also have population figures for France,
+ Germany, EU27, it is not as simple as just taking a
+ <code>qb:Slice</code>
+ where you fix the time period and the measure.
+ </p>
+
+ <p>
+ Similarly, Etcheverry and Vaisman [<cite><a href="#ref-QB4OLAP">QB4OLAP</a></cite>]
+ present the use case to publish household data from <a href="http://statswales.wales.gov.uk/index.htm">StatsWales</a> and <a href="http://opendatacommunities.org/doc/dataset/housing/household-projections">Open
+ Data Communities</a>.
+ </p>
+
+ <p>This multidimensional data contains for each fact a time
+ dimension with one level Year and a location dimension with levels
+ Unitary Authority, Government Office Region, Country, and ALL.</p>
+
+ <p>As unit, units of 1000 households is used.</p>
+
+ <p>In this use case, one wants to publish not only a dataset on the
+ bottom most level, i.e. what are the number of households at each
+ Unitary Authority in each year, but also a dataset on more aggregated
+ levels.</p>
+
+ <p>For instance, in order to publish a dataset with the number of
+ households at each Government Office Region per year, one needs to
+ aggregate the measure of each fact having the same Government Office
+ Region using the SUM function.</p>
+
+ <p>Importantly, one would like to maintain the relationship between
+ the resulting datasets, i.e., the levels and aggregation functions.</p>
+
+ <p>Again, this use case does not simply need a selection (or "dice"
+ in OLAP context) where one fixes the time period dimension.</p>
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Vocabularyshouldrecommendamechanismtosupporthierarchicalcodelists">Vocabulary
+ should recommend a mechanism to support hierarchical code lists</a></li>
+ </ul>
+
+
+ </section> <section id="publisher-use-case-publishing-slices-of-data-about-uk-bathing-water-quality">
+ <h3 id="PublishingslicesofdataaboutUKBathingWaterQuality"><span class="secno">3.5 </span>Publisher
+ Use Case: Publishing slices of data about UK Bathing Water Quality</h3>
+ <p>
+ <span style="font-size: 10pt">(Use case has been provided by
+ Epimorphics Ltd, in their <a href="http://www.epimorphics.com/web/projects/bathing-water-quality">UK
+ Bathing Water Quality</a> deployment)
+ </span>
+ </p>
+ <p>
+ As part of their work with data.gov.uk and the UK Location Programme
+ Epimorphics Ltd have been working to pilot the publication of both
+ current and historic bathing water quality information from the <a href="http://www.environment-agency.gov.uk/">UK Environment
+ Agency</a> as Linked Data.
+ </p>
+ <p>The UK has a number of areas, typically beaches, that are
+ designated as bathing waters where people routinely enter the water.
+ The Environment Agency monitors and reports on the quality of the
+ water at these bathing waters.</p>
+ <p>The Environement Agency's data can be thought of as structured
+ in 3 groups:</p>
+ <ul>
+ <li>There is basic reference data describing the bathing waters
+ and sampling points</li>
+ <li>There is a data set "Annual Compliance Assessment Dataset"
+ giving the rating for each bathing water for each year it has been
+ monitored</li>
+ <li>There is a data set "In-Season Sample Assessment Dataset"
+ giving the detailed weekly sampling results for each bathing water</li>
+ </ul>
+ <p>The most important dimensions of the data are bathing water,
+ sampling point, and compliance classification.</p>
+ <p>Challenges:</p>
+ <ul>
+ <li>Observations may exhibit a number of attributes, e.g.,
+ whether ther was an abnormal weather exception.</li>
+ <li>Relevant slices of both datasets are to be created:
+ <ul>
+ <li>Annual Compliance Assessment Dataset: all the observations
+ for a specific sampling point, all the observations for a specific
+ year.</li>
+ <li>In-Season Sample Assessment Dataset: samples for a given
+ sampling point, samples for a given week, samples for a given year,
+ samples for a given year and sampling point, latest samples for
+ each sampling point.</li>
+ <li>The use case suggests more arbitrary subsets of the
+ observations, e.g., collecting all the "latest" observations in a
+ continuously updated data set.</li>
+ </ul>
+
+
+ </li>
+ </ul>
+ <p>Existing Work:</p>
+ <ul>
+ <li>The <a href="http://purl.oclc.org/NET/ssnx/ssn">Semantic
+ Sensor Network ontology</a> (SSN) already provides a way to publish
+ sensor information. SSN data provides statistical Linked Data and
+ grounds its data to the domain, e.g., sensors that collect
+ observations (e.g., sensors measuring average of temperature over
+ location and time).
+ </li>
+ <li>A number of organisations, particularly in the Climate and
+ Meteorological area already have some commitment to the OGC
+ "Observations and Measurements" (O&M) logical data model, also
+ published as ISO 19156.</li>
+ </ul>
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#VocabularyshoulddefinerelationshiptoISO19156ObservationsMeasurements">Vocabulary
+ should define relationship to ISO19156 - Observations & Measurements</a></li>
+ <li><a href="#Vocabularyshouldclarifytheuseofsubsetsofobservations">Vocabulary
+ should clarify the use of subsets of observations</a></li>
+ </ul>
+
+
+ </section> <section id="publisher-use-case-eurostat-sdmx-as-linked-data">
+ <h3 id="EurostatSDMXasLinkedData"><span class="secno">3.6 </span>Publisher Use Case: Eurostat
+ SDMX as Linked Data</h3>
+ <p>
+ <span style="font-size: 10pt">(This use case has been taken
+ from <a href="http://estatwrap.ontologycentral.com/">Eurostat
+ Linked Data Wrapper</a> and <a href="http://eurostat.linked-statistics.org/">Linked Statistics
+ Eurostat Data</a>, both deployments for publishing Eurostat SDMX as
+ Linked Data using the draft version of the data cube vocabulary)
+ </span>
+ </p>
+
+ <p>
+ As mentioned already, the ISO standard for exchanging and sharing
+ statistical data and metadata among organisations is Statistical Data
+ and Metadata eXchange [<cite><a href="#ref-SDMX">SDMX</a></cite>].
+ Since this standard has proven applicable in many contexts, we adopt
+ the multidimensional model that underlies SDMX and intend the standard
+ vocabulary to be compatible to SDMX.
+ </p>
+
+ <p>
+ Therefore, in this use case we intend to explain the benefit and
+ challenges of publishing SDMX data as Linked Data. As one of the main
+ adopters of SDMX, <a href="http://epp.eurostat.ec.europa.eu/">Eurostat</a>
+ publishes large amounts of European statistics coming from a data
+ warehouse as SDMX and other formats on the web. Eurostat also provides
+ an interface to browse and explore the datasets. However, linking such
+ multidimensional data to related data sets and concepts would require
+ downloading of interesting datasets and manual integration.The goal
+ here is to improve integration with other datasets; Eurostat data
+ should be published on the web in a machine-readable format, possible
+ to be linked with other datasets, and possible to be freeley consumed
+ by applications. Both <a href="http://estatwrap.ontologycentral.com/">Eurostat
+ Linked Data Wrapper</a> and <a href="http://eurostat.linked-statistics.org/">Linked Statistics
+ Eurostat Data</a> intend to publish <a href="http://epp.eurostat.ec.europa.eu/portal/page/portal/eurostat/home/">Eurostat
+ SDMX data</a> as <a href="http://5stardata.info/">5-star Linked Open
+ Data</a>. Eurostat data is partly published as SDMX, partly as tabular
+ data (TSV, similar to CSV). Eurostat provides a <a href="http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&file=table_of_contents_en.xml">TOC
+ of published datasets</a> as well as a feed of modified and new datasets.
+
+ Eurostat provides a list of used codelists, i.e., <a href="http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&dir=dic">range
+ of permitted dimension values</a>. Any Eurostat dataset contains a
+ varying set of dimensions (e.g., date, geo, obs_status, sex, unit) as
+ well as measures (generic value, content is specified by dataset,
+ e.g., GDP per capita in PPS, Total population, Employment rate by
+ sex).
+ </p>
+
+
+ <p>Benefits:</p>
+
+ <ul>
+ <li>Possible implementation of ETL pipelines based on Linked Data
+ technologies (e.g., <a href="http://code.google.com/p/ldspider/">LDSpider</a>)
+ to effectively load the data into a data warehouse for analysis
+ </li>
+
+ <li>Allows useful queries to the data, e.g., comparison of
+ statistical indicators across EU countries.</li>
+
+ <li>Allows to attach contextual information to statistics during
+ the interpretation process.</li>
+
+ <li>Allows to reuse single observations from the data.</li>
+
+ <li>Linking to information from other data sources, e.g., for
+ geo-spatial dimension.
+ </li></ul>
+
+ <p>Challenges:</p>
+
+ <ul>
+ <li>New Eurostat datasets are added regularly to Eurostat. The
+ Linked Data representation should automatically provide access to the
+ most-up-to-date data.</li>
+
+ <li>How to match elements of the geo-spatial dimension to
+ elements of other data sources, e.g., NUTS, GADM.</li>
+
+ <li>There is a large number of Eurostat datasets, each possibly
+ containing a large number of columns (dimensions) and rows
+ (observations). Eurostat publishes more than 5200 datasets, which,
+ when converted into RDF require more than 350GB of disk space
+ yielding a dataspace with some 8 billion triples.</li>
+
+ <li>In the Eurostat Linked Data Wrapper, there is a timeout for
+ transforming SDMX to Linked Data, since Google App Engine is used.
+ Mechanisms to reduce the amount of data that needs to be translated
+ would be needed.</li>
+
+ <li>Provide a useful interface for browsing and visualising the
+ data. One problem is that the data sets have to high dimensionality
+ to be displayed directly. Instead, one could visualise slices of time
+ series data. However, for that, one would need to either fix most
+ other dimensions (e.g., sex) or aggregate over them (e.g., via
+ average). The selection of useful slices from the large number of
+ possible slices is a challenge.</li>
+
+ <li>Each dimension used by a dataset has a range of permitted
+ values that need to be described.</li>
+
+ <li>The Eurostat SDMX as Linked Data use case suggests to have
+ time lines on data aggregating over the gender dimension.</li>
+
+ <li>The Eurostat SDMX as Linked Data use case suggests to provide
+ data on a gender level and on a level aggregating over the gender
+ dimension.</li>
+
+ <li>Updates to the data
+
+ <ul>
+ <li>Eurostat - Linked Data pulls in changes from the original
+ Eurostat dataset on weekly basis and conversion process runs every
+ Saturday at noon taking into account new datasets along with
+ updates to existing datasets.</li>
+ <li>Eurostat Linked Data Wrapper on-the-fly translates Eurostat
+ datasets into RDF so that always the most current data is used. The
+ problem is only to point users towards the URIs of Eurostat
+ datasets: Estatwrap provides a feed of modified and new <a href="http://estatwrap.ontologycentral.com/feed.rdf">datasets</a>.
+ Also, it provides a <a href="http://estatwrap.ontologycentral.com/table_of_contents.html">TOC</a>
+ that could be automatically updated from the <a href="http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&file=table_of_contents_en.xml">Eurostat
+ TOC</a>.
+ </li>
+ </ul>
+
+
+ </li>
+
+ <li>Query interface</li>
+
+ <ul>
+ <li>Eurostat - Linked Data provides SPARQL endpoint for the
+ metadata (not the observations).</li>
+ <li>Eurostat Linked Data Wrapper allows and demonstrates how to
+ use Qcrumb.com to query the data.</li>
+ </ul>
+
+ <li>Browsing and visualising interface:
+ <ul>
+ <li>Eurostat Linked Data Wrapper provides for each dataset an
+ HTML page showing a visualisation of the data.</li>
+ </ul>
+
+
+ </li>
+ </ul>
+
+ <p>Non-requirements:</p>
+ <ul>
+ <li>One possible application would run validation checks over
+ Eurostat data. The intended standard vocabulary is to publish the
+ Eurostat data as-is and is not intended to represent information for
+ validation (similar to business rules).</li>
+ <li>Information of how to match elements of the geo-spatial
+ dimension to elements of other data sources, e.g., NUTS, GADM, is not
+ part of a vocabulary recommendation.</li>
+ </ul>
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#VocabularyshouldbuildupontheSDMXinformationmodel">There
+ should be mechanisms and recommendations regarding publication and
+ consumption of large amounts of statistical data</a></li>
+ <li><a href="#Thereshouldbearecommendedmechanismtoallowforpublicationofaggregateswhichcrossmultipledimensions">There
+ should be a recommended mechanism to allow for publication of
+ aggregates which cross multiple dimensions</a></li>
+ </ul>
+ </section> <section id="publisher-use-case-representing-relationships-between-statistical-data">
+ <h3 id="Representingrelationshipsbetweenstatisticaldata"><span class="secno">3.7 </span>Publisher
+ Use Case: Representing relationships between statistical data</h3>
+ <p>
+ <span style="font-size: 10pt">(This use case has mainly been
+ taken from the COINS project [<cite><a href="#ref-COINS">COINS</a></cite>])
+ </span>
+ </p>
+
+ <p>In several applications, relationships between statistical data
+ need to be represented.</p>
+
+ <p>The goal of this use case is to describe provenance,
+ transformations, and versioning around statistical data, so that the
+ history of statistics published on the web becomes clear. This may
+ also relate to the issue of having relationships between datasets
+ published.</p>
+
+ <p>
+ For instance, the COINS project [<cite><a href="#ref-COINS">COINS</a></cite>]
+ has at least four perspectives on what they mean by “COINS” data: the
+ abstract notion of “all of COINS”, the data for a particular year, the
+ version of the data for a particular year released on a given date,
+ and the constituent graphs which hold both the authoritative data
+ translated from HMT’s own sources. Also, additional supplementary
+ information which they derive from the data, for example by
+ cross-linking to other datasets.
+ </p>
+
+ <p>Another specific use case is that the Welsh Assembly government
+ publishes a variety of population datasets broken down in different
+ ways. For many uses then population broken down by some category (e.g.
+ ethnicity) is expressed as a percentage. Separate datasets give the
+ actual counts per category and aggregate counts. In such cases it is
+ common to talk about the denominator (often DENOM) which is the
+ aggregate count against which the percentages can be interpreted.</p>
+
+ <p>
+ Another example for representing relationships between statistical
+ data are transformations on datasets, e.g., addition of derived
+ measures, conversion of units, aggregations, OLAP operations, and
+ enrichment of statistical data. A concrete example is given by Freitas
+ et al. [<cite><a href="#ref-COGS">COGS</a></cite>] and illustrated in
+ the following figure.
+ </p>
+
+ <p class="caption">Figure: Illustration of ETL workflows to process
+ statistics</p>
+
+ <p align="center">
+ <img alt="COGS relationships between statistics example" src="./figures/Relationships_Statistical_Data_Cogs_Example.png">
+ </p>
+
+ <p>Here, numbers from a sustainability report have been created by
+ a number of transformations to statistical data. Different numbers
+ (e.g., 600 for year 2009 and 503 for year 2010) might have been
+ created differently, leading to different reliabilities to compare
+ both numbers.</p>
+ <p>Benefits:</p>
+
+ <p>Making transparent the transformation a dataset has been exposed
+ to. Increases trust in the data.</p>
+
+ <p>Challenges:</p>
+
+ <ul>
+ <li>Operations on statistical data result in new statistical
+ data, depending on the operation. For instance, in terms of Data
+ Cube, operations such as slice, dice, roll-up, drill-down will result
+ in new Data Cubes. This may require representing general
+ relationships between cubes (as discussed in the <a href="http://groups.google.com/group/publishing-statistical-data/browse_thread/thread/75762788de10de95">publishing-statistical-data
+ mailing list</a>).
+ </li>
+ <li>Should Data Cube support explicit declaration of such
+ relationships either between separated qb:DataSets or between
+ measures with a single <code>qb:DataSet</code> (e.g. <code>ex:populationCount</code>
+ and <code>ex:populationPercent</code>)?
+ </li>
+ <li>If so should that be scoped to simple, common relationships
+ like DENOM or allow expression of arbitrary mathematical relations?</li>
+ </ul>
+
+ <p>
+ Existing Work:
+ </p><ul>
+ <li>Possible relation to <a href="http://www.w3.org/2011/gld/wiki/Best_Practices_Discussion_Summary#Versioning">Versioning</a>
+ part of GLD Best Practices Document, where it is specified how to
+ publish data which has multiple versions.
+ </li>
+ <li>The <a href="http://sites.google.com/site/cogsvocab/">COGS</a>
+ vocabulary [<cite><a href="#ref-COGS">COGS</a></cite>] is related to
+ this use case since it may complement the standard vocabulary for
+ representing ETL pipelines processing statistics.
+ </li>
+ </ul>
+ <p></p>
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Thereshouldbearecommendedwayofdeclaringrelationsbetweencubes">There
+ should be a recommended way of declaring relations between cubes</a></li>
+ </ul>
+
+ </section> <section id="consumer-use-case-simple-chart-visualisations-of-integrated-published-statistical-data">
+ <h3 id="Simplechartvisualisationsofpublishedstatisticaldata"><span class="secno">3.8 </span>Consumer
+ Use Case: Simple chart visualisations of (integrated) published
+ statistical data</h3>
+ <p>
+ <span style="font-size: 10pt">(Use case taken from <a href="http://www.iwrm-smart.org/">SMART research project</a>)
+ </span>
+ </p>
+
+ <p>Data that is published on the Web is typically visualized by
+ transforming it manually into CSV or Excel and then creating a
+ visualization on top of these formats using Excel, Tableau,
+ RapidMiner, Rattle, Weka etc.</p>
+ <p>This use case shall demonstrate how statistical data published
+ on the web can be with few effort visualized inside a webpage, without
+ using commercial or highly-complex tools.</p>
+ <p>
+ An example scenario is environmental research done within the <a href="http://www.iwrm-smart.org/">SMART research project</a>. Here,
+ statistics about environmental aspects (e.g., measurements about the
+ climate in the Lower Jordan Valley) shall be visualized for scientists
+ and decision makers. Statistics should also be possible to be
+ integrated and displayed together. The data is available as XML files
+ on the web. On a separate website, specific parts of the data shall be
+ queried and visualized in simple charts, e.g., line diagrams.
+ </p>
+
+ <p class="caption">Figure: HTML embedded line chart of an
+ environmental measure over time for three regions in the lower Jordan
+ valley</p>
+
+ <p align="center">
+ <img alt="display of an environmental measure over time for three regions in the lower Jordan valley" src="./figures/Level_above_msl_3_locations.png" width="1000px">
+ </p>
+
+ <p class="caption">Figure: Showing the same data in a pivot table.
+ Here, the aggregate COUNT of measures per cell is given.</p>
+ <p align="center">
+ <img alt="Figure: Showing the same data in a pivot
+ table. Here, the aggregate COUNT of measures per cell is given." src="./figures/pivot_analysis_measurements.PNG">
+ </p>
+ <p>Challenges of this use case are:</p>
+ <ul>
+ <li>The difficulties lay in structuring the data appropriately so
+ that the specific information can be queried.</li>
+ <li>Also, data shall be published with having potential
+ integration in mind. Therefore, e.g., units of measurements need to
+ be represented.</li>
+ <li>Integration becomes much more difficult if publishers use
+ different measures, dimensions.</li>
+ </ul>
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata">There
+ should be criteria for well-formedness and assumptions consumers can
+ make about published data</a></li>
+ </ul>
+
+ </section> <section id="consumer-use-case-visualising-published-statistical-data-in-google-public-data-explorer">
+ <h3 id="VisualisingpublishedstatisticaldatainGooglePublicDataExplorer"><span class="secno">3.9 </span>Consumer
+ Use Case: Visualising published statistical data in Google Public Data
+ Explorer</h3>
+ <p>
+ <span style="font-size: 10pt">(Use case taken from <a href="http://code.google.com/apis/publicdata/">Google Public Data
+ Explorer (GPDE)</a>)
+ </span>
+ </p>
+ <p>
+ <a href="http://code.google.com/apis/publicdata/">Google Public
+ Data Explorer</a> (GPDE) provides an easy possibility to visualize and
+ explore statistical data. Data needs to be in the <a href="https://developers.google.com/public-data/overview">Dataset
+ Publishing Language</a> (DSPL) to be uploaded to the data explorer. A
+ DSPL dataset is a bundle that contains an XML file, the schema, and a
+ set of CSV files, the actual data. Google provides a tutorial to
+ create a DSPL dataset from your data, e.g., in CSV. This requires a
+ good understanding of XML, as well as a good understanding of the data
+ that shall be visualized and explored.
+ </p>
+ <p>In this use case, the goal is to take statistical data published
+ on the web and to transform it into DSPL for visualization and
+ exploration with as few effort as possible.</p>
+ <p>For instance, Eurostat data about Unemployment rate downloaded
+ from the web as shown in the following figure:</p>
+
+ <p class="caption">Figure: An interactive chart in GPDE for
+ visualising Eurostat data described with DSPL</p>
+ <p align="center">
+ <img alt="An interactive chart in GPDE for visualising Eurostat data in the DSPL" src="./figures/Eurostat_GPDE_Example.png" width="1000px">
+ </p>
+
+ <p>Benefits:</p>
+ <ul>
+ <li>If a standard Linked Data vocabulary is used, visualising and
+ exploring new data that already is represented using this vocabulary
+ can easily be done using GPDE.</li>
+ <li>Datasets can be first integrated using Linked Data technology
+ and then analysed using GDPE.</li>
+ </ul>
+ <p>Challenges of this use case are:</p>
+ <ul>
+ <li>There are different possible approaches each having
+ advantages and disadvantages: 1) A customer C is downloading this
+ data into a triple store; SPARQL queries on this data can be used to
+ transform the data into DSPL and uploaded and visualized using GPDE.
+ 2) or, one or more XLST transformation on the RDF/XML transforms the
+ data into DSPL.</li>
+ <li>The technical challenges for the consumer here lay in knowing
+ where to download what data and how to get it transformed into DSPL
+ without knowing the data.</li>
+ </ul>
+
+ <p>
+ Non-requirements:
+ </p><ul>
+ <li>DSPL is representative for using statistical data published
+ on the web in available tools for analysis. Similar tools that may
+ be automatically covered are: Weka (arff data format), Tableau,
+ SPSS, STATA, PC-Axis etc.</li>
+ </ul>
+ <p></p>
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata">There
+ should be criteria for well-formedness and assumptions consumers can
+ make about published data</a></li>
+ </ul>
+ </section> <section id="consumer-use-case-analysing-published-statistical-data-with-common-olap-systems">
+ <h3 id="AnalysingpublishedstatisticaldatawithcommonOLAPsystems"><span class="secno">3.10 </span>Consumer
+ Use Case: Analysing published statistical data with common OLAP
+ systems</h3>
+ <p>
+ <span style="font-size: 10pt">(Use case taken from <a href="http://xbrl.us/research/appdev/Pages/275.aspx">Financial
+ Information Observation System (FIOS)</a>)
+ </span>
+ </p>
+
+ <p>Online Analytical Processing (OLAP) is an analysis method on
+ multidimensional data. It is an explorative analysis methode that
+ allows users to interactively view the data on different angles
+ (rotate, select) or granularities (drill-down, roll-up), and filter it
+ for specific information (slice, dice).</p>
+
+ <p>OLAP systems that first use ETL pipelines to
+ Extract-Load-Transform relevant data for efficient storage and queries
+ in a data warehouse and then allows interfaces to issue OLAP queries
+ on the data are commonly used in industry to analyse statistical data
+ on a regular basis.</p>
+
+ <p>
+ The goal in this use case is to allow analysis of published
+ statistical data with common OLAP systems [<cite><a href="#ref-OLAP4LD">OLAP4LD</a></cite>]
+ </p>
+
+ <p>For that a multidimensional model of the data needs to be
+ generated. A multidimensional model consists of facts summarised in
+ data cubes. Facts exhibit measures depending on members of dimensions.
+ Members of dimensions can be further structured along hierarchies of
+ levels.</p>
+
+ <p>
+ An example scenario of this use case is the Financial Information
+ Observation System (FIOS) [<cite><a href="#ref-FIOS">FIOS</a></cite>],
+ where XBRL data provided by the SEC on the web is to be re-published
+ as Linked Data and made possible to explore and analyse by
+ stakeholders in a web-based OLAP client Saiku.
+ </p>
+
+ <p>The following figure shows an example of using FIOS. Here, for
+ three different companies, cost of goods sold as disclosed in XBRL
+ documents are analysed. As cell values either the number of
+ disclosures or - if only one available - the actual number in USD is
+ given:</p>
+
+
+ <p class="caption">Figure: Example of using FIOS for OLAP
+ operations on financial data</p>
+ <p align="center">
+ <img alt="Example of using FIOS for OLAP operations on financial data" src="./figures/FIOS_example.PNG">
+ </p>
+
+ <p>Benefits:</p>
+
+ <ul>
+ <li>OLAP operations cover typical business requirements, e.g.,
+ slice, dice, drill-down.</li>
+ <li>OLAP frontends intuitive interactive, explorative, fast.
+ Interfaces well-known to many people in industry.</li>
+ <li>OLAP functionality provided by many tools that may be reused</li>
+ </ul>
+
+ <p>Challenges:</p>
+ <ul>
+ <li>ETL pipeline needs to automatically populate a data
+ warehouse. Common OLAP systems use relational databases with a star
+ schema.</li>
+ <li>A problem lies in the strict separation between queries for
+ the structure of data (metadata queries), and queries for actual
+ aggregated values (OLAP operations).</li>
+ <li>Another problem lies in defining Data Cubes without greater
+ insight in the data beforehand.</li>
+ <li>Depending on the expressivity of the OLAP queries (e.g.,
+ aggregation functions, hierarchies, ordering), performance plays an
+ important role.</li>
+ <li>Olap systems have to cater for possibly missing information
+ (e.g., the aggregation function or a human readable label).</li>
+ </ul>
+
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata">There
+ should be criteria for well-formedness and assumptions consumers can
+ make about published data</a></li>
+ </ul>
+ </section> <section id="registry-use-case-registering-published-statistical-data-in-data-catalogs">
+ <h3 id="Registeringpublishedstatisticaldataindatacatalogs"><span class="secno">3.11 </span>Registry
+ Use Case: Registering published statistical data in data catalogs</h3>
+ <p>
+ <span style="font-size: 10pt">(Use case motivated by <a href="http://www.w3.org/TR/vocab-dcat/">Data Catalog vocabulary</a>)
+ </span>
+ </p>
+
+ <p>
+ After statistics have been published as Linked Data, the question
+ remains how to communicate the publication and let users discover the
+ statistics. There are catalogs to register datasets, e.g., CKAN, <a href="http://www.datacite.org/">datacite.org</a>, <a href="http://www.gesis.org/dara/en/home/?lang=en">da|ra</a>, and <a href="http://pangaea.de/">Pangea</a>. Those catalogs require specific
+ configurations to register statistical data.
+ </p>
+
+ <p>The goal of this use case is to demonstrate how to expose and
+ distribute statistics after publication. For instance, to allow
+ automatic registration of statistical data in such catalogs, for
+ finding and evaluating datasets. To solve this issue, it should be
+ possible to transform the published statistical data into formats that
+ can be used by data catalogs.</p>
+
+ <p>
+ A concrete use case is the structured collection of <a href="http://wiki.planet-data.eu/web/Datasets">RDF Data Cube
+ Vocabulary datasets</a> in the PlanetData Wiki. This list is supposed to
+ describe statistical datasets on a higher level - for easy discovery
+ and selection - and to provide a useful overview of RDF Data Cube
+ deployments in the Linked Data cloud.
+ </p>
+
+ <p>Unanticipated Uses:</p>
+
+ <ul>
+ <li>If data catalogs contain statistics, they do not expose those
+ using Linked Data but for instance using CSV or HTML (e.g., Pangea).
+ It could also be a use case to publish such data using the data cube
+ vocabulary.</li>
+ </ul>
+
+ <p>Existing Work:</p>
+ <ul>
+ <li>The <a href="http://www.w3.org/TR/vocab-dcat/">Data
+ Catalog vocabulary</a> (DCAT) is strongly related to this use case since
+ it may complement the standard vocabulary for representing statistics
+ in the case of registering data in a data catalog.
+ </li>
+ </ul>
+
+
+ <p>Requirements:</p>
+ <ul>
+ <li><a href="#Thereshouldbearecommendedwaytocommunicatetheavailabilityofpublishedstatisticaldatatoexternalpartiesandtoallowautomaticdiscoveryofstatisticaldata">There
+ should be a recommended way to communicate the availability of
+ published statistical data to external parties and to allow
+ automatic discovery of statistical data</a></li>
+ </ul>
+ </section> </section>
+
+ <section id="requirements-1">
+ <!--OddPage--><h2 id="requirements"><span class="secno">4. </span>Requirements</h2>
+
+ <p>The use cases presented in the previous section give rise to the
+ following requirements for a standard representation of statistics.
+ Requirements are cross-linked with the use cases that motivate them.</p>
+
+
+ <section id="vocabulary-should-build-upon-the-sdmx-information-model">
+ <h3 id="VocabularyshouldbuildupontheSDMXinformationmodel"><span class="secno">4.1 </span>Vocabulary
+ should build upon the SDMX information model</h3>
+ <p>
+ The draft version of the vocabulary builds upon <a href="http://sdmx.org/?page_id=16">SDMX Standards Version 2.0</a>. A
+ newer version of SDMX, <a href="http://sdmx.org/?p=899">SDMX
+ Standards, Version 2.1</a>, is available.
+ </p>
+ <p>The requirement is to at least build upon Version 2.0, if
+ specific use cases derived from Version 2.1 become available, the
+ working group may consider building upon Version 2.1.</p>
+ <p>Background information:</p>
+ <ul>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/37">http://www.w3.org/2011/gld/track/issues/37</a></li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#SDMXWebDisseminationUseCase">SDMX Web
+ Dissemination Use Case</a></li>
+ <li><a href="#UKgovernmentfinancialdatafromCombinedOnlineInformationSystem">Publisher
+ Use Case: UK government financial data from Combined Online
+ Information System (COINS)</a></li>
+ <li><a href="#EurostatSDMXasLinkedData">Publisher Use Case:
+ Eurostat SDMX as Linked Data</a></li>
+ </ul>
+
+ </section> <section id="vocabulary-should-clarify-the-use-of-subsets-of-observations">
+ <h3 id="Vocabularyshouldclarifytheuseofsubsetsofobservations"><span class="secno">4.2 </span>Vocabulary
+ should clarify the use of subsets of observations</h3>
+ <p>There should be a consensus on the issue of flattening or
+ abbreviating data; one suggestion is to author data without the
+ duplication, but have the data publication tools "flatten" the compact
+ representation into standalone observations during the publication
+ process.</p>
+ <p>Background information:</p>
+ <ul>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/33">http://www.w3.org/2011/gld/track/issues/33</a></li>
+
+ <li>Since there are no use cases for qb:subslice, the vocabulary
+ should clarify or drop the use of qb:subslice; issue: <a href="http://www.w3.org/2011/gld/track/issues/34">http://www.w3.org/2011/gld/track/issues/34</a>
+ </li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#UKgovernmentfinancialdatafromCombinedOnlineInformationSystem">Publisher
+ Use Case: UK government financial data from Combined Online
+ Information System (COINS)</a></li>
+ <li><a href="#PublishingslicesofdataaboutUKBathingWaterQuality">Publisher
+ Use Case: Publishing slices of data about UK Bathing Water Quality</a></li>
+ </ul>
+
+ </section> <section id="vocabulary-should-recommend-a-mechanism-to-support-hierarchical-code-lists">
+ <h3 id="Vocabularyshouldrecommendamechanismtosupporthierarchicalcodelists"><span class="secno">4.3 </span>Vocabulary
+ should recommend a mechanism to support hierarchical code lists</h3>
+ <p>
+ First, hierarchical code lists may be supported via SKOS [<cite><a href="#ref-skos">SKOS</a></cite>]. Allow for cross-location and cross-time
+ analysis of statistical datasets.
+ </p>
+ <p>
+ Second, one can think of non-SKOS hierarchical code lists. E.g., if
+ simple
+ <code> skos:narrower</code>
+ /
+ <code>skos:broader</code>
+ relationships are not sufficient or if a vocabulary uses specific
+ hierarchical properties, e.g.,
+ <code>geo:containedIn</code>
+ .
+ </p>
+ <p>
+ Also, the use of hierarchy levels needs to be clarified. It has been
+ suggested, to allow
+ <code>skos:Collections</code>
+ as value of
+ <code>qb:codeList</code>
+ .
+ </p>
+ <p>Background information:</p>
+ <ul>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/31">http://www.w3.org/2011/gld/track/issues/31</a></li>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/39">http://www.w3.org/2011/gld/track/issues/39</a>
+ </li>
+ <li>Discussion at publishing-statistical-data mailing list: <a href="http://groups.google.com/group/publishing-statistical-data/msg/7c80f3869ff4ba0f">http://groups.google.com/group/publishing-statistical-data/msg/7c80f3869ff4ba0f</a></li>
+ <li>Part of the requirement is met by the work on an ISO
+ Extension to SKOS [<cite><a href="#ref-xkos">XKOS</a></cite>]
+ </li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#PublishingExcelSpreadsheetsasLinkedData">Publisher
+ Use Case: Publishing Excel Spreadsheets as Linked Data</a></li>
+ </ul>
+
+ </section> <section id="vocabulary-should-define-relationship-to-iso19156---observations-measurements">
+ <h3 id="VocabularyshoulddefinerelationshiptoISO19156ObservationsMeasurements"><span class="secno">4.4 </span>Vocabulary
+ should define relationship to ISO19156 - Observations & Measurements</h3>
+ <p>An number of organisations, particularly in the Climate and
+ Meteorological area already have some commitment to the OGC
+ "Observations and Measurements" (O&M) logical data model, also
+ published as ISO 19156. Are there any statements about compatibility
+ and interoperability between O&M and Data Cube that can be made to
+ give guidance to such organisations?</p>
+ <p>Background information:</p>
+ <ul>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/32">http://www.w3.org/2011/gld/track/issues/32</a></li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#PublishingslicesofdataaboutUKBathingWaterQuality">Publisher
+ Use Case: Publishing slices of data about UK Bathing Water Quality</a></li>
+ </ul>
+
+ </section> <section id="there-should-be-a-recommended-mechanism-to-allow-for-publication-of-aggregates-which-cross-multiple-dimensions">
+ <h3 id="Thereshouldbearecommendedmechanismtoallowforpublicationofaggregateswhichcrossmultipledimensions"><span class="secno">4.5 </span>There
+ should be a recommended mechanism to allow for publication of
+ aggregates which cross multiple dimensions</h3>
+
+ <p>Background information:</p>
+ <ul>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/31">http://www.w3.org/2011/gld/track/issues/31</a></li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li>E.g., the Eurostat SDMX as Linked Data use case suggests to
+ have time lines on data aggregating over the gender dimension: <a href="#EurostatSDMXasLinkedData">Publisher Use Case: Eurostat
+ SDMX as Linked Data</a>
+ </li>
+ <li>Another possible use case could be provided by the <a href="http://data.gov.uk/resources/payments">Payment Ontology</a>.
+ </li>
+ </ul>
+
+ </section> <section id="there-should-be-a-recommended-way-of-declaring-relations-between-cubes">
+ <h3 id="Thereshouldbearecommendedwayofdeclaringrelationsbetweencubes"><span class="secno">4.6 </span>There
+ should be a recommended way of declaring relations between cubes</h3>
+ <p>Background information:</p>
+ <ul>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/30">http://www.w3.org/2011/gld/track/issues/30</a></li>
+ <li>Discussion in <a href="http://groups.google.com/group/publishing-statistical-data/browse_thread/thread/75762788de10de95">publishing-statistical-data
+ mailing list</a>
+ </li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#Representingrelationshipsbetweenstatisticaldata">Publisher
+ Use Case: Representing relationships between statistical data</a></li>
+ </ul>
+
+ </section> <section id="there-should-be-criteria-for-well-formedness-and-assumptions-consumers-can-make-about-published-data">
+ <h3 id="Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata"><span class="secno">4.7 </span>There
+ should be criteria for well-formedness and assumptions consumers can
+ make about published data</h3>
+
+ <p>Background information:</p>
+ <ul>
+ <li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/29">http://www.w3.org/2011/gld/track/issues/29</a></li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#Simplechartvisualisationsofpublishedstatisticaldata">Consumer
+ Use Case: Simple chart visualisations of (integrated) published
+ statistical data</a></li>
+ <li><a href="#VisualisingpublishedstatisticaldatainGooglePublicDataExplorer">Consumer
+ Use Case: Visualising published statistical data in Google Public
+ Data Explorer</a></li>
+ <li><a href="#AnalysingpublishedstatisticaldatawithcommonOLAPsystems">Consumer
+ Use Case: Analysing published statistical data with common OLAP
+ systems</a></li>
+ </ul>
+
+ </section> <section id="there-should-be-mechanisms-and-recommendations-regarding-publication-and-consumption-of-large-amounts-of-statistical-data">
+ <h3 id="VocabularyshouldbuildupontheSDMXinformationmodel"><span class="secno">4.8 </span>There
+ should be mechanisms and recommendations regarding publication and
+ consumption of large amounts of statistical data</h3>
+ <p>Background information:</p>
+ <ul>
+ <li>Related issue regarding abbreviations <a href="http://www.w3.org/2011/gld/track/issues/29">http://www.w3.org/2011/gld/track/issues/29</a>
+ </li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#EurostatSDMXasLinkedData">Publisher Use Case:
+ Eurostat SDMX as Linked Data</a></li>
+ </ul>
+
+ </section> <section id="there-should-be-a-recommended-way-to-communicate-the-availability-of-published-statistical-data-to-external-parties-and-to-allow-automatic-discovery-of-statistical-data">
+ <h3 id="Thereshouldbearecommendedwaytocommunicatetheavailabilityofpublishedstatisticaldatatoexternalpartiesandtoallowautomaticdiscoveryofstatisticaldata"><span class="secno">4.9 </span>There
+ should be a recommended way to communicate the availability of
+ published statistical data to external parties and to allow automatic
+ discovery of statistical data</h3>
+ <p>Clarify the relationship between DCAT and QB.</p>
+ <p>Background information:</p>
+ <ul>
+ <li>None.</li>
+ </ul>
+
+ <p>Required by:</p>
+ <ul>
+ <li><a href="#SDMXWebDisseminationUseCase">SDMX Web
+ Dissemination Use Case</a></li>
+ <li><a href="#Registeringpublishedstatisticaldataindatacatalogs">Registry
+ Use Case: Registering published statistical data in data catalogs</a></li>
+ </ul>
+
+ </section> </section>
+ <section class="appendix" id="acknowledgements-1">
+ <!--OddPage--><h2 id="acknowledgements"><span class="secno">A. </span>Acknowledgements</h2>
+ <p>We thank Rinke Hoekstra, Dave Reynolds, Bernadette Hyland,
+ Biplav Srivastava, John Erickson, Villazón-Terrazas for
+ feedback and input.</p>
+ </section>
+
+ <h2 id="references">References</h2>
+
+ <dl>
+
+ <dt id="ref-cog">[COG]</dt>
+ <dd>
+ SDMX Content Oriented Guidelines, <a href="http://sdmx.org/?page_id=11">http://sdmx.org/?page_id=11</a>
+ </dd>
+
+ <dt id="ref-COGS">[COGS]</dt>
+ <dd>
+ Freitas, A., Kämpgen, B., Oliveira, J. G., O’Riain, S., & Curry, E.
+ (2012). Representing Interoperable Provenance Descriptions for ETL
+ Workflows. ESWC 2012 Workshop Highlights (pp. 1–15). Springer Verlag,
+ 2012 (in press). (Extended Paper published in Conf. Proceedings.). <a href="http://andrefreitas.org/papers/preprint_provenance_ETL_workflow_eswc_highlights.pdf">http://andrefreitas.org/papers/preprint_provenance_ETL_workflow_eswc_highlights.pdf</a>.
+ </dd>
+
+ <dt id="ref-COINS">[COINS]</dt>
+ <dd>
+ Ian Dickinson et al., COINS as Linked Data <a href="http://data.gov.uk/resources/coins">http://data.gov.uk/resources/coins</a>,
+ last visited on Jan 9 2013
+ </dd>
+
+ <dt id="ref-FIOS">[FIOS]</dt>
+ <dd>
+ Andreas Harth, Sean O'Riain, Benedikt Kämpgen. Submission XBRL
+ Challenge 2011. <a href="http://xbrl.us/research/appdev/Pages/275.aspx">http://xbrl.us/research/appdev/Pages/275.aspx</a>.
+ </dd>
+
+
+ <dt id="ref-FOWLER97">[FOWLER97]</dt>
+ <dd>Fowler, Martin (1997). Analysis Patterns: Reusable Object
+ Models. Addison-Wesley. ISBN 0201895420.</dd>
+
+
+ <dt id="ref-linked-data">[LOD]</dt>
+ <dd>
+ Linked Data, <a href="http://linkeddata.org/">http://linkeddata.org/</a>
+ </dd>
+
+ <dt id="ref-OLAP">[OLAP]</dt>
+ <dd>
+ Online Analytical Processing Data Cubes, <a href="http://en.wikipedia.org/wiki/OLAP_cube">http://en.wikipedia.org/wiki/OLAP_cube</a>
+ </dd>
+
+ <dt id="ref-OLAP">[OLAP4LD]</dt>
+ <dd>
+ Kämpgen, B. and Harth, A. (2011). Transforming Statistical Linked
+ Data for Use in OLAP Systems. I-Semantics 2011. <a href="http://www.aifb.kit.edu/web/Inproceedings3211">http://www.aifb.kit.edu/web/Inproceedings3211</a>
+ </dd>
+
+ <dt id="ref-QB">[QB-2010]</dt>
+ <dd>
+ RDF Data Cube vocabulary, <a href="http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/cube.html">http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/cube.html</a>
+ </dd>
+
+ <dt id="ref-QB">[QB-2013]</dt>
+ <dd>
+ RDF Data Cube vocabulary, <a href="http://www.w3.org/TR/vocab-data-cube/">http://www.w3.org/TR/vocab-data-cube/</a>
+ </dd>
+
+ <dt id="ref-QB4OLAP">[QB4OLAP]</dt>
+ <dd>
+ Etcheverry, Vaismann. QB4OLAP : A New Vocabulary for OLAP Cubes on
+ the Semantic Web. <a href="http://publishing-multidimensional-data.googlecode.com/git/index.html">http://publishing-multidimensional-data.googlecode.com/git/index.html</a>
+ </dd>
+
+ <dt id="ref-rdf">[RDF]</dt>
+ <dd>
+ Resource Description Framework, <a href="http://www.w3.org/RDF/">http://www.w3.org/RDF/</a>
+ </dd>
+
+ <dt id="ref-scovo">[SCOVO]</dt>
+ <dd>
+ The Statistical Core Vocabulary, <a href="http://sw.joanneum.at/scovo/schema.html">http://sw.joanneum.at/scovo/schema.html</a>
+ <br> SCOVO: Using Statistics on the Web of data, <a href="http://sw-app.org/pub/eswc09-inuse-scovo.pdf">http://sw-app.org/pub/eswc09-inuse-scovo.pdf</a>
+ </dd>
+
+ <dt id="ref-skos">[SKOS]</dt>
+ <dd>
+ Simple Knowledge Organization System, <a href="http://www.w3.org/2004/02/skos/">http://www.w3.org/2004/02/skos/</a>
+ </dd>
+
+ <dt id="ref-SDMX">[SMDX]</dt>
+ <dd>
+ SMDX - SDMX User Guide Version 2009.1, <a href="http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf">http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf</a>,
+ last visited Jan 8 2013.
+ </dd>
+
+ <dt id="ref-SDMX-21">[SMDX 2.1]</dt>
+ <dd>
+ SDMX 2.1 User Guide Version. Version 0.1 - 19/09/2012. <a href="http://sdmx.org/wp-content/uploads/2012/11/SDMX_2-1_User_Guide_draft_0-1.pdf">http://sdmx.org/wp-content/uploads/2012/11/SDMX_2-1_User_Guide_draft_0-1.pdf</a>.
+ last visited on 8 Jan 2013.
+ </dd>
+
+ <dt id="ref-xkos">[XKOS]</dt>
+ <dd>
+ Extended Knowledge Organization System (XKOS), <a href="https://github.com/linked-statistics/xkos">https://github.com/linked-statistics/xkos</a>
+ </dd>
+
+ </dl>
+
+
+</body></html>
\ No newline at end of file