gld: changeset 289:34d03b6b4249

--- a/data-cube-ucr/index.html	Mon Feb 25 16:52:18 2013 +0100
+++ b/data-cube-ucr/index.html	Wed Feb 27 19:19:36 2013 +0100
@@ -91,8 +91,8 @@
 	The following figure illustrates this specificitiy of modelling in a
 	class diagram:
 
-	<p class="caption">Figure demonstrating specificity of modelling a
-		statistic</p>
+	<p class="caption">Figure: Illustration of specificities in
+		modelling of a statistic</p>
 
 	<p align="center">
 		<img alt="specificity of modelling a
@@ -196,7 +196,7 @@
 	<p>
 		<span style="font-size: 10pt">(Use case taken from SDMX Web
 			Dissemination Use Case [<cite><a href="#ref-SDMX-21">SDMX
-					2.1</a></cite>]
+					2.1</a></cite>])
 		</span>
 	</p>
 	<p>Since we have adopted the multidimensional model that underlies
@@ -217,13 +217,13 @@
 		SDMX and in more detail described as follows:</p>
 
 	<p class="caption">
-		Process flow diagram by SDMX [<cite><a href="#ref-SDMX-21">SDMX
-				2.1</a></cite>]
+		Figure: Process flow diagram by SDMX [<cite><a
+			href="#ref-SDMX-21">SDMX 2.1</a></cite>]
 	</p>
 
 	<p align="center">
 		<img alt="SDMX Web Dissemination Use Case"
-			src="./figures/SDMX_Web_Dissemination_Use_Case.png"></img>
+			src="./figures/SDMX_Web_Dissemination_Use_Case.png" width="1000px"></img>
 	</p>
 	<p>Benefits:</p>
 	<p>A structural metadata source (registry) collects metadata about
@@ -242,18 +242,894 @@
 		metadata.</p>
 
 	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Thereshouldbearecommendedwaytocommunicatetheavailabilityofpublishedstatisticaldatatoexternalpartiesandtoallowautomaticdiscoveryofstatisticaldata">There
+				should be a recommended way to communicate the availability of
+				published statistical data to external parties and to allow
+				automatic discovery of statistical data</a></li>
+	</ul>
+
 
 	<p>The SDMX Web Dissemination Use Case can be concretised by
 		several sub-use cases, detailed in the following sections.</p>
 
 	</section> <section>
-	<h3 id="COINS">Publisher Use Case: UK government financial data
-		from Combined Online Information System (COINS)</h3>
+	<h3 id="UKgovernmentfinancialdatafromCombinedOnlineInformationSystem">Publisher
+		Use Case: UK government financial data from Combined Online
+		Information System (COINS)</h3>
 	<p>
 		<span style="font-size: 10pt">(This use case has been
-			summarised from Ian Dickinson et al. (COINS as Linked Data.
-			http://data.gov.uk/resources/coins. Last visited on Jan 9 2013). </span>
+			summarised from Ian Dickinson et al. [<cite><a
+				href="#ref-COINS">COINS</a></cite>])
+		</span>
 	</p>
+	<p>More and more organizations want to publish statistics on the
+		web, for reasons such as increasing transparency and trust. Although
+		in the ideal case, published data can be understood by both humans and
+		machines, data often is simply published as CSV, PDF, XSL etc.,
+		lacking elaborate metadata, which makes free usage and analysis
+		difficult.</p>
+	<p>Therefore, the goal in this use case is to use a
+		machine-readable and application-independent description of common
+		statistics with use of open standards, to foster usage and innovation
+		on the published data.</p>
+	<p>In the "COINS as Linked Data" project (Ian Dickinson et al.
+		COINS as Linked Data. http://data.gov.uk/resources/coins. Last visited
+		on Jan 9 2013), the Combined Online Information System (COINS)
+		(Treasury's web site.
+		http://www.hm-treasury.gov.uk/psr_coins_data.htm. Last visited on Jan
+		9 2013) shall be published using a standard Linked Data vocabulary.</p>
+	<p>In the Combined Online Information System (COINS), HM Treasury,
+		the principal custodian of financial data for the UK government,
+		released previously restricted financial information about government
+		spendings.</p>
+
+	<p>Benefits:</p>
+
+	According to the COINS as Linked Data project, the reason for
+	publishing COINS as Linked Data are threefold.
+
+	<ul>
+		<li>using open standard representation makes it easier to work
+			with the data with available technologies and promises innovative
+			third-party tools and usages</li>
+		<li>individual transactions and groups of transactions are given
+			an identity, and so can be referenced by web address (URL), to allow
+			them to be discussed, annotated, or listed as source data for
+			articles or visualizations</li>
+		<li>cross-links between linked-data datasets allow for much
+			richer exploration of related datasets</li>
+	</ul>
+
+	<p>The COINS data has a hypercube structure. It describes financial
+		transactions using seven independent dimensions (time, data-type,
+		department etc.) and one dependent measure (value). Also, it allows
+		thirty-three attributes that may further describe each transaction.
+		For further information, see the "COINS as Linked Data" project
+		website.</p>
+
+	<p>COINS is an example of one of the more complex statistical
+		datasets being publishing via data.gov.uk.</p>
+
+	<p>Part of the complexity of COINS arises from the nature of the
+		data being released.</p>
+
+	<p>The published COINS datasets cover expenditure related to five
+		different years (2005–06 to 2009–10). The actual COINS database at HM
+		Treasury is updated daily. In principle at least, multiple snapshots
+		of the COINS data could be released through the year.</p>
+
+	<p>The COINS use case leads to the following challenges:</p>
+	<ul>
+		<li>The actual data and its hypercube structure are to be
+			represented separately so that an application first can examine the
+			structure before deciding to download the actual data, i.e., the
+			transactions. The hypercube structure also defines for each dimension
+			and attribute a range of permitted values that are to be represented.</li>
+		<li>An access or query interface to the COINS data, e.g., via a
+			SPARQL endpoint or the linked data API, is planned. Queries that are
+			expected to be interesting are: "spending for one department", "total
+			spending by department", "retrieving all data for a given
+			observation",</li>
+		<li>Also, the publisher favours a representation that is both as
+			self-descriptive as possible, i.e., others can link to and download
+			fully-described individual transactions and as compact as possible,
+			i.e., information is not unnecessarily repeated.</li>
+		<li>Moreover, the publisher is thinking about the possible
+			benefit of publishing slices of the data, e.g., datasets that fix all
+			dimensions but the time dimension. For instance, such slices could be
+			particularly interesting for visualisations or comments. However,
+			depending on the number of Dimensions, the number of possible slices
+			can become large which makes it difficult to select all interesting
+			slices.</li>
+		<li>An important benefit of linked data is that we are able to
+			annotate data, at a fine-grained level of detail, to record
+			information about the data itself. This includes where it came from –
+			the provenance of the data – but could include annotations from
+			reviewers, links to other useful resources, etc. Being able to trust
+			that data to be correct and reliable is a central value for
+			government-published data, so recording provenance is a key
+			requirement for the COINS data.</li>
+		<li>A challenge also is the size of the data, especially since it
+			is updated regularly. Five data files already contain between 3.3 and
+			4.9 million rows of data.</li>
+	</ul>
+	<p>Requirements::</p>
+	<ul>
+		<li><a
+			href="#Vocabularyshouldclarifytheuseofsubsetsofobservations">Vocabulary
+				should clarify the use of subsets of observations</a></li>
+	</ul>
+
+
+
+
+	</section> <section>
+	<h3 id="PublishingExcelSpreadsheetsasLinkedData">Publisher Use
+		Case: Publishing Excel Spreadsheets as Linked Data</h3>
+	<p>
+		<span style="font-size: 10pt">(Part of this use case has been
+			contributed by Rinke Hoekstra. See <a
+			href="http://ehumanities.nl/ceda_r/">CEDA_R</a> and <a
+			href="http://www.data2semantics.org/">Data2Semantics</a> for more
+			information.)
+		</span>
+	</p>
+
+	<p>Not only in government, there is a need to publish considerable
+		amounts of statistical data to be consumed in various (also
+		unexpected) application scenarios. Typically, Microsoft Excel sheets
+		are made available for download. Those excel sheets contain single
+		spreadsheets with several multidimensional data tables, having a name
+		and notes, as well as column values, row values, and cell values.</p>
+	<p>Benefits:</p>
+	<p>The goal in this use case is to to publish spreadsheet
+		information in a machine-readable format on the web, e.g., so that
+		crawlers can find spreadsheets that use a certain column value. The
+		published data should represent and make available for queries the
+		most important information in the spreadsheets, e.g., rows, columns,
+		and cell values.</p>
+	<p>
+		For instance, in the C<a href="http://ehumanities.nl/ceda_r/">CEDA_R</a>
+		and <a href="http://www.data2semantics.org/">Data2Semantics</a>
+		projects publishing and harmonizing Dutch historical census data (from
+		1795 onwards) is a goal. These censuses are now only available as
+		Excel spreadsheets (obtained by data entry) that closely mimic the way
+		in which the data was originally published and shall be published as
+		Linked Data.
+	</p>
+	<p>Challenges in this use case:</p>
+	<p>All context and so all meaning of the measurement point is
+		expressed by means of dimensions. The pure number is the star of an
+		ego-network of attributes or dimensions. In a RDF representation it is
+		then easily possible to define hierarchical relationships between the
+		dimensions (that can be exemplified further) as well as mapping
+		different attributes across different value points. This way a
+		harmonization among variables is performed around the measurement
+		points themselves.</p>
+	<p>In historical research, until now, harmonization across datasets
+		is performed by hand, and in subsequent iterations of a database: it
+		is very hard to trace back the provenance of decisions made during the
+		harmonization procedure.</p>
+	<p>Combining Data Cube with SKOS to allow for cross-location and
+		cross-time historical analysis</p>
+	<p>Novel visualisation of census data</p>
+	<p>Integration with provenance vocabularies, e.g., PROV-O, for
+		tracking of harmonization steps</p>
+	<p>These challenges may seem to be particular to the field of
+		historical research, but in fact apply to government information at
+		large. Government is not a single body that publishes information at a
+		single point in time. Government consists of multiple (altering)
+		bodies, scattered across multiple levels, jurisdictions and areas.
+		Publishing government information in a consistent, integrated manner
+		requires exactly the type of harmonization required in this use case.</p>
+	<p>Excel sheets provide much flexibility in arranging information.
+		It may be necessary to limit this flexibility to allow automatic
+		transformation.</p>
+	<p>There are many spreadsheets.</p>
+	<p>Semi-structured information, e.g., notes about lineage of data
+		cells, may not be possible to be formalized.</p>
+	<p>Another concrete example is the Stats2RDF [1] project that
+		intends to publish biomedical statistical data that is represented as
+		Excel sheets. Here, Excel files are first translated into CSV and then
+		translated into RDF.</p>
+
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Vocabularyshouldrecommendamechanismtosupporthierarchicalcodelists">Vocabulary
+				should recommend a mechanism to support hierarchical code lists</a></li>
+	</ul>
+
+
+	</section> <section>
+	<h3
+		id="PublishinghierarchicallystructureddatafromStatsWalesandOpenDataCommunities">Publisher
+		Use Case: Publishing hierarchically structured data from StatsWales
+		and Open Data Communities</h3>
+	<p>
+		<span style="font-size: 10pt">(Use case has been taken from [<cite><a
+				href="#ref-SDMX-21">QB4OLAP</a></cite>])
+		</span>
+	</p>
+
+	<p>It often comes up in statistical data that you have some kind of
+		'overall' figure, which is then broken down into parts (GLD mailing
+		list discussion.
+		http://groups.google.com/group/publishing-statistical-data/msg/7c80f3869ff4ba0f).</p>
+
+	<p>
+		Etcheverry and Vaisman [<cite><a href="#ref-SDMX-21">QB4OLAP</a></cite>]
+		present the use case to publish household data from <a
+			href="http://statswales.wales.gov.uk/index.htm">StatsWales</a> and <a
+			href="http://opendatacommunities.org/doc/dataset/housing/household-projections">Open
+			Data Communities</a>.
+	</p>
+
+	<p>This multidimensional data contains for each fact a time
+		dimension with one level year and a location dimension with levels
+		Unitary Authority, Government Office Region, Country, and ALL.</p>
+
+	<p>As unit, units of 1000 households is used.</p>
+
+	<p>In this use case, one wants to publish not only a dataset on the
+		bottom most level, i.e. what are the number of households at each
+		Unitary Authority in each year, but also a dataset on more aggregated
+		levels.</p>
+
+	<p>For instance, in order to publish a dataset with the number of
+		households at each Government Office Region per year, one needs to
+		aggregate the measure of each fact having the same Government Office
+		Region using the SUM function.</p>
+
+	<p>Importantly, one would like to maintain the relationship between
+		the resulting datasets, i.e., the levels and aggregation functions.</p>
+
+	<p>Note, this use case does not simply need a selection (or "dice"
+		in OLAP context) where one fixes the time period and the measure
+		(qb:Slice where you fix the time period and the measure).</p>
+
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Vocabularyshouldrecommendamechanismtosupporthierarchicalcodelists">Vocabulary
+				should recommend a mechanism to support hierarchical code lists</a></li>
+	</ul>
+
+
+	</section> <section>
+	<h3 id="PublishingslicesofdataaboutUKBathingWaterQuality">Publisher
+		Use Case: Publishing slices of data about UK Bathing Water Quality</h3>
+	<p>
+		<span style="font-size: 10pt">(Use case has been provided by
+			Epimorphics Ltd (<a
+			href="http://www.epimorphics.com/web/projects/bathing-water-quality">http://www.epimorphics.com/web/projects/bathing-water-quality</a>))
+		</span>
+	</p>
+	<p>As part of their work with data.gov.uk and the UK Location
+		Programme Epimorphics Ltd have been working to pilot the publication
+		of both current and historic bathing water quality information from
+		the UK Environment Agency (http://www.environment-agency.gov.uk/) as
+		Linked Data.</p>
+	<p>The UK has a number of areas, typically beaches, that are
+		designated as bathing waters where people routinely enter the water.
+		The Environment Agency monitors and reports on the quality of the
+		water at these bathing waters.</p>
+	<p>The Environement Agency's data can be thought of as structured
+		in 3 groups:</p>
+	<ul>
+		<li>There is basic reference data describing the bathing waters
+			and sampling points</li>
+		<li>There is a data set "Annual Compliance Assessment Dataset"
+			giving the rating for each bathing water for each year it has been
+			monitored</li>
+		<li>There is a data set "In-Season Sample Assessment Dataset"
+			giving the detailed weekly sampling results for each bathing water</li>
+	</ul>
+	<p>The most important dimensions of the data are bathing water,
+		sampling point, and compliance classification.</p>
+	<p>Challenges:</p>
+
+	<p>Observations may exhibit a number of attributes, e.g., whether
+		ther was an abnormal weather exception.</p>
+	<p>
+		Relevant slices of both datasets are to be created:
+		<ul>
+			<li>Annual Compliance Assessment Dataset: all the observations
+				for a specific sampling point, all the observations for a specific
+				year.</li>
+			<li>In-Season Sample Assessment Dataset: samples for a given
+				sampling point, samples for a given week, samples for a given year,
+				samples for a given year and sampling point, latest samples for each
+				sampling point.</li>
+			<li>The use case suggests more arbitrary subsets of the
+				observations, e.g., collecting all the "latest" observations in a
+				continuously updated data set.</li>
+		</ul>
+
+
+	</p>
+	<p>Existing Work:</p>
+	<ul>
+		<li>Semantic Sensor Network ontology (SSN) [2] already provides a
+			way to publish sensor information. SSN data provides statistical
+			Linked Data and grounds its data to the domain, e.g., sensors that
+			collect observations (e.g., sensors measuring average of temperature
+			over location and time).</li>
+		<li>A number of organizations, particularly in the Climate and
+			Meteorological area already have some commitment to the OGC
+			"Observations and Measurements" (O&M) logical data model, also
+			published as ISO 19156.</li>
+	</ul>
+
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#VocabularyshoulddefinerelationshiptoISO19156ObservationsMeasurements">Vocabulary
+				should define relationship to ISO19156 - Observations & Measurements</a></li>
+		<li><a
+			href="#Vocabularyshouldclarifytheuseofsubsetsofobservations">Vocabulary
+				should clarify the use of subsets of observations</a></li>
+	</ul>
+
+
+	</section> <section>
+	<h3 id="EurostatSDMXasLinkedData">Publisher Use Case: Eurostat
+		SDMX as Linked Data</h3>
+	<p>
+		<span style="font-size: 10pt">(This use case has been taken
+			from <a href="http://estatwrap.ontologycentral.com/">Eurostat
+				Linked Data Wrapper</a> and <a
+			href="http://eurostat.linked-statistics.org/">Linked Statistics
+				Eurostat Data</a>, both deployments for publishing Eurostat SDMX as
+			Linked Data using the draft version of the vocabulary)
+		</span>
+	</p>
+
+	<p>As mentioned already, the ISO standard for exchanging and
+		sharing statistical data and metadata among organizations is
+		Statistical Data and Metadata eXchange (SDMX). Since this standard has
+		proven applicable in many contexts, we adopt the multidimensional
+		model that underlies SDMX and intend the standard vocabulary to be
+		compatible to SDMX.</p>
+
+	<p>
+		Therefore, in this use case we intend to explain the benefit and
+		challenges of publishing SDMX data as Linked Data. As one of the main
+		adopters of SDMX, <a href="http://epp.eurostat.ec.europa.eu/">Eurostat</a>
+		publishes large amounts of European statistics coming from a data
+		warehouse as SDMX and other formats on the web. Eurostat also provides
+		an interface to browse and explore the datasets. However, linking such
+		multidimensional data to related data sets and concepts would require
+		download of interesting datasets and manual integration.The goal here
+		is to improve integration with other datasets; Eurostat data should be
+		published on the web in a machine-readable format, possible to be
+		linked with other datasets, and possible to be freeley consumed by
+		applications. Both <a href="http://estatwrap.ontologycentral.com/">Eurostat
+			Linked Data Wrapper</a> and <a
+			href="http://eurostat.linked-statistics.org/">Linked Statistics
+			Eurostat Data</a> intend to publish Eurostat SDMX data as Linked Data. In
+		these use cases, <a
+			href="http://epp.eurostat.ec.europa.eu/portal/page/portal/eurostat/home/">Eurostat
+			data</a> shall be published as <a href="http://5stardata.info/">5-star
+			Linked Open Data</a>. Eurostat data is partly published as SDMX, partly
+		as tabular data (TSV, similar to CSV). Eurostat provides a <a
+			href="http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&file=table_of_contents_en.xml">TOC
+			of published datasets</a> as well as a feed of modified and new datasets.
+
+		Eurostat provides a list of used codelists, i.e., <a
+			href="http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&dir=dic">range
+			of permitted dimension values</a>. Any Eurostat dataset contains a
+		varying set of dimensions (e.g., date, geo, obs_status, sex, unit) as
+		well as measures (generic value, content is specified by dataset,
+		e.g., GDP per capita in PPS, Total population, Employment rate by
+		sex).
+	</p>
+
+
+	<p>Benefits:</p>
+
+	<ul>
+		<li>Possible implementation of ETL pipelines based on Linked Data
+			technologies (e.g., LDSpider) to load the data into a data warehouse
+			for analysis</li>
+
+		<li>Allows useful queries to the data, e.g., comparison of
+			statistical indicators across EU countries.</li>
+
+		<li>Allows to attach contextual information to statistics during
+			the interpretation process.</li>
+
+		<li>Allows to reuse single observations from the data.</li>
+
+		<li>Linking to information from other data sources, e.g., for
+			geo-spatial dimension.
+	</ul>
+
+	<p>Challenges:</p>
+
+	<ul>
+		<li>New Eurostat datasets are added regularly to Eurostat. The
+			Linked Data representation should automatically provide access to the
+			most-up-to-date data.</li>
+
+		<li>How to match elements of the geo-spatial dimension to
+			elements of other data sources, e.g., NUTS, GADM.</li>
+
+		<li>There is a large number of Eurostat datasets, each possibly
+			containing a large number of columns (dimensions) and rows
+			(observations). Eurostat publishes more than 5200 datasets, which,
+			when converted into RDF require more than 350GB of disk space
+			yielding a dataspace with some 8 billion triples.</li>
+
+		<li>In the Eurostat Linked Data Wrapper, there is a timeout for
+			transforming SDMX to Linked Data, since Google App Engine is used.
+			Mechanisms to reduce the amount of data that needs to be translated
+			would be needed.</li>
+
+		<li>Provide a useful interface for browsing and visualising the
+			data. One problem is that the data sets have to high dimensionality
+			to be displayed directly. Instead, one could visualise slices of time
+			series data. However, for that, one would need to either fix most
+			other dimensions (e.g., sex) or aggregate over them (e.g., via
+			average). The selection of useful slices from the large number of
+			possible slices is a challenge.</li>
+
+		<li>Each dimension used by a dataset has a range of permitted
+			values that ought to be represented.</li>
+
+		<li>The Eurostat SDMX as Linked Data use case suggests to have
+			time lines on data aggregating over the gender dimension.</li>
+
+		<li>The Eurostat SDMX as Linked Data use case suggests to provide
+			data on a gender level and on a level aggregating over the gender
+			dimension.</li>
+
+		<li>Updates to the data
+
+			<ul>
+				<li>Eurostat - Linked Data pulls in changes from the original
+					Eurostat dataset on weekly basis and conversion process runs every
+					Saturday at noon taking into account new datasets along with
+					updates to existing datasets.</li>
+				<li>Eurostat Linked Data Wrapper on-the-fly translates Eurostat
+					datasets into RDF so that always the most current data is used. The
+					problem is only to point users towards the URIs of Eurostat
+					datasets: Estatwrap provides a feed of modified and new <a
+					href="http://estatwrap.ontologycentral.com/feed.rdf">datasets</a>.
+					Also, it provides a <a
+					href="http://estatwrap.ontologycentral.com/table_of_contents.html">TOC</a>
+					that could be automatically updated from the <a
+					href="http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&file=table_of_contents_en.xml">Eurostat
+						TOC</a>.
+				</li>
+			</ul>
+
+
+		</li>
+
+		<p>Query interface</p>
+
+		<ul>
+			<li>Eurostat - Linked Data provides SPARQL endpoint for the
+				metadata (not the observations).</li>
+			<li>Eurostat Linked Data Wrapper allows and demonstrates how to
+				use Qcrumb.com to query the data.</li>
+		</ul>
+
+		<p>
+			Browsing and visualising interface:
+			<ul>
+				<li>Eurostat Linked Data Wrapper provides for each dataset an
+					HTML page showing a visualisation of the data.</li>
+			</ul>
+
+
+		</p>
+
+		<p>Non-requirements:</p>
+		<ul>
+			<li>One possible application would run validation checks over
+				Eurostat data. The intended standard vocabulary is to publish the
+				Eurostat data as-is and is not intended to represent information for
+				validation (similar to business rules).</li>
+		</ul>
+
+		<p>Requirements:</p>
+		<ul>
+			<li><a href="#VocabularyshouldbuildupontheSDMXinformationmodel">There
+					should be mechanisms and recommendations regarding publication and
+					consumption of large amounts of statistical data</a></li>
+			<li><a
+				href="#Thereshouldbearecommendedmechanismtoallowforpublicationofaggregateswhichcrossmultipledimensions">There
+					should be a recommended mechanism to allow for publication of
+					aggregates which cross multiple dimensions</a></li>
+		</ul>
+	</section> <section>
+	<h3 id="Representingrelationshipsbetweenstatisticaldata">Publisher
+		Use Case: Representing relationships between statistical data</h3>
+	<p>
+		<span style="font-size: 10pt">(This use case has mainly been
+			taken from the COINS project [<cite><a href="#ref-COINS">COINS</a></cite>])
+		</span>
+	</p>
+
+	<p>In several applications, relationships between statistical data
+		need to be represented.</p>
+
+	<p>The goal of this use case is to describe provenance,
+		transformations, and versioning around statistical data, so that the
+		history of statistics published on the web becomes clear. This may
+		also relate to the issue of having relationships between datasets
+		published.</p>
+
+	<p>
+		For instance, the COINS project [<cite><a href="#ref-COINS">COINS</a></cite>]
+		has at least four perspectives on what they mean by “COINS” data: the
+		abstract notion of “all of COINS”, the data for a particular year, the
+		version of the data for a particular year released on a given date,
+		and the constituent graphs which hold both the authoritative data
+		translated from HMT’s own sources. Also, additional supplementary
+		information which they derive from the data, for example by
+		cross-linking to other datasets.
+	</p>
+
+	<p>Another specific use case is that the Welsh Assembly government
+		publishes a variety of population datasets broken down in different
+		ways. For many uses then population broken down by some category (e.g.
+		ethnicity) is expressed as a percentage. Separate datasets give the
+		actual counts per category and aggregate counts. In such cases it is
+		common to talk about the denominator (often DENOM) which is the
+		aggregate count against which the percentages can be interpreted.</p>
+
+	<p>
+		Another example for representing relationships between statistical
+		data are transformations on datasets, e.g., addition of derived
+		measures, conversion of units, aggregations, OLAP operations, and
+		enrichment of statistical data. A concrete example is given by Freitas
+		et al. [<cite><a href="#ref-COGS">COGS</a></cite>] and illustrated in
+		the following figure.
+	</p>
+
+	<p class="caption">Figure: Illustration of ETL of statistics</p>
+
+	<p align="center">
+		<img alt="COGS relationships between statistics example"
+			src="./figures/Relationships_Statistical_Data_Cogs_Example.png"></img>
+	</p>
+
+	<p>Here, numbers from a sustainability report have been created by
+		a number of transformations to statistical data. Different numbers
+		(e.g., 600 for year 2009 and 503 for year 2010) might have been
+		created differently, leading to different reliabilities to compare
+		both numbers.</p>
+	<p>Benefits:</p>
+
+	<p>Making transparent the transformation a dataset has been exposed
+		to. Increases trust in the data.</p>
+
+	<p>Challenges:</p>
+
+	<ul>
+		<li>Operations on statistical data result in new statistical
+			data, depending on the operation. For instance, in terms of Data
+			Cube, operations such as slice, dice, roll-up, drill-down will result
+			in new Data Cubes. This may require representing general
+			relationships between cubes (as discussed in the <a
+			href="http://groups.google.com/group/publishing-statistical-data/browse_thread/thread/75762788de10de95">publishing-statistical-data
+				mailing list</a>).
+		</li>
+		<li>Should Data Cube support explicit declaration of such
+			relationships either between separated qb:DataSets or between
+			measures with a single <code>qb:DataSet</code> (e.g. <code>ex:populationCount</code>
+			and <code>ex:populationPercent</code>)?
+		</li>
+		<li>If so should that be scoped to simple, common relationships
+			like DENOM or allow expression of arbitrary mathematical relations?</li>
+	</ul>
+
+	<p>
+		Existing Work (optional):
+
+		<p>
+			Possible relation to <a
+				href="http://www.w3.org/2011/gld/wiki/Best_Practices_Discussion_Summary#Versioning">Versioning</a>
+			part of GLD Best Practices Document, where it is specified how to
+			publish data which has multiple versions.
+		</p>
+	<p>
+		The <a href="http://sites.google.com/site/cogsvocab/">COGS</a>
+		vocabulary [<cite><a href="#ref-COGS">COGS</a></cite>] is related to
+		this use case since it may complement the standard vocabulary for
+		representing ETL pipelines processing statistics.
+	</p>
+
+	</p>
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Thereshouldbearecommendedwayofdeclaringrelationsbetweencubes">There
+				should be a recommended way of declaring relations between cubes</a></li>
+	</ul>
+
+	</section> <section>
+	<h3 id="Simplechartvisualisationsofpublishedstatisticaldata">Consumer
+		Use Case: Simple chart visualisations of (integrated) published
+		statistical data</h3>
+	<p>
+		<span style="font-size: 10pt">(Use case taken from <a
+			href="http://www.iwrm-smart.org/">SMART research project</a>)
+		</span>
+	</p>
+
+	<p>Data that is published on the Web is typically visualized by
+		transforming it manually into CSV or Excel and then creating a
+		visualization on top of these formats using Excel, Tableau,
+		RapidMiner, Rattle, Weka etc.</p>
+	<p>This use case shall demonstrate how statistical data published
+		on the web can be with few effort visualized inside a webpage, without
+		using commercial or highly-complex tools.</p>
+	<p>
+		An example scenario is environmental research done within the <a
+			href="http://www.iwrm-smart.org/">SMART research project</a>. Here,
+		statistics about environmental aspects (e.g., measurements about the
+		climate in the Lower Jordan Valley) shall be visualized for scientists
+		and decision makers. Statistics should also be possible to be
+		integrated and displayed together. The data is available as XML files
+		on the web. On a separate website, specific parts of the data shall be
+		queried and visualized in simple charts, e.g., line diagrams.
+	</p>
+
+	<p class="caption">Figure: HTML embedded line chart of an
+		environmental measure over time for three regions in the lower Jordan
+		valley</p>
+
+	<p align="center">
+		<img
+			alt="display of an environmental measure over time for three regions in the lower Jordan valley"
+			src="./figures/Level_above_msl_3_locations.png" width="1000px"></img>
+	</p>
+
+	<p class="caption">Figure: Showing the same data in a pivot table.
+		Here, the aggregate COUNT of measures per cell is given.</p>
+	<p align="center">
+		<img
+			alt="Figure: Showing the same data in a pivot
+		table. Here, the aggregate COUNT of measures per cell is given."
+			src="./figures/pivot_analysis_measurements.PNG"></img>
+	</p>
+	<p>Challenges of this use case are:</p>
+	<ul>
+		<li>The difficulties lay in structuring the data appropriately so
+			that the specific information can be queried.</li>
+		<li>Also, data shall be published with having potential
+			integration in mind. Therefore, e.g., units of measurements need to
+			be represented.</li>
+		<li>Integration becomes much more difficult if publishers use
+			different measures, dimensions.</li>
+	</ul>
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata">There
+				should be criteria for well-formedness and assumptions consumers can
+				make about published data</a></li>
+	</ul>
+
+	</section> <section>
+	<h3 id="VisualisingpublishedstatisticaldatainGooglePublicDataExplorer">Consumer
+		Use Case: Visualising published statistical data in Google Public Data
+		Explorer</h3>
+	<p>
+		<span style="font-size: 10pt">(Use case taken from <a
+			href="http://code.google.com/apis/publicdata/">Google Public Data
+				Explorer (GPDE)</a>)
+		</span>
+	</p>
+	<p>
+		<a href="http://code.google.com/apis/publicdata/">Google Public
+			Data Explorer</a> (GPDE) provides an easy possibility to visualize and
+		explore statistical data. Data needs to be in the <a
+			href="https://developers.google.com/public-data/overview">Dataset
+			Publishing Language</a> (DSPL) to be uploaded to the data explorer. A
+		DSPL dataset is a bundle that contains an XML file, the schema, and a
+		set of CSV files, the actual data. Google provides a tutorial to
+		create a DSPL dataset from your data, e.g., in CSV. This requires a
+		good understanding of XML, as well as a good understanding of the data
+		that shall be visualized and explored.
+	</p>
+	<p>In this use case, the goal is to take statistical data published
+		on the web and to transform it into DSPL for visualization and
+		exploration with as few effort as possible.</p>
+	<p>For instance, Eurostat data about Unemployment rate downloaded
+		from the web as shown in the following figure:</p>
+
+	<p class="caption">Figure: An interactive chart in GPDE for
+		visualising Eurostat data in the DSPL</p>
+	<p align="center">
+		<img
+			alt="An interactive chart in GPDE for visualising Eurostat data in the DSPL"
+			src="./figures/Eurostat_GPDE_Example.png" width="1000px"></img>
+	</p>
+
+	<p>Benefits:</p>
+	<ul>
+		<li>If a standard Linked Data vocabulary is used, visualising and
+			exploring new data that already is represented using this vocabulary
+			can easily be done using GPDE.</li>
+		<li>Datasets can be first integrated using Linked Data technology
+			and then analysed using GDPE.</li>
+	</ul>
+	<p>Challenges of this use case are:</p>
+	<ul>
+		<li>There are different possible approaches each having
+			advantages and disadvantages: 1) A customer C is downloading this
+			data into a triple store; SPARQL queries on this data can be used to
+			transform the data into DSPL and uploaded and visualized using GPDE.
+			2) or, one or more XLST transformation on the RDF/XML transforms the
+			data into DSPL.</li>
+		<li>The technical challenges for the consumer here lay in knowing
+			where to download what data and how to get it transformed into DSPL
+			without knowing the data.</li>
+	</ul>
+
+	<p>Unanticipated Uses (optional): DSPL is representative for using
+		statistical data published on the web in available tools for analysis.
+		Similar tools that may be automatically covered are: Weka (arff data
+		format), Tableau, SPSS, STATA, PC-Axis etc.</p>
+
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata">There
+				should be criteria for well-formedness and assumptions consumers can
+				make about published data</a></li>
+	</ul>
+	</section> <section>
+	<h3 id="AnalysingpublishedstatisticaldatawithcommonOLAPsystems">Consumer
+		Use Case: Analysing published statistical data with common OLAP
+		systems</h3>
+	<p>
+		<span style="font-size: 10pt">(Use case taken from <a
+			href="http://xbrl.us/research/appdev/Pages/275.aspx">Financial
+				Information Observation System (FIOS)</a>)
+		</span>
+	</p>
+
+	<p>Online Analytical Processing (OLAP) is an analysis method on
+		multidimensional data. It is an explorative analysis methode that
+		allows users to interactively view the data on different angles
+		(rotate, select) or granularities (drill-down, roll-up), and filter it
+		for specific information (slice, dice).</p>
+
+	<p>OLAP systems that first use ETL pipelines to
+		Extract-Load-Transform relevant data for efficient storage and queries
+		in a data warehouse and then allows interfaces to issue OLAP queries
+		on the data are commonly used in industry to analyse statistical data
+		on a regular basis.</p>
+
+	<p>
+		The goal in this use case is to allow analysis of published
+		statistical data with common OLAP systems [<cite><a
+			href="#ref-OLAP4LD">OLAP4LD</a></cite>]
+	</p>
+
+	<p>For that a multidimensional model of the data needs to be
+		generated. A multidimensional model consists of facts summarised in
+		data cubes. Facts exhibit measures depending on members of dimensions.
+		Members of dimensions can be further structured along hierarchies of
+		levels.</p>
+
+	<p>
+		An example scenario of this use case is the Financial Information
+		Observation System (FIOS) [<cite><a href="#ref-FIOS">FIOS</a></cite>],
+		where XBRL data provided by the SEC on the web is to be re-published
+		as Linked Data and made analysable for stakeholders in a web-based
+		OLAP client Saiku.
+	</p>
+
+	<p>The following figure shows an example of using FIOS. Here, for
+		three different companies, cost of goods sold as disclosed in XBRL
+		documents are analysed. As cell values either the number of
+		disclosures or - if only one available - the actual number in USD is
+		given:</p>
+
+
+	<p class="caption">Figure: Example of using FIOS for OLAP
+		operations on financial data</p>
+	<p align="center">
+		<img alt="Example of using FIOS for OLAP operations on financial data"
+			src="./figures/FIOS_example.PNG"></img>
+	</p>
+
+	<p>Benefits:</p>
+
+	<ul>
+		<li>OLAP operations cover typical business requirements, e.g.,
+			slice, dice, drill-down.</li>
+		<li>OLAP frontends intuitive interactive, explorative, fast.
+			Interfaces well-known to many people in industry.</li>
+		<li>OLAP functionality provided by many tools that may be reused</li>
+	</ul>
+
+	<p>Challenges:</p>
+	<ul>
+		<li>ETL pipeline needs to automatically populate a data
+			warehouse. Common OLAP systems use relational databases with a star
+			schema.</li>
+		<li>A problem lies in the strict separation between queries for
+			the structure of data (metadata queries), and queries for actual
+			aggregated values (OLAP operations).</li>
+		<li>Another problem lies in defining Data Cubes without greater
+			insight in the data beforehand.</li>
+		<li>Depending on the expressivity of the OLAP queries (e.g.,
+			aggregation functions, hierarchies, ordering), performance plays an
+			important role.</li>
+	</ul>
+
+
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata">There
+				should be criteria for well-formedness and assumptions consumers can
+				make about published data</a></li>
+	</ul>
+	</section> <section>
+	<h3 id="Registeringpublishedstatisticaldataindatacatalogs">Registry
+		Use Case: Registering published statistical data in data catalogs</h3>
+	<p>
+		<span style="font-size: 10pt">(Use case motivated by <a
+			href="http://www.w3.org/TR/vocab-dcat/">Data Catalog vocabulary</a>)
+		</span>
+	</p>
+
+	<p>
+		After statistics have been published as Linked Data, the question
+		remains how to communicate the publication and let users discover the
+		statistics. There are catalogs to register datasets, e.g., CKAN, <a
+			href="http://www.datacite.org/">datacite.org</a>, <a
+			href="http://www.gesis.org/dara/en/home/?lang=en">da|ra</a>, and <a
+			href="http://pangaea.de/">Pangea</a>. Those catalogs require specific
+		configurations to register statistical data.
+	</p>
+
+	<p>The goal of this use case is to demonstrate how to expose and
+		distribute statistics after publication. For instance, to allow
+		automatic registration of statistical data in such catalogs, for
+		finding and evaluating datasets. To solve this issue, it should be
+		possible to transform the published statistical data into formats that
+		can be used by data catalogs.</p>
+
+	<p>
+		A concrete use case is the structured collection of <a
+			href="http://wiki.planet-data.eu/web/Datasets">RDF Data Cube
+			Vocabulary datasets</a> in the PlanetData Wiki. It is supposed to list
+		statistical data published. This list is supposed to describe the
+		formal RDF descriptions on a higher level and to provide a useful
+		overview of RDF Data Cube deployments in the Linked Data cloud.
+	</p>
+
+	<p>Unanticipated Uses: If data catalogs contain statistics, they do
+		not expose those using Linked Data but for instance using CSV or HTML
+		(e.g., Pangea). It could also be a use case to publish such data using
+		the standard vocabulary.</p>
+	<p>
+		Existing Work: The <a href="http://www.w3.org/TR/vocab-dcat/">Data
+			Catalog vocabulary</a> (DCAT) is strongly related to this use case since
+		it may complement the standard vocabulary for representing statistics
+		in the case of registering data in a data catalog.
+	</p>
+
+	<p>Requirements:</p>
+	<ul>
+		<li><a
+			href="#Thereshouldbearecommendedwaytocommunicatetheavailabilityofpublishedstatisticaldatatoexternalpartiesandtoallowautomaticdiscoveryofstatisticaldata">There
+				should be a recommended way to communicate the availability of
+				published statistical data to external parties and to allow
+				automatic discovery of statistical data</a></li>
+	</ul>
 	</section> </section>
 
 	<section>
@@ -261,260 +1137,278 @@
 
 	<p>The use cases presented in the previous section give rise to the
 		following requirements for a standard representation of statistics.
-		Requirements are cross-linked with the use cases that motivate them.
-		Requirements are similarly categorized as deriving from publishing or
-		consuming use cases.</p>
+		Requirements are cross-linked with the use cases that motivate them.</p>
 
-	<section>
-	<h3>Publishing requirements</h3>
 
 	<section>
-	<h4>Machine-readable and application-independent representation of
-		statistics</h4>
-	<p>It should be possible to add abstraction, multiple levels of
-		description, summaries of statistics.</p>
-
-	<p>Required by: UC1, UC2, UC3, UC4</p>
-	</section> <section>
-	<h4>Representing statistics from various resource</h4>
-	<p>Statistics from various resource data should be possible to be
-		translated into QB. QB should be very general and should be usable for
-		other data sets such as survey data, spreadsheets and OLAP data cubes.
-		What kind of statistics are described: simple CSV tables (UC 1), excel
-		(UC 2) and more complex SDMX (UC 3) data about government statistics
-		or other public-domain relevant data.</p>
-
-	<p>Required by: UC1, UC2, UC3</p>
-	</section> <section>
-	<h4>Communicating, exposing statistics on the web</h4>
-	<p>It should become clear how to make statistical data available on
-		the web, including how to expose it, and how to distribute it.</p>
-
-	<p>Required by: UC5</p>
-	</section> <section>
-	<h4>Coverage of typical statistics metadata</h4>
-	<p>It should be possible to add metainformation to statistics as
-		found in typical statistics or statistics catalogs.</p>
-
-	<p>Required by: UC1, UC2, UC3, UC4, UC5</p>
-	</section> <section>
-	<h4>Expressing hierarchies</h4>
-	<p>It should be possible to express hierarchies on Dimensions of
-		statistics. Some of this requirement is met by the work on ISO
-		Extension to SKOS [17].</p>
-
-	<p>Required by: UC3, UC9</p>
-	</section> <section>
-	<h4>Machine-readable and application-independent representation of
-		statistics</h4>
-	<p>It should be possible to add abstraction, multiple levels of
-		description, summaries of statistics.</p>
-
-	<p>Required by: UC1, UC2, UC3, UC4</p>
-	</section> <section>
-	<h4>Expressing aggregation relationships in Data Cube</h4>
-	<p>Based on [18]: It often comes up in statistical data that you
-		have some kind of 'overall' figure, which is then broken down into
-		parts. To Supposing I have a set of population observations, expressed
-		with the Data Cube vocabulary - something like (in pseudo-turtle):</p>
-	<pre>
-ex:obs1
-  sdmx:refArea <UK>;
-  sdmx:refPeriod "2011";
-  ex:population "60" .
-
-ex:obs2
-  sdmx:refArea <England>;
-  sdmx:refPeriod "2011";
-  ex:population "50" .
-
-ex:obs3
-  sdmx:refArea <Scotland>;
-  sdmx:refPeriod "2011";
-  ex:population "5" .
-
-ex:obs4
-  sdmx:refArea <Wales>;
-  sdmx:refPeriod "2011";
-  ex:population "3" .
-
-ex:obs5
-  sdmx:refArea <NorthernIreland>;
-  sdmx:refPeriod "2011";
-  ex:population "2" .
-  	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	
-	</pre>
-	<p>What is the best way (in the context of the RDF/Data Cube/SDMX
-		approach) to express that the values for the England/Scotland/Wales/
-		Northern Ireland ought to add up to the value for the UK and
-		constitute a more detailed breakdown of the overall UK figure? I might
-		also have population figures for France, Germany, EU27, etc...so it's
-		not as simple as just taking a qb:Slice where you fix the time period
-		and the measure.</p>
-	<p>Some of this requirement is met by the work on ISO Extension to
-		SKOS [19].</p>
-
-
-	<p>Required by: UC1, UC2, UC3, UC9</p>
-	</section> <section>
-	<h4>Scale - how to publish large amounts of statistical data</h4>
-	<p>Publishers that are restricted by the size of the statistics
-		they publish, shall have possibilities to reduce the size or remove
-		redundant information. Scalability issues can both arise with
-		peoples's effort and performance of applications.</p>
-
-	<p>Required by: UC1, UC2, UC3, UC4</p>
-	</section> <section>
-	<h4>Compliance-levels or criteria for well-formedness</h4>
-	<p>The formal RDF Data Cube vocabulary expresses few formal
-		semantic constraints. Furthermore, in RDF then omission of
-		otherwise-expected properties on resources does not lead to any formal
-		inconsistencies. However, to build reliable software to process Data
-		Cubes then data consumers need to know what assumptions they can make
-		about a dataset purporting to be a Data Cube.</p>
-	<p>What *well-formedness* criteria should Data Cube publishers
-		conform to? Specific areas which may need explicit clarification in
-		the well-formedness criteria include (but may not be limited to):</p>
+	<h3 id="VocabularyshouldbuildupontheSDMXinformationmodel">Vocabulary
+		should build upon the SDMX information model</h3>
+	<p>
+		The draft version of the vocabulary builds upon <a
+			href="http://sdmx.org/?page_id=16">SDMX Standards Version 2.0</a>. A
+		newer version of SDMX, <a href="http://sdmx.org/?p=899">SDMX
+			Standards, Version 2.1</a>, is available.
+	</p>
+	<p>The requirement is to at least build upon Version 2.0, if
+		specific use cases derived from Version 2.1 become available, the
+		working group may consider building upon Version 2.1.</p>
+	<p>Background information:</p>
 	<ul>
-		<li>use of abbreviated data layout based on attachment levels</li>
-		<li>use of qb:Slice when (completeness, requirements for an
-			explicit qb:SliceKey?)</li>
-		<li>avoiding mixing two approaches to handling multiple-measures
-		</li>
-		<li>optional triples (e.g. type triples)</li>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/37">http://www.w3.org/2011/gld/track/issues/37</a></li>
 	</ul>
 
-	<p>Required by all use cases.</p>
-	</section> <section>
-	<h4>Declaring relations between Cubes</h4>
-	<p>In some situations statistical data sets are used to derive
-		further datasets. Should Data Cube be able to explicitly convey these
-		relationships?</p>
-	<p>Note that there has been some work towards this within the SDMX
-		community as indicated here:
-		http://groups.google.com/group/publishing-statistical-data/msg/b3fd023d8c33561d</p>
-
-	<p>Required by: UC6</p>
-	</section> </section> <section>
-	<h3>Consumption requirements</h3>
+	<p>Required by:</p>
+	<ul>
+		<li><a href="#SDMXWebDisseminationUseCase">SDMX Web
+				Dissemination Use Case</a></li>
+		<li><a
+			href="#UKgovernmentfinancialdatafromCombinedOnlineInformationSystem">Publisher
+				Use Case: UK government financial data from Combined Online
+				Information System (COINS)</a></li>
+		<li><a href="#EurostatSDMXasLinkedData">Publisher Use Case:
+				Eurostat SDMX as Linked Data</a></li>
+	</ul>
 
-	<section>
-	<h4>Finding statistical data</h4>
-	<p>Finding statistical data should be possible, perhaps through an
-		authoritative service</p>
-
-	<p>Required by: UC5</p>
-	</section> <section>
-	<h4>Retrival of fine grained statistics</h4>
-	<p>Query formulation and execution mechanisms. It should be
-		possible to use SPARQL to query for fine grained statistics.</p>
-
-	<p>Required by: UC1, UC2, UC3, UC4, UC5, UC6, UC7</p>
-	</section> <section>
-	<h4>Understanding - End user consumption of statistical data</h4>
-	<p>Must allow presentation, visualization .</p>
-
-	<p>Required by: UC7, UC8, UC9, UC10</p>
 	</section> <section>
-	<h4>Comparing and trusting statistics</h4>
-	<p>Must allow finding what's in common in the statistics of two or
-		more datasets. This requirement also deals with information quality -
-		assessing statistical datasets - and trust - making trust judgements
-		on statistical data.</p>
+	<h3 id="Vocabularyshouldclarifytheuseofsubsetsofobservations">Vocabulary
+		should clarify the use of subsets of observations</h3>
+	<p>There should be a consensus on the issue of flattening or
+		abbreviating data; one suggestion is to author data without the
+		duplication, but have the data publication tools "flatten" the compact
+		representation into standalone observations during the publication
+		process.</p>
+	<p>Background information:</p>
+	<ul>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/33">http://www.w3.org/2011/gld/track/issues/33</a></li>
 
-	<p>Required by: UC5, UC6, UC9</p>
-	</section> <section>
-	<h4>Integration of statistics</h4>
-	<p>Interoperability - combining statistics produced by multiple
-		different systems. It should be possible to combine two statistics
-		that contain related data, and possibly were published independently.
-		It should be possible to implement value conversions.</p>
+		<li>Since there are no use cases for qb:subslice, the vocabulary
+			should clarify or drop the use of qb:subslice; issue: <a
+			href="http://www.w3.org/2011/gld/track/issues/34">http://www.w3.org/2011/gld/track/issues/34</a>
+		</li>
+	</ul>
 
-	<p>Required by: UC1, UC3, UC4, UC7, UC9, UC10</p>
+	<p>Required by:</p>
+	<ul>
+		<li><a
+			href="#UKgovernmentfinancialdatafromCombinedOnlineInformationSystem">Publisher
+				Use Case: UK government financial data from Combined Online
+				Information System (COINS)</a></li>
+		<li><a href="#PublishingslicesofdataaboutUKBathingWaterQuality">Publisher
+				Use Case: Publishing slices of data about UK Bathing Water Quality</a></li>
+	</ul>
+
 	</section> <section>
-	<h4>Scale - how to consume large amounts of statistical data</h4>
-	<p>Consumers that want to access large amounts of statistical data
-		need guidance.</p>
+	<h3
+		id="Vocabularyshouldrecommendamechanismtosupporthierarchicalcodelists">Vocabulary
+		should recommend a mechanism to support hierarchical code lists</h3>
+	<p>First, hierarchical code lists may be supported via SKOS. Allow
+		for cross-location and cross-time analysis of statistical datasets.</p>
+	<p>Second, one can think of non-SKOS hierarchical code lists. E.g.,
+		if simple skos:narrower/skos:broader relationships are not sufficient
+		or if a vocabulary uses specific hierarchical properties, e.g.,
+		geo:containedIn.</p>
+	<p>Also, the use of hierarchy levels needs to be clarified. It has
+		been suggested, to allow skos:Collections as value of qb:codeList.</p>
+	<p>Background information:</p>
+	<ul>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/31">http://www.w3.org/2011/gld/track/issues/31</a></li>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/39">http://www.w3.org/2011/gld/track/issues/39</a>
+		</li>
+	</ul>
 
-	<p>Required by: UC7, UC9</p>
+	<p>Required by:</p>
+	<ul>
+		<li><a href="#PublishingExcelSpreadsheetsasLinkedData">Publisher
+				Use Case: Publishing Excel Spreadsheets as Linked Data</a></li>
+	</ul>
+
 	</section> <section>
-	<h4>Common internal representation of statistics, to be exported
-		in other formats</h4>
-	<p>QB data should be possible to be transformed into data formats
-		such as XBRL which are required by certain institutions.</p>
+	<h3
+		id="VocabularyshoulddefinerelationshiptoISO19156ObservationsMeasurements">Vocabulary
+		should define relationship to ISO19156 - Observations & Measurements</h3>
+	<p>An number of organizations, particularly in the Climate and
+		Meteorological area already have some commitment to the OGC
+		"Observations and Measurements" (O&M) logical data model, also
+		published as ISO 19156. Are there any statements about compatibility
+		and interoperability between O&M and Data Cube that can be made to
+		give guidance to such organizations?</p>
+	<p>Background information:</p>
+	<ul>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/32">http://www.w3.org/2011/gld/track/issues/32</a></li>
+	</ul>
 
-	<p>Required by: UC10</p>
+	<p>Required by:</p>
+	<ul>
+		<li><a href="#PublishingslicesofdataaboutUKBathingWaterQuality">Publisher
+				Use Case: Publishing slices of data about UK Bathing Water Quality</a></li>
+	</ul>
+
 	</section> <section>
-	<h4>Dealing with imperfect statistics</h4>
-	<p>Imperfections - reasoning about statistical data that is not
-		complete or correct.</p>
+	<h3
+		id="Thereshouldbearecommendedmechanismtoallowforpublicationofaggregateswhichcrossmultipledimensions">There
+		should be a recommended mechanism to allow for publication of
+		aggregates which cross multiple dimensions</h3>
 
-	<p>Required by: UC7, UC8, UC9, UC10</p>
-	</section> </section> </section>
+	<p>Background information:</p>
+	<ul>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/31">http://www.w3.org/2011/gld/track/issues/31</a></li>
+	</ul>
+
+	<p>Required by:</p>
+	<ul>
+		<li>E.g., the Eurostat SDMX as Linked Data use case suggests to
+			have time lines on data aggregating over the gender dimension: <a
+			href="#EurostatSDMXasLinkedData">Publisher Use Case: Eurostat
+				SDMX as Linked Data</a>
+		</li>
+		<li>Another possible use case could be provided by the <a
+			href="http://data.gov.uk/resources/payments">Payment Ontology</a>.
+		</li>
+	</ul>
+
+	</section> <section>
+	<h3 id="Thereshouldbearecommendedwayofdeclaringrelationsbetweencubes">There
+		should be a recommended way of declaring relations between cubes</h3>
+	<p>Background information:</p>
+	<ul>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/30">http://www.w3.org/2011/gld/track/issues/30</a></li>
+	</ul>
+
+	<p>Required by:</p>
+	<ul>
+		<li><a href="#Representingrelationshipsbetweenstatisticaldata">Publisher
+				Use Case: Representing relationships between statistical data</a></li>
+	</ul>
+
+	</section> <section>
+	<h3
+		id="Thereshouldbecriteriaforwell-formednessandassumptionsconsumerscanmakeaboutpublisheddata">There
+		should be criteria for well-formedness and assumptions consumers can
+		make about published data</h3>
+
+	<p>Background information:</p>
+	<ul>
+		<li>Issue: <a href="http://www.w3.org/2011/gld/track/issues/29">http://www.w3.org/2011/gld/track/issues/29</a></li>
+	</ul>
+
+	<p>Required by:</p>
+	<ul>
+		<li><a
+			href="#Simplechartvisualisationsofpublishedstatisticaldata">Consumer
+				Use Case: Simple chart visualisations of (integrated) published
+				statistical data</a></li>
+		<li><a
+			href="#VisualisingpublishedstatisticaldatainGooglePublicDataExplorer">Consumer
+				Use Case: Visualising published statistical data in Google Public
+				Data Explorer</a></li>
+		<li><a
+			href="#AnalysingpublishedstatisticaldatawithcommonOLAPsystems">Consumer
+				Use Case: Analysing published statistical data with common OLAP
+				systems</a></li>
+	</ul>
+
+	</section> <section>
+	<h3 id="VocabularyshouldbuildupontheSDMXinformationmodel">There
+		should be mechanisms and recommendations regarding publication and
+		consumption of large amounts of statistical data</h3>
+	<p>Background information:</p>
+	<ul>
+		<li>Related issue regarding abbreviations <a
+			href="http://www.w3.org/2011/gld/track/issues/29">http://www.w3.org/2011/gld/track/issues/29</a>
+		</li>
+	</ul>
+
+	<p>Required by:</p>
+	<ul>
+		<li><a href="#EurostatSDMXasLinkedData">Publisher Use Case:
+				Eurostat SDMX as Linked Data</a></li>
+	</ul>
+
+	</section> <section>
+	<h3
+		id="Thereshouldbearecommendedwaytocommunicatetheavailabilityofpublishedstatisticaldatatoexternalpartiesandtoallowautomaticdiscoveryofstatisticaldata">There
+		should be a recommended way to communicate the availability of
+		published statistical data to external parties and to allow automatic
+		discovery of statistical data</h3>
+	<p>Clarify the relationship between DCAT and QB.</p>
+	<p>Background information:</p>
+	<ul>
+		<li>None.</li>
+	</ul>
+
+	<p>Required by:</p>
+	<ul>
+		<li><a href="#SDMXWebDisseminationUseCase">SDMX Web
+				Dissemination Use Case</a></li>
+		<li><a href="#Registeringpublishedstatisticaldataindatacatalogs">Registry
+				Use Case: Registering published statistical data in data catalogs</a></li>
+	</ul>
+
+	</section> </section>
 	<section class="appendix">
 	<h2 id="acknowledgements">Acknowledgements</h2>
-	<p>The editors are very thankful for comments and suggestions ...</p>
+	<p>We thank Rinke Hoekstra, Dave Reynolds, Bernadette Hyland,
+		Biplav Srivastava, John Erickson, Villaz&oacute;n-Terrazas for
+		feedback and input.</p>
 	</section>
 
 	<h2 id="references">References</h2>
 
 	<dl>
-		<dt id="ref-SDMX">[SMDX]</dt>
+
+		<dt id="ref-cog">[COG]</dt>
 		<dd>
-			SMDX - SDMX User Guide Version 2009.1, <a
-				href="http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf">http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf</a>,
-			last visited Jan 8 2013.
+			SDMX Content Oriented Guidelines, <a
+				href="http://sdmx.org/?page_id=11">http://sdmx.org/?page_id=11</a>
 		</dd>
 
-		<dt id="ref-SDMX-21">[SMDX 2.1]</dt>
+		<dt id="ref-COGS">[COGS]</dt>
 		<dd>
-			SDMX 2.1 User Guide Version. Version 0.1 - 19/09/2012. <a
-				href="http://sdmx.org/wp-content/uploads/2012/11/SDMX_2-1_User_Guide_draft_0-1.pdf">http://sdmx.org/wp-content/uploads/2012/11/SDMX_2-1_User_Guide_draft_0-1.pdf</a>.
-			Last visited on 8 Jan 2013.
+			Freitas, A., Kämpgen, B., Oliveira, J. G., O’Riain, S., & Curry, E.
+			(2012). Representing Interoperable Provenance Descriptions for ETL
+			Workflows. ESWC 2012 Workshop Highlights (pp. 1–15). Springer Verlag,
+			2012 (in press). (Extended Paper published in Conf. Proceedings.). <a
+				href="http://andrefreitas.org/papers/preprint_provenance_ETL_workflow_eswc_highlights.pdf">http://andrefreitas.org/papers/preprint_provenance_ETL_workflow_eswc_highlights.pdf</a>.
 		</dd>
 
+		<dt id="ref-COINS">[COINS]</dt>
+		<dd>
+			Ian Dickinson et al., COINS as Linked Data <a
+				href="http://data.gov.uk/resources/coins">http://data.gov.uk/resources/coins</a>,
+			Last visited on Jan 9 2013
+		</dd>
+
+		<dt id="ref-FIOS">[FIOS]</dt>
+		<dd>
+			Andreas Harth, Sean O'Riain, Benedikt Kämpgen. Submission XBRL
+			Challenge 2011. <a
+				href="http://xbrl.us/research/appdev/Pages/275.aspx">http://xbrl.us/research/appdev/Pages/275.aspx</a>.
+		</dd>
+
+
 		<dt id="ref-Fowler1997">[Fowler1997]</dt>
 		<dd>Fowler, Martin (1997). Analysis Patterns: Reusable Object
 			Models. Addison-Wesley. ISBN 0201895420.</dd>
 
+
+		<dt id="ref-linked-data">[LOD]</dt>
+		<dd>
+			Linked Data, <a href="http://linkeddata.org/">http://linkeddata.org/</a>
+		</dd>
+
+		<dt id="ref-OLAP">[OLAP]</dt>
+		<dd>
+			Online Analytical Processing Data Cubes, <a
+				href="http://en.wikipedia.org/wiki/OLAP_cube">http://en.wikipedia.org/wiki/OLAP_cube</a>
+		</dd>
+
+		<dt id="ref-OLAP">[OLAP4LD]</dt>
+		<dd>
+			Kämpgen, B. and Harth, A. (2011). Transforming Statistical Linked
+			Data for Use in OLAP Systems. I-Semantics 2011. <a
+				href="http://www.aifb.kit.edu/web/Inproceedings3211">http://www.aifb.kit.edu/web/Inproceedings3211</a>
+		</dd>
+
 		<dt id="ref-QB">[QB-2010]</dt>
 		<dd>
 			RDF Data Cube vocabulary, <a
@@ -527,15 +1421,11 @@
 				href="http://www.w3.org/TR/vocab-data-cube/">http://www.w3.org/TR/vocab-data-cube/</a>
 		</dd>
 
-		<dt id="ref-OLAP">[OLAP]</dt>
+		<dt id="ref-QB4OLAP">[QB4OLAP]</dt>
 		<dd>
-			Online Analytical Processing Data Cubes, <a
-				href="http://en.wikipedia.org/wiki/OLAP_cube">http://en.wikipedia.org/wiki/OLAP_cube</a>
-		</dd>
-
-		<dt id="ref-linked-data">[LOD]</dt>
-		<dd>
-			Linked Data, <a href="http://linkeddata.org/">http://linkeddata.org/</a>
+			Etcheverry, Vaismann. QB4OLAP : A New Vocabulary for OLAP Cubes on
+			the Semantic Web. <a
+				href="http://publishing-multidimensional-data.googlecode.com/git/index.html">http://publishing-multidimensional-data.googlecode.com/git/index.html</a>
 		</dd>
 
 		<dt id="ref-rdf">[RDF]</dt>
@@ -557,10 +1447,18 @@
 				href="http://www.w3.org/2004/02/skos/">http://www.w3.org/2004/02/skos/</a>
 		</dd>
 
-		<dt id="ref-cog">[COG]</dt>
+		<dt id="ref-SDMX">[SMDX]</dt>
 		<dd>
-			SDMX Content Oriented Guidelines, <a
-				href="http://sdmx.org/?page_id=11">http://sdmx.org/?page_id=11</a>
+			SMDX - SDMX User Guide Version 2009.1, <a
+				href="http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf">http://sdmx.org/wp-content/uploads/2009/02/sdmx-userguide-version2009-1-71.pdf</a>,
+			Last visited Jan 8 2013.
+		</dd>
+
+		<dt id="ref-SDMX-21">[SMDX 2.1]</dt>
+		<dd>
+			SDMX 2.1 User Guide Version. Version 0.1 - 19/09/2012. <a
+				href="http://sdmx.org/wp-content/uploads/2012/11/SDMX_2-1_User_Guide_draft_0-1.pdf">http://sdmx.org/wp-content/uploads/2012/11/SDMX_2-1_User_Guide_draft_0-1.pdf</a>.
+			Last visited on 8 Jan 2013.
 		</dd>
 
 	</dl>
author	bkaempge
	Wed, 27 Feb 2013 19:19:36 +0100
changeset 289	34d03b6b4249
parent 288	9901db54f738
child 290	cfea00750fcb