Clean up, simplification of BP doc
authorbhyland
Wed, 21 Mar 2012 22:59:09 -0400
changeset 130 77ca46544091
parent 129 d14c994f4d5c
child 132 8ccd1b1144c7
Clean up, simplification of BP doc
bp/index.html
--- a/bp/index.html	Wed Mar 21 21:42:50 2012 -0400
+++ b/bp/index.html	Wed Mar 21 22:59:09 2012 -0400
@@ -26,67 +26,52 @@
 </section>
 
 
+<!--    INTRODUCTION    -->
+
 <section class="introductory">
-<h2>Scope</h2>
+<h2>Purpose of the Document</h2>
 
 <p>
-This document is aimed at assisting government IT managers, procurement officers, Web developers, vendors, and researchers who are interested in publishing open government data using W3C standards.  The benefits of using international standards for data exchange is to significantly increase interoperability of data.
+This document sets out a series of best practices designed to facilitate development and delivery of Linked Open Data. The recommendations are offered to creators, maintainers and operators of Web sites publishing government data in both human and machine readable data formats.
 </p>
-<p>
-Readers of this document are expected to be familiar with delivery of content via the Web, and to have a general familiarity with the technologies involved, but are not required to have a background in semantic technologies or previous experience with Linked Data. Data stewards, curators, database administrators and other personnel involved in Open Government initiatives are encouraged to read this Linked Open Data Best Practices document. 
-</section>
 
-<!--    INTRODUCTION    -->
-<section>
-<h2>Introduction</h2>
+<h2>Audience</h2>
+<p>
+Readers of this document are expected to be familiar with the creation of Web applications, and to have a general familiarity with the technologies involved, but are not expected to have a background in Linked Data technologies or previous experience with publishing data as Linked Open Data on the Web.</p>
 
-<section>
-<h3>Overview - Bernadette</h3>
 <p>
-Many governments have mandated publication of open government data to the public via the Web. The intention of these mandates are to facilitate the maintenance of open societies and support governmental accountability and transparency initiatives. However, publication of unstructured data on the World Wide Web is in itself insufficient; in order to realize the goals of efficiency, transparency and accountability, re-use of published data means members of the public must be able to absorb data in ways that can be readily found via search, visualized and absorbed programmatically.
+The document is not targeted solely at developers; others, such as government procurement officers, website administrators, and tool developers are encouraged to read it.</p>
+
+<h2>Scope</h2>
+
+<p>This document aims to ease the adoption of Linked Open Data by providing an intuitive explanation of what is involved in publishing open government data on the Web.  <a href="http://www.w3.org/DesignIssues/LinkedData.html" title="Linked Data - Design Issues">Linked Data</a> addresses many objectives of open government transparency initiatives through the use international Web standards for the publication, dissemination and reuse of structured data.
 </p>
 
 <p>
-This compilation of best practices for Linked Open Data produced by the W3C Government Linked Data Working Group is intended to help data curators and publishers better understand how to best use their time and resources to achieve the goals of Open Government. Linked Data principles address many of the data description and data format requirements for realizing the goals of Open Government. Linked Data uses a family of international standards and best practices for the publication, dissemination and reuse of structured data. Linked Data, unlike previous data formatting and publication approaches, provides a simple mechanism for combining data from multiple sources across the Web. 
+Linked Data uses a family of international standards and best practices for the publication, dissemination and reuse of structured data. Linked Data, unlike previous data formatting and publication approaches, provides a simple mechanism for combining data from multiple sources across the Web. 
 </p>
 
-<p>
-<a href="http://www.w3.org/DesignIssues/LinkedData.html" title="Linked Data - Design Issues">Linked Data</a> addresses key requirements of open government by providing a family of international standards for the publication, dissemination and reuse of structured data.
-</p>
-</section>
-
-<section>
-<h3>Scope - Bernadette</h3>
-
+<h2>Government Motivation for Publishing Linked Open Data</h2>
 <p>
-The approach in writing this document has been to collate and present the most relevant engineering practices prevalent in the Linked Data development community today and identify those that:
-<ul>
-<li> Facilitate the exploitation of Linked Data to enable better search, access and re-use of open government information;</li>
-<li> Are considered harmful and can have non-obvious detrimental effects on the overall quality of data publishing on the Web.</li>
-</ul>
-The goal of this best practices document is not endorse specific technologies, rather, this document focuses on key considerations and guidance to be successful. However, there are a number of cases where explicitly omitting a Best Practice that references an emerging technology on the grounds that it was too recent to have received wide adoption would have unnecessarily excluded a valuable recommendation. As such, some Best Practices have been included on the grounds that the Working Group believes that they will soon become fully qualified Best Practices (e.g. in prevalent use within the development community).
-</p>
-
-<p>
-Finally, in publishing Linked Open Data, it is not necessary to implement everything decribed herein. Instead, each Best Practice should be considered as a possible measure that may be implemented towards the goal of providing as rich and dynamic an experience as possible via a Web browser and Linked Data client. 
+Many governments have mandated publication of open government data to the public Web. The intention of these mandates are to facilitate the maintenance of open societies and support governmental accountability and transparency initiatives. However, publication of unstructured data on the World Wide Web is in itself insufficient; in order to realize the goals of efficiency, transparency and accountability, re-use of published data means members of the public must be able to absorb data in ways that can be readily found via search, visualized and absorbed programmatically.
 </p>
 
 </section>
 
+
 <section>
 <h3>Motivation</h3>
 The best practices provided here are provide a methodical approach for the creation, publication and dissemination of government Linked Data, including:
 <ul>
-	<li>Description of the full life cycle of a Government Linked Data project, starting with identification of suitable data sets, procurement, modeling, vocabulary selection, through publication and ongoing maintenance.</li>
-	<li>Definition of known, proven steps to create and maintain government data sets using Linked Data principles.</li>
-	<li>Guidance in explaining the value proposition for LOD to stakeholders, managers and executives.</li>
-	<li>Assist the Working Group in later stages of the Standards Process, in order to solicit feedback, use cases, etc.</li>
+	<li> Description of the life cycle of a Linked Data project, starting with identification of suitable data sets, modeling, vocabulary selection, through publication and ongoing maintenance.</li>
+	<li> Definition of proven steps to create and maintain government data sets using Linked Data principles.</li>
+	<li> Guidance in the procurement process for publishing Linked Open Data.
 </ul>
+
 <p class='issue'>Does it make sense to base the GLD life cycle on one of the general LD life cycles? See <a href="https://www.w3.org/2011/gld/track/issues/15">ISSUE-15</a></p>
 </section>
 
-<p class='issue'>Michael suggests to include the three available GLD life cycles we have</p>
-
+<section>
 <p>Currently we have identified the following Government Linked Data Life Cycles
 </p>
 <p class="todo"> Include a brief description for each one of them.
@@ -100,18 +85,13 @@
 <p>Villazon-terrazas</p>
 <img src="img/GLF_Villazon-terrazas.PNG" width="600" />
 
-<section>
-<h3>Brief History of Open Government Linked Data - Bernadette</h3>
-
-</section>
-
 </section>
 <!--    PROCUREMENT   -->
 <section>
-<h3>Procurement - Bernadette</h3>
-<p class='responsible'>George Thomas (Health & Human Services, US), Mike Pendleton (Environmental Protection Agency, US), John Sheridan (OPSI, UK)</p>
+<h3>Procurement</h3>
+<p class='responsible'>Mike Pendleton (Environmental Protection Agency, USA)</p>
 <p>
-Specific products and services involved in governments publishing linked data will be defined, suitable for use during government procurement. Just as the <a href="http://www.w3.org/WAI/intro/wcag" title="WCAG Overview">Web Content Accessibility Guidelines</a> allow governments to easily specify what they mean when they contract for an accessible Website, these definitions will simplify contracting for data sites and applications.
+Just as the <a href="http://www.w3.org/WAI/intro/wcag" title="WCAG Overview">Web Content Accessibility Guidelines</a> allow governments to easily specify what they mean when they contract for an accessible Website, these definitions will simplify contracting for data sites and applications.
 </p>
 
 <p>
@@ -120,27 +100,18 @@
 
 <h4>Overview</h4>
 <p>
-Recent Open Government initiatives call for more and better access to government data. To meet expanding consumer needs, many governments are now looking to go beyond traditional provisioning formats (e.g. CSV, XML), and are beginning to provision data using Linked Open Data (LOD) approaches.
-</p>
-
-<p>
-In contrast to provisioning data on the Web, LOD provisions data into the Web so it can be interlinked with other linked data, making it easier to discover, and more useful and reusable. LOD leverages World Wide Web standards such as Hypertext Transfer Protocol (HTTP), Resource Description Framework (RDF), and Uniform Resource Identifiers (URIs), which make data self-describing so that it is both human and machine readable. Self-describing data is important because most government data comes from relational data systems that do not fully describe the source data schema needed for application development by third parties.
-</p>
-
-<p>
-While LOD is a relatively new approach to data provisioning, growth has been exponential. LOD has been adopted by other national governments including the UK, Sweden, Germany, France, Spain, New Zealand and Australia.
+LOD provisions data into the Web so it can be interlinked with other linked data, making it easier to discover, and more useful and reusable. LOD leverages World Wide Web standards such as Hypertext Transfer Protocol (HTTP), Resource Description Framework (RDF), and Uniform Resource Identifiers (URIs), which make data self-describing so that it is both human and machine readable. Self-describing data is important because most government data comes from relational data systems that do not fully describe the source data schema needed for application development by third parties.
 </p>
 
 <h5>LOD Production through Consumption Lifecycle</h5>
 
 <p>
-The following categorizes activities associated with LOD development and maintenance, and identifies products and services and associated with these activities:
+The following categorizes general activities associated with LOD development and maintenance:
 </p>
 
 <ol type="1">
 	<li>LOD Preparation<li>
-	<p>Products :</p>
-	<p>Services : Services that support modeling relational or other data sources using URIs, developing scripts used to generate/create linked open data. Overlap exists between LOD preparation and publishing.</p>
+	<p>Services : Services that support modeling relational or other data sources using URIs, developing scripts used to generate/create linked open data.</p>
 	<li>LOD Publishing</li>
 	<p>Products: RDF database (a.k.a. triple store) enables hosting of linked data</p>
 	<p>Services: These are services that support creation, interlinking and deployment of linked data (see also linked data preparation). Hosting data via a triple store is a key aspect of publishing. LD publishing may include implementing a PURL strategy. During preparation for publishing linked data, data and publishing infrastructure may be tested and debugged to ensure it adheres to linked data principles and best practices. (Source: Linked Data: Evolving the Web into a Global Data Space, Heath and Bizer, Morgan and Claypool, 2011, Section 5.4, p. 53)</p>
@@ -210,36 +181,12 @@
 
 <p>As such, opportunities may exist to streamline the development of a security plan, or conversely, to identify potential project security vulnerabilities and risks, through early engagement with hosting providers, software vendors and others who may be responsible for those common, inherited controls. If the inherited controls meet the recommendations, they can then be assembled following the requisite templates, and the system security plan can be completed through addition of any applicable controls specific or unique to the linked data application's configuration, implementation, processes or other elements described in the security control and security plan guidance.</p>
 
-<h4>Glossary</h4>
-<ul>
-<li>
-Linked Open Data: A pattern for hyper-linking machine-readable data sets to each other using Semantic Web techniques, especially via the use of RDF and URIs. Enables distributed SPAQL queries of the data sets and a “browsing” or “discovery” approach to finding information (as compared to a search strategy. (Source: Linking Enterprise Data, David Wood, Springer, 2010, p. 286)
-</li>
-<li>
-Linked Open Data Cloud: Linked Open Data that has been published is depicted in a LOD cloud diagram. The diagram shows connections between linked data sets and color codes them based on data type (e.g., government, media, life sciences, etc.). The diagram can be viewed at: http://richard.cyganiak.de/2007/10/lod/
-</li>
-<li>
-RDF (Resource Description Framework): A language for representing information about resources in the World Wide Web. RDF is based on the idea of identifying things using Web identifiers (called Uniform Resource Identifiers, or URIs), and describing resources in terms of simple properties and property values. This enables RDF to represent simple statements about resources as a graph of nodes and arcs representing the resources, and their properties and values. (http://www.w3.org/TR/rdf-primer/)
-</li>
-<li>
-Semantic Technologies: The broad set of technologies that related to the extraction, representation, storage, retrieval and analysis of machine-readable information. The Semantic Web standards are a subset of semantic technologies and techniques. (Source: Linking Enterprise Data, David Wood, Springer, 2010, p. 286) Semantic Web: An evolution or part of the World Wide Web that consists of machine-readable data in RDF and an ability to query that information in standard ways (e.g. via SPARQL)
-</li>
-<li>
-Semantic Web Standards: Standards of the World Wide Web Consortium (W3C) relating to the Semantic Web, including RDF, RDFa, SKOS and OWL. (Source: Linking Enterprise Data, David Wood, Springer, 2010, p. 287)
-</li>
-<li>
-SPARQL: Simple Protocol and RDF Query Language (SPARQL) defines a standard query language and data access protocol for use with the Resource Description Framework (RDF) data model. (http://msdn.microsoft.com/en-us/library/aa303673.aspx) Just as SQL is used to query relational data, SPARQL is used to query graph, or linked, data.
-</li>
-<li>
-Uniform Resource Identifiers (URIs): URI’s play a key role in enabling linked data. To publish data on the Web, the items in a domain of interest must first be identified. These are the things whose properties and relationships will be described in the data, and may include Web documents as well as real-world entities and abstract concepts. As Linked Data builds directly on Web architecture [67], the Web architecture term resource is used to refer to these things of interest, which are, in turn, identified by HTTP URIs. Wide Web Consortium’s Government Linked Data (W3C/GLD) workgroup: http://www.w3.org/2011/gld/charter
-</li>
-</ul>
 </section>
 
 
 <!--    << VOCABULARY SELECTION   -->
 <section>
-<h3>Vocabulary Selection -  	Boris</h3>
+<h3>Vocabulary Selection</h3>
 <p class='responsible'>Michael Hausenblas (DERI), Ghislain Atemezing (INSTITUT TELECOM), Boris Villazon-Terrazas (UPM),  Daniel Vila-Suero (UPM), George Thomas (Health & Human Services, US), John Erickson (RPI), Biplav Srivastava (IBM)</p>
 <p>
 Modeling is an important phase in any Government Linked Data life cycle. Within this phase Governments need to build a vocabulary that models the data sources they want to publish as Linked Data. The most important recommendation in this context is to reuse as much as possible available vocabularies. This reuse-based approach speeds up the vocabulary development, and therefore, governments will save time, effort and resources. However, the reuse-based approach leads to two main questions (1) where/how do I find/discover available vocabularies, and (2) how do I select a vocabulary that best fits my needs?. Moreover, we have to consider that there may be cases in which Governments will need to mint their own vocabulary terms, these cases lead to another question (3) how to mint my own vocabulary terms?. In this section we provide answers to those questions, by means of checklists for each question.
@@ -370,9 +317,12 @@
 <p class="highlight"><b>Vocabulary should be published following available best practices</b><br/>
 	<i>What it means:</i> One of the goals is to contribute to the community by sharing the new vocabulary. To this end, it is recommended to follow available recipes for publishing RDF vocabularies, e.g., <a href="http://www.w3.org/TR/swbp-vocab-pub/" target="_blank">Best Practice Recipes for Publishing RDF Vocabularies</a>.	
 </p>
-</section> <!-- Vocabulary management/creation >> -->
+</section> <!-- Vocabulary management/creation -->
 
 <section> <!-- << Multilingualism in vocabs -->
+
+<!-- TODO add references to Felix Sasaka's work on multilingual Web and new W3C WG -->
+
 	<h4>Multilingualism in vocabs</h4>
 <p>
 This section provides some considerations when we are dealing with multilingualism in vocabularies. We have identified that multilingualism in vocabularies can be found nowadays in the following formats:
@@ -510,7 +460,8 @@
 
 <section> 
 <h4>URI Persistence</h4>
-<p>@@[email protected]@ Expand this section (Bernadette)</p>
+<p class='responsible'>Bernadette Hyland (3 Round Stones), John Erickson (RPI)</p>
+
 <p><i>Advice, info related to persistent URIs</i></p>
 <p>As is the case with many human interactions, confidence in interactions via the Web depends on stability and predictability. For an information resource, persistence depends on the consistency of representations. The representation provider decides when representations are sufficiently consistent (although that determination generally takes user expectations into account).</p>
 <p>
@@ -559,7 +510,7 @@
 
 <!--    VERSIONING   -->
 <section>
-<h3>Versioning - Boris</h3>
+<h3>Versioning</h3>
 <p class='responsible'>John Erickson (RPI), Ghislain Atemezing (INSTITUT TELECOM), Hadley Beeman (LinkedGov)</p>
 <p>
 This section specifies how to publish data which has multiple versions, including variations such as:
@@ -606,7 +557,7 @@
 
 <!--  << STABILITY   -->
 <section>
-<h3>Stability - Boris</h3>
+<h3>Stability</h3>
 <p class='responsible'>Anne Washington (GMU), Ron Reck</p>
 
 <section> <!-- << STABILITY.overview -->
@@ -779,7 +730,7 @@
 
 <!--    COOKBOOK   -->
 <section>
-<h3>Cookbook - Bernadette</h3>
+<h3>Linked Open Data Cookbook</h3>
 <p class='responsible'>Bernadette Hyland (3 Round Stones)</p>
 <p>
 See <a href="http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook">Cookbook for Open Government Linked Data</a>.
@@ -834,52 +785,9 @@
 </ul>
 <p>Possible organization of use cases (Adapted from <a href="http://bit.ly/wlKOEF" target="_blank">Trust and Linked Data</a>):</p>
 
-<ul>
-	<li>Simple "Oh Yeah?" scenario</li>
-	<ul>
-		<li>User retrieves a dataset, then clicks on “oh yeah” button, then site returns a provenance record</li>
-	</ul>
-</ul>
-
-<ul>
-	<li>Licensing scenario</li>
-	<ul>
-		<li>User retrieves dataset, then wants to check permission to use</li>
-	</ul>
-</ul>
-
-<ul>
-	<li>Referral scenario</li>
-	<ul>
-		<li>Site refers queries about provenance in terms of pointers to another site’s provenance facilities</li>
-	</ul>
-</ul>
-
-<ul>
-	<li>Repeated queries scenario</li>
-	<ul>
-		<li>Service repeatedly queries a site, wants provenance for all the answers</li>
-		<li>This is similar to PROV WG example, where user follows provenance record, asking follow-up questions based on previous answers</li>
-	</ul>
-</ul>
-
-<ul>
-	<li>Versioning scenario</li>
-	<ul>
-		<li>User retrieves a dataset, then wants to see its provenance, but the dataset has been updated in the original site (its provenance as well)</li>
-	</ul>
-</ul>
-
-<ul>
-	<li>Dynamic scenario</li>
-	<ul>
-		<li>User retrieves a resource that is dynamically created</li>
-	</ul>
-</ul>
 </section>
 
 
-
 </section> <!-- Pragmatic Provenance >> -->