First round of Marios comments
authorBoris Villazon-Terrazas <bvillazon@fi.upm.es>
Wed, 27 Mar 2013 05:33:51 +0100
changeset 441 64e7bc77f396
parent 440 e948ff07e769
child 442 839493d50752
First round of Marios comments
bp/index.html
--- a/bp/index.html	Tue Mar 26 16:25:05 2013 -0400
+++ b/bp/index.html	Wed Mar 27 05:33:51 2013 +0100
@@ -42,17 +42,17 @@
 
 <h2>Audience</h2>
 <p>
-Readers of this document are expected to be familiar with fundemental Web technologies such as HTML, URIs, and HTTP.  The document is targeted at developers, government information management staff, website administrators.
+Readers of this document are expected to be familiar with fundamental Web technologies such as HTML, URIs, and HTTP.  The document is targeted at developers, government information management staff, website administrators.
 </p>
 
 <h2>Scope</h2>
 <p>
-This document aims to facilitate the adoption of Linked Open Data Principles for publishing open government data on the Web.  Linked Data utilizes the Resource Description Framework (RDF) 
+This document aims to facilitate the adoption of Linked Open Data Principles for publishing open government data on the Web.  Linked Data utilizes the Resource Description Framework (RDF). 
 
 <p>
 Linked Data refers to a set of best practices for publishing and interlinking structured data for access by both humans and machines via the use of the RDF family of syntaxes (e.g., RDF/XML, N3, Turtle and N-Triples) and HTTP URIs. RDF and Linked Data are not the same thing.  
 
-Linked Data can be published by an person or organization behind the firewall or on the public Web. If Linked Data is published on the public Web, it is generally called Linked Open Data.
+Linked Data can be published by a person or organization behind the firewall or on the public Web. If Linked Data is published on the public Web, it is generally called Linked Open Data.
 </p>
 
 <h2>Background</h2>
@@ -72,7 +72,7 @@
 <p class='stmt'><a href="#IDENTIFY">IDENTIFY</a> The first step is indentifying data sets that other people may wish to re-use.
 </p>
 
-<p class='stmt'><a href="#MODEL">MODEL</a> Sketch the main objects the data describes.  Use lines to describe how they are related to each other.  Denormalize the data as necessary.  Put aside immediate needs of any given application and model the data.
+<p class='stmt'><a href="#MODEL">MODEL</a> Sketch the main objects the data describes.  Use lines to describe how they are related to each other.  Denormalize the data as necessary. Model the data in an application-independent, objective way in terms of representation.
 </p>
 
 <p class='stmt'><a href="#NAME">NAME</a> Use HTTP URIs as names for your objects. Give careful consideration to the URI naming strategy. Consider how the data will change over time and name as necessary.
@@ -104,6 +104,9 @@
 
 <h2> Linked Open Data Lifecycle </h2>
 <!-- <p class='issue'>Does it make sense to base the GLD life cycle on one of the general LD life cycles? See <a href="https://www.w3.org/2011/gld/track/issues/15">ISSUE-15</a></p> -->
+<p>
+The process of publishing  Government Linked Open Data should be comprised of tractable and manageable steps, forming a life cycle in the same way Software Engineering uses life cycles in development projects. A GLD life cycle should cover all steps from identifying appropriate datasets to actually publishing and maintaining them. In the following paragraph three different life cycle models are presented, however it is evident that they all share common (and sometimes overlapping) characteristics in their constituents. For example, they all identify the need to specify, model and publish data in acceptable LOD formats. In essence, they capture the same tasks that are needed in the process, but provide different boundaries between these tasks.
+</p>
 
 <p class="todo"> (Editors) - Please provide a brief description of lifecycle diagrams.
 </p>
@@ -126,7 +129,7 @@
 
 <ul>
 	<li>
-	<p>Villaz&oacute;n-terrazas et al. claim that the process of publishing Government Linked Data must have a life cycle, in the same way of Software Engineering, in which every development project has a life cycle. According to our experience this process has an iterative incremental life cycle model, which is based on the continuous improvement and extension of the Government Linked Data resulted from performing several iterations.
+	<p>Villaz&oacute;n-terrazas et al. cpropose a Linked Data life cycle that consists of the following steps: (1) Specify, (2) Model, (3) Generate, (4) Publish, and (5) Exploit.
 </p>
 	</li>
 </ul>
@@ -171,7 +174,8 @@
 Examples of domain: Geography, Environment, Administrations, State Services, Statistics, People, Organisation.</p>
 
 <p class="highlight"><b>Identify relevant keywords in the dataset</b><br/>
-	<i>What it means:</i> Identify words that describe the main ideas or concepts. By identifying the relevant keywords or categories of your dataset, it helps for the searching process using a Semantic Web Search Engine. If you have raw data in <a href='http://www.w3.org/TR/gld-glossary/#csv'>CSV</a>, the columns of the tables can be used for the searching process. <br/><br/>
+	<i>What it means:</i> Identify words that describe the main ideas or concepts. By identifying the relevant keywords or categories of your dataset, it helps for the searching process using a Semantic Web Search Engine. 
+	<!-- If you have raw data in <a href='http://www.w3.org/TR/gld-glossary/#csv'>CSV</a>, the columns of the tables can be used for the searching process.--> <br/><br/>
 	Examples: commune, county, feature	
 </p>
 
@@ -222,12 +226,12 @@
 
 This checklist aims to help in vocabulary selection, in summary:
 <li>Ensure vocabularies you use are published by a trusted group or organization</li>
-<li>Ensure vocabularies have permanent URI</li>
+<li>Ensure vocabularies have permanent URIs</li>
 <li>Confirm the versioning policy </li>
 </p>
 
 <p class="highlight"><b>Vocabularies MUST be documented</b><br/>
-	<i>What it means:</i> A vocabulary MUST be documented. This includes the liberal use of labels and comments; tags to language used.  Human-readable pages must be provided by the publisher describe the classes and properties, preferably with use cases defined.	
+	<i>What it means:</i> A vocabulary MUST be documented. This includes the liberal use of labels and comments, as well as appropriate language tags. The publisher must provide human-readable pages that describe the vocabulary, along with its constituent classes and properties. Preferably, easily comprehensible use-cases should be defined and documented.	
 </p>
 
 <p class="highlight"><b>Vocabularies SHOULD be self-descriptive</b><br/>
@@ -248,7 +252,7 @@
 </p>
 
 <p class="highlight"><b>Vocabularies SHOULD be used by other data sets</b><br/>
-	<i>What it means:</i> If the vocabulary is used by other authoritative Linked Open Data sets that is helpful.  It is in re-use of vocabularies that we achieve the benefits of Linked Open Data.  For example: An analysis on the <a href="http://stats.lod2.eu/vocabularies" target="_blank">use of vocabularies</a> on the Linke Data cloud reveals that <a href="http://xmlns.com/foaf/0.1" target="_blank">FOAF</a> is reused by more than 55 other vocabularies.
+	<i>What it means:</i> If the vocabulary is used by other authoritative Linked Open Data sets that is helpful.  It is in re-use of vocabularies that we achieve the benefits of Linked Open Data. Selected vocabularies from third parties should be already in use by other data sets, as shows that they are already established in the LOD community, and thus better candidates for wider adoption and reuse. For example: An analysis on the <a href="http://stats.lod2.eu/vocabularies" target="_blank">use of vocabularies</a> on the Linked Data cloud reveals that <a href="http://xmlns.com/foaf/0.1" target="_blank">FOAF</a> is reused by more than 55 other vocabularies.
 </p>
 
 <p class="highlight"><b>Vocabularies SHOULD be accessible for a long period</b><br/>
@@ -338,7 +342,7 @@
 	<li>As a set of rdfs:label in which the language has been restricted (@en, @fr...). Currently, this is the most commonly used approach. It is also a best practice to always include an rdfs:label for which the language tag in not indicated. This term corresponds to the "default" language of the vocabulary</li>
 	<li>As skos:prefLabel (or skosxl:Label), in which the language has also been restricted.</li>
 	<li>As a set of monolingual ontologies (ontologies in which labels are expressed in one natural language) in the same domain mapped or aligned to each other (see the example of EuroWordNet, in which wordnets in different natural languages are mapped to each other through the so-called ILI - inter-lingual-index-, which consists of a set of concepts common to all categorizations).</li>
-	<li>As a set of ontology + lexicon. This represent the latest trend in the representation of linguistic (multilingual) information associated to ontologies. The idea is that the ontology is associated to an external ontology of linguistic descriptions. One of the best exponents in this case is the lemon model <a href="http://tia2011.crim.fr/Workshop-Proceedings/pdf/TIAW15.pdf" target="_blank">REF1</a>, <a href="http://lexinfo.net/" target="_blank">REF2</a>, an ontology of linguistic descriptions that is to be related with the concepts and properties in an ontology to provide lexical, terminological, morphosintactic, etc., information. One of the main advantages of this approach is that semantics and linguistic information are kept separated. One can link several lemon models in different natural languages to the same ontology.</li>
+	<li>As a set of ontology + lexicon. This represents the latest trend in the representation of linguistic (multilingual) information associated to ontologies. The idea is that the ontology is associated to an external ontology of linguistic descriptions. One of the best exponents in this case is the lemon model <a href="http://tia2011.crim.fr/Workshop-Proceedings/pdf/TIAW15.pdf" target="_blank">REF1</a>, <a href="http://lexinfo.net/" target="_blank">REF2</a>, an ontology of linguistic descriptions that is to be related with the concepts and properties in an ontology to provide lexical, terminological, morphosintactic, etc., information. One of the main advantages of this approach is that semantics and linguistic information are kept separated. One can link several lemon models in different natural languages to the same ontology.</li>
 </ul>
 The current trend is to follow the first approach, i.e., to use rdfs:label and rdfs:comment for each term in the vocabulary.
 	
@@ -350,7 +354,7 @@
 <!-- << URI CONSTRUCTION   -->
 <section>
 <h2>URI Construction</h2>
-The following guidance is providing with respect to creating or sometimes called "minting" URIs for vocabularies, concepts, and datasets.  This section specifies how to create good URIs for use in government linked data. Input documents include 
+The following guidance is provided with the intention to address URI minting, i.e., URI creation for vocabularies, concepts and datasets. This section specifies how to create good URIs for use in government linked data. Input documents include 
 <ul>
 	<li><a href="http://www.w3.org/TR/cooluris/" title="Cool URIs for the Semantic Web">Cool URIs for the Semantic Web</a></li>
 	<li><a href="http://www.cabinetoffice.gov.uk/media/308995/public_sector_uri.pdf">Designing URI Sets for the UK Public Sector</a> (PDF)</li>
@@ -365,7 +369,7 @@
 </p>
 
 <h3>URI Design Principles</h3>
-<p>The Web makes use of the URI (Uniform Resource Identifiers) as a single global identification system. The global scope of URIs promotes large-scale "network effects", in order to benefit from the value of Linked Data government and governmental agencies need to identify their resources using URIs. This section provides a set of general principles aimed at helping to government stakeholders to define and manage URIs for their resources.</p>
+<p>The Web makes use of the URI (Uniform Resource Identifiers) as a single global identification system. The global scope of URIs promotes large-scale "network effects". Therefore, in order to benefit from the value of LD, government and governmental agencies need to identify their resources using URIs. This section provides a set of general principles aimed at helping government stakeholders to define and manage URIs for their resources.</p>
 
 <p class="highlight"><b>Use HTTP URIs</b><br>
 What it means: To benefit from and increase the value of the World Wide Web, governments and agencies SHOULD provide HTTP URIs as identifiers for their resources. There are many benefits to participating in the existing network of URIs, including linking, caching, and indexing by search engines. As stated in [LDPrinciples], HTTP URIs enable people to "look-up" or "dereference" a URI in order to access a representation of the resource identified by that URI.
@@ -373,7 +377,7 @@
 </p>
 
 <p class="highlight"><b>Provide at least one machine-readable representation of the resource identified by the URI</b><br>
-What it means: In order to enable HTTP URIs to be "dereferenced", data publishers have to set up the neccesary infraestructure elements (e.g. TCP-based HTTP servers) to serve representations of the resources they want to make available (e.g. a human-readable HTML representation or a machine-readable RDF/XML). A publisher may supply zero or more representations of the resource identified by that URI. However, there is a clear benefit to data users in providing at least one machine-readable representation. More information about serving different representations of a resource can be found in <a href="http://www.w3.org/TR/cooluris/" target="_blank">Cool URIs for the Semantic Web</a>.
+What it means: In order to enable HTTP URIs to be "dereferenced", data publishers have to set up the necessary infrastructure elements (e.g. TCP-based HTTP servers) to serve representations of the resources they want to make available (e.g. a human-readable HTML representation or a machine-readable RDF/XML). A publisher may supply zero or more representations of the resource identified by that URI. However, there is a clear benefit to data users in providing at least one machine-readable representation. More information about serving different representations of a resource can be found in <a href="http://www.w3.org/TR/cooluris/" target="_blank">Cool URIs for the Semantic Web</a>.
 </p>
 
 <p class="highlight"><b>A URI structure will not contain anything that could change</b><br>
@@ -392,7 +396,6 @@
 The World Wide Web Consortium's (W3C) Technical Architecture Group (TAG) attempted to settle a long standing debate about the use of URL resolution on 15 June 2005. Specifically, they decided:
 
 The TAG provides advice to the community that they may mint "http" URIs for any resource provided that they follow this simple rule for the sake of removing ambiguity:
-
 <ul>
 <li> If an "http" resource responds to a GET request with a 2xx response, then the resource identified by that URI is an information resource;</li>
 <li> If an "http" resource responds to a GET request with a 303 (See Other) response, then the resource identified by that URI could be any resource;</li>
@@ -400,7 +403,8 @@
 </li>
 </ul>
 
-<p>
+
+
 The practical implication of http-range-14 for Linked Data and Semantic Web implementors is the requirement to return an HTTP 303 (See Other) response when resolving HTTP URI identifiers for conceptual or physical resources (that is, for resources whose canonical content is non-informational in nature, c.f. [Wood2007]).  Current implementations of the Persistent URL (PURL) server provide support for 303 URIs [Wood2010]. Although the issue remains unsettled, and occasional attempts have been (and probably will be) made to revisit the TAG’s decision, however compliance with the http-range-14 decision until such time as it may be updated is recommended.
 </p>
 
@@ -479,7 +483,7 @@
 </p>
 
 <p>
-A Persistent URL is an address on the World Wide Web that causes a redirection to another Web resource. If a Web resource changes location (and hence URL), a PURL pointing to it can be updated. A user of a PURL always uses the same Web address, even though the resource in question may have moved. PURLs may be used by publishers to manage their own information space or by Web users to manage theirs; a PURL service is independent of the publisher of information. PURL services thus allow the management of hyperlink integrity. Hyperlink integrity is a design trade-off of the World Wide Web, but may be partially restored by allowing resource users or third parties to influence where and how a URL resolves. A simple PURL works by responding to an HTTP GET request with a response
+A Persistent URL (PURL) is an address on the World Wide Web that causes a redirection to another Web resource. If a Web resource changes location (and hence URL), a PURL pointing to it can be updated. A user of a PURL always uses the same Web address, even though the resource in question may have moved. PURLs may be used by publishers to manage their own information space or by Web users to manage theirs; a PURL service is independent of the publisher of information. PURL services thus allow the management of hyperlink integrity. Hyperlink integrity is a design trade-off of the World Wide Web, but may be partially restored by allowing resource users or third parties to influence where and how a URL resolves. A simple PURL works by responding to an HTTP GET request with a response
 of type 302 (“Found”). The response contains an HTTP “Location” header, the value of which is a URL that the client should subsequently retrieve via a new HTTP GET request. 
 </p>