--- a/bp/index.html Thu Dec 05 00:47:14 2013 +0100
+++ b/bp/index.html Thu Dec 05 09:45:56 2013 +0100
@@ -177,7 +177,7 @@
<h2>Scope</h2>
<p>
-This document aims to facilitate the adoption of Linked Open Data Principles for publishing open government data on the Web. Linked Data utilizes the Resource Description Framework [[!RDF-SCHEMA]].
+This document aims to facilitate the adoption of Linked Open Data Principles for publishing open government data on the Web. Linked Data utilizes the Resource Description Framework [[!RDF-CONCEPTS]].
<p>
Linked Data refers to a set of best practices for publishing and interlinking structured data for access by both humans and machines via the use of the RDF family of syntaxes (e.g., RDF/XML, N3, Turtle and N-Triples) and HTTP URIs. RDF and Linked Data are not the same thing.
@@ -204,8 +204,10 @@
The following best practices are discussed in this document and listed here for convenience.
-<p class='stmt'><a href="#NOMINATE">NOMINATE A PILOT:</a> <br>Nominate a pilot project on Linked Open Data.
-</p>
+<!--drop this suggested by Dave R. <p class='stmt'><a href="#NOMINATE">NOMINATE A PILOT:</a> <br>Nominate a pilot project on Linked Open Data.
+</p> -->
+<p class='stmt'><a href="#WROFLOW">CHOOSE A WORKFLOW:</a><br>Determine which workflow to use for your Linked Open Data use case.
+ </p>
<p class='stmt'><a href="#SELECT">SELECT A DATA SET:</a> <br>Select a data set that provides benefit to others for re-use. Consider a data set that your organization collects unique information that when combined with other open data provides greater value. For example, publishing facilities that can then be linked with geographic information including postal codes is a popular example.
</p>
@@ -234,13 +236,13 @@
<p class='stmt'><a href="#MACHINE">MACHINE ACCESSIBLE:</a><br>A major benefit of Linked Data is providing access to data for machines. Machines can use a variety of methods to read data including, but not limited to: a RESTful API, SPARQL endpoints or download.
</p>
-<p class='stmt'><a href="#SERIALIZATION">SERIALIZATION:</a><br> Convert the source data into a Linked Data representation. This process is also called "Triplifaction" however, more accurately, it is the process serializing data into one or more RDF serializations. RDF serializations include Turtle, Notation-3 (N3), N-Triples, XHTML with embedded RDFa, and RDF/XML.
+<p class='stmt'><a href="#SERIALIZATION">DATA CONVERSION:</a><br> Convert the sources data to a Linked Data representation. This will typically mean mapping the source data to a set of RDF statements about entities described by the data. These statements can then be serialized into a range of RDF serializations including Turtle, N-Triples, JSON-LD, (X)HTML with embedded RDFa and RDF/XML.
</p>
<p class='stmt'><a href="#LINK">LINKS ARE KEY:</a> <br>As the name suggests, Linked Open Data means the data is linked to other stuff. Data in isolation is rarely valauble, however, interlinked data is suddenly very valuable. There are many popular data sets, such as DBpedia that provide valuable data, including photos and geographic information. Being able to connect Linked Open Data from a government authority with DBpedia is quick way to show the value of adding content to the Linked Data Cloud.
</p>
-<p class='stmt'><a href="#HOST">DOMAIN:</a> <br>Deliver Linked Open Data on authoritative domain. Using an authoritative domain increases the perception of trusted content. Authoritative data that is regularly updated on a government domain is critical to uptake and reuse of the data set(s).
+<p class='stmt'><a href="#HOST">DOMAIN:</a> <br>Deliver Linked Open Data on an authoritative domain. Using an authoritative domain increases the perception of trusted content. Authoritative data that is regularly updated on a government domain is critical to uptake and reuse of the data set(s).
</p>
<p class='stmt'><a href="#ANNOUNCE">ANNOUNCE:</a><br> Announce the Linked Open Data on multiple channels. Be sure to have a plan in place to handle timely feedback. Linked Open Data implies the public is looking at and using the data, so ensure you have people in place to handle the customer service and technical support required to support the global Web audience.
@@ -383,65 +385,6 @@
</p>
</section>
-<!-- Discovery Checklist
- <p class="issue"> (Editors) - Requested (12-Apr-2013, GLD WG F2F) - guidance on creating a simple controlled vocabulary using SKOS. Confirm this fits with the scope of BP document. <br/>
- Some pointers: <br/>
- + SKOS datasets: http://www.w3.org/2001/sw/wiki/SKOS/Datasets <br/>
- + SKOS implementation records: http://www.w3.org/2006/07/SWD/SKOS/reference/20090315/implementation.html <br/>
- + An introduction to SKOS: http://www.xml.com/pub/a/2005/06/22/skos.html
-
-</p>-->
-
-<section id='skos'>
-<h2>Using SKOS to Create a Controlled Vocabulary</h2>
-
-<div class='note'>
- [[SKOS-REFERENCE]] , which stands for Simple Knowledge Organization System, is a W3C standard, based on other Semantic Web standards (RDF and OWL), that provides a way to represent controlled vocabularies, taxonomies and thesauri. Specifically, SKOS itself is an OWL ontology and it can be written out in any RDF flavour.
-</div>
-
-<p>The W3C SKOS standard defines a portable, flexible controlled vocabulary format that is increasingly popular, with the added benefit of a good entry-level step toward the use of Semantic Web technology. </p>
-<div class="highlight"> It is a good practice to use SKOS in the following situations:
- <ul>
- <li>There is a need to publish a list of terms or taxonomies having a special meaning for the domain</li>
- <li> There is a clear distinction between the collections of concepts (ConceptScheme) and the different concepts. </li>
- <li> Define when possible a different namespace for each <code>skos:ConceptScheme</code> </li>
- <li> Structure the categories using properties <code>skos:hasTopConcept</code>, <code>skos:broader</code>. </li>
- <li> When the mappings with the concepts are not only of the form <b>owl:sameAs</b>, hence it could be possible to have other semantic relationships among them, e.g.: broader, related, narrower.
- </ul>
-</div>
-
-
-<p><i>Let's consider a list of equipments where the codes used are: A101="Police", A206="Post Office" and A504="Restaurant". With SKOS, we could define the following structure:</i></p>
-
-
-<pre class="example">
- <http://example.org/codes/typeEquipment/A101>
- rdf:type skos:Concept, ex:TypeEquipmentA ;
- skos:notation "A101"
- skos:prefLabel "Police"@en ;
- skos:inScheme <http://example.org/codes/typeEquipment> .
-
- <http://example.org/codes/typeEquipment/A206>
- rdf:type skos:Concept, ex:TypeEquipmentA ;
- skos:notation "A206"
- skos:prefLabel "Post office"@en ;
- skos:inScheme <http://example.org/codes/typeEquipment> .
-
- <http://example.org/codes/typeEquipment/A504>
- rdf:type skos:Concept, ex:TypeEquipmentA ;
- skos:notation "A504"
- skos:prefLabel "Restaurant"@en ;
- skos:inScheme <http://example.org/codes/typeEquipment> .
-
- <http://example.org/codes/typeEquipment>
- rdf:type skos:ConceptScheme ;
- rdfs:label "Type of Equipments"@en;
- rdfs:label "Type d'equipements"@fr .
- </pre>
-
-
-
-</section>
@@ -500,22 +443,11 @@
</section>
-<!-- << Vocabulary management/creation -->
+<!-- << Vocabulary creation -->
<section id="MODEL">
<h2>Vocabulary Creation</h2>
-<!-- Editorial notes for creators/maintainers:
-
-Creation Namespace management
-stability (PURL)
-longevity (hit by bus)
-Available ontology development methodologies (Informative)
-Usage (instance-level, SPARQL, etc.)
-Versioning
-go back to creation
-Partial or full deprecation
-Cross-cutting issues: "Hit-by-bus" -->
<p><i>There will be cases in which authorities will need to mint their own vocabulary terms. This section provides a set of considerations aimed at helping to government stakeholders mint their own vocabulary terms. This section includes some items of the previous section because some recommendations for vocabulary selection also apply to vocabulary creation.</i> </p>
<p class="note"> Ensure new vocabularies you create are:
@@ -554,7 +486,7 @@
</p>
<p class="highlight"><b>Vocabularies should provide a versioning policy</b><br/>
- <i>What it means:</i> It refers to the mechanism put in place by the publisher to always take care of backward compatibilities of the versions, the ways those changes affected the previous versions. Major changes of the vocabularies should be reflected on the documentation, in both machine or human-readable formats. This is strongly related to the best practices described in the Versioning section.
+ <i>What it means:</i> It refers to the mechanism put in place by the publisher to always take care of backward compatibilities of the versions, the ways those changes affected the previous versions. Major changes of the vocabularies should be reflected on the documentation, in both machine or human-readable formats.
</p>
<p class="highlight"><b>Vocabularies should provide documentation</b><br/>
@@ -591,9 +523,7 @@
</p> -->
-<!-- TODO -->
-<!--
-<p class="todo">Add references to Felix Sasaka's work on multilingual Web and new W3C WG</p> -->
+
<section id="multilingual">
<h2>Multilingual Vocabularies</h2>
@@ -607,8 +537,8 @@
<!--remove this, suggested by Phil? It is also a best practice to always include an <code>rdfs:label</code> for which the language tag in not indicated. This term corresponds to the <b>"default"</b>language of the vocabulary</li> -->
<li>As <code>skos:prefLabel</code> (or <code>skosxl:Label</code>), in which the language has also been restricted.</li>
- <li>As a set of monolingual ontologies (ontologies in which labels are expressed in one natural language) in the same domain mapped or aligned to each other (see the example of EuroWordNet, in which wordnets in different natural languages are mapped to each other through the so-called ILI - inter-lingual-index-, which consists of a set of concepts common to all categorizations).</li>
- <li>As a set of ontology + lexicon. This represents the latest trend in the representation of linguistic (multilingual) information associated to ontologies. The idea is that the ontology is associated to an external ontology of linguistic descriptions. One of the best exponents in this case is the lemon model </a>, <a href="http://lexinfo.net/" target="_blank"></a>, an ontology of linguistic descriptions that is to be related with the concepts and properties in an ontology to provide lexical, terminological, morphosintactic, etc., information. One of the main advantages of this approach is that semantics and linguistic information are kept separated. One can link several lemon models in different natural languages to the same ontology.</li>
+ <li>As a set of monolingual ontologies (ontologies in which labels are expressed in one natural language) in the same domain mapped or aligned to each other (see the example of EuroWordNet, in which wordnets in different natural languages are mapped to each other through the so-called <code>ILI - inter-lingual-index-</code>, which consists of a set of concepts common to all categorizations).</li>
+ <li>As a set of ontology + lexicon. This is an approach to the representation of linguistic (multilingual) information associated to ontologies. The idea is that the ontology is associated to an external ontology of linguistic descriptions. One of the best exponents in this case is the <a href="http://lexinfo.net/" target="_blank">lemon model</a>, an ontology of linguistic descriptions that is to be related with the concepts and properties in an ontology to provide lexical, terminological, morphosyntactic, etc., information. One of the main advantages of this approach is that semantics and linguistic information are kept separated. One can link several lemon models in different natural languages to the same ontology.</li>
<li> It could be also useful to use the <code>lexInfo</code> ontology available at <code>http://www.lexinfo.net/lmf#</code> where they provide stable resources for languages, such as <code>http://lexvo.org/id/iso639-3/eng </code> for English, or <code>http://lexvo.org/id/iso639-3/cmn</code> for Chinese Mandarin.
</ul>
<p class="note">The current trend is to follow the first approach, i.e., to use at least a <code>rdfs:label</code> and <code>rdfs:comment</code> for each term in the vocabulary.</p>
@@ -616,14 +546,64 @@
</section>
-<!--<ul>
- <li>Find out the frequence of usage of the term in Linked Open Data using tools like <a href="http://stats.lod2.eu/properties" target="_blank">lodstats</a> or <a href="http://lov.okfn.org/dataset/lov/stats/" target="_blank">lov stats</a></li>
- <li>Always use first a property in the namespace of a "standard", or recommended vocabulary by the W3C. </li>
- <li>Always use a property in a vocabulary more recently published, because it likely extendes or reuses a similar previous vocabulary in the same scope.</li>
- <li> Consider with higher priority the criteria of sustainability or long term presence of the namespace.</li>
- <li>Authoritive criteria of the underlying vocabulary has to be taken into account.</li>
- <li>Don't be ashame, learn from others vocabularies published in the Wild .</li>
- </ul> -->
+<!-- Using SKOS to create a controlled vocavulary -->
+
+<section id='skos'>
+<h2>Using SKOS to Create a Controlled Vocabulary</h2>
+
+<div class='note'>
+ [[SKOS-REFERENCE]] , which stands for Simple Knowledge Organization System, is a W3C standard, based on other Semantic Web standards (RDF and OWL), that provides a way to represent controlled vocabularies, taxonomies and thesauri. Specifically, SKOS itself is an OWL ontology and it can be written out in any RDF flavour.
+</div>
+
+<p>The W3C SKOS standard defines a portable, flexible controlled vocabulary format that is increasingly popular, with the added benefit of a good entry-level step toward the use of Semantic Web technology. </p>
+
+<div class="highlight"> SKOS is appropriate in the following situations:
+ <ul>
+ <li>There is a need to publish a controlled list of terms or taxonomies having a special meaning for the domain.</li>
+ <li> The complexity and formality of an OWL ontology is not appropriate (for example the terms are not themselves entities that will be richly described).</li>
+ </ul>
+</div>
+<div class="highlight"> In creating a SKOS vocabulary bear the following good practice in mind:
+ <li>Make a clear distinction between the collections of concepts (ConceptScheme) and the different individual concepts. </li>
+ <li> Define when possible a different namespace for each <code>skos:ConceptScheme</code> </li>
+ <li> Structure the concepts in the list using properties <code>skos:hasTopConcept</code>, <code>skos:broader</code>, <code>skos:narrower.</code> </li>
+ <li>Consider defining a Class to represent all the skos:Concepts in your controlled list (this can facilitate declaration of properties that will use this list).</li>
+ <li> Provide multilingual labels for the terms.</li>
+ </ul>
+</div>
+
+
+<p><i>Let's consider a list of equipments where the codes used are: A101="Police", A206="Post Office" and A504="Restaurant". With SKOS, we could define the following structure:</i></p>
+
+
+<pre class="example">
+ <http://example.org/codes/typeEquipment/A101>
+ rdf:type skos:Concept, ex:TypeEquipmentA ;
+ skos:notation "A101"
+ skos:prefLabel "Police"@en ;
+ skos:inScheme <http://example.org/codes/typeEquipment> .
+
+ <http://example.org/codes/typeEquipment/A206>
+ rdf:type skos:Concept, ex:TypeEquipmentA ;
+ skos:notation "A206"
+ skos:prefLabel "Post office"@en ;
+ skos:inScheme <http://example.org/codes/typeEquipment> .
+
+ <http://example.org/codes/typeEquipment/A504>
+ rdf:type skos:Concept, ex:TypeEquipmentA ;
+ skos:notation "A504"
+ skos:prefLabel "Restaurant"@en ;
+ skos:inScheme <http://example.org/codes/typeEquipment> .
+
+ <http://example.org/codes/typeEquipment>
+ rdf:type skos:ConceptScheme ;
+ rdfs:label "Type of Equipments"@en;
+ rdfs:label "Type d'equipements"@fr .
+ </pre>
+
+
+</section>
+
<section id="howto">
<h2>Best Practice for choosing entity URIs</h2>
@@ -659,29 +639,29 @@
<p class="highlight"><b>Q: Do the entities have existing unique, non-URI, IDs?</b> <br>
YQ: Are they globally unique?<br>
- YQ: Is there an existing URI mapping for these IDs from a reliable party?<br>
- YQ: Do you have additional information about the entities beyond what they have?<br>
- Y: Mint your own URIs based on the existing ID, and map using a mapping property<br>
- N: Use their URIs directly.<br>
- N: Mint your own URI based on the existing ID by sticking it onto a unique base.<br>
+ YQ: Is there an existing URI mapping for these IDs from a reliable party?<br>
+ YQ: Do you have additional information about the entities beyond what they have?<br>
+ Y: Mint your own URIs based on the existing ID, and map using a mapping property<br>
+ N: Use their URIs directly.<br>
+ N: Mint your own URI based on the existing ID by sticking it onto a unique base.<br>
N: So you have only strings that are not guaranteed to be in a stable 1:1 correspondence with the entities. Use a blank node; make sure that there's a good skos:prefLabel, rdfs:label, dc:title, and other standard metadata properties.
</p>
<p class="highlight"><b>Q: Can the entity be represented as one of the standard RDF datatypes (that is, it's a date, number, etc.)?</b> <br>
YQ: Is the entity annotated with additional information beyond what the datatype represents?<br>
- N: Use a typed literal.
+ N: Use a typed literal.
</p>
<p class="highlight"><b>Q: Can you map to reliable remote URIs?</b> <br>
Q: Do you have data about the entities beyond what's already available from the remote URIs?<br>
-A: No? Then use the remote URIs.
+N: No? Then use the remote URIs.
</p>
<p class="highlight"><b>Q: Data is on its own web page with permalink?</b> <br>
Q: Can you deploy RDFa in the web page, or can you deploy Turtle via content negotiation on the same URI?<br>
-A: Use <code>permalink#{fragment}</code> pattern, where <code>{fragment}</code> might be "this", "id", "product", "user", etc.
+YN: Use <code>permalink#{fragment}</code> pattern, where <code>{fragment}</code> might be "this", "id", "product", "user", etc.
</p>
</section>
@@ -742,7 +722,7 @@
<p class="note"><b>TAG advices on http issues</b><br>
The TAG provides advice to the community that they may mint "http" URIs for any resource provided that they follow this simple rule for the sake of removing ambiguity as below:
-<pre class="example">
+<pre class="highlight">
<li> If an "http" resource responds to a GET request with a 2xx response, then the resource identified by that URI is an information resource;</li>
<li> If an "http" resource responds to a GET request with a 303 (See Other) response, then the resource identified by that URI could be any resource;</li>
<li> If an "http" resource responds to a GET request with a 4xx (error) response, then the nature of the resource is unknown.
@@ -876,7 +856,8 @@
</li>
</ul>
-</section> <!-- URI CONSTRUCTION >> -->
+</section>
+
<!-- SPECIFY LICENSE -->
@@ -1069,7 +1050,7 @@
<h2>Acknowledgments</h2>
This document has been produced by the Government Linked Data Working Group, and its contents reflect extensive discussion within the Working Group as a whole.
<p>
-The editors gratefully acknowledge the many contributors to this Best Practices document including: <a href="http://mhausenblas.info/#i">Michael Hausenblas</a> (MapR), <a href="http://logd.tw.rpi.edu/person/john_erickson" target="_blank">John Erickson</a> (Rensselaer Polytechnic Institute), <a href="http://3roundstones.com/about-us/leadership-team/david-wood/" target="_blank">David Wood</a> (3 Round Stones), <a href="http://data.semanticweb.org/person/bernard-vatant/">Bernard Vatant </a> (Semantic Web - Mondeca), Michael Pendleton (U.S. Environmental Protection Agency), <a href="http://researcher.watson.ibm.com/researcher/view_person_subpage.php?id=3088" target="_blank">Biplav Srivastava</a> (IBM India), <a href="http://www.oeg-upm.net">Daniel Vila </a> (Ontology Engineering Group), Martín Álvarez Espinar (CTIC-Centro Tecnológico), and <a href="http://linkedgov.org">Hadley Beeman </a> (UK LinkedGov) and Phil Archer <a href="http://www.w3.org" target="_blank">(W3C / ERCIM)</a>.
+The editors gratefully acknowledge the many contributors to this Best Practices document including: <a href="http://mhausenblas.info/#i">Michael Hausenblas</a> (MapR), <a href="http://logd.tw.rpi.edu/person/john_erickson" target="_blank">John Erickson</a> (Rensselaer Polytechnic Institute), <a href="http://3roundstones.com/about-us/leadership-team/david-wood/" target="_blank">David Wood</a> (3 Round Stones), <a href="http://data.semanticweb.org/person/bernard-vatant/">Bernard Vatant </a> (Semantic Web - Mondeca), Michael Pendleton (U.S. Environmental Protection Agency), <a href="http://researcher.watson.ibm.com/researcher/view_person_subpage.php?id=3088" target="_blank">Biplav Srivastava</a> (IBM India), <a href="http://www.oeg-upm.net">Daniel Vila </a> (Ontology Engineering Group), Martín Álvarez Espinar (CTIC-Centro Tecnológico), and <a href="http://linkedgov.org">Hadley Beeman </a> (UK LinkedGov) , <a href="http://www.epimorphics.com" target="_blank">Dave Reynolds</a> and Phil Archer <a href="http://www.w3.org" target="_blank">(W3C / ERCIM)</a>.
</p>
</section>