BP document update with Vocab Selection section
authorBoris Villazon-Terrazas <bvillazon@fi.upm.es>
Fri, 02 Mar 2012 19:17:41 +0100
changeset 110 c17fe482db32
parent 108 11e7dc631232
child 111 eb80ae20e7a4
BP document update with Vocab Selection section
bp/img/GLF_Hausenblas.PNG
bp/img/GLF_Hyland.PNG
bp/img/GLF_Villazon-terrazas.PNG
bp/index.html
bp/local-style.css
Binary file bp/img/GLF_Hausenblas.PNG has changed
Binary file bp/img/GLF_Hyland.PNG has changed
Binary file bp/img/GLF_Villazon-terrazas.PNG has changed
--- a/bp/index.html	Thu Feb 23 19:50:52 2012 +0100
+++ b/bp/index.html	Fri Mar 02 19:17:41 2012 +0100
@@ -85,6 +85,21 @@
 <p class='issue'>Does it make sense to base the GLD life cycle on one of the general LD life cycles? See <a href="https://www.w3.org/2011/gld/track/issues/15">ISSUE-15</a></p>
 </section>
 
+<p class='issue'>Michael suggests to include the three available GLD life cycles we have</p>
+
+<p>Currently we have identified the following Government Linked Data Life Cycles
+</p>
+<p class="todo"> Include a brief description for each one of them.
+</p>
+<p>Hyland et al.</p>
+<div id="centerImg">
+</div>
+<img src="img/GLF_Hyland.PNG"  width="550"/>
+<p>Hausenblas et al.</p>
+<img src="img/GLF_Hausenblas.PNG" width="600"/>
+<p>Villazon-terrazas</p>
+<img src="img/GLF_Villazon-terrazas.PNG" width="600" />
+
 <section>
 <h3>Brief History of Open Government Linked Data - Bernadette</h3>
 
@@ -231,62 +246,87 @@
 </p>
 
 <section> <!-- Discovery checklist -->
-<h4>Discovery checklist</h4>
-<p>As we already stated, following the reuse-based approach, governments have to look for available vocabularies to reuse, instead of building new vocabularies from scratch. This checklist provides some considerations when trying to find out existing vocabularies that could best fit the needs of a Government or a specialized agency.
+	<h4>Discovery checklist</h4>
+	<p>As we already stated, following the reuse-based approach, governments have to look for available vocabularies to reuse, instead of building new vocabularies from scratch. This checklist provides some considerations when trying to find out existing vocabularies that could best fit the needs of a Government or a specialized agency.
+	</p>
+	
+<p class="highlight"><b>Define the scope of the domain</b><br/>
+<i>What it means:</i> Developing a common understanding as to what is included in, or excluded from, in the domain. By defining the scope of the domain, it restricts and helps to quickly find out related works in Linked Open Data initiatives. Hence, it could help in reusing some existing vocabularies of the same domain. Most of the time, the dataset gives you some hints about the domain. <br/><br/>
+Examples of domain: Geography, Environment, Administrations, State Services, Statistics, People, Organisation, etc.	
+	</p>
+
+<p class="highlight"><b>Identify relevant keywords in the dataset</b><br/>
+	<i>What it means:</i> Identifying words that describe the main ideas or concepts. By identifying the relevant keywords or categories of your dataset, it helps for the searching process using Semantic Web Search Engine. If you have raw data in csv, the columns of the tables can be used for the searching process. <br/><br/>
+	Examples: commune, county, point, feature, address, etc.	
 </p>
 
-<section>
-	<h5>Define the scope of the domain</h5>
-<p>
-<i>What it means:</i> Developing a common understanding as to what is included in, or excluded from, in the domain. By defining the scope of the domain, it restricts and helps to quickly find out related works in Linked Open Data initiatives. Hence, it could help in reusing some existing vocabularies of the same domain. Most of the time, the dataset gives you some hints about the domain.
-</p>
-<p>
-Examples of domain: Geography, Environment, Administrations, State Services, Statistics, People, Organisation, etc.	
-</p>
-</section>
-
-<section>
-	<h5>Identify relevant keywords in the dataset</h5>
-<p>
-<i>What it means:</i> Identifying words that describe the main ideas or concepts. By identifying the relevant keywords or categories of your dataset, it helps for the searching process using Semantic Web Search Engine. If you have raw data in csv, the columns of the tables can be used for the searching process.
+<p class="highlight"><b>Searching for a vocabulary in one specific language</b><br/>
+	<i>What it means:</i>Many of the available vocabularies are in English. You may be aware of having a vocabulary in your own language.
+	Consider this issue as it may restrict your search. Sometimes it might be useful to translate some of the keywords to English. 
 </p>
-<p>
-Examples: commune, county, point, feature, address, etc.
-</p>
-</section>
 
-<section>
-	<h5>Searching for a vocabulary in one specific language</h5>
-<p>
-<i>What it means:</i>Many of the available vocabularies are in English. You may be aware of having a vocabulary in your own language.
-Consider this issue as it may restrict your search. Sometimes it might be useful to translate some of the keywords to English.
-</p>
-</section>
-
-<section>
-	<h5>How to find vocabularies</h5>
-<p>
-<i>What it means:</i>There are some specific search tools (<a href="http://ws.nju.edu.cn/falcons/" target="_blank">Falcons</a>, <a href="http://watson.kmi.open.ac.uk/WatsonWUI/" target="_blank">Watson</a>, <a href="http://sindice.com/" target="_blank">Sindice</a>, <a href="http://swse.deri.org/" target="_blank">Semantic Web Search Engine</a>, <a href="http://swoogle.umbc.edu/" target="_blank">Swoogle</a>) that collect, analyse and index vocabularies and semantic data available online for efficient access.
-</p>
-<p>
+<p class="highlight"><b>How to find vocabularies</b><br/>
+	<i>What it means:</i>There are some specific search tools (<a href="http://ws.nju.edu.cn/falcons/" target="_blank">Falcons</a>, <a href="http://watson.kmi.open.ac.uk/WatsonWUI/" target="_blank">Watson</a>, <a href="http://sindice.com/" target="_blank">Sindice</a>, <a href="http://swse.deri.org/" target="_blank">Semantic Web Search Engine</a>, <a href="http://swoogle.umbc.edu/" target="_blank">Swoogle</a>, <a href="http://schemapedia.com/" target="_blank">Schemapedia</a>) that collect, analyse and index vocabularies and semantic data available online for efficient access.<br/><br/>
 	Examples: It is possible to perform a search on a relevant term or category present in your data.
 </p>
-</section>
 
-<section>
-	<h5>Where to find existing vocabularies in datasets catalogues</h5>
-<p>
-<i>What it means:</i>Another way around is to perform search using the previously identified key terms in datasets catalogues. Some of these catalogues provide samples of how the underlying data was modelled and how it was used for.
-</p>
-<p>
+<p class="highlight"><b>Where to find existing vocabularies in datasets catalogues</b><br/>
+	<i>What it means:</i>Another way around is to perform search using the previously identified key terms in datasets catalogues. Some of these catalogues provide samples of how the underlying data was modelled and how it was used for.<br/><br/>
 	Some existing catalogues are: <a href="http://thedatahub.org/" target="_blank">Data Hub</a> (former CKAN), <a href="http://labs.mondeca.com/dataset/lov/" target="_blank">LOV</a> directory, etc.
 </p>
-</section>
+
 </section> <!-- Discovery checklist >> -->
 
 <section> <!-- << Vocabulary Selection Criteria checklist -->
 <h4>Vocabulary Selection Criteria checklist</h4>
-<p>This checklist aims at giving some advices to better assess and select the vocabulary that best fits your needs, according to the output of the vocabularies discovered in the Discovery section. The final result should be one or two vocabularies that could be reused for your own purpose (mappings, extension, etc..)
+<p>This checklist aims at giving some advices to better assess and select the vocabulary that best fits your needs, according to the output of the vocabularies discovered in the Discovery section. The final result should be one or two vocabularies that could be reused for your own purpose (mappings, extension, etc.)
+</p>
+
+<p class="highlight"><b>Vocabularies should be self-descriptive</b><br/>
+	<i>What it means:</i> Each property or term in a vocabulary should have a Label, Definition and Comment defined.
+	Self-describing data suggests that information about the encodings used for each representation is provided explicitly within the representation. The ability for Linked Data to describe itself, to place itself in context, contributes to the usefulness of the underlying data.<br/><br/>
+For example, popular vocabulary called DCMI Metadata Terms has a Term Name <a href="http://dublincore.org/documents/dcmi-terms/#terms-contributor" target="_blank">Contributor</a> which has a:</br>
+	  Label: Contributor<br/>
+	  Definition: An entity responsible for making contributions to the resource<br/>
+	  Comment: Examples of a Contributor include a person, an organization, or a service.<br/>
+</p>
+
+<p class="highlight"><b>Vocabularies should be described in more than one language</b><br/>
+	<i>What it means:</i> EMultilingualism should be supported by the vocabulary, i.e., all the elements of the vocabulary should have labels, definitions and comments available in the government's official language, e.g., Spanish, and at least in English.
+	That is also very important as the documentation should be clear enough with appropriate tag for the language used for the comments or labels.<br/><br/>
+For example, for the same term <a href="http://dublincore.org/documents/dcmi-terms/#terms-contributor" target="_blank">Contributor</a></br>
+	  rdfs:label "Contributor"@en, "Colaborador"@es<br/>
+	  rdfs:comment "Examples of a Contributor include a person, an organization, or a service"@en , "Ejemplos de collaborator incluyen persona, organizaciĆ³n o servicio"@es<br/>
+</p>
+
+<p class="highlight"><b>Vocabulary reusability</b><br/>
+	<i>What it means:</i> It is always better to check how the vocabulary is used by others initiatives around and its popularity.<br/><br/>
+For example: The recent <a href="http://stats.lod2.eu/vocabularies" target="_blank">statistics</a> of the use of vocabularies in the cloud reveals that <a href="http://xmlns.com/foaf/0.1" target="_blank">foaf</a> is reused by more than 55 other vocabularies.
+</p>
+
+<p class="highlight"><b>Vocabularies should be accessible for a long period</b><br/>
+	<i>What it means:</i> The vocabulary selected should have a guarantee of maintenance in a long term, or at least the editors should be aware of that issue.
+	It also include here checking the permanence of the URIs, and how is the policy of vocabulary versioning. This is strongly related to the best practices described in the Stability section.
+</p>
+
+<p class="highlight"><b>Vocabularies should be published by a trusted group or organization</b><br/>
+	<i>What it means:</i> Although anyone can create a vocabulary, it is always better to check if it is one person, group or organization that is responsible for publishing and maintaining the vocabulary.
+	It is recommended to better trust a well-known organization than a single person.
+</p>
+
+<p class="highlight"><b>Vocabularies should have permanent URIs</b><br/>
+	<i>What it means:</i> It refers here to not have a 404 http error when trying to access at any *thing* of the vocabulary. It also refers to the permanent access to the server hosting the vocabulary, facilitating reusability and consumption of the data build upon them.<br/><br/>
+	Example: The <a href="http://www.w3.org/2003/01/geo/wgs84_pos#/" target="_blank">Geo W3C vocabulary</a> is one of the most used vocabulary for basic representation of geometry points (latitute/longitude) and has been around since 2009, always available at the same namespace. This is strongly related to the best practices described in the Stability section.	
+</p>
+
+<p class="highlight"><b>Vocabularies should provide a versioning policy</b><br/>
+	<i>What it means:</i> It refers to the mechanism put in place by the publisher to always take care of backward compatibilities of the versions, the ways those changes affected the previous versions.
+	Major changes of the vocabularies should be reflected on the documentation, in both machine or human-readable formats. This is strongly related to the best practices described in the Versioning section.	
+</p>
+
+<p class="highlight"><b>Vocabularies should provide documentations</b><br/>
+	<i>What it means:</i> A vocabulary should be well-documented for machine readable (use of labels and comments; tags to language used).
+	Also for human-readable, an extra documentation should be provided by the publisher to better understand the classes and properties, and if possible with some valuable use cases.	
 </p>
 </section> <!--  Vocabulary Selection Criteria checklist >> -->
 
@@ -294,6 +334,58 @@
 <h4>Vocabulary management/creation</h4>
 <p>As we already mentioned, we have to take into account that there may be cases in which Governments will need to mint their own vocabulary terms. This section provides a set of considerations aimed at helping to government stakeholders to mint their own vocabulary terms. This section includes some items of the previous section because some recommendations for vocabulary selection also apply to vocabulary creation.
 </p>
+
+<p class="highlight"><b>Define the URI of the vocabulary.</b><br/>
+	<i>What it means:</i> The URI that identifies your vocabulary must be defined. This is strongly related to the Best Practices described in section URI Construction.<br/><br/>
+	For example: If we are minting new vocabulary terms from a particular government, we should define the URI of that particular vocabulary.	
+</p>
+
+<p class="highlight"><b>Vocabularies should be self-descriptive</b><br/>
+	<i>What it means:</i> Each property or term in a vocabulary should have a Label, Definition and Comment defined.
+	Self-describing data suggests that information about the encodings used for each representation is provided explicitly within the representation. The ability for Linked Data to describe itself, to place itself in context, contributes to the usefulness of the underlying data.<br/><br/>
+For example, popular vocabulary called DCMI Metadata Terms has a Term Name <a href="http://dublincore.org/documents/dcmi-terms/#terms-contributor" target="_blank">Contributor</a> which has a:</br>
+	  Label: Contributor<br/>
+	  Definition: An entity responsible for making contributions to the resource<br/>
+	  Comment: Examples of a Contributor include a person, an organization, or a service.<br/>
+</p>
+
+<p class="highlight"><b>Vocabularies should be described in more than one language</b><br/>
+	<i>What it means:</i> EMultilingualism should be supported by the vocabulary, i.e., all the elements of the vocabulary should have labels, definitions and comments available in the government's official language, e.g., Spanish, and at least in English.
+	That is also very important as the documentation should be clear enough with appropriate tag for the language used for the comments or labels.<br/><br/>
+For example, for the same term <a href="http://dublincore.org/documents/dcmi-terms/#terms-contributor" target="_blank">Contributor</a></br>
+	  rdfs:label "Contributor"@en, "Colaborador"@es<br/>
+	  rdfs:comment "Examples of a Contributor include a person, an organization, or a service"@en , "Ejemplos de collaborator incluyen persona, organizaciĆ³n o servicio"@es<br/>
+</p>
+
+<p class="highlight"><b>Vocabularies should provide a versioning policy</b><br/>
+	<i>What it means:</i> It refers to the mechanism put in place by the publisher to always take care of backward compatibilities of the versions, the ways those changes affected the previous versions.
+	Major changes of the vocabularies should be reflected on the documentation, in both machine or human-readable formats. This is strongly related to the best practices described in the Versioning section.	
+</p>
+
+<p class="highlight"><b>Vocabularies should provide documentations</b><br/>
+	<i>What it means:</i> A vocabulary should be well-documented for machine readable (use of labels and comments; tags to language used).
+	Also for human-readable, an extra documentation should be provided by the publisher to better understand the classes and properties, and if possible with some valuable use cases.	
+</p>
+
+<p class="highlight"><b>Vocabulary should be published following available best practices</b><br/>
+	<i>What it means:</i> One of the goals is to contribute to the community by sharing the new vocabulary. To this end, it is recommended to follow available recipes for publishing RDF vocabularies, e.g., <a href="http://www.w3.org/TR/swbp-vocab-pub/" target="_blank">Best Practice Recipes for Publishing RDF Vocabularies</a>.	
+</p>
+
+<section> <!-- << Multilingualism in vocabs -->
+	<h4>Multilingualism in vocabs</h4>
+<p>
+This section provides some considerations when we are dealing with multilingualism in vocabularies. We have identified that multilingualism in vocabularies can be found nowadays in the following formats:
+</p>
+<ul>
+	<li>As a set of rdfs:label in which the language has been restricted (@en, @fr...). Currently, this is the most commonly used approach. It is also a best practice to always include an rdfs:label for which the language tag in not indicated. This term corresponds to the "default" language of the vocabulary</li>
+	<li>As skos:prefLabel (or skosxl:Label), in which the language has also been restricted.</li>
+	<li>As a set of monolingual ontologies (ontologies in which labels are expressed in one natural language) in the same domain mapped or aligned to each other (see the example of EuroWordNet, in which wordnets in different natural languages are mapped to each other through the so-called ILI - inter-lingual-index-, which consists of a set of concepts common to all categorizations).</li>
+	<li>As a set of ontology + lexicon. This represent the latest trend in the representation of linguistic (multilingual) information associated to ontologies. The idea is that the ontology is associated to an external ontology of linguistic descriptions. One of the best exponents in this case is the lemon model <a href="http://tia2011.crim.fr/Workshop-Proceedings/pdf/TIAW15.pdf" target="_blank">REF1</a>, <a href="http://lexinfo.net/" target="_blank">REF2</a>, an ontology of linguistic descriptions that is to be related with the concepts and properties in an ontology to provide lexical, terminological, morphosintactic, etc., information. One of the main advantages of this approach is that semantics and linguistic information are kept separated. One can link several lemon models in different natural languages to the same ontology.</li>
+</ul>
+The current trend is to follow the first approach, i.e., to use rdfs:label and rdfs:comment for each term in the vocabulary.
+	
+</section> <!-- Multilingualism in vocabs >> -->
+
 </section> <!-- Vocabulary management/creation >> -->
 
 <!-- Editorial notes for creators/maintainers:
@@ -308,7 +400,7 @@
 Partial or full deprecation
 Cross-cutting issues: "Hit-by-bus" -->
 
-
+@@TO DO@@ Add references
 </section>
 <!--  VOCABULARY SELECTION >>  -->
 
--- a/bp/local-style.css	Thu Feb 23 19:50:52 2012 +0100
+++ b/bp/local-style.css	Fri Mar 02 19:17:41 2012 +0100
@@ -61,7 +61,7 @@
 
 .highlight {
 border: 3px solid #005a9c;
-margin: 5 5 5 20px;
+margin: 5px 25px 0 25px;
 padding: 10px;
 }