Craig M Trim's feedback
authordgarijo
Mon, 26 Nov 2012 13:00:38 +0100
changeset 5057 72a6dd0990e1
parent 5047 a2f836dc5c70
child 5058 01ec52b5d42c
Craig M Trim's feedback
dc-note/Overview.html
--- a/dc-note/Overview.html	Mon Nov 26 07:32:44 2012 +0000
+++ b/dc-note/Overview.html	Mon Nov 26 13:00:38 2012 +0100
@@ -517,15 +517,16 @@
  <img width="72" height="48" src="http://www.w3.org/Icons/w3c_home" alt="W3C" /></a></p>
  <h1 class="title" id="title">Dublin Core to PROV Mapping</h1><h2 id="w3c-working-draft-19-august-2012">
  <acronym title="World Wide Web Consortium">W3C</acronym> Editor's Draft 28 October 2012</h2>
- <dl><dt>This version:</dt><dd><a href="http://dvcs.w3.org/hg/prov/raw-file/38ace4244610/dc-note/Overview.html">http://dvcs.w3.org/hg/prov/raw-file/38ace4244610/dc-note/Overview.html</a></dd>
- <dt>Latest published version:</dt><dd><a href="http://dvcs.w3.org/hg/prov/raw-file/38ace4244610/dc-note/Overview.html">http://dvcs.w3.org/hg/prov/raw-file/38ace4244610/dc-note/Overview.html</a></dd>
+ <dl><dt>This version:</dt><dd><a href="http://dvcs.w3.org/hg/prov/raw-file/default/dc-note/Overview.html">http://dvcs.w3.org/hg/prov/raw-file/default/dc-note/Overview.html</a></dd>
+ <dt>Latest published version:</dt><dd><a href="http://dvcs.w3.org/hg/prov/raw-file/default/dc-note/Overview.html">http://dvcs.w3.org/hg/prov/raw-file/default/dc-note/Overview.html</a></dd>
  <dt>Latest editor's draft:</dt><dd><a href="http://dvcs.w3.org/hg/prov/raw-file/default/dc-note/Overview.html">http://dvcs.w3.org/hg/prov/raw-file/default/dc-note/Overview.html</a></dd>
  <dt>Editors:</dt> 
-<dd><a href="http://www.kaiec.org/">Kai Eckert</a>, Manheim University Library, Germany</dd>
-<dd><a href="http://www.oeg-upm.net/index.php/en/phdstudents/28-dgarijo">Daniel Garijo</a></span>, Universidad Politécnica de Madrid, Spain</dd>
+<dd><span><a href="http://www.kaiec.org/">Kai Eckert</a>, Manheim University Library, Germany</dd>
+<dd><span><a href="http://www.oeg-upm.net/index.php/en/phdstudents/28-dgarijo">Daniel Garijo</a></span>, Universidad Politécnica de Madrid, Spain</dd>
 <dt>Contributors:</dt>
-<dd><a href="http://www.inf.kcl.ac.uk/staff/simonm">Simon Miles</a>, King's College London, UK</dd>
-<dd><span>Michael Panzer </span>OCLC Online Computer Library center, USA</dd>
+<dd><span><a href="http://www.inf.kcl.ac.uk/staff/simonm">Simon Miles</a></span>, King's College London, UK</dd>
+<dd><span>Craig M. Trim</span>, IBM, USA</dd>
+<dd><span>Michael Panzer</span>, OCLC Online Computer Library center, USA</dd>
 </dl><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2012 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.eu/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. <acronym title="World Wide Web Consortium">W3C</acronym> <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p><hr /></div>
   <div id="abstract" class="introductory section"><h2>Abstract</h2>
    <p>
@@ -628,16 +629,24 @@
 <h2><span class="secno">1. </span>Introduction</h2>
    <p>
     The Dublin Core Metadata Initiative (DCMI) [<a href="#bib-DCMI">DCMI</a>] provides a core metadata vocabulary,
-	commonly referred to as Dublin Core. The original element set consisted of 15 elements that are still available nowadays.
-	These elements are defined very broadly, in particular they have no
-	range specification, i.e., they can be used with arbitrary values as objects. The elements have been further
-	refined and new types have been introduced. This more specific vocabulary is called the terms and currently consists
-	of 55 properties [<a href="#bib-DCTERMS">DCTERMS</a>].
+	commonly referred to as Dublin Core. The original element set, from 1995, contains 15 broadly-defined elements still in use.
+	The core elements have no range specification, and arbitrary values can be used as objects. The core elements have been 
+	expanded beyond the original fifteen. Existing elements have been refined and new elements have been added. This expanded vocabulary is 
+	referred to as "DCMI Terms" and currently consists of 55 properties [<a href="#bib-DCTERMS">DCTERMS</a>].
 	</p>
-The Dublin Core elements are considered legacy and the use of the DCMI terms is preferred. They have different namespaces;
- if abbreviated, the elements are usually used with the <code>dc</code> prefix, while <code>dct</code> or <code>dcterms</code> 
- prefix is used for the terms.
-
+The use of DCMI terms is preferred and the Dublin Core element set has been depecreated. 
+Both element sets have different namespaces. The original element set is typically referred with the 
+<code>dc</code> prefix, while <code>dct</code> (or <code>dcterms</code>) is used as prefix for the newer DCMI element set. 
+</p>
+<p>
+DCMI terms hold a lot of provenance information and tell us about a resource, <i>when</i> it was affected in the past, 
+<i>who</i> affected it and <i>how</i> it was affected. The rest of the DCMI terms (description metadata), tell us <i>what</i> was affected. 
+There is no direct information in Dublin Core describing <i>where</i> a resource was affected. Such information is usually 
+only available for the publication of a resource (i.e., an action located at the address of the publisher). <!--Note that 
+spatial is not related to this question, as it is a descriptive property that links a resource to the location 
+referred to in its content, but not to the location where it was created, modified, issued or published. -->
+</p>
+<!--
 Consider the following example for a metadata record:
 </p><p>
 <a href="#example1">Example 1</a>: a simple metadata record:
@@ -651,6 +660,7 @@
     dct:replaces ex:doc2 ;
     dct:format "HTML" .
 </pre>
+
 <p>
 Clearly not all metadata statements deal with provenance. 
  <code>dct:title</code>, <code>dct:subject</code> and <code>dct:format</code> are descriptions of the resource <code>ex:doc1</code>. 
@@ -680,30 +690,35 @@
 <b>Provenance metadata</b>: available, contributor, created, creator, date, dateAccepted, dateCopyrighted,
  dateSubmitted, hasFormat, hasVersion, isFormatOf, isReferencedBy, isReplacedBy, issued, isVersionOf, license, modified,
  provenance, publisher, references, replaces, rightsHolder, rights, source, valid.
-</p><p>
-This is a conservative classification of provenance metadata. It can be argued that other elements contain 
-provenance information as well, depending on their usage in a concrete implementation or application.
+</p>
+-->
+<p>
+A classification of the <code>dct</code> terms is provided in <a href="#categories">Table 1</a>. This classification is by necessity
+somewhat conservative, as it can be argued that elements placed in the description metadata terms contain 
+provenance information as well, depending on their usage. Based on this, 25 (out of 55) terms can be considered as 
+provenance related. These terms can be further categorized according to the question they answer regarding the 
+provenance of a resource:
 </p><p>
-According to the proposed classification, there are 25 terms out of 55 that can be considered as provenance related.
- The terms can further be categorized according to the question they answer regarding the provenance of a resource:
-</p><p>
-<b>Who?</b> (contributor, creator, publisher, rightsHolder): Category that includes all properties that have <code>dct:Agent</code> as range,
- i.e., a resource that acts or has the power to act. The contributor, creator, and publisher clearly influence
- the resource and therefore are important for its origin. This is not immediately clear for the <code>dct:rightsHolder</code>,
- but as ownership is considered the important provenance information for many resources, like artworks, it is included in this category.
-</p><p>
-<b>When?</b> (available, created, date, dateAccepted, dateCopyrighted, dateSubmitted, issued, modified, valid):
+
+<b>Dates and Time terms (When?):</b>This category contains date and time related terms.
  Dates typically belong to the provenance record of a resource. It can be questioned whether a resource changes by
  being published or not. Depending on the application, however, the publication can be seen as an action that changes 
  the state of the resource. Two dates can be considered special regarding their relevance for
- provenance: available and valid. They are different from the other dates as by definition they can represent a
+ provenance: <code>dct:available</code> and <code>dct:valid</code>. They are different from the other dates as by definition they can represent a
  date range. Often, the range of availability or validity of a resource is inherent to the resource and known
  beforehand – consider the validity of a passport or the availability of a limited special offer published on the web.
  In these cases, there is no action involved that makes the resource invalid or unavailable, it is simply determined
  by the validity range. On the other hand, if an action is involved, e.g., a resource is declared invalid because
  a mistake has been found, then it is relevant for its provenance.
+ </p><p>
+ 
+<b>Agency Terms (Who?):</b> This category contains agent related terms. All properties that have <code>dct:Agent</code> as range,
+ i.e., a resource that acts or has the power to act. The <code>dct:contributor</code>, <code>dct:creator</code>, 
+ and <code>dct:publisher</code> clearly influence
+ the resource and therefore are important for its origin. This is not immediately clear for the <code>dct:rightsHolder</code>,
+ but as ownership is considered the important provenance information for many resources, like artworks, it is included in this category.
 </p><p>
-<b>How?</b> (isVersionOf, hasVersion, isFormatOf, hasFormat, references, isReferencedBy, replaces, isReplacedBy, source, rights, license):
+<b>Derivation Terms (How?):</b> This category contains derivation related terms.  
  Resources are often derived from other resources. In this case, the original resource becomes part of the provenance
  record of the derived resource. Derivations can be further classified as <code>dct:isVersionOf, dct:isFormatOf, dct:replaces, dct:source</code>.
   <code>dct:references</code> is a weaker relation, but it can be assumed that a referenced resource influenced the described resource
@@ -714,9 +729,9 @@
  between the resources involved. Finally, licensing and rights are considered part of the provenance of the resource as well, 
  since they restrict how the resource has been used by its owners.
 </p>
-<p>
+<!--<p>
 <a href="#categories">Table 1</a> summarizes the terms in their respective categories:
-</p>
+</p>-->
 <div id="categories" ALIGN="center">
  <table>
 	<caption> <a href="#categories"> Table 1:</a> Categorization of the Dublin Core Terms </caption>
@@ -769,7 +784,7 @@
  this definition may overlap partially with almost half of the DCMI terms, which
 specify concrete aspects of provenance of a resource.
 </p><p>
-In summary, the DCMI terms – and therefore any Dublin Core metadata record – hold a lot of provenance information and
+<!--In summary, the DCMI terms – and therefore any Dublin Core metadata record – hold a lot of provenance information and
  tell us about a resource, <i>when</i> it was affected in the past, <i>who</i> affected it and <i>how</i> it was affected.
  The other DCMI terms (description metadata), tell us <i>what</i> was affected. There is 
  no direct information in Dublin Core describing <i>where</i> a resource was affected. Such information is usually only available for the
@@ -777,6 +792,28 @@
  to this question, as it is a descriptive property that links a resource to the location referred to in its content, but not to the
  location where it was created, modified, issued or published. <!--– or even that it has ever been or is otherwise related to Berlin. <!--And finally, the question,
  why a resource was affected, lacks – apart from subtle hints from terms like replaces – as usual a satisfying answer. -->
+ An example of a simple metadata record annotated with <code>dct</code> terms can be seen below:
+</p><p>
+<a href="#example1">Example 1</a>: a simple metadata record:
+<pre class="example" id="example1">
+ ex:doc1 dct:title "A mapping from Dublin Core..." ;
+    dct:creator ex:kai, ex:daniel, ex:simon, ex:michael ;
+    dct:created "2012-02-28" ;
+    dct:publisher ex:w3c ;
+    dct:issued "2012-02-29" ;
+    dct:subject ex:dublincore ;
+    dct:replaces ex:doc2 ;
+    dct:format "HTML" .
+</pre>
+In <a href="#example1">Example 1</a>, <code>dct:title</code>, <code>dct:subject</code> and <code>dct:format</code> 
+are descriptions of the resource <code>ex:doc1</code>. 
+They do not provide any information on how the resource was created or modified in the past.
+ On the other hand, some statements imply provenance-related information. For example <code>dct:creator</code> 
+ implies that the document has been created and refers to an author. Similarly, the existence 
+ of the <code>dct:issued</code> date implies that the document has been published. This information is redundantly 
+ implied by the <code>dct:publisher</code> statement as well. Finally, <code>dct:replaces</code> relates 
+ the document to another document <code>ex:doc2</code> which had probably
+ some kind of influence on <code>ex:doc1</code>.
 </p>
 <h3 id ="namespaces">1.1 Namespaces</h3> 
 <p>The namespaces used through the document can be seen in <a href="#ns"> Table 2</a> below:
@@ -797,10 +834,10 @@
 <h2>2. Mapping from Dublin Core to PROV</h2>
 <p>A mapping between Dublin Core Terms and PROV-O has many advantages. First, it can provide valuable insights
  into the different characteristics of both data models (in particular it explains PROV from a Dublin Core point of view).
- Second, such a mapping can be used to extract PROV data from the huge amount of Dublin Core data that is available on 
- the Web today. Third, it can translate PROV data to Dublin Core and make it accessible for applications that
- understand Dublin Core. Last, but not least, it can lower the barrier to adopt PROV, as simple Dublin Core statements can be
- used as starting point to generate PROV data. </p>
+ Second, such a mapping can be used to extract PROV data from the large amount of Dublin Core data available on 
+ the Web today. Third, the mapping can translate PROV data to Dublin Core and make it accessible for applications that
+ understand Dublin Core. Finally, the mapping can lower the barrier to entry for PROV adoption. Simple Dublin Core 
+ statements can be used as starting point for PROV data generation. </p>
 <div id="basic" class="section">
 <h3>2.1 Basic considerations </h3>
 <p>