--- a/dc-note/dc-note.html Wed Feb 06 21:31:49 2013 +0100
+++ b/dc-note/dc-note.html Thu Feb 07 01:50:27 2013 +0100
@@ -800,56 +800,60 @@
</p>-->
</section>
<section id="entities-in-dublin-core">
- <h3>What is ex:doc1? Entities in Dublin Core</h3>
+ <h3>Entities in Dublin Core</h3>
<p>
- Consider the example metadata record below (in <a href="#example1">example 1</a>). As a <code>dc</code>
- metadata record describes the resulting document as a whole,
+ Consider the example metadata record below (<a href="#example1">Example 1</a>), where a document (<code>ex:doc1</code>) is described
+ with several <code>dct</code> statements:
+ </p>
+ <p>
+ <a href="#example1">Example 1</a>: a simple metadata record:
+ <pre class="example" id="example1">
+ ex:doc1 dct:title "A mapping from Dublin Core..." ;
+ dct:creator ex:kai, ex:daniel, ex:simon, ex:michael ;
+ dct:created "2012-02-28" ;
+ dct:publisher ex:w3c ;
+ dct:issued "2012-02-29" ;
+ dct:subject ex:dublincore ;
+ dct:replaces ex:doc2 ;
+ dct:format "HTML" .
+ </pre>
+ In <a href="#example1">Example 1</a>, <code>dct:title</code>, <code>dct:subject</code> and <code>dct:format</code>
+ are descriptions of the resource <code>ex:doc1</code>.
+ They do not provide any information on how the resource was created or modified in the past.
+ On the other hand, some statements imply provenance-related information. For example <code>dct:creator</code>
+ implies that the document has been created and refers to an author. Similarly, the existence
+ of the <code>dct:issued</code> date implies that the document has been published. This information is redundantly
+ implied by the <code>dct:publisher</code> statement as well. Finally, <code>dct:replaces</code> relates
+ the document to another document <code>ex:doc2</code> which probably had some kind of influence on <code>ex:doc1</code>.
+ </p>
+ As a <code>dc</code>
+ metadata record describes the document as a whole,
it is not clear how this document relates to the different states that the document had until it reached its final state.
For example, a document may have a <code>dct:created</code> date and a <code>dct:issued</code> date. According to
the PROV ontology, the activity of issuing a document involves two different states of the document: the document before it was issued
- and the issued document. Each of these states correspond to a different specialization of the document, even if the document
+ and the issued document. Each of these states correspond to a different <code>prov:specialization</code> of the document, even if the document
has not changed. Generally, there are two approaches to deal with this issue:</p>
- </p>
- --------------->TO DEAL WITH THIS
+ </p>
<p>
- An example of a simple metadata record annotated with <code>dct</code> terms can be seen below:
- </p><p>
- <a href="#example1">Example 1</a>: a simple metadata record:
- <pre class="example" id="example1">
-ex:doc1 dct:title "A mapping from Dublin Core..." ;
- dct:creator ex:kai, ex:daniel, ex:simon, ex:michael ;
- dct:created "2012-02-28" ;
- dct:publisher ex:w3c ;
- dct:issued "2012-02-29" ;
- dct:subject ex:dublincore ;
- dct:replaces ex:doc2 ;
- dct:format "HTML" .
- </pre>
- In <a href="#example1">Example 1</a>, <code>dct:title</code>, <code>dct:subject</code> and <code>dct:format</code>
- are descriptions of the resource <code>ex:doc1</code>.
- They do not provide any information on how the resource was created or modified in the past.
- On the other hand, some statements imply provenance-related information. For example <code>dct:creator</code>
- implies that the document has been created and refers to an author. Similarly, the existence
- of the <code>dct:issued</code> date implies that the document has been published. This information is redundantly
- implied by the <code>dct:publisher</code> statement as well. Finally, <code>dct:replaces</code> relates
- the document to another document <code>ex:doc2</code> which probably had some kind of influence on <code>ex:doc1</code>.
- </p>
-
- <p>
- 1) To create new instances of entities, typically as blank nodes, that are all related to the original
- document by means of <code>prov:specializationOf</code>. This leads to bloated and not very intuitive data models, e.g. think
- about the translation of a single <code>dct:publisher</code> statement, where anyone would expect to somehow find some activity and
- agent that are directly related to the document (as in <a href="#figure_mapping_example">Figure 1</a>).
+ 1) Create new instances of entities, typically as blank nodes, that are all related to the original
+ document by means of <code>prov:specializationOf</code>. This leads to bloated and not very intuitive data models. For example, consider the
+ about the translation of a single <code>dct:publisher</code> statement (as shown on the top of <a href="#figure_mapping_example">Figure 1</a>):
+ having a publisher implies a "Publish" activity (represented with a blank node), which is related to the <code>ex:publisher</code> agent. The
+ activity must have taken as input the document to be published (<code>:_usedEntity</code>, which is a <code>prov:sprecializationOf</code> the
+ resource we are describing), and generated the published resource (<code>:_resultingEntity</code>). Since we can't ensure that the published
+ resource has not suffered any further modifications, <code>:_resultingEntity</code> is also a <code>prov:specializationOf</code> the resource
+ <code>ex:doc1</code>.
</p><p>
<div id = "figure_mapping_example" class="figure" style="text-align: center;">
<img src="img/example1.png"></img>
<div style="text-align: center;">
- <a href="#figure_mapping_example">Figure 1</a>. A mapping example creating blank nodes for each state of the resource. In PROV entities are represented
- with ellipses, activities with rectangles and agents with pentagons.
+ <a href="#figure_mapping_example">Figure 1</a>. A mapping example creating blank nodes for each state of the resource. PROV entities are represented
+ with ellipses, activities with rectangles and agents with pentagons. The bold arrow implies how the DC statement (on top of the figure) would be converted
+ to PROV (the graph on the bottom).
</div>
</div>
</p><p>
- 2) To adopt the original resource (<code>ex:doc1</code>) as the <code>prov:Entity</code> used and then generated by the PublicationActivity
+ 2) Adopt the original resource (<code>ex:doc1</code>) as the <code>prov:Entity</code> used and then generated by the PublicationActivity
(<code>:_activity</code>). However, this representation leads to a misinterpretation of the <code>dct</code> statement, as shown in the example of
<a href="#figure_mapping_example_conflating">Figure 2</a>. The representation implies that <code>ex:doc1</code>
was generated by <code>_:activity</code> and then used by <code>_:activity</code> afterwards, instead of being used and then being generated by
@@ -872,14 +876,23 @@
<section>
<h2>Mapping from Dublin Core to PROV</h2>
- --->INTRODUCE THE MAPPING HERE<---
- Substantially, the mapping consists of three parts:
-
+ <p>
+ This section describes the mapping between Dublin Core and PROV. The mapping is divided in several subsections:
+ <ul>
+ <li> <a href="#direct-mappings">Section 3.1</a>: Direct mappings between Dublin Core and PROV.</li>
+ <li> <a href="#prov-refinements">Section 3.2</a>: Extension of PROV classes and properties to represent DC activities.</li>
+ <li> <a href="#complex-mappings">Section 3.3</a>: Complex mappings between Dublin Core and PROV (inferring activities using blank nodes).</li>
+ <li> <a href="#cleanup">Section 3.4 </a>: Strategies for cleaning up some of the blank nodes of <a href="#complex-mappings">Section 3.3</a>.</li>
+ <li> <a href="#list-of-terms-excluded-from-the-mapping">Section 3.5</a>: Rationale for the terms excluded of the mapping.</li>
+ <li> <a href="#mapping-from-prov-to-dc">Section 3.6</a>: Mapping PROV to Dublin Core (out of the scope of this mapping).</li>
+ </ul>
+ </p>
<section>
<h3></span>Direct mappings</h3>
<p>
- The direct mappings provide basic interoperability using the integration mechanisms of RDF. By means
- of OWL 2 RL reasoning, any PROV application can at least make some sense from Dublin Core data. The direct mappings also
+ The direct mappings relate the <code>dct</code> terms to the PROV binary relationships by using the integration mechanisms of RDF.
+ PROV applications will be able to interoperate with these <code>dct</code> statements by applying means of OWL 2 RL reasoning, (i.e.,
+ they will be able to understand DC statements). The direct mappings also
contribute to the formal definition of the vocabularies by translating them to PROV.</p>
<p>Dublin Core, while less complex from a modeling perspective,
is more specific about the type of the activity taking place. PROV provides general attribution, and
@@ -1028,16 +1041,12 @@
</tbody>
</table>
</div>
- With the direct mapping, a metadata record such as <a href="#example1">example 1</a> will infer that
- the resource was <code>prov:generatedAtTime</code> at two different
-times. Although this may seem inconsistent, it is supported by PROV and
-it is due to the difference
- between Dublin Core and PROV resources: while the former conflates
-more than one version or "state" of the resource in a single entity, the
- latter
- proposes to separate all of them. Thus, the mapping produces
-provenance that complies with the current definition of entity but
- it does not comply with all the PROV constraints [<cite><a href="#bib-PROV-CONSTRAINTS" class="bibref">PROV-CONSTRAINTS</a></cite>].
+ It is worth mentioning that applying the direct mappings to a metadata record such as <a href="#example1">example 1</a> will infer that
+ the resource was <code>prov:generatedAtTime</code> at two different times. Although this may seem inconsistent, it is supported by PROV and
+ it is due to the difference between Dublin Core and PROV resources: while the former conflates more than one version or "state" of
+ the resource in a single entity, the latter proposes to separate all of them. Thus, the mapping produces provenance that complies with the
+ current definition of entity but it does not comply with all the PROV
+ constraints [<cite><a href="#bib-PROV-CONSTRAINTS" class="bibref">PROV-CONSTRAINTS</a></cite>].
<p></p>
<p>
Some properties have been found to be superproperties of certain prov concepts. These can be seen below in <a href="#list_of_direct_mappings2">Table 4</a>:
@@ -1045,7 +1054,9 @@
<pre rel="prov:wasQuotedFrom" resource="http://dvcs.w3.org/hg/prov/raw-file/tip/examples/eg-24-prov-o-html-examples/rdf/create/rdf/property_qualifiedAttribution.ttl"
-->
</p>
-
+ <div class="note">
+ <p>Some of the terms of the list are still under discussion: <code>dct:isVersionOf</code>.</p>
+ </div>
<div id="list_of_direct_mappings2">
<table>
<caption> <a href="#list_of_direct_mappings2"> Table 4:</a> Direct mappings (2) </caption>
@@ -1083,9 +1094,9 @@
<caption> <a href="#list_of_direct_mappings_no_prov_core"> Table 5:</a> Direct mappings to the PROV terms not included in the core </caption>
<tbody>
<tr>
+ <th>DC Term</th>
+ <th>Relation</th>
<th>PROV Term</th>
- <th>Relation</th>
- <th>DC Term</th>
<th>Rationale</th>
</tr>
<tr>
@@ -1179,28 +1190,28 @@
To properly reflect the meaning of the Dublin Core terms, more specific subclasses are needed:
</p><p>
<pre class="code">
- prov:PublicationActivity rdfs:subClassOf prov:Activity .
- prov:ContributionActivity rdfs:subClassOf prov:Activity .
- prov:CreationActivity rdfs:subClassOf prov:Activity, prov:ContributionActivity .
- prov:ModificationActivity rdfs:subClassOf prov:Activity .
- prov:AcceptanceActivity rdfs:subClassOf prov:Activity .
- prov:CopyrightingActivity rdfs:subClassOf prov:Activity .
- prov:SubmissionActivity rdfs:subClassOf prov:Activity .
- prov:PublisherRole rdfs:subClassOf prov:Role .
- prov:ContributorRole rdfs:subClassOf prov:Role .
- prov:CreatorRole rdfs:subClassOf prov:Role, prov:ContributorRole .
+ prov:Publish rdfs:subClassOf prov:Activity .
+ prov:Contribute rdfs:subClassOf prov:Activity .
+ prov:Create rdfs:subClassOf prov:Activity, prov:ContributionActivity .
+ prov:Modify rdfs:subClassOf prov:Activity .
+ prov:Accept rdfs:subClassOf prov:Activity .
+ prov:Copyright rdfs:subClassOf prov:Activity .
+ prov:Submit rdfs:subClassOf prov:Activity .
+ prov:Publisher rdfs:subClassOf prov:Role .
+ prov:Contributor rdfs:subClassOf prov:Role .
+ prov:Creator rdfs:subClassOf prov:Role, prov:ContributorRole .
</pre>
</p>
<p>
- Custom refinements of the properties should be omitted as they would be identical to the Dublin Core terms. If these more
- specific properties are needed, the Dublin Core terms should be used directly, according to the direct mappings presented in section 2.3.
+ Custom refinements of the properties have been as they would be identical to the Dublin Core terms. If these more
+ specific properties are needed, the Dublin Core terms can be used directly, according to the direct mappings presented in section 2.3.
</p>
</section>
<section>
<h3>Complex Mappings</h3>
<p>
- The complex mappings consist on a set of patterns defined to generate qualified PROV statements from Dublin Core statements. This type of qualification may not be
+ The complex mappings consist of a set of patterns defined to generate qualified PROV statements from Dublin Core statements. This type of qualification may not be
always needed, and it is the choice of the implementor whether to use them or not depending on the use case. It is also important to note that not all the
direct mappings have a complex mapping associated, just those which imply a specific activity: creation, publication, etc.
The complex mappings are provided in form of SPARQL CONSTRUCT queries, i.e., queries that describe a
@@ -1562,7 +1573,7 @@
<div id = "figure_cleanup2" class="figure" style="text-align: center;">
<img src="img/cleanup2.png"></img>
<div style="text-align: center;">
- <a href="#figure_cleanup2">Figure 4</a>. Sorting the activities by date to conflate blank nodes.
+ <a href="#figure_cleanup2">Figure 4</a>. Sorting the activities by date to conflate blank nodes. The creation activity occurs before the publishing activity.
</div>
</div>
</p>