updated derivation, simplified derivation record, prov:strep can be single/any
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Mon, 16 Jan 2012 13:32:00 +0000
changeset 1360 e7cc8d306e38
parent 1359 51dd8481a91b
child 1361 2e47d8728ead
updated derivation, simplified derivation record, prov:strep can be single/any
model/ProvenanceModel.html
--- a/model/ProvenanceModel.html	Fri Jan 13 13:23:47 2012 -0500
+++ b/model/ProvenanceModel.html	Mon Jan 16 13:32:00 2012 +0000
@@ -1858,29 +1858,36 @@
 -->
 
 
-<p>Hence, given a precision axis, with values <em>precise</em> and <em>imprecise</em>, and an activity axis, with values  <em>one activity</em> and <em>n activities</em>, we can then form a matrix of possible derivations, precise or imprecise, or corresponding to one activity or  n activities.
-Out of the four possibilities, PROV-DM offers three forms of derivation, while the  fourth one is not meaningful.  The following table summarises names for the three kinds of derivation, which we then explain.</p>
+<p>Hence, we can consider two axis.  An activity number axis that has values  <em>single</em>, <em>multiple</em>,  and <em>unknown</em>, respectively representing the case where one activity is known to have occurred, more than one activities are known to have occurred, or an unknown number of activities have occurred. Likewise, we can consider another axis to cover other details (identities, generation and usage records, attributes), with values <em>asserted</em> and <em>not asserted</em>. We can then form a matrix of possible derivations. Out of the six possibilities, 
+PROV-DM offers three forms of derivation derivation records to cater for five othem, while the remaining one is not meaningful.  The following table summarises names for the three kinds of derivation, which we then explain.</p>
 
 <div style="text-align: center;">
 <table border="1" style="margin-left: auto; margin-right: auto;">
 <caption>PROV-DM Derivation Type Summary</caption>
-<tr><td colspan=2 rowspan=2></td><td colspan=2>precision axis</td></tr>
-<tr><td>precise</td><td>imprecise</td></tr> 
-<tr><td rowspan=2>activity<br>axis</td><td>one activity</td><td><a>precise-1 derivation record</a></td><td><a>imprecise-1 derivation record</a></td></tr> 
-<tr><td>n activities</td><td>---</td><td><a>imprecise-n derivation record</a></td></tr> 
+<tr><td colspan=2 rowspan=2></td><td colspan=2><em>other details</em> axis</td></tr>
+<tr><td>asserted</td><td>not asserted</td></tr> 
+<tr><td rowspan=3><em>activity number</em><br>axis</td><td>single</td><td><a>precise-1 derivation record</a></td><td><a>imprecise-1 derivation record</a></td></tr> 
+<tr><td>multiple</td><td><a>imprecise-n derivation record</a></td><td rowspan=2><a>imprecise-n derivation record</a></td></tr> 
+<tr><td>unknown</td><td>&mdash;</td></tr> 
 </table>
 </div>
 
 <ul>
 <li> The asserter asserts that derivation is due to exactly one activity, and all the details are asserted. We call this a precise-1 derivation record.</li>
 <li> The asserter asserts that derivation is due to exactly one activity, but other details,  whether known or unknown, are not asserted. We call this an imprecise-1 derivation record.</li>
-<li> The asserter does not know how many activities are involved in the derivation, and other details, whether known or unknown, are also not asserted. We call this an imprecise-n derivation record.</li>
+<li> The following cases are captured by an imprecise-n derivation record.
+<ul>
+<li> The asserter knows that multiple activities are involved or ignores the number of activities involved in the derivation, and  other details are not asserted. </li>
+<li> The asserter knows that multiple activities are involved in the derivation, and all their details are asserted. In this case,  these activities are connected by means of generated and used intermediary entities.  Despite all activities and details being known, there is no guarantee that any of these activities plays an active role in the derivation; hence, this case is also regarded as imprecise. Instead, precise derivations need to be expressed between these intermediary entities.  </li>
 </ul>
-
-<p> We note that the fourth theoretical case of a precise derivation, where the number of activities is not known or asserted cannot occur. </p>
-
-<p>In order to represent the number of activities in a derivation, we introduce a PROV-DM attribute <span class="name">steps</span>, which can take two possible values.  
-When <span class="name">prov:steps="1"</span>, derivation is due to one activity; when <span class="name">prov:steps="n"</span>, the number of activities is not known.</p>
+</ul>
+
+<p> We note that the last theoretical cases cannot occur, since
+  asserting the details of an unknown number of activities is a contradiction.
+</p>
+
+<p>In order to represent the number of activities in a derivation, we introduce a PROV-DM attribute <span class="name">steps</span>, which can take two possible values:   <span class="name">single</span> and <span class="name">any</span>.
+When <span class="name">prov:steps="single"</span>, derivation is due to one activity; when <span class="name">prov:steps="any"</span>, the number of activities is multiple or not known.</p>
 
 
 <p>The three kinds of derivation records are successively introduced.  Making use of the attribute <span class="name">steps</span>, we can distinguish the various derivation types.</p>
@@ -1893,7 +1900,7 @@
 <li><em>activity</em>: an identifier <span class="name">a</span> of an activity record, which is a representation of the activity using and generating the above entities;</li>
 <li><em>generation</em>: an identifier  <span class="name">g2</span> of the generation record pertaining to <span class="name">e2</span> and <span class="name">a</span>;</li> 
 <li><em>usage</em>: an identifier  <span class="name">u1</span> of the usage record pertaining to <span class="name">e1</span> and <span class="name">a</span>.</li> 
-<li><em>attributes</em>: an OPTIONAL set of attribute-value pairs <span class="name">attrs</span> that describe the modalities of this derivation, optionally including the attribute-value pair <span class="name">prov:steps="1"</span>.</li>
+<li><em>attributes</em>: an OPTIONAL set of attribute-value pairs <span class="name">attrs</span> that describe the modalities of this derivation, optionally including the attribute-value pair <span class="name">prov:steps="single"</span>.</li>
 </ul>
 <p>It is OPTIONAL to include  the attribute <span class="name">prov:steps</span> in a precise-1 derivation since the record already refers to the one and only one activity underpinning the derivation.</p>
 
@@ -1903,7 +1910,7 @@
 <li><em>id</em>:  an OPTIONAL identifier  <span class="name">id</span> identifying the derivation record;</li> 
 <li><em>generatedEntity</em>: the identifier <span class="name">e2</span> of  an entity record, which is a representation of the generated entity;</li>
 <li><em>usedEntity</em>: the identifier <span class="name">e1</span> of  an entity record, which is a representation of the used entity.</li>
-<li><em>attributes</em>: a set of attribute-value pairs <span class="name">attrs</span> that describe the modalities of this derivation; it MUST include the attribute-value pair <span class="name">prov:steps="1"</span>.</li>
+<li><em>attributes</em>: a set of attribute-value pairs <span class="name">attrs</span> that describe the modalities of this derivation; it MUST include the attribute-value pair <span class="name">prov:steps="single"</span>.</li>
 </ul>
 <p>An imprecise-1 derivation MUST include the attribute <span class="name">prov:steps</span>,  since it is the only means to distinguish this record from an imprecise-n derivation record.</p>
 
@@ -1913,9 +1920,9 @@
 <li><em>id</em>:  an OPTIONAL identifier  <span class="name">id</span> identifying the derivation record;</li> 
 <li><em>generatedEntity</em>: the identifier <span class="name">e2</span> of  an entity record, which is a representation of the generated entity;</li>
 <li><em>usedEntity</em>: the identifier <span class="name">e1</span> of  an entity record, which is a representation of the used entity.</li>
-<li><em>attributes</em>: an OPTIONAL set of attribute-value pairs <span class="name">attrs</span> that describe the modalities of this derivation; it optionally includes the attribute-value pair <span class="name">prov:steps="n"</span>.</li>
+<li><em>attributes</em>: an OPTIONAL set of attribute-value pairs <span class="name">attrs</span> that describe the modalities of this derivation; it optionally includes the attribute-value pair <span class="name">prov:steps="any"</span>.</li>
 </ul>
-<p>It is OPTIONAL to include  the attribute <span class="name">prov:steps</span> in an imprecise-n derivation record. It defaults to <span class="name">prov:steps="n"</span>.</p> 
+<p>It is OPTIONAL to include  the attribute <span class="name">prov:steps</span> in an imprecise-n derivation record. It defaults to <span class="name">prov:steps="any"</span>.</p> 
 
 
 <p>None of the three kinds of derivation is defined to be transitive. Domain-specific specializations of these derivations may be defined in such a way that the transitivity property holds.</p>
@@ -1925,61 +1932,38 @@
 
 <div class='grammar'>
 <span class="nonterminal">derivationRecord</span>&nbsp;::= 
-<span class="nonterminal">precise-1-derivationRecord</span>
-| <span class="nonterminal">imprecise-1-derivationRecord</span>
-| <span class="nonterminal">imprecise-n-derivationRecord</span><br/>
-<br/>
-<span class="nonterminal">precise-1-derivationRecord</span>&nbsp;::=  
-<span class="name">wasDerivedFrom</span>
-<span class="name">(</span>
-<span class="optional"> <span class="nonterminal">identifier</span>,</span>
-<span class="nonterminal">eIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">eIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">aIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">gIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">uIdentifier</span>
-<span class="nonterminal">optional-attribute-values</span>
-<span class="name">)</span><br/>
-<span class="nonterminal">imprecise-1-derivationRecord</span>::=  
 <span class="name">wasDerivedFrom</span>
 <span class="name">(</span>
 <span class="optional"> <span class="nonterminal">identifier</span>,</span>
 <span class="nonterminal">eIdentifier</span>
 <span class="name">,</span>
 <span class="nonterminal">eIdentifier</span>
+<span class="optional">
 <span class="name">,</span>
-<span class="nonterminal">attribute-values</span>
-<span class="name">)</span>
-<br/>
-<span class="nonterminal">imprecise-n-derivationRecord</span>::=  
-<span class="name">wasDerivedFrom</span>
-<span class="name">(</span>
-<span class="optional"> <span class="nonterminal">identifier</span>,</span>
-<span class="nonterminal">eIdentifier</span>
+<span class="nonterminal">aIdentifier</span>
 <span class="name">,</span>
-<span class="nonterminal">eIdentifier</span>
+<span class="nonterminal">gIdentifier</span>
+<span class="name">,</span>
+<span class="nonterminal">uIdentifier</span>
+</span>
 <span class="nonterminal">optional-attribute-values</span>
 <span class="name">)</span>
 </div>
-<div class="note">
-The grammar should make it clear that attribute <span class="name">prov:steps="1"</span> is required for imprecise-1-derivationRecord.<br/>
-  PM: suggestion -- remove the distinction between imprecise-1 and imprecise-n in the grammar and instead explain that the qualification (1 vs n) is through attribute prov:steps.
-</div>
+<p>
+When the activity, generation and usage record identifiers are present, a derivation record is precise-1.  The distinction between imprecise-1 and imprecise-n is given by the 
+attribute <span class="name">prov:steps</span>.  
+</p>
 
 <div class="anexample">
 <p>The following assertions state the existence of derivations.</p>
 <pre class="codeexample">
-wasDerivedFrom(e5,e3,a4,g2,u2,[])
-wasDerivedFrom(e5,e3,a4,g2,u2,[prov:steps="1"])
-
-wasDerivedFrom(e3,e2,[prov:steps="1"])
+wasDerivedFrom(e5,e3,a4,g2,u2)
+wasDerivedFrom(e5,e3,a4,g2,u2,[prov:steps="single"])
+
+wasDerivedFrom(e3,e2,[prov:steps="single"])
 
 wasDerivedFrom(e2,e1,[])
-wasDerivedFrom(e2,e1,[prov:steps="n"])
+wasDerivedFrom(e2,e1,[prov:steps="any"])
 </pre>
 <p>
 The first two are precise-1 derivation records expressing that the activity represented by the activity <span class="name">a4</span>, by
@@ -1996,9 +1980,9 @@
 <p>An precise-1  derivation record is richer  than an imprecise-1 derivation record, itself, being more informative that an imprecise-n derivation record. Hence, the following implications hold.</p>
 <div class='inference' id='derivation-implications'>
 Given two entity records denoted by <span class="name">e1</span> and <span class="name">e2</span>, <span class='conditional'>if</span> the assertion  <span class="name">wasDerivedFrom(e2, e1, a, g2, u1, attrs)</span>
- holds for some generation record identified by <span class="name">g2</span>, and usage record identified by <span class="name">u1</span>, then <span class="name">wasDerivedFrom(e2,e1,[prov:steps="1"] &cup; attrs)</span> also holds.<br>
-
-Given two entity records denoted by <span class="name">e1</span> and <span class="name">e2</span>, <span class='conditional'>if</span> the assertion  <span class="name">wasDerivedFrom(e2, e1, [prov:steps="1"] &cup; attrs)</span>
+ holds for some generation record identified by <span class="name">g2</span>, and usage record identified by <span class="name">u1</span>, then <span class="name">wasDerivedFrom(e2,e1,[prov:steps="single"] &cup; attrs)</span> also holds.<br>
+
+Given two entity records denoted by <span class="name">e1</span> and <span class="name">e2</span>, <span class='conditional'>if</span> the assertion  <span class="name">wasDerivedFrom(e2, e1, [prov:steps="single"] &cup; attrs)</span>
  holds, then <span class="name">wasDerivedFrom(e2,e1,attrs)</span> also holds.<br>
  </div>
 
@@ -2014,7 +1998,7 @@
 First, we consider one-activity derivations.</p>
 
 <div class='interpretation' id='derivation-usage-generation-ordering'>Given an activity record identified by <span class="name">a</span>, entity records identified by <span class="name">e1</span> and <span class="name">e2</span>, generation record identified by <span class="name">g2</span>, and usage record identified by <span class="name">u1</span>, <span class='conditional'>if</span> the record <span class="name">wasDerivedFrom(e2,e1,a,g2,u1,attrs)</span>
-or <span class="name">wasDerivedFrom(e2,e1,[prov:steps="1"] &cup; attrs)</span> holds, <span class='conditional'>then</span>
+or <span class="name">wasDerivedFrom(e2,e1,[prov:steps="single"] &cup; attrs)</span> holds, <span class='conditional'>then</span>
 the following temporal constraint holds:
 the <a title="entity usage event">usage</a>
 of entity denoted by <span class="name">e1</span> <a>precedes</a> the <a title="entity generation event">generation</a> of
@@ -2670,7 +2654,7 @@
 </div>
 
 
-<li> The  attribute <dfn id="dfn-steps"><span class="name">prov:steps</span></dfn>  defines the level of precision associated with a derivation record. The value associated with a <span class="name">prov:steps</span> attribute  MUST be   <span class="name">"1"</span> or <span class="name">"n"</span>. The attribute <span class="name">prov:step</span> occurs at most once in a derivation record. A derivation record without attribute <span class="name">prov:step</span> is considered to be equivalent to the same record extended with an extra attribute  <span class="name">prov:step</span> and associated value <span class="name">"n"</span>. </li>
+<li> The  attribute <dfn id="dfn-steps"><span class="name">prov:steps</span></dfn>  defines the level of precision associated with a derivation record. The value associated with a <span class="name">prov:steps</span> attribute  MUST be   <span class="name">"single"</span> or <span class="name">"any"</span>. The attribute <span class="name">prov:step</span> occurs at most once in a derivation record. A derivation record without attribute <span class="name">prov:step</span> is considered to be equivalent to the same record extended with an extra attribute  <span class="name">prov:step</span> and associated value <span class="name">"any"</span>. </li>
 
 </ul>
 </section>
@@ -3569,7 +3553,7 @@
 
 
 <div class='interpretation' id='derivation-usage-generation-ordering'>Given an activity record identified by <span class="name">a</span>, entity records identified by <span class="name">e1</span> and <span class="name">e2</span>, generation record identified by <span class="name">g2</span>, and usage record identified by <span class="name">u1</span>, <span class='conditional'>if</span> the record <span class="name">wasDerivedFrom(e2,e1,a,g2,u1,attrs)</span>
-or <span class="name">wasDerivedFrom(e2,e1,[prov:steps="1"] &cup; attrs)</span> holds, <span class='conditional'>then</span>
+or <span class="name">wasDerivedFrom(e2,e1,[prov:steps="single"] &cup; attrs)</span> holds, <span class='conditional'>then</span>
 the following ordering constraint holds:
 the <a title="entity usage event">usage</a>
 of entity denoted by <span class="name">e1</span> <a>precedes</a> the <a title="entity generation event">generation</a> of
@@ -3580,7 +3564,7 @@
 illustrated by Subfigure <a href="#constraint-summary">constraint-summary</a> (f) and  expressed by constraint <a href="#derivation-generation-generation-ordering">derivation-generation-generation-ordering</a>.</p>
 
 <div class='interpretation' id='derivation-generation-generation-ordering'>
-Given two entity records denoted by <span class="name">e1</span> and <span class="name">e2</span>, <span class='conditional'>if</span> the record <span class="name">wasDerivedFrom(e2,e1,[prov:steps="n"] &cup; attrs)</span>
+Given two entity records denoted by <span class="name">e1</span> and <span class="name">e2</span>, <span class='conditional'>if</span> the record <span class="name">wasDerivedFrom(e2,e1,[prov:steps="any"] &cup; attrs)</span>
  holds, <span class='conditional'>then</span> the following ordering constraint holds:
 the <a title="entity generation event">generation event</a> of the entity denoted by <span class="name">e1</span> <a>precedes</a> the <a title="entity generation event">generation event</a> of
 the entity  denoted by <span class="name">e2</span>.
@@ -3699,7 +3683,7 @@
 <p>A further inference is permitted from the imprecise-1 derivation record: </p>
 <div class='inference' id='derivation-use'>
 <p>Given an activity record identified by <span class="name">a</span>, entity records identified by <span class="name">e1</span> and <span class="name">e2</span>, and set of attribute-value pairs <span class="name">attrs2</span>,
-<span class='conditional'>if</span> <span class="name">wasDerivedFrom(e2,e1, [prov:steps="1"])</span> and <span class="name">wasGeneratedBy(e2,a,attrs2)</span> hold, <span class='conditional'>then</span> <span class="name">used(a,e1,attrs1)</span> also holds
+<span class='conditional'>if</span> <span class="name">wasDerivedFrom(e2,e1, [prov:steps="single"])</span> and <span class="name">wasGeneratedBy(e2,a,attrs2)</span> hold, <span class='conditional'>then</span> <span class="name">used(a,e1,attrs1)</span> also holds
 for some set of attribute-value pairs <span class="name">attrs1</span>.
 </div>
 <p>This inference is justified by the fact that the entity represented by entity record identified by <span class="name">e2</span> is generated by at most one activity in a given account (see <a href="#generation-uniqueness">generation-uniqueness</a>). Hence,  this activity record is also the one referred to in the usage record of <span class="name">e1</span>. 
@@ -3897,6 +3881,9 @@
 <section class="appendix"> 
       <h2>Changes Since Second Public Working Draft</h2> 
 <ul>
+<li>01/16/12: updated derivation, simplified derivation record, prov:strep can be single/any. </li>
+<li>01/02/12: revision of collections </li>
+<li>01/02/12: definition alternate and specialization </li>
 <li>12/21/11: updated example with new relations. </li>
 <li>12/21/11: definition alternate and specialization (5.3.3.3). </li>
 <li>12/21/11: Specified permitted occurrences of prov attributes. </li>