updated wasDerivedFrom to refer to gen/usage record ids
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Thu, 17 Nov 2011 15:29:35 +0000
changeset 951 9487fc1112d7
parent 950 6b096273804f
child 952 5f6d39d503c6
updated wasDerivedFrom to refer to gen/usage record ids
model/ProvenanceModel.html
--- a/model/ProvenanceModel.html	Thu Nov 17 14:42:18 2011 +0000
+++ b/model/ProvenanceModel.html	Thu Nov 17 15:29:35 2011 +0000
@@ -420,6 +420,9 @@
     <section id="prov-dm-overview"> 
 <h2>PROV-DM Overview </h2>
 
+<div class="note"> Figure and text in this section needs to be adapted to the new relations between activity and agent. The term "characterizing attribute" needs also to be removed from the figure. 
+</div>
+
 <p>
 The following ER diagram provides a high level overview of the <strong>structure of PROV-DM assertions</strong>. Examples of provenance assertions that conform to this schema are provided in the next section (note: cardinality constraints on the associations are 0..* unless otherwise stated)</p>
 
@@ -557,14 +560,14 @@
 
 <p>
 Generation Records (described in <a href="#record-Generation">Section Generation</a>) represent the event at which a file is created in a specific form. Attributes are used to describe the modalities according to which a given entity is generated by a given activity.  The interpretation of attributes is application specific. Illustrations of such attributes for the scenario are: no attribute is provided for <span class="name">e0</span>;
-<span class="name">e2</span> was generated by the editor's  save function;  <span class="name">e4</span> can be found on the smtp port, in the attachment section of the mail message; <span class="name">e6</span> was produced on the standard output of <span class="name">a5</span>.</p>
+<span class="name">e2</span> was generated by the editor's  save function;  <span class="name">e4</span> can be found on the smtp port, in the attachment section of the mail message; <span class="name">e6</span> was produced on the standard output of <span class="name">a5</span>. Two identifiers <span class="name">g1</span> and <span class="name">g2</span> identify the generation records referenced in derivations introduced below.</p>
 <pre>
 wasGeneratedBy(e0, a0, [])
 wasGeneratedBy(e1, a0, [ex:fct="create"])
 wasGeneratedBy(e2, a1, [ex:fct="save"])     
 wasGeneratedBy(e3, a3, [ex:fct="save"])     
-wasGeneratedBy(e4, a2, [ex:port="smtp", ex:section="attachment"])  
-wasGeneratedBy(e5, a4, [ex:port="smtp", ex:section="attachment"])    
+wasGeneratedBy(g1, e4, a2, [ex:port="smtp", ex:section="attachment"])  
+wasGeneratedBy(g2, e5, a4, [ex:port="smtp", ex:section="attachment"])    
 wasGeneratedBy(e6, a5, [ex:file="stdout"])
 </pre>
 
@@ -574,23 +577,23 @@
 Usage Records (described in <a href="#record-Usage">Section Usage</a>) represent the event by which a file is read by an activity. 
 
 Likewise, attributes describe the modalities according to which the various entities are used by activities.  Illustrations of such attributes are: 
-<span class="name">e1</span> is used in the context of  <span class="name">a1</span>'s <span class="name">load</span> functionality; <span class="name">e2</span> is used by <span class="name">a2</span> in the context of its attach functionality; <span class="name">e3</span> is used on the standard input by <span class="name">a5</span>. </p>
+<span class="name">e1</span> is used in the context of  <span class="name">a1</span>'s <span class="name">load</span> functionality; <span class="name">e2</span> is used by <span class="name">a2</span> in the context of its attach functionality; <span class="name">e3</span> is used on the standard input by <span class="name">a5</span>. Two identifiers <span class="name">u1</span> and <span class="name">u2</span> identify the Usage records referenced in derivations introduced below.</p>
 <pre>
 used(a1,e1,[ex:fct="load"])
 used(a3,e2,[ex:fct="load"])
-used(a2,e2,[ex:fct="attach"])
-used(a4,e3,[ex:fct="attach"])
+used(u1,a2,e2,[ex:fct="attach"])
+used(u2,a4,e3,[ex:fct="attach"])
 used(a5,e3,[ex:file="stdin"])
 </pre>
 
 
 <p>
-Derivation Records (described in <a href="#Derivation-Relation">Section Derivation Relation</a>) express that an entity is derived from another.  The first two are expressed in their compact version, whereas the following two are expressed in their full version, including the activity underpinning the derivation, and relevant attribute describing the usage and generation of entities.</p>
+Derivation Records (described in <a href="#Derivation-Relation">Section Derivation Relation</a>) express that an entity is derived from another.  The first two are expressed in their compact version, whereas the following two are expressed in their full version, including the activity underpinning the derivation, and associated  usage (<span class="name">u1</span>, <span class="name">u2</span>) and generation (<span class="name">g1</span>, <span class="name">g2</span>) records.</p>
 <pre>
 wasDerivedFrom(e2,e1)
 wasDerivedFrom(e3,e2)
-wasDerivedFrom(e4,e2,a2,[ex:port="smtp", ex:section="attachment"],[ex:fct="attach"])
-wasDerivedFrom(e5,e3,a4,[ex:port="smtp", ex:section="attachment"],[ex:fct="attach"])
+wasDerivedFrom(e4,e2,a2,g1,u2)
+wasDerivedFrom(e5,e3,a4,g2,u2)
 </pre>
 
 
@@ -1546,12 +1549,12 @@
 [<span class="name">,</span>
 <span class="nonterminal">identifier</span>
 <span class="name">,</span>
-<span class="nonterminal">generationAttributeValues</span>
+<span class="nonterminal">generationIdentifier</span>
 <span class="name">,</span>
-<span class="nonterminal">useAttributesValues</span>]
+<span class="nonterminal">usageIdentifier</span>]
 <span class="name">)</span><br/>
-<span class="nonterminal">generationAttributeValues</span>:=  <span class="nonterminal">attributeValues</span><br/>
-<span class="nonterminal">useAttributeValues</span>:=  <span class="nonterminal">attributeValues</span><br/>
+<span class="nonterminal">generationIdentifier</span>:=  <span class="nonterminal">identifier</span><br/>
+<span class="nonterminal">usageIdentifier</span>:=  <span class="nonterminal">usage</span><br/>
 
 <span class="nonterminal">pe-independent-derivationRecord</span>:=  
 <span class="name">wasEventuallyDerivedFrom</span>
@@ -1576,15 +1579,16 @@
 <section id="pe-linked-derivationRecord">
 <h4>Activity Linked Derivation Record</h4>
 
-<p>an activity linked derivation record, which, by definition of a derivation record, is a representation that some entity is transformed from, created from, or affected by another entity, also entails the existence of an activity record that represents an activity that transforms, creates or affects this entity.</p>
-
-<p>In its full form, a activity linked derivation record, written <span class="name">wasDerivedFrom(e2,e1,pe,attrs2,attrs1)</span> in PROV-ASN:</p>
+<p>An activity linked derivation record, which, by definition of a derivation record, is a representation that some entity is transformed from, created from, or affected by another entity, also entails the existence of an activity record that represents an activity that transforms, creates or affects this entity.</p>
+
+<p>In its full form, an activity linked derivation record, written <span class="name">wasDerivedFrom(id, e2,e1,a,g2,u1)</span> in PROV-ASN, contains:</p>
 <ul>
-<li> refers to an entity record identified by <span class="name">e2</span>, which is a representation of the generated entity;</li>
-<li> refers to an entity record identified by <span class="name">e1</span>, which is a representation of the used entity;</li>
-<li> refers to an activity record identified by <span class="name">pe</span>, which is a representation of the activity using and generating the above entities;</li>
-<li> contains a set of attribute-value pairs <span class="name">attrs2</span>, which describes the generation record pertaining to <span class="name">e2</span> and <span class="name">pe</span>;</li>
-<li> contains a set of attribute-value pairs <span class="name">attrs1</span>, which describes the usage record pertaining to <span class="name">e1</span> and <span class="name">pe</span>.</li>
+<li><em>id</em>:  an OPTIONAL identifier  <span class="name">id</span> identifying the derivation record;</li> 
+<li><em>generatedEntity</em>: the identifier <span class="name">e2</span> of  an entity record, which is a representation of the generated entity;</li>
+<li><em>usedEntity</em>: the identifier <span class="name">e1</span> of  an entity record, which is a representation of the used entity;</li>
+<li><em>activity</em>: an identifier <span class="name">a</span> of an activity record, which is a representation of the activity using and generating the above entities;</li>
+<li><em>generation</em>: an identifier  <span class="name">g2</span> of the generation record pertaining to <span class="name">e2</span> and <span class="name">a</span>;</li> 
+<li><em>usage</em>: an identifier  <span class="name">u1</span> of the usage record pertaining to <span class="name">e1</span> and <span class="name">a</span>;</li> 
 </ul>
 
 
@@ -1601,20 +1605,20 @@
 <div class="anexample">
 <p>The following derivation assertions</p>
 <pre class="codeexample">
-wasDerivedFrom(e5,e3,a4,[ex:channel="out"],[ex:channel="in"])
+wasDerivedFrom(e5,e3,a4,g2,u2)
 wasDerivedFrom(e3,e2)
 </pre>
 <p>
 state the existence of activity-linked derivations;
 the first expresses that the activity represented by the activity <span class="name">a4</span>, by
-using the entity denoted by <span class="name">e3</span> obtained on the <span class="name">in</span> channel
+using the entity denoted by <span class="name">e3</span> obtained during use documented by usage record <span class="name">u2</span>
  derived the
-entity denoted by <span class="name">e5</span> and generated it on
-channel <span class="name">out</span>. The second is similar for <span class="name">e3</span> and <span class="name">e2</span>, but it leaves the activity record and associated attributes implicit. The meaning of "channel" is application specific.
+entity denoted by <span class="name">e5</span> and generated it according to generation record
+ <span class="name">g2</span>. The second is similar for <span class="name">e3</span> and <span class="name">e2</span>, but it leaves the activity record and associated attributes implicit. 
 </p>
 </div>
 
-
+<!--
 <p>If a derivation record holds for <span class="name">e2</span> and <span class="name">e1</span>, then it means that the entity represented by the entity record identified by <span class="name">e1</span> has an influence on the entity represented by the entity record identified by <span class="name">e2</span>, which is captured by a dependency between their attribute values; it also implies temporal ordering. These are specified as follows:</p>
 
 <div class='deprecatedconstraint' id='derivation-attributes'>Given an activity record denoted by <span class="name">pe</span>, entity records denoted by <span class="name">e1</span> and <span class="name">e2</span>, set of attribute-value pairs <span class="name">attrs1</span> and <span class="name">attrs2</span>, the assertion <span class="name">wasDerivedFrom(e2,e1,pe,attrs2,attrs1)</span>
@@ -1624,6 +1628,7 @@
 attributes of the entity record identified by <span class="name">e1</span>. </div>
 
 <div class='note'>The WG has approved that this constraint should be dropped.  It and others had some influence on derivation transitivity. They will be removed from the documents once the proposal on derivation has been approved. </div>
+-->
 
 
 <div class='interpretation' id='derivation-use-generation-ordering'>Given an activity record identified by <span class="name">pe</span>, entity records identified by <span class="name">e1</span> and <span class="name">e2</span>, sets of attribute-value pairs <span class="name">attrs1</span> and <span class="name">attrs2</span>, <span class='conditional'>if</span> the assertion <span class="name">wasDerivedFrom(e2,e1,pe,attrs2,attrs1)</span>
@@ -1633,7 +1638,7 @@
 the entity denoted by <span class="name">e2</span>.
 </div>
 
-
+<!--  If there is a generation/usage record, then this is now trivial!
 <p>
 The following inference rule states that a generation and usage event can be inferred from an activity linked derivation record.
 </p>
@@ -1643,6 +1648,7 @@
   <span class="name">wasGeneratedBy(e2,pe,attrs2)</span> and <span class="name">used(pe,e1,attrs1)</span> also
   hold.
 </div>
+-->
 
 
 <p>The compact version has the same meaning as the fully formed
@@ -1650,10 +1656,15 @@
 record is known to exist, though it does not need to be 
 asserted.  This is formalized by the following inference rule,
 referred to as <em>activity introduction</em>:</p>
-<div class='constraint' id="derivation-activity">
-<span class='conditional'>If</span> <span class="name">wasDerivedFrom(e2,e1)</span> holds, <span class='conditional'>then</span> there exists an activity record identified by <span class="name">pe</span>, and sets of attribute-value pairs <span class="name">attrs1</span>,<span class="name">attrs2</span>,
+<div class='constraint' id="activity-introduction">
+<span class='conditional'>If</span> <span class="name">wasDerivedFrom(e2,e1)</span> holds, <span class='conditional'>then</span> there exist an activity record identified by <span class="name">a</span>, a usage record identified by <span class="name">u</span>, and a generation record identified by <span class="name">g</span>
 such that:
-  <span class="name">wasGeneratedBy(e2,pe,attrs2)</span> and <span class="name">used(pe,e1,attrs1)</span>. 
+<pre class="codeexample">
+activity(a,attrs)
+wasGeneratedBy(g,e2,a,gAttrs)
+used(u,a,e1,uAttrs)
+</pre>
+for sets of attribute-value pairs <span class="name">gAttrs</span>, <span class="name">uAttrs</span>, and <span class="name">attrs</span>.
 </div>
 
 
@@ -1662,12 +1673,12 @@
 
 <p>
 Note that inferring derivation from usage and generation does not hold
-in general. Indeed, when a generation <span class="name">wasGeneratedBy(e2,pe,attrs2)</span>
-precedes <span class="name">used(pe,e1,attrs1)</span>, for
-some <span class="name">e1</span>, <span class="name">e2</span>, <span class="name">attrs1</span>, <span class="name">attrs2</span>, and <span class="name">pe</span>, one
-cannot infer derivation <span class="name">wasDerivedFrom(e2,e1,pe,attrs2,attrs1)</span>
-or <span class="name">wasDerivedFrom(e2,e1)</span> since the values of attributes
-of <span class="name">e2</span> cannot possibly be determined by the values of attributes
+in general. Indeed, when a generation <span class="name">wasGeneratedBy(g, e2, a, attrs2)</span>
+precedes <span class="name">used(u, a, e1, attrs1)</span>, for
+some <span class="name">e1</span>, <span class="name">e2</span>, <span class="name">attrs1</span>, <span class="name">attrs2</span>, and <span class="name">a</span>, one
+cannot infer derivation <span class="name">wasDerivedFrom(e2, e1, a, g, u)</span>
+or <span class="name">wasDerivedFrom(e2,e1)</span> since 
+of <span class="name">e2</span> cannot possibly be determined by
 of <span class="name">e1</span>, given the creation of <span class="name">e2</span> precedes the use
 of <span class="name">e1</span>.
 </p>
@@ -1689,7 +1700,9 @@
 derive <span class="name">wasGeneratedBy(e2,pe,attrs2)</span> because identifier <span class="name">e1</span> may occur in usage records referring to 
 many activity records, but they may not be referred to in generation records containing identifier <span class="name">e2</span>.</p>
 
-
+<div class="note">This property holds for account where
+generation-unicity applies. Maybe move it to separate section with all
+related material. </div>
 
 </section>
 
@@ -1737,8 +1750,6 @@
 <p>Hence, an activity independent derivation record can be directly asserted or can be inferred (by means of <a href="#derivation-linked-independent">derivation-linked-independent</a>).</p>
 
 
-<div class='note'>Should we link wasEventuallyDerivedFrom to attributes as we did for wasDerivedFrom?  If so, this type of inference should be presented upfront, for both.</div>
-
 
 
 
@@ -2410,6 +2421,7 @@
 <div class="note">Proposal to change the name to "Dependencies amongst Activities"  to avoid ambiguities</div>
 
 <p>PROV-DM allows temporal relationships between activities to be expressed.
+
 An <dfn id="InformationFlowOrdering">information flow ordering record</dfn> is a representation that an entity was generated by an activity, before it was used by another activity.
 A <dfn id="ControlOrdering">control ordering record</dfn> is a representation that the end of
 an activity precedes the start of another activity.
@@ -2865,6 +2877,7 @@
 <section class="appendix"> 
       <h2>Changes Since Previous Version</h2> 
 <ul>
+<li>11/17/11: Updated wasDerivedFrom to refer to generation/usage record ids.</li>
 <li>11/17/11: Simplified hasAnnotation mechanism, now requiring to-be-annotated record to had id.</li>
 <li>11/17/11: Renamed annotation into note.</li>
 <li>11/17/11: Introduced wasStartedBy, wasEndedBy, and actedOnBehalfOf.</li>