updateds on structural constraints
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Wed, 21 Dec 2011 12:36:52 +0000
changeset 1302 9a94c9c2fbaf
parent 1301 1d696f076f51
child 1303 c0836e4892ce
updateds on structural constraints
model/ProvenanceModel.html
--- a/model/ProvenanceModel.html	Wed Dec 21 12:09:58 2011 +0000
+++ b/model/ProvenanceModel.html	Wed Dec 21 12:36:52 2011 +0000
@@ -3484,7 +3484,7 @@
 that time information. It is expected that such instantiated
 constraint can help corroborate provenance information. We anticipate
 that verification algorithms could be developed though this
-verification is outside the scope of this specfication.
+verification is outside the scope of this specification.
 </p>
 
 <p>The following figure summarizes the ordering constraints in a
@@ -3598,13 +3598,13 @@
 
 
 
-<p>Sections 5 and 6 define a data model for provenance, which, for the most part, is unconstrained. Section 7.1 defines an interpretation of this data model, in terms of event ordering constraints.  This section introduces further constraints on the structure of PROV-DM records.  Instances of records that satisfy these constraints are said to be <dfn>structurally well-formed</dfn>.  A benefit of structurally well-formed provenance records is that further inferences can be made, because records are more precise, and therefore, richer. </p>
+<p>Sections 5 and 6 define a data model for provenance, which, for the most part, is unconstrained. Section 7.1 defines an interpretation of this data model, in terms of event ordering constraints.  This section introduces further constraints on the structure of PROV-DM records.  Records that satisfy these constraints are said to be <dfn>structurally well-formed</dfn>.  A benefit of structurally well-formed provenance records is that further inferences can be made, because records are more precise, and therefore, richer. </p>
 
 <p>According to the definition of a <a>generation record</a>, an entity becomes available after this entity's generation event, and does not exist before this event.  From this definition, we conclude that PROV-DM does not allow for an entity to have two generation records occurring at two different instants.
 The rationale for this constraint is as follows.
  Two distinct <a title="entity generation event">generation events</a> (by a same activity or by two distinct activities), occurring one after the other, necessarily create two distinct entities; otherwise, the second <a title="entity generation event">generation event</a> would have resulted in a entity that existed before its creation, which contradicts the definition of <a>generation record</a>.</p>
 
-<p>So, PROV-DM allows for  two distinct <a title="entity generation event">generation events</a>  <span class="name">g1</span> and <span class="name">g2</span> of a same entity provided they occur <em>simultaneously</em>. 
+<p>So, PROV-DM allows for two distinct <a>generation records</a>  <span class="name">g1</span> and <span class="name">g2</span> referencing a same entity record provided they occur <em>simultaneously</em>. 
 <!-- (This means that <span class="name">g1</span> <a>precedes</a> <span class="name">g2</span> and <span class="name">g2</span> <a>precedes</a> <span class="name">g1</span>.) -->
   For such a simultaneous generation to occur, the generation event has to be unique and caused by a <em>single world activity</em>, though the provenance records may contain different activity records providing alternative descriptions of that same world activity. 
 </p>
@@ -3613,9 +3613,9 @@
 <p>
 In the following assertions, a workflow execution  <span class="name">a0</span> consists of two sub-workflow executions  <span class="name">a1</span> and <span class="name">a2</span>. Sub-workflow execution <span class="name">a2</span> generates entity <span class="name">e</span>, so does <span class="name">a0</span>.</p>
 <pre class="codeexample">
-activity(a0,t1,t2,[prov:type="workflow execution"])
-activity(a1,t1,t3,[prov:type="workflow execution"])
-activity(a2,t3,t2,[prov:type="workflow execution"])
+activity(a0,,,[prov:type="workflow execution"])
+activity(a1,,,[prov:type="workflow execution"])
+activity(a2,,,[prov:type="workflow execution"])
 wasInformedBy(a2,a1)
 
 wasGeneratedBy(e,a0)
@@ -3624,9 +3624,9 @@
 This example is permitted in PROV-DM if the two activity records <span class="name">a0</span> and <span class="name">a2</span> provide alternate descriptions of what happens in the world with respect to this generation event.
 </div>
 
-<p>While this example is permitted in PROV-DM, it does not expose the hierarchical organization of executions and  it mixes records providing two descriptions of a same execution. This issue is highlighted by two different <a title="generation record">generation records</a> for entity <span class="name">e</span>, which make reasoning about this kind of provenance records unncessarily difficult.  Such assertions are said not be structurally well-formed.</p>
-
-<p>Instead,  the structure of the descriptions can be exposed with separate accounts to form a structurally well-formed provenance record.</p> 
+<p>While this example is permitted in PROV-DM, it does not expose the hierarchical organization of executions and  it mixes records providing two descriptions of a same execution. This issue is highlighted by two different <a title="generation record">generation records</a> for entity <span class="name">e</span>, which make reasoning about this kind of provenance records unnecessarily difficult.  Such assertions are said not be structurally well-formed.</p>
+
+<p>Structurally well-formed provenance records can be obtained by partitioning the generation records into different accounts. This makes it clear that these records provide alternative descriptions of the same real-world generation event, rather than describing two generation events for the same entity. When accounts are used, the example can be encoded as follows.</p>
 
 
 <div class="anexample">
@@ -3655,7 +3655,8 @@
 If two activities sequentially set different values to some attribute by means of two different <a title="entity generation event">generation events</a>, then they generate distinct entities. Alternatively,  for two activities to generate an entity simultaneously, they would require some synchronization by which they agree the entity is released for use; the end of this synchronization would constitute the actual generation of the entity, but is performed by a single activity. This unicity constraint is formalized as follows.
 -->
 
-<p>Structurally well-formed records satisfy some constraints, which force the structure of descriptions to be exposed by means of accounts. In return for satisfying these constraints, further inferences can be made about structurally well-formed records.
+
+<p>Structurally well-formed records satisfy some constraints, which force the structure of descriptions to be exposed by means of accounts. With these constraints satisfied, further inferences can be made about structurally well-formed records.
 The unicity of generation records in accounts is formulated as follows.
 </p>
 
@@ -3668,8 +3669,8 @@
 
 <p>A further inference is permitted from the imprecise-1 derivation record: </p>
 <div class='inference' id='derivation-use'>
-<p>Given an activity record identified by <span class="name">pe</span>, entity records identified by <span class="name">e1</span> and <span class="name">e2</span>, and set of attribute-value pairs <span class="name">attrs2</span>,
-<span class='conditional'>if</span> <span class="name">wasDerivedFrom(e2,e1, [prov:steps="1"])</span> and <span class="name">wasGeneratedBy(e2,pe,attrs2)</span> hold, <span class='conditional'>then</span> <span class="name">used(pe,e1,attrs1)</span> also holds
+<p>Given an activity record identified by <span class="name">a</span>, entity records identified by <span class="name">e1</span> and <span class="name">e2</span>, and set of attribute-value pairs <span class="name">attrs2</span>,
+<span class='conditional'>if</span> <span class="name">wasDerivedFrom(e2,e1, [prov:steps="1"])</span> and <span class="name">wasGeneratedBy(e2,a,attrs2)</span> hold, <span class='conditional'>then</span> <span class="name">used(a,e1,attrs1)</span> also holds
 for some set of attribute-value pairs <span class="name">attrs1</span>.
 </div>
 <p>This inference is justified by the fact that the entity represented by entity record identified by <span class="name">e2</span> is generated by at most one activity in a given account (see <a href="#generation-unicity">generation-unicity</a>). Hence,  this activity record is also the one referred to in the usage record of <span class="name">e1</span>. 
@@ -3678,8 +3679,8 @@
 
 
 <p>We note that the converse inference, does not hold.
-From <span class="name">wasDerivedFrom(e2,e1)</span> and <span class="name">used(pe,e1)</span>, one cannot
-derive <span class="name">wasGeneratedBy(e2,pe,attrs2)</span> because identifier <span class="name">e1</span> may occur in usage records referring to 
+From <span class="name">wasDerivedFrom(e2,e1)</span> and <span class="name">used(a,e1)</span>, one cannot
+derive <span class="name">wasGeneratedBy(e2,a,attrs2)</span> because identifier <span class="name">e1</span> may occur in usage records referring to 
 many activity records, but they may not be referred to in generation records containing identifier <span class="name">e2</span>.</p>
 
 
@@ -3695,9 +3696,10 @@
 constraint <a href="#generation-unicity">generation-unicity</a> may no
 longer be satisfied in the resulting union.  </p>
 
+<p>
+Indeed, let us reconsider example <a href="#account-example-1">account-example-1</a>, and let us define another account record as follows.</p>
+
 <div class="anexample">
-<p>
-Indeed, let us reconsider example <a href="#account-example-1">account-example-1</a>, and let us define another account record as follows:</p>
 <pre class="codeexample">
 account(ex:acc2,
         http://example.org/asserter2,