updated structural constraints section
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Wed, 21 Dec 2011 09:43:26 +0000
changeset 1298 57fb738d7452
parent 1297 f0264df289ee
child 1299 8f8764cad3a1
updated structural constraints section
model/ProvenanceModel.html
--- a/model/ProvenanceModel.html	Wed Dec 21 07:56:38 2011 +0000
+++ b/model/ProvenanceModel.html	Wed Dec 21 09:43:26 2011 +0000
@@ -2436,7 +2436,7 @@
 <div class='note'>
 Currently, the non-terminal <span class="nonterminal">asserter</span> is defined as IRI and its interpretation is outside PROV-DM. We may want the asserter to be an agent instead, and therefore use PROV-DM to express the provenance of PROV-DM assertions.  The editors seek inputs on how to resolve this issue. </div>
 
-<div class="anexample">
+<div class="anexample" id="account-example-1">
 <p>
 The following account record</p>
 <pre class="codeexample">
@@ -3581,14 +3581,68 @@
 <section id="structural-constraints"> 
 <h3>PROV-DM Structural Constraints</h3>
 
-<div class="note">
-This section,  in development,  is grouping all structural constraints.
+
+
+<p>Sections 5 and 6 define a data model for provenance, which, for the most part, is unconstrained. Section 7.1 defines an interpretation of this data model, in terms of event ordering constraints.  This section introduces further constraints on the structure of PROV-DM records.  Instances of records that satisfy these constraints are said to be <dfn>structurally well-formed</dfn>.  A benefit of structurally well-formed provenance records is that further inferences can be made, because records are more precise, and therefore, richer. </p>
+
+<p>According to the definition of a <a>generation record</a>, an entity becomes available after this entity's generation event, and does not exist before this event.  From this definition, we conclude that PROV-DM does not allow for an entity to have two generation records occurring at two different instants.
+The rationale for this constraint is as follows.
+ Two distinct <a title="entity generation event">generation events</a> (by a same activity or by two distinct activities), occurring one after the other, necessarily create two distinct entities; otherwise, the second <a title="entity generation event">generation event</a> would have resulted in a entity that existed before its creation, which contradicts the definition of <a>generation record</a>.</p>
+
+<p>So, PROV-DM allows for  two distinct <a title="entity generation event">generation events</a>  <span class="name">g1</span> and <span class="name">g2</span> of a same entity provided they occur <em>simultaneously</em>. 
+<!-- (This means that <span class="name">g1</span> <a>precedes</a> <span class="name">g2</span> and <span class="name">g2</span> <a>precedes</a> <span class="name">g1</span>.) -->
+  For such a simultaneous generation to occur, the generation event has to be unique and caused by a <em>single world activity</em>, though the provenance records may contain different activity records providing alternative descriptions of that same world activity. 
+</p>
+
+<div class="anexample">
+<p>
+In the following assertions, a workflow execution  <span class="name">a0</span> consists of two sub-workflow executions  <span class="name">a1</span> and <span class="name">a2</span>. Sub-workflow execution <span class="name">a2</span> generates entity <span class="name">e</span>, so does <span class="name">a0</span>.</p>
+<pre class="codeexample">
+activity(a0,t1,t2,[prov:type="workflow execution"])
+activity(a1,t1,t3,[prov:type="workflow execution"])
+activity(a2,t3,t2,[prov:type="workflow execution"])
+wasInformedBy(a2,a1)
+
+wasGeneratedBy(e,a0)
+wasGeneratedBy(e,a2)
+</pre>
+This example is permitted in PROV-DM if the two activity records <span class="name">a0</span> and <span class="name">a2</span> provide alternate descriptions of what happens in the world with respect to this generation event.
 </div>
 
-
+<p>While this example is permitted in PROV-DM, it does not expose the hierarchical organization of executions and  it mixes records providing two descriptions of a same execution. This issue is highlighted by two different <a title="generation record">generation records</a> for entity <span class="name">e</span>, which make reasoning about this kind of provenance records unncessarily difficult.  Such assertions are said not be structurally well-formed.</p>
+
+<p>Instead,  the structure of the descriptions can be exposed with separate accounts to form a structurally well-formed provenance record.</p> 
+
+
+<div class="anexample">
+<p>
+The same example is now revisited, with the following assertions that are structurally well-formed. Two accounts are introduced, and there is a single generation record for entity <span class="name">e</span> per account.</p>
+<pre class="codeexample">
+account(ex:summary,
+        http://example.org/asserter, 
+        activity(a0,t1,t2,[prov:type="workflow execution"])
+        wasGeneratedBy(e,a0))
+
+account(ex:detailed,
+        http://example.org/asserter, 
+        activity(a1,t1,t3,[prov:type="workflow execution"])
+        activity(a2,t3,t2,[prov:type="workflow execution"])
+        wasInformedBy(a2,a1)
+        wasGeneratedBy(e,a2))
+</pre>
+</div>
+
+
+
+<!--
 <p>A given entity record can be referred to in a single generation record in the scope of a given <a href="#record-Account">account</a>.
 The rationale for this constraint is as follows.
 If two activities sequentially set different values to some attribute by means of two different <a title="entity generation event">generation events</a>, then they generate distinct entities. Alternatively,  for two activities to generate an entity simultaneously, they would require some synchronization by which they agree the entity is released for use; the end of this synchronization would constitute the actual generation of the entity, but is performed by a single activity. This unicity constraint is formalized as follows.
+-->
+
+<p>Structurally well-formed records satisfy some constraints, which force the structure of descriptions to be exposed by means of accounts. In return for satisfying these constraints, further inferences can be made about structurally well-formed records.
+The unicity of generation records in accounts is formulated as follows.
+</p>
 
 <div class='constraint' id='generation-unicity'>Given an entity record denoted by <span class="name">e</span>, two activity records denoted by <span class="name">a1</span> and <span class="name">a2</span>, and two sets of attribute-value pairs <span class="name">attrs1</span> and <span class="name">attrs2</span>,
 <span class='conditional'>if</span> the records <span class="name">wasGeneratedBy(e,a1,attrs1)</span> and <span class="name">wasGeneratedBy(e,a2,attrs2)</span> exist in the scope of a given account,
@@ -3598,7 +3652,7 @@
 
 
 <p>A further inference is permitted from the imprecise-1 derivation record: </p>
-<div class='constraint' id='derivation-use'>
+<div class='inference' id='derivation-use'>
 <p>Given an activity record identified by <span class="name">pe</span>, entity records identified by <span class="name">e1</span> and <span class="name">e2</span>, and set of attribute-value pairs <span class="name">attrs2</span>,
 <span class='conditional'>if</span> <span class="name">wasDerivedFrom(e2,e1, [prov:steps="1"])</span> and <span class="name">wasGeneratedBy(e2,pe,attrs2)</span> hold, <span class='conditional'>then</span> <span class="name">used(pe,e1,attrs1)</span> also holds
 for some set of attribute-value pairs <span class="name">attrs1</span>.
@@ -3615,12 +3669,12 @@
 
 
 <p>
-An account is said to be well-formed if
-it satisfies the constraints  <a href="#generation-unicity">generation-unicity</a> and <a href="#derivation-use">derivation-use</a>.</p>
+An account is said to be structurally well-formed if
+it satisfies the constraint  <a href="#generation-unicity">generation-unicity</a>. If an account is structurally well-formed, it support the inference <a href="#derivation-use">derivation-use</a>.</p>
 
 <p> The union of two accounts is another account, 
 containing the unions of their respective records, where
- records with a same identifier should be understood according to constraint <a href="#identified-entity-in-account">identified-entity-in-account</a>. Well-formed
+ records with a same identifier should be understood according to constraint <a href="#identified-entity-in-account">identified-entity-in-account</a>. Structurally well-formed
 accounts are not
 closed under union because the
 constraint <a href="#generation-unicity">generation-unicity</a> may no
@@ -3628,7 +3682,7 @@
 
 <div class="anexample">
 <p>
-Indeed, let us consider another account record</p>
+Indeed, let us reconsider example <a href="#account-example-1">account-example-1</a>, and let us define another account record as follows:</p>
 <pre class="codeexample">
 account(ex:acc2,
         http://example.org/asserter2, 
@@ -3643,6 +3697,43 @@
 </div>
 
 
+<div class="note">
+Can the semantics characterize better what can be achieved with structurally well-formed accounts?
+</div>
+
+
+<div class="note">
+Satya discussed the example of a sculpture, whose hand and leg are sculpted independently by two different sculptors. He suggested that the sculpture is generated by two distinct activities. This section explains that it is not the case.  The example can be formulated as follows.
+<pre class="codeexample">
+
+entity(ex:s, [ prov:type="sculpture" ])
+entity(ex:h, [ prov:type="hand" ])
+entity(ex:l, [ prov:type="leg" ])
+wasDerivedFrom(ex:s,ex:h, [ prov:type="contained" ])
+wasDerivedFrom(ex:s,ex:l, [ prov:type="contained" ])
+
+entity(ex:s_2, [ prov:type="sculpture" ])
+entity(ex:h_2, [ prov:type="hand" ])
+wasDerivedFrom(ex:s_2,ex:h_2, [ prov:type="contained" ])
+wasDerivedFrom(ex:s_2,ex:l,   [ prov:type="contained" ])
+wasDerivedFrom(ex:h_2,ex:h,   [ prov:type="refinementOf" ])
+wasDerivedFrom(ex:s_2,ex:s,   [ prov:type="refinementOf" ])
+wasGeneratedBy(ex:h_2,ex:a1)
+activity(ex:a1, [ prov:type="sculptHand" ])
+
+entity(ex:s_3, [ prov:type="sculpture" ])
+entity(ex:l_3, [ prov:type="leg" ])
+wasDerivedFrom(ex:s_3,ex:h_2, [ prov:type="contained" ])
+wasDerivedFrom(ex:s_3,ex:l_3, [ prov:type="contained" ])
+wasDerivedFrom(ex:l_2,ex:l,   [ prov:type="refinementOf" ])
+wasDerivedFrom(ex:s_3,ex:s_2, [ prov:type="refinementOf" ])
+wasGeneratedBy(ex:l_3,ex:a2)
+activity(ex:a2, [ prov:type="sculptLeg" ])
+</pre>
+We see that ex:s_3 (the sculpture in its final state) was derived from ex:l_2 (containment) which was generated by ex:a2. However, ex:s_3 is not directly generated by ex:a2.  We may want to consider an abbreviation for this: wasGeneratedBy*(ex:s_3,ex:a2).
+</div>
+
+
 </section>