updated dm-constraints with collections
authorPaolo Missier <pmissier@acm.org>
Fri, 30 Mar 2012 17:04:02 +0100
changeset 2129 d31f335adead
parent 2128 989b8910f4f5
child 2130 9de0045d5fe4
updated dm-constraints with collections
model/prov-dm-constraints.html
model/working-copy/wd5-prov-dm-collections.html
--- a/model/prov-dm-constraints.html	Fri Mar 30 16:02:08 2012 +0100
+++ b/model/prov-dm-constraints.html	Fri Mar 30 17:04:02 2012 +0100
@@ -1562,29 +1562,75 @@
 <section id="collection-constraints">
 <h3>PROV-DM Collection Constraints</h3>
 
-<div class='note'>
-Edited by PM --- TBC
+<div class="note"> unique key constraint removed as it follows from the "update semantics" which is explained in the DM</div>
+
+
+It is desirable to restrict the derivation of one collection from another to one single insertion or removal relation. The following examples illustrate what may happen when multiple derivation is allowed.
+
+<div class="anexample">
+  <pre class="codeexample">
+entity(c1, [prov:type="Collection"])
+entity(c2, [prov:type="Collection"])
+entity(c, [prov:type="Collection"])
+
+derivedByInsertionFrom(c, c1, {(k1, v1), (k2, v2)})
+derivedByInsertionFrom(c, c2, {(k3, v3)})
+</pre>
+<span class="name">c</span> is the state of a collection that is obtained by (i) adding <span class="name">{(k1, v1), (k2, v2)}</span> to the collection in state <span class="name">c1</span>, and also (ii) adding <span class="name">{(k3, v3)}</span> to the collection in state <span class="name">c2</span>. Thus, c represents a collection with content  <span class="name">c1  union c1 union {(k1, v1), (k2, v2), (k3,v3}</span>.
+</pre>
 </div>
+  
+This interpretation, however, may lead to confusion. For example:
+<div class="anexample">
+<pre class="codeexample">
+entity(c1, [prov:type="Collection"])
+entity(c2, [prov:type="Collection"])
+entity(c, [prov:type="Collection"])
+
+derivedByInsertionFrom(c, c1, {(k1, v1)})
+derivedByRemovalFrom(c, c2, {k1})
+</pre>
+Here the insertion and removal into, and removal from <span class="name">c1</span> and <span class="name">c2</span> "cancel" each other. This is allowed if no constraint is enforced, however it is not meaningful.
+</div>
+ On the other hand, it is desirable to be able to express the fact that <span class="name">c</span> is obtained precisely as the result of <em>merging</em> <span class="name">c1</span> and <span class="name">c2</span>. <br/>
+This is achieved by  (i) adding a constraint to ensure that each derivation is unique, and (ii) making use of the <span class="name">merge(c,c1,c2)</span> to define the state <span class="name">c</span>  precisely as the union of the states <span class="name">c1</span> and <span class="name">c2</span>. This justifies the introduction of the following constraint.
 
 
 
-<div class='constraint' id='collection-parallel-insertions'>
-<p>One can have multiple assertions regarding the state of a collection following a <em>set</em> of insertions, for example:</p>
+<div class='constraint' id='collection-unique-derivation'/>
+
+  <p>One cannot have multiple assertions that define the state of a collection by means of insertions and removal relations. Thus:</p>
 <pre class="codeexample">
-CollectionAfterInsertion(c2, c1, k1, v1)
-CollectionAfterInsertion(c2, c1, k2, v2)
-...
+entity(c1, [prov:type="Collection"])
+entity(c2, [prov:type="Collection"])
+entity(c, [prov:type="Collection"])
+
+derivedByInsertionFrom(c, c1, {(k1, v1), (k2, v2)})
+derivedByInsertionFrom(c, c2, {(k3, v3)})
 </pre>
-<p>This is interpreted as <em>" <span class="name">c2</span> is the state that results from inserting  <span class="name">(k1, v1)</span>,  <span class="name">(k2, v2)</span> etc. into  <span class="name">c1</span>"</em></p>
-</div>
+is not allowed (unless the two sets were identical, in which case one of the two statements would be redundant)<p/>
 
-<div class='note'>
-Shouldn't we have the same for deletion, and combination of insertion and deletion?
+ In particular, one cannot derive the state of a collection from another using multiple statements. Thus: <p/>
+<pre class="codeexample">
+derivedByInsertionFrom(id1, c, c1, {(k1, v1), (k2, v2)})
+derivedByInsertionFrom(id2, c, c1, {(k3, v3), (k4, v4)})
+</pre>
+  is not allowed.<p/>
+
+The same applies to removal and combinations of insertions and removals, for example:
+
+<pre class="codeexample">
+derivedByInsertionFrom(c, c1, {(k1, v1)})
+derivedByRemovalFrom(c, c2, {k2})
+</pre>
+  is not allowed.
 </div>
 
 
-<div class='constraint' id='collection-branching-derivations'>
-It is possible to have multiple derivations from a single root collection, as shown in the following example.
+<section id="Collection-branching">
+<h3>Collection branching.</h3>
+
+It is possible to have multiple derivations from a single root collection, as long as the resulting entities are distinct, as shown in the following example.
 
 <div class="anexample">
 <pre class="codeexample">
@@ -1596,78 +1642,69 @@
   entity(c2, [prov:type="Collection"])
   entity(c3, [prov:type="Collection"])
   
-  CollectionAfterInsertion(c1, c, k1, v1)       // c1 = { (k1,v1) }
-  CollectionAfterInsertion(c2, c, k2, v2)       // c2 = { (k2 v2) }
-  CollectionAfterInsertion(c3, c1, k3,v3)       // c3 = { (k1,v1),  (k3,v3) }
+  derivedByInsertionFrom(c1, c, {(k1, v1)})      
+  derivedByInsertionFrom(c2, c, {(k2, v2)})       
+  derivedByInsertionFrom(c3, c1, {(k3,v3)})       
 </pre>
-</div>
-</div>
-
-
-
-
-
-
-<div class='constraint' id='collection-unique-ancestor'>
-Given the pair of assertions:
-<pre class="codeexample">
-CollectionAfterInsertion(c, c1, k1, v1)
-CollectionAfterInsertion(c, c2, k2, v2)
-</pre>
-it follows that  <span class="name">c1==c2</span>.
-</div>
+    From this set of assertions, we conclude:
+  <pre class="codeexample">
+  c1 = { (k1,v1) }
+  c2 = { (k2 v2) }
+  c3 = { (k1,v1),  (k3,v3) }
+  </pre>
 
-<div class='note'>
-Original text stated it follows that <span class="name">c1==c2, k1==k2, v1==v2</span>, because one cannot have two different derivations for the same final collection state. This is incompatible with parallel insertion constraint.
-</div>
-
-
-<div class='note'>
-Shouldn't we have the same for deletion, and combination of insertion and deletion?
-</div>
-
-
+</section>
 
+<section id="collections-derivation">
 
-<div class='constraint' id='collection-unique-value-for-key'>
-Given the following set of insertions:
-<pre class="codeexample">
-CollectionAfterInsertion(c1, c, k, v1)
-CollectionAfterInsertion(c1, c, k, v2)
-</pre>
-it follows that  <span class="name">v1==v2</span>.
-</div>
+<h3>State of collections and use of weaker <a href="#Derivation-Relation">derivation</a> relation</h3>
 
-
-<p>The state of a collection is only known to the extent that a chain of derivations starting from an empty collection can be found. Since a set of assertions regarding a collection's evolution may be incomplete, so is the reconstructed state obtained by querying those assertions. In general, all assertions reflect the asserter's partial knowledge of a sequence of data transformation events. In the particular case of collection evolution, in which the asserter  <em>knows</em> that some of the state changes may have been missed, then the more generic  <a href="#Derivation-Relation">derivation</a> relation should be used to signal that some updates may have occurred, which cannot be precisely asserted as insertions or removals. The following two examples illustrate this.</p>
+<p>The state of a collection is only known to the extent that a chain of derivations starting from an empty collection can be found. Since a set of assertions regarding a collection's evolution may be incomplete, so is the reconstructed state obtained by querying those assertions. In general, all assertions reflect partial knowledge reagrding a sequence of data transformation events. In the particular case of collection evolution, in which some of the state changes may have been missed, the more generic  <a href="#Derivation-Relation">derivation</a> relation should be used to signal that some updates may have occurred, which cannot be precisely asserted as insertions or removals. The following two examples illustrate this.</p>
 
 <div class="anexample">
 <pre class="codeexample">
-  entity(c, [prov:type="collection"])    // e is a collection, possibly not empty
+  entity(c, [prov:type="Collection"])    // c is a collection, possibly not empty
   entity(v1)
-  entity(v2, [prov:type="collection"])    // v2 is a collection
-
-  CollectionAfterInsertion(c1, c, k1, v1)       // c1 <em>includes</em> { (k1,v1) } but may contain additional unknown pairs
-  CollectionAfterInsertion(c2, c1, k2, v2)      // c2 includes { (k1,v1), (k2 v2) } where v2 is a collection with unknown state
-</pre>
-</div>
-  In the example, the state of <span class="name">c2</span> is only partially known because the collection is constructed from partially known other collections.
+  entity(v2, [prov:type="Collection"])    // v2 is a collection
 
-<div class="anexample">
-<pre class="codeexample">
-  entity(c, [prov:type="emptyCollection"])    // e is an empty collection
-  entity(v1)
-  entity(v2)
-
-  CollectionAfterInsertion(c1, c, k1, v1)       // c1 = { (k1,v1) }
-  wasDerivedFrom(c2, c1)                        // the asserted knows that c2 is somehow derived from c1, but cannot assert the precise sequence of updates
-    CollectionAfterInsertion(c3, c2, k2, v2)       
-</pre>
-
-<p>Here  <span class="name">c3</span> includes <span class="name">{ (k2 v2) }</span> but the earlier "gap" leaves uncertainty regarding  <span class="name">(k1,v1)</span>  (it may have been removed) or any other pair that may have been added as part of the derivation activities.</p>
-</div>
+  derivedByInsertionFrom(c1, c, {(k1, v1)})       
+  derivedByInsertionFrom(c2, c1, {(k2, v2)})    
+ </pre>
+     From this set of assertions, we conclude:
+   <pre class="codeexample">
+    c1 includes (k1,v1) but may contain additional unknown pairs
+    c2 includes (k1,v1), (k2 v2) (and possibly more pairs), where v2 is a collection with unknown state
+   </pre>
+ 
+ </div>
+ 
+   In the example, the state of <span class="name">c2</span> is only partially known because the collection is constructed from partially known other collections.
+ 
+ <div class="anexample">
+ <pre class="codeexample">
+   entity(c, [prov:type="EmptyCollection"])    // c is an empty collection
+   entity(v1)
+   entity(v2)
+   entity(c1, [prov:type="Collection"])    
+   entity(c2, [prov:type="Collection"])    
+   entity(c3, [prov:type="Collection"])    
+ 
+   derivedByInsertionFrom(c1, c, {(k1, v1)})       
+   wasDerivedFrom(c2, c1)                       
+   derivedByInsertionFrom(c3, c2, {(k2, v2)})       
+ </pre>
+     From this set of assertions, we conclude:
+   <pre class="codeexample">
+    c1 = { (k1,v1) }
+    c2 is somehow derived from c1, but the precise sequence of updates is unknown
+    c3  includes  (k2 v2) but the earlier "gap" leaves uncertainty regarding  (k1,v1) <br/>  (it may have been removed) or any other pair that may have been added as part of the derivation activities.
+   </pre>
+ </div>
 
 </section>
+ 
+
+</section>  <!-- end of collections -->
 
 
 <!--
--- a/model/working-copy/wd5-prov-dm-collections.html	Fri Mar 30 16:02:08 2012 +0100
+++ b/model/working-copy/wd5-prov-dm-collections.html	Fri Mar 30 17:04:02 2012 +0100
@@ -207,8 +207,86 @@
 <h3>Collections</h3>
 
 <section id="collection-constraints">
-<h3>PROV-DM Collection Constraints  [to go in part II]</h3>
+<h3>PROV-DM Collection Constraints and further considerations  [to go in part II]</h3>
 
+<div class='constraint' id='collection-unique-derivation'/>
+
+
+  <p>One cannot have multiple assertions that define the state of a collection by means of insertions and removal relations. Thus:</p>
+<pre class="codeexample">
+entity(c1, [prov:type="Collection"])
+entity(c2, [prov:type="Collection"])
+entity(c, [prov:type="Collection"])
+
+derivedByInsertionFrom(c, c1, {(k1, v1), (k2, v2)})
+derivedByInsertionFrom(c, c2, {(k3, v3)})
+</pre>
+is not allowed (unless the two sets were identical, in which case one of the two statements would be redundant)<p/>
+
+ In particular, one cannot derive the state of a collection from another using multiple statements. Thus: <p/>
+<pre class="codeexample">
+derivedByInsertionFrom(id1, c, c1, {(k1, v1), (k2, v2)})
+derivedByInsertionFrom(id2, c, c1, {(k3, v3), (k4, v4)})
+</pre>
+  is not allowed.<p/>
+
+The same applies to removal and combinations of insertions and removals, for example:
+
+<pre class="codeexample">
+derivedByInsertionFrom(c, c1, {(k1, v1)})
+derivedByRemovalFrom(c, c2, {k2})
+</pre>
+  is not allowed.
+</div>
+
+
+<div class='constraint' id='collection-unique-value-for-key'>
+  Keys are unique within a collection. Thus:
+<pre class="codeexample">
+entity(c, [prov:type="Collection"])
+entity(c1, [prov:type="Collection"])
+
+derivedByInsertionFrom(c1, c, {(k, v1), ...})
+derivedByInsertionFrom(c1, c, {(k, v2)}, ...)
+</pre>
+implies  <span class="name">v1==v2</span>.
+</div>
+
+</section>
+
+<section id="further-considerations">
+<h3>Further considerations.</h3>
+
+
+<section id="Collection-branching">
+<h3>Collection branching.</h3>
+
+It is possible to have multiple derivations from a single root collection, as long as the resulting entities are distinct, as shown in the following example.
+
+<div class="anexample">
+<pre class="codeexample">
+  entity(c, [prov:type="EmptyCollection"])    // e is an empty collection
+  entity(v1)
+  entity(v2)
+  entity(v3)
+  entity(c1, [prov:type="Collection"])
+  entity(c2, [prov:type="Collection"])
+  entity(c3, [prov:type="Collection"])
+  
+  derivedByInsertionFrom(c1, c, {(k1, v1)})      
+  derivedByInsertionFrom(c2, c, {(k2, v2)})       
+  derivedByInsertionFrom(c3, c1, {(k3,v3)})       
+</pre>
+    From this set of assertions, we conclude:
+  <pre class="codeexample">
+  c1 = { (k1,v1) }
+  c2 = { (k2 v2) }
+  c3 = { (k1,v1),  (k3,v3) }
+  </pre>
+
+</section>
+
+<section id="collections-derivation">
 
 <h3>State of collections and use of weaker <a href="#Derivation-Relation">derivation</a> relation</h3>
 
@@ -216,12 +294,12 @@
 
 <div class="anexample">
 <pre class="codeexample">
-  entity(c, [prov:type="collection"])    // c is a collection, possibly not empty
+  entity(c, [prov:type="Collection"])    // c is a collection, possibly not empty
   entity(v1)
-  entity(v2, [prov:type="collection"])    // v2 is a collection
+  entity(v2, [prov:type="Collection"])    // v2 is a collection
 
- derivedByInsertionFrom(c1, c, {(k1, v1)})       
-   derivedByInsertionFrom(c2, c1, {(k2, v2)})    
+  derivedByInsertionFrom(c1, c, {(k1, v1)})       
+  derivedByInsertionFrom(c2, c1, {(k2, v2)})    
  </pre>
      From this set of assertions, we conclude:
    <pre class="codeexample">
@@ -234,12 +312,12 @@
  
  <div class="anexample">
  <pre class="codeexample">
-   entity(c, [prov:type="emptyCollection"])    // c is an empty collection
+   entity(c, [prov:type="EmptyCollection"])    // c is an empty collection
    entity(v1)
    entity(v2)
-   entity(c1, [prov:type="collection"])    
-   entity(c2, [prov:type="collection"])    
-   entity(c3, [prov:type="collection"])    
+   entity(c1, [prov:type="Collection"])    
+   entity(c2, [prov:type="Collection"])    
+   entity(c3, [prov:type="Collection"])    
  
    derivedByInsertionFrom(c1, c, {(k1, v1)})       
    wasDerivedFrom(c2, c1)                       
@@ -253,92 +331,8 @@
    </pre>
  </div>
 
-
-<div class='constraint' id='collection-unique-insertion'>
-<p>One cannot have multiple assertions regarding the state of a collection. Thus:</p>
-<pre class="codeexample">
-derivedByInsertionFrom(id1, c2, c1, k1, v1)
-derivedByInsertionFrom(id2, c2, c1, k2, v2)
-</pre>
-implies <span class="name">k1 = v1, k2 = v2, id1 = id2</span>. <p/>
-
-  Similarly for removal:
-<pre class="codeexample">
-derivedByRemovalFrom(c2, c1, k1)
-derivedByRemovalFrom(c2, c1, k2)
-</pre>
-implies <span class="name">k1 = k2</span>. <p/>
-  
-</div>
-
-<div class='constraint' id='collection-branching-derivations'>
-It is possible to have multiple derivations from a single root collection, as long as the resuting entities are distinct, as shown in the following example.
-
-<div class="anexample">
-<pre class="codeexample">
-  entity(c, [prov:type="EmptyCollection"])    // e is an empty collection
-  entity(v1)
-  entity(v2)
-  entity(v3)
-  entity(c1, [prov:type="Collection"])
-  entity(c2, [prov:type="Collection"])
-  entity(c3, [prov:type="Collection"])
-  
-  derivedByInsertionFrom(c1, c, k1, v1)      
-  derivedByInsertionFrom(c2, c, k2, v2)       
-  derivedByInsertionFrom(c3, c1, k3,v3)       
-</pre>
-    From this set of assertions, we conclude:
-  <pre class="codeexample">
-  c1 = { (k1,v1) }
-  c2 = { (k2 v2) }
-  c3 = { (k1,v1),  (k3,v3) }
-  </pre>
-
-</div>
-</div>
-
-
-<div class='constraint' id='collection-unique-ancestor'>
-A collection can only be derived from a single prior collection. Thus:
-<pre class="codeexample">
-entity(c1, [prov:type="Collection"])
-entity(c2, [prov:type="Collection"])
-entity(c, [prov:type="Collection"])
-
-derivedByInsertionFrom(c, c1, k1, v1)
-derivedByInsertionFrom(c, c2, k2, v2)
-</pre>
-implies  <span class="name">c1==c2</span>. <p/>
-
-<pre class="codeexample">
-derivedByRemovalFrom(c, c1, k1)
-derivedByRemovalFrom(c, c2, k2)
-</pre>
-implies <span class="name">c1 = c2, k1 = k2</span>. <p/>
-
-  This also applies to any combination of insertions and removals, for example:
-
-  <pre class="codeexample">
-derivedByInsertionFrom(c, c1, k1, v1)
-derivedByRemovalFrom(c, c2, k2)
-</pre>
-  implies  <span class="name">c1 = c2, k1 = k2</span>.
-  
-</div>
-
-<div class='constraint' id='collection-unique-value-for-key'>
-  Keys are unique within a collection. Thus:
-<pre class="codeexample">
-entity(c, [prov:type="Collection"])
-entity(c1, [prov:type="Collection"])
-
-derivedByInsertionFrom(c1, c, k, v1)
-derivedByInsertionFrom(c1, c, k, v2)
-</pre>
-implies  <span class="name">v1==v2</span>.
-</div>
-
+</section>
+ 
 
 </section>  <!-- end of collections -->