update to collections
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Mon, 02 Apr 2012 21:54:37 +0100
changeset 2231 b0f6afda4187
parent 2228 37529d61e1b3
child 2232 bb6d96e96a45
update to collections
model/prov-dm-constraints.html
--- a/model/prov-dm-constraints.html	Mon Apr 02 21:00:01 2012 +0100
+++ b/model/prov-dm-constraints.html	Mon Apr 02 21:54:37 2012 +0100
@@ -1612,92 +1612,69 @@
 <section id="collection-constraints">
 <h3>PROV-DM Collection Constraints</h3>
 
-<div class="note"> unique key constraint removed as it follows from the "update semantics" which is explained in the DM</div>
-
-
-It is desirable to restrict the derivation of one collection from another to one single insertion or removal relation, or to one membership relation.
-The interpretation of two (or more) insertion, removal, membership relations that result in the same collection is undefined.
-
-<!--
-The following examples illustrate what may happen when multiple derivations are allowed.
-
-<div class="anexample">
-  <pre class="codeexample">
-entity(c1, [prov:type="Collection"])
-entity(c2, [prov:type="Collection"])
-entity(c, [prov:type="Collection"])
-
-derivedByInsertionFrom(c, c1, {(k1, v1), (k2, v2)})
-derivedByInsertionFrom(c, c2, {(k3, v3)})
-</pre>
-<span class="name">c</span> is the state of a collection that is obtained by (i) adding <span class="name">{(k1, v1), (k2, v2)}</span> to the collection in state <span class="name">c1</span>, and also (ii) adding <span class="name">{(k3, v3)}</span> to the collection in state <span class="name">c2</span>. Thus, c represents a collection with content  <span class="name">c1  union c1 union {(k1, v1), (k2, v2), (k3,v3}</span>.
-</pre>
+<p>Membership is a convenience notation, since it can be expressed in terms of an insertion into some collection. The membership definition is formalized by constraint <a href="#membership-as-insertion">membership-as-insertion</a>.</p>
+
+<div class='constraint' id='membership-as-insertion'>
+ <span class="name">memberOf(c, {(k1, v1), ...})</span> holds
+<span class='conditional'>if and only if</span> there exists a collection <span class="name">c0</span>, such that
+<span class="name">derivedByInsertionFrom(c, c0, {(k1, v1), ...})</span>.
 </div>
-  
-This interpretation, however, may lead to confusion. For example:
-<div class="anexample">
-<pre class="codeexample">
-entity(c1, [prov:type="Collection"])
-entity(c2, [prov:type="Collection"])
-entity(c, [prov:type="Collection"])
-
-derivedByInsertionFrom(c, c1, {(k1, v1)})
-derivedByRemovalFrom(c, c2, {k1})
-</pre>
-Here the insertion and removal into, and removal from <span class="name">c1</span> and <span class="name">c2</span> "cancel" each other. This is allowed if no constraint is enforced, however it is not meaningful.
-</div>
---> 
-<!--
-On the other hand, it is desirable to be able to express the fact that <span class="name">c</span> is obtained precisely as the result of <em>merging</em> <span class="name">c1</span> and <span class="name">c2</span>. <br/>
--->
-<!--
-This is achieved by adding a constraint to ensure that each derivation is unique, and (ii) making use of the <span class="name">merge(c,c1,c2)</span> to define the state <span class="name">c</span>  precisely as the union of the states <span class="name">c1</span> and <span class="name">c2</span>. -->
-The following constraint ensures unique derivation.
+
+<p>A collection may be obtained by insertion or removal, or said to satisfy the membership relation.
+To provide an interpretation of collections, PROV-DM 
+ restricts one collection to be involved in a single derivation by insertion or removal, or to one membership relation.
+PROV-DM does not provide an interpretation for descriptions that consist of two (or more) insertion, removal, membership relations that result in the same collection.</p>
+
+
+
+<p>The following constraint ensures unique derivation.</p>
 
 
 <div class='constraint' id='collection-unique-derivation'>
-
-  <p>The state of a collection that is derived through multiple insertions, removal, or membership relations is undefined.
-
+A collection MUST NOT be derived through multiple insertions, removal, or membership relations.
 </div>
 
 <div class="anexample">
+Consider the following descriptions about three collections.
   <pre class="codeexample">
-entity(c1, [prov:type="Collection"])
-entity(c2, [prov:type="Collection"])
-entity(c, [prov:type="Collection"])
-
-derivedByInsertionFrom(c, c1, {(k1, v1), (k2, v2)})
-derivedByInsertionFrom(c, c2, {(k3, v3)})
+entity(c1, [prov:type="prov:Collection"  %% xsd:QName])
+entity(c2, [prov:type="prov:Collection"  %% xsd:QName])
+entity(c3, [prov:type="prov:Collection"  %% xsd:QName])
+
+
+derivedByInsertionFrom(c3, c1, {("k1", e1), ("k2", e2)})
+derivedByInsertionFrom(c3, c2, {("k3", e3)})
 </pre>
-<p>is undefined (unless the two sets were identical, in which case one of the two statements would be redundant)</p>
+<p>There is no interpretation for such descriptions since <span class="name">c3</span> is derived multiple times by insertion.</p>
 </div>
 
-<p>As a particular case, the state of <span class="name">c</span> as derived multiple times from the same <span class="name">c1</span> is undefined. </p>
+
 <div class="anexample">
-  <pre class="codeexample">
-derivedByInsertionFrom(id1, c, c1, {(k1, v1), (k2, v2)})
-derivedByInsertionFrom(id2, c, c1, {(k3, v3), (k4, v4)})
+<p>As a particular case, collection <span class="name">c</span> is derived multiple times from the same <span class="name">c1</span>. </p>
+<pre class="codeexample">
+derivedByInsertionFrom(id1, c, c1, {("k1", e1), ("k2", e2)})
+derivedByInsertionFrom(id2, c, c1, {("k3", e3), ("k4", e4)})
 </pre>
-<p>is undefined. </p>
-<p> The expected way to accomplish the effect intended with these statements, is as follows:</p>
+<p>The interpretation of such descriptions is also unspecified. </p>
+<p>To describe the insertion of the 4 key-entity pairs, one would instead write:</p>
 <pre class="codeexample">
-derivedByInsertionFrom(id1, c, c1, {(k1, v1), (k2, v2), (k3, v3), (k4, v4)})
+derivedByInsertionFrom(id1, c, c1, {("k1", e1), ("k2", e2), ("k3", e3), ("k4", e4)})
 </pre>  
 </div>
 
 The same is true for any combination of insertions, removals, and membership relations:
 <div class="anexample">
+<p>The following descriptions</p>
 <pre class="codeexample">
-derivedByInsertionFrom(c, c1, {(k1, v1)})
-derivedByRemovalFrom(c, c2, {k2})
+derivedByInsertionFrom(c, c1, {("k1", e1)})
+derivedByRemovalFrom(c, c2, {"k2"})
 </pre>
-  is undefined.
+have no interpretation.
+Nor have the following:
 <pre class="codeexample">
-derivedByInsertionFrom(c, c1, {(k1, v1)})
-memberOf(c, c2, {k2})
+derivedByInsertionFrom(c, c1, {("k1", e1)})
+memberOf(c, {"k2"}).
 </pre>
-  is undefined.
 </div>
 
 
@@ -1705,85 +1682,89 @@
 <!--
 <section id="Collection-branching">
 -->
-<h3>Collection branching.</h3>
-
-It is possible to have multiple derivations from a single root collection, as long as the resulting entities are distinct, as shown in the following example.
-
-<div class="anexample">
-<pre class="codeexample">
-  entity(c, [prov:type="EmptyCollection"])    // e is an empty collection
-  entity(v1)
-  entity(v2)
-  entity(v3)
-  entity(c1, [prov:type="Collection"])
-  entity(c2, [prov:type="Collection"])
-  entity(c3, [prov:type="Collection"])
-  
-  derivedByInsertionFrom(c1, c, {(k1, v1)})      
-  derivedByInsertionFrom(c2, c, {(k2, v2)})       
-  derivedByInsertionFrom(c3, c1, {(k3,v3)})       
-</pre>
-    From this set of assertions, we conclude:
-  <pre class="codeexample">
-  c1 = { (k1,v1) }
-  c2 = { (k2 v2) }
-  c3 = { (k1,v1),  (k3,v3) }
-  </pre>
-</div>
-  <!--
-</section>
--->
-  
-<!--
-  <section id="collections-derivation">
--->
-  
-<h3>State of collections and use of weaker <a href="#Derivation-Relation">derivation</a> relation</h3>
-
-<p>The state of a collection is only known to the extent that a chain of derivations starting from an empty collection can be found. Since a set of assertions regarding a collection's evolution may be incomplete, so is the reconstructed state obtained by querying those assertions. In general, all assertions reflect partial knowledge reagrding a sequence of data transformation events. In the particular case of collection evolution, in which some of the state changes may have been missed, the more generic  <a href="#Derivation-Relation">derivation</a> relation should be used to signal that some updates may have occurred, which cannot be precisely asserted as insertions or removals. The following two examples illustrate this.</p>
+<section id="collection-branching">
+<h4>Collection branching</h4>
+
+It is allowed to have multiple derivations from a single root collection, as long as the resulting entities are distinct, as shown in the following example.
 
 <div class="anexample">
 <pre class="codeexample">
-  entity(c, [prov:type="Collection"])    // c is a collection, possibly not empty
-  entity(v1)
-  entity(v2, [prov:type="Collection"])    // v2 is a collection
-
-  derivedByInsertionFrom(c1, c, {(k1, v1)})       
-  derivedByInsertionFrom(c2, c1, {(k2, v2)})    
- </pre>
-     From this set of assertions, we conclude:
-   <pre class="codeexample">
-    c1 includes (k1,v1) but may contain additional unknown pairs
-    c2 includes (k1,v1), (k2 v2) (and possibly more pairs), where v2 is a collection with unknown state
-   </pre>
- 
+entity(c0, [prov:type="prov:EmptyCollection" %% xsd:QName])    // c0 is an empty collection
+entity(c1, [prov:type="prov:Collection" %% xsd:QName])
+entity(c2, [prov:type="prov:Collection" %% xsd:QName])
+entity(c3, [prov:type="prov:Collection" %% xsd:QName])
+entity(e1)
+entity(e2)
+entity(e3)
+
+derivedByInsertionFrom(c1, c0, {("k1", e1)})      
+derivedByInsertionFrom(c2, c0, {("k2", e2)})       
+derivedByInsertionFrom(c3, c1, {("k3", e3)})       
+</pre>
+From this set of descriptions, we conclude:
+<pre class="codeexample">
+  c1 = { ("k1", e1) }
+  c2 = { ("k2", e2) }
+  c3 = { ("k1", e1), ("k3", e3)}
+</pre>
+</div>
+
+</section>
+
+  
+
+<section id="collections-and-derivation">
+
+  
+<h4>Collections and Weaker Derivation Relation</h4>
+
+<p>The state of a collection is only known to the extent that a chain of derivations starting from an empty collection can be found. Since a set of descriptions regarding a collection's evolution may be incomplete, so is the reconstructed state obtained by querying those descriptions. In general, all descriptions reflect partial knowledge regarding a sequence of data transformation events. In the particular case of collection evolution, in which some of the state changes may have been missed, the more generic  <a href="#Derivation-Relation">derivation</a> relation should be used to signal that some updates may have occurred, which cannot be expressed as insertions or removals. The following  example illustrates this.</p>
+
+<!--
+<div class="anexample">
+<pre class="codeexample">
+entity(c, [prov:type="prov:Collection" %% xsd:QName])    // c is a collection, possibly not empty
+entity(c1, [prov:type="prov:Collection" %% xsd:QName])    
+entity(c2, [prov:type="prov:Collection" %% xsd:QName])    
+entity(e1)
+entity(e2)
+
+derivedByInsertionFrom(c1, c,  {("k1", e1)})       
+derivedByInsertionFrom(c2, c1, {("k2", e2)})    
+</pre>
+From this set of descriptions, we conclude:
+<ul>
+<li> <span class="name">c1</span> includes <span class="name">("k1", e1)</span> but may contain additional unknown pairs
+<li> <span class="name">c2</span> includes <span class="name">("k1", e1), ("k2", e2)</span> (and possibly more pairs), where <span class="name">e2</span> is a collection with unknown state
+</pre>
  </div>
- 
-   In the example, the state of <span class="name">c2</span> is only partially known because the collection is constructed from partially known other collections.
+--> 
+
  
  <div class="anexample">
+In the example, the state of <span class="name">c2</span> is only partially known because the collection is constructed from partially known other collections.
  <pre class="codeexample">
-   entity(c, [prov:type="EmptyCollection"])    // c is an empty collection
-   entity(v1)
-   entity(v2)
-   entity(c1, [prov:type="Collection"])    
-   entity(c2, [prov:type="Collection"])    
-   entity(c3, [prov:type="Collection"])    
- 
-   derivedByInsertionFrom(c1, c, {(k1, v1)})       
-   wasDerivedFrom(c2, c1)                       
-   derivedByInsertionFrom(c3, c2, {(k2, v2)})       
+entity(c0, [prov:type="prov:EmptyCollection" %% xsd:QName])    // c0 is an empty collection
+entity(c1, [prov:type="prov:Collection" %% xsd:QName])    
+entity(c2, [prov:type="prov:Collection" %% xsd:QName])    
+entity(c3, [prov:type="prov:Collection" %% xsd:QName])    
+entity(e1)
+entity(e2)
+
+derivedByInsertionFrom(c1, c0, {("k1", e1)})       
+wasDerivedFrom(c2, c1)                       
+derivedByInsertionFrom(c3, c2, {("k2", e2)})       
  </pre>
-     From this set of assertions, we conclude:
-   <pre class="codeexample">
-    c1 = { (k1,v1) }
-    c2 is somehow derived from c1, but the precise sequence of updates is unknown
-    c3  includes  (k2 v2) but the earlier "gap" leaves uncertainty regarding  (k1,v1) <br/>  (it may have been removed) or any other pair that may have been added as part of the derivation activities.
-   </pre>
+From this set of descriptions, we conclude:
+<ul>
+<li>    <span class="name">c1 = { ("k1", e1) }</span>
+<li>    <span class="name">c2</span> is somehow derived from <span class="name">c1</span>, but the precise sequence of updates is unknown
+<li>    <span class="name">c3</span>  includes  <span class="name">("k2", e2)</span> but the earlier "gap" leaves uncertainty regarding  <span class="name">("k1", e1)</span>  (it may have been removed) or any other pair that may have been added as part of the derivation activities.
+</ul>
  </div>
-<!--
+
 </section>
- -->
+
 
 </section>  <!-- end of collections -->