update from khalid and jun - dropbox + merge
authorTim L <lebot@rpi.edu>
Wed, 28 Mar 2012 11:26:04 -0400
changeset 2063 da7970e98d7e
parent 2062 f14c485bb220 (current diff)
parent 2061 404af3d11cb2 (diff)
child 2064 9eeccafd7f95
child 2065 e8f88f2fb9fc
child 2071 f56bdd5c03a7
update from khalid and jun - dropbox + merge
--- a/model/glossary.html	Wed Mar 28 11:05:09 2012 -0400
+++ b/model/glossary.html	Wed Mar 28 11:26:04 2012 -0400
@@ -39,7 +39,7 @@
 </span>
 
 <span class="glossary" id="glossary-collection">  
-A <dfn id="concept-collection">collection</dfn> is an entity that provides a structure to some constituents, which are themselves entities. These constituents are said to be <dfn>part of</dfn> the collections. 
+A <dfn id="concept-collection">collection</dfn> is an entity that provides a structure to some constituents, which are themselves entities. These constituents are said to be <dfn>member of</dfn> the collections. 
 </span>
 
 <span class="glossary" id="glossary-account">  
--- a/model/glossary.js	Wed Mar 28 11:05:09 2012 -0400
+++ b/model/glossary.js	Wed Mar 28 11:26:04 2012 -0400
@@ -46,7 +46,7 @@
 '</span> ' + 
 ' ' + 
 '<span class="glossary" id="glossary-collection">   ' + 
-'A <dfn id="concept-collection">collection</dfn> is an entity that provides a structure to some constituents, which are themselves entities. These constituents are said to be <dfn>part of</dfn> the collections.  ' + 
+'A <dfn id="concept-collection">collection</dfn> is an entity that provides a structure to some constituents, which are themselves entities. These constituents are said to be <dfn>member of</dfn> the collections.  ' + 
 '</span> ' + 
 ' ' + 
 '<span class="glossary" id="glossary-account">   ' + 
--- a/model/prov-dm-constraints.html	Wed Mar 28 11:05:09 2012 -0400
+++ b/model/prov-dm-constraints.html	Wed Mar 28 11:26:04 2012 -0400
@@ -1538,6 +1538,15 @@
 </section>
 
 
+<section id="collection-constraints">
+<h3>PROV-DM Collection Constraints</h3>
+
+<div class='note'>
+Edited by PM --- TBC
+</div>
+
+
+
 <div class='constraint' id='collection-parallel-insertions'>
 <p>One can have multiple assertions regarding the state of a collection following a <em>set</em> of insertions, for example:</p>
 <pre class="codeexample">
--- a/model/prov-dm.html	Wed Mar 28 11:05:09 2012 -0400
+++ b/model/prov-dm.html	Wed Mar 28 11:26:04 2012 -0400
@@ -337,7 +337,7 @@
 </ol>
 
 
-<p>This specification intentionally presents the key concepts of the PROV Data Model, without drilling down into all its subtleties.  Using these key concepts, it becomes possible to write useful provenance assertions very quickly, and publish or embed them along side the data they relate to. </p>
+<p>This specification intentionally presents the key concepts of the PROV Data Model, without drilling down into all its subtleties.  Using these key concepts, it becomes possible to write useful provenance descriptions very quickly, and publish or embed them along side the data they relate to. </p>
 
 <p>However, if data changes, then it is challenging to express its provenance precisely, like it would be for any other form of metadata. To address this challenge, a <em>refinement</em> is proposed to enrich simple provenance, with extra-descriptions that  help qualify the specific subject of provenance and provenance itself, with attributes and temporal interval, intended to satisfy a comprehensive set of constraints.  These aspects are covered in the companion specification [[PROV-DM-CONSTRAINTS]].
 </p>
@@ -871,7 +871,7 @@
 <li> Several persons are associated with activity <span class="name">ex:edit1</span>, some in an editorial role, some in a contributor's role.</li>
 </ul>
 
-<p>Again, we paraphrase some PROV-DM assertions, and illustrate them with the PROV-N notation.
+<p>Again, we paraphrase some PROV-DM descriptions, and illustrate them with the PROV-N notation.
 Full details of the provenance record can be found <a href="examples/w3c-publication3.pn">here</a>.</p>
 
 <ul>
@@ -1019,10 +1019,10 @@
 <tr class="component3-color"><td><a>Traceability</a></td><td><a title="tracedTo">tracedTo(id,e2,e1,attrs)</a></td></tr>
 <tr class="component4-color"><td><a>Alternate</a></td><td><a title="alternateOf">alternateOf(alt1, alt2)</a></td></tr>
 <tr class="component4-color"><td><a>Specialization</a></td><td><a title="specializationOf">specializationOf(sub, super)</a></td></tr>
-<tr class="component5-color"><td><a>Collection</a></td><td></td></tr>
-<tr class="component5-color"><td><a>Insertion</a></td><td>derivedByInsertionFrom(id, collAfter, collBefore, key, value, attrs)</td></tr>
-<tr class="component5-color"><td><a>Removal</a></td><td>derivedByRemovalFrom(id, collAfter, collBefore, key, attrs)</td></tr>
-<tr class="component5-color"><td><a>Membership</a></td><td>contained(id, coll, key, values, attrs)</td></tr>
+<tr class="component5-color"><td><a>Collection</a></td><td><a>Collection</a></td></tr>
+<tr class="component5-color"><td><a>Insertion</a></td><td><a title="derivedByInsertionFrom">derivedByInsertionFrom(id, c2, c1, {(key_1, e_1), ..., (key_n, e_n)}, attrs)</a></td></tr>
+<tr class="component5-color"><td><a>Removal</a></td><td><a title="derivedByRemovalFrom">derivedByRemovalFrom(id, c2, c1, {key_1, ... key_n}, attrs)</a></td></tr>
+<tr class="component5-color"><td><a>Membership</a></td><td><a title="memberOf">memberOf(c, {(key_1, e_1), ..., (key_n, e_n)})</a></td></tr>
 <tr class="component6-color"><td><a>Note</a></td><td><a title="note">note(id, [ attr1=val1, ...])</a></td></tr>
 <tr class="component6-color"><td><a>Annotation</a></td><td><a title="hasAnnotation">hasAnnotation(r,n)</a></td></tr>
 </table>
@@ -1261,7 +1261,7 @@
 </div>
 
 
-<p>The relations wasStartedBy and used are orthogonal, and thus need to be asserted independently, according to the situation being described.</p>
+<p>The relations wasStartedBy and used are orthogonal, and thus need to be expressed independently, according to the situation being described.</p>
 
 </section>
 
@@ -1930,7 +1930,7 @@
 <h3>Component 5: Collections</h3>
 
 <p>The fifth component of PROV-DM is concerned with the notion of collections. 
-A collection is an entity that has some parts. The parts are themselves entities, and therefore their provenance can be expressed. In many applications, it is also of interest to be able to express the provenance of the collection  itself: e.g. who maintains the collection, which part it contains at which point in time, and how it was assembled. The purpose of Component 5 is to define the types and relations that are useful to express the provenance of collections. </p>
+A collection is an entity that has some members. The members are themselves entities, and therefore their provenance can be expressed. In many applications, it is also of interest to be able to express the provenance of the collection  itself: e.g. who maintains the collection, which member it contains at which point in time, and how it was assembled. The purpose of Component 5 is to define the types and relations that are useful to express the provenance of collections. </p>
 
 <p>Figure <a href="#figure-component5">figure-component5</a> overviews
 the component, which consists of two "UML Class" and three associations.
@@ -1946,7 +1946,7 @@
 
 
 <p>The intent of these relations and types is to express the <em>history of changes that occurred to a collection</em>. 
-Changes to collections are about the insertion of parts to collections and the removal of parts from collections.
+Changes to collections are about the insertion of entities to collections and the removal of members from collections.
 Indirectly, such history provides a way to reconstruct, the contents of a collection.</p>
 
 <section id="term-collection">
@@ -1957,7 +1957,7 @@
 
 <p>Conceptually, a collection has a logical structure consisting of key-entity pairs. This structure is often referred to as a <em>map</em>, and is a generic indexing mechanisms that can abstract commonly used data structures, including associative lists (also known as "dictionaries" in some programming languages), relational tables, ordered lists, and more (the specification of such specialized structures in terms of key-value pairs is out of the scope of this document).</p>
 
-<p>A given collection forms a given structure for its  parts.  A different structure (obtained either by insertion or removal parts) constitutes a different collection. Hence,
+<p>A given collection forms a given structure for its members.  A different structure (obtained either by insertion or removal of members) constitutes a different collection. Hence,
  for the purpose of provenance, a collection entity is viewed as a snapshot of a structure. Insertion and removal operations result in new snapshots, each snapshot forming an identifiable collection entity.</p>
 
 
@@ -1996,56 +1996,70 @@
 
 
 
+
+
 <p><div class="attributes" id="attributes-derivedByInsertionFrom">
-A <dfn title="derivedByInsertionFrom">Derivation-by-Insertion</dfn> relation <span class="name">derivedByInsertionFrom(id, c2, c1,  {(key_1, e_1), ..., (key_n, e_n)})</span> states that  <span class="name">c2</span> is the state of the collection
-following the insertion of pairs <span class="name">(key_1, e_1)</span>, ..., <span class="name">(key_n, e_n)</span> into collection  <span class="name">c1</span>, with the provision that each <span class="name">key_i</span> is unique.
-
-<p> A Derivation-by-Insertion relation<span class="withPn">, written <span class="pnExpression"> derivedByInsertionFrom(id, c2, c1, key-value-set, attrs)</span>,</span> contains:</p>
+A <dfn title="derivedByInsertionFrom">Derivation-by-Insertion</dfn> relation<span class="withPn">, written <span class="pnExpression">derivedByInsertionFrom(id, c2, c1, {(key_1, e_1), ..., (key_n, e_n)}, attrs)</span>,</span> contains:</p>
 <ul>
 <li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
 <li><span class='attribute'>after</span>: an identifier (<span class="name">c2</span>) for the collection <em>after</em> insertion; </li>
 <li><span class='attribute'>before</span>: an identifier (<span class="name">c1</span>) for the collection <em>before</em> insertion;</li>
-<li><span class='attribute'>key-value-set</span>: the inserted key-value pairs, of the form {(key_1, e_1), ..., (key_n, e_n)} where each key_i is a value, and e_i is an identifier  for the value that has been inserted with the key. This may be an entity identifier;</li>
-<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+<li><span class='attribute'>key-entity-set</span>: the inserted key-entity pairs <span class="name">(key_1, e_1)</span>, ..., <span class="name">(key_n, e_n)</span> in which each <span class="name">key_i</span> is a <a>value</a>, and <span class="name">e_i</span> is an identifier  for the entity that has been inserted with the key;
+ each <span class="name">key_i</span> is expected to be unique for the key-entity-set;
+</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set (<span class="name">attrs</span>) of attribute-value pairs to further describe the properties of the relation.</li>
 </ul>
+</div>
+
+<p>
+A Derivation-by-Insertion relation <span class="name">derivedByInsertionFrom(id, c2, c1,  {(key_1, e_1), ..., (key_n, e_n)})</span> states that  <span class="name">c2</span> is the state of the collection
+following the insertion of pairs <span class="name">(key_1, e_1)</span>, ..., <span class="name">(key_n, e_n)</span> into collection  <span class="name">c1</span>.</p>
+
+
+
+
 
 
 <div class="anexample">
 <pre class="codeexample">
-   entity(c, [prov:type="EmptyCollection"])    // c is an empty collection
-   entity(v1)
-   entity(v2)
-   entity(c1, [prov:type="Collection"])
-   entity(c2, [prov:type="Collection"])
+   entity(c, [prov:type="EmptyCollection" %% xsd:QName])    // c is an empty collection
+   entity(e1)
+   entity(e2)
+   entity(e3)
+   entity(c1, [prov:type="Collection" %% xsd:QName])
+   entity(c2, [prov:type="Collection" %% xsd:QName])
   
-  derivedByInsertionFrom(c1, c, {("k1", v1), ("k2",v2)})       
-  derivedByInsertionFrom(c2, c1, {("k3", v3)})    
+  derivedByInsertionFrom(c1, c,  {("k1", e1), ("k2", e2)})       
+  derivedByInsertionFrom(c2, c1, {("k3", e3)})    
 </pre>
-  From this set of assertions, we conclude:
-  <pre class="codeexample">
-   c =  {  }
-   c1 = { ("k1",v1),("k2",v2) }
-   c2 =  { ("k1",v1),("k2",v2), ("k3", v3) }
+From this set of descriptions, we conclude:
+<pre class="codeexample">
+   c  = {  }
+   c1 = { ("k1", e1), ("k2", e2) }
+   c2 = { ("k1", e1), ("k2", e2), ("k3", e3) }
   </pre>
 </div>
 
+<p>Insertion provides an "update semantics" for the keys that are already present in the collection, as illustrated by the following example. </p>
+
 <div class="anexample">
 <pre class="codeexample">
-   entity(c, [prov:type="EmptyCollection"])    // c is an empty collection
-   entity(v1)
-   entity(v2)
-   entity(c1, [prov:type="Collection"])
-   entity(c2, [prov:type="Collection"])
+   entity(c, [prov:type="EmptyCollection" %% xsd:QName])    // c is an empty collection
+   entity(e1)
+   entity(e2)
+   entity(e3)
+   entity(c1, [prov:type="Collection" %% xsd:QName])
+   entity(c2, [prov:type="Collection" %% xsd:QName])
   
-  derivedByInsertionFrom(c1, c, {("k1", v1), ("k2",v2)})       
-  derivedByInsertionFrom(c2, c1, {(<strong>"k1"</strong>, v3)})    
+  derivedByInsertionFrom(c1, c,  {("k1", e1), ("k2", e2)})       
+  derivedByInsertionFrom(c2, c1, {("k1", e3)})    
 </pre>
-   This is a case of <strong>update</strong> of v1 to v3 for the same key, "k1". <br/>
-  From this set of assertions, we conclude:
+   This is a case of <em>update</em> of <span class="name">e1</span> to <span class="name">e3</span> for the same key, <span class="name">"k1"</span>. <br/>
+  From this set of descriptions, we conclude:
   <pre class="codeexample">
    c =  {  }
-   c1 = { ("k1",v1),("k2",v2) }
-   c2 =  { ("k1",v3),("k2",v2) }
+   c1 = { ("k1", e1), ("k2", e2) }
+   c2 = { ("k1", e3), ("k2", e2) }
   </pre>
 </div>
 
@@ -2058,36 +2072,41 @@
 <span class="glossary-ref" data-ref="glossary-removal"></span>
 
 
-<p><strong>Derivation-by-Removal</strong> relation <span class="name">derivedByRemovalFrom(id, c2,c1, {key_1, ... key_n})</span> states that  <span class="name">c2</span> is  the  state of the collection following the removal of the set of pairs corresponding to keys  <span class="name">key_1...key_n</span> from  <span class="name">c1</span>.
-
-<p> A Derivation-by-Removal relation, written <span class="pnExpression"> derivedByRemovalFrom(id, collAfter, collBefore, key-set, attrs)</span>, contains:</p>
+
+
+<p>
+<div class="attributes" id="attributes-derivedByRemovalFrom">
+ A <dfn title="derivedByRemovalFrom">Derivation-by-Removal</dfn> relation, written <span class="pnExpression">derivedByRemovalFrom(id, c2, c1, {key_1, ... key_n}, attrs)</span>, contains:</p>
 <ul>
 <li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
-<li><span class='attribute'>after</span>: an identifier  for the collection  <em>after</em> the deletion; </li>
-<li><span class='attribute'>before</span>: an identifier  for the collection <em>before</em> the deletion;</li>
-<li><span class='attribute'>key-set</span>: a set of deleted keys, of the form {key_1,..., key_n};</li>
-<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+<li><span class='attribute'>after</span>: an identifier (<span class="name">c2</span>) for the collection  <em>after</em> the deletion; </li>
+<li><span class='attribute'>before</span>: an identifier (<span class="name">c1</span>)  for the collection <em>before</em> the deletion;</li>
+<li><span class='attribute'>key-set</span>: a set of deleted keys  <span class="name">key_1</span>, ..., <span class="name">key_n</span>, for which each <span class="name">key_i</span> is a <a>value</a>;</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set (<span class="name">attrs</span>) of attribute-value pairs to further describe the properties of the relation.</li>
 </ul>
-
+</div>
+
+<p>Derivation-by-Removal relation <span class="name">derivedByRemovalFrom(id, c2,c1, {key_1, ... key_n})</span> states that  <span class="name">c2</span> is  the  state of the collection following the removal of the set of pairs corresponding to keys  <span class="name">key_1...key_n</span> from  <span class="name">c1</span>.
 
 <div class="anexample">
 <pre class="codeexample">
-   entity(c, [prov:type="EmptyCollection"])    // e is an empty collection
-   entity(v1)
-   entity(v2)
+   entity(c, [prov:type="EmptyCollection"])    // c is an empty collection
+   entity(e1)
+   entity(e2)
+   entity(e3)
    entity(c1, [prov:type="Collection"])
    entity(c2, [prov:type="Collection"])
 
-  derivedByInsertionFrom(c1, c, {("k1", v1), ("k2",v2)})       
-  derivedByInsertionFrom(c2, c1, {("k3", v3)})
-  derivedByRemovalFrom(c3, c2, {k1,k3})   
+  derivedByInsertionFrom(c1, c, {("k1", e1), ("k2",e2)})       
+  derivedByInsertionFrom(c2, c1, {("k3", e3)})
+  derivedByRemovalFrom(c3, c2, {k1, k3})   
 </pre>
-  From this set of assertions, we conclude:
+From this set of descriptions, we conclude:
   <pre class="codeexample">
    c =  {  }
-   c1 = { ("k1",v1), ("k2", v2)  }
-   c2 = { ("k1",v1), ("k2", v2), ("k3", v3) }
-   c3 = { ("k2",v2) }
+   c1 = { ("k1", e1), ("k2", e2)  }
+   c2 = { ("k1", e1), ("k2", e2), ("k3", e3) }
+   c3 = { ("k2", e2) }
   </pre>
 
   
@@ -2103,17 +2122,21 @@
 <span class="glossary-ref" data-ref="glossary-membership"></span>
 
 <p>
-The insertion and removal relations make insertions and removals explicit as part of the history of a collection. This, however, requires explicit mention of the state of the collection prior to each insertion. The membership relation removes this needs, allowing the state of a collection <span class="name">c</span> to be asserted without having to introduce a prior state. This allows for the natural expression of a collection state, for instance in cases where a program or workflow block produces a new collection <span class="name">c</span>  with known content. In such cases, 
-<span class="name">memberOf(id, c,{(key_1, e_1), ..., (key_n, e_n)})</span> asserts that  <span class="name">c</span> is known to include <span class="name">{(key_1, e_1), ..., (key_n, e_n)}</span>, without having to introduce an initial state. <br/>
-
-<p> A <strong>Membership</strong> relation, written <span class="pnExpression"> memberOf(id, coll, key-value-set, attrs)</span>, contains:</p>
+The insertion and removal  relations make insertions and removals explicit as part of the history of a collection. This, however, requires explicit mention of the state of the collection prior to each operation. The membership relation removes this needs, allowing the state of a collection <span class="name">c</span> to be expressed without having to introduce a prior state.</p>
+
+<p>
+<div class="attributes" id="attributes-memberOf">
+ A <dfn title="memberOf">membership</dfn> relation, written <span class="pnExpression">memberOf(id, c, {(key_1, e_1), ..., (key_n, e_n)}, attrs)</span>, contains:
 <ul>
 <li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
-<li><span class='attribute'>after</span>: an identifier  for the collection whose members are asserted; </li>
-<li><span class='attribute'>key-value-set</span>: a set of key-value pairs that are members of the collection, of the form {(key_1, e_1), ..., (key_n, e_n)}</li>
-<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
-
+<li><span class='attribute'>after</span>: an identifier (<span class="name">c</span>) for the collection whose members are asserted; </li>
+<li><span class='attribute'>key-entity-set</span>: a set of key-entity pairs <span class="name">(key_1, e_1)</span>, ..., <span class="name">(key_n, e_n)</span> that are members of the collection;</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set (<span class="name">attrs</span>) of attribute-value pairs to further describe the properties of the relation.</li>
 </ul>
+</div>
+
+<p>The description <span class="name">memberOf(c, {(key_1, e_1), ..., (key_n, e_n)})</span> states that  <span class="name">c</span> is known to include <span class="name">(key_1, e_1)</span>, ..., <span class="name">(key_n, e_n)}</span>, without having to introduce an initial state. <br/>
+
 
 
 <div class="anexample">
@@ -2122,28 +2145,34 @@
    activity(a)
    wasGeneratedBy(c,a)   // a produced c
   
-   entity(v1)
-   entity(v2)
-   memberOf(c, {("k1", v1), ("k2", v2)} )  
+   entity(e1)
+   entity(e2)
+   memberOf(c, {("k1", e1), ("k2", e2)} )  
   
-   entity(v3)
+   entity(e3)
    entity(c1, [prov:type="Collection"])
   
-   derivedByInsertionFrom(c1, c, {("k3", v3)})     
+   derivedByInsertionFrom(c1, c, {("k3", e3)})     
 </pre>
   From this set of assertions, we conclude:
   <pre class="codeexample">
-   c  contains   ("k1", v1), ("k2", v2) 
-   c1 contains   ("k1", v1), ("k2", v2), ("k3", v3) 
+   c  contains   ("k1", e1), ("k2", e2) 
+   c1 contains   ("k1", e1), ("k2", e2), ("k3", v3) 
   </pre>
  Note that the state of <span class="name">c1</span> with these relations is only partially known, because the initial state of <span class="name">c</span> is unknown.
+</div>
+
+<!-- To go to part 2
+
 
   Note that the following one cannot have at the same time an empty collection and membership relations for it, i.e., the following example is invalid:
 <pre class="codeexample">
   <span class="name"> entity(c, [prov:type="EmptyCollection"])</span>
-   memberOf(c, {("k1", v1), ("k2", v2)} )  
+   memberOf(c, {("k1", e1), ("k2", v2)} )  
   </pre>
-</div>
+
+
+-->
 
 </section>  <!-- Membership -->
 
@@ -2152,16 +2181,11 @@
 <p>Further considerations: </p>
 
 <ul>
-<li>In Key-Value pairs, Keys are <a href="#term-value">values</a>, and Values are entities. This allows expressing nested collections, that is, collections whose values include entities of type collection.</li>
-
-<li>As the relation names suggest, insertion and removal relations are a particular case of <a href="#Derivation-Relation">derivation</a>.</li>
-
-
-
-<li>The state of a collection (i.e., the set of key-value pairs it contains) at a given point in a sequence of operations is never stated explicitly. Rather, it can be obtained by querying the chain of derivations involving insertions and removals. Entity type <span class="name">emptyCollection</span> can be used in this context as it marks the start of a sequence of collection operations.</li>
-
-
-<li>The representation of a collection through these relations, makes no assumption regarding the underlying data structure used to store and manage collections. In particular, no assumptions are needed regarding the mutability of a data structure that is subject to updates. Entities, however, are immutable and this applies  to those entities that represent collections. This is reflected in the constraints listed in Part II.  </li>
+
+<li>The state of a collection (i.e., the set of key-entity pairs it contains) at a given point in a sequence of operations is never stated explicitly. Rather, it can be obtained by querying the chain of derivations involving insertions and removals. Entity type <span class="name">emptyCollection</span> can be used in this context as it marks the start of a sequence of collection operations.</li>
+
+
+<li>The representation of a collection through these relations makes no assumption regarding the underlying data structure used to store and manage collections. In particular, no assumptions are needed regarding the mutability of a data structure that is subject to updates. Entities, however, are immutable and this applies  to those entities that represent collections. This is reflected in the constraints listed in Part II.  </li>
 </ul>