new collections doc for WD5
authorPaolo Missier <pmissier@acm.org>
Fri, 09 Mar 2012 17:10:21 +0000
changeset 1859 249da9e101cb
parent 1858 32a85c0391fd
child 1860 668fd87cb727
new collections doc for WD5
model/prov-dm-constraints.html
model/prov-dm.html
model/working-copy/wd5-prov-dm-collections.html
--- a/model/prov-dm-constraints.html	Fri Mar 09 16:38:49 2012 +0000
+++ b/model/prov-dm-constraints.html	Fri Mar 09 17:10:21 2012 +0000
@@ -1535,7 +1535,6 @@
 consider an abbreviation for this: wasGeneratedBy*(ex:s_3,ex:a2).</p>
 </div>
 
-
 </section>
 
 
@@ -1546,7 +1545,7 @@
 <h3>PROV-DM Collection Constraints</h3>
 
 <div class='note'>
-Raw material taken from prov-dm3. Some further text required.
+Edited by PM --- TBC
 </div>
 
 
--- a/model/prov-dm.html	Fri Mar 09 16:38:49 2012 +0000
+++ b/model/prov-dm.html	Fri Mar 09 17:10:21 2012 +0000
@@ -1929,7 +1929,9 @@
 <h3>Collections</h3>
 
 <p><strong>Collection relations</strong> address the need to describe the evolution of entities that have a collection structure, that is, which may contain other entities. Specifically, this section exploits the built-in type for entities, called <a title="concept-collection">collection</a>, and two relations to describe the effect of adding elements to, and removing elements from, a collection entity.
-The intent of these relations and entity types is to capture the <em>history of changes that occurred to a collection</em>. </p>
+The intent of these relations and entity types is to capture the <em>history of changes that occurred to a collection</em>.
+Thus, a collection entity is an immutable representation of the state of a collection data structure following a sequence of insertion and deletion operations.
+</p>
 
 <p>A collection is an entity that has a logical internal structure consisting of key-value pairs, often referred to as a map.
 More precisely, the following entity types are introduced:
@@ -1965,7 +1967,7 @@
 </div>
 
 
-<p> A relation derivedByInsertionFrom<span class="withPn">, written <span class="pnExpression"> derivedByInsertionFrom(id, collAfter, collBefore, key, value, attrs)</span>,</span> contains:</p>
+<p> A Derivation-by-Insertion relation<span class="withPn">, written <span class="pnExpression"> derivedByInsertionFrom(id, collAfter, collBefore, key, value, attrs)</span>,</span> contains:</p>
 <ul>
 <li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
 <li><span class='attribute'>after</span>: an identifier for the collection <em>after</em> insertion; </li>
@@ -1975,7 +1977,7 @@
 <li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
 </ul>
 
-<p> A relation CollectionAfterDeletion, written <span class="pnExpression"> CollectionAfterDeletion(id, collAfter, collBefore, key, attrs)</span>, contains:</p>
+<p> A Derivation-by-Removal relation, written <span class="pnExpression"> derivedByRemovalFrom(id, collAfter, collBefore, key, attrs)</span>, contains:</p>
 <ul>
 <li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
 <li><span class='attribute'>after</span>: an identifier  for the collection  <em>after</em> the deletion; </li>
@@ -1984,12 +1986,26 @@
 <li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
 </ul>
 
-<!--
-<div class='note'>
-I propose to call them afterInsertion instead of CollectionAfterInsertion (likewise, for deletion).
-What about attributes and optional Id?
-</div>
--->
+As a convenience, corresponding relations for <em>bulk operations</em> involving a set of key-value pairs are introduced, as follows.
+
+<p> A  Derivation-by-Bulk-Insertion relation <span class="withPn">, written <span class="pnExpression"> derivedByBulkInsertionFrom(id, collAfter, collBefore, key-value-set, attrs)</span>,</span> contains:</p>
+<ul>
+<li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
+<li><span class='attribute'>after</span>: an identifier for the collection <em>after</em> insertion; </li>
+<li><span class='attribute'>before</span>: an identifier for the collection <em>before</em> insertion;</li>
+<li><span class='attribute'>key-value-set</span>: a set of inserted key-value pairs, of the form {(key_1, value_1), ..., (key_n, value_n)}</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+</ul>
+
+<p> A Derivation-by-Bulk-Removal relation, written <span class="pnExpression"> derivedByBulkRemovalFrom(id, collAfter, collBefore, key, attrs)</span>, contains:</p>
+<ul>
+<li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
+<li><span class='attribute'>after</span>: an identifier  for the collection  <em>after</em> the deletion; </li>
+<li><span class='attribute'>before</span>: an identifier  for the collection <em>before</em> the deletion;</li>
+<li><span class='attribute'>key-set</span>: a set of deleted keys, of the form {key_1,..., key_n}</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+</ul>
+
 
 <p>Further considerations:</p>
 
@@ -2004,86 +2020,18 @@
 
 <li>The state of a collection (i.e., the set of key-value pairs it contains) at a given point in a sequence of operations is never stated explicitly. Rather, it can be obtained by querying the chain of derivations involving insertions and removals. Entity type <span class="name">emptyCollection</span> can be used in this context as it marks the start of a sequence of collection operations.</li>
 
-<li>Collections with a content may be generated as a result of an activity. This can be modelled in PROV by asserting insertions to reflect the state of the collection after generation.
-
-<div class='note'>
-TBC with an example
-</div>
-
-
-
-<!-- 
-  <li> One can have multiple assertions regarding the state of a collection following a <em>set</em> of insertions, for example:<br/>
-<span class="name">CollectionAfterInsertion(c2,c1, k1, v1)</span><br/>
-<span class="name">CollectionAfterInsertion(c2,c1, k2, v2)</span><br/>
-  <span class="name">...</span><br/>
-This is interpreted as <em>" <span class="name">c2</span> is the state that results from inserting  <span class="name">(k1, v1)</span>,  <span class="name">(k2, v2)</span> etc. into  <span class="name">c1</span>"</em></li></p>
-
-<li> It is possible to have multiple derivations from a single root collection, possibly by different asserters, as shown in the following example.
+<li>An activity may generate a new collection, complete with its content, without visible insertion operations. Nevertheless, one can abstract the collection construction process as a sequence of insertions (or one single bulk insertion). Thus, if the content of the collection at the end of the generation is known, one can state it using the Derivation-by-Insertion relation, starting from the empty collection (or, more generally, from a collection with unknown prior state).
 
 <div class="anexample">
 <pre class="codeexample">
-  entity(c, [prov:type="EmptyCollection"])    // e is an empty collection
-  entity(v1)
-  entity(v2)
-  entity(v3)
-  entity(c1, [prov:type="Collection"])
-  entity(c2, [prov:type="Collection"])
-  entity(c3, [prov:type="Collection"])
-  
-  CollectionAfterInsertion(c1, c, k1, v1)       // c1 = { (k1,v1) }
-  CollectionAfterInsertion(c2, c, k2, v2)       // c2 = { (k2 v2) }
-  CollectionAfterInsertion(c3, c1, k3,v3)       // c3 = { (k1,v1),  (k3,v3) }
+   entity(c0, [prov:type="EmptyCollection"])    // e is an empty collection
+   activity(a)
+   entity(c1, [prov:type="Collection"]) 
+   wasGeneratedBy(c1,a)  
+   derivedByBulkInsertionFrom(c1, c0, {("k1", v1), {("k2", v2) )       // c1 = { ("k1",v1), ("k2",v2) }
 </pre>
 </div>
 
-<div class='note'>Asserter not defined</div>
-</li></p>
-
-
-<li>Given the pair of assertions:
-
-<span class="name">CollectionAfterInsertion(c, c1, k1, v1)</span><br/>
-<span class="name">CollectionAfterInsertion(c, c2, k2, v2)</span><br/>
-
-it follows that <span class="name">c1==c2, k1==k2, v1==v2</span>, because one cannot have two different derivations for the same final collection state.</li></p>
-
-
-<li>Given the following set of insertions:<br/>
-
-<span class="name">CollectionAfterInsertion(c1, c, k, v1)</span><br/>
-<span class="name">CollectionAfterInsertion(c1, c, k, v2)</span><br/>
-
-it follows that  <span class="name">v1==v2</span>.</li></p>
-
-
-<li> The state of a collection is only known to the extent that a chain of derivations starting from an empty collection can be found. Since a set of assertions regarding a collection's evolution may be incomplete, so is the reconstructed state obtained by querying those assertions. In general, all assertions reflect the asserter's partial knowledge of a sequence of data transformation events. In the particular case of collection evolution, in which the asserter  <em>knows</em> that some of the state changes may have been missed, then the more generic  <a href="#Derivation-Relation">derivation</a> relation should be used to signal that some updates may have occurred, which cannot be precisely asserted as insertions or removals. The following two examples illustrate this.
-
-<div class="anexample">
-<pre class="codeexample">
-  entity(c, [prov:type="collection"])    // e is a collection, possibly not empty
-  entity(v1)
-  entity(v2, [prov:type="collection"])    // v2 is a collection
-
-  CollectionAfterInsertion(c1, c, k1, v1)       // c1 <em>includes</em> { (k1,v1) } but may contain additional unknown pairs
-  CollectionAfterInsertion(c2, c1, k2, v2)      // c2 includes { (k1,v1), (k2 v2) } where v2 is a collection with unknown state
-</pre>
-</div>
-  In the example, the state of <span class="name">c2</span> is only partially known because the collection is constructed from partially known other collections.
-
-<div class="anexample">
-<pre class="codeexample">
-  entity(c, [prov:type="emptyCollection"])    // e is an empty collection
-  entity(v1)
-  entity(v2)
-
-  CollectionAfterInsertion(c1, c, k1, v1)       // c1 = { (k1,v1) }
-  wasDerivedFrom(c2, c1)                        // the asserted knows that c2 is somehow derived from c1, but cannot assert the precise sequence of updates
-    CollectionAfterInsertion(c3, c2, k2, v2)       
-</pre>
-</div>
-Here  <span class="name">c3</span> includes <span class="name">{ (k2 v2) }</span> but the earlier "gap" leaves uncertainty regarding  <span class="name">(k1,v1)</span>  (it may have been removed) or any other pair that may have been added as part of the derivation activities.</li></p>
--->
 </ul>
 <div class='note'>Deleted further items. Some of them are constraints which belong to part 2.</div>
 
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/working-copy/wd5-prov-dm-collections.html	Fri Mar 09 17:10:21 2012 +0000
@@ -0,0 +1,423 @@
+<!DOCTYPE html>
+
+<html><head> 
+    <title>PROV-DM new version of Collections</title> 
+    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 
+    <!-- 
+      === NOTA BENE ===
+      For the three scripts below, if your spec resides on dev.w3 you can check them
+      out in the same tree and use relative links so that they'll work offline,
+     -->
+<!-- PM -->
+    <style type="text/css">
+      .note { font-size:small; margin-left:50px }
+     </style>
+
+    <script src="http://dev.w3.org/2009/dap/ReSpec.js/js/respec.js" class="remove"></script> 
+    <script src="http://www.w3.org/2007/OWL/toggles.js" class="remove"></script> 
+    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js" class="remove"></script>
+
+    <script src="../glossary.js" class="remove"></script>
+
+    <script class="remove">
+      function updateGlossaryRefs() {
+        $('.glossary-ref').each(function(index) {
+          var ref=$(this).attr('ref');
+          var span=$(this).attr('withspan')
+          $('#'+ref+'.glossary').contents().clone().appendTo($(this));
+          $(this).attr('prov:hadOriginalSource',glossary_hg);
+          if (span) {
+            $(this).children('dfn').replaceWith(function(){return $('<span>').addClass('dfn').append($(this).contents())});
+          }
+        });
+      }
+      $(document).ready(function(){
+        // if glossary is in a string:
+        $('#glossary_div').html(glossary_string)
+        updateGlossaryRefs();
+      });
+
+    </script>
+
+    <script class="remove"> 
+      var addExtraReferences = function() {
+          for (var k in extraReferences)
+              berjon.biblio[k] = extraReferences[k];
+      };
+      var extraReferences = {
+        "CLOCK":
+         "Lamport, L. "+
+         "<a href=\"http://research.microsoft.com/users/lamport/pubs/time-clocks.pdf\"><cite>Time, clocks, and the ordering of events in a distributed system</cite></a>."+
+         "Communications of the ACM 21 (7): 558–565. 1978. "+
+         "URL: <a href=\"http://research.microsoft.com/users/lamport/pubs/time-clocks.pdf\">http://research.microsoft.com/users/lamport/pubs/time-clocks.pdf</a> " +
+         "DOI: doi:10.1145/359545.359563.",
+        "CSP":
+         "Hoare, C. A. R. "+
+         "<a href=\"http://www.usingcsp.com/cspbook.pdf\"><cite>Communicating Sequential Processes</cite></a>."+
+         "Prentice-Hall. 1985"+
+         "URL: <a href=\"http://www.usingcsp.com/cspbook.pdf\">http://www.usingcsp.com/cspbook.pdf</a>",
+        "Logic":
+          "W. E. Johnson"+
+          "<a href=\"http://www.ditext.com/johnson/intro-3.html\"><cite>Logic: Part III</cite></a>."+
+          "1924. "+
+          "URL: <a href=\"http://www.ditext.com/johnson/intro-3.html\">http://www.ditext.com/johnson/intro-3.html</a>",
+        "PROV-SEM":
+          "James Cheney "+
+          "<a href=\"http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman\"><cite>Formal Semantics Strawman</cite></a>. "+
+          "2011, Work in progress. "+
+          "URL: <a href=\"http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman\">http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman</a>",
+
+        "PROV-PRIMER":
+          "Yolanda Gil and Simon Miles (eds.) Khalid Belhajjame, Helena Deus, Daniel Garijo, Graham Klyne, Paolo Missier, Stian Soiland-Reyes, and Stephan Zednik "+
+          "<a href=\"http://www.w3.org/TR/prov-primer/\"><cite>Prov Model Primer</cite></a>. "+
+          "2011, Working Draft. "+
+          "URL: <a href=\"http://www.w3.org/TR/prov-primer/\">http://www.w3.org/TR/prov-primer/</a>",
+
+        "PROV-O":
+          "Satya Sahoo and Deborah McGuinness (eds.) Khalid Belhajjame, James Cheney, Daniel Garijo, Timothy Lebo, Stian Soiland-Reyes, and Stephan Zednik "+
+          "<a href=\"http://www.w3.org/TR/prov-o/\"><cite>Provenance Formal Model</cite></a>. "+
+          "2011, Working Draft. "+
+          "URL: <a href=\"http://www.w3.org/TR/prov-o/\">http://www.w3.org/TR/prov-o/</a>",
+
+
+        "PROV-DM":
+          "Luc Moreau and Paolo Missier (eds.) ... "+
+          "<a href=\"http://www.w3.org/TR/prov-dm/\"><cite>PART 1: PROV-DM ...</cite></a>. "+
+          "2011, Working Draft. "+
+          "URL: <a href=\"http://www.w3.org/TR/prov-dm/\">http://www.w3.org/TR/prov-dm/</a>",
+
+
+        "PROV-DM-CONSTRAINTS":
+          "Luc Moreau and Paolo Missier (eds.) ... "+
+          "<a href=\"http://www.w3.org/TR/prov-dm-constraints/\"><cite>PROV-DM Constraints</cite></a>. "+
+          "2011, Working Draft. "+
+          "URL: <a href=\"http://www.w3.org/TR/prov-dm-constraints/\">http://www.w3.org/TR/prov-dm-constraints/</a>",
+
+        "PROV-ASN":
+          "Luc Moreau and Paolo Missier (eds.) ... "+
+          "<a href=\"http://www.w3.org/TR/prov-asn/\"><cite>PROV-ASN ....</cite></a>. "+
+          "2011, Working Draft. "+
+          "URL: <a href=\"http://www.w3.org/TR/prov-asn/\">http://www.w3.org/TR/prov-asn/</a>",
+
+        "PROV-AQ":
+          "Graham Klyne and Paul Groth (eds.) Luc Moreau, Olaf Hartig, Yogesh Simmhan, James Meyers, Timothy Lebo, Khalid Belhajjame, and Simon Miles "+
+          "<a href=\"http://www.w3.org/TR/prov-aq/\"><cite>Provenance Access and Query</cite></a>. "+
+          "2011, Working Draft. "+
+          "URL: <a href=\"http://www.w3.org/TR/prov-aq/\">http://www.w3.org/TR/prov-aq/</a>",
+      };
+      var respecConfig = {
+          // specification status (e.g. WD, LCWD, NOTE, etc.). If in doubt use ED.
+          specStatus:           "ED",
+          
+          // the specification's short name, as in http://www.w3.org/TR/short-name/
+          shortName:            "prov-dm",
+ 
+          // if your specification has a subtitle that goes below the main
+          // formal title, define it here
+          subtitle   :  "Draft proposal (to be incorporated in Editor's Draft when approved)",
+
+ 
+          // if you wish the publication date to be other than today, set this
+          // publishDate:  "2011-10-18",
+ 
+          // if the specification's copyright date is a range of years, specify
+          // the start date here:
+          // copyrightStart: "2005"
+ 
+          // if there is a previously published draft, uncomment this and set its YYYY-MM-DD date
+          // and its maturity status
+          previousPublishDate:  "2012-02-02",
+          previousMaturity:  "WD",
+ 
+          // if there a publicly available Editor's Draft, this is the link
+          edDraftURI:           "http://dvcs.w3.org/hg/prov/raw-file/default/model/prov-dm.html",
+ 
+          // if this is a LCWD, uncomment and set the end of its review period
+          // lcEnd: "2009-08-05",
+ 
+          // if you want to have extra CSS, append them to this list
+          // it is recommended that the respec.css stylesheet be kept
+          extraCSS:             ["http://dev.w3.org/2009/dap/ReSpec.js/css/respec.css", "../extra.css"],
+ 
+          // editors, add as many as you like
+          // only "name" is required
+          editors:  [
+              { name: "Luc Moreau", url: "http://www.ecs.soton.ac.uk/~lavm/",
+                company: "University of Southampton" },
+              { name: "Paolo Missier", url: "http://www.cs.ncl.ac.uk/people/Paolo.Missier",
+                company: "Newcastle University" },
+          ],
+ 
+          // authors, add as many as you like. 
+          // This is optional, uncomment if you have authors as well as editors.
+          // only "name" is required. Same format as editors.
+ 
+          authors:  [
+              { name: "Khalid Belhajjame", url: "http://semanticweb.org/wiki/Khalid_Belhajjame",
+                company: "University of Manchester" },
+              { name: "Stephen Cresswell",
+                company: "legislation.gov.uk"},
+              { name: "Yolanda Gil",
+                company: "Invited Expert", url:"http://www.isi.edu/~gil/"},
+              { name: "Reza B'Far",
+                company: "Oracle Corporation" },
+              { name: "Paul Groth", url: "http://www.few.vu.nl/~pgroth/",
+                company: "VU University of Amsterdam" },
+              { name: "Graham Klyne",
+                company: "University of Oxford" },
+              { name: "Jim McCusker", url: "http://tw.rpi.edu/web/person/JamesMcCusker",
+                company: "Rensselaer Polytechnic Institute" },
+              { name: "Simon Miles", 
+                company: "Invited Expert", url:"http://www.inf.kcl.ac.uk/staff/simonm/" },
+              { name: "James Myers", url:"http://www.rpi.edu/research/ccni/",
+                company: "Rensselaer Polytechnic Institute"},
+              { name: "Satya Sahoo", url:"http://cci.case.edu/cci/index.php/Satya_Sahoo",
+                company: "Case Western Reserve University" },
+          ],
+          
+          // name of the WG
+          wg:           "Provenance Working Group",
+          
+          // URI of the public WG page
+          wgURI:        "http://www.w3.org/2011/prov/",
+          
+          // name (with the @w3c.org) of the public mailing to which comments are due
+          wgPublicList: "public-prov-wg",
+          
+          // URI of the patent status for this WG, for Rec-track documents
+          // !!!! IMPORTANT !!!!
+          // This is important for Rec-track documents, do not copy a patent URI from a random
+          // document unless you know what you're doing. If in doubt ask your friendly neighbourhood
+          // Team Contact.
+          wgPatentURI:  "http://www.w3.org/2004/01/pp-impl/46974/status",
+
+          // Add extraReferences to bibliography database
+          preProcess: [addExtraReferences],
+      };
+    </script> 
+  </head> 
+  <body> 
+
+    <section id="abstract">
+    </section> 
+
+
+
+
+<section id="term-original-source">
+
+<section id="term-Collection">
+<h3>Collections</h3>
+
+<p><strong>Collection relations</strong> address the need to describe the evolution of entities that have a collection structure, that is, which may contain other entities. Specifically, this section exploits the built-in type for entities, called <a title="concept-collection">collection</a>, and two relations to describe the effect of adding elements to, and removing elements from, a collection entity.
+The intent of these relations and entity types is to capture the <em>history of changes that occurred to a collection</em>.
+Thus, a collection entity is an immutable representation of the state of a collection data structure following a sequence of insertion and deletion operations.
+</p>
+
+<p>A collection is an entity that has a logical internal structure consisting of key-value pairs, often referred to as a map.
+More precisely, the following entity types are introduced:
+
+<ul>
+  <li> <span class="name">Collection</span>  denotes an entity of type collection, i.e. an entity that  can participate in insertion and removal relations;
+
+  <li><span class="name">EmptyCollection</span> denotes an empty collection.
+</ul>
+
+The following relations relate a collection <span class="name">c1</span> with a collection <span class="name">c2</span> obtained after adding or removing a new pair to (resp. from) <span class="name">c1</span>:
+
+<ul>
+  <li>Derivation-by-Insertion relation <span class="name">derivedByInsertionFrom(c2, c1, k, v)</span> states that  <span class="name">c2</span> is the state of the collection
+following the insertion of pair <span class="name">(k,v)</span> into collection  <span class="name">c1</span>;</li>
+
+<li>  Derivation-by-Removal relation <span class="name">derivedByRemovalFrom(c2,c1, k)</span> states that  <span class="name">c2</span> is  the  state of the collection following the removal of the pair corresponding to key  <span class="name">k</span> from  <span class="name">c1</span>.</li>
+
+</ul>
+
+<div class="anexample">
+<pre class="codeexample">
+   entity(c, [prov:type="EmptyCollection"])    // e is an empty collection
+   entity(v1)
+   entity(v2)
+   entity(c1, [prov:type="Collection"])
+   entity(c2, [prov:type="Collection"])
+  
+  derivedByInsertionFrom(c1, c, "k1", v1)       // c1 = { ("k1",v1) }
+  derivedByInsertionFrom(c2, c1, "k2", v2)      // c2 = { ("k1",v1), ("k2", v2) }
+  derivedByRemovalFrom(c3, c2, k1)              // c3 = { ("k2",v2) }
+</pre>
+</div>
+
+
+<p> A Derivation-by-Insertion relation<span class="withPn">, written <span class="pnExpression"> derivedByInsertionFrom(id, collAfter, collBefore, key, value, attrs)</span>,</span> contains:</p>
+<ul>
+<li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
+<li><span class='attribute'>after</span>: an identifier for the collection <em>after</em> insertion; </li>
+<li><span class='attribute'>before</span>: an identifier for the collection <em>before</em> insertion;</li>
+<li><span class='attribute'>key</span>: the key that has been inserted</li>
+<li><span class='attribute'>value</span>: an identifier  for the value that has been inserted with the key.</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+</ul>
+
+<p> A Derivation-by-Removal relation, written <span class="pnExpression"> derivedByRemovalFrom(id, collAfter, collBefore, key, attrs)</span>, contains:</p>
+<ul>
+<li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
+<li><span class='attribute'>after</span>: an identifier  for the collection  <em>after</em> the deletion; </li>
+<li><span class='attribute'>before</span>: an identifier  for the collection <em>before</em> the deletion;</li>
+<li><span class='attribute'>key</span>: the key corresponding to the (key, value) pair that has been deleted from the collection.</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+</ul>
+
+As a convenience, corresponding relations for <em>bulk operations</em> involving a set of key-value pairs are introduced, as follows.
+
+<p> A  Derivation-by-Bulk-Insertion relation <span class="withPn">, written <span class="pnExpression"> derivedByBulkInsertionFrom(id, collAfter, collBefore, key-value-set, attrs)</span>,</span> contains:</p>
+<ul>
+<li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
+<li><span class='attribute'>after</span>: an identifier for the collection <em>after</em> insertion; </li>
+<li><span class='attribute'>before</span>: an identifier for the collection <em>before</em> insertion;</li>
+<li><span class='attribute'>key-value-set</span>: a set of inserted key-value pairs, of the form {(key_1, value_1), ..., (key_n, value_n)}</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+</ul>
+
+<p> A Derivation-by-Bulk-Removal relation, written <span class="pnExpression"> derivedByBulkRemovalFrom(id, collAfter, collBefore, key, attrs)</span>, contains:</p>
+<ul>
+<li><span class='attribute'>id</span>:  an OPTIONAL identifier identifying the relation;</li>
+<li><span class='attribute'>after</span>: an identifier  for the collection  <em>after</em> the deletion; </li>
+<li><span class='attribute'>before</span>: an identifier  for the collection <em>before</em> the deletion;</li>
+<li><span class='attribute'>key-set</span>: a set of deleted keys, of the form {key_1,..., key_n}</li>
+<li><span class='attribute'>attributes</span>: an OPTIONAL set of attribute-value pairs to further describe the properties of the relation.</li>
+</ul>
+
+
+<p>Further considerations:</p>
+
+<ul>
+  <li>The <strong>map</strong> collection type provides a generic indexing structure that can be used to model commonly used data structures, including associative lists (also known as "dictionaries" in some programming languages), relational tables, ordered lists, and more (the specification of such specialized structures in terms of key-value pairs is out of the scope of this document).</li>
+
+<li>Values are entities. This allows expressing nested collections, that is, collections whose values include entities of type collection.</li>
+
+<li>As the relation names suggest, insertion and removal relations are a particular case of <a href="#Derivation-Relation">derivation</a>.</li>
+
+<li>This representation of a collection's evolution makes no assumption regarding the underlying data structure used to store and manage collections. In particular, no assumptions are needed regarding the mutability of a data structure that is subject to updates. Entities, however, are immutable and this applies  to those entities that represent collections. This is reflected in the constraints listed in Part II.  </li>
+
+<li>The state of a collection (i.e., the set of key-value pairs it contains) at a given point in a sequence of operations is never stated explicitly. Rather, it can be obtained by querying the chain of derivations involving insertions and removals. Entity type <span class="name">emptyCollection</span> can be used in this context as it marks the start of a sequence of collection operations.</li>
+
+<li>An activity may generate a new collection, complete with its content, without visible insertion operations. Nevertheless, one can abstract the collection construction process as a sequence of insertions (or one single bulk insertion). Thus, if the content of the collection at the end of the generation is known, one can state it using the Derivation-by-Insertion relation, starting from the empty collection (or, more generally, from a collection with unknown prior state).
+
+<div class="anexample">
+<pre class="codeexample">
+   entity(c0, [prov:type="EmptyCollection"])    // e is an empty collection
+   activity(a)
+   entity(c1, [prov:type="Collection"]) 
+   wasGeneratedBy(c1,a)  
+   derivedByBulkInsertionFrom(c1, c0, {("k1", v1), {("k2", v2) )       // c1 = { ("k1",v1), ("k2",v2) }
+</pre>
+</div>
+
+</ul>
+<div class='note'>Deleted further items. Some of them are constraints which belong to part 2.</div>
+
+
+</section>   <!-- end collections-->
+
+<section id="collection-constraints">
+<h3>PROV-DM Collection Constraints  [to go in part II]</h3>
+
+<div class='constraint' id='collection-unique-insertion'>
+<p>One cannot have multiple assertions regarding the state of a collection. Thus:</p>
+<pre class="codeexample">
+derivedByInsertionFrom(c2, c1, k1, v1)
+derivedByInsertionFrom(c2, c1, k2, v2)
+</pre>
+implies <span class="name">k1 = v1, k2 = v2</span>. <p/>
+
+  Similarly for removal:
+<pre class="codeexample">
+derivedByRemovalFrom(c2, c1, k1)
+derivedByRemovalFrom(c2, c1, k2)
+</pre>
+implies <span class="name">k1 = k2</span>. <p/>
+  
+</div>
+
+<div class='constraint' id='collection-branching-derivations'>
+It is possible to have multiple derivations from a single root collection, as long as the resuting entities are distinct, as shown in the following example.
+
+<div class="anexample">
+<pre class="codeexample">
+  entity(c, [prov:type="EmptyCollection"])    // e is an empty collection
+  entity(v1)
+  entity(v2)
+  entity(v3)
+  entity(c1, [prov:type="Collection"])
+  entity(c2, [prov:type="Collection"])
+  entity(c3, [prov:type="Collection"])
+  
+  derivedByInsertionFrom(c1, c, k1, v1)       // c1 = { (k1,v1) }
+  derivedByInsertionFrom(c2, c, k2, v2)       // c2 = { (k2 v2) }
+  derivedByInsertionFrom(c3, c1, k3,v3)       // c3 = { (k1,v1),  (k3,v3) }
+</pre>
+</div>
+</div>
+
+
+<div class='constraint' id='collection-unique-ancestor'>
+A collection can only be derived from a single prior collection. Thus:
+<pre class="codeexample">
+derivedByInsertionFrom(c, c1, k1, v1)
+derivedByInsertionFrom(c, c2, k2, v2)
+</pre>
+implies  <span class="name">c1==c2</span>. <p/>
+
+And:
+<pre class="codeexample">
+derivedByRemovalFrom(c, c1, k1)
+derivedByRemovalFrom(c, c1, k2)
+</pre>
+implies <span class="name">k1 = k2</span>. <p/>
+
+ This also applies to any combination of insertions and removals.
+</div>
+
+<div class='constraint' id='collection-unique-value-for-key'>
+  Keys are unique within a collection. Thus:
+<pre class="codeexample">
+derivedByInsertionFrom(c1, c, k, v1)
+derivedByInsertionFrom(c1, c, k, v2)
+</pre>
+implies  <span class="name">v1==v2</span>.
+</div>
+
+<h3>Use of weaker <a href="#Derivation-Relation">derivation</a> relation</h3>
+
+<p>The state of a collection is only known to the extent that a chain of derivations starting from an empty collection can be found. Since a set of assertions regarding a collection's evolution may be incomplete, so is the reconstructed state obtained by querying those assertions. In general, all assertions reflect the asserter's partial knowledge of a sequence of data transformation events. In the particular case of collection evolution, in which the asserter  <em>knows</em> that some of the state changes may have been missed, then the more generic  <a href="#Derivation-Relation">derivation</a> relation should be used to signal that some updates may have occurred, which cannot be precisely asserted as insertions or removals. The following two examples illustrate this.</p>
+
+<div class="anexample">
+<pre class="codeexample">
+  entity(c, [prov:type="collection"])    // e is a collection, possibly not empty
+  entity(v1)
+  entity(v2, [prov:type="collection"])    // v2 is a collection
+
+  derivedByInsertionFrom(c1, c, k1, v1)       // c1 <em>includes</em> { (k1,v1) } but may contain additional unknown pairs
+  derivedByInsertionFrom(c2, c1, k2, v2)      // c2 includes { (k1,v1), (k2 v2) } where v2 is a collection with unknown state
+</pre>
+</div>
+  In the example, the state of <span class="name">c2</span> is only partially known because the collection is constructed from partially known other collections.
+
+<div class="anexample">
+<pre class="codeexample">
+  entity(c, [prov:type="emptyCollection"])    // e is an empty collection
+  entity(v1)
+  entity(v2)
+
+  derivedByInsertionFrom(c1, c, k1, v1)       // c1 = { (k1,v1) }
+  wasDerivedFrom(c2, c1)                        // the asserted knows that c2 is somehow derived from c1, but cannot assert the precise sequence of updates
+    derivedByInsertionFrom(c3, c2, k2, v2)       
+</pre>
+
+<p>Here  <span class="name">c3</span> includes <span class="name">{ (k2 v2) }</span> but the earlier "gap" leaves uncertainty regarding  <span class="name">(k1,v1)</span>  (it may have been removed) or any other pair that may have been added as part of the derivation activities.</p>
+</div>
+
+</section>
+
+
+