* Weakened generation/inv/start/end constraints, and added simultaneity constraints.
authorJames Cheney <jcheney@inf.ed.ac.uk>
Fri, 10 Aug 2012 16:31:17 +0100
changeset 4328 c20c2419340a
parent 4327 15e3355292ba
child 4329 af7ffdd0f1d2
* Weakened generation/inv/start/end constraints, and added simultaneity constraints.
model/prov-constraints.html
--- a/model/prov-constraints.html	Fri Aug 10 15:44:29 2012 +0100
+++ b/model/prov-constraints.html	Fri Aug 10 16:31:17 2012 +0100
@@ -433,7 +433,7 @@
 form">normalize</a> PROV instances to forms that can easily be
 compared in order to determine whether two PROV instances are
 <a>equivalent</a>.  Validity and equivalence are also defined for PROV
-bundles (that is, named instances) and datasets (that is, a toplevel
+bundles (that is, named instances) and documents (that is, a toplevel
 instance together with zero or more bundles).</p>
 
 </section>
@@ -519,10 +519,10 @@
 
 <p>The PROV Data Model, PROV-DM, is a conceptual data model for provenance, which is
 realizable using different representations such as PROV-N and PROV-O.
-A PROV <dfn>dataset</dfn> is a set of PROV statements,
+A PROV <dfn>document</dfn> is a set of PROV statements,
 together with zero or more <a>bundles</a>, or named sets of
 statements.  For
-example, a PROV dataset could be a .provn document, the result
+example, a PROV document could be a .provn document, the result
 of a query, a triple store containing PROV statements in RDF, etc. The
 PROV-DM specification [[PROV-DM]] imposes minimal requirements upon
 PROV instances. A <a>valid</a> PROV instance corresponds to a
@@ -550,15 +550,15 @@
 impossibility constraints can be checked on the normal form to determine
 <a title="valid">validity</a>.  Equivalence of two PROV 
 instances can be determined by comparing their normal forms.  For PROV
-datasets, validity and equivalence amount to checking the validity or
-pairwise equivalence of their respective datasets.
+documents, validity and equivalence amount to checking the validity or
+pairwise equivalence of their respective documents.
 </p>
 <p>
 This document outlines an algorithm for validity checking based on
 <a title="normal form">normalization</a>.  Applications MAY implement
 validity and equivalence checking using normalization, as suggested
 here, or in
-any other way as long as the same instances or datasets are considered valid or
+any other way as long as the same instances or documents are considered valid or
 equivalent, respectively.
 </p>
 
@@ -568,9 +568,9 @@
 to support them at all.  
 Applications producing provenance SHOULD ensure that it is
 <a>valid</a>, and similarly applications consuming provenance MAY reject provenance that is not <a>valid</a>.  Applications
-which are determining whether PROV instances or datasets convey the same
+which are determining whether PROV instances or documents convey the same
 information SHOULD check equivalence as specified here, and SHOULD
-treat equivalent instances or datasets in the same way.
+treat equivalent instances or documents in the same way.
 </p>
 </section>
 <section>
@@ -1054,14 +1054,14 @@
   those obtained by the algorithm in this specification.
   </p>
 
-<h2>From Instances to Datasets</h2>
-
-<p>PROV datasets can contain multiple instances: a <a>toplevel
+<h2>From Instances to Documents</h2>
+
+<p>PROV documents can contain multiple instances: a <a>toplevel
 instance</a>, the set of statements not appearing within a bundle, and
 zero or more named instances called <a>bundle</a>s.  For the purpose
 of this document, these instances are treated independently: that is,
-a PROV dataset is valid provided that each instance in it is valid
-and the names of its bundles are distinct; a PROV dataset is
+a PROV document is valid provided that each instance in it is valid
+and the names of its bundles are distinct; a PROV document is
 equivalent to another if their toplevel instances are equivalent, they
 have the same number of bundles with the same names, and the instances
 of their corresponding bundles are equivalent.  As for blank nodes in
@@ -1069,7 +1069,7 @@
 level, so existential variables with the same name occurring in
 different instances do not necessarily denote the same term.  This
 is a consequence of the fact that the instances of two equivalent
-datasets only need to be  pairwise isomorphic; this is a weaker
+documents only need to be  pairwise isomorphic; this is a weaker
 property than requiring that there be a single isomorphism that works
 for all of the corresponding instances.
 </p>
@@ -1121,6 +1121,7 @@
 		<a class="rule-text" href="#generation-within-activity"><span>TBD</span></a><br>
 		<a class="rule-text" href="#generation-precedes-invalidation"><span>TBD</span></a><br>
 		<a class="rule-text" href="#generation-precedes-usage"><span>TBD</span></a><br>
+		<a class="rule-text" href="#generation-generation-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#derivation-usage-generation-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#derivation-generation-generation-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasStartedBy-ordering"><span>TBD</span></a><br>
@@ -1167,6 +1168,7 @@
 		<a class="rule-text" href="#usage-within-activity"><span>TBD</span></a><br>
 		<a class="rule-text" href="#generation-within-activity"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasInformedBy-ordering"><span>TBD</span></a><br>
+		<a class="rule-text" href="#start-start-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasStartedBy-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasAssociatedWith-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#impossible-property-overlap"><span>TBD</span></a><br>
@@ -1184,6 +1186,7 @@
 		<a class="rule-text" href="#usage-within-activity"><span>TBD</span></a><br>
 		<a class="rule-text" href="#generation-within-activity"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasInformedBy-ordering"><span>TBD</span></a><br>
+		<a class="rule-text" href="#end-end-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasEndedBy-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasAssociatedWith-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#impossible-property-overlap"><span>TBD</span></a><br>
@@ -1197,6 +1200,7 @@
 		<a class="rule-text" href="#unique-invalidation"><span>TBD</span></a><br>
 		<a class="rule-text" href="#generation-precedes-invalidation"><span>TBD</span></a><br>
 		<a class="rule-text" href="#usage-precedes-invalidation"><span>TBD</span></a><br>
+		<a class="rule-text" href="#invalidation-invalidation-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasStartedBy-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#wasEndedBy-ordering"><span>TBD</span></a><br>
 		<a class="rule-text" href="#specialization-invalidation-ordering"><span>TBD</span></a><br>
@@ -1358,13 +1362,13 @@
   <ol><li>When processing provenance, an
     application MAY apply the inferences and definitions in <a
     href="#inferences" class='sectionRef'></a>.</li>
-   <li>If determining whether a PROV instance or dataset is <a>valid</a>, an
+   <li>If determining whether a PROV instance or document is <a>valid</a>, an
     application MUST check that all of the
     constraints of <a href="#constraints" class="sectionRef"></a> are
-  satisfied  on the <a>normal form</a> of the instance or dataset.  </li>
+  satisfied  on the <a>normal form</a> of the instance or document.  </li>
    <li> If producing provenance meant for other applications to
     use, the application SHOULD produce <a>valid</a> provenance, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>. </li>
-    <li>If determining whether two PROV instances or datasets are
+    <li>If determining whether two PROV instances or documents are
   <a>equivalent</a>, an application MUST determine whether their
   normal forms are equal, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>.
   </ol>
@@ -2607,11 +2611,18 @@
 
 
 <div id='unique-generation_text'>
-<p>An entity has exactly one generation and
-invalidation event (either or both may, however, be left implicit).
-  Together with the key constraints above, this implies that
+<p> Entities may have multiple generation or invalidation events
+  (either or both may, however, be left implicit).  An entity can be
+  generated by more than one activity, with one generation event per
+  interaction.  These events must be simultaneous, as required by <a
+  class="rule-ref"
+  href="#generation-generation-ordering"><span>generation-generation-ordering</span></a>
+  and <a
+  class="rule-ref" href="#invalidation-invalidation-ordering"><span>invalidation-invalidation-ordering</span></a>.
+<!--  Together with the key constraints above, this implies that
   <span class="name">e</span> is also a key for generation and
   invalidation statements.
+  -->
 </p>
 </div>
 
@@ -2622,7 +2633,6 @@
 <span class='conditional'>THEN</span> <span class="name">gen1</span> = <span class="name">gen2</span>.</p>
 </div> 
 
-<hr>
 
 <div id='unique-invalidation_text'>
 
@@ -2634,13 +2644,14 @@
 
 
 <div class="remark"> <p> It follows from the above uniqueness and key
-  constraints that the generation and invalidation activities and
-  times of an entity are unique, if specified.  However, because we
+  constraints that the generation and invalidation events linking an
+  entity and activity are unique, if specified.  However, because we
   apply the constraints by merging, it is possible for a valid PROV instance
-to contain multiple statements about the same identifiers.
+to contain multiple statements about the same generation or
+  invalidation event, for example:
   <pre>
 wasGeneratedBy(id1; e,a,-,[prov:location="Paris"])
-wasGeneratedBy(-; e,-,-,[color="Red"])
+wasGeneratedBy(-; e,a,-,[color="Red"])
 </pre>
   When the uniqueness and key constraints are applied, the instance is
   <a title="normal form">normalized</a> to the following form:
@@ -2654,10 +2665,18 @@
 
 <hr />
 <p id='unique-wasStartedBy_text'>
-An activity has exactly one start and
-end event (either or both may, however, be left implicit).  Again,
-together with above key constraints these constraints imply that the
-activity is a key for activity start and end statements.
+An activity may have more than one start and
+end event, each having a different activity (either or both may,
+however, be left implicit).  However,
+the triggering entity linking any two activities in a start or end event is unique.
+That is, an activity may be started by
+several other activities, with shared or separate triggering
+entities.  If an activity is started or ended by multiple events, they must all
+be simultaneous, as specified in <a
+  class="rule-ref"
+href="#start-start-ordering"><span>start-start-ordering</span></a>
+and <a
+  class="rule-ref" href="#end-end-ordering"><span>end-end-ordering</span></a>.
 </p>
 
 <div class='constraint' id='unique-wasStartedBy'>
@@ -3064,7 +3083,7 @@
 </p>
 </div>
 
-<p>
+
 
 
 <hr />
@@ -3086,7 +3105,7 @@
 </p>
 </div>
 
-<p>
+
 <hr />
 
 <p id='usage-precedes-invalidation_text'>All usages of an entity precede its invalidation, which is captured by <a class="rule-ref" href="#usage-precedes-invalidation"><span>TBD</span></a> (without any explicit graphical representation).</p> 
@@ -3103,10 +3122,52 @@
 <span class="name">inv</span>.</p>
 </div>
 
-<p>
+<hr />
+
 
 <hr />
 
+<p id='generation-generation-ordering_text'>
+If an activity is generated by more than one activity, the events must all
+be simultaneous.  The following constraint requires that if there are two generation
+events that generate the same activity, then one <a>precedes</a> the
+other.  Using this constraint in both directions means that each event
+<a>precedes</a> the other.
+</p>
+<div class='constraint' id='generation-generation-ordering'>
+ <p>
+    <span class="conditional">IF</span>
+<span class="name">wasGeneratedBy(gen1; e,_a1,_t1,_attrs1)</span> 
+and
+<span class="name">wasGeneratedBy(gen2; a,_a2,_t2,_attrs2)</span> 
+<span class="conditional">THEN</span>
+<span class="name">gen1</span> 
+<a title="precedes">precedes</a>
+<span class="name">gen2</span>.
+  </p>
+</div>
+
+<hr />
+
+<p id='invalidation-invalidation-ordering_text'>
+If an activity is invalidated by more than one activity, the events must all
+be simultaneous.  The following constraint requires that if there are two start
+events that start the same activity, then one <a>precedes</a> the
+other.  Using this constraint in both directions means that each event
+<a>precedes</a> the other, that is, they are simultaneous.
+</p>
+<div class='constraint' id='invalidation-invalidation-ordering'>
+ <p>
+    <span class="conditional">IF</span>
+<span class="name">wasInvalidatedBy(inv1; a,_e2,_a1,_t1,_attrs1)</span> 
+and
+<span class="name">wasInvalidatedBy(inv2; a,_e2,_a2,_t2,_attrs2)</span> 
+<span class="conditional">THEN</span>
+<span class="name">inv1</span> 
+<a title="precedes">precedes</a>
+<span class="name">inv2</span>.
+  </p>
+</div>
 
 
 
@@ -3173,6 +3234,50 @@
 which implies ordering between the usage of <span class="name">e1</span> and
 generation of <span class="name">e2</span>.  </p>
 </div>
+
+<hr />
+
+<p id='start-start-ordering_text'>
+If an activity is started by more than one activity, the events must all
+be simultaneous.  The following constraint requires that if there are two start
+events that start the same activity, then one <a>precedes</a> the
+other.  Using this constraint in both directions means that each event
+<a>precedes</a> the other.
+</p>
+<div class='constraint' id='start-start-ordering'>
+ <p>
+    <span class="conditional">IF</span>
+<span class="name">wasStartedBy(start1; a,_e1,_a1,_t1,_attrs1)</span> 
+and
+<span class="name">wasStartedBy(start2; a,_e2,_a2,_t2,_attrs2)</span> 
+<span class="conditional">THEN</span>
+<span class="name">start1</span> 
+<a title="precedes">precedes</a>
+<span class="name">start2</span>.
+  </p>
+</div>
+
+<hr />
+
+<p id='end-end-ordering_text'>
+If an activity is ended by more than one activity, the events must all
+be simultaneous.  The following constraint requires that if there are two start
+events that start the same activity, then one <a>precedes</a> the
+other.  Using this constraint in both directions means that each event
+<a>precedes</a> the other, that is, they are simultaneous.
+</p>
+<div class='constraint' id='end-end-ordering'>
+ <p>
+    <span class="conditional">IF</span>
+<span class="name">wasEndedBy(end1; a,_e2,_a1,_t1,_attrs1)</span> 
+and
+<span class="name">wasEndedBy(end2; a,_e2,_a2,_t2,_attrs2)</span> 
+<span class="conditional">THEN</span>
+<span class="name">end1</span> 
+<a title="precedes">precedes</a>
+<span class="name">end2</span>.
+  </p>
+</div>
 <hr />
 
 <p id='wasStartedBy-ordering_text'>
@@ -3204,7 +3309,7 @@
   </li>
   </ol>
 </div>
-<p>
+
 <hr />
 
 <p id='wasEndedBy-ordering_text'> Similarly, the entity that triggered
@@ -3849,8 +3954,8 @@
 
   <p>We define the notions of <a title="normal form">normalization</a>, <a
 title="valid">validity</a> and
-<a title="equivalence">equivalence</a> of PROV datasets and instances.  We first define these concepts
-for PROV instances and then extend them to PROV datasets.</p>
+<a title="equivalence">equivalence</a> of PROV documents and instances.  We first define these concepts
+for PROV instances and then extend them to PROV documents.</p>
 
 <section id="instance">
   <h2>Instances</h2>
@@ -3984,21 +4089,21 @@
 </section>
 
 <section id="bundle-constraints">
-<h2>Bundles and Datasets</h2>
+<h2>Bundles and Documents</h2>
 
 
 <p>The definitions, inferences, and constraints, and
 the resulting notions of normalization, validity and equivalence,
-assume a PROV dataset that consists only of a <a>toplevel
+assume a PROV document that consists only of a <a>toplevel
 instance</a>, containing all PROV statements in the top level of the
-dataset (that is, not enclosed in a named <a>bundle</a>).  In this
+document (that is, not enclosed in a named <a>bundle</a>).  In this
 section, we describe how to deal with general PROV
-datasets, possibly including multiple named bundles.  Briefly, each bundle is
+documents, possibly including multiple named bundles.  Briefly, each bundle is
 handled independently; there is no interaction between bundles from
 the perspective of applying definitions, inferences, or constraints,
 computing normal forms, or checking validity or equivalence.</p>
 
-<p> We model a general PROV dataset, containing <span class="name">n</span> named bundles
+<p> We model a general PROV document, containing <span class="name">n</span> named bundles
 <span class="name">b<sub>1</sub>...b<sub>n</sub></span>, as a tuple
 <span
 class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,b<sub>n</sub>=B<sub>n</sub>])</span>
@@ -4017,15 +4122,15 @@
 endBundle
 </pre>
 
-<p> The <a>normal form</a> of a PROV dataset
+<p> The <a>normal form</a> of a PROV document
 <span class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,[b<sub>n</sub>=B<sub>n</sub>])</span> is <span class="name">(B'<sub>0</sub>,[b<sub>1</sub>=B'<sub>1</sub>,...,b<sub>n</sub>=B'<sub>n</sub>])</span>
 where <span class="name">B'<sub>i</sub></span> is the normal
 form of <span class="name">B<sub>i</sub></span> for each <span class="name">i</span> between 0 and <span class="name">n</span>.  </p>
 
-<p>A PROV dataset is <a>valid</a> if each of the bundles <span class="name">B<sub>0</sub></span>,
+<p>A PROV document is <a>valid</a> if each of the bundles <span class="name">B<sub>0</sub></span>,
 ..., <span class="name">B<sub>n</sub></span> are valid and none of the bundle identifiers <span class="name">b<sub>i</sub></span> are repeated.</p>
 
-<p>Two (valid) PROV datasets <span class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,b<sub>n</sub>=B<sub>n</sub>])</span> and
+<p>Two (valid) PROV documents <span class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,b<sub>n</sub>=B<sub>n</sub>])</span> and
 <span class="name">(B'<sub>0</sub>,[b<sub>1</sub>'=B'<sub>1</sub>,...,b'<sub>m</sub>=B'<sub>m</sub>])</span> are <a>equivalent</a> if <span class="name">B<sub>0</sub></span> is
 equivalent to <span class="name">B'<sub>0</sub></span> and <span class="name">n = m</span> and
 there exists a permutation <span class="name">P : {1..n} -> {1..n}</span> such that for each <span class="name">i</span>, <span class="name">b<sub>i</sub> =