updated figure 3
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Wed, 08 Aug 2012 12:23:55 +0100
changeset 4307 2d79b8a7c3aa
parent 4306 df3e192734d7 (current diff)
parent 4305 3bfb87c8369a (diff)
child 4309 6528ab8410f7
updated figure 3
--- a/model/comments/issue-459-paul.txt	Wed Aug 08 12:23:39 2012 +0100
+++ b/model/comments/issue-459-paul.txt	Wed Aug 08 12:23:55 2012 +0100
@@ -23,6 +23,8 @@
 I made this appendix non-normative. This allows us to do it later.
 
 @James: OK?
+-- Yes, I'd stated appendices are non-normative in the compliance, but
+doesn't hurt to restate.  -- JRC.
 
 
    > 
--- a/model/prov-constraints.html	Wed Aug 08 12:23:39 2012 +0100
+++ b/model/prov-constraints.html	Wed Aug 08 12:23:55 2012 +0100
@@ -527,16 +527,16 @@
 reasoning can be safely applied. By default, PROV instances need not
 be <a>valid</a>.  </p>
 
-<p> This document specifies <em>inferences</em> over PROV instances
-that applications MAY employ, including <em>definitions</em> of some
-provenance statements in terms of others, and also defines a class of
+<p> This document specifies <em>definitions</em> of some
+provenance statements in terms of others, <em>inferences</em> over PROV instances
+that applications MAY employ, and also defines a class of
 <a>valid</a> PROV instances by specifying <em>constraints</em> that
 <a>valid</a> PROV instances must satisfy. There are four kinds of
 constraints: <em>uniqueness constraints</em>, <em>event ordering
 constraints</em>, <em>impossibility constraints</em>, and <em>type
 constraints</em>.
 Further discussion
-of the semantics of PROV statements, which justifies the inferences
+of the semantics of PROV statements, which justifies the definitions, inferences
 and constraints, can be found in the formal semantics [[PROV-SEM]].
 </p>
 
@@ -694,12 +694,12 @@
 attributes; this leads to potential ambiguity, which is mitigated through the
 use of identifiers.</p>
 
-<p>An <a>activity</a> is delimited by its <a title="activity start
+<p>An <a>activity</a>'s lifetime is delimited by its <a title="activity start
 event">start</a> and its <a title="activity end event">end</a>
 events.  It occurs over
 an interval delimited by two <a title="instantaneous event">instantaneous
 events</a>. However, an activity statement need not mention start or end time information, because they may not be known.
-An activity's attribute-value pairs are expected to describe the activity's situation during its interval, i.e. an interval between two instantaneous events, namely its <a title="activity start event">start</a> event and its <a title="activity end event">end</a> event.
+An activity's attribute-value pairs are expected to describe the activity's situation during its lifetime.
 </p>
 
 
@@ -783,6 +783,9 @@
 </figure> <!-- <b>new Figure 1:</b>  -->
 </div>
 
+<div class="note">
+  James to add prose
+  </div>
 
 
 </section>
@@ -1103,23 +1106,49 @@
 </section>
 
 
-
 <section id="inferences">
-<h2>Inferences and Definitions</h2>
+<h2>Definitions and Inferences</h2>
 <p>
-This section  describes <a title="inference">inferences</a> and <a title="definition">definitions</a> that MAY be used on
+This section  describes <a title="definition">definitions</a> and <a title="inference">inferences</a> that MAY be used on
   provenance data, and preserve <a>equivalence</a> on <a>valid</a>
 PROV instances (as detailed in <a href="#normalization-validity-equivalence" class="sectionRef"></a>).
-An  <dfn id="inference">inference</dfn> is a rule that can be applied
-  to PROV instances to add new PROV statements.  A <dfn
+A <dfn
   id="definition">definition</dfn> is a rule that can be applied to
-  PROV instances to replace defined expressions with definitions.  A definition states that a
+  PROV instances to replace defined expressions with definitions. An  <dfn id="inference">inference</dfn> is a rule that can be applied
+  to PROV instances to add new PROV statements.  A definition states that a
   provenance statement is equivalent to some other statements, whereas
   an inference only states one direction of an implication; thus,
   defined provenance statements can be replaced by their definitions.
 </p>
 
-<p> Inferences have the following general form:</p>
+
+<p> Definitions have the following general form:</p>
+
+<div class='definition-example' id='definition-example'>
+<p>
+  <span class="name">defined_stmt</span> <span
+  class='conditional'>IF AND ONLY IF</span>
+  there exists <span class="name">a<sub>1</sub></span>,..., <span
+  class="name">a<sub>m</sub></span> such that <span
+  class="name">defining_stmt<sub>1</sub></span> and  ... and <span class="name">defining_stmt<sub>n</sub></span>.</p>
+  </div>
+ 
+  <p>
+  A definition can be applied to a PROV instance, since its <span class="name">defined_stmt</span> is defined in
+  terms of other statements.  Applying a
+  definition to an instance means that if an occurrence of a defined
+  provenance statement  <span class="name">defined_stmt</span>
+  can be found in a PROV instance, then we can remove it and add all of the statements
+<span class="name">defining_stmt<sub>1</sub></span> ... <span class="name">defining_stmt<sub>n</sub></span> to the instance, possibly after generating fresh
+  identifiers <span class="name">a<sub>1</sub></span>,...,<span
+  class="name">a<sub>m</sub></span> for existential variables.  In
+  other words, it is safe to replace
+  a defined statement with
+  its definition.  
+</p>
+
+
+  <p> Inferences have the following general form:</p>
 <div class='inference-example' id='inference-example'>
 <p>
   <span class='conditional'>IF</span> <span class="name">hyp<sub>1</sub></span> and ... and
@@ -1141,7 +1170,7 @@
   class="name">a<sub>m</sub></span> for the existential variables.  These fresh
   identifiers might later be found to be equal to known identifiers;
   they play a similar role in PROV constraints to existential
-  variables in logic, "labeled nulls" in database theory
+  variables in logic, to "labeled nulls" in database theory
   [[DBCONSTRAINTS]],  or to blank nodes in [[!RDF]].  In general, omitted optional parameters to
   [[PROV-N]] statements, or explicit <span class="name">-</span>
   markers, are placeholders for existentially quantified variables;
@@ -1149,31 +1178,6 @@
   this general rule, which are specified in <a class="rule-ref"
   href="#optional-placeholders"><span>TBD</span></a>.</p>
 
-<p> Definitions have the following general form:</p>
-
-<div class='definition-example' id='definition-example'>
-<p>
-  <span class="name">defined_exp</span> <span
-  class='conditional'>IF AND ONLY IF</span>
-  there exists <span class="name">a<sub>1</sub></span>,..., <span
-  class="name">a<sub>m</sub></span> such that <span
-  class="name">defining_exp<sub>1</sub></span> and  ... and <span class="name">defining_exp<sub>n</sub></span>.</p>
-  </div>
- 
-  <p>
-  A definition can be applied to a PROV instance, since its <span class="name">defined_exp</span> is defined in
-  terms of other statements.  Applying a
-  definition to an instance means that if an occurrence of a defined
-  provenance statement  <span class="name">defined_exp</span>
-  can be found in a PROV instance, then we can remove it and add all of the statements
-<span class="name">defining_exp<sub>1</sub></span> ... <span class="name">defining_exp<sub>n</sub></span> to the instance, possibly after generating fresh
-  identifiers <span class="name">a<sub>1</sub></span>,...,<span
-  class="name">a<sub>m</sub></span> for existential variables.  In
-  other words, it is safe to replace
-  a defined statement with
-  its definition.  
-</p>
-  
 <p> Definitions and inferences can be viewed as logical formulas;
   similar formalisms are often used in rule-based reasoning [[CHR]]
   and in databases [[DBCONSTRAINTS]].  In particular, the identifiers
@@ -1185,7 +1189,7 @@
   nodes in RDF.  In contrast, distinct URIs or literal values in PROV
   are assumed to be distinct for the purpose of checking validity or
   inferences.  This issue is discussed in more detail under <a
-   href="#uniqueness-constraints">Uniqueness Constraints</a> below.
+   href="#uniqueness-constraints">Uniqueness Constraints</a>.
   </p>
 
 <section>
@@ -1196,9 +1200,18 @@
   be omitted in [[PROV-O]] notation.  For the purpose of inference and
   validity checking, we generate special identifiers called
   <dfn>existential variables</dfn> denoting the unknown values.
-  Existential variables can be <em>substituted</em> with constant
-  identifiers, literals, the placeholder <span class="name">-</span>,
-  or other <a>existential variables</a>.  
+</p>
+<p>
+Existential variables can be <a title="substitution">substituted</a>
+  with other terms.  Specifically, a
+  <dfn>substitution</dfn> is a function from a set of existential
+  variables to identifiers, literals, the placeholder <span class="name">-</span>,
+  or other <a>existential variables</a>.  A substitution <span class="math">S</span> can be
+  applied to an instance <span class="math">I</span> by replacing all occurrences of existential
+  variables <span class="math">x</span> in the instance with <span class="math">S(x)</span>.  <div
+  class="note">More explanation may be needed</div>
+</p>
+<p>
 <a class="rule-ref" href="#optional-identifiers"><span>TBD</span></a>,
 <a class="rule-ref" href="#optional-attributes"><span>TBD</span></a>, and
 <a class="rule-ref" href="#definition-short-forms"><span>TBD</span></a>,
@@ -1495,9 +1508,9 @@
 
   
 
-<p id="communication-generation-use-inference_text">Communication between activities is <a
- title="definition">defined</a> as the existence of an underlying
-entity generated by one activity and used by the other.</p>
+<p id="communication-generation-use-inference_text">Communication between activities <a
+ title="definition">implies</a> the existence of an underlying
+entity generated by one activity and used by the other, and vice versa.</p>
 
 <div class='inference' id='communication-generation-use-inference'>
 <p>
@@ -1589,7 +1602,7 @@
 -->
 
 <p id="entity-generation-invalidation-inference_text">
-From an entity, we can infer the existence of
+From an entity statement, we can infer the existence of
 generation and invalidation events.
 </p>
 <div class='inference' id='entity-generation-invalidation-inference'>
@@ -1755,8 +1768,10 @@
 <div class='inference' id='revision-is-alternate-inference'>
 <p>
   <span class='conditional'>IF</span> <span
-  class="name">wasDerivedFrom(_id; e2,e1,[prov:type='prov:Revision'])</span>, <span class='conditional'>THEN</span> <span
-  class="name">alternateOf(e2,e1)</span>.
+  class="name">wasDerivedFrom(_id; e2,e1,_a,_g,_u,[prov:type='prov:Revision'])</span>, <span class='conditional'>THEN</span> <span
+  class="name">alternateOf(e2,e1)</span>.  The activity <span class="name">_a</span>,
+  generation <span class="name">_g</span>,
+  or use <span class="name">_u</span> may be the placeholder <span class="name">-</span>.
 </p>
 <!--
 <li><span class='conditional'>IF</span> <span
@@ -2050,7 +2065,7 @@
     <li> <em>event ordering constraints</em> that say that it
   should be possible to arrange the 
   events (generation, usage, invalidation, start, end) described in a
-  PROV instance into a preorder that corresponds to a sensible
+  PROV instance into a <a>preorder</a> that corresponds to a sensible
   "history" (for example, an entity should not be generated after it
   is used); and
     </li>
@@ -2242,9 +2257,10 @@
   <li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
   the <span class="name">wasInvalidatedBy(id; e,a,t,attrs)</span> statement.
   </li>
-  <li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
+<!--  <li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
   the <span class="name">wasDerivedFrom(id;  e2, e1, attrs)</span> statement.
   </li>
+  -->
   <li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
   the <span class="name">wasDerivedFrom(id;  e2, e1, a, g2, u1, attrs)</span> statement.
   </li>
@@ -2439,7 +2455,7 @@
 as or after <span class="name">e2</span>. Both relations are
 <a>preorder</a>s, meaning that they are <a>reflexive</a> and
 <a>transitive</a>.  Moreover, we sometimes consider <em>strict</em> forms of these
-orders: we say <span class="name">e1</span> strictly precedes <span
+orders: we say <span class="name">e1</span> <a>strictly precedes</a> <span
 class="name">e2</span> to indicate that <span class="name">e1</span>
 happened before <span class="name">e2</span>, but not at the same time.  This is a
 <a>transitive</a> relation. </p>
@@ -2808,7 +2824,7 @@
   id='derivation-generation-generation-ordering'>
 <p>
  <span class="conditional">IF</span>
-<span class="name">wasDerivedFrom(_d; e2,e1,attrs)</span>
+<span class="name">wasDerivedFrom(_d; e2,e1,-,-,-,attrs)</span>
   and
 <span class="name">wasGeneratedBy(gen1; e1,_a1,_t1,_attrs1)</span>
   and
@@ -3236,8 +3252,8 @@
   </p>
   <div class='constraint' id='impossible-object-property-overlap'>
   <p>
-For each <span class="name">p</span> in <span class="name">entity</span>, <span class="name">activity</span>
-   or <span class="name">agent</span> and for each  <span class="name">r</span> in { 
+For each <span class="name">p</span> in {<span class="name">entity</span>, <span class="name">activity</span>
+   or <span class="name">agent</span>} and for each  <span class="name">r</span> in { 
 <span class="name">used</span>,
 <span class="name">wasGeneratedBy</span>,
 <span class="name">wasInvalidatedBy</span>,
@@ -3264,8 +3280,8 @@
 <h2>Type Constraints</h2>
 
 <p id="typing_text">The following rule  establishes types denoted by identifiers from their use within expressions. 
-For this, the function <span class="name">typeOf</span> gives the set of types denoted by an identifier.
-For example,  <span class="name">typeOf(e)</span> returns the set of types associated with identifier  <span class="name">e</span>. The function <span class="name">typeOf</span> is not a term of PROV, but a construct introduced to validate PROV statements. 
+The function <span class="name">typeOf</span> gives the set of types denoted by an identifier.
+That is,  <span class="name">typeOf(e)</span> returns the set of types associated with identifier  <span class="name">e</span>. The function <span class="name">typeOf</span> is not a term of PROV, but a construct introduced to validate PROV statements. 
 </p>
 
 
@@ -3273,7 +3289,15 @@
 
 <p>
  For any identifier  <span class="name">id</span>,  <span class="name">typeOf(id)</span>  is a subset of {<span class="name">'entity'</span>, <span class="name">'activity'</span>, <span class="name">'agent'</span>, <span class="name">'prov:Collection'</span>, <span class="name">'prov:EmptyCollection'</span>}.
-For identifiers that do not have a type,  <span class="name">typeOf</span> gives the empty set.
+For identifiers that do not have a type,  <span
+ class="name">typeOf</span> gives the empty set.
+  Identifiers can have more than one type, because of subtyping
+ (e.g. <span class="name">'prov:EmptyCollection'</span> is a subtype of <span
+ class="name">'prov:Collection'</span>) or because certain types are not
+ disjoint (such as <span class="name">'agent'</span> and <span class="name">'entity'</span>). The set of types
+ does not reflect all of the distinctions among objects, only those
+ relevant for checking validity.  In particular, subtypes such as
+ <span class="name">'plan'</span> are omitted.
 </p>
 
 <p>To check if a PROV instance satisfies type constraints, one obtains the types of identifiers by application of
@@ -3288,7 +3312,7 @@
 <div class='constraint' id="typing">
 
 
-<ul>
+<ol>
 <li>
 <span class='conditional'>IF</span> 
    <span class='name'>wasGeneratedBy(gen; e,a,t,attrs)</span>  
@@ -3461,7 +3485,7 @@
 <span class="name">'prov:Collection' &isin; typeOf(c)</span>AND
 <span class="name">'prov:EmptyCollection' &isin; typeOf(c)</span>.
 
-</ul>
+</ol>
 </div>
 
 
@@ -3534,9 +3558,8 @@
   </div>
 
 <p> We define the <dfn>normal form</dfn> of a PROV instance as the set
-of provenance statements resulting from merging to resolve all
-uniqueness constraints in the instance and applying all possible
-inference rules to this set. </p>
+of provenance statements resulting from applying all definitions,
+  inferences, and uniqueness constraints.</p>
 
 
 
@@ -3548,7 +3571,7 @@
     </li>
   <li>
     Apply all inferences to <span class="math">I<sub>1</sub></span> by adding the conclusion of each inference
-    whose hypotheses are satisfied and whose entire conclusions do not
+    whose hypotheses are satisfied and whose entire conclusion does not
     already hold (again, possibly introducing fresh existential
     variables), yielding an instance <span class="math">I<sub>2</sub></span>.
     </li>
@@ -3585,7 +3608,7 @@
   validity:</p>
 
 <ol>
-  <li>Normalize the instance, obtaining normalized instance <span class="math">I'</span>.  If
+  <li>Normalize the instance #<I>#, obtaining normalized instance <span class="math">I'</span>.  If
   normalization fails, then <span class="math">I</span> is not <a>valid</a>.
   </li>
   <li>Apply all event ordering constraints to <span class="math">I'</span> to build a graph <span class="math">G</span> whose nodes
@@ -3595,18 +3618,20 @@
   <li> Determine whether there is a cycle in <span class="math">G</span> that contains a
   "strictly precedes" edge.  If so, then <span class="math">I</span> is not <a>valid</a>.  
   </li>
+  <li>Apply the type constraints <a href="#type-constraints">(section
+  5.3)</a> to determine whether there are any violations of
+  disjointness.  If so, then <span class="math">I</span> is not <a>valid</a>.  
   <li>
-  If no
-  such cycle exists, and none of the impossibility constraints <a href="#impossibility-constraints">(section 5.3)</a>  and type constraints <a href="#type-constraints">(section 5.4)</a> are
-  violated, then <span class="math">I</span> is <a>valid</a>.
+  Check that none of the impossibility constraints <a href="#impossibility-constraints">(section 5.4)</a>  are
+  violated.  If any are violated, then <span class="math">I</span> is
+  not <a>valid</a>.  Otherwise, #<I># is <a>valid</a>
   </li>
   </ol>
 
 <p>A normal form of a PROV instance does not exist when a uniqueness constraint fails due to merging failure. </p>
 
 
-  <p>Two PROV instances are <dfn>equivalent</dfn> if they have the
-isomorphic normal forms (that is, after applying all possible inference
+  <p>Two PROV instances are <dfn>equivalent</dfn> if they have isomorphic normal forms (that is, after applying all possible inference
 rules, the two instances produce the same set of PROV statements,
 up to reordering of statements and attributes within attribute lists,
   and renaming of existential variables).
@@ -3617,18 +3642,20 @@
   The order of provenance statements is irrelevant to the meaning of
   a PROV instance.  That is, a
   PROV instance is equivalent to any other instance obtained by
-  permuting its statements.
+reordering its statements.
   </li>
   <li>The order of attribute-value pairs in attribute lists is
   irrelevant to the meaning of a PROV statement.  That is, a PROV
   statement carrying attributes is equivalent to any other statement
-  obtained by permuting attribute-value pairs.
+  obtained by reordering attribute-value pairs and eliminating
+  duplicate pairs.
   </li>
   <li>The particular choices of names of existential variables are irrelevant to the meaning
-  of an instance; that is, the names can be permuted without changing
-  the meaning.  (Replacing two different names with equal names,
+  of an instance; that is, the names can be renamed without changing
+  the meaning, as long as different names are always replaced with
+  different names.  (Replacing two different names with equal names,
   however, can
-  change the meaning.)</li>
+  change the meaning, so does not preserve equivalence.)</li>
   <li>
   Applying inference rules, definitions, and uniqueness constraints preserves equivalence.  That is, a <a>PROV
   instance</a> is equivalent to the instance obtained by applying any