prov: changeset 4375:b0c1e2e3f83f

--- a/model/prov-constraints.html	Thu Aug 30 13:26:48 2012 +0100
+++ b/model/prov-constraints.html	Thu Aug 30 19:01:02 2012 +0100
@@ -1047,7 +1047,7 @@
 <p>
   In addition, a substitution can be applied to an atomic formula
   (PROV statement) <span class="math">p(t<sub>1</sub>,...,t<sub>n</sub>)</span> by applying it to each term,
-  that is, <span class="math">S(p(t<sub>1</sub>,...,t<sub>n</sub>)) = p(S(t<sub>1</sub>),...,t<sub>n</sub>)</span>.  Likewise, a
+  that is, <span class="math">S(p(t<sub>1</sub>,...,t<sub>n</sub>)) = p(S(t<sub>1</sub>),...,S(t<sub>n</sub>))</span>.  Likewise, a
   substitution <span class="math">S</span> can be applied to an instance <span class="math">I</span> by applying
   it to each atomic formula (PROV statement) in <span class="math">I</span>, that is, <span class="math">S(I)
   = {S(A) | A ∈ I}</span>.
@@ -1074,38 +1074,40 @@
   <li>Ordering constraints employ atomic precedence relations that can
   be thought of as binary formulas <span class="math">precedes(t,t')</span> or <span class="math">strictly_precedes(t,t')</span>
   </li>
-  <li>Typing constraints employ the set-valued <span class="name">typeOf</span> function, but
-  the property <span class="name">type ∈ typeOf(id)</span> can be represented as a binary
-  relation <span class="name">typeOf(id,type)</span>.
+  <li>Typing constraints <span class="name">'type' ∈ typeOf(id)</span>
+  can be represented as a atomic formulas <span class="name">typeOf(id,'type')</span>.
   </li>
   <li>Impossibility constraints employ the conclusion <span class="name">INVALID</span>,
   which is equivalent to the logical constant <span class="math">False</span>. </li>
   </ul>
   <p> Similarly, the definitions, inferences, and constraint rules in this
-  specification can also be viewed as logical formulas.</p>
+  specification can also be viewed as logical formulas., built up out
+  of atomic formulas, logical connectives "and" (∧), "implies" (⇒),
+  and quantifiers "for all" (∀) and "there exists" (∃).  For more
+  background on logical formulas, see a logic textbook such as [[Logic]].</p>
 <ul>
   <li>
   A definition of the form <span class="name">A</span> <span class="conditional">IF AND ONLY IF</span> there
   exists <span class="name">y<sub>1</sub></span>...<span class="name">y<sub>m</sub></span> such that <span class="name">B<sub>1</sub></span> and ... and <span class="name">B<sub>k</sub></span>
   can be thought of as a formula <span class="math">∀ x<sub>1</sub>,....,x<sub>n</sub>. A ⇔ ∃ y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧ B<sub>k</sub></span>, where <span class="math">x<sub>1</sub></span>...<span class="math">x<sub>n</sub></span> are the
-  free variables of the formula.  
+  free variables of the definition.  
 </li>
 <li>An inference of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub></span> and ... and <span class="name">A<sub>l</sub></span>  <span class="conditional">THEN</span>  there
   exists <span class="name">y<sub>1</sub></span>...<span class="name">y<sub>m</sub></span> such that <span class="name">B<sub>1</sub></span> and ... and <span class="name">B<sub>k</sub></span> can
   be thought of as a formula <span class="math">∀ x<sub>1</sub>,....,x<sub>n</sub>.  A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ ∃ y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧ B<sub>k</sub></span>, where <span class="math">x<sub>1</sub></span>...<span class="math">x<sub>n</sub></span> are the
-  free variables of the formula.  
+  free variables of the inference.  
 </li>
-<li>A uniqueness, ordering, or typing constraint of the form <span class="conditional">IF</span> <span class="name">A</span> <span class="conditional">THEN</span> <span class="name">C</span> can be viewed as a formula
+<li>A uniqueness, ordering, or typing constraint of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> <span class="conditional">THEN</span> <span class="name">C</span> can be viewed as a formula
   <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ C</span>.  </li>
-<li>A constraint of the form <span class="conditional">IF</span> <span class="name">A</span> <span class="conditional">THEN INVALID</span> can be viewed as a formula
-  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A ⇒ False</span>.  </li>
+<li>A constraint of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> <span class="conditional">THEN INVALID</span> can be viewed as a formula
+  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ False</span>.  </li>
   </ul>
 
 
  <h4>Satisfying definitions, inferences, and constraints</h4>
   <p>
   In logic, a formula's meaning is defined by saying when it is
-  <em>satisfied</em> in a situation.  Likewise, we can view
+  <em>satisfied</em>.  We can view
   definitions, inferences, and constraints as being satisfied or not
   satisfied in a PROV instance, augmented with information about the constraints.
   </p>
@@ -1136,10 +1138,10 @@
   A<sub>l</sub></span> is true, then <span class="math">C</span> is
   also true.</li>
   <li>An impossibility constraint is satisfied when the formula
-  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A ⇒
+  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒
   False</span> holds.  This is logically equivalent to <span class="math">∄
-  x<sub>1</sub>...x<sub>n</sub>. A</span>, that is, there exists no
-  substitution for <span class="math">x<sub>1</sub>...x<sub>n</sub></span> making <span class="math">A</span> true.
+  x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span>, that is, there exists no
+  substitution for <span class="math">x<sub>1</sub>...x<sub>n</sub></span> making <span class="math">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> true.
 </ol>  
 
 <h4>Merging</h4>
@@ -1166,7 +1168,7 @@
 </p>
   <ul>
   <li>
-  A logical equivalence of the form <span class="math">∀
+  A definition of the form <span class="math">∀
   x<sub>1</sub>,....,x<sub>n</sub>. A ⇔ ∃
   y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧
   B<sub>k</sub></span>
@@ -1176,7 +1178,7 @@
     occurrence of <span class="math"> B<sub>1</sub>, ..., B<sub>k</sub></span>, adding
     <span class="math">A</span>.
     In our setting, the defined formulas <span class="math">A</span> are never used in other
-    constraints, so it is sufficient to replace all occurrences of
+    formulas, so it is sufficient to replace all occurrences of
     <span class="math">A</span> with their definitions.  The formula <span
     class="math">A</span> is then redundant, and can be
   removed from the instance.
@@ -1228,8 +1230,9 @@
   definition, inference, and constraint forms is not guaranteed to
   terminate.  A simple example is the inference <span class="math">R(x,y) ⇒ ∃z. R(x,z)
   ∧R(z,y)</span>, which can be applied to <span class="math">{R(a,b)}</span> to generate an
-  infinite sequence of larger and larger instances.  This is obviously something we want to
-  avoid.  There is a great deal of work on termination of the chase in
+  infinite sequence of larger and larger instances.  To ensure that
+  normalization, validity, and equivalence are decidable, we require
+  that normalization terminates.  There is a great deal of work on termination of the chase in
   databases, or of sets of constraint handling rules.  The termination
   of the notion of normalization defined in this specification is
   guaranteed because the definitions, inferences and uniqueness/key
@@ -1245,26 +1248,34 @@
   termination.  This specification draws a distinction between knowing
   that an identifier has type <span class="name">'entity'</span>, <span class="name">'activity'</span>, or <span class="name">'agent'</span>, and having
   an explicit <span class="name">entity(id)</span>, <span class="name">activity(id)</span>, or <span class="name">agent(id)</span> statement in the instance.
-  For example, focusing on entity, it is a stronger statement to say
-  that <span class="name">entity(id)</span> holds in the instance, because we can apply
-  entity-generation-invalidation-inference to infer generation and
-  invalidation events for the entity.  In contrast, if we only know
+  For example, focusing on entity statements, we can infer <span
+  class="name">'entity' ∈ typeOf(id)</span> if  <span
+  class="name">entity(id)</span> holds in the instance.  In contrast, if we only know
   that <span class="name">'entity' ∈ typeOf(id)</span>, this does not imply that <span class="name">entity(id)</span>
   holds.
   </p>
   <p>
   This distinction (for both entities and activities) is essential to
   ensure termination of the inferences, because we allow inferring
-  that a declared <span class="name">entity(id)</span> or  <span class="name">activity(id)</span> has a generation
-  or respectively start event.  If we strengthened the type
+  that a declared <span class="name">entity(id,attrs)</span> has a generation
+  and invalidation event, using
+  <a class="rule-text"
+  href="#entity-generation-invalidation-inference"><span>TBD</span></a>.
+  Likewise, for activities, we allow inferring that a declared <span class="name">activiy(id,t1,t2,attrs)</span> has a generation
+  and invalidation event, using
+  <a class="rule-text"
+  href="#activity-start-end-inference"><span>TBD</span></a>.  These
+  inferences do not apply to identifiers whose types are kown, but for
+  which there is not an explicit entity or activity statement.
+If we strengthened the type
   inference constraints to add new entity or activity statements for
   the entities and activities involved in generating or starting other
   declared entities or activities, then we could keep generating new entities and
   activities in an unbounded chain into the past (as in the "chicken
   and egg" paradox).  The
-  design adopted here assumes that instances explicitly declare the
-  entities and activities that are relevant for validity checking, and
-  these can be inferred to have start and end events.  This inference
+  design adopted here requires that instances explicitly declare the
+  entities and activities that are relevant for validity checking, and only
+  these can be inferred to have invalidation/generation and start/end events.  This inference
   is not supported for identifiers
 that are indirectly referenced in other relations and therefore have
   type <span class="name">'entity'</span> or <span class="name">'activity'</span>.  
@@ -1290,7 +1301,7 @@
   rather than applied.  This means that they do not generate new
   formulas expressible in PROV, but they do generate basic constraints
   that might or might not be consistent with each other.  Checking
-  such constraints follows a saturation strategy similar to that for 
+  such constraints follows a saturation strategy similar to that for normalization:
   </p>
   <ol>
   <li><p>
@@ -1303,7 +1314,7 @@
 precedes(t,t')</span> can be applied by searching for occurrences of
 <span class="math"> A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> and for each such match
 adding the atomic formula <span class="math">precedes(t,t')</span> to
-  the instance.  After all such constraints have been checked, and the
+  the instance, and similarly for strictly-precedes constraints.  After all such constraints have been checked, and the
   resulting edges added to the graph, the ordering constraints are
   violated if there is a cycle in the graph that includes a
   strictly-precedes edge, and satisfied otherwise.
@@ -1358,8 +1369,8 @@
   forms may describe the same situation but differ in inessential
   details such as the order of statements or of elements of
   attribute-value lists.  To remedy this, we can easily consider
-  instances to be equvalent up to reordering of attributes.  However,
-  instances can also be equvalent if they differ only in choice of
+  instances to be equivalent up to reordering of attributes.  However,
+  instances can also be equivalent if they differ only in choice of
   names of existential variables. Because of this, the appropriate
   notion of equivalence of normal forms is <em>isomorphism</em>.  Two
   instances <span class="math">I<sub>1</sub></span> and <span class="math">I<sub>2<sub></span> are <em>isomorphic</em> if there is an invertible
@@ -1383,11 +1394,11 @@
 <h4>From Instances to Bundles and Documents</h4>
 
 <p>PROV documents can contain multiple instances: a <a>toplevel
-instance</a>, the set of statements not appearing within a bundle, and
+instance</a> consisting of the set of statements not appearing within a bundle, and
 zero or more named instances called <a>bundle</a>s.  For the purpose
-of this document, these instances are treated independently: that is,
+of inference and constraint checking, these instances are treated independently. That is,
 a PROV document is valid provided that each instance in it is valid
-and the names of its bundles are distinct; a PROV document is
+and the names of its bundles are distinct.  Similarly, a PROV document is
 equivalent to another if their toplevel instances are equivalent, they
 have the same number of bundles with the same names, and the instances
 of their corresponding bundles are equivalent.  Analogously to blank nodes in
@@ -1405,7 +1416,8 @@
 <h4>Summary of definitions, inferences, and constraints</h4>
 
 <p><a href="">Table 2</a> summarizes the definitions, inferences, and
-constraints of this document.
+constraints specified in this document, broken down by component and
+type or relation involved.
 </p>
 
 <div class="note">Table: work in progress; these entries might change when the document is updated.</div>
author	James Cheney <jcheney@inf.ed.ac.uk>
	Thu, 30 Aug 2012 19:01:02 +0100
changeset 4375	b0c1e2e3f83f
parent 4374	9d2967fd1345
child 4376	95ad452a23c6