author James Cheney Thu, 30 Aug 2012 19:01:02 +0100 changeset 4375 b0c1e2e3f83f parent 4374 9d2967fd1345 child 4376 95ad452a23c6
* further rev overview section
 model/prov-constraints.html
```--- a/model/prov-constraints.html	Thu Aug 30 13:26:48 2012 +0100
+++ b/model/prov-constraints.html	Thu Aug 30 19:01:02 2012 +0100
@@ -1047,7 +1047,7 @@
<p>
In addition, a substitution can be applied to an atomic formula
(PROV statement) <span class="math">p(t<sub>1</sub>,...,t<sub>n</sub>)</span> by applying it to each term,
-  that is, <span class="math">S(p(t<sub>1</sub>,...,t<sub>n</sub>)) = p(S(t<sub>1</sub>),...,t<sub>n</sub>)</span>.  Likewise, a
+  that is, <span class="math">S(p(t<sub>1</sub>,...,t<sub>n</sub>)) = p(S(t<sub>1</sub>),...,S(t<sub>n</sub>))</span>.  Likewise, a
substitution <span class="math">S</span> can be applied to an instance <span class="math">I</span> by applying
it to each atomic formula (PROV statement) in <span class="math">I</span>, that is, <span class="math">S(I)
= {S(A) | A ∈ I}</span>.
@@ -1074,38 +1074,40 @@
<li>Ordering constraints employ atomic precedence relations that can
be thought of as binary formulas <span class="math">precedes(t,t')</span> or <span class="math">strictly_precedes(t,t')</span>
</li>
-  <li>Typing constraints employ the set-valued <span class="name">typeOf</span> function, but
-  the property <span class="name">type ∈ typeOf(id)</span> can be represented as a binary
-  relation <span class="name">typeOf(id,type)</span>.
+  <li>Typing constraints <span class="name">'type' ∈ typeOf(id)</span>
+  can be represented as a atomic formulas <span class="name">typeOf(id,'type')</span>.
</li>
<li>Impossibility constraints employ the conclusion <span class="name">INVALID</span>,
which is equivalent to the logical constant <span class="math">False</span>. </li>
</ul>
<p> Similarly, the definitions, inferences, and constraint rules in this
-  specification can also be viewed as logical formulas.</p>
+  specification can also be viewed as logical formulas., built up out
+  of atomic formulas, logical connectives "and" (∧), "implies" (⇒),
+  and quantifiers "for all" (∀) and "there exists" (∃).  For more
+  background on logical formulas, see a logic textbook such as [[Logic]].</p>
<ul>
<li>
A definition of the form <span class="name">A</span> <span class="conditional">IF AND ONLY IF</span> there
exists <span class="name">y<sub>1</sub></span>...<span class="name">y<sub>m</sub></span> such that <span class="name">B<sub>1</sub></span> and ... and <span class="name">B<sub>k</sub></span>
can be thought of as a formula <span class="math">∀ x<sub>1</sub>,....,x<sub>n</sub>. A ⇔ ∃ y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧ B<sub>k</sub></span>, where <span class="math">x<sub>1</sub></span>...<span class="math">x<sub>n</sub></span> are the
-  free variables of the formula.
+  free variables of the definition.
</li>
<li>An inference of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub></span> and ... and <span class="name">A<sub>l</sub></span>  <span class="conditional">THEN</span>  there
exists <span class="name">y<sub>1</sub></span>...<span class="name">y<sub>m</sub></span> such that <span class="name">B<sub>1</sub></span> and ... and <span class="name">B<sub>k</sub></span> can
be thought of as a formula <span class="math">∀ x<sub>1</sub>,....,x<sub>n</sub>.  A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ ∃ y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧ B<sub>k</sub></span>, where <span class="math">x<sub>1</sub></span>...<span class="math">x<sub>n</sub></span> are the
-  free variables of the formula.
+  free variables of the inference.
</li>
-<li>A uniqueness, ordering, or typing constraint of the form <span class="conditional">IF</span> <span class="name">A</span> <span class="conditional">THEN</span> <span class="name">C</span> can be viewed as a formula
+<li>A uniqueness, ordering, or typing constraint of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> <span class="conditional">THEN</span> <span class="name">C</span> can be viewed as a formula
<span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ C</span>.  </li>
-<li>A constraint of the form <span class="conditional">IF</span> <span class="name">A</span> <span class="conditional">THEN INVALID</span> can be viewed as a formula
-  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A ⇒ False</span>.  </li>
+<li>A constraint of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> <span class="conditional">THEN INVALID</span> can be viewed as a formula
+  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ False</span>.  </li>
</ul>

<h4>Satisfying definitions, inferences, and constraints</h4>
<p>
In logic, a formula's meaning is defined by saying when it is
-  <em>satisfied</em> in a situation.  Likewise, we can view
+  <em>satisfied</em>.  We can view
definitions, inferences, and constraints as being satisfied or not
satisfied in a PROV instance, augmented with information about the constraints.
</p>
@@ -1136,10 +1138,10 @@
A<sub>l</sub></span> is true, then <span class="math">C</span> is
also true.</li>
<li>An impossibility constraint is satisfied when the formula
-  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A ⇒
+  <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒
False</span> holds.  This is logically equivalent to <span class="math">∄
-  x<sub>1</sub>...x<sub>n</sub>. A</span>, that is, there exists no
-  substitution for <span class="math">x<sub>1</sub>...x<sub>n</sub></span> making <span class="math">A</span> true.
+  x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span>, that is, there exists no
+  substitution for <span class="math">x<sub>1</sub>...x<sub>n</sub></span> making <span class="math">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> true.
</ol>

<h4>Merging</h4>
@@ -1166,7 +1168,7 @@
</p>
<ul>
<li>
-  A logical equivalence of the form <span class="math">∀
+  A definition of the form <span class="math">∀
x<sub>1</sub>,....,x<sub>n</sub>. A ⇔ ∃
y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧
B<sub>k</sub></span>
@@ -1176,7 +1178,7 @@
occurrence of <span class="math"> B<sub>1</sub>, ..., B<sub>k</sub></span>, adding
<span class="math">A</span>.
In our setting, the defined formulas <span class="math">A</span> are never used in other
-    constraints, so it is sufficient to replace all occurrences of
+    formulas, so it is sufficient to replace all occurrences of
<span class="math">A</span> with their definitions.  The formula <span
class="math">A</span> is then redundant, and can be
removed from the instance.
@@ -1228,8 +1230,9 @@
definition, inference, and constraint forms is not guaranteed to
terminate.  A simple example is the inference <span class="math">R(x,y) ⇒ ∃z. R(x,z)
∧R(z,y)</span>, which can be applied to <span class="math">{R(a,b)}</span> to generate an
-  infinite sequence of larger and larger instances.  This is obviously something we want to
-  avoid.  There is a great deal of work on termination of the chase in
+  infinite sequence of larger and larger instances.  To ensure that
+  normalization, validity, and equivalence are decidable, we require
+  that normalization terminates.  There is a great deal of work on termination of the chase in
databases, or of sets of constraint handling rules.  The termination
of the notion of normalization defined in this specification is
guaranteed because the definitions, inferences and uniqueness/key
@@ -1245,26 +1248,34 @@
termination.  This specification draws a distinction between knowing
that an identifier has type <span class="name">'entity'</span>, <span class="name">'activity'</span>, or <span class="name">'agent'</span>, and having
an explicit <span class="name">entity(id)</span>, <span class="name">activity(id)</span>, or <span class="name">agent(id)</span> statement in the instance.
-  For example, focusing on entity, it is a stronger statement to say
-  that <span class="name">entity(id)</span> holds in the instance, because we can apply
-  entity-generation-invalidation-inference to infer generation and
-  invalidation events for the entity.  In contrast, if we only know
+  For example, focusing on entity statements, we can infer <span
+  class="name">'entity' ∈ typeOf(id)</span> if  <span
+  class="name">entity(id)</span> holds in the instance.  In contrast, if we only know
that <span class="name">'entity' ∈ typeOf(id)</span>, this does not imply that <span class="name">entity(id)</span>
holds.
</p>
<p>
This distinction (for both entities and activities) is essential to
ensure termination of the inferences, because we allow inferring
-  that a declared <span class="name">entity(id)</span> or  <span class="name">activity(id)</span> has a generation
-  or respectively start event.  If we strengthened the type
+  that a declared <span class="name">entity(id,attrs)</span> has a generation
+  and invalidation event, using
+  <a class="rule-text"
+  href="#entity-generation-invalidation-inference"><span>TBD</span></a>.
+  Likewise, for activities, we allow inferring that a declared <span class="name">activiy(id,t1,t2,attrs)</span> has a generation
+  and invalidation event, using
+  <a class="rule-text"
+  href="#activity-start-end-inference"><span>TBD</span></a>.  These
+  inferences do not apply to identifiers whose types are kown, but for
+  which there is not an explicit entity or activity statement.
+If we strengthened the type
inference constraints to add new entity or activity statements for
the entities and activities involved in generating or starting other
declared entities or activities, then we could keep generating new entities and
activities in an unbounded chain into the past (as in the "chicken
-  design adopted here assumes that instances explicitly declare the
-  entities and activities that are relevant for validity checking, and
-  these can be inferred to have start and end events.  This inference
+  design adopted here requires that instances explicitly declare the
+  entities and activities that are relevant for validity checking, and only
+  these can be inferred to have invalidation/generation and start/end events.  This inference
is not supported for identifiers
that are indirectly referenced in other relations and therefore have
type <span class="name">'entity'</span> or <span class="name">'activity'</span>.
@@ -1290,7 +1301,7 @@
rather than applied.  This means that they do not generate new
formulas expressible in PROV, but they do generate basic constraints
that might or might not be consistent with each other.  Checking
-  such constraints follows a saturation strategy similar to that for
+  such constraints follows a saturation strategy similar to that for normalization:
</p>
<ol>
<li><p>
@@ -1303,7 +1314,7 @@
precedes(t,t')</span> can be applied by searching for occurrences of
<span class="math"> A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> and for each such match
adding the atomic formula <span class="math">precedes(t,t')</span> to
-  the instance.  After all such constraints have been checked, and the
+  the instance, and similarly for strictly-precedes constraints.  After all such constraints have been checked, and the
resulting edges added to the graph, the ordering constraints are
violated if there is a cycle in the graph that includes a
strictly-precedes edge, and satisfied otherwise.
@@ -1358,8 +1369,8 @@
forms may describe the same situation but differ in inessential
details such as the order of statements or of elements of
attribute-value lists.  To remedy this, we can easily consider
-  instances to be equvalent up to reordering of attributes.  However,
-  instances can also be equvalent if they differ only in choice of
+  instances to be equivalent up to reordering of attributes.  However,
+  instances can also be equivalent if they differ only in choice of
names of existential variables. Because of this, the appropriate
notion of equivalence of normal forms is <em>isomorphism</em>.  Two
instances <span class="math">I<sub>1</sub></span> and <span class="math">I<sub>2<sub></span> are <em>isomorphic</em> if there is an invertible
@@ -1383,11 +1394,11 @@
<h4>From Instances to Bundles and Documents</h4>

<p>PROV documents can contain multiple instances: a <a>toplevel
-instance</a>, the set of statements not appearing within a bundle, and
+instance</a> consisting of the set of statements not appearing within a bundle, and
zero or more named instances called <a>bundle</a>s.  For the purpose
-of this document, these instances are treated independently: that is,
+of inference and constraint checking, these instances are treated independently. That is,
a PROV document is valid provided that each instance in it is valid
-and the names of its bundles are distinct; a PROV document is
+and the names of its bundles are distinct.  Similarly, a PROV document is
equivalent to another if their toplevel instances are equivalent, they
have the same number of bundles with the same names, and the instances
of their corresponding bundles are equivalent.  Analogously to blank nodes in
@@ -1405,7 +1416,8 @@
<h4>Summary of definitions, inferences, and constraints</h4>

<p><a href="">Table 2</a> summarizes the definitions, inferences, and
-constraints of this document.
+constraints specified in this document, broken down by component and
+type or relation involved.
</p>

<div class="note">Table: work in progress; these entries might change when the document is updated.</div>```