author | James Cheney <jcheney@inf.ed.ac.uk> |

Thu, 30 Aug 2012 19:01:02 +0100 | |

changeset 4375 | b0c1e2e3f83f |

parent 4374 | 9d2967fd1345 |

child 4376 | 95ad452a23c6 |

* further rev overview section

model/prov-constraints.html |

--- a/model/prov-constraints.html Thu Aug 30 13:26:48 2012 +0100 +++ b/model/prov-constraints.html Thu Aug 30 19:01:02 2012 +0100 @@ -1047,7 +1047,7 @@ <p> In addition, a substitution can be applied to an atomic formula (PROV statement) <span class="math">p(t<sub>1</sub>,...,t<sub>n</sub>)</span> by applying it to each term, - that is, <span class="math">S(p(t<sub>1</sub>,...,t<sub>n</sub>)) = p(S(t<sub>1</sub>),...,t<sub>n</sub>)</span>. Likewise, a + that is, <span class="math">S(p(t<sub>1</sub>,...,t<sub>n</sub>)) = p(S(t<sub>1</sub>),...,S(t<sub>n</sub>))</span>. Likewise, a substitution <span class="math">S</span> can be applied to an instance <span class="math">I</span> by applying it to each atomic formula (PROV statement) in <span class="math">I</span>, that is, <span class="math">S(I) = {S(A) | A ∈ I}</span>. @@ -1074,38 +1074,40 @@ <li>Ordering constraints employ atomic precedence relations that can be thought of as binary formulas <span class="math">precedes(t,t')</span> or <span class="math">strictly_precedes(t,t')</span> </li> - <li>Typing constraints employ the set-valued <span class="name">typeOf</span> function, but - the property <span class="name">type ∈ typeOf(id)</span> can be represented as a binary - relation <span class="name">typeOf(id,type)</span>. + <li>Typing constraints <span class="name">'type' ∈ typeOf(id)</span> + can be represented as a atomic formulas <span class="name">typeOf(id,'type')</span>. </li> <li>Impossibility constraints employ the conclusion <span class="name">INVALID</span>, which is equivalent to the logical constant <span class="math">False</span>. </li> </ul> <p> Similarly, the definitions, inferences, and constraint rules in this - specification can also be viewed as logical formulas.</p> + specification can also be viewed as logical formulas., built up out + of atomic formulas, logical connectives "and" (∧), "implies" (⇒), + and quantifiers "for all" (∀) and "there exists" (∃). For more + background on logical formulas, see a logic textbook such as [[Logic]].</p> <ul> <li> A definition of the form <span class="name">A</span> <span class="conditional">IF AND ONLY IF</span> there exists <span class="name">y<sub>1</sub></span>...<span class="name">y<sub>m</sub></span> such that <span class="name">B<sub>1</sub></span> and ... and <span class="name">B<sub>k</sub></span> can be thought of as a formula <span class="math">∀ x<sub>1</sub>,....,x<sub>n</sub>. A ⇔ ∃ y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧ B<sub>k</sub></span>, where <span class="math">x<sub>1</sub></span>...<span class="math">x<sub>n</sub></span> are the - free variables of the formula. + free variables of the definition. </li> <li>An inference of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub></span> and ... and <span class="name">A<sub>l</sub></span> <span class="conditional">THEN</span> there exists <span class="name">y<sub>1</sub></span>...<span class="name">y<sub>m</sub></span> such that <span class="name">B<sub>1</sub></span> and ... and <span class="name">B<sub>k</sub></span> can be thought of as a formula <span class="math">∀ x<sub>1</sub>,....,x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ ∃ y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧ B<sub>k</sub></span>, where <span class="math">x<sub>1</sub></span>...<span class="math">x<sub>n</sub></span> are the - free variables of the formula. + free variables of the inference. </li> -<li>A uniqueness, ordering, or typing constraint of the form <span class="conditional">IF</span> <span class="name">A</span> <span class="conditional">THEN</span> <span class="name">C</span> can be viewed as a formula +<li>A uniqueness, ordering, or typing constraint of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> <span class="conditional">THEN</span> <span class="name">C</span> can be viewed as a formula <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ C</span>. </li> -<li>A constraint of the form <span class="conditional">IF</span> <span class="name">A</span> <span class="conditional">THEN INVALID</span> can be viewed as a formula - <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A ⇒ False</span>. </li> +<li>A constraint of the form <span class="conditional">IF</span> <span class="name">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> <span class="conditional">THEN INVALID</span> can be viewed as a formula + <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ False</span>. </li> </ul> <h4>Satisfying definitions, inferences, and constraints</h4> <p> In logic, a formula's meaning is defined by saying when it is - <em>satisfied</em> in a situation. Likewise, we can view + <em>satisfied</em>. We can view definitions, inferences, and constraints as being satisfied or not satisfied in a PROV instance, augmented with information about the constraints. </p> @@ -1136,10 +1138,10 @@ A<sub>l</sub></span> is true, then <span class="math">C</span> is also true.</li> <li>An impossibility constraint is satisfied when the formula - <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A ⇒ + <span class="math">∀ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub> ⇒ False</span> holds. This is logically equivalent to <span class="math">∄ - x<sub>1</sub>...x<sub>n</sub>. A</span>, that is, there exists no - substitution for <span class="math">x<sub>1</sub>...x<sub>n</sub></span> making <span class="math">A</span> true. + x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span>, that is, there exists no + substitution for <span class="math">x<sub>1</sub>...x<sub>n</sub></span> making <span class="math">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> true. </ol> <h4>Merging</h4> @@ -1166,7 +1168,7 @@ </p> <ul> <li> - A logical equivalence of the form <span class="math">∀ + A definition of the form <span class="math">∀ x<sub>1</sub>,....,x<sub>n</sub>. A ⇔ ∃ y<sub>1</sub>...y<sub>m</sub> . B<sub>1</sub> ∧ ... ∧ B<sub>k</sub></span> @@ -1176,7 +1178,7 @@ occurrence of <span class="math"> B<sub>1</sub>, ..., B<sub>k</sub></span>, adding <span class="math">A</span>. In our setting, the defined formulas <span class="math">A</span> are never used in other - constraints, so it is sufficient to replace all occurrences of + formulas, so it is sufficient to replace all occurrences of <span class="math">A</span> with their definitions. The formula <span class="math">A</span> is then redundant, and can be removed from the instance. @@ -1228,8 +1230,9 @@ definition, inference, and constraint forms is not guaranteed to terminate. A simple example is the inference <span class="math">R(x,y) ⇒ ∃z. R(x,z) ∧R(z,y)</span>, which can be applied to <span class="math">{R(a,b)}</span> to generate an - infinite sequence of larger and larger instances. This is obviously something we want to - avoid. There is a great deal of work on termination of the chase in + infinite sequence of larger and larger instances. To ensure that + normalization, validity, and equivalence are decidable, we require + that normalization terminates. There is a great deal of work on termination of the chase in databases, or of sets of constraint handling rules. The termination of the notion of normalization defined in this specification is guaranteed because the definitions, inferences and uniqueness/key @@ -1245,26 +1248,34 @@ termination. This specification draws a distinction between knowing that an identifier has type <span class="name">'entity'</span>, <span class="name">'activity'</span>, or <span class="name">'agent'</span>, and having an explicit <span class="name">entity(id)</span>, <span class="name">activity(id)</span>, or <span class="name">agent(id)</span> statement in the instance. - For example, focusing on entity, it is a stronger statement to say - that <span class="name">entity(id)</span> holds in the instance, because we can apply - entity-generation-invalidation-inference to infer generation and - invalidation events for the entity. In contrast, if we only know + For example, focusing on entity statements, we can infer <span + class="name">'entity' ∈ typeOf(id)</span> if <span + class="name">entity(id)</span> holds in the instance. In contrast, if we only know that <span class="name">'entity' ∈ typeOf(id)</span>, this does not imply that <span class="name">entity(id)</span> holds. </p> <p> This distinction (for both entities and activities) is essential to ensure termination of the inferences, because we allow inferring - that a declared <span class="name">entity(id)</span> or <span class="name">activity(id)</span> has a generation - or respectively start event. If we strengthened the type + that a declared <span class="name">entity(id,attrs)</span> has a generation + and invalidation event, using + <a class="rule-text" + href="#entity-generation-invalidation-inference"><span>TBD</span></a>. + Likewise, for activities, we allow inferring that a declared <span class="name">activiy(id,t1,t2,attrs)</span> has a generation + and invalidation event, using + <a class="rule-text" + href="#activity-start-end-inference"><span>TBD</span></a>. These + inferences do not apply to identifiers whose types are kown, but for + which there is not an explicit entity or activity statement. +If we strengthened the type inference constraints to add new entity or activity statements for the entities and activities involved in generating or starting other declared entities or activities, then we could keep generating new entities and activities in an unbounded chain into the past (as in the "chicken and egg" paradox). The - design adopted here assumes that instances explicitly declare the - entities and activities that are relevant for validity checking, and - these can be inferred to have start and end events. This inference + design adopted here requires that instances explicitly declare the + entities and activities that are relevant for validity checking, and only + these can be inferred to have invalidation/generation and start/end events. This inference is not supported for identifiers that are indirectly referenced in other relations and therefore have type <span class="name">'entity'</span> or <span class="name">'activity'</span>. @@ -1290,7 +1301,7 @@ rather than applied. This means that they do not generate new formulas expressible in PROV, but they do generate basic constraints that might or might not be consistent with each other. Checking - such constraints follows a saturation strategy similar to that for + such constraints follows a saturation strategy similar to that for normalization: </p> <ol> <li><p> @@ -1303,7 +1314,7 @@ precedes(t,t')</span> can be applied by searching for occurrences of <span class="math"> A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> and for each such match adding the atomic formula <span class="math">precedes(t,t')</span> to - the instance. After all such constraints have been checked, and the + the instance, and similarly for strictly-precedes constraints. After all such constraints have been checked, and the resulting edges added to the graph, the ordering constraints are violated if there is a cycle in the graph that includes a strictly-precedes edge, and satisfied otherwise. @@ -1358,8 +1369,8 @@ forms may describe the same situation but differ in inessential details such as the order of statements or of elements of attribute-value lists. To remedy this, we can easily consider - instances to be equvalent up to reordering of attributes. However, - instances can also be equvalent if they differ only in choice of + instances to be equivalent up to reordering of attributes. However, + instances can also be equivalent if they differ only in choice of names of existential variables. Because of this, the appropriate notion of equivalence of normal forms is <em>isomorphism</em>. Two instances <span class="math">I<sub>1</sub></span> and <span class="math">I<sub>2<sub></span> are <em>isomorphic</em> if there is an invertible @@ -1383,11 +1394,11 @@ <h4>From Instances to Bundles and Documents</h4> <p>PROV documents can contain multiple instances: a <a>toplevel -instance</a>, the set of statements not appearing within a bundle, and +instance</a> consisting of the set of statements not appearing within a bundle, and zero or more named instances called <a>bundle</a>s. For the purpose -of this document, these instances are treated independently: that is, +of inference and constraint checking, these instances are treated independently. That is, a PROV document is valid provided that each instance in it is valid -and the names of its bundles are distinct; a PROV document is +and the names of its bundles are distinct. Similarly, a PROV document is equivalent to another if their toplevel instances are equivalent, they have the same number of bundles with the same names, and the instances of their corresponding bundles are equivalent. Analogously to blank nodes in @@ -1405,7 +1416,8 @@ <h4>Summary of definitions, inferences, and constraints</h4> <p><a href="">Table 2</a> summarizes the definitions, inferences, and -constraints of this document. +constraints specified in this document, broken down by component and +type or relation involved. </p> <div class="note">Table: work in progress; these entries might change when the document is updated.</div>