author | James Cheney <jcheney@inf.ed.ac.uk> |

Fri, 26 Oct 2012 18:32:58 +0100 | |

changeset 4570 | 34fdcb59ee1b |

parent 4569 | 5ad661dd8b57 |

child 4574 | 2edbe20885c1 |

* More constraints issues

model/prov-constraints.html |

--- a/model/prov-constraints.html Fri Oct 26 13:55:28 2012 +0100 +++ b/model/prov-constraints.html Fri Oct 26 18:32:58 2012 +0100 @@ -425,7 +425,9 @@ <p> This document defines a subset of PROV instances called <i><a>valid</a></i> PROV instances. -The intent of validation is ensure that a PROV instance represents a history of objects and their interactions which is consistent, and thus safe to use for the purpose of logical reasoning and other kinds of analysis. +The intent of validation is ensure that a PROV instance represents a +consistent history of objects and their interactions that is safe to +use for the purpose of logical reasoning and other kinds of analysis. Valid PROV instances satisfy certain <a>definitions</a>, <a>inferences</a>, and <a>constraints</a>. These definitions, inferences, and constraints @@ -468,13 +470,16 @@ <li>The Linked Data and Semantic Web community should focus on PROV-O defining PROV classes and properties specified in an OWL2 ontology. For further details, PROV-DM and PROV-CONSTRAINTS specify -the constraints applicable to the data model, and its interpretation. -<!--PROV-SEM provides a mathematical semantics. ---> +the constraints applicable to the data model, and its interpretation. </li> <li>Developers seeking to retrieve or publish provenance should focus on PROV-AQ.</li> <li>Readers seeking to implement other PROV serializations -should focus on PROV-DM and PROV-CONSTRAINTS. PROV-O and PROV-N offer examples of mapping to RDF and text, respectively.</li> +should focus on PROV-DM and PROV-CONSTRAINTS. PROV-O and PROV-N offer +examples of mapping to RDF and text, respectively.</li> +<li>[[PROV-SEM]] provides a mathematical semantics and a collection of +axioms that provide a (non-normative, but equivalent) declarative specification, aimed +at readers with a mathematical logic or formal background. This can +serve as a starting point for alteranative implementations. </li> </ul> </section> @@ -530,8 +535,11 @@ PROV-DM specification [[PROV-DM]] imposes minimal requirements upon PROV instances. A <a>valid</a> PROV instance corresponds to a consistent history of objects and interactions to which logical -reasoning can be safely applied. By default, PROV instances need not -be <a>valid</a>. </p> +reasoning can be safely applied. PROV instances need not +be <a>valid</a>. (The term <a>valid</a> is chosen by analogy with +notions of validity in other W3C specifications; it is unfortunately +incompatible with the usual meaning of "validity" in logic.) +</p> <p> This document specifies <em>definitions</em> of some provenance statements in terms of others, <em>inferences</em> over PROV instances @@ -543,7 +551,8 @@ constraints</em>. Further discussion of the semantics of PROV statements, which justifies the definitions, inferences -and constraints, can be found in the formal semantics [[PROV-SEM]]. +and constraints, and relates the procedural specification approach +taken here to a declarative specification, can be found in the formal semantics [[PROV-SEM]]. </p> <p>We define validity and equivalence in terms of a @@ -557,23 +566,28 @@ pairwise equivalence of their respective instances. </p> <p> -This document outlines an algorithm for validity checking based on +This specification defines +validity and equivalence procedurally, via an algorithm based on <a title="normal form">normalization</a>. Applications MAY implement -validity and equivalence checking using normalization, as suggested -here, or in +validity and equivalence checking using normalization, as outlined +here. Applications MAY also implement validation and equivalence +checking in any other way as long as the same instances or documents are considered valid or equivalent, respectively. </p> -<p>Checking validity or equivalence are RECOMMENDED, but not required, for -applications compliant with PROV. This specification defines how -validity and equivalence are to be checked, if an application elects -to support them at all. +<p> Checking validity or equivalence are RECOMMENDED, but not required, for +applications compliant with PROV. Applications producing provenance SHOULD ensure that it is <a>valid</a>, and similarly applications consuming provenance MAY reject provenance that is not <a>valid</a>. Applications which are determining whether PROV instances or documents convey the same information SHOULD check equivalence as specified here, and SHOULD -treat equivalent instances or documents in the same way. +treat equivalent instances or documents in the same way. This is a +guideline only, and the meaning of "in the same way" is +application-specific. For example, applications that manipulate the syntax of +PROV instances in particular representations, such as pretty-printing +or digital signing, have good reasons to treat different equivalent +documents differently. </p> </section> <section> @@ -1151,17 +1165,24 @@ substitution for <span class="math">x<sub>1</sub>...x<sub>n</sub></span> making <span class="math">A<sub>1</sub> ∧ ... ∧ A<sub>l</sub></span> true. </ol> -<h4>Merging</h4> - - <p><em>Merging</em> is an operation that takes two terms and compares them to - see if they are equal, or can be made equal by substituting an - existential variable with another term. This operation is a special - case of <em>unification</em>, a common operation in logical - reasoning, including logic programming and automated reasoning. -Merging two terms <span class="math">t,t'</span> results in either substitution <span class="math">S</span> +<h4>Unification and Merging</h4> + + <p><em>Unification</em> is an operation that takes two terms and compares them to + determine whether they can be made equal by substituting an + existential variable with another term. If so, the result is such a + substitution; otherwise, the result is failure. Unification is an + essential concept in logic programming and automated reasoning, +where terms can involve variables, constants and function symbols. In PROV, + by comparison, unification only needs to deal with +</p> +<p> +Unifying two terms <span class="math">t,t'</span> results in either substitution <span class="math">S</span> such that <span class="math">S(t) = S(t')</span>, or failure indicating that there is no substitution that can be applied to both <span class="math">t</span> and <span class="math">t'</span> to make - them equal. + them equal. Unification is also used to define an operation on PROV + statements called <em>merging</em>. Merging takes two statements + that have equal identifiers, unifies their corresponding term + arguments, and combines their attribute lists. </p> @@ -1205,20 +1226,20 @@ x<sub>1</sub>...x<sub>n</sub>. A<sub>1</sub> ∧ ... ∧ A<sub>p</sub> ⇒ t = t'</span> can be applied by searching for an occurrence <span class="math">A<sub>1</sub> ∧ ... ∧ A<sub>p</sub></span> in the instance, and - if one is found, merging the terms <span class="math">t</span> and + if one is found, unifying the terms <span class="math">t</span> and <span class="math">t'</span>. If successful, the resulting substitution is applied to the instance; otherwise, the application of the uniqueness constraint fails. </li> <li>A key constraint can similarly be applied by searching for -different occurrences of a statement with the same identifier, merging the +different occurrences of a statement with the same identifier, unifying the corresponding parameters of the statements, and concatenating their -attribute lists. The substitutions obtained by merging are applied to -the instance. +attribute lists, to form a single statement. The substitutions obtained by unification are applied to +the merged statement and the rest of the instance. </li> </ul> <p>As noted above, uniqueness or key constraint - application can <em>fail</em>, if a required merging step fails. Failure of constraint + application can <em>fail</em>, if a required unification or merging step fails. Failure of constraint application means that there is no way to add information to the instance to satisfy the constraint, which in turn implies that the instance is <em>invalid</em>. @@ -1230,6 +1251,16 @@ called <em>chasing</em> [[DBCONSTRAINTS]] or <em>saturation</em> [[CHR]]. We call this process <em>normalization</em>. </p> + <p> + Although this specification outlines one particular way of + performing inferences and checking constraints, based on + normalization, implementations can use any other equivalent + algorithm. The logical formulas corresponding to the definitions, + inferences, and constraints outlined above (and further elaborated + in [[PROV-SEM]]) provides an equivalent specification, and any + implementation that correctly checks validity and equivalence (whether it performs nornalization or not) complies + with this specification. + </p> <h4>Termination</h4> @@ -1350,7 +1381,8 @@ </li> </ol> <p> - A normalized instance that satisfies all of the checked constraints + A normalized instance that passes all of the ordering, typing, + and impossibility constraint checks is called <a>valid</a>. Validity can be, but is not required to be, checked by normalizing and then checking constraints. Any other algorithm that provides equivalent behavior (that is, accepts the @@ -1720,9 +1752,10 @@ application MAY apply the inferences and definitions in <a href="#inferences" class='sectionRef'></a>.</li> <li>If determining whether a PROV instance or document is <a>valid</a>, an - application MUST check that all of the + application MUST determine whether all of the constraints of <a href="#constraints" class="sectionRef"></a> are - satisfied on the <a>normal form</a> of the instance or document. </li> + satisfied on + the <a>normal form</a> of the instance or document. </li> <li> If producing provenance meant for other applications to use, the application SHOULD produce <a>valid</a> provenance, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>. </li> <li>If determining whether two PROV instances or documents are @@ -1730,10 +1763,15 @@ normal forms are equal, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>. </ol> - <p>Compliant applications are not required to explicitly compute - normal forms; however, if checking validity or equivalence, the - results should be the same as would be obtained by computing normal - forms as defined in this specification.</p> + <p>This specification defines validity and equivalence procedurally + via reduction to normal forms. If checking validity or equivalence, the + results MUST be the same as would be obtained by computing normal + forms as defined in this specification. Applications that explicitly compute + normal forms, or to follow the implementation strategy suggested by + this specification, are by definition compliant. However, + applications can also comply by checking validity and equivalence in any + other way that yields the same answers without explicitly applying + definitions, inferences, and constraints or constructing normal forms.</p> <p> All figures are for illustration purposes @@ -2572,10 +2610,10 @@ <div class="remark"> In the above inference, <span class="name">_pl</span> is an - existential variable, so it can be - merged with a constant identifier, another existential variable, or a + existential variable, so it can be unified +with a constant identifier, another existential variable, or a placeholder <span class="name">-</span>, as explained - in the definition of <a>merging</a>. + in the definition of <a>unification</a>. </div> <hr /> <p id='delegation-inference_text'> Delegation relates agents where one agent acts on behalf of @@ -2608,7 +2646,7 @@ <span class="name">_pl1</span> and <span class="name">_pl2</span> can be replaced with constant identifiers, existential variables, or placeholders <span class="name">-</span> independently, as explained - in the definition of <a>merging</a>. + in the definition of <a>unification</a>. </div> <hr /> @@ -2844,44 +2882,41 @@ class="name">a</span>) is <span class="name">activity(a,2011-11-16T16:00:00,2011-11-16T18:00:00,[a=1,b=2])</span>. </p> - <p> <dfn>Merging</dfn> is an operation that can be applied - to a pair of terms, or a pair of attribute lists. - The result of merging is either a substitution (mapping - existentially quantified variables to terms) or failure, indicating that the merge - cannot be performed. Merging of pairs of terms, attribute lists, - or statements is defined as follows.</p> + <p> <dfn>Unification</dfn> is an operation that can be applied + to a pair of terms. + The result of unification is either a <dfn>unifier</dfn>, that is, + a substitution <span class="math">S</span> such that <span class="math">S(t) + = S(t')</span>, or failure, indicating + that there is no <a>unifier</a>. Unification of pairs of terms is defined as follows.</p> <ul> <li> If <span class="name">t</span> and <span class="name">t'</span> are constant identifiers or values (including the placeholder <span class="name">-</span>), then - their <a>merge</a> exists only if they are equal, otherwise merging + there are two cases. If <span class="name">t = t'</span> then their <a>unifier</a> is the + empty substitution, otherwise unification fails. </li> <li> If <span class="name">x</span> is an existential variable and <span class="name">t'</span> is any term (identifier, constant, placeholder <span class="name">-</span>, or existential variable), then their - <a>merge</a> is <span class="name">t'</span>, and the resulting substitution is - <span class="name">[x=t']</span>. In the special case where <span class="name">t'=x</span>, the merge is - <span class="name">x</span> and the resulting substitution is empty.</li> + <a>unifier</a> is + <span class="name">[x=t']</span>. In the special case where + <span class="name">t'=x</span>, the unifier is the empty substitution.</li> <li> If <span class="name">t</span> is any term (identifier, constant, placeholder <span class="name">-</span>, or existential variable) and <span class="name">x'</span> is an existential variable, then their - merge is the same as the merge of <span class="name">x</span> and <span class="name">t</span>.</li> - <li> The <a>merge</a> of two attribute lists <span - class="name">attrs1</span> and <span class="name">attrs2</span> - is their union, considered as sets of key-value pairs, written <span - class="name">attrs1 ∪ attrs2</span>. Duplicate keys with - different are - allowed, but equal key-value pairs are merged.</li> -</ul> - - -<div class="remark">Merging for terms is analogous to unification in + unifier is the same as the unifier of <span class="name">x</span> + and <span class="name">t</span>.</li> + </ul> + + + +<div class="remark">Unification is analogous to unification in logic programming and theorem proving, restricted to flat terms with - no function symbols. No occurs check is needed because there are no +constants and variables but no function symbols. No occurs check is needed because there are no function symbols.</div> <p> @@ -2895,15 +2930,15 @@ <ol> <li>Suppose PROV instance <span class="math">I</span> contains all of the hypotheses <span class="name">hyp<sub>1</sub></span> and ... and <span class="name">hyp<sub>n</sub></span>. </li> - <li>Attempt to merge all of the equated terms in the conclusion + <li>Attempt to unify all of the equated terms in the conclusion <span class="name">t<sub>1</sub></span> = <span class="name">u<sub>1</sub></span> and ... and <span class="name">t<sub>n</sub></span> = <span class="name">u<sub>n</sub></span>. </li> - <li>If merging fails, then the constraint + <li>If unification fails, then the constraint is unsatisfiable, so application of the constraint to <span class="math">I</span> fails. If this failure occurs during <a>normalization</a> prior to validation, then <span class="math">I</span> is invalid, as explained in <a href="#normalization-validity-equivalence">Section 6</a>. </li> - <li>If merging succeeds with a substitution <span class="math">S</span>, then + <li>If unification succeeds with a substitution <span class="math">S</span>, then <span class="math">S</span> is applied to the instance <span class="math">I</span>, yielding result <span class="math">S(I)</span>.</li> </ol> @@ -2920,20 +2955,20 @@ <p> Because of the presence of attributes, key constraints do not reduce directly to uniqueness constraints. Instead, we enforce key - constraints as follows. </p> + constraints using the following <dfn>merging</dfn> process. </p> <ol> <li> Suppose <span class="name">r(a<sub>0</sub>; a<sub>1</sub>,...a<sub>n</sub>,attrs1)</span> and <span class="name">r(b<sub>0</sub>; b<sub>1</sub>,...b<sub>n</sub>,attrs2)</span> hold in PROV instance <span class="math">I</span>, where the key fields <span class="name">a<sub>k</sub> = b<sub>k</sub></span> are equal.</li> - <li> Attempt to merge all of the corresponding parameters <span + <li> Attempt to unify all of the corresponding parameters <span class="name">a<sub>0</sub> = b<sub>0</sub> </span> and ... and <span class="name">a<sub>n</sub> = b<sub>n</sub></span>. </li> - <li>If merging fails, then the constraint is unsatisfiable, so + <li>If unification fails, then the constraint is unsatisfiable, so application of the key constraint to <span class="math">I</span> fails. </li> - <li>If merging succeeds with substitution <span class="math">S</span>, then we remove <span + <li>If unification succeeds with substitution <span class="math">S</span>, then we remove <span class="name">r(a<sub>0</sub>; a<sub>1</sub>,...a<sub>n</sub>,attrs1)</span> and <span class="name">r(b<sub>0</sub>; b<sub>1</sub>,...b<sub>n</sub>,attrs2)</span> from <span class="math">I</span>, obtaining instance <span class="math">I'</span>, and return instance <span @@ -2941,14 +2976,15 @@ attrs2)}</span> ∪ <span class="math">S(I')</span>. </ol> + <p>Thus, if a PROV instance contains an apparent violation of a uniqueness - constraint or key constraint, merging can be used to determine + constraint or key constraint, unification or merging can be used to determine whether the constraint can be satisfied by instantiating some existential variables with other terms. For key constraints, this is the same as merging pairs of statements whose keys are equal and whose corresponding arguments are compatible, because after - merging respective arguments and attribute lists, the two statements + unifying respective arguments and combining attribute lists, the two statements become equal and one can be omitted. </p> @@ -4435,7 +4471,8 @@ variables), yielding an instance <span class="math">I<sub>2</sub></span>. </li> <li> - Apply all uniqueness constraints to <span class="math">I<sub>2</sub></span> by merging terms or statements + Apply all uniqueness constraints to <span + class="math">I<sub>2</sub></span> by unifying terms or merging statements and applying the resulting substitution to the instance, yielding an instance <span class="math">I<sub>3</sub></span>. If some uniqueness constraint cannot be applied, then normalization fails. @@ -4460,9 +4497,8 @@ <p> A PROV instance is <dfn id="dfn-valid">valid</dfn> -if its normal form exists and satisfies all of - the validity constraints; this implies that the instance satisfies - all of the definitions, inferences, and constraints. +if its normal form exists and all of + the validity constraints succeed on the normal form. The following algorithm can be used to test validity:</p> @@ -4487,7 +4523,8 @@ </li> </ol> -<p>A normal form of a PROV instance does not exist when a uniqueness constraint fails due to merging failure. </p> +<p>A normal form of a PROV instance does not exist when a uniqueness + constraint fails due to unification or merging failure. </p> <p>Two PROV instances are <dfn>equivalent</dfn> if they have isomorphic normal forms (that is, after applying all possible inference @@ -4518,17 +4555,16 @@ <li> Applying inference rules, definitions, and uniqueness constraints preserves equivalence. That is, a <a>PROV instance</a> is equivalent to the instance obtained by applying any - inference rule or definition, or by <a>merging</a> two statements to + inference rule or definition, or by <a>unifying</a> two terms or <a>merging</a> two statements to enforce a uniqueness constraint. </li> <li>Equivalence is <a>reflexive</a>, <a>symmetric</a>, and <a>transitive</a>.</li> </ul> <p> An application that processes PROV data SHOULD handle -equivalent instances in the same way. (Common exceptions to this rule -include, for example, pretty-printers that seek to preserve the -original order of statements in a file and avoid expanding -inferences.) </p> +equivalent instances in the same way. (Common exceptions to this guideline +include, for example, applications that pretty-print or digitally sign +provenance, where the order of statements matters.) </p> </section>