prov: changeset 4577:e1799f847977

--- a/model/prov-constraints.html	Sun Oct 28 20:14:34 2012 +0100
+++ b/model/prov-constraints.html	Mon Oct 29 10:46:25 2012 +0000
@@ -424,7 +424,8 @@
 
 
 <p> This document defines a subset of PROV instances called
-<i><a>valid</a></i> PROV instances.  
+<i><a>valid</a></i> PROV instances, by analogy with notions of
+validity for other Web standards.
 The intent of validation is ensure that a PROV instance represents a
 consistent history of objects and their interactions that is safe to
 use for the purpose of logical reasoning and other kinds of analysis.
@@ -479,7 +480,7 @@
 <li>[[PROV-SEM]] provides a mathematical semantics and a collection of
 axioms that provide a (non-normative, but equivalent) declarative specification, aimed
 at readers with a mathematical logic or formal background.  This can
-serve as a starting point for alteranative implementations. </li>
+serve as a starting point for alternative implementations. </li>
 </ul>
 
 </section>
@@ -536,9 +537,14 @@
 PROV instances. A <a>valid</a> PROV instance corresponds to a
 consistent history of objects and interactions to which logical
 reasoning can be safely applied. PROV instances need not
-be <a>valid</a>.  (The term <a>valid</a> is chosen by analogy with
-notions of validity in other W3C specifications; it is unfortunately
-incompatible with the usual meaning of "validity" in logic.)
+be <a>valid</a>.
+</p>
+<p>
+The term <a>valid</a> is chosen by analogy with
+notions of validity in other W3C specifications. Unfortunately, this
+terminology is incompatible with the usual meaning of "validity" in logic;
+our notion of validity of a PROV instance/document is closer to
+logical "consistency".
 </p>
 
 <p> This document specifies <em>definitions</em> of some
@@ -583,11 +589,11 @@
 which are determining whether PROV instances or documents convey the same
 information SHOULD check equivalence as specified here, and SHOULD
 treat equivalent instances or documents in the same way.  This is a
-guideline only, and the meaning of "in the same way" is
+guideline only, because meaning of "in the same way" is
 application-specific.  For example, applications that manipulate the syntax of
 PROV instances in particular representations, such as pretty-printing
-or digital signing, have good reasons to treat different equivalent
-documents differently.
+or digital signing, have good reasons to treat syntactically
+different, but equivalent, documents differently.
 </p>
 </section>
 <section>
@@ -1009,9 +1015,25 @@
   does not attempt to provide a complete introduction to these topics,
   but it is provided in order to aid readers familiar with one or more
   of these topics in understanding the specification, and to clarify
-  some of the motivations for choices in the specification to all readers.
+  some of the motivations for choices in the specification to all
+  readers.
   </p>
 
+    <p>As discussed below, the definitions, inferences and constraints
+can be viewed as pure logical assertions that could be checked in a
+variety of ways.  The rest of this document specifies validity and
+equivalence procedurally, that is, in terms of a reference
+implementation based on normalization.  Although both declarative and
+procedural specification techniques have advantages, a purely
+declarative specification offers much less assistance for
+implementers, while a procedural approach immediately demonstrates
+implementability and provides an adequate (polynomial-time) default implementation.  In
+this section we relate the declarative meaning of formulas to their
+procedural meaning.  [[PROV-SEM]] will provide an alternative,
+declarative characterization of validity and equivalence which could
+be used as a starting point for other implementation strategies.  </p>
+
+
  
  <h4>Constants, Variables and Placeholders</h4>
   <p>
@@ -1258,7 +1280,7 @@
   algorithm.  The logical formulas corresponding to the definitions,
   inferences, and constraints outlined above (and further elaborated
   in [[PROV-SEM]]) provides an equivalent specification, and any
-  implementation that correctly checks validity and equivalence (whether it performs nornalization or not) complies
+  implementation that correctly checks validity and equivalence (whether it performs normalization or not) complies
   with this specification.
   </p>
 
@@ -1430,6 +1452,29 @@
   those obtained by the algorithm in this specification.
   </p>
 
+<p>Equivalence is only explicitly specified for
+valid instances (whose normal forms exist and are unique up to
+isomorphism).  Implementations may test equivalences involving valid
+and invalid documents.  This specification does not constrain the
+behavior of equivalence checking involving invalid instances, provided
+that:
+</p>
+<ul>
+  <li>instance equivalence is <a>reflexive</a>, <a>symmetric</a> and
+  <a>transitive</a> on all instances</li>
+  <li> no valid instance is equivalent to an invalid instance.</li>
+  </ul>
+<p>
+Because of the second constraint, equivalence is essentially the union
+  of two equivalence relations on the disjoint sets of valid and
+  invalid instances.
+  There are two simple implementations of equivalence for invalid
+  documents that are correct:</p>
+  <ol>
+    <li>each invalid instance is equivalent only to itself</li>
+    <li>every pair of invalid instances are equivalent</li>
+  </ol>
+  
 <h4>From Instances to Bundles and Documents</h4>
 
 <p>PROV documents can contain multiple instances: a <a>toplevel
@@ -1767,7 +1812,7 @@
   via reduction to normal forms.  If checking validity or equivalence, the
   results MUST be the same as would be obtained by computing normal
   forms as defined in this specification.  Applications that explicitly compute
-  normal forms, or to follow the implementation strategy suggested by
+  normal forms, following the implementation strategy suggested by
   this specification, are by definition compliant.  However,
   applications can also comply by checking validity and equivalence in any
   other way that yields the same answers without explicitly applying
@@ -1831,6 +1876,11 @@
   its definition.  
 </p>
 
+  <div class="remark">
+    We use definitions primarily to expand the compact, concrete
+    PROV-N syntax, including short forms and optional parameters to the abstract syntax
+    implicitly used in PROV-DM.
+  </div>
 
   <p> Inferences have the following general form:</p>
 <div class='inference-example' id='inference-example'>
@@ -2161,7 +2211,7 @@
   Essentially, the definitions state that parameters  
 <span class="name">g,u</span> are expandable only if the activity is specified, i.e., if parameter <span class="name">a</span> is provided.
 
-The rationale for this is that when a is provided, then there have to be two events, namely <span class="name">u</span> and <span class="name">g</span>, which account for the usage of <span class="name">e1</span> and the generation of <span class="name">e2</span>, respectively, by <span class="name">a</span>. Conversely, if <span class="name">a</span> is not provided, then one cannot tell whether one or more activities are involved in the derivation, and the explicit introduction of such events, which correspond to a single acitivity, would therefore not be justified.  </p>
+The rationale for this is that when a is provided, then there have to be two events, namely <span class="name">u</span> and <span class="name">g</span>, which account for the usage of <span class="name">e1</span> and the generation of <span class="name">e2</span>, respectively, by <span class="name">a</span>. Conversely, if <span class="name">a</span> is not provided, then one cannot tell whether one or more activities are involved in the derivation, and the explicit introduction of such events, which correspond to a single activity, would therefore not be justified.  </p>
 
 
 <p>  A  later constraint, <a class="rule-ref"
@@ -3272,7 +3322,7 @@
 implies event ordering, or vice versa.  However, an application MAY flag time values
 that appear inconsistent with the event ordering as possible
 inconsistencies.  When generating provenance, an application SHOULD
-use a consistent imeline for related PROV statements within an
+use a consistent timeline for related PROV statements within an
 instance.</p>
 
 
@@ -4527,11 +4577,30 @@
   constraint fails due to unification or merging failure. </p>
 
 
-  <p>Two PROV instances are <dfn>equivalent</dfn> if they have isomorphic normal forms (that is, after applying all possible inference
+  <p>Two PROV instances <span class="math">I</span> and <span class="math">I'</span> are <dfn>isomorphic</dfn> if
+  there exists an invertible substitution <span class="math">S</span> that maps each
+  variable of <span class="math">I</span> to a distinct variable of <span class="math">I'</span> and such that
+  <span class="math">S(I) = I'</span>.</p>
+
+<p>  Two <a>valid</a> PROV instances are <dfn>equivalent</dfn> if they
+  have <a>isomorphic</a> normal forms.  That is, after applying all possible inference
 rules, the two instances produce the same set of PROV statements,
 up to reordering of statements and attributes within attribute lists,
-  and renaming of existential variables).
-Equivalence has the following characteristics: </p>
+  and renaming of existential variables.  
+</p>
+  <p>Equivalence can also be checked over pairs of PROV instances that
+  are not necessarily valid, subject to the following rules:
+  <ul>
+    <li>If both are valid, then equivalence is
+  defined above.</li>
+    <li>If both are invalid, then equivalence can be
+  implemented in any way provided it is <a>reflexive</a>, <a>symmetric</a>, and <a>transitive</a>.
+  </li>
+  <li>If one instance is valid and the other is invalid, then the two
+  instances are not equivalent.</li>
+  </ul>
+  <p>
+Equivalence has the following characteristics over valid instances: </p>
 
 <ul>
   <li>
@@ -4555,16 +4624,20 @@
   <li>
   Applying inference rules, definitions, and uniqueness constraints preserves equivalence.  That is, a <a>PROV
   instance</a> is equivalent to the instance obtained by applying any
-  inference rule or definition, or by <a>unifying</a> two terms or <a>merging</a> two statements to
+  inference rule or definition, or by <a title="unification">unifying</a> two terms or <a>merging</a> two statements to
   enforce a uniqueness constraint.
   </li>
-  <li>Equivalence is <a>reflexive</a>, <a>symmetric</a>, and <a>transitive</a>.</li>
+  <li>Equivalence is <a>reflexive</a>, <a>symmetric</a>, and
+  <a>transitive</a>.  (This is because a valid instance has a unique
+  normal form up to isomorphism [[DBCONSTRAINTS]]). </li>
 </ul>
 
 <p> An application that processes PROV data SHOULD handle
-equivalent instances in the same way. (Common exceptions to this guideline
+equivalent instances in the same way.  This guideline is necessarily
+imprecise because "in the same way" is application-specific.
+Common exceptions to this guideline
 include, for example, applications that pretty-print or digitally sign
-provenance, where the order of statements matters.)  </p>
+provenance, where the order and syntactic form of statements matters.  </p>
 
 </section>
 
@@ -4921,7 +4994,11 @@
  -->
 <!--  LocalWords:  disjointness inferrable subtyping subtype subtypes hadMember
  -->
-<!--  LocalWords:  hasMember toplevel sameAs tuple acyclicity isomorphism
+<!--  LocalWords:  hasMember toplevel sameAs tuple acyclicity isomorphism IRI
  -->
-<!--  LocalWords:  endBundle typeof equational acyclic invertible
+<!--  LocalWords:  endBundle typeof equational acyclic invertible procedurally
  -->
+<!--  LocalWords:  implementers multi unifier ERCIM Groth Stian Soiland Ilkay
+ -->
+<!--  LocalWords:  Altintas Reza
+ -->
author	James Cheney <jcheney@inf.ed.ac.uk>
	Mon, 29 Oct 2012 10:46:25 +0000
changeset 4577	e1799f847977
parent 4576	af2d939b4df4
child 4578	9e9b6e57383c