--- a/model/prov-constraints.html Wed Aug 08 12:10:14 2012 +0100
+++ b/model/prov-constraints.html Wed Aug 08 12:17:48 2012 +0100
@@ -527,16 +527,16 @@
reasoning can be safely applied. By default, PROV instances need not
be <a>valid</a>. </p>
-<p> This document specifies <em>inferences</em> over PROV instances
-that applications MAY employ, including <em>definitions</em> of some
-provenance statements in terms of others, and also defines a class of
+<p> This document specifies <em>definitions</em> of some
+provenance statements in terms of others, <em>inferences</em> over PROV instances
+that applications MAY employ, and also defines a class of
<a>valid</a> PROV instances by specifying <em>constraints</em> that
<a>valid</a> PROV instances must satisfy. There are four kinds of
constraints: <em>uniqueness constraints</em>, <em>event ordering
constraints</em>, <em>impossibility constraints</em>, and <em>type
constraints</em>.
Further discussion
-of the semantics of PROV statements, which justifies the inferences
+of the semantics of PROV statements, which justifies the definitions, inferences
and constraints, can be found in the formal semantics [[PROV-SEM]].
</p>
@@ -694,12 +694,12 @@
attributes; this leads to potential ambiguity, which is mitigated through the
use of identifiers.</p>
-<p>An <a>activity</a> is delimited by its <a title="activity start
+<p>An <a>activity</a>'s lifetime is delimited by its <a title="activity start
event">start</a> and its <a title="activity end event">end</a>
events. It occurs over
an interval delimited by two <a title="instantaneous event">instantaneous
events</a>. However, an activity statement need not mention start or end time information, because they may not be known.
-An activity's attribute-value pairs are expected to describe the activity's situation during its interval, i.e. an interval between two instantaneous events, namely its <a title="activity start event">start</a> event and its <a title="activity end event">end</a> event.
+An activity's attribute-value pairs are expected to describe the activity's situation during its lifetime.
</p>
@@ -783,6 +783,9 @@
</figure> <!-- <b>new Figure 1:</b> -->
</div>
+<div class="note">
+ James to add prose
+ </div>
</section>
@@ -1103,23 +1106,49 @@
</section>
-
<section id="inferences">
-<h2>Inferences and Definitions</h2>
+<h2>Definitions and Inferences</h2>
<p>
-This section describes <a title="inference">inferences</a> and <a title="definition">definitions</a> that MAY be used on
+This section describes <a title="definition">definitions</a> and <a title="inference">inferences</a> that MAY be used on
provenance data, and preserve <a>equivalence</a> on <a>valid</a>
PROV instances (as detailed in <a href="#normalization-validity-equivalence" class="sectionRef"></a>).
-An <dfn id="inference">inference</dfn> is a rule that can be applied
- to PROV instances to add new PROV statements. A <dfn
+A <dfn
id="definition">definition</dfn> is a rule that can be applied to
- PROV instances to replace defined expressions with definitions. A definition states that a
+ PROV instances to replace defined expressions with definitions. An <dfn id="inference">inference</dfn> is a rule that can be applied
+ to PROV instances to add new PROV statements. A definition states that a
provenance statement is equivalent to some other statements, whereas
an inference only states one direction of an implication; thus,
defined provenance statements can be replaced by their definitions.
</p>
-<p> Inferences have the following general form:</p>
+
+<p> Definitions have the following general form:</p>
+
+<div class='definition-example' id='definition-example'>
+<p>
+ <span class="name">defined_stmt</span> <span
+ class='conditional'>IF AND ONLY IF</span>
+ there exists <span class="name">a<sub>1</sub></span>,..., <span
+ class="name">a<sub>m</sub></span> such that <span
+ class="name">defining_stmt<sub>1</sub></span> and ... and <span class="name">defining_stmt<sub>n</sub></span>.</p>
+ </div>
+
+ <p>
+ A definition can be applied to a PROV instance, since its <span class="name">defined_stmt</span> is defined in
+ terms of other statements. Applying a
+ definition to an instance means that if an occurrence of a defined
+ provenance statement <span class="name">defined_stmt</span>
+ can be found in a PROV instance, then we can remove it and add all of the statements
+<span class="name">defining_stmt<sub>1</sub></span> ... <span class="name">defining_stmt<sub>n</sub></span> to the instance, possibly after generating fresh
+ identifiers <span class="name">a<sub>1</sub></span>,...,<span
+ class="name">a<sub>m</sub></span> for existential variables. In
+ other words, it is safe to replace
+ a defined statement with
+ its definition.
+</p>
+
+
+ <p> Inferences have the following general form:</p>
<div class='inference-example' id='inference-example'>
<p>
<span class='conditional'>IF</span> <span class="name">hyp<sub>1</sub></span> and ... and
@@ -1141,7 +1170,7 @@
class="name">a<sub>m</sub></span> for the existential variables. These fresh
identifiers might later be found to be equal to known identifiers;
they play a similar role in PROV constraints to existential
- variables in logic, "labeled nulls" in database theory
+ variables in logic, to "labeled nulls" in database theory
[[DBCONSTRAINTS]], or to blank nodes in [[!RDF]]. In general, omitted optional parameters to
[[PROV-N]] statements, or explicit <span class="name">-</span>
markers, are placeholders for existentially quantified variables;
@@ -1149,31 +1178,6 @@
this general rule, which are specified in <a class="rule-ref"
href="#optional-placeholders"><span>TBD</span></a>.</p>
-<p> Definitions have the following general form:</p>
-
-<div class='definition-example' id='definition-example'>
-<p>
- <span class="name">defined_exp</span> <span
- class='conditional'>IF AND ONLY IF</span>
- there exists <span class="name">a<sub>1</sub></span>,..., <span
- class="name">a<sub>m</sub></span> such that <span
- class="name">defining_exp<sub>1</sub></span> and ... and <span class="name">defining_exp<sub>n</sub></span>.</p>
- </div>
-
- <p>
- A definition can be applied to a PROV instance, since its <span class="name">defined_exp</span> is defined in
- terms of other statements. Applying a
- definition to an instance means that if an occurrence of a defined
- provenance statement <span class="name">defined_exp</span>
- can be found in a PROV instance, then we can remove it and add all of the statements
-<span class="name">defining_exp<sub>1</sub></span> ... <span class="name">defining_exp<sub>n</sub></span> to the instance, possibly after generating fresh
- identifiers <span class="name">a<sub>1</sub></span>,...,<span
- class="name">a<sub>m</sub></span> for existential variables. In
- other words, it is safe to replace
- a defined statement with
- its definition.
-</p>
-
<p> Definitions and inferences can be viewed as logical formulas;
similar formalisms are often used in rule-based reasoning [[CHR]]
and in databases [[DBCONSTRAINTS]]. In particular, the identifiers
@@ -1185,7 +1189,7 @@
nodes in RDF. In contrast, distinct URIs or literal values in PROV
are assumed to be distinct for the purpose of checking validity or
inferences. This issue is discussed in more detail under <a
- href="#uniqueness-constraints">Uniqueness Constraints</a> below.
+ href="#uniqueness-constraints">Uniqueness Constraints</a>.
</p>
<section>
@@ -1196,9 +1200,18 @@
be omitted in [[PROV-O]] notation. For the purpose of inference and
validity checking, we generate special identifiers called
<dfn>existential variables</dfn> denoting the unknown values.
- Existential variables can be <em>substituted</em> with constant
- identifiers, literals, the placeholder <span class="name">-</span>,
- or other <a>existential variables</a>.
+</p>
+<p>
+Existential variables can be <a title="substitution">substituted</a>
+ with other terms. Specifically, a
+ <dfn>substitution</dfn> is a function from a set of existential
+ variables to identifiers, literals, the placeholder <span class="name">-</span>,
+ or other <a>existential variables</a>. A substitution <span class="math">S</span> can be
+ applied to an instance <span class="math">I</span> by replacing all occurrences of existential
+ variables <span class="math">x</span> in the instance with <span class="math">S(x)</span>. <div
+ class="note">More explanation may be needed</div>
+</p>
+<p>
<a class="rule-ref" href="#optional-identifiers"><span>TBD</span></a>,
<a class="rule-ref" href="#optional-attributes"><span>TBD</span></a>, and
<a class="rule-ref" href="#definition-short-forms"><span>TBD</span></a>,
@@ -1495,9 +1508,9 @@
-<p id="communication-generation-use-inference_text">Communication between activities is <a
- title="definition">defined</a> as the existence of an underlying
-entity generated by one activity and used by the other.</p>
+<p id="communication-generation-use-inference_text">Communication between activities <a
+ title="definition">implies</a> the existence of an underlying
+entity generated by one activity and used by the other, and vice versa.</p>
<div class='inference' id='communication-generation-use-inference'>
<p>
@@ -1589,7 +1602,7 @@
-->
<p id="entity-generation-invalidation-inference_text">
-From an entity, we can infer the existence of
+From an entity statement, we can infer the existence of
generation and invalidation events.
</p>
<div class='inference' id='entity-generation-invalidation-inference'>
@@ -1755,8 +1768,10 @@
<div class='inference' id='revision-is-alternate-inference'>
<p>
<span class='conditional'>IF</span> <span
- class="name">wasDerivedFrom(_id; e2,e1,[prov:type='prov:Revision'])</span>, <span class='conditional'>THEN</span> <span
- class="name">alternateOf(e2,e1)</span>.
+ class="name">wasDerivedFrom(_id; e2,e1,_a,_g,_u,[prov:type='prov:Revision'])</span>, <span class='conditional'>THEN</span> <span
+ class="name">alternateOf(e2,e1)</span>. The activity <span class="name">_a</span>,
+ generation <span class="name">_g</span>,
+ or use <span class="name">_u</span> may be the placeholder <span class="name">-</span>.
</p>
<!--
<li><span class='conditional'>IF</span> <span
@@ -2050,7 +2065,7 @@
<li> <em>event ordering constraints</em> that say that it
should be possible to arrange the
events (generation, usage, invalidation, start, end) described in a
- PROV instance into a preorder that corresponds to a sensible
+ PROV instance into a <a>preorder</a> that corresponds to a sensible
"history" (for example, an entity should not be generated after it
is used); and
</li>
@@ -2242,9 +2257,10 @@
<li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
the <span class="name">wasInvalidatedBy(id; e,a,t,attrs)</span> statement.
</li>
- <li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
+<!-- <li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
the <span class="name">wasDerivedFrom(id; e2, e1, attrs)</span> statement.
</li>
+ -->
<li>The identifier field <span class="name">id</span> is a <span class="conditional">KEY</span> for
the <span class="name">wasDerivedFrom(id; e2, e1, a, g2, u1, attrs)</span> statement.
</li>
@@ -2439,7 +2455,7 @@
as or after <span class="name">e2</span>. Both relations are
<a>preorder</a>s, meaning that they are <a>reflexive</a> and
<a>transitive</a>. Moreover, we sometimes consider <em>strict</em> forms of these
-orders: we say <span class="name">e1</span> strictly precedes <span
+orders: we say <span class="name">e1</span> <a>strictly precedes</a> <span
class="name">e2</span> to indicate that <span class="name">e1</span>
happened before <span class="name">e2</span>, but not at the same time. This is a
<a>transitive</a> relation. </p>
@@ -2808,7 +2824,7 @@
id='derivation-generation-generation-ordering'>
<p>
<span class="conditional">IF</span>
-<span class="name">wasDerivedFrom(_d; e2,e1,attrs)</span>
+<span class="name">wasDerivedFrom(_d; e2,e1,-,-,-,attrs)</span>
and
<span class="name">wasGeneratedBy(gen1; e1,_a1,_t1,_attrs1)</span>
and
@@ -3236,8 +3252,8 @@
</p>
<div class='constraint' id='impossible-object-property-overlap'>
<p>
-For each <span class="name">p</span> in <span class="name">entity</span>, <span class="name">activity</span>
- or <span class="name">agent</span> and for each <span class="name">r</span> in {
+For each <span class="name">p</span> in {<span class="name">entity</span>, <span class="name">activity</span>
+ or <span class="name">agent</span>} and for each <span class="name">r</span> in {
<span class="name">used</span>,
<span class="name">wasGeneratedBy</span>,
<span class="name">wasInvalidatedBy</span>,
@@ -3264,8 +3280,8 @@
<h2>Type Constraints</h2>
<p id="typing_text">The following rule establishes types denoted by identifiers from their use within expressions.
-For this, the function <span class="name">typeOf</span> gives the set of types denoted by an identifier.
-For example, <span class="name">typeOf(e)</span> returns the set of types associated with identifier <span class="name">e</span>. The function <span class="name">typeOf</span> is not a term of PROV, but a construct introduced to validate PROV statements.
+The function <span class="name">typeOf</span> gives the set of types denoted by an identifier.
+That is, <span class="name">typeOf(e)</span> returns the set of types associated with identifier <span class="name">e</span>. The function <span class="name">typeOf</span> is not a term of PROV, but a construct introduced to validate PROV statements.
</p>
@@ -3273,7 +3289,15 @@
<p>
For any identifier <span class="name">id</span>, <span class="name">typeOf(id)</span> is a subset of {<span class="name">'entity'</span>, <span class="name">'activity'</span>, <span class="name">'agent'</span>, <span class="name">'prov:Collection'</span>, <span class="name">'prov:EmptyCollection'</span>}.
-For identifiers that do not have a type, <span class="name">typeOf</span> gives the empty set.
+For identifiers that do not have a type, <span
+ class="name">typeOf</span> gives the empty set.
+ Identifiers can have more than one type, because of subtyping
+ (e.g. <span class="name">'prov:EmptyCollection'</span> is a subtype of <span
+ class="name">'prov:Collection'</span>) or because certain types are not
+ disjoint (such as <span class="name">'agent'</span> and <span class="name">'entity'</span>). The set of types
+ does not reflect all of the distinctions among objects, only those
+ relevant for checking validity. In particular, subtypes such as
+ <span class="name">'plan'</span> are omitted.
</p>
<p>To check if a PROV instance satisfies type constraints, one obtains the types of identifiers by application of
@@ -3288,7 +3312,7 @@
<div class='constraint' id="typing">
-<ul>
+<ol>
<li>
<span class='conditional'>IF</span>
<span class='name'>wasGeneratedBy(gen; e,a,t,attrs)</span>
@@ -3461,7 +3485,7 @@
<span class="name">'prov:Collection' ∈ typeOf(c)</span>AND
<span class="name">'prov:EmptyCollection' ∈ typeOf(c)</span>.
-</ul>
+</ol>
</div>
@@ -3534,9 +3558,8 @@
</div>
<p> We define the <dfn>normal form</dfn> of a PROV instance as the set
-of provenance statements resulting from merging to resolve all
-uniqueness constraints in the instance and applying all possible
-inference rules to this set. </p>
+of provenance statements resulting from applying all definitions,
+ inferences, and uniqueness constraints.</p>
@@ -3548,7 +3571,7 @@
</li>
<li>
Apply all inferences to <span class="math">I<sub>1</sub></span> by adding the conclusion of each inference
- whose hypotheses are satisfied and whose entire conclusions do not
+ whose hypotheses are satisfied and whose entire conclusion does not
already hold (again, possibly introducing fresh existential
variables), yielding an instance <span class="math">I<sub>2</sub></span>.
</li>
@@ -3585,7 +3608,7 @@
validity:</p>
<ol>
- <li>Normalize the instance, obtaining normalized instance <span class="math">I'</span>. If
+ <li>Normalize the instance #<I>#, obtaining normalized instance <span class="math">I'</span>. If
normalization fails, then <span class="math">I</span> is not <a>valid</a>.
</li>
<li>Apply all event ordering constraints to <span class="math">I'</span> to build a graph <span class="math">G</span> whose nodes
@@ -3595,18 +3618,20 @@
<li> Determine whether there is a cycle in <span class="math">G</span> that contains a
"strictly precedes" edge. If so, then <span class="math">I</span> is not <a>valid</a>.
</li>
+ <li>Apply the type constraints <a href="#type-constraints">(section
+ 5.3)</a> to determine whether there are any violations of
+ disjointness. If so, then <span class="math">I</span> is not <a>valid</a>.
<li>
- If no
- such cycle exists, and none of the impossibility constraints <a href="#impossibility-constraints">(section 5.3)</a> and type constraints <a href="#type-constraints">(section 5.4)</a> are
- violated, then <span class="math">I</span> is <a>valid</a>.
+ Check that none of the impossibility constraints <a href="#impossibility-constraints">(section 5.4)</a> are
+ violated. If any are violated, then <span class="math">I</span> is
+ not <a>valid</a>. Otherwise, #<I># is <a>valid</a>
</li>
</ol>
<p>A normal form of a PROV instance does not exist when a uniqueness constraint fails due to merging failure. </p>
- <p>Two PROV instances are <dfn>equivalent</dfn> if they have the
-isomorphic normal forms (that is, after applying all possible inference
+ <p>Two PROV instances are <dfn>equivalent</dfn> if they have isomorphic normal forms (that is, after applying all possible inference
rules, the two instances produce the same set of PROV statements,
up to reordering of statements and attributes within attribute lists,
and renaming of existential variables).
@@ -3617,18 +3642,20 @@
The order of provenance statements is irrelevant to the meaning of
a PROV instance. That is, a
PROV instance is equivalent to any other instance obtained by
- permuting its statements.
+reordering its statements.
</li>
<li>The order of attribute-value pairs in attribute lists is
irrelevant to the meaning of a PROV statement. That is, a PROV
statement carrying attributes is equivalent to any other statement
- obtained by permuting attribute-value pairs.
+ obtained by reordering attribute-value pairs and eliminating
+ duplicate pairs.
</li>
<li>The particular choices of names of existential variables are irrelevant to the meaning
- of an instance; that is, the names can be permuted without changing
- the meaning. (Replacing two different names with equal names,
+ of an instance; that is, the names can be renamed without changing
+ the meaning, as long as different names are always replaced with
+ different names. (Replacing two different names with equal names,
however, can
- change the meaning.)</li>
+ change the meaning, so does not preserve equivalence.)</li>
<li>
Applying inference rules, definitions, and uniqueness constraints preserves equivalence. That is, a <a>PROV
instance</a> is equivalent to the instance obtained by applying any