--- a/model/prov-constraints.html Wed Jul 18 13:37:37 2012 -0400
+++ b/model/prov-constraints.html Wed Jul 18 13:37:48 2012 -0400
@@ -206,8 +206,6 @@
"<a href=\"http://www.w3.org/TR/prov-n/\"><cite>PROV-N: The Provenance Notation</cite></a>. "+
"2011, Working Draft. "+
"URL: <a href=\"http://www.w3.org/TR/prov-n/\">http://www.w3.org/TR/prov-n/</a>",
-
-
"PROV-O":
"Satya Sahoo and Deborah McGuinness (eds.) Khalid Belhajjame, James Cheney, Daniel Garijo, Timothy Lebo, Stian Soiland-Reyes, and Stephan Zednik "+
"<a href=\"http://www.w3.org/TR/prov-o/\"><cite>Provenance Formal Model</cite></a>. "+
@@ -219,7 +217,12 @@
"<a href=\"http://www.w3.org/TR/prov-aq/\"><cite>Provenance Access and Query</cite></a>. "+
"2011, Working Draft. "+
"URL: <a href=\"http://www.w3.org/TR/prov-aq/\">http://www.w3.org/TR/prov-aq/</a>",
- };
+"RDF":
+ "Graham Klyne and Jeremy J. Carroll (eds.) "+
+ "<a href=\"http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/\"><cite>Resource Description Framework (RDF): Concepts and Abstract Syntax</cite></a>. "+
+ "2004, W3C Recommendation. "+
+ "URL: <a href=\"http://www.w3.org/TR/2004/REC-rdf-concepts-20040210//\">http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/</a>",
+};
var respecConfig = {
// specification status (e.g. WD, LCWD, NOTE, etc.). If in doubt use ED.
specStatus: "ED",
@@ -304,14 +307,14 @@
</p>
-<p> This document introduces <i>inferences</i> and <i>definitions</i>
- that are allowed on provenance statements and <i>constraints</i>
+<p> This document introduces <a>inferences</a> and <a>definitions</a>
+ that are allowed on provenance statements and <a>constraints</a>
that PROV instances MUST satisfy in order to be considered
- <i>valid</i>. These inferences and constraints are useful for
+ <a>valid</a>. These inferences and constraints are useful for
readers who develop applications that generate provenance or reason
- over provenance. They can also be used to <i>normalize</i> PROV
+ over provenance. They can also be used to <a title="normal form">normalize</a> PROV
instances to forms that can easily be compared in order to determine
- whether two PROV instances are equivalent.</p>
+ whether two PROV instances are <a>equivalent</a>.</p>
</section>
@@ -370,7 +373,15 @@
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
[[!RFC2119]].</p>
- </section>
+
+<p>In this document, logical formulas contain variables written as
+ lower-case identifiers. Some of these ariables are written
+ beginning with the underscore character <span
+ class="name">_</span>, by convention, to indicate that they
+ (intentionally) appear only once in the formula; thus, the textual
+ variable name is mnemonic only. </p>
+
+</section>
<section id="purpose">
@@ -378,18 +389,22 @@
<h3>Purpose of this document</h3>
<p> PROV-DM is a conceptual data model for provenance, which is
-realizable using different serializations such as PROV-O and PROV-N.
- The PROV-DM specification [[!PROV-DM]] imposes few
-requirements upon sets of PROV statements (or <a>instances</a>). By
-default, PROV instances need not be valid, that is, they may not
-correspond to a consistent history of objects and interactions to
-which logical reasoning can be safely applied. </p>
+realizable using different serializations such as PROV-N, PROV-O, or
+PROV-XML. A PROV <dfn>instance</dfn> is a set of PROV statements,
+possibly including <a>bundles</a>, or named sets of statements. For
+example, such a PROV instance could be a .provn document, the result
+of a query, a triple store containing PROV statements in RDF, etc. The
+PROV-DM specification [[PROV-DM]] imposes minimal requirements upon
+PROV instances. A <em>valid</em> PROV instance corresponds to a
+consistent history of objects and interactions to which logical
+reasoning can be safely applied. By default, PROV instances need not
+be valid. </p>
<p> This document specifies <em>inferences</em> over PROV instances
that applications MAY employ, including <em>definitions</em> of some
provenance statements in terms of others, and also defines a class of
<em>valid</em> PROV instances by specifying <em>constraints</em> that
-valid PROV instances must satisfy. Applications SHOULD produce valid
+xsvalid PROV instances must satisfy. Applications SHOULD produce valid
provenance and MAY reject provenance that is not valid. Applications
SHOULD also use definitions, inferences and constraints to normalize
PROV instances in order to determine whether two such instances convey
@@ -426,8 +441,8 @@
provenance. </p>
<p><a href="#constraints">Section 5</a> presents two kinds of constraints,
-<em>uniqueness</em> constraints that prescribe properties of PROV
-instances that can be checked directly by inspecting the syntax, and
+<em>uniqueness</em> constraints that prescribe that certain statments
+must be unique within PROV <a>instances</a>, and
<em>event ordering</em> constraints that require that the records in a
PROV <a>instance</a> are consistent with a sensible ordering of events
relating the activities, entities and agents involved.
@@ -528,7 +543,7 @@
other. Similarly, there is no assumption that the attributes of an
entity uniquely identify it. Two different entities that present the
same aspects of possibly different things can have the same
-attributes; this leads to ambiguity, which is mitigated through the
+attributes; this leads to potential ambiguity, which is mitigated through the
use of identifiers.</p>
<p>An <a>activity</a> is delimited by its <a title="activity start event">start</a> and its <a title="activity end event">end</a> events; hence, it occurs over
@@ -582,8 +597,7 @@
<strong>entity generation</strong>, and <strong>entity
invalidation</strong> events apply to entities, and the generation and
invalidation events delimit the <a>lifetime</a> of an entity. More
-specifically:
-
+precisely:
</p>
<p>An <dfn id="dfn-start-event">activity start event</dfn> is the <a title="instantaneous event">instantaneous event</a> that marks the instant an activity starts.</p>
@@ -605,41 +619,13 @@
for use. The entity no longer exists after this event.</p>
</section>
-</section>
-
-<section id="compliance">
-<h2>Compliance with this document</h2>
-
-<p>
- For the purpose of compliance, the normative sections of this document
- are <a href="#compliance"
-class="sectionRef"></a>, <a href="#inferences"
-class="sectionRef"></a>, <a href="#constraints"
-class="sectionRef"></a>, <a href="#equivalence"
-class="sectionRef"></a>, and <a
- href="#bundle-constraints" class="sectionRef"></a>.
- To be compliant:
- <ol><li>When processing provenance, an
- application MAY apply the inferences and definitions in <a
- href="#inferences" class='sectionRef'></a>.</li>
- <li>When determining whether a PROV instance is <a>valid</a>, an
- application MUST check that all of the
- constraints of <a href="#constraints" class="sectionRef"></a> are
- satisfied on the <a>normal form</a> of the instance.</li>
- <li> When producing provenance meant for other applications to
- use, the application SHOULD produce <a>valid</a> provenance, as specified in <a href="#equivalence" class="sectionRef"></a>. </li>
- <li>When determining whether two PROV instances are
- <a>equivalent</a>, an application MUST determine whether their
- normal forms are equal, as specified in <a href="#equivalence" class="sectionRef"></a>.
- <li>For PROV instances containing multiple <a>bundle</a>s, the
-inferences, definitions, constraints and inferences
-SHOULD be applied to each bundle independently, for the purpose of
- checking validity or equivalence, as specified in <a
- href="#bundle-constraints" class="sectionRef"></a>. </li>
- </ol>
+
+<section>
+<h4>Summary of constraints and inferences</h4>
+
+<p><a href="">Table 5</a> summarizes the definitions, inferences, and
+constraints of this document.
</p>
-
- <p>
<div class="note">Table: work in progress; do after constraints are done.</div>
@@ -696,6 +682,50 @@
</section>
+</section>
+
+<section id="compliance">
+<h2>Compliance with this document</h2>
+
+<p>
+ For the purpose of compliance, the normative sections of this document
+ are <a href="#compliance"
+class="sectionRef"></a>, <a href="#inferences"
+class="sectionRef"></a>, <a href="#constraints"
+class="sectionRef"></a>, and <a href="#normalization-validity-equivalence"
+class="sectionRef"></a>.
+
+
+ To be compliant:
+ <ol><li>When processing provenance, an
+ application MAY apply the inferences and definitions in <a
+ href="#inferences" class='sectionRef'></a>.</li>
+ <li>When determining whether a PROV instance is <a>valid</a>, an
+ application MUST check that all of the
+ constraints of <a href="#constraints" class="sectionRef"></a> are
+ satisfied on the <a>normal form</a> of the instance.</li>
+ <li> When producing provenance meant for other applications to
+ use, the application SHOULD produce <a>valid</a> provenance, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>. </li>
+ <li>When determining whether two PROV instances are
+ <a>equivalent</a>, an application MUST determine whether their
+ normal forms are equal, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>.
+ </ol>
+</p>
+
+ <p>
+ All diagrams are for illustration purposes
+ only. Text in appendices and
+in boxes labeled "Remark" is informative. Where there is any apparent
+ ambiguity between the descriptive text and the formal text in a
+ "definition", "inference" or "constraint" box, the formal text takes
+priority.
+ </p>
+
+ <div class="note"> To reviewers: We specifically invite review for
+ consistency between the informal and formal text.</div>
+
+</section>
+
<section id="inferences">
@@ -708,8 +738,8 @@
An <dfn id="inference">inference</dfn> is a rule that can be applied
to PROV instances to add new PROV statements. A <dfn
id="definition">definition</dfn> is a rule that states that a
- provenance expression is equivalent to some other expressions; thus,
- defined provenance expressions can be replaced by their definitions,
+ provenance statement is equivalent to some other statements; thus,
+ defined provenance statements can be replaced by their definitions,
and vice versa.
</p>
@@ -723,14 +753,20 @@
class="name">conclusion_1</span> and ... and <span class="name">conclusion_n</span>.</p>
</div>
-<p>
- This means that if all of the provenance expressions matching <span class="name">hyp_1</span>... <span class="name">hyp_k</span>
- can be found in a PROV instance, we can add all of the expressions
- <span class="name">concl_1</span> ... <span class="name">concl_n</span> to the instance, possibly after generating fresh
- identifiers <span class="name">a_1</span>,...,<span class="name">a_m</span> for unknown objects. These fresh
+<p> This means that if all of the provenance statements matching
+ <span class="name">hyp_1</span>... <span class="name">hyp_k</span>
+ can be found in a PROV instance, we can add all of the statements
+ <span class="name">concl_1</span> ... <span
+ class="name">concl_n</span> to the instance, possibly after
+ generating fresh identifiers <span class="name">a_1</span>,...,<span
+ class="name">a_m</span> for unknown objects. These fresh
identifiers might later be found to be equal to known identifiers;
- they play a similar role in PROV constraints to existential variables in logic.
-</p>
+ they play a similar role in PROV constraints to existential
+ variables in logic or to blank nodes in [[RDF]]. With a few
+ exceptions (discussed below), omitted optional parameters to
+ [[PROV-N]] statements, or explicit <span class="name">-</span>
+ markers, are placeholders for existentially quantified variables;
+ that is, they denote unknown values. </p>
<p> Definitions have the following general form:</p>
@@ -743,14 +779,14 @@
</div>
<p>
- This means that a provenance expression <span class="name">defined_exp</span> is defined in
- terms of other expressions. This can be viewed as a two-way
+ This means that a provenance statement <span class="name">defined_exp</span> is defined in
+ terms of other statements. This can be viewed as a two-way
inference: If <span class="name">defined_exp</span>
- can be found in a PROV instance, we can add all of the expressions
+ can be found in a PROV instance, we can add all of the statements
<span class="name">defining_exp_1</span> ... <span class="name">defining_exp_n</span> to the instance, possibly after generating fresh
identifiers <span class="name">a_1</span>,...,<span
class="name">a_m</span> for unknown objects. It is safe to replace
- a defined expression with
+ a defined statement with
its definition.
</p>
@@ -772,40 +808,127 @@
<h4>Optional Identifiers and Attributes</h4>
<p>Many PROV relation statements have an identifier,
- identifting a link between two or more related objects. This
- identifier is optional and can be omitted in [[PROV-N]] notation.
- For the purpose of inference and validity checking, it is convenient
- to generate dummy identifiers (i.e. existentially quantified
- variables) denoting the missing identifiers.
+ identifying a link between two or more related objects. Identifiers
+ can sometimes be omitted in [[PROV-N]] notation.
+ For the purpose of inference and validity checking, we generate
+ special identifiers called <dfn>existential variables</dfn> denoting
+ the unknown values.
</p>
+
<div class="definition" id="optional-identifiers">
<p>
<span class="name">r(a_1,...,a_n) </span> holds <span class="conditional">IF AND ONLY IF</span>
there exists <span class="name">id</span> such that <span class="name">r(id;a_1,...,a_n)</span> holds.</p>
</div>
- <p>Likewise, many PROV expressions allow for an optional attribute
+ <p>Likewise, many PROV statements allow for an optional attribute
list. If it is omitted, this is the same as specifying an empty
attribute list:</p>
<div class="definition" id="optional-attributes">
- <p>
+<ol>
+ <li>
<span class="name">r(a_1,...,a_n)</span> holds and does not contain an explicit attribute
list <span class="conditional">IF AND ONLY IF</span> <span
- class="name">r(a_1,...,a_n,[])</span> holds.</p>
- <p>
+ class="name">r(a_1,...,a_n,[])</span> holds.</li>
+ <li>
Similarly, <span class="name">r(id;a_1,...,a_n)</span> holds and does not contain an explicit attribute
- list <span class="conditional">IF AND ONLY IF</span> <span class="name">r(id;a_1,...,a_n,[])</span> holds.</p>
+ list <span class="conditional">IF AND ONLY IF</span> <span
+ class="name">r(id;a_1,...,a_n,[])</span> holds.</li></ol>
</div>
<p> Finally, many PROV
- expressions have other optional arguments or short forms that can be
+ statements have other optional arguments or short forms that can be
used if none of the optional arguments is present. These are
- handled by specific rules listed below. </p> </section>
-
-<div class="note">Add expansion rules for optionals</div>
+ handled by specific rules listed below. </p>
+
+ <div class="note"> The following list assumes that we have already
+ expanded id and attribute parameters, so that we do not need
+ extra cases. Moreover it assumes that "-" will be normalized
+ away in the next step.</div>
+<div class="definition" id="definition-short-forms">
+ <p>
+
+ <ol>
+ <li> <span class="name">activity(id,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">activity(id,-,-,attrs)</span>.
+ </li>
+ <li><span class="name">wasGeneratedBy(id;e,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">wasGeneratedBy(id;e,-,-,attrs)</span>.
+ </li>
+ <li><span class="name">used(id;a,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">used(id;a,-,-,attrs)</span>.
+ </li>
+ <li><span class="name">wasStartedBy(id;a,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">wasStartedBy(id;a,-,-,-,attrs)</span>.
+ </li>
+ <li><span class="name">wasEndedBy(id;a,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">wasEndedBy(id;a,-,-,-,attrs)</span>.
+ </li>
+ <li><span class="name">wasInvalidatedBy(id;e,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">wasInvalidatedBy(id;e,-,-,attrs)</span>.
+ </li>
+ <li><span class="name">wasAssociatedWith(id;e,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">wasAssociatedWith(id;e,-,-,attrs)</span>.
+ </li>
+ <li><span class="name">actedOnBehalfOf(id;a2,a1,attrs)</span> <span class="conditional">IF AND ONLY IF</span> <span class="name">actedOnBehalfOf(id;a2,a1,-,attrs)</span>.
+ </li>
+ </ol>
+ </p>
+ </div>
+
+ <div class="remark">
+<p>
+ Note that there is no expansion rule for wasDerivedFrom. In a derivation of the form
+ <span class="name">wasDerivedFrom(id;e1,e2,attr)</span>, the
+ absence of the optional activity, generation and use identifiers
+ means that the derivation relationship may encompass multiple activities,
+generations, and uses. Thus, it is not equivalent to <span
+ class="name">wasDerivedFrom(id;e1,e2,a,g,u,attr)</span> where some
+ activity, generation and use are given explicitly.
+ The short
+ form is not defined in terms of the long form in this case.</p>
+
+ <p>
+ There
+ are also no expansion rules for entity, agent, communiction,
+ attribution, influence, alternate, or specialization, because these have no optional parameters aside
+ from the identifier and attribute, which are expanded by other
+ rules above.</p>
+ </div>
+
+<div id="optional-attributes1">
+<p>Finally, most optional parameters (written <span
+ class="name">-</span>) are, for the purpose of this document,
+ considered to be distinct, fresh existential variables. Thus,
+ before proceeding to apply other definitions or inferences, most
+ occurrences of <span class="name">-</span> MUST be replaced
+ by fresh existential variables, distinct from any others occurring in
+ the instance.
+ The only exceptions, where <span class="name">-</span> MUST be left
+ in place, is the plan parameter in <span class="name">wasAssociatedWith</span>.
+ </p>
+
+ <div class="note">
+ The following is a template for a large number of more precise
+ rules specifying how to fill in optional parameters. It would be
+ nice to find a uniform way of saying this without lots of tedious
+ rules, since the expansion is completely uniform except for the
+ plan parameter.</div>
+
+ <div class="definition" id="optional-variables">
+
+ <p> <span class="name">r(a_0;...,a_{i-1}, -, a_i, ...,a_n) </span> <span class="conditional">IF AND ONLY IF</span> there exists <span class="name">a'</span>
+ such that <span class="name">r(a_0;...,a_{i-1},a',a_i,...,a_n)</span>. This definiton
+ applies to any relation containing a placeholder <span
+ class="name">-</span>, <b>except</b> for a placeholder in the <span class="name">plan</span>
+ position (i.e., the fourth argument) of <span class="name">wasAssociatedWith</span>.
+ </p>
+ </div>
+
+ <div class="remark">In an association of the form
+ <span class="name">wasAssociatedWith(id;a, ag,-,attr)</span>, the
+ absence of a plan means: either no plan exists, or a plan exists but
+ it is not identified. Thus, it is not equivalent to <span
+ class="name">wasAssociatedWith(id;a,ag,p,attr)</span> where a
+ plan <span class="name">p</span> is given.
+ </div>
+</section>
<section>
- <h3>Component 1: Entities and Activities</h3>
+ <h3>Entities and Activities</h3>
@@ -838,7 +961,8 @@
wasInformedBy(a2,a1)
wasInformedBy(a3,a2)
</pre>
-<p> We cannot infer <span class="name">wasInformedBy(a3,a1)</span> from these expressions. Indeed,
+ <div class="remark">
+<p> We cannot infer <span class="name">wasInformedBy(a3,a1)</span> from these statements. Indeed,
from
<span class="name">wasInformedBy(a2,a1)</span>, we know that there exists <span class="name">e1</span> such that <span class="name">e1</span> was generated by <span class="name">a1</span>
and used by <span class="name">a2</span>. Likewise, from <span class="name">wasInformedBy(a3,a2)</span>, we know that there exists <span class="name">e2</span> such that <span
@@ -856,6 +980,7 @@
<figcaption id="counterexample-wasInformedBy"><b>Figure 1:</b> Counter-example for transitivity of wasInformedBy</figcaption>
</figure>
</div>
+</div>
<hr />
<p id="activity-start-end-inference">
@@ -867,7 +992,9 @@
<p>
<span class='conditional'>IF</span> <span
class="name">activity(a,t1,t2,_attrs)</span> <span
- class="conditional">THEN</span> <span
+ class="conditional">THEN</span> there exist <span class="name">_id1</span>, <span class="name">_e1</span>,
+ <span class="name">_attrs1</span>, <span class="name">_id2</span>, <span class="name">_e2</span>, and <span class="name">_attrs2</span> such that
+ <span
class="name">wasStartedBy(_id1;a,_e1,t1,_attrs1)</span> and <span class="name">wasEndedBy(_id2;a,_e2,t2,_attrs2)</span>.
</div>
@@ -883,7 +1010,8 @@
<div class='inference' id='wasStartedBy-inference'>
<p><span class='conditional'>IF</span>
<span class="name">wasStartedBy(_id;a,e1,a1,_t,_attrs)</span>,
-<span class='conditional'>THEN</span>
+<span class='conditional'>THEN</span> there exist <span class="name">_gen</span>, <span class="name">_t1</span>,
+ and <span class="name">_attrs1</span> such that
<span class="name">wasGeneratedBy(_gen;e1,a1,_t1,_attrs1)</span>.</p>
</div>
<p>
@@ -900,7 +1028,8 @@
<div class='inference' id='wasEndedBy-inference'>
<p><span class='conditional'>IF</span>
<span class="name">wasEndedBy(_id;a,e1,a1,_t,_attrs)</span>,
-<span class='conditional'>THEN</span>
+<span class='conditional'>THEN</span> there exist <span class="name">_gen</span>, <span class="name">_t1</span>,
+ and <span class="name">_attrs1</span> such that
<span class="name">wasGeneratedBy(_gen;e1,a1,_t1,_attrs1)</span>.</p>
</div>
@@ -910,7 +1039,7 @@
</section>
<section>
-<h3>Component 2: Derivations</h3>
+<h3>Derivations</h3>
<hr>
@@ -925,12 +1054,12 @@
class="name">wasDerivedFrom(_id;e2,e1,a,id2,_id1,_attrs)</span>,
<span class='conditional'>THEN</span> there exist <span
class="name">_t2</span> and <span class="name">_attrs2</span>
- such that <span class="name">wasGeneratedBy(id2;e2,a,_t2,_attrs2)</span>
+ such that <span class="name">wasGeneratedBy(id2;e2,a,_t2,_attrs2)</span>.
<li><span class='conditional'>IF</span> <span
class="name">wasDerivedFrom(_id;e2,e1,a,_id2,id1,_attrs)</span>,
<span class='conditional'>THEN</span> there exist <span
class="name">_t1</span> and <span class="name">_attrs1</span>
- such that <span class="name">used(id1;a,e1,_t1,_attrs1)</span>
+ such that <span class="name">used(id1;a,e1,_t1,_attrs1)</span>.
</ol>
</div>
<p>
@@ -952,7 +1081,8 @@
</p>
-<p>The converse inference does not hold. Informally, from <span
+<div class="remark">
+ <p>The converse inference does not hold. Informally, from <span
class="name">wasDerivedFrom(e2,e1)</span> and <span
class="name">used(a,e1,-)</span>, one cannot derive <span
class="name">wasGeneratedBy(e2,a,-)</span> because identifier <span
@@ -965,23 +1095,23 @@
transitive either, following similar reasoning as for <span
class="name">wasInformedBy</span>. </p>
-
-
-
+ </div>
+
+<hr />
<p>A revision admits the following inference, stating that the two entities
linked by a revision are also alternates.</p>
<div class='inference' id='revision-is-alternate'>
-<p><span class='conditional'>IF</span> <span
+<ol><li><span class='conditional'>IF</span> <span
class="name">wasDerivedFrom(_id;e2,e1,[prov:type='prov:Revision'])</span>
holds, <span class='conditional'>THEN</span> <span
class="name">alternateOf(e2,e1)</span> holds.
-</p>
-<p><span class='conditional'>IF</span> <span
+</li>
+<li><span class='conditional'>IF</span> <span
class="name">wasDerivedFrom(_id;e2,e1,_act,_gen,_use,[prov:type='prov:Revision'])</span>
holds, <span class='conditional'>THEN</span> <span
class="name">alternateOf(e2,e1)</span> holds.
-</p>
+</li></ol>
</div>
@@ -1004,8 +1134,8 @@
<span class='conditional'>IF</span>
<span class="name">wasQuotedFrom(e2,e1,ag2,ag1,attrs)</span> holds for some identifiers
<span class="name">e2</span>, <span class="name">e1</span>, <span class="name">ag2</span>, <span class="name">ag1</span>,
-<span class='conditional'>THEN</span> the following hold:
-<pre>
+<span class='conditional'>THEN</span>
+<pre
wasDerivedFrom(e2,e1)
wasAttributedTo(e2,ag2)
wasAttributedTo(e1,ag1)
@@ -1019,7 +1149,7 @@
<section >
-<h3>Component 3: Agents</h3>
+<h3>Agents</h3>
<p id='inference-attribution_text'> Attribution identifies an agent as responsible for an entity. An
agent can only be responsible for an entity if it was associated with
@@ -1043,12 +1173,10 @@
<span class="name">_attrs1</span>,
<span class="name">_attrs2</span>,
<span class="name">_attrs3</span>,
- such that the following statements hold:
-<pre>
-activity(a,_t1,_t2,_attrs1)
-wasGeneratedBy(_gen;e,a,_t,_attrs2)
-wasAssociatedWith(_assoc;a,ag,_pl,_attrs3)
-</pre>
+ such that
+<span class="name">activity(a,_t1,_t2,_attrs1)</span> and
+<span class="name">wasGeneratedBy(_gen;e,a,_t,_attrs2)</span> and
+<span class="name">wasAssociatedWith(_assoc;a,ag,_pl,_attrs3)</span> hold.
</p>
</div>
<hr />
@@ -1056,21 +1184,36 @@
another, in the context of some activity. The supervising agent
delegates some responsibility for part of the activity to the
subordinate agent, while retaining some responsibility for the overall
-activity. </p>
-<div class="note">
- @@TODO: Do we want both of these inferences?
- </div>
+activity. Both agents are associated with this activity.</p>
+
<div class='inference' id='inference-delegation'>
<p>
<span class='conditional'>IF</span>
-<span class="name">wasAssociatedWith(_id1;a, ag1, _pl1, _attrs1)</span>
- and <span class="name">actedOnBehalfOf(_id;ag1, ag2, a, _attrs)</span>,
-<span class='conditional'>THEN</span> there exist <span
- class="name">_id2</span>, <span class="name">_pl2</span>, and <span class="name">_attrs2</span> such that <span class="name">wasAssociatedWith(_id2;a, ag2, _pl2, _attrs2)</span>.
+<span class="name">actedOnBehalfOf(_id;ag1, ag2, a, _attrs)</span>
+<span class='conditional'>THEN</span> there exist <span
+ class="name">_id1</span>, <span class="name">_pl1</span>, <span
+ class="name">_attrs1</span>, <span
+ class="name">_id2</span>, <span class="name">_pl2</span>, and <span
+ class="name">_attrs2</span> such that <span
+ class="name">wasAssociatedWith(_id1;a, ag1, _pl1, _attrs1)</span>
+ and <span class="name">wasAssociatedWith(_id2;a, ag2, _pl2, _attrs2)</span>.
</p>
</div>
+<div class="note">Make it clear somewhere that the existential
+ variables _pl1 and _pl2 can be replaced with either a concrete plan
+ identifier or <span class="name">-</span> placeholder, thus, there is no need for a
+ disjunction in the head of the rule to cover both cases.
+ </div>
+<div class="remark">
+ Note that the two associations between the agents and the activity
+ may have different identifiers, different plans, and different
+ attributes. In particular, the plans of the two agents need not be
+ the same, and one, both, or neither can be the placeholder <span class="name">-</span>
+ indicating that there is no plan.
+ </div>
+
<hr />
<p id="inference-influence_text">
The <span class="name">wasInfluencedBy</span> relation is implied by other relations, including
@@ -1082,7 +1225,7 @@
<p>
<ol>
<li>
- <span class="conditional">IF</span> <span class="name">wasGeneratedBy(id;e,a,t,attrs)</span> <span class="conditional">THEN</span> <span class="name">wasInfluencedBy(id; o2, o1, attrs)</span>.
+ <span class="conditional">IF</span> <span class="name">wasGeneratedBy(id;e,a,t,attrs)</span> <span class="conditional">THEN</span> <span class="name">wasInfluencedBy(id; e, a, attrs)</span>.
</li>
<li>
<span class="conditional">IF</span> <span class="name">used(id;a,e,t,attrs)</span> <span class="conditional">THEN</span> <span class="name">wasInfluencedBy(id; a, e, attrs)</span>.
@@ -1118,7 +1261,7 @@
<section>
-<h3>Component 5: Alternate and Specialized Entities</h3>
+<h3>Alternate and Specialized Entities</h3>
@@ -1148,6 +1291,7 @@
<hr>
+ <div class="note">Perhaps this should be a constraint.</div>
<p id="specialization-irreflexive_text">Similarly, specialization is a
<a>strict partial order</a>: it is <a>irreflexive</a> and
<a>transitive</a>.</p>
@@ -1158,11 +1302,14 @@
<span class='name'>specializationOf(e,e)</span> holds.</p>
</div>
-<p>
+
<hr>
-<p id="specialization-antisymmetric_text"/>
-
-<div class='inference' id="specialization-antisymmetric">
+<div class="note"> This is also a constraint, but follows from
+ irreflexivity and transitivity so we may omit it.</div>
+
+ <p id="specialization-asymmetric_text"/>
+
+<div class='inference' id="specialization-asymmetric">
<p> For any
entities <span class='name'>e1</span>, <span
class='name'>e2</span>,
@@ -1196,10 +1343,11 @@
For any entities <span class='name'>e1</span>, <span class='name'>e2</span>, <span class='conditional'>IF</span> <span class='name'>specializationOf(e1,e2)</span> <span class='conditional'>THEN</span> <span class='name'>alternateOf(e1,e2)</span>.</p>
</div>
-
- <div class="note">Confirm inferences about attributes,
- generation, invalidation? See below. What about shared attributes
- of overlapping entities?
+<hr />
+ <div class="note"> Explanation needed. Also, the following
+ inference is too specific - it only handles the case of a single
+ attribute, whereas in general we want the sub-entity to inherit all
+ attributes of the super-entity.
</div>
<div class="inference" id="specialization-attributes">
@@ -1210,11 +1358,12 @@
</div>
+ <hr />
+
<div class="note">
<p>Note: The following inference is associated with a feature "<a href="http://www.w3.org/2005/10/Process-20051014/tr#cfi">at risk</a>" and may be removed from this specification based on feedback. Please send feedback to public-prov-comments@w3.org.</p>
</div>
- <hr>
<p id="mention-specialization_text">If one entity is a mention of another in a bundle, then the former is also a specialization of the latter:</p>
@@ -1241,10 +1390,10 @@
This section defines a collection of constraints on PROV instances.
There are two kinds of constraints:
<ul><li><em>uniqueness constraints</em> that say that a <a>PROV
- instance</a> can contain at most one expression or that multiple
- expressions about the same objects need to have the same values (for
+ instance</a> can contain at most one statement of each kind with a
+ given identifier. For
example, if we describe the same generation event twice, then the
- two expressions should have the same times);
+ two statements should have the same times;
</li>
<li> and <em>event ordering constraints</em> that say that it
should be possible to arrange the
@@ -1264,17 +1413,22 @@
- <p> Uniqueness constraints are checked through <a>merging</a> pairs of
- statements subject to equalities. For example, suppose we have
- two activity statements <span class="name">activity(a,t1,-,[a=1])</span>
- and <span class="name">activity(a,-,t2,[b=2])</span>. The
- <a>merge</a> of these two statements (describing the same activity
- <span class="name">a</span>) is <span
- class="name">activity(a,t1,t2,[a=1,b=2])</span>.
- </p>
+ <p> In the absence of existential variables, uniqueness constraints
+ could be checked directly by checking that no identifier appears
+ more than once for a given statement. However, in the presence of
+ existential variables, we need to be more careful to combine
+ partial information that might be present in multiple compatible
+ statements, due to inferences. Uniqueness constraints are
+ enforced through <a>merging</a> pairs of statements subject to
+ equalities. For example, suppose we have two activity statements
+ <span class="name">activity(a,t1,-,[a=1])</span> and <span
+ class="name">activity(a,-,t2,[b=2])</span>. The <a>merge</a> of
+ these two statements (describing the same activity <span
+ class="name">a</span>) is <span
+ class="name">activity(a,t1,t2,[a=1,b=2])</span>. </p>
<p> Merging can be applied
- to a pair of terms, a pair of attributes, or a pair of statements.
+ to a pair of terms, or a pair of attribute lists.
The result of merging is either a substitution (mapping
existentially quantified variables to terms) or a special symbol
<span class="name">undefined</span> indicating that the merge
@@ -1283,24 +1437,56 @@
<ul>
<li> If <span class="name">t</span> and <span
- class="name">t'</span> are concrete identifiers or values, then
+ class="name">t'</span> are concrete identifiers or values
+ (including the placeholder <span class="name">-</span>), then
their <a>merge</a> exists only if they are equal, otherwise merging
is undefined. </li>
- <li> If one <span class="name">t</span> and
- <span class="name">t'</span> is an existential variable, then their
- <a>merge</a> is </li>
- <li> The <a>merge</a> of two attribute lists <span
+ <li> If one <span class="name">x</span> is an existential variable
+ and
+ <span class="name">t'</span> is any term (identifier, constant,
+ placeholder <span class="name">-</span>, or
+ existential variable), then their
+ <a>merge</a> is <span class="name">t'</span>, and the resulting substitution is
+ <span class="name">[x=t']</span>. In the special case where <span class="name">t'=x</span>, the merge is
+ <span class="name">x</span> and the resulting substitution is empty.</li>
+ <li> If one <span class="name">t</span> is any term (identifier, constant,
+ placeholder <span class="name">-</span>, or
+ existential variable) and
+ <span class="name">x'</span> is an existential variable, then their
+ merge is the same as the merge of <span class="name">x</span> and <span class="name">t</span>.</li>
+ <li> The <a>merge</a> of two attribute lists <span
class="name">attrs1</span> and <span class="name">attrs2</span>
is their union, considered as unordered lists, written <span
- class="name">attrs1 ∪ attrs2</span> </li>
+ class="name">attrs1 ∪ attrs2</span>. </li>
</ul>
-
+
+<div class="remark">Merging for terms is analogous to unification in
+ logic programming and theorem proving, restricted to flat terms with
+ no function symbols. No occurs check is needed because there are no
+ function symbols.</div>
+
+ <p>
A typical uniqueness constraint is as follows:
+ </p>
<div class="constraint-example" id="uniqueness-example">
-<p> IF hyp_1 AND ... AND hyp_n THEN t_1 = u_1 AND ... AND t_n = u_n.</p>
+<p> IF <span class="name">hyp_1</span> AND ... AND <span class="name">hyp_n</span> THEN <span class="name">t_1</span> = <span class="name">u_1</span> AND ... AND <span class="name">t_n</span> = <span class="name">u_n</span>.</p>
</div>
+ <p> Such a constraint is enforced as follows:</p>
+ <ol> <li>Suppose PROV instance I contains all of the hypotheses
+ <span class="name">hyp_1</span> AND ... AND <span class="name">hyp_n</span>.
+ </li>
+ <li>Attempt to merge all of the equated terms in the conclusion
+ <span class="name">t_1</span> = <span class="name">u_1</span> AND ... AND <span class="name">t_n</span> = <span class="name">u_n</span>.
+ </li>
+ <li>If merging fails, then the constraint
+ is unsatisfiable, so application of the constraint to I
+ fails.</li>
+ <li>If merging succeeds with a substitution S, then
+ S is applied to the instance I, yielding result I(S).</li>
+ </ol>
+
<p> In the common case that a particular field of a relation uniquely
determines the other fields, we call the uniqueness constraint a
<a>key constraint</a>. Key constraints are written as follows:
@@ -1310,29 +1496,40 @@
class="constraint-example" id="key-example">
<p>The <span
class="name">a_k</span> field is a KEY for relation <span
- class="name">r(a_0;a_1,...a_n)</span>. </p></div>
-
- <p> A key constraint is interpreted as a uniqueness constraint of the
- following form:</p>
-
- <div
- class="constraint-example" id="key-example-expanded">
- <p>IF <span
- class="name">r(a_0;a_1,...a_n)</span> AND <span
- class="name">r(b_0;b_1,...b_n)</span> (where <span
- class="name">a_k = b_k</span>) THEN <span
+ class="name">r(a_0;a_1,...,a_n)</span>. </p></div>
+
+ <p> Because of the presence of attributes, key constraints do not
+ reduce directly to uniqueness constraints. Instead, we enforce key
+ constraints as follows. </p>
+ <ol>
+ <li> Suppose <span
+ class="name">r(a_0;a_1,...a_n,attrs1)</span> AND <span
+ class="name">r(b_0;b_1,...b_n,attrs2)</span> hold in PROV instance I, where the key fields <span
+ class="name">a_k = b_k</span> are equal.</li>
+ <li> Attempt to merge all of the corresponding parameters <span
class="name">a_0 = b_0 </span> AND ... AND <span
- class="name">a_n = b_n</span>. </p></div>
-
-<p>If a PROV instance contains an apparent violation of a uniqueness
+ class="name">a_n = b_n</span>.
+ </li>
+ <li>If merging fails, then the constraint is unsatisfiable, so
+ application of the key constraint to I fails.
+ </li>
+ <li>If merging succeeds with substitution S, then we remove <span
+ class="name">r(a_0;a_1,...a_n,attrs1)</span> AND <span
+ class="name">r(b_0;b_1,...b_n,attrs2)</span> from I, obtaining
+ instance I', and return instance <span
+ class="name">{r(S(a_0);S(a_1),...S(a_n),attrs1 ∪
+ attrs2)}</span> ∪ S(I').
+ </ol>
+
+
+<p>Thus, if a PROV instance contains an apparent violation of a uniqueness
constraint or key constraint, merging can be used to determine
- whether the constraint can be satisfied by unifying an existential
- variable with another term. For key constraints, this is the same
- as merging pairs of statements whose keys are equal, because after
- merging, the two statements become equal and one can be omitted.
- Finally, the substitution obtained by merging is applied to all
- other statements in the instance, to reflect additional knowledge
- about existential variables gained through merging.</p>
+ whether the constraint can be satisfied by instantiating some existential
+ variables with other terms. For key constraints, this is the same
+ as merging pairs of statements whose keys are equal and whose
+ coresponding arguments are compatible, because after
+ merging respective arguments and attribute lists, the two statements
+ become equal and one can be omitted. </p>
@@ -1369,15 +1566,12 @@
<div class='constraint' id='key-relation'>
<p><ol>
<li>The identifier field <span class="name">id</span> is a KEY for
- the <span class="name">used(id;a,e,t,attrs)</span> statement.
- </li>
- <li>The identifier field <span class="name">id</span> is a KEY for
the <span class="name">wasGeneratedBy(id;e,a,t,attrs)</span> statement.
</li>
<li>The identifier field <span class="name">id</span> is a KEY for
- the <span class="name">wasInvalidatedBy(id;e,a,t,attrs)</span> statement.
+ the <span class="name">used(id;a,e,t,attrs)</span> statement.
</li>
- <li>The identifier field <span class="name">id</span> is a KEY for
+ <li>The identifier field <span class="name">id</span> is a KEY for
the <span class="name">wasInformedBy(id;a2,a1,attrs)</span> statement.
</li>
<li>The identifier field <span class="name">id</span> is a KEY for
@@ -1390,25 +1584,32 @@
the <span class="name">wasInvalidatedBy(id;e,a,t,attrs)</span> statement.
</li>
<li>The identifier field <span class="name">id</span> is a KEY for
+ the <span class="name">wasDerivedFrom(id; e2, e1, attrs)</span> statement.
+ </li>
+ <li>The identifier field <span class="name">id</span> is a KEY for
the <span class="name">wasDerivedFrom(id; e2, e1, a, g2, u1, attrs)</span> statement.
</li>
- <li>The identifier field <span class="name">id</span> is a KEY for
+ <li>The identifier field <span class="name">id</span> is a KEY for
the <span class="name">wasAttributedTo(id;e,ag,attr)</span> statement.
</li>
<li>The identifier field <span class="name">id</span> is a KEY for
the <span class="name">wasAssociatedWith(id;a,ag,pl,attrs)</span> statement.
</li>
<li>The identifier field <span class="name">id</span> is a KEY for
+ the <span class="name">wasAssociatedWith(id;a,ag,-,attrs)</span> statement.
+ </li>
+ <li>The identifier field <span class="name">id</span> is a KEY for
the <span class="name">actedOnBehalfOf(id;ag2,ag1,a,attrs)</span> statement.
</li>
+ <li>The identifier field <span class="name">id</span> is a KEY for
+ the <span class="name">wasInfluencedBy(id;o2,o1,attrs)</span> statement.
+ </li>
</ol>
</p> </div>
<hr>
-<div class="note"> The next four constraints can be stated more simply
-as key constraints</div>
<div id='unique-generation_text'>
<p>We assume that an entity has exactly one generation and
@@ -1422,13 +1623,13 @@
<div class='constraint' id='unique-generation'>
<p>
-<span class='conditional'>IF</span> <span class="name">wasGeneratedBy(id1;e,a1,t1,attrs1)</span> and <span class="name">wasGeneratedBy(id2;e,a2,t2,attrs2)</span>,
+<span class='conditional'>IF</span> <span class="name">wasGeneratedBy(id1;e,_a1,_t1,_attrs1)</span> and <span class="name">wasGeneratedBy(id2;e,_a2,_t2,_attrs2)</span>,
<span class='conditional'>THEN</span> <span class="name">id1</span>=<span class="name">id2</span>.</p>
</div>
<div class='constraint' id='unique-invalidation'>
<p>
-<span class='conditional'>IF</span> <span class="name">wasInvalidatedBy(id1;e,a1,t1,attrs1)</span> and <span class="name">wasInvalidatedBy(id2;e,a2,t2,attrs2)</span>,
+<span class='conditional'>IF</span> <span class="name">wasInvalidatedBy(id1;e,_a1,_t1,_attrs1)</span> and <span class="name">wasInvalidatedBy(id2;e,_a2,_t2,_attrs2)</span>,
<span class='conditional'>THEN</span> <span class="name">id1</span>=<span class="name">id2</span>.</p>
</div>
@@ -1452,8 +1653,8 @@
<div class='constraint' id='unique-wasStartedBy'>
<p>
<span class='conditional'>IF</span> <span
- class="name">wasStartedBy(id;a,e1,a1,t,attrs)</span> and <span
- class="name">wasStartedBy(id';a,e1',a1',t',attrs')</span>, <span
+ class="name">wasStartedBy(id1;a,_e1,_a1,_t1,_attrs1)</span> and <span
+ class="name">wasStartedBy(id2;a,_e2,_a2,_t2,_attrs2)</span>, <span
class='conditional'>THEN</span> <span class="name">id</span>=<span
class="name">id'</span>.</p>
</div>
@@ -1462,8 +1663,8 @@
<div class='constraint' id='unique-wasEndedBy'>
<p>
<span class='conditional'>IF</span> <span
- class="name">wasEndedBy(id;a,e1,a1,t,attrs)</span> and <span
- class="name">wasEndedBy(id';a,e1',a1',t',attrs')</span>, <span
+ class="name">wasEndedBy(id1;a,_e1,_a1,_t1,_attrs1)</span> and <span
+ class="name">wasEndedBy(id2;a,_e2,_a2,_t2,_attrs2)</span>, <span
class='conditional'>THEN</span> <span class="name">id</span>=<span
class="name">id'</span>.</p>
</div>
@@ -1559,7 +1760,7 @@
this means that <span class="name">e1</span> happened at the same time
as or after <span class="name">e2</span>. Both relations are
<a>preorder</a>s, meaning that they are <a>reflexive</a> and
-<a>transitive</a>. Moreover, we consider strict forms of these
+<a>transitive</a>. Moreover, we sometimes consider <em>strict</em> forms of these
orders: we say <span class="name">e1</span> strictly precedes <span
class="name">e2</span> to indicate that <span class="name">e1</span>
happened before <span class="name">e2</span>. This is a
@@ -1588,18 +1789,16 @@
<p>IF <span
class="name">hyp_1</span> AND ... AND <span
class="name">hyp_n</span> THEN <span
- class="name">a_0</span> <a>precedes</a>/<a title="precedes">strictly precedes</a> <span
- class="name">b_0</span>. </p></div>
+ class="name">evt1</span> <a>precedes</a>/<a title="precedes">strictly precedes</a> <span
+ class="name">evt2</span>. </p></div>
<p>
The conclusion of an ordering constraint is either <a>precedes</a>
or <a title="precedes">strictly precedes</a>. To check ordering constraints, we
- generate all possible <a>precedes</a> and <a title="precedes">strictly precedes</a>
- relationships (including transitive), and check that the resulting
- relation has no strict cycles (that is, the <a title="precedes">strictly precedes</a>
- relation induced by the edges is irreflexive). This is equivalent to
- constructing a directed graph, with edges marked <a>precedes</a> or
- <a title="precedes">strictly precedes</a>, and checking that there is no cycle
- containing a <a>strictly-precedes</a> edge.
+ generate all <a>precedes</a> and <a title="precedes">strictly
+ precedes</a>
+ relationships arising from the ordering constraints to form a directed graph, with edges marked <a>precedes</a> or
+ <a title="precedes">strictly precedes</a>, and check that there is no cycle
+ containing a <a title="precedes">strictly precedes</a> edge.
</p>
@@ -1642,7 +1841,8 @@
line precedes the event denoted by the right line.</p>
<div class="note"> Miscellanous suggestions about figures
(originally from Tim Lebo):
-<li>
+<ul>
+ <li>
I think it would help if the "corresponding edges between entities and activities" where the same visual style as the vertical line marking the time the Usage, generation and derivation occurred. A matching visual style provides a Gestalt that matches the concept. I am looking at subfigures b and c in 5.2.0</li>
<li>
Is Invalidation missing in "The following figure summarizes the ordering constraints" paragraph?</li> </ul>
@@ -1895,7 +2095,8 @@
-<p id='derivation-usage-generation-ordering_text'>If there is a derivation between <span class="name">e2</span> and <span class="name">e1</span>, then
+<p id='derivation-usage-generation-ordering_text'>If there is a
+derivation relationship linking <span class="name">e2</span> and <span class="name">e1</span>, then
this means that the entity <span class="name">e1</span> had some influence on the entity <span class="name">e2</span>; for this to be possible, some event ordering must be satisfied.
First, we consider derivations, where the activity and usage are known. In that case, the <a title="entity usage event">usage</a> of <span class="name">e1</span> has to precede the <a title="entity generation
event">generation</a> of <span class="name">e2</span>.
@@ -1941,7 +2142,7 @@
<p>Note that event ordering is between generations of <span class="name">e1</span>
and <span class="name">e2</span>, as opposed to derivation where usage is known,
-which implies ordering ordering between the usage of <span class="name">e1</span> and
+which implies ordering between the usage of <span class="name">e1</span> and
generation of <span class="name">e2</span>. </p>
<hr />
@@ -2160,93 +2361,48 @@
</section> <!-- constraints -->
<section id="normalization-validity-equivalence">
-<h2>Normalization, Equivalence and Validity</h2>
-
-
- <p>We define notions of <a>normalization</a>, <a>validity</a> and
-<a>equivalence</a> of PROV instances. We first define these concepts
-for PROV instances that consist of a single, unnamed bundle of
-statements, called the <a>toplevel bundle</a>.</p>
-
- <section id="optional-attributes">
- <h3>Optional Attributes</h3>
-
-
-<div id="optional-attributes1">
-<p>PROV-N allows some statement parameters and attributes to
- be optional. Unless otherwise specified, when an
- optional attribute is not present in a statement, it has a default
- value. Thus, most optional parameters (written <span
- class="name">-</span>) should be replaced by existential variables.
- The only exceptions are:
- </p>
- <ul>
-<li><div id="optional-attributes4">In an association of the form
- <span class="name">wasAssociatedWith(id;a, ag,-,attr)</span>, the
- absence of a plan means: either no plan exists, or a plan exists but
- it is not identified. Thus, it is not equivalent to <span
- class="name">wasAssociatedWith(id;a,ag,p,attr)</span> where a
- plan <span class="name">p</span> is given.</div></li>
- <li><div id="optional-attributes4">In an association of the form
- <span class="name">wasDerivedFrom(id;e1,e2,attr)</span>, the
- absence of the optional activity, generation and use identifiers
- means that the derivation relationship may encompass multiple activities,
-generations, and uses. Thus, it is not equivalent to <span
- class="name">wasDerivedFrom(id;e1,e2,a,g,u,attr)</span> where some
- activity, generation and use are given explicitly.</div></li>
-</div></li>
- </ul>
-</div>
-
-</section>
-
-<section id="normalization">
-<h3>Normalization and Validity</h3>
+<h2>Normalization, Validity, and Equivalence</h2>
+
+
+ <p>We define the notions of <a title="normal form">normalization</a>, <a
+title="valid">validity</a> and
+<a title="equivalent">equivalence</a> of PROV instances. We first define these concepts
+for PROV instances that consist of a single, unnamed <a>bundle</a> of
+statements, called the <dfn>toplevel bundle</dfn>.</p>
+
+
<p> We define the <dfn>normal form</dfn> of a PROV instance as the set
-of provenance expressions resulting from merging to resolve all
+of provenance statements resulting from merging to resolve all
uniqueness constraints in the instance and applying all possible
-inference rules to this set. Formally, we say that two PROV instances
-are <dfn>equivalent</dfn> if they have the same normal form (that is,
-after applying all possible inference rules, the two instances produce
-the same set of PROV-DM expressions.) </p>
-
-<p> An application that processes PROV-DM data SHOULD handle
-equivalent instances in the same way. (Common exceptions to this rule
-include, for example, pretty printers that seek to preserve the
-original order of statements in a file and avoid expanding
-inferences.) </p>
-
- A PROV instance is <dfn id="dfn-valid">valid</dfn>
-if its normal form exists and satisfies all of
- the validity constraints; this implies that the instance satisfies
- all of the inferences, definitions and constraints.
- The following algorithm can be used to compute normal forms and test
- validity:
+inference rules to this set. </p>
+
+
<ol>
<li>
- Apply all definitions by replacing each defined statement by its
- definition (possibly introducing fresh identifiers in the process).
- </li>
- <li>
- Apply all inferences by adding the conclusion of each inference
- whose hypotheses are satisfied and whose conclusion does not
- already hold.
+ Apply all definitions to <span class="math">I</span> by replacing each defined statement by its
+ definition (possibly introducing fresh existential variables in
+ the process), yielding an instance <span class="math">I<sub>1</sub></span>.
</li>
<li>
- Apply all uniqueness constraints by merging pairs of statements
- and unifying existential variables and constants (or failing if this would require merging two
- distinct constants). Merging may lead to substitutions of
- existential variables that enable new inferences; thus, steps 2
- and 3 may
- need to be repeated.
+ Apply all inferences to <span class="math">I<sub>1</sub></span> by adding the conclusion of each inference
+ whose hypotheses are satisfied and whose conclusions do not
+ already hold (again, possibly introducing fresh existential
+ variables), yielding an instance <span class="math">I<sub>2</sub></span>.
</li>
- <li>Apply event ordering constraints to determine whether there is
- a consistent ordering on the events in the normal form.
+ <li>
+ Apply all uniqueness constraints to <span class="math">I<sub>2</sub></span> by merging terms or statements
+ and applying the resulting substitution to the instance, yielding
+ an instance <span class="math">I<sub>3</sub></span>. If some uniqueness constraint cannot be
+ applied, then normalization fails.
</li>
- </ol>
-
+ <li>If no inferences can be applied to instance <span class="math">I<sub>3</sub></span>, then <span class="math">I<sub>3</sub></span> is the
+ normal form of <span class="math">I</span>.</li>
+ <li>Otherwise, the normal form of <span class="math">I</span> is the same as the normal form
+ of <span class="math">I<sub>3</sub></span> (that is, proceed by recursively normalizing <span class="math">I<sub>3</sub></span>).
+ </ol>
+
<p>Because of the potential interaction among inferences, definitons and
constraints, there is a loop in the above algorithm. Nevertheless,
all of our constraints fall into a class of <a>tuple-generating
@@ -2254,24 +2410,47 @@
satisfy a termination condition called <a>weak acyclicity</a> that
has been studied in the context of relational databases
[[DBCONSTRAINTS]]. Therefore, the above algorithm terminates, independently
- of the order in which inferences and constraints are applied.
-<div class="note">
- Must check this for final constraints/inferences!
- </div>
+ of the order in which inferences and constraints are applied.
+ <a href="#termination">Appendix C</a> gives a proof that normalization terminates and produces
+ a unique (up to isomorphism) normal form.
</p>
-</section>
- <section id="equivalence">
-<h3>Equivalence</h3>
- <p>
-Equivalence has the following characteristics:
-</p>
+
+ <p>
+ A PROV instance is <dfn id="dfn-valid">valid</dfn>
+if its normal form exists and satisfies all of
+ the validity constraints; this implies that the instance satisfies
+ all of the inferences, definitions and constraints.
+ The following algorithm can be used to test
+ validity:</p>
+
+<ol>
+ <li>Normalize the instance, obtaining normalized instance <span class="math">I'</span>. If
+ normalization fails, then <span class="math">I</span> is not <a>valid</a>.
+ </li>
+ <li>Apply all event ordering constraints to <span class="math">I'</span> to build a graph <span class="math">G</span> whose nodes
+ are event identifiers and edges
+ are labeled by "precedes"
+ and "strictly precedes" relationships among events induced by the constraints.</li>
+ <li> Determine whether there is a cycle in <span class="math">G</span> that contains a
+ "strictly precedes" edge. If so, then <span class="math">I</span> is not <a>valid</a>. If no
+ such cycle exists, then I is <a>valid</a>.
+ </li>
+ </ol>
+
+
+
+ <p>Two PROV instances are <dfn>equivalent</dfn> if they have the
+same normal form (that is, after applying all possible inference
+rules, the two instances produce the same set of PROV-DM statements,
+up to reordering of statements and attributes within attribute lists).
+Equivalence has the following characteristics: </p>
<ul>
<li>
- The order of provenance expressions is irrelevant to the meaning of
+ The order of provenance statements is irrelevant to the meaning of
a PROV instance. That is, a
PROV instance is equivalent to any other instance obtained by
- permuting its expressions.
+ permuting its statements.
</li>
<li>The order of attribute-value pairs in attribute lists is
irrelevant to the meaning of a PROV statement. That is, a PROV
@@ -2284,32 +2463,36 @@
inference rule or definition, or by <a>merging</a> two statements to
enforce a uniqueness constraint.
</li>
- <li>Equivalence is reflexive, symmetric, and transitive.</li>
+ <li>Equivalence is <a>reflexive</a>, <a>symmetric</a>, and <a>transitive</a>.</li>
</ul>
-
-
-</section>
+<p> An application that processes PROV-DM data SHOULD handle
+equivalent instances in the same way. (Common exceptions to this rule
+include, for example, pretty printers that seek to preserve the
+original order of statements in a file and avoid expanding
+inferences.) </p>
+
+
<section id="bundle-constraints">
-<h2>Bundles and Constraints</h2>
-
-<div class="note">
-TODO: Fix notation.
-</div>
-
-<p>The definitions, inferences, and constraints introduced so far, and
+<h2>Bundles</h2>
+
+
+<p>The definitions, inferences, and constraints, and
the resulting notions of normalization, validity and equivalence,
assume a PROV instance with exactly one <a>bundle</a>, the <a>toplevel
bundle</a>, consisting of all PROV statements in the toplevel of the
-instance. In this section, we describe how to deal with PROV
+instance (that is, not enclosed in a named <a>bundle</a>). In this section, we describe how to deal with PROV
instances consisting of multiple bundles. Briefly, each bundle is
handled independently; there is no interaction between bundles from
the perspective of applying definitions, inferences, or constraints,
computing normal forms, or checking validity or equivalence.</p>
<p> We model a general PROV instance, containing <span class="name">n</span> named bundles
-<span class="name">b<sub>1</sub>...b<sub>n</sub></span>, as a tuple <span class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,\ldots,b<sub>n</sub>=B<sub>n</sub>])</span> where TB is the set of
-statements of the toplevel bundle, and for each <span class="name">i, <span class="name">B<sub>i</sub></span> is the set of
+<span class="name">b<sub>1</sub>...b<sub>n</sub></span>, as a tuple
+<span
+class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,b<sub>n</sub>=B<sub>n</sub>])</span>
+where <span class="name">B<sub>0</sub></span> is the set of
+statements of the <a>toplevel bundle</a>, and for each <span class="name">i</span>, <span class="name">B<sub>i</sub></span> is the set of
statements of bundle <span class="name">b<sub>i</sub></span>. This notation is shorthand for the
following PROV-N syntax:</p>
<pre>
@@ -2324,18 +2507,18 @@
</pre>
<p> The <a>normal form</a> of a general PROV instance
-(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,[b<sub>n</sub>=B<sub>n</sub>]) is (B'<sub>0</sub>,[b<sub>1</sub>=B'<sub>1</sub>,...,b<sub>n</sub>=B'<sub>n</sub>])
-where B'<sub>i</sub> is the normal
-form of B<sub>i</sub> for each i between 0 and n. </p>
-
-<p>A general PROV instance is <a>valid</a> if each of the bundles B<sub>0</sub>,
-..., B<sub>n</sub> are valid and none of the bundle identifiers b<sub>i</sub> are repeated.</p>
-
-<p>Two (valid) general PROV instances (B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,b<sub>n</sub>=B<sub>n</sub>]) and
-(B'<sub>0</sub>,[b<sub>1</sub>'=B'<sub>1</sub>,...,b'_m=B'_m]) are <a>equivalent</a> if B<sub>0</sub> is
-equivalent to B'<sub>0</sub> and n = m and
-there exists a permutation P : n -> n such that for each i, b<sub>i</sub> =
-b'<sub>P(i)</sub> and B<sub>i</sub> is equivalent to B'<sub>P(i)</sub>.
+<span class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,[b<sub>n</sub>=B<sub>n</sub>])</span> is <span class="name">(B'<sub>0</sub>,[b<sub>1</sub>=B'<sub>1</sub>,...,b<sub>n</sub>=B'<sub>n</sub>])</span>
+where <span class="name">B'<sub>i</sub></span> is the normal
+form of <span class="name">B<sub>i</sub></span> for each <span class="name">i</span> between 0 and <span class="name">n</span>. </p>
+
+<p>A general PROV instance is <a>valid</a> if each of the bundles <span class="name">B<sub>0</sub></span>,
+..., <span class="name">B<sub>n</sub></span> are valid and none of the bundle identifiers <span class="name">b<sub>i</sub></span> are repeated.</p>
+
+<p>Two (valid) general PROV instances <span class="name">(B<sub>0</sub>,[b<sub>1</sub>=B<sub>1</sub>,...,b<sub>n</sub>=B<sub>n</sub>])</span> and
+<span class="name">(B'<sub>0</sub>,[b<sub>1</sub>'=B'<sub>1</sub>,...,b'_m=B'_m])</span> are <a>equivalent</a> if <span class="name">B<sub>0</sub></span> is
+equivalent to <span class="name">B'<sub>0</sub></span> and <span class="name">n = m</span> and
+there exists a permutation <span class="name">P : {1..n} -> {1..n}</span> such that for each <span class="name">i</span>, <span class="name">b<sub>i</sub> =
+b'<sub>P(i)</sub></span> and <span class="name">B<sub>i</sub></span> is equivalent to <span class="name">B'<sub>P(i)</sub></span>.
</p>
</section> <!-- bundle-constraints-->
@@ -2360,36 +2543,45 @@
</section>
<section class="glossary">
- <h2>Glossary</h2>
+ <h2>Glossary</h2>
+
<ul>
- <li> <dfn>antisymmetric</dfn>: A relation R over X is <a>antisymmetric</a> if
- for any elements x, y of X, if x R y and y R x then x = y.</li>
- <li> <dfn>asymmetric</dfn>: A relation R over X is <a>asymmetric</a> if
- x R y and y R x do not hold for any elements x, y of X.</li>
- <li> <dfn>irreflexive</dfn>: A relation R over X is <a>irreflexive</a> if
- for x R x does not hold for any element x of X.</li>
- <li> <dfn>reflexive</dfn>: A relation R over X is <a>reflexive</a> if
- for any element x of X, we have x R x.</li>
+ <li> <dfn>antisymmetric</dfn>: A relation <span class="math">R</span> over <span class="math">X</span> is <a>antisymmetric</a> if
+ for any elements <span class="math">x</span>, <span class="math">y</span> of <span class="math">X</span>, if <span class="math">x R y</span> and <span class="math">y R x</span> then <span class="math">x = y</span>.</li>
+ <li> <dfn>asymmetric</dfn>: A relation <span class="math">R</span> over <span class="math">X</span> is <a>asymmetric</a> if
+ <span class="math">x R y</span> and <span class="math">y R x</span> do not hold for any elements <span class="math">x</span>, <span class="math">y</span> of <span class="math">X</span>.</li>
+ <li> <dfn>irreflexive</dfn>: A relation <span class="math">R</span> over <span class="math">X</span> is <a>irreflexive</a> if
+ for <span class="math">x R x</span> does not hold for any element <span class="math">x</span> of <span class="math">X</span>.</li>
+ <li> <dfn>reflexive</dfn>: A relation <span class="math">R</span> over <span class="math">X</span> is <a>reflexive</a> if
+ for any element <span class="math">x</span> of <span class="math">X</span>, we have <span class="math">x R x</span>.</li>
<li><dfn>partial order</dfn>: A partial order is a relation
that is <a>reflexive</a>, <a>antisymmetric</a>, and <a>transitive</a>.</li>
- <li><dfn>preorder</dfn>: A preorder is a relation R that is
+ <li><dfn>preorder</dfn>: A preorder is a relation that is
<a>reflexive</a> and <a>transitive</a>. (It is not necessarily antisymmetric,
- meaning there can be cycles of distinct elements x1 R x2 R ... R
- xn R x1.)</li>
+ meaning there can be cycles of distinct elements <span
+ class="math">x<sub>1</sub> R x<sub>2</sub> R ... R
+ x<sub>n</sub> R x<sub>1</sub>.</span></li>
<li><dfn>strict partial order</dfn>: A strict partial order is a
relation that is <a>irreflexive</a>, <a>asymmetric</a> and <a>transitive</a>.</li>
<li><dfn>strict preorder</dfn>: A strict preorder is a relation
that is <a>irreflexive</a> and <a>transitive</a>.</li>
- <li> <dfn>symmetric</dfn>: A relation R over X is <a>symmetric</a> if
- for any elements x, y of X, if x R y then y R x.</li>
- <li> <dfn>transitive</dfn>: A relation R over X is <a>transitive</a> if
- for any elements x, y, z of X, if x R y and y R z then x R z.</li>
+ <li> <dfn>symmetric</dfn>: A relation <span class="math">R</span> over <span class="math">X</span> is <a>symmetric</a> if
+ for any elements <span class="math">x</span>, <span class="math">y</span> of <span class="math">X</span>, if <span class="math">x R y</span> then <span class="math">y R x</span>.</li>
+ <li> <dfn>transitive</dfn>: A relation <span class="math">R</span> over <span class="math">X</span> is <a>transitive</a> if
+ for any elements <span class="math">x</span>, <span class="math">y</span>, <span class="math">z</span> of <span class="math">X</span>, if <span class="math">x R y</span> and <span class="math">y R z</span> then <span class="math">x R z</span>.</li>
</ul>
</section>
+ <section class="appendix" id="termination">
+ <h2>Termination of normalization</h2>
+<div class="note">TODO: give proof that normalization terminates and
+ produces unique normal forms.
+ </div>
+ </h2>
+
</body></html>
<!-- LocalWords: px DM RL RDF AQ SEM SOTD Definitional wasInformedBy attrs ag