--- a/model/comments/issue-459-stian.txt Tue Aug 07 12:32:13 2012 +0100
+++ b/model/comments/issue-459-stian.txt Tue Aug 07 12:45:25 2012 +0100
@@ -1,7 +1,7 @@
>
>
> The following is my review of the PROV Constraints document from:
- > http://dvcs.w3.org/hg/prov/raw-file/tip/model/releases/ED-prov-constraints-20120723/prov-constraints.htm > l
+ > http://dvcs.w3.org/hg/prov/raw-file/tip/model/releases/ED-prov-constraints-20120723/prov-constraints.html
>
> Please accept my apologies in the delay, this document, being rather
> formal, required a bit more time on my behalf to be reviewed.
@@ -28,6 +28,9 @@
> document changes from a specification-style language to a "we assume"
> paper style language. I would try to keep it at specification level.
>
+
+"We assume" has been reworded.
+
> Some of the introductory text seems to assume the reader wants to do
> normalization, however I suspect most readers will only be interested
> in validation; in particular to ensure their own PROV instance is
@@ -158,6 +161,8 @@
> specification might be unclear or confusing.
>
>
+
+1.
>
> Abstract: "statements and constraints that PROV instances must satisfy
> in order to be considered valid".
@@ -173,11 +178,21 @@
> Without the clarification of "according to this document" the meaning
> of "MUST" here is difficult - would it not be valid PROV-DM either?
>
- >
+
+Fixed.
+
+2.
+
+ >
> > 1.1 Conventions
> > Some of these ariables
> --> variables
>
+
+Fixed
+
+3.
+
>
> > _, by convention, to indicate that they (intentionally) appear
> only once in the formula; thus, the textual variable name is
@@ -193,6 +208,12 @@
> Replace with something like "such passive variables are provided
> merely as an aid to the reader."
>
+
+Fixed.
+
+
+
+4.
>
> 1.2 Purpose of this document
>
@@ -203,6 +224,11 @@
>
> This statement might be in other documents as well, but I've not checked.
>
+
+Fixed. I think this phrasing only stated here...
+
+5.
+
>
> > > A valid PROV instance corresponds to a consistent history
> of objects and interactions to which logical reasoning can be
@@ -211,6 +237,11 @@
> Again the valid, but here with a small definition. A forward-link to
> section 6 would be helpful.
>
+
+Done.
+
+6. BLOCKER
+
>
>
> > Applications should also use definitions, inferences and constraints
@@ -236,6 +267,11 @@
> similarly applications consuming provenance MAY reject provenance that
> is not _valid_."
>
+
+This is ISSUE-464. Your wording is adopted.
+
+7.
+
>
> > To summarize: compliant applications use definitions, inferences,
> > and uniqueness constraints to normalize PROV instances, and then
@@ -250,12 +286,25 @@
> normalization. It is saying you can't be compliant without doing all
> of the above!
>
+
+This is ISSUE-464. I have reworded to avoid giving the impression
+that we require normalization, validity checking, or equivalence
+checking, but kept the summary - I feel it is important to summarize
+how the components fit together.
+
+8.
+
>
> 1.3 Structure of this document
> > certain statments
> statements
>
>
+
+Fixed.
+
+9.
+
>
> 2.1 Entities, Activities and Agents
>
@@ -270,21 +319,40 @@
> entirety of an entity's lifetime"
>
>
+
+Fixed.
+
+10.
+
>
> > Different entities that are aspects of the same thing are called alternate,
> 'alternate' -> 'alternates'
>
>
+
+Fixed.
+
+11.
+
> > PROV relations of specialization and alternate can be used
> Use actual relation names and code style:
>
> 'specialization' -> '<code>specializationOf</code>'
> 'alternate' -> '<code>alternateOf</code>'
>
+
+Fixed.
+
+12.
> > express dependencies between the various entities using events
>
> italics on 'events'.
>
+
+Done.
+
+13.
+
>
> > > An activity is delimited by its start and its end events;
> hence, it occurs over an interval delimited by two instantaneous
@@ -292,6 +360,10 @@
>
> Remove "hence, "
>
+Fixed.
+
+14.
+
>
> > > In contrast, an activity (..) is typically not identifiable by the characteristics it exhibits at any point during its duration.
>
@@ -301,6 +373,11 @@
> use the argument that activities are not things in the world, they are
> what happens to the things in the world.
>
+
+Fixed.
+
+15.
+
> > > An entity usage event is the instantaneous event that
> marks the first instant of an entity's consumption timespan by an
> activity.
@@ -313,6 +390,11 @@
> this instant, although the activity could potentially have other
> usages of the same entity starting at different times."
>
+
+Fixed.
+
+16.
+
>
> 2.3 Summary of constraints..
>
@@ -328,6 +410,19 @@
> horisontal headers? Perhaps even the Type/Relation column could be
> done as headers?
>
+
+I had put this in its own section, in the non-normative part. It would
+be good to summarize the constraints somehow, but this doesn't seem to
+be working well. It would also be good to summarize the typing
+constaints in the informative section (e.g. previewing the typing
+constraint rules).
+
+Should this table / index be an appendix or something?
+
+@Luc?
+
+17.
+
>
> > To be compliant:
>
@@ -347,6 +442,17 @@
>
> Replace "When" with "If".
>
+
+I believe when and if are synonymous in this context. I replaced the
+last three since these are the only ones that change the emphasis.
+The first reads strangely to me (If...may) so I left it as is.
+
+I also added a paragraph to make it clear that we do not require
+computing normal forms, only that whatever algorithms people use
+produce the same results as the specification.
+
+18.
+
>
> 4. Inferences and Definitions
>
@@ -357,6 +463,11 @@
>
> use concl1 and concln in first line. (Consistency).
>
+
+Fixed.
+
+19.
+
>
> 4.1 Optional identifiers
>
@@ -365,6 +476,11 @@
> "We note that" -> "Note that" ("we" are not observing here, but specifying!)
>
>
+
+Fixed.
+
+20.
+
>
> > then the following definition holds:
> > r(id;a1,...,an) holds IF AND ONLY IF r(id;a1,...,an,[]) holds.
--- a/model/prov-constraints.html Tue Aug 07 12:32:13 2012 +0100
+++ b/model/prov-constraints.html Tue Aug 07 12:45:25 2012 +0100
@@ -421,14 +421,18 @@
</p>
-<p> This document introduces <a>definitions</a> and <a>inferences</a>
- that are allowed on provenance statements and <a>constraints</a>
- that PROV instances MUST satisfy in order to be considered
- <a>valid</a>. These definitions, inferences, and constraints are useful for
- readers who develop applications that generate provenance or reason
- over provenance. They can also be used to <a title="normal form">normalize</a> PROV
- instances to forms that can easily be compared in order to determine
- whether two PROV instances are <a>equivalent</a>.</p>
+<p> This document defines a subset of PROV instances called
+<i><a>valid</a></i> PROV instances. A <a>valid</a> PROV instance corresponds to a
+consistent history of objects and interactions to which logical
+reasoning can be safely applied.
+Valid PROV instances satisfy
+certain <a>definitions</a>, <a>inferences</a>, and
+<a>constraints</a>. These definitions, inferences, and constraints
+provide a measure of consistency checking for provenance and reasoning
+over provenance. They can also be used to <a title="normal
+form">normalize</a> PROV instances to forms that can easily be
+compared in order to determine whether two PROV instances are
+<a>equivalent</a>.</p>
</section>
@@ -501,8 +505,8 @@
lower-case identifiers. Some of these variables are written
beginning with the underscore character <span
class="name">_</span>, by convention, to indicate that they
- (intentionally) appear only once in the formula; thus, the textual
- variable name is mnemonic only. </p>
+ appear only once in the formula. Such variables are
+ provided merely as an aid to the reader. </p>
</section>
@@ -512,7 +516,7 @@
<h3>Purpose of this document</h3>
<p>The PROV Data Model, PROV-DM, is a conceptual data model for provenance, which is
-realizable using different serializations such as PROV-N and PROV-O.
+realizable using different representations such as PROV-N and PROV-O.
A PROV <dfn>instance</dfn> is a set of PROV statements,
possibly including <a>bundles</a>, or named sets of statements. For
example, such a PROV instance could be a .provn document, the result
@@ -529,24 +533,40 @@
<a>valid</a> PROV instances by specifying <em>constraints</em> that
<a>valid</a> PROV instances must satisfy. There are four kinds of
constraints: <em>uniqueness constraints</em>, <em>event ordering
-constraints</em>, <em>impossibility constraints</em>, and <em>type constraints</em>.
-Applications SHOULD produce valid
-provenance and MAY reject provenance that is not <a>valid</a>. Applications
-SHOULD also use definitions, inferences and constraints to normalize
-PROV instances in order to determine whether two such instances convey
-the same information.
-
-
-<p>To summarize: compliant applications use definitions,
-inferences, and uniqueness constraints to normalize PROV instances,
-and then check event ordering, impossibility, and type constraints. If these
-checks succeed, the instance is
-<a>valid</a>, and the normal form is considered <a>equivalent</a> to
-the original instance. Also, any two PROV instances that yield the
-same normal form are considered <a>equivalent</a>. Further discussion
+constraints</em>, <em>impossibility constraints</em>, and <em>type
+constraints</em>.
+Further discussion
of the semantics of PROV statements, which justifies the inferences
and constraints, can be found in the formal semantics [[PROV-SEM]].
</p>
+
+<p>We define validity and equivalence in terms of a
+concept called <a title="normal form">normalization</a>. Definitions, inferences,
+and uniqueness constraints can be applied to <a title="normal
+form">normalize</a> PROV instances, and event ordering, typing, and
+impossibility constraints are checked on the normal form to determine
+<a title="valid">validity</a>. Equivalence of two PROV
+instances can be determined by comparing their normal forms.
+</p>
+<p>
+This document outlines an algorithm for validity checking based on
+<a title="normal form">noramlization</a>. Applications MAY implement
+validity and equivalence checking using normalization, as suggested
+here, or in
+any other way as long as the same instances are considered valid or
+equivalent, respectively.
+</p>
+
+<p>Checking validity or equivalence are RECOMMENDED, but not required, for
+applications compliant with PROV. This specification defines how
+validity and equivalence are to be checked, if an application elects
+to support them at all.
+Applications producing provenance SHOULD ensure that it is
+<a>valid</a>, and similarly applications consuming provenance MAY reject provenance that is not <a>valid</a>. Applications
+which are determining whether PROV instances convey the same
+information SHOULD check equivalence as specified here, and SHOULD
+treat equivalent instances in the same way.
+</p>
</section>
<section>
<h4>Structure of this document</h4>
@@ -604,8 +624,9 @@
<section id='rationale' class="informative">
<h3>Rationale</h3>
-<p> In this section we give a high-level rationale that provides some
- further background for the constraints. </p>
+<p> This section gives a high-level rationale that provides some
+ further background for the constraints, but does not affect the
+technical content of the rest of the specification.</p>
<section>
<h4>Entities, Activities and Agents</h4>
@@ -627,15 +648,15 @@
between its <a title="entity generation event">generation event</a>
and its <a title="entity invalidation event">invalidation event</a>.
An entity's attributes are established when the entity is
-created and describe the entity's situation and (partial) state
-during the entity's lifetime.</p>
+created and (partially) describe the entity's situation and state
+during the entirety of the entity's lifetime.</p>
<p>
A different entity (perhaps representing a different user or
system perspective) may fix other aspects of the same thing, and its provenance
may be different. Different entities that fix aspects of the same
-thing are called <em>alternate</em>, and the PROV relations of
-specialization and alternate can be used to link such entities.</p>
+thing are called <em>alternates</em>, and the PROV relations of
+<span class="name">specializationOf</span> and <span class="name">alternateOf</span> can be used to link such entities.</p>
<p>Besides entities, a variety of other PROV objects have
attributes, including activity, generation, usage, invalidation, start, end,
@@ -657,9 +678,9 @@
over which relevant attributes of the thing are not fixed, a PROV
instance would describe multiple entities, each with its own
identifier, <a>lifetime</a>, and fixed attributes, and express dependencies between
- the various entities using events. For example, if we want to
+ the various entities using <i><a href="instantaneous event">events</a></i>. For example, in order to
describe the provenance of several versions of a document, involving
- attributes such as authorship that change over time, we need
+ attributes such as authorship that change over time, one can use
different entities for the versions linked by appropriate
generation, usage, revision, and invalidation events.
</p>
@@ -673,7 +694,9 @@
attributes; this leads to potential ambiguity, which is mitigated through the
use of identifiers.</p>
-<p>An <a>activity</a> is delimited by its <a title="activity start event">start</a> and its <a title="activity end event">end</a> events; hence, it occurs over
+<p>An <a>activity</a> is delimited by its <a title="activity start
+event">start</a> and its <a title="activity end event">end</a>
+events. It occurs over
an interval delimited by two <a title="instantaneous event">instantaneous
events</a>. However, an activity statement need not mention start or end time information, because they may not be known.
An activity's attribute-value pairs are expected to describe the activity's situation during its interval, i.e. an interval between two instantaneous events, namely its <a title="activity start event">start</a> event and its <a title="activity end event">end</a> event.
@@ -685,8 +708,7 @@
any point in its lifetime, persists during this interval, and
preserves the characteristics provided. In
contrast, an activity is something that occurs, happens, unfolds, or
-develops through time, but is typically not identifiable by the
-characteristics it exhibits at any point during its duration. This
+develops through time. This
distinction is similar to the distinction between 'continuant' and
'occurrent' in logic [[Logic]].</p>
@@ -699,7 +721,7 @@
in many different contexts within individual systems and across the
Web. Different systems may use different clocks which may not be
precisely synchronized, so when provenance statements are combined by
-different systems, we may not be able to align the times involved to a
+different systems, an application may not be able to align the times involved to a
single global timeline. Hence, PROV is designed to minimize
assumptions about time. Instead, PROV talks about (identified)
events. </p>
@@ -737,8 +759,9 @@
<p>An <dfn id="dfn-usage-event">entity usage event</dfn> is the <a
title="instantaneous event">instantaneous event</a> that marks the first instant of
-an entity's consumption timespan by an activity. Before this instant
-the entity had not begun to be used by the activity.</p>
+an entity's consumption timespan by an activity. The described usage
+had not started before this instant, although the activity could
+potentially have used the same entity at a different time.</p>
<p>An <dfn id="dfn-invalidation-event">entity invalidation event</dfn>
@@ -1032,17 +1055,21 @@
<ol><li>When processing provenance, an
application MAY apply the inferences and definitions in <a
href="#inferences" class='sectionRef'></a>.</li>
- <li>When determining whether a PROV instance is <a>valid</a>, an
+ <li>If determining whether a PROV instance is <a>valid</a>, an
application MUST check that all of the
constraints of <a href="#constraints" class="sectionRef"></a> are
- satisfied on the <a>normal form</a> of the instance.</li>
- <li> When producing provenance meant for other applications to
+ satisfied on the <a>normal form</a> of the instance. </li>
+ <li> If producing provenance meant for other applications to
use, the application SHOULD produce <a>valid</a> provenance, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>. </li>
- <li>When determining whether two PROV instances are
+ <li>If determining whether two PROV instances are
<a>equivalent</a>, an application MUST determine whether their
normal forms are equal, as specified in <a href="#normalization-validity-equivalence" class="sectionRef"></a>.
</ol>
+ <p>Compliant applications are not required to explicitly compute
+ normal forms; however, if checking validity or equivalence, the
+ results should be the same as would be obtained by computing normal
+ forms as defined in this specification.</p>
<p>
All diagrams are for illustration purposes
@@ -1063,7 +1090,7 @@
<section id="inferences">
<h2>Inferences and Definitions</h2>
<p>
-In this section, we describe <a title="inference">inferences</a> and <a title="definition">definitions</a> that MAY be used on
+This section describes <a title="inference">inferences</a> and <a title="definition">definitions</a> that MAY be used on
provenance data, and preserve <a>equivalence</a> on <a>valid</a>
PROV instances (as detailed in <a href="#normalization-validity-equivalence" class="sectionRef"></a>).
An <dfn id="inference">inference</dfn> is a rule that can be applied
@@ -1082,16 +1109,19 @@
<span class="name">hyp<sub>k</sub></span> <span class='conditional'>THEN</span>
there exists <span class="name">a<sub>1</sub></span> and ... and <span
class="name">a<sub>m</sub></span> such that <span
- class="name">conclusion<sub>1</sub></span> and ... and <span class="name">conclusion<sub>n</sub></span>.</p>
+ class="name">concl<sub>1</sub></span> and ... and <span class="name">concl<sub>n</sub></span>.</p>
</div>
-<p> This means that if all of the provenance statements matching
+<p> Inferences can be applied to PROV instances. Applying an inference to an instance means that if all of the provenance statements matching
<span class="name">hyp<sub>1</sub></span>... <span class="name">hyp<sub>k</sub></span>
- can be found in a PROV instance, we can add all of the statements
+ can be found in the instance, then we check whether the conclusion
<span class="name">concl<sub>1</sub></span> ... <span
- class="name">concl<sub>n</sub></span> to the instance, possibly after
+ class="name">concl<sub>n</sub></span> is satisfied for some values
+ of existential variables. If so, application of the inference has
+ no effect on the instance. If not, then a copy the
+ conclusion should be added to the instance, after
generating fresh identifiers <span class="name">a<sub>1</sub></span>,...,<span
- class="name">a<sub>m</sub></span> for unknown objects. These fresh
+ class="name">a<sub>m</sub></span> for the existential variables. These fresh
identifiers might later be found to be equal to known identifiers;
they play a similar role in PROV constraints to existential
variables in logic, "labeled nulls" in database theory
@@ -1114,15 +1144,17 @@
</div>
<p>
- This means that a provenance statement <span class="name">defined_exp</span> is defined in
- terms of other statements. This can be viewed as a two-way
- inference: If <span class="name">defined_exp</span>
- can be found in a PROV instance, we can add all of the statements
+ A definition can be applied to a PROV instance, since its <span class="name">defined_exp</span> is defined in
+ terms of other statements. Applying a
+ definition to an instance means that if an occurrence of a defined
+ provenance statement <span class="name">defined_exp</span>
+ can be found in a PROV instance, then we can remove it and add all of the statements
<span class="name">defining_exp<sub>1</sub></span> ... <span class="name">defining_exp<sub>n</sub></span> to the instance, possibly after generating fresh
identifiers <span class="name">a<sub>1</sub></span>,...,<span
- class="name">a<sub>m</sub></span> for unknown objects. It is safe to replace
+ class="name">a<sub>m</sub></span> for existential variables. In
+ other words, it is safe to replace
a defined statement with
- its definition.
+ its definition.
</p>
<p> Definitions and inferences can be viewed as logical formulas;
@@ -2142,8 +2174,9 @@
<p>
<hr>
- <p id='key-object_text'> We assume that the various identified objects of PROV have
- unique statements describing them within a PROV instance, through
+ <p id='key-object_text'>The various identified objects of PROV MUST have
+ unique statements describing them within a valid PROV instance.
+ This is enforced through
the following key constraints:
</p>
<div class='constraint' id='key-object'>
@@ -2217,9 +2250,9 @@
<div id='unique-generation_text'>
-<p>We assume that an entity has exactly one generation and
+<p>An entity has exactly one generation and
invalidation event (either or both may, however, be left implicit).
- Note that together with the key constraints above, this implies that
+ Together with the key constraints above, this implies that
<span class="name">e</span> is also a key for generation and
invalidation statements.
</p>
@@ -2252,7 +2285,7 @@
<hr />
<p id='unique-wasStartedBy_text'>
-We assume that an activity has exactly one start and
+An activity has exactly one start and
end event (either or both may, however, be left implicit). Again,
together with above key constraints these constraints imply that the
activity is a key for activity start and end statements.
@@ -2342,9 +2375,9 @@
<p>Given that provenance consists of a description of past entities
and activities, <a>valid</a> provenance instances MUST
-satisfy <em>ordering constraints</em> between instantaneous events, which we introduce in
+satisfy <em>ordering constraints</em> between instantaneous events, which are introduced in
this section. For instance, an entity can only be used after it was
-generated; hence, we say that an entity's <a title="entity generation
+generated; in other words, an entity's <a title="entity generation
event">generation event</a> precedes any of this
entity's <a title="entity usage event">usage events</a>. Should this
ordering constraint be violated, the associated generation and
@@ -2360,14 +2393,15 @@
<p>Specifically, <dfn id="dfn-precedes">precedes</dfn> is a <a>preorder</a>
-between <a title="instantaneous event">instantaneous events</a>. When
-we say <span class="name">e1</span> precedes <span
-class="name">e2</span>, this means that <span class="name">e1</span>
+between <a title="instantaneous event">instantaneous events</a>. A
+constraint of the form
+<span class="name">e1</span> precedes <span
+class="name">e2</span> means that <span class="name">e1</span>
happened at the same time as or before <span class="name">e2</span>.
For symmetry, <dfn id="dfn-follows">follows</dfn> is defined as the
-inverse of <a title="precedes">precedes</a>; that is, when we say
-<span class="name">e1</span> follows <span class="name">e2</span>,
-this means that <span class="name">e1</span> happened at the same time
+inverse of <a title="precedes">precedes</a>; that is, a constraint of
+the form
+<span class="name">e1</span> follows <span class="name">e2</span> means that <span class="name">e1</span> happened at the same time
as or after <span class="name">e2</span>. Both relations are
<a>preorder</a>s, meaning that they are <a>reflexive</a> and
<a>transitive</a>. Moreover, we sometimes consider <em>strict</em> forms of these
@@ -2428,7 +2462,7 @@
<h3>Activity constraints</h3>
<p>
-In this section we discuss constraints from the perspective of
+This section specifies ordering constraints from the perspective of
the <a>lifetime</a> of an activity. An activity starts, then during
its lifetime uses, generates or invalidates entities, and communicates with or starts
other
@@ -2642,8 +2676,8 @@
<p id='generation-precedes-invalidation_text'>
Generation of an entity precedes its invalidation. (This
-follows from other constraints if the entity is used, but we state it
-explicitly to cover the case of an entity that is generated and
+follows from other constraints if the entity is used, but it is stated
+explicitly here to cover the case of an entity that is generated and
invalidated without being used.)</p>
<div class='constraint' id='generation-precedes-invalidation'>
@@ -2997,7 +3031,8 @@
<p><span class="conditional">IF</span> <span class="name">hyp<sub>1</sub></span> and ... and <span class="name">hyp<sub>n</sub></span> <span class="conditional">THEN</span> <span class="conditional">INVALID</span>.</p>
</div>
-<p> To check an impossibility constraint on instance <span class="math">I</span>, we check whether there is
+<p> Checking an impossibility constraint on instance <span
+ class="math">I</span> means checking whether there is
any way of matching the pattern <span class="name">hyp<sub>1</sub></span>, ..., <span class="name">hyp<sub>n</sub></span>. If there
is, then checking the constraint on <span class="math">I</span> fails (which implies that
<span class="math">I</span> is invalid).
@@ -3022,7 +3057,7 @@
<!--
<hr>
<div class="note"> This is also a constraint, but follows from
- irreflexivity and transitivity so we may omit it.</div>
+ irreflexivity and transitivity so it may be omitted.</div>
<p id="specialization-asymmetric_text"/>