--- a/model/prov-dm-constraints.html Mon Apr 02 09:22:49 2012 +0100
+++ b/model/prov-dm-constraints.html Mon Apr 02 09:23:03 2012 +0100
@@ -219,7 +219,6 @@
</p>
-<p>PROV-DM is essentially defined without any constraints [[PROV-DM]]. This document introduces a further set of concepts underpinning this data model and defines constraints that well-structured provenance descriptions should follow and that provide an interpretation for these descriptions. </p>
<p>This specification is one of several specifications, referred to as the PROV family of specifications, defining the various aspects
@@ -239,6 +238,25 @@
</ul>
+<p>PROV-DM is essentially defined without any constraints [[PROV-DM]]. This document introduces a further set of concepts underpinning this data model and defines constraints that well-structured provenance descriptions should follow and that provide an interpretation for these descriptions. </p>
+
+
+<p>In [[PROV-DM]], a data model for provenance has been defined without introducing any constraint that this data model has to satisfy. First we introduce and refine various concepts such as attributes, event, entity, entity, interval, accounts, which underpin the PROV-DM data model. Using these notions, we explore the constraints
+that the PROV-DM data model has to satisfy. </p>
+
+<p>Several types of constraints are specified.</p>
+<ul>
+<li>Definitional constraints are constraints directly following the definition of concepts in the PROV data model (<a href="#definitional-constraints">Section 4</a>). </li>
+<li>Account constraints have to be satisfied by provenance descriptions in the context of a given account (<a href="#account-constraints">Section 5</a>)</li>
+<li>Event ordering constraints provide a "temporal interpretation" for provenance descriptions (<a href="#interpretation">Section 6</a>)</li>
+<li>Structural constraints are further constraints to be satisfied by generation descriptions (<a href="#structural-constraints">Section 7</a>)</li>
+<li>Collection constraints are the constraints that hold for collections (<a href="#collection-constraints">Section 8</a>)</li>
+</ul>
+</div>
+
+
+
+<!--
<section id="structure-of-this-document">
<h3>Structure of this Document</h3>
@@ -264,7 +282,7 @@
</section>
-
+-->
<section id="conventions">
<h3>Conventions</h3>
@@ -282,7 +300,7 @@
<section id='prov-dm-refinement'>
<h2>Data Model Refinement</h2>
-<p>Underpinning the PROV-DM data model is a notion of event, marking transitions in the world (when entities are generated, used, or destroyed, or activities started or ended). This notion of event is not first-class in the data model, but underpins many of its concepts and its semantics [[PROV-SEM]]. Thus, using this notion of event, we can provide an interpretation for the data model, which in turn can allow creators of provenance assertions to make their assertions more robust. </p>
+<p>Underpinning the PROV-DM data model is a notion of event, marking transitions in the world (when entities are generated, used, or invalidated, or activities started or ended). This notion of event is not first-class in the data model, but underpins many of its concepts and its semantics [[PROV-SEM]]. Thus, using this notion of event, we can provide an interpretation for the data model, which in turn can allow creators of provenance assertions to make their assertions more robust. </p>
<section id='section-time-event'>
@@ -315,8 +333,8 @@
<section id="types-of-events">
<h4>Types of Events</h4>
-<p>Four kinds of <a title="event">instantaneous events</a> underpin the PROV-DM data model. The <strong>activity start</strong> and <strong>activity end</strong> events demarcate the
-beginning and the end of activities, respectively. The <strong>entity generation</strong> and <strong>entity usage</strong> events demarcate the characterization interval for entities. More
+<p>Five kinds of <a title="event">instantaneous events</a> underpin the PROV-DM data model. The <strong>activity start</strong> and <strong>activity end</strong> events demarcate the
+beginning and the end of activities, respectively. The <strong>entity generation</strong>, <strong>entity usage</strong>, and <strong>entity invalidation</strong> events demarcate the characterization interval for entities. More
specifically:
</p>
@@ -326,11 +344,7 @@
<p>An <dfn id="dfn-usage-event">entity usage event</dfn> is the <a title="event">instantaneous event</a> that marks the first instant of an entity's consumption timespan by an activity.</p>
-<p>An <dfn id="dfn-destruction-event">entity destruction event</dfn> is the <a title="event">instantaneous event</a> that marks the initial instant of an entity's destruction or cessation timespan, after which the entity
-is no longer available for use.</p>
-
-<div class='note'>Tentative definition of destruction!</div>
-
+<p>An <dfn id="dfn-invalidation-event">entity invalidation event</dfn> is the <a title="event">instantaneous event</a> that marks the initial instant of the destruction, invalidation, or cessation of an entity, after which the entity is no longer available for use.</p>
<p>An <dfn id="dfn-start-event">activity start event</dfn> is the <a title="event">instantaneous event</a> that marks the instant an activity starts.</p>
@@ -366,7 +380,7 @@
<p>This specification introduces a set of "temporal interpretation"
-rules allowing the derivation of <a title="event">instantaneous event</a> ordering constraints from
+rules allowing ordering constraints between <a title="event">instantaneous event</a> to inferred from
provenance descriptions. According to such temporal interpretation,
descriptions MUST satisfy such constraints. We note that the
actual verification of such ordering constraints is outside the
@@ -400,11 +414,11 @@
<p>It is the purpose of attributes in PROV-DM to help fix some aspect of entities.
Indeed, we previously defined
-entities as things in the world one wants to provide provenance for;
+entities as things one wants to provide provenance for;
we refine this definition as follows, using attribute-values to describe entities' "partial states", and linking them to the very existence of entities.</p>
<p>
-An <dfn>entity</dfn> is a thing in the world one wants to provide provenance for and whose situation in the world is represented by some attribute-value pairs; an entity's attribute-value pairs remain unchanged during an entity's characterization interval, which is defined as the period comprised between its <a title="entity generation event">generation event</a> and its <a title="entity destruction event">destruction event</a>.</p>
+An <dfn>entity</dfn> is a thing one wants to provide provenance for and whose situation in the world is described by some attribute-value pairs. An entity's attribute-value pairs are specified when the entity description is created and remain unchanged. An entity's attribute-value pairs are expected to describe the entity's situation and (partial) state during an entity's characterization interval, which is defined as the period comprised between its <a title="entity generation event">generation event</a> and its <a title="entity invalidation event">invalidation event</a>.</p>
<p>An entity fixes some aspects of a thing and its situation in the
world. An alternative entity may fix other aspects, and its provenance
@@ -434,7 +448,7 @@
<p>We do not assume that any entity is more important than any other; in fact, it is possible to describe the processing that occurred for the report to be commissioned, for
individual versions to be created, for those versions to be published at the given URL, etc., each via a different entity with attribute-value pairs that fix some aspect of the report appropriately.</p>
-<p>Attributes are not restricted to entities, but they belong to a variety of PROV-DM objects, including activities, activity associations, responsibility chains, generations, usages, derivations, and alternates. Each object has its duration interval, and attribute-value pairs for a given object, are expected to be unchanged for the object's duration.</p>
+<p>Attributes are not restricted to entities, but they belong to a variety of PROV-DM objects, including activity, generation, usage, start, end, communication, attribution, association, responsibility, and derivation. Each object has its duration interval (potentially collapsing to a single time point), and attribute-value pairs for a given object, are expected to be descriptions that hold for the object's duration.</p>
</section>
@@ -473,15 +487,13 @@
</p>
-
</section>
-<div class='issue'> We need to refine the definition of entity and activity, and all the concepts in general. This is <a href="http://www.w3.org/2011/prov/track/issues/223">ISSUE-223</a>.</div>
-
-
-
-
- <section id="account-and-accountEntity">
- <h3>Account and AccountEntity</h3>
+
+
+
+
+ <section id="account-section">
+ <h3>Account</h3>
<p>It is common for multiple provenance records to co-exist. For
@@ -501,53 +513,30 @@
<p>
<span class="glossary" id="glossary-account">
-An <dfn>account</dfn> is a named bundle of provenance descriptions.
+An <dfn>account</dfn> is a entity that contains a bundle of provenance descriptions.
</span> PROV-DM does not provide an actual mechanism for creating accounts, i.e. for bundling up provenance descriptions and naming them. Accounts MUST satisfy some properties:
<ul>
-<li>An account can be seen as a container of provenance descriptions, hence its content MAY change over time.</li>
+<li>An account is as a container of provenance descriptions, hence its content MAY change over time.</li>
<li>If an account's set of descriptions changes over time, it increases monotonically with time. </li>
<li>A given description of e.g. an entity in a given account, in terms of its identifier and attribute-value pairs, does not change over time. </li>
</ul>
<div class='note'>
-The last point is important and needs to be discussed by the Working Group.
-It indicates that within an account:
+The last point is important. It indicates that within an account:
<ul>
-<li>It is always possible to add new provenance descriptions, e.g. stating that a given entity was used by an activity. This is very much an open world assumption.
+<li>It is always possible to add new provenance descriptions, e.g. stating that a given entity was used by an activity, or derived from another. This is very much an open world assumption.
<li>It is not permitted to add new attributes to a given entity (a form of closed world assumption from the attributes point of view), though it is always permitted to create a new description for an entity, which is a "copy" of the original description extended with novel attributes (cf Example <a href="#merge-with-rename">merge-with-rename</a>).
</ul>
</div>
<p>
-There is no construct in PROV-DM to create such named bundles. Instead, it is assumed that some mechanism, outside PROV-DM can create them. However, from a provenance viewpoint, such accounts are things we may want to describe the provenance of. In order to be able to do so, we need to see accounts as entities, whose origin can be described using PROV-DM vocabulary. Thus, PROV-DM introduces the reserved type AccountEntity, defined as follows:
-<span class="glossary" id="glossary-entity">
- <dfn id="concept-accountEntity">AccountEntity</dfn> is the category of entities that are accounts, i.e. named bundles of provenance descriptions.
-</span>
+There is no construct in PROV-DM to create such bundles of descriptions. Instead, it is assumed that some mechanism, outside PROV-DM can create them. However, from a provenance viewpoint, such accounts are things we may want to describe the provenance of. In order to be able to do so, we need to see accounts as entities, whose origin can be described using PROV-DM vocabulary. Thus, PROV-DM introduces the reserved type <span class="name">Account</span>.
</p>
</section>
</section>
-
-<section id="data-model-constraints">
-<h2>Constraints Applicable to PROV-DM </h2>
-
-<p>In [[PROV-DM]], a data model for provenance has been defined without introducing any constraint that this data model has to satisfy. In <a href="#prov-dm-refinement">Section 2</a>, various notions have been introduced, attributes, event, entity interval, activity interval, accounts, which underpin the PROV-DM data model. Using these notion, we explore the constraints
-that the PROV-DM data model has to satisfy. </p>
-
-<div class='note'>
-<p> Overview the kind of constraints</p>
-<ul>
-<li>Definitional constraints (<a href="#definitional-constraints">Section 4</a>)</li>
-<li>Account constraints (<a href="#account-constraints">Section 5</a>)</li>
-<li>Event ordering constraints (<a href="#interpretation">Section 6</a>)</li>
-<li>Structural constraints (<a href="#structural-constraints">Section 7</a>)</li>
-<li>Collection constraints (<a href="#collection-constraints">Section 8</a>)</li>
-</ul>
-</div>
-
-</section>
<section id="definitional-constraints">
@@ -567,8 +556,8 @@
<p>
-An <dfn>entity</dfn> is a thing one wants to provide provenance for and whose situation in the world is represented by some attribute-value pairs. An entity's attribute-value pairs are specified when the entity description is created and remain unchanged. An entity's attribute-value pairs are expected to represent the entity's situation and (partial) state during an entity's characterization interval,
- i.e. a continuous interval between two <a title="event">instantaneous events</a> in the world, namely its <a title="entity generation event">generation event</a> and its <a title="entity destruction event">destruction event</a>.</p>
+An <dfn>entity</dfn> is a thing one wants to provide provenance for and whose situation in the world is described by some attribute-value pairs. An entity's attribute-value pairs are specified when the entity description is created and remain unchanged. An entity's attribute-value pairs are expected to describe the entity's situation and (partial) state during an entity's characterization interval,
+ i.e. a continuous interval between two <a title="event">instantaneous events</a> in the world, namely its <a title="entity generation event">generation event</a> and its <a title="entity invalidation event">invalidation event</a>.</p>
<p>If an entity's situation changes in the world, this may result in its description being invalid, because one or more attribute-value pairs no longer hold. In that case, from the PROV viewpoint, there exists a new entity, which needs to be given a distinct identifier, and associated with the attribute-value pairs that reflect its new situation in the world.</p>
@@ -1197,7 +1186,7 @@
introduces a notion of <a title="event">instantaneous event</a>
marking changes in the world, in its activities and entities. PROV-DM
identifies five kinds of <a title="event">instantaneous events</a>, namely <a>entity generation
-event</a>, <a>entity usage event</a>, <a>entity destruction event</a>, <a>activity start event</a>
+event</a>, <a>entity usage event</a>, <a>entity invalidation event</a>, <a>activity start event</a>
and <a>activity end event</a>. PROV-DM adopts Lamport's clock
assumptions [[CLOCK]] in the form of a reflexive, transitive partial order <a>follows</a>
(and its inverse <a>precedes</a>) between <a title="event">instantaneous events</a>. Furthermore,
@@ -1251,7 +1240,7 @@
<div class="note">THis is new</div>
<div class='interpretation' id='generation-precedes-invalidation'>For any entity, the following ordering constraint holds: the <a title="entity generation event">generation</a> of an entity always
-<a>precedes</a> any of its <a title="entity destruction event">destruction</a>.
+<a>precedes</a> any of its <a title="entity invalidation event">invalidation</a>.
</div>
@@ -1360,7 +1349,7 @@
</div>
<div class='issue'>In the following, we assume that we can talk about the end of an entity (or agent)
-For this, we use the term 'destruction' This is <a href="http://www.w3.org/2011/prov/track/issues/204">ISSUE-204</a>.
+For this, we use the term 'invalidation' This is <a href="http://www.w3.org/2011/prov/track/issues/204">ISSUE-204</a>.
</div>
@@ -1384,12 +1373,12 @@
class="name">wasStartedBy(a,ag)</span>
holds, <span class='conditional'>then</span> the following ordering constraints hold: the
<a title="activity start event">start</a> event of the activity denoted by <span class="name">a</span> <a>follows</a> the <a title="entity generation event">generation event</a> for agent denoted by <span class="name">ag</span>, and
-<a>precedes</a> the destruction event of
+<a>precedes</a> the invalidation event of
the same agent.
</div>
-<p>An activity that was associated with an agent must have some overlap with the agent. The agent may be generated, or may only become associated with the activity, after its start: so, the agent is required to exist before the activity end. Likewise, the agent may be destructed, or may terminate its association with the activity, before the activity end: hence, the agent destruction is required to happen after the activity start.
+<p>An activity that was associated with an agent must have some overlap with the agent. The agent may be generated, or may only become associated with the activity, after its start: so, the agent is required to exist before the activity end. Likewise, the agent may be destructed, or may terminate its association with the activity, before the activity end: hence, the agent invalidation is required to happen after the activity start.
This is
illustrated by Subfigure <a href="#constraint-summary2">constraint-summary2</a> (b) and expressed by constraint <a href="#wasAssociatedWith-ordering">wasAssociatedWith-ordering</a>.</p>
@@ -1399,7 +1388,7 @@
class="name">wasAssociatedWith(a,ag)</span>
holds, <span class='conditional'>then</span> the following ordering constraints hold: the
<a title="activity start event">start</a> event of the activity denoted by <span class="name">a</span>
-precedes the destruction event of
+precedes the invalidation event of
the agent denoted by <span class="name">ag</span>, and
the <a title="entity generation event">generation event</a> for agent denoted by <span class="name">ag</span>
<a>precedes</a> the activity <a title="activity end event">end</a> event.
@@ -1591,7 +1580,11 @@
<div class="note"> unique key constraint removed as it follows from the "update semantics" which is explained in the DM</div>
-It is desirable to restrict the derivation of one collection from another to one single insertion or removal relation. The following examples illustrate what may happen when multiple derivation is allowed.
+It is desirable to restrict the derivation of one collection from another to one single insertion or removal relation, or to one membership relation.
+The interpretation of two (or more) insertion, removal, membership relations that result in the same collection is undefined.
+
+<!--
+The following examples illustrate what may happen when multiple derivations are allowed.
<div class="anexample">
<pre class="codeexample">
@@ -1618,19 +1611,23 @@
</pre>
Here the insertion and removal into, and removal from <span class="name">c1</span> and <span class="name">c2</span> "cancel" each other. This is allowed if no constraint is enforced, however it is not meaningful.
</div>
+-->
<!--
On the other hand, it is desirable to be able to express the fact that <span class="name">c</span> is obtained precisely as the result of <em>merging</em> <span class="name">c1</span> and <span class="name">c2</span>. <br/>
-->
<!--
This is achieved by adding a constraint to ensure that each derivation is unique, and (ii) making use of the <span class="name">merge(c,c1,c2)</span> to define the state <span class="name">c</span> precisely as the union of the states <span class="name">c1</span> and <span class="name">c2</span>. -->
-This justifies the introduction of the following constraint.
-
+The following constraint ensures unique derivation.
<div class='constraint' id='collection-unique-derivation'/>
- <p>One cannot have multiple assertions that define the state of a collection by means of insertions and removal relations. Thus:</p>
-<pre class="codeexample">
+ <p>The state of a collection that is derived through multiple insertions, removal, or membership relations is undefined.
+
+</div>
+
+<div class="anexample">
+ <pre class="codeexample">
entity(c1, [prov:type="Collection"])
entity(c2, [prov:type="Collection"])
entity(c, [prov:type="Collection"])
@@ -1638,24 +1635,39 @@
derivedByInsertionFrom(c, c1, {(k1, v1), (k2, v2)})
derivedByInsertionFrom(c, c2, {(k3, v3)})
</pre>
-is not allowed (unless the two sets were identical, in which case one of the two statements would be redundant)<p/>
-
- In particular, one cannot derive the state of a collection from another using multiple statements. Thus: <p/>
-<pre class="codeexample">
+is undefined (unless the two sets were identical, in which case one of the two statements would be redundant)<p/>
+</div>
+
+As a particular case, the state of <span class="name">c</span> as derived multiple times from the same <span class="name">c1</span> is undefined. <p/>
+<div class="anexample">
+ <pre class="codeexample">
derivedByInsertionFrom(id1, c, c1, {(k1, v1), (k2, v2)})
derivedByInsertionFrom(id2, c, c1, {(k3, v3), (k4, v4)})
</pre>
- is not allowed.<p/>
-
-The same applies to removal and combinations of insertions and removals, for example:
-
+ is undefined. <p/>
+ The expected way to accomplish the effect intended with these statements, is as follows:
+ <pre class="codeexample">
+derivedByInsertionFrom(id1, c, c1, {(k1, v1), (k2, v2), (k3, v3), (k4, v4)})
+</pre>
+</div>
+
+The same is true for any combination of insertions, removals, and membership relations:
+<div class="anexample">
<pre class="codeexample">
derivedByInsertionFrom(c, c1, {(k1, v1)})
derivedByRemovalFrom(c, c2, {k2})
</pre>
- is not allowed.
+ is undefined.
+<pre class="codeexample">
+derivedByInsertionFrom(c, c1, {(k1, v1)})
+memberOf(c, c2, {k2})
+</pre>
+ is undefined.
</div>
+</div>
+
+
<!--
<section id="Collection-branching">
@@ -1893,7 +1905,7 @@
</li>
-<li> A destruction event for <span class="name">tr:prov-dm</span> is provided.
+<li> A invalidation event for <span class="name">tr:prov-dm</span> is provided.
<pre>
entity(tr:prov-dm)
agent(ex:Simon)
@@ -1902,7 +1914,7 @@
wasAttributedTo(tr:prov-dm,ex:Simon)
</pre>
<div class='note'>
-Speculative, since we have not defined the destruction event (yet?.
+Speculative, since we have not defined the invalidation event (yet?.
What is the meaning here? that only the versions that existed during this characterization interval were attributed to ex:Simon.
</div>
</li>