* Fixed bug in figure
authorJames Cheney <jcheney@inf.ed.ac.uk>
Fri, 13 Apr 2012 18:03:25 +0100
changeset 2294 d399bf53971f
parent 2293 d9b8e2cf4f09
child 2295 98d5d2dec1da
* Fixed bug in figure
* Addressed some of Tim's comments
model/images/constraints2.png
model/working-copy/wd5-prov-dm-constraints-revised.html
Binary file model/images/constraints2.png has changed
--- a/model/working-copy/wd5-prov-dm-constraints-revised.html	Fri Apr 13 17:20:34 2012 +0100
+++ b/model/working-copy/wd5-prov-dm-constraints-revised.html	Fri Apr 13 18:03:25 2012 +0100
@@ -38,6 +38,11 @@
           "<a href=\"http://www.ditext.com/johnson/intro-3.html\"><cite>Logic: Part III</cite></a>."+
           "1924. "+
           "URL: <a href=\"http://www.ditext.com/johnson/intro-3.html\">http://www.ditext.com/johnson/intro-3.html</a>",
+        "Lattices":
+          "TODO"+
+          "<a href=\"http://TODO\"><cite>TODO</cite></a>."+
+          "YYYY. "+
+          "URL: <a href=\"http://TODO\">http://TODO</a>",
         "PROV-SEM":
           "James Cheney "+
           "<a href=\"http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman\"><cite>Formal Semantics Strawman</cite></a>. "+
@@ -155,16 +160,20 @@
 
     <section id="abstract">
 <p>
-PROV-DM, the PROV data model, is a data model for provenance that describes the entities, people and activities involved in producing a piece of data or thing. PROV-DM is structured in six components, dealing with: (1) entities and activities, and the time at which they were created, used, or ended; (2) agents bearing responsibility for entities that were generated and actities that happened; (3) derivations between entities; (4) properties to link entities that refer to a same thing; (5) collections of entities, whose provenance can itself be tracked; (6) a simple annotation mechanism.</p>
-
-
-<p> This document introduces a further set of concepts underpinning
-  the PROV-DM data model and defines <i>inferences</i> that are
-  allowed on provenance descriptions and <i>validity constraints</i>
-  that well-structured provenance descriptions should follow. These
-  inferences and constraints are useful for readers who develop
-  applications that generate provenance or reason over provenance.
-  </p> </section>
+PROV-DM, the PROV data model, is a data model for provenance that
+describes the entities, people and activities involved in producing a
+piece of data or thing. PROV-DM is structured in six components,
+dealing with: (1) entities and activities, and the time at which they
+were created, used, or ended; (2) agents bearing responsibility for activities; (3) derivations between entities; (4) properties to link entities that refer to a same thing; (5) collections of entities, whose provenance can itself be tracked; (6) a simple annotation mechanism.</p>
+
+
+<p> This document introduces a further set of concepts useful for
+  understanding the PROV-DM data model and defines <i>inferences</i>
+  that are allowed on provenance descriptions and <i>validity
+  constraints</i> that well-structured provenance descriptions should
+  follow. These inferences and constraints are useful for readers who
+  develop applications that generate provenance or reason over
+  provenance.  </p> </section>
 
 <section id="sotd">
 <h4>PROV Family of Specifications</h4>
@@ -238,7 +247,8 @@
 <li>A data model for provenance, which is presented in three documents:
 <ul>
 <li> PROV-DM (part I): the provenance data model itself, expressed in natural language  [[PROV-DM]];
-<li> PROV-DM-CONSTRAINTS (part II): constraints underpinning the data model (this document);
+<li> PROV-DM-CONSTRAINTS (part II): inferences and constraints on
+  valid PROV-DM data (this document);
 <li> PROV-N (part III): a notation to express instances of that data model for human consumption [[PROV-N]];
 </ul> 
 </li>
@@ -249,10 +259,15 @@
 </ul>
 
 
-<p>PROV-DM is essentially defined without any constraints  [[PROV-DM]]. This document introduces a further set of concepts underpinning this data model and defines constraints that well-structured provenance descriptions should follow and that provide an interpretation for these descriptions. </p>
-
-
-<p>In [[PROV-DM]], a data model for provenance has been defined without introducing any constraint that this data model has to satisfy.   First we introduce and refine various concepts such as attributes, event, entity, entity, interval, accounts, which underpin the PROV-DM data model. Using these notions, we explore the constraints
+<p>PROV-DM is essentially defined without any constraints 
+[[PROV-DM]]. This document introduces a further set of concepts useful
+for understanding the rationale for the data model, defines inferences
+over PROV-DM data, and defines constraints that valid provenance descriptions should follow. </p>
+
+
+<p>In [[PROV-DM]], a data model for provenance has been defined
+without introducing any constraint that this data model has to
+satisfy.   First we introduce and refine various PROV-DM concepts such as attributes, event, entity, entity, interval, accounts. Using these notions, we explore the constraints
 that the PROV-DM data model has to satisfy. </p> 
 
 <p>Several types of constraints are specified.</p>
@@ -273,7 +288,8 @@
 
 <div class='note'>TODO</div>
 
-<p>In <a href="#prov-dm-refinement">section 2</a>, further concepts underpinning PROV-DM are introduced.</p>
+<p>In <a href="#prov-dm-refinement">section 2</a>, further concepts
+useful for explaining PROV-DM are introduced.</p>
 
 <p><a href="#data-model-constraints">Section 3</a></p>
 
@@ -311,7 +327,16 @@
 <section id='prov-dm-refinement'>
 <h2>Data Model Refinement</h2>
 
-<p>Underpinning the PROV-DM data model is a notion of event, marking transitions in the world (when entities are generated, used, or invalidated, or activities started or ended).  This notion of event is not first-class in the data model, but underpins many of its concepts and its semantics [[PROV-SEM]].  Thus, using this notion of event, we can provide an interpretation for the data model, which in turn can allow creators of provenance descriptons to make their descriptions more robust. </p>
+<div class='note>TODO: Better section title/headings; move this
+  material later or embed it into appropriate sections</div>
+  
+<p>The PROV-DM data model is implicitly based on a notion of event, marking
+transitions in the world.  Events include generation, usage, or
+invalidation of entities, as well as starting or ending of activities.  This
+notion of event is not first-class in the data model, but it is useful
+for explaining its other concepts and its semantics [[PROV-SEM]].
+Thus, events help justify  <i>inferences</i> on provenance as well as
+<i>validity</i> constraints indicating when provenance is self-consistent. </p>
 
 
     <section id='section-time-event'> 
@@ -320,48 +345,34 @@
 <p>Time is critical in the context of provenance, since it can help corroborate provenance claims. For instance, if an entity is claimed to be obtained by transforming another, then the
 latter must have existed before the former. If it is not the case, then there is something wrong with such a provenance claim. </p>
 
-<p> Although time is critical, we should also recognize that provenance can be used in many different contexts: in a single system, across the Web, or in spatial data management, to name a
-few. Hence, it is a design objective of PROV-DM to minimize the assumptions about time, so that PROV-DM can be used in varied contexts.  </p>
-
-
-<p>Furthermore, consider two activities that started at the same time
+<p> Although time is critical, we should also recognize that
+provenance can be used in many different contexts within individual
+systems and across the Web. Different systems may use different clocks
+which may not be precisely synchronized, so when provenance records
+are combined by different systems, we may not be able to align the
+times involved to a single global timeline.  Hence, PROV-DM is
+designed to minimize assumptions about time.  </p>
+
+
+<!--<p>Furthermore, consider two activities that started at the same time
 instant. Just by referring to that instant, we cannot distinguish
 which activity start we refer to. This is particularly relevant if we
 try to explain that the start of these activities had different
 reasons.  We need to be able to refer to the start of an activity as a
 first class concept, so that we can talk about it and about its
 relation with respect to other similar starts. </p>
-
-
-<p>Hence, in our conceptualization of the world, an <em>instantaneous event</em>, or <dfn id="dfn-event">event</dfn> for short, happens in the world and marks a change in the world, in its
-activities and in its entities.  
+-->
+
+<p>Hence, to talk about the constraints on valid PROV-DM data, we
+refer to <em>instantaneous events</em>, or <dfn
+id="dfn-event">events</dfn> for short, that correspond to interactions
+between activities and entities.  
 The term "event" is commonly used in process algebra with a similar meaning. For instance, in CSP [[CSP]], events represent communications or interactions; they are assumed to be atomic and
 instantaneous.</p>
 
 
 
 
-<section id="types-of-events">
-<h4>Types of Events</h4>
-
-<p>Five kinds of <a title="event">instantaneous events</a> underpin the PROV-DM data model. The <strong>activity start</strong> and <strong>activity end</strong>  events demarcate the
-beginning and the end of activities, respectively. The <strong>entity generation</strong>, <strong>entity usage</strong>, and <strong>entity invalidation</strong> events demarcate the characterization interval for entities. More
-specifically:
-
-</p>
-
-<p>An <dfn id="dfn-generation-event">entity generation event</dfn> is the <a title="event">instantaneous event</a> that marks the  final instant of an entity's creation timespan, after which
-it is available for use.</p>
-
-<p>An <dfn id="dfn-usage-event">entity usage event</dfn> is the <a title="event">instantaneous event</a> that marks the first instant of an entity's consumption timespan by an activity.</p>
-
-<p>An <dfn id="dfn-invalidation-event">entity invalidation event</dfn> is the <a title="event">instantaneous event</a> that marks the  initial instant of the destruction, invalidation, or cessation of an entity, after which the entity is  no longer available for use.</p>
-
-<p>An <dfn id="dfn-start-event">activity start event</dfn> is the <a title="event">instantaneous event</a> that marks the instant an activity starts.</p>
-
-<p>An <dfn id="dfn-end-event">activity end event</dfn> is the <a title="event">instantaneous event</a> that marks the instant an activity ends.</p>
-
-</section>
 
 <section id="event-ordering">
 <h4>Event Ordering</h4>
@@ -372,18 +383,39 @@
 </p>
 
 
-<p>Specifically, <dfn id="dfn-follows">follows</dfn> is a partial
-order between <a title="event">instantaneous events</a>, indicating that an <a title="event">instantaneous event</a> occurs at the same time as or after another.
-For symmetry, <dfn id="dfn-precedes">precedes</dfn> is defined as
-the inverse of follows. (Hence, these relations are reflexive and transitive.)</p>
-
-
-<p> How such partial order is realized in practice is beyond the scope
+<p>Specifically, <dfn id="dfn-precedes">precedes</dfn> is a partial
+order between <a title="event">instantaneous events</a>.  When we say
+<span class="name">e1</span> precedes <span class="name">e2</span>,
+this means that either the two events are equal or <span
+class="name">e1</span> happened before <span class="name">e2</span>.
+For symmetry, <dfn id="dfn-follows">follows</dfn> is defined as the
+inverse of <a title="precedes">precedes</a>; that is, when we say
+<span class="name">e1</span> follows <span class="name">e2</span>,
+this means that either the two events are equal or <span
+class="name">e1</span> happened after <span
+class="name">e2</span>. Both relations are partial orders, meaning
+that they are reflexive, transitive, and antisymetric
+[[Lattices]].</p>
+
+<div class="note"> Define reflexivity, transitivity
+and antisymmetry in glossary.  Also, do we want to allow an event to
+  "precede" itself?
+</div>
+
+<div class="note">
+ The following paragraphs are unclear and need to be revised, to
+ address review concerns: if
+ we aren't saying anything about how events and time relate, and time
+ is the only concrete information about event ordering in PROV-DM,
+ then how can implementations check that event ordering constraints
+ are satisfied?
+  </div>  
+<p> How such a partial order is implemented in practice is beyond the scope
 of this specification.  This specification only assumes that
 each <a title="event">instantaneous event</a> can be mapped to an instant in some form of
 timeline. The actual mapping is not in scope of this
 specification. Likewise, whether this timeline is formed of a single
-global timeline or whether it consists of multiple Lamport's style
+global timeline or whether it consists of multiple Lamport-style
 clocks is also beyond this specification.  The <a>follows</a> and
 <a>precedes</a> orderings of events should be consistent with the
 ordering of their associated times
@@ -391,10 +423,9 @@
 </p>
 
 
-<p>This specification introduces a set of <i>event ordering constraints</i>
+<p>This specification defines <i>event ordering constraints</i>
 between  <a title="event">instantaneous events</a> associated with 
-provenance descriptions.  According to such temporal interpretations,
-descriptions MUST satisfy such constraints.  </p>
+provenance descriptions.  PROV-DM data MUST satisfy such constraints.  </p>
 
 <p>PROV-DM also allows for time observations to be inserted in
 specific descriptions, for each recognized <a
@@ -410,6 +441,32 @@
 
 </section>
 
+<section id="types-of-events">
+<h4>Types of Events</h4>
+
+<p>Five kinds of <a title="event">instantaneous events</a> are used
+for the PROV-DM data model. The <strong>activity start</strong> and <strong>activity end</strong>  events demarcate the
+beginning and the end of activities, respectively. The <strong>entity
+usage</strong>, <strong>entity generation</strong>, and <strong>entity invalidation</strong> events demarcate the characterization interval for entities. More
+specifically:
+
+</p>
+
+<p>An <dfn id="dfn-start-event">activity start event</dfn> is the <a title="event">instantaneous event</a> that marks the instant an activity starts.</p>
+
+<p>An <dfn id="dfn-end-event">activity end event</dfn> is the <a title="event">instantaneous event</a> that marks the instant an activity ends.</p>
+
+<p>An <dfn id="dfn-usage-event">entity usage event</dfn> is the <a
+title="event">instantaneous event</a> that marks the first instant of
+an entity's consumption timespan by an activity.</p>
+
+<p>An <dfn id="dfn-generation-event">entity generation event</dfn> is the <a title="event">instantaneous event</a> that marks the  final instant of an entity's creation timespan, after which
+it is available for use.</p>
+
+<p>An <dfn id="dfn-invalidation-event">entity invalidation event</dfn> is the <a title="event">instantaneous event</a> that marks the  initial instant of the destruction, invalidation, or cessation of an entity, after which the entity is  no longer available for use.</p>
+
+</section>
+
     </section> 
 
 
@@ -427,17 +484,20 @@
 
 <p>
 <!--From a provenance viewpoint, it is important to identify a
-<em>partial state</em> of something, i.e. something with some
-aspects that have been fixed, so that it becomes possible to express
-its provenance, and what causes that thing, with these specific
+<em>partial state</em> of something, i.e. something with some aspects
+that have been fixed, so that it becomes possible to express its
+provenance, and what causes that thing, with these specific
 aspects.-->
-
-To describe the provenance of things that can change over time,
-PROV-DM uses the concept of <i>entities</i> with fixed attributes.  An
-entity encompasses a part of a thing's history during which some of the
+To describe the provenance of things that can change over
+time, PROV-DM uses the concept of <i>entities</i> with fixed
+attributes.  From a provenance viewpoint, it is important to identify
+a partial state of something, i.e. something with some aspects that
+have been fixed, so that it becomes possible to express its provenance
+(i.e. what caused the thing with these specific aspects).  An entity
+encompasses a part of a thing's history during which some of the
 attributes are fixed.  An entity can thus be thought of as a part of a
 thing with some associated partial state.
-</p>
+Attribures in PROV-DM are used to fix certain aspects of entities.</p>
 
 <!--<p>Attributes in PROV-DM describe some aspects of entities.
 Indeed, we previously defined 
@@ -447,7 +507,14 @@
 
 <p>
 An <dfn>entity</dfn> is a thing one wants to provide provenance for
-and whose situation in the world is described by some fixed attributes. An entity's attributes pairs are expected to describe the entity's situation and (partial) state  during an entity's characterization interval, which  is defined as the period between its <a title="entity generation event">generation event</a> and its <a title="entity invalidation event">invalidation event</a>.</p>
+and whose situation in the world is described by some fixed
+attributes. An entity has a characterization interval, or lifetime,
+defined as characterization interval, which  is defined as the period
+between its <a title="entity generation event">generation event</a>
+and its <a title="entity invalidation event">invalidation event</a>.
+An entity's attributes are established when the entity is
+created and describe the entity's situation and (partial) state
+during an entity's lifetime.</p>
 
 <p>An entity fixes some aspects of a thing and its situation in the
 world. A different entity (perhaps representing a different user or