updated blog example
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Mon, 10 Sep 2012 13:55:50 +0100
changeset 4442 f7053b2e4f85
parent 4441 d5024a7552e5
child 4443 d85ea94d156f
updated blog example
model/constraints-blog.html
--- a/model/constraints-blog.html	Mon Sep 10 12:42:04 2012 +0100
+++ b/model/constraints-blog.html	Mon Sep 10 13:55:50 2012 +0100
@@ -4,46 +4,21 @@
 
 <h2>Last Call: Constraints of the Provenance Data Model</h2>
 
-<p>On Sept. XX, 2012 the Provenance Working Group has announced Last Call on a new document: <a href="http://www.w3.org/TR/prov-constraints/">PROV-CONSTRAINTS</a> in the suite that defines the core of the PROV family of specifications.</p>
+<p>On Sept. 11, 2012 the Provenance Working Group has announced Last Call on a new document: <a href="http://www.w3.org/TR/prov-constraints/">PROV-CONSTRAINTS</a> in the suite that defines the core of the PROV family of specifications.</p>
 
 <p>This follows the recent <a href="http://www.w3.org/blog/SW/2012/07/24/last-call-3-working-drafts-for-provenance-interchange/">Last Call announcement for 3 other documents</a>, namely <a href="http://www.w3.org/TR/prov-dm/">PROV-DM<a>, <a href="http://www.w3.org/TR/prov-o/">PROV-O<a>, <a href="http://www.w3.org/TR/prov-n/">PROV-N<a>. The meaning of <em>Last Call</em> is clarified in the earlier announcement. Essentially, it means that the specification document is open to public comments for a set period of time, at the end of which the editors commit to produce the final version of the document, where all such comments are accounted for following internal group discussions.</p>
 
-<p> The <a href="http://www.w3.org/TR/prov-constraints/">PROV-CONSTRAINTS</a> document complements the first three, and is focused on the notion of <em>valid</em> provenance. The intent of provenance validation  is to ensure that a set of PROV statements, called a <strong>PROV instance</strong>, represents a history of objects and their interactions which is consistent, and thus safe to use for the purpose of logical reasoning and other kinds of analysis.  
- Valid PROV statements satisfy all the definitions, inferences, and constraints found in the  The <a href="http://www.w3.org/TR/prov-constraints/">PROV-CONSTRAINTS</a> document.
-
+<p> The <a href="http://www.w3.org/TR/prov-constraints/">PROV-CONSTRAINTS</a> document complements the first three, and is focused on the notion of <em>valid</em> provenance. The intent of provenance validation  is to ensure that a set of PROV statements represents a history of objects and their interactions which is consistent, and thus safe to use for the purpose of logical reasoning and other kinds of analysis.  </p>
 
-Thus, the document can be used for two main purposes:
-
-<ul> 
-<li>To design a validator that can be used to check the consistency of a PROV instance, and to infer new statements from the given ones by means of logic reasoning;
-<li>To determine whether or not two PROV instances are <em>equivalent</em>. For this, the notion of <strong>normal form</strong> for PROV statements is defined in the document, along with guidelines for normalizing  PROV instances, and for comparing two normalized such sets. 
+<p>Thus, the document can be used to design a validator that can be used to check the consistency of a PROV statements.</p>
 </ul>
 
 <h3>What is in the CONSTRAINTS document?</h3>
 
-<p>The document includes <em>definitions</em>,  <em>constraints</em>, and <em>inference rules</em>. 
+<p>Three types of <strong>constraints</strong> are defined.</p>
 
 <ul>
-<li><strong>Definitions</strong> are rules that specify when two PROV statements are equivalent. For example, <a href="http://www.w3.org/TR/prov-constraints#optional-identifiers">Def. 1</a> stipulates the optional nature of identifiers, by stating  that an expression of the form:<br/>
-
-<code>r(a1,...,an) </code><br/>
-
-is equivalent to: <br/>
-
-<code>r(id; a1,...,an)</code> <br/>
-
-where <code>id</code> is an identifier that in unique across all instances of relation <code>r</code> (this is true for the majority of relations in PROV, with a few exceptions noted as part of the <a href="http://www.w3.org/TR/prov-constraints#optional-identifiers">definition</a>).</li>
-
-<li><strong>Inferences</strong> are IF-THEN rules by which new valid statements can be generated from a given set of statements. For example,  <a href="http://www.w3.org/TR/prov-constraints#derivation-generation-use-inference">Inference 11</a> specifies that, if entity <code>e2</code> was generated from <code>e1</code> during the course of activity <code>a</code>, then it must be the case that <code>a</code> used <code>e1</code> at some point in time,  and that <code>e2</code> was generated by <code>a</code> at some other time. This is expressed as follows:</p>
-
-<code>IF wasDerivedFrom(_id; e2,e1,a,gen2,use1,_attrs), THEN there exists _t1 and _t2 such that used(use1; a,e1,_t1,[]) and wasGeneratedBy(gen2; e2,a,_t2,[]).</p></code>
-
-</li>
-
-<li>Finally, three types of <strong>constraints</strong> are defined.
-
-<ul>
-<li><strong>Uniqueness constraints</strong>. These include key constraints, stating for instance that <code>e</code> is key for statement <code>entity(e,attrs)</code>, but also constraints that state the uniqueness of events such as the generation of an entity.  <a href="http://www.w3.org/TR/prov-constraints#unique-generation">Constraint 25</a> for example states that only one generation event can be associated to a generated entity and a generating activity:</p>
+<li><strong>Uniqueness constraints</strong>. These include key constraints, stating for instance that identifier <code>e</code> is key for statement <code>entity(e,attrs)</code>, but also constraints that state the uniqueness of events such as the generation of an entity by an activity.  <a href="http://www.w3.org/TR/prov-constraints#unique-generation">Constraint 25</a> for example states that only one generation event can be associated to a generated entity and a generating activity:</p>
 
 <code>IF wasGeneratedBy(gen1; e,a,_t1,_attrs1) and wasGeneratedBy(gen2; e,a,_t2,_attrs2), THEN gen1 = gen2.</code></p></li>
 
@@ -58,43 +33,56 @@
 </ul>
 
 </ul>
-<h3>Examples</h3>  
-
-The rules and constraints machinery just described is intended to be used for validation and inference of PROV statements. Some simple examples follow.
-
-<h4>Example 1</h4>  
-
-<strong>Luc maybe you can add a simple inference here? </strong>
-
-<h4>Example 2</h4>  
-
-We now show an inference process involving ordering constaints, which leads to concluding that <em>all the events involved in the provenance must all be simultaneous</em>. Although logically this is a possibility, this is most likely an indication of some of the statements disrupt the consistency of the entire history.  The example involves a case of <em>mutual derivation</em> of an entity from another. Suppose our PROV instance includes the following statement:</p>
-<code>wasDerivedFrom(e2,e1,a,gen,use)</code></p>
-
-that is, <code>e2</code> was generated from <code>e1</code> through activity <code>a</code>. Here <code>gen</code> and <code>use</code> denote the events for the generation of <code>e2</code> and the <code>use</code> of <code>e1</code>, respectively, by <code>a</code>.</p>
+<h3>Example</h3>  
 
 
-<a href="http://www.w3.org/TR/prov-constraints#derivation-usage-generation-ordering">Constraint 43</a> defines the precedence of use over generation in the context of derivation:</p>
+<p>We now show an inference process involving ordering constaints, which leads to concluding that <em>all the events involved in the provenance must all be simultaneous</em>. Although logically this is a possibility, this is most likely an indication of some of the statements disrupt the consistency of the entire history.  The example involves a case of <em>mutual derivation</em> of an entity from another. Consider the following statements:</p>
 
-<code>IF wasDerivedFrom(e2,e1,a,gen,use) THEN use precedes gen.</code></p>
+<p>
+<code>entity(e1)</code><br>
+<code>entity(e2)</code><br>
+<code>activity(a1)</code><br>
+<code>activity(a2)</code><br>
+<code>wasGeneratedBy(gen2; e2,a2,t2)</code><br>
+<code>wasGeneratedBy(gen1; e1,a1,t1)</code><br>
+<code>wasDerivedFrom(d1; e2,e1,-,-,-)</code></p>
 
-Intuitively, <code>a</code> must have used <code>e1</code> prior to generating  <code>e2</code>: </p>
-<code>  use precedes gen.</code></p>
+<p>That is, <code>e2</code> was derived from <code>e1</code>, each of <code>e2</code>, <code>e1</code> being respectively generated by an activity <code>a2</code>, <code>a1</code>, at time <code>t2</code>, <code>t1</code>, as illustrated by the followign figure.</p>
+
+<img src="blog1.png">
+
+
+
+<p><a href="http://www.w3.org/TR/2012/WD-prov-constraints-20120911/#derivation-generation-generation-ordering">Constraint 44</a> defines the precedence of generation of the second entity over generation of the first entity in the context of derivation:</p>
+
+<code>IF wasDerivedFrom(d; e2,e1,a,g,u,attrs) and wasGeneratedBy(gen1; e1,a1,t1,attrs1) and wasGeneratedBy(gen2; e2,a2,t2,attrs2) THEN gen1 strictly precedes gen2.</code>
+
+
+
+<p>Intuitively,  <code>e1</code> must be generated prior to generating  <code>e2</code>: </p>
+<code>  gen1 strictly precedes gen2.</code></p>
    
-Suppose we add the following statement to the instance:</p>
-<code>wasDerivedFrom(e1,e2,a',gen',use')</code></p>     
-That is, <code>e1</code> was generated from <code>e0</code> through activity <code>a'</code>. This new statement may come about because of a merge operation that attempts to blend together two independent sets of statements which predicate on the same objects.  Adding this new statement, however, creates a circular derivation between <code>e1</code> and <code>e2</code>, a fairly atypical situation. We therefore expect that our constraint system be able to tell us something interesting. Indeed, by application of the same <a href="http://www.w3.org/TR/prov-constraints#derivation-usage-generation-ordering">Constraint 43</a>, this new statement entails: </p>
-<code> use' precedes gen'</code></p>
+<p>Suppose we add the following statement to the our set of statements:</p>
 
-Furthermore, <a href="http://www.w3.org/TR/prov-constraints#generation-precedes-usage">Constraint 39</a>, mentioned earlier, specifies  that the generation of any entity must precede all of its uses. Thus we have the additional precedence relations:</p>
-<code> gen   precedes use'  (e1 generated before use),  gen'  precedes use   (e2 generated before use)</code></p>
+<p><code>wasDerivedFrom(d2; e1,e2,-,-,-)</code></p>  
 
-The precedence relation is a preorder between instantaneous events: a constraint of the form <code>e1 precedes e2</code> means that <code>e1</code> happened <em>at the same time as</em> or before <code>e2</code>. By this definition, the set of precedence relations entailed so far leads to a single logical conclusion: all the events involved in the circular derivation must have happened at the same time:</p>
-<code>use = gen = use' = gen'.</code></p>
- Although this is feasible in theory, in practice this conclusion flags these two statements as potentially inconsistent. 
-A reasoner can of course extend these conclusions to the case of a circular derivation involving more than two activities.
+<p>This would form the following overall PROV graph.</p>
+
+<img src="blog2.png">
+   
+<p> Adding this new statement, however, creates a circular derivation between <code>e1</code> and <code>e2</code>, an invalid situation. We therefore expect that our constraint system be able to tell us something interesting. Indeed, by application of the same <a href="http://www.w3.org/TR/2012/WD-prov-constraints-20120911/#derivation-generation-generation-ordering">Constraint 44</a>, this new statement entails: </p>
+<code>  gen2 strictly precedes gen1.</code></p>
+
+
+<p> Hence, we obtain that <code>  gen2 strictly precedes gen1 strictly precedes gen2</code>, which is impossible.</p>
+
+<h2>Conclusion</h2>
+
+<p>This example was simple and may not have required an automated validator to detect invalidation. However, when graph patterns become more complex, an automated validator turns out to be an essential component for provenance user, whether they intend to publish provenance, or whether they intend to consume it. The prov-constraints document defines a set of constraints that validators are expected to implement.</p>
+
+
  
-<h2>Conclusion</h2>
+