Binary file .DS_Store has changed
Binary file model/.DS_Store has changed
--- a/model/ProvenanceModel.html Wed Dec 14 14:22:45 2011 +0000
+++ b/model/ProvenanceModel.html Wed Dec 14 16:03:51 2011 +0000
@@ -709,12 +709,12 @@
<p>
Activity Records (described in <a href="#record-Activity">Section Activity</a>) represent activities in the scenario.</p>
<pre>
-activity(a0, create-file, 2011-11-16T16:00:00,)
-activity(a1, add-crime-in-london, 2011-11-16T16:05:00,)
-activity(a2, email, 2011-11-16T17:00:00,)
-activity(a3, edit-London-New-York, 2011-11-17T09:00:00,)
-activity(a4, email, 2011-11-17T09:30:00,)
-activity(a5, spellcheck,,)
+activity(a0, 2011-11-16T16:00:00,,[prov:type="createFile"])
+activity(a1, 2011-11-16T16:05:00,,[prov:type="edit"])
+activity(a2, 2011-11-16T17:00:00,,[prov:type="email"])
+activity(a3, 2011-11-17T09:00:00,,[prov:type="edit"])
+activity(a4, 2011-11-17T09:30:00,,[prov:type="email"])
+activity(a5, , ,[prov:type="spellcheck"])
</pre>
@@ -992,10 +992,10 @@
<p>
Examples of activities include assembling a data set based on a set of measurements, performing a statistical analysis over a data set, sorting news items according to some criteria, running a sparql query over a triple store, editing a file, and publishing a web page. </p>
-<p> An activity record, written <span class="name">activity(id, rl, st, et, [ attr1=val1, ...])</span> in PROV-ASN, contains:</p>
+<p> An activity record, written <span class="name">activity(id, st, et, [ attr1=val1, ...])</span> in PROV-ASN, contains:</p>
<ul>
<li><em>id</em>: an identifier <span class="name">id</span> identifying an activity; the identifier of the activity record is defined to be the same as the identifier of the activity;</li>
-<li><em>recipeLink</em>: an OPTIONAL <a href="#record-RecipeLink">recipe link</a> <span class="name">rl</span>, which consists of a domain specific specification of the activity;</li>
+<!--<li><em>recipeLink</em>: an OPTIONAL <a href="#record-RecipeLink">recipe link</a> <span class="name">rl</span>, which consists of a domain specific specification of the activity;</li>-->
<li><em>startTime</em>: an OPTIONAL time <span class="name">st</span> indicating the start of the activity;</li>
<li><em>endTime</em>: an OPTIONAL time <span class="name">et</span> indicating the end of the activity;</li>
<li><em>attributes</em>: an OPTIONAL set of attribute-value pairs <span class="name">[ attr1=val1, ...]</span>, representing other attributes of this activity that hold for its whole duration.</li>
@@ -1011,8 +1011,10 @@
<span class="name">(</span>
<span class="nonterminal">identifier</span>
<span class="optional"><span class="name">,</span>
+<!--
<span class="nonterminal">recipeLink</span> </span>
<span class="name">,</span>
+-->
<span class="optional"><span class="nonterminal">time</span></span>
<span class="name">,</span>
<span class="optional"><span class="nonterminal">time</span></span>
@@ -1024,7 +1026,7 @@
<p>
The following activity assertion</p>
<pre class="codeexample">
-activity(a1,add-crime-in-london,2011-11-16T16:05:00,2011-11-16T16:06:00,[ex:host="server.example.org",prov:type="ex:edit" %% xsd:QName])
+activity(a1,2011-11-16T16:05:00,2011-11-16T16:06:00,[ex:host="server.example.org",prov:type="ex:edit" %% xsd:QName])
</pre>
<p>identified by identifier <span class="name">a1</span>, states the existence of an activity with recipe link <span class="name">add-crime-in-london</span>, start time <span class="name">2011-11-16T16:05:00</span>, and end time <span class="name">2011-11-16T16:06:00</span>, running on host <span class="name">server.example.org</span>, and of type <span class="name">edit</span> (declared in some namespace with prefix <span class="name">ex</span>). The attribute <span class="name">host</span> is application specific, but MUST hold for the duration of activity. The attribute <span class="name">type</span> is a reserved attribute of PROV-DM, allowing for subtyping to be expressed.</p>
</div>
@@ -1073,8 +1075,14 @@
<li><span class="name">Organization</span>: agents of type Organization are social institutions such as companies, societies etc. (This type is equivalent to a "foaf:organization" [[FOAF]])</li>
<li><span class="name">SoftwareAgent</span>: a software agent is a piece of software. </li>
</ul>
+<p>Furthermore, section <a href="#record-planLink">Plan Link Record</a>, introduces the idea of <em>plans</em> being associated with activities:</p>
+<ul>
+<li><span class="name">Plan</span>: agents of type <a >plan</a> are entities that represent a set of
+actions or steps intended to achieve some goal.</li>
+</ul>
<p>These types are mutually exclusive, though they do not cover all kinds of agent. </p>
+
<p>An agent record, noted <span class="name">agent(id, [ attr1=val1, ...])</span> in PROV-ASN, contains:</p>
<ul>
<li><em>id</em>: an identifier <span class="name">id</span> identifying an agent; the identifier of the agent record is defined to be the same as the identifier of the agent;</li>
@@ -1184,7 +1192,7 @@
<caption>PROV-DM Core Relation Summary</caption>
<tr><td></td><td>Entity</td><td>Activity</td><td>Agent</td><td>Note</td></tr>
<tr><td>Entity</td><td><a title="derivation record">wasDerivedFrom</a><br><a title="complementarity record">wasComplementOf</a></td><td><a title="generation record">wasGeneratedBy</a></td><td>-</td><td><a title="annotation record">hasAnnotation</a></td></tr>
-<tr><td>Activity</td><td><a title="usage record">used</a></td><td>-</td><td><a title="start record">wasStartedBy</a><br><a title="end record">wasEndedBy</a><br><a title="activity association record">wasAssociatedWith</a></td><td><a title="annotation record">hasAnnotation</a></td></tr>
+<tr><td>Activity</td><td><a title="usage record">used</a></td><td>-</td><td><a title="start record">wasStartedBy</a><br><a title="end record">wasEndedBy</a><br><a title="activity association record">wasAssociatedWith</a><br><a title="plan link record">hadPlan</a></td><td><a title="annotation record">hasAnnotation</a></td></tr>
<tr><td>Agent</td><td>-</td><td>-</td><td><a title="responsibility record">actedOnBehalfOf</a></td><td><a title="annotation record">hasAnnotation</a></td></tr>
<tr><td>Note</td><td>-</td><td>-</td><td>-</td><td><a title="annotation record">hasAnnotation</a></td></tr>
</table>
@@ -1486,12 +1494,84 @@
was started and ended by an agent, represented by record denoted by <span class="name">ah</span>, in "manual" mode, an application specific characterization of these relations.
</p>
</div>
-</section>
+
<div class="note">Temporal constraints for these relations remain to be
written. The temporal constraints should ensure that the agent started
its existence before the effect it may have on the activity. </div>
-
+</section>
+
+
+<section id="record-planLink">
+<h4>Plan Link Record</h4>
+
+<p>In the context of PROV-DM, a <dfn>plan</dfn> should be understood as a set of
+actions or steps intended to achieve some goal. PROV-DM is not
+prescriptive about the nature of plans, their representation, the
+actions and steps they consist of, and their intended goals. Hence,
+for the purpose of this specification, a plan can be a workflow for a
+scientific experiment, a recipe for a cooking activity, or a list of
+instructions for a micro-processor execution. While PROV-DM does not
+specify the representations of plans, it allows for activities to be
+associated with plans. Furthermore, since plans may evolve over time,
+it may become necessary to track their provenance, and hence, plans are
+entities. An activity MAY be associated with multiple plans. This
+allows for descriptions of activities initially associated with a
+plan, which was changed, on the fly, as the activity progresses. Plans
+can be successfully executed or they can fail. We expect applications
+to exploit PROV-DM <a href="#extensibility-section">extensibility
+mechanisms</a> to capture the rich nature of plans and associations
+between activities and plans.</p>
+
+
+<p>A
+<dfn>plan link record</dfn> is a representation of an association
+between an activity and a plan. It is a specialization of
+an <a>activity association record</a>. Hence, a plan is also a PROV-DM
+agent, by virtue of being associated with an activity (cf. constraint <a href="#association-agent">association-agent</a>).</p>
+
+
+
+
+<p>A plan link record, written <span class="name">hadPlan(id,a,e,attrs)</span> in PROV-ASN, contains:</p>
+<ul>
+<li><em>id</em>: an OPTIONAL identifier <span class="name">id</span> identifying the plan link record;</li>
+<li><em>activity</em>: an identifier <span class="name">a</span> denoting an activity record;
+<li><em>e</em>: an identifier <span class="name">e</span> for an entity, representing a plan;
+<li><em>attributes</em>: an OPTIONAL set of attribute-value pairs <span class="name">attrs</span>, describing modalities of this relation.
+</ul>
+
+
+<div class='grammar'>
+<span class="nonterminal">planLink</span> ::=
+<span class="name">hadPlan</span>
+<span class="name">(</span>
+<span class="optional"> <span class="nonterminal">identifier</span>,</span>
+<span class="nonterminal">aIdentifier</span>,
+<span class="nonterminal">eIdentifier</span>
+<span class="nonterminal">optional-attribute-values</span>
+<span class="name">)</span>
+</div>
+
+
+<div class="anexample">
+<p>Below, we find assertions about
+activity <span class="name">ex:a2</span>. It is said to be associated
+with two plans <span class="name">ex:s1</span>
+and <span class="name">ex:s2</span>, two strategy documents at
+specific URLs. The first one is supersed, whereas the second, derived
+from the first, is the most recent.</p>
+<pre class="codeexample">
+entity(ex:s1,[prov:type="prov:Plan"%% xsd:QName, ex:label="Communication Strategy 1",
+ ex:url="http://example.org/strategy1.html" %% xsd:anyURI])
+entity(ex:s2,[prov:type="prov:Plan"%% xsd:QName, ex:label="Communication Strategy 2",
+ ex:url="http://example.org/strategy2.html" %% xsd:anyURI])
+wasDerivedFrom(ex:s2,ex:s1)
+hadPlan(ex:a2, ex:s1,[ex:kind="supersed"])
+hadPlan(ex:a2, ex:s2,[ex:kind="newest"])
+</pre>
+</div>
+</section>
<!--
@@ -1882,15 +1962,74 @@
<section id="record-complement-of">
-<h4>Complementarity Record</h4>
-
-<div class="note">While the working group recognizes the importance of the complementarity record concept, its name and its exact semantics are still being discussed.
+<h4>View Record</h4>
+
+<div class="note">This section is currently under revision and in flux</div>
+
+<p>A <dfn id="view">view record</dfn> is used to establish a relationship between two entity records, which asserts that the two records provide a different characterization of the same entity.<br/>
+
+Two versions of this relation are considered.
+<ul>
+
+ <li>The <span class="name">strictViewOf(e2,e1)</span> relation captures the intuition that <span class="name">e2</span> and <span class="name">e1</span> denote the same entity, but <span class="name">e2</span> provides a <em>more specific</em> characterization of the entity than <span class="name">e1</span> does. One example is an entity record <span class="name">e1</span> describing "Bob as a person", and another entity record <span class="name">e2</span> describing "Bob, the holder of twitter account XYZ". Following this interpretation, relation <span class="name">strictViewOf()</span> is both <strong>transitive</strong> and <strong>anti-symmetric</strong>:
+
+ <ul>
+ <li>if <span class="name">strictViewOf(e3,e2)</span> and <span class="name">strictViewOf(e2,e1)</span> hold then it follows that <span class="name">strictViewOf(e3,e1)</span> holds.
+
+ <li>if <span class="name">strictViewOf(e2,e1)</span> then it follows that <span class="name">strictViewOf(e1,e2)</span> does not hold.
+ </ul>
+
+ <li>The <span class="name">ViewOf(e2,e1)</span> relation captures the intuition that <span class="name">e2</span> and <span class="name">e1</span> provide a different characterization of the same entity, but there is no indication that one is more specific than the other. One example is an entity record <span class="name">e1</span> describing "Bob, the holder of facebook account ABC", and another entity record <span class="name">e2</span> describing "Bob, the holder of twitter account XYZ". Following this interpretation, relation <span class="name">strictViewOf()</span> is <strong>transitive</strong> and <strong>symmetric</strong>:
+
+ <ul>
+ <li>if <span class="name">ViewOf(e3,e2)</span> and <span class="name">ViewOf(e2,e1)</span> hold then it follows that <span class="name">ViewOf(e3,e1)</span> holds.
+
+ <li>if <span class="name">ViewOf(e2,e1)</span> then it follows that <span class="name">ViewOf(e1,e2)</span> also holds.
+ </ul>
+
+</ul>
+
+The definition of entity record includes a specification of the characterization interval within which the record is valid. In both versions, the two relations <span class="name">e2</span> are only valid within the intersection of the characterization intervals of the two participating entity records. <p/>
+
+
+<p>A strict view record is written <span class="name">strictView(e2,e1)</span>, where <span class="name">e1</span> and <span class="name">e2</span> are two identifiers denoting entity records.</p>
+
+<p>A view record is written <span class="name">View(e2,e1)</span>, where <span class="name">e1</span> and <span class="name">e2</span> are two identifiers denoting entity records.</p>
+
+
+<div class="anexample">
+
+ <div class="note">RS example to be replaced once we reach consensus on the definition</div>
</div>
+
+<p>In PROV-ASN, a view (resp. strictView) record's text matches the <span class="nonterminal">viewRecord</span> (resp. <span class="nonterminal">strictViewRecord</span>) production of the grammar defined in this specification document.</p>
+
+<div class='grammar'>
+<span class="nonterminal">viewRecord</span> ::=
+<span class="name">viewOf</span>
+<span class="name">(</span>
+<span class="nonterminal">eIdentifier</span>
+<span class="name">,</span>
+<span class="nonterminal">eIdentifier</span>
+<span class="nonterminal">optional-attribute-values</span>
+<span class="name">)</span> <br/>
+|
+<span class="name">strictViewOf</span>
+<span class="name">(</span>
+<span class="nonterminal">eIdentifier</span>
+<span class="name">,</span>
+<span class="nonterminal">eIdentifier</span>
+<span class="nonterminal">optional-attribute-values</span>
+<span class="name">)</span> <br/>
+</div>
+
+
+<!--
+
<p>A <dfn id="complementOf">complementarity record</dfn> is a relationship between two entities stated to have compatible characterization over some continuous interval between two events.</p>
-<!-- PAOLO, which event do you refer to? <a>event</a>?? -->
<p>
@@ -1912,13 +2051,7 @@
<p>It is important to note that the relation holds only for the characterization intervals of the entity expessions involved As soon as one attribute changes value in one of them, new correspondences need to be found amongst the new entities. Thus, the relation has a validity span that can be expressed in terms of the event lines of the entity.</p>
-<!--
-The "IVP of" relationship is designed to represent pairs of entities that correspond to each other. By their own nature, an entity remains valid only as long as all of its attributes do not change their value. It follows that the correspondence "B IVP of A" is only valid within the time interval during which such invariance attribute holds for both A and B. When any of the attribute values change in either A or B, those entities are replaced by new ones, and a new correspondence may be established. Thus, "IVP of" is defined relative to the intersection of the temporal intervals for which A and B are valid.
--->
-
-
-
-<p>A complementarity record is written <span class="name">wasComplementOf(e2,e1)</span>, where <span class="name">e1</span> and <span class="name">e2</span> are two identifiers denoting entity records.</p>
+
<div class="anexample">
@@ -1960,31 +2093,6 @@
<span class="name">wasComplementOf(e3,e2)</span> and <span class="name">wasComplementOf(e2,e1)</span> hold. The record <span class="name">wasComplementOf(e3,e1)</span> may not hold because the characterization intervals of the denoted entity records may not overlap.</p>
-<p>In PROV-ASN, a complementarity record's text matches the <span class="nonterminal">complementarityRecord</span> production of the grammar defined in this specification document.</p>
-
-<div class='grammar'>
-<span class="nonterminal">complementarityRecord</span> ::=
-<span class="name">wasComplementOf</span>
-<span class="name">(</span>
-<span class="nonterminal">eIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">eIdentifier</span>
-<span class="nonterminal">optional-attribute-values</span>
-<span class="name">)</span> <br/>
-|
-<span class="name">wasComplementOf</span>
-<span class="name">(</span>
-<span class="nonterminal">eIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">accIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">eIdentifier</span>
-<span class="name">,</span>
-<span class="nonterminal">accIdentifier</span>
-<span class="nonterminal">optional-attribute-values</span>
-<span class="name">)</span>
-</div>
-
<p>
An entity record identifier can optionally be accompanied by an account identifier. When this is the case, it becomes possible to link two entity record identifiers that are appear in different accounts. (In particular, the entity record identifiers in two different account are allowed to be the same.). When account identifiers are not available, then the linking of entity records through complementarity can only take place within the scope of a single account.
@@ -2022,7 +2130,7 @@
Furthermore, there is a suggestion that an alternative relation that is transitive may also be useful.
This is raised in the following <a href="http://lists.w3.org/Archives/Public/public-prov-wg/2011Sep/0315.html">email</a>.</div>
-
+-->
<div class='issue'>A discussion on alternative definition of wasComplementOf has not reached a satisfactory conclusion yet. This is <a href="http://www.w3.org/2011/prov/track/issues/29">ISSUE-29</a></div>
@@ -2109,7 +2217,7 @@
<pre class="codexample">
entity(e1,[prov:type="document"])
entity(e2,[prov:type="document"])
-activity(a,transform,t1,t2,[])
+activity(a,t1,t2)
used(u1,a,e1,[ex:file="stdin"])
wasGeneratedBy(e2, a, [ex:file="stdout"])
@@ -2188,9 +2296,9 @@
...
wasDerivedFrom(e2,e1)
...
- activity(a0,create-file,t)
+ activity(a0,t,,[prov:type="createFile"])
...
- wasGeneratedBy(e0,a0,[])
+ wasGeneratedBy(e0,a0)
...
wasAssociatedWith(a4, ag5, [prov:role="communicator"]) )
</pre>
@@ -2243,7 +2351,7 @@
http://example.org/asserter2,
entity(e0, [ prov:type="File", ex:path="/shared/crime.txt", ex:creator="Alice" ])
...
- activity(a1,create-file,t1)
+ activity(a1,t1,,[prov:type="createFile"])
...
wasGeneratedBy(e0,a1,[ex:fct="create"])
... )
@@ -2260,12 +2368,12 @@
account(ex:acc3,
http://example.org/asserter1,
entity(e0, [ prov:type="File", ex:path="/shared/crime.txt", ex:creator="Alice" ])
- activity(a0,create-file,t)
+ activity(a0,t,,[prov:type="createFile"])
wasGeneratedBy(e0,a0,[])
account(ex:acc4,
http://example.org/asserter2,
entity(e1, [ prov:type="File", ex:path="/shared/crime.txt", ex:creator="Alice", ex:content="" ])
- activity(a0,copy-file,t)
+ activity(a0,t,,[prov:type="copyFile"])
wasGeneratedBy(e1,a0,[ex:fct="create"])
wasComplementOf(e1,e0)))
</pre>
@@ -2520,6 +2628,7 @@
</section>
+<!--
<section id="record-RecipeLink">
<h3>Recipe Link</h3>
@@ -2555,7 +2664,7 @@
</section>
-
+-->
@@ -2888,8 +2997,8 @@
<p>
In the following assertions, we find two activity records, identified by <span class="name">a1</span> and <span class="name">a2</span>, representing two activities, which took place on two separate hosts. The third record indicates that the latter activity was started by the former.</p>
<pre class="codeexample">
-activity(a1,workflow,t1,t2,[ex:host="server1.example.org"])
-activity(a2,sub-workflow,t3,t4,[ex:host="server2.example.org"])
+activity(a1,t1,t2,[ex:host="server1.example.org",prov:type="workflow"])
+activity(a2,t3,t4,[ex:host="server2.example.org",prov:type="subworkflow"])
wasStartedBy(a2,a1)
</pre>
@@ -3029,7 +3138,7 @@
<span class="name">e</span> and <span class="name">ag</span>,
<span class='conditional'>then</span> there exists an activity identified by <span class="name">pe</span> such that the following statements hold:
<pre>
-activity(pe,recipe,t1,t2,attr1)
+activity(pe,t1,t2,attr1)
wasGenerateBy(e,pe)
wasAssociatedWith(pe,ag,attr2)
</pre>
@@ -3435,6 +3544,7 @@
<section class="appendix">
<h2>Changes Since Second Public Working Draft</h2>
<ul>
+<li>12/13/11: Introduced hadPlan as a specialization of wasAssociatedWith. </li>
<li>12/12/11: Section 7 on interpretation. </li>
<li>12/08/11: Restructuring of Constraints. </li>
</ul>
--- a/model/releases/WD-prov-dm-20111215/Overview.html Wed Dec 14 14:22:45 2011 +0000
+++ b/model/releases/WD-prov-dm-20111215/Overview.html Wed Dec 14 16:03:51 2011 +0000
@@ -3190,7 +3190,7 @@
The purpose of this section is to enable modelling of part-of relationships amongst entities. In particular, a form of <strong>collection</strong> entity type is introduced, consisting of a set of key-value pairs. Key-value pairs provide a generic indexing structure that can be used to model commonly used data structures, including associative lists (also known as "dictionaries" in some programming languages), relational tables, ordered lists, and more.<br>
-The relations introduced here are all specializations of the <a href="#record-Derivation"><strong>wasDerivedFrom</strong></a> relation, specifically precise-1 or imprecise-1. They are designed to model:
+The relations introduced here are all specializations of the <a href="#Derivation-Relation"><strong>wasDerivedFrom</strong></a> relation, specifically precise-1 or imprecise-1. They are designed to model:
<ul>
<li><strong>insertion</strong>: a collection entity c' is obtained from collection entity c, by adding entity e having key k to c;
@@ -3923,8 +3923,8 @@
</dd><dt id="bib-CSP">[CSP]</dt><dd>Hoare, C. A. R. <a href="http://www.usingcsp.com/cspbook.pdf"><cite>Communicating Sequential Processes</cite></a>.Prentice-Hall. 1985URL: <a href="http://www.usingcsp.com/cspbook.pdf">http://www.usingcsp.com/cspbook.pdf</a>
</dd><dt id="bib-FOAF">[FOAF]</dt><dd>Dan Brickley, Libby Miller. <a href="http://xmlns.com/foaf/spec/"><cite>FOAF Vocabulary Specification 0.98.</cite></a> 9 August 2010. URL: <a href="http://xmlns.com/foaf/spec/">http://xmlns.com/foaf/spec/</a>
</dd><dt id="bib-Logic">[Logic]</dt><dd>W. E. Johnson<a href="http://www.ditext.com/johnson/intro-3.html"><cite>Logic: Part III</cite></a>.1924. URL: <a href="http://www.ditext.com/johnson/intro-3.html">http://www.ditext.com/johnson/intro-3.html</a>
-</dd><dt id="bib-PROV-O">[PROV-O]</dt><dd>Satya Sahoo and Deborah McGuinness <a href="http://dvcs.w3.org/hg/prov/raw-file/default/ontology/ProvenanceFormalModel.html"><cite>Provenance Formal Model</cite></a>. 2011, Work in progress. URL: <a href="http://dvcs.w3.org/hg/prov/raw-file/default/ontology/ProvenanceFormalModel.html">http://dvcs.w3.org/hg/prov/raw-file/default/ontology/ProvenanceFormalModel.html</a>
-</dd><dt id="bib-PROV-PAQ">[PROV-PAQ]</dt><dd>Graham Klyne and Paul Groth <a href="http://dvcs.w3.org/hg/prov/raw-file/tip/paq/provenance-access.html"><cite>Provenance Access and Query</cite></a>. 2011, Work in progress. URL: <a href="http://dvcs.w3.org/hg/prov/raw-file/tip/paq/provenance-access.html">http://dvcs.w3.org/hg/prov/tip/paq/provenance-access.html</a>
+</dd><dt id="bib-PROV-O">[PROV-O]</dt><dd>Satya Sahoo and Deborah McGuinness <a href="http://www.w3.org/TR/prov-o/"><cite>Provenance Formal Model</cite></a>. 2011, Work in progress. URL: <a href="http://www.w3.org/TR/prov-o/">http://www.w3.org/TR/prov-o/</a>
+</dd><dt id="bib-PROV-PAQ">[PROV-PAQ]</dt><dd>Graham Klyne and Paul Groth <a href="http://dvcs.w3.org/hg/prov/raw-file/tip/paq/prov-aq.html"><cite>Provenance Access and Query</cite></a>. 2011, Work in progress. URL: <a href="http://dvcs.w3.org/hg/prov/raw-file/tip/paq/prov-aq.html">http://dvcs.w3.org/hg/prov/tip/paq/prov-aq.html</a>
</dd><dt id="bib-PROV-PRIMER">[PROV-PRIMER]</dt><dd>Yolanda Gil and Simon Miles <a href="http://dvcs.w3.org/hg/prov/raw-file/default/primer/Primer.html"><cite>Prov Model Primer</cite></a>. 2011, Work in progress. URL: <a href="http://dvcs.w3.org/hg/prov/raw-file/default/primer/Primer.html">http://dvcs.w3.org/hg/prov/raw-file/default/primer/Primer.html</a>
</dd><dt id="bib-PROV-SEMANTICS">[PROV-SEMANTICS]</dt><dd>James Cheney <a href="http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman"><cite>Formal Semantics Strawman</cite></a>. 2011, Work in progress. URL: <a href="http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman">http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman</a>
</dd></dl></div></div></body></html>
--- a/paq/prov-aq.html Wed Dec 14 14:22:45 2011 +0000
+++ b/paq/prov-aq.html Wed Dec 14 16:03:51 2011 +0000
@@ -44,7 +44,7 @@
};
var respecConfig = {
// specification status (e.g. WD, LCWD, NOTE, etc.). If in doubt use ED.
- specStatus: "FPWD-NOTE",
+ specStatus: "ED",
// the specification's short name, as in http://www.w3.org/TR/short-name/
shortName: "prov-aq",
--- a/primer/Primer.html Wed Dec 14 14:22:45 2011 +0000
+++ b/primer/Primer.html Wed Dec 14 16:03:51 2011 +0000
@@ -112,32 +112,33 @@
</ul>
<p>
- The provenance of digital objects represents their origins. The PROV-DM is a
- proposed standard to represent provenance records, which reflect the entities
+ The <i>provenance</i> of digital objects represents their origins. The PROV-DM is a
+ proposed standard to represent provenance records, which contain <i>assertions</i> about the entities
and activities involved in producing and delivering or otherwise influencing a
given object. By knowing the provenance of an object, we can make determinations
about how to use it. Provenance records can be used for many purposes, such as
understanding how data was collected so it can be meaningfully used, determining
ownership and rights over an object, making judgments about information to
- determine whether to trust it, verifying that the activity used to obtain a
+ determine whether to trust it, verifying that the process and steps used to obtain a
result complies with given requirements, and reproducing how something it was generated.
</p>
<p>
As a standard for provenance, PROV-DM accommodates all those different uses
- of provenance. However, different people may have different perspectives on provenance,
- and as a result different types of information might be captured in a provenance record.
+ of provenance. Different people may have different perspectives on provenance,
+ and as a result different types of information might be captured in provenance records.
One perspective might focus on <i>agent-centered provenance</i>, that is, what entities
were involved in generating or manipulating the information in question. For example,
in the provenance of a picture in a news article we might capture the photographer who
took it, the person that edited it, and the newspaper that published it. A second perspective
might focus on <i>object-centered provenance</i>, by tracing the origins of portions of a
document to other documents. An example is having a web page that was assembled from content
- from a news article, quotes of interviews with experts, and a graph that plots data from a
+ from a news article, quotes of interviews with experts, and a chart that plots data from a
government agency. A third perspective one might take is on <i>process-centered provenance</i>,
capturing the actions and steps taken to generate the information in question. For example, a
- graph may have been generated by invoking a service to retrieve data from a database, and then
- extracting certain statistics from the data using some statistics package.
+ chart may have been generated by invoking a service to retrieve data from a database, then
+ extracting certain statistics from the data using some statistics package, and finally
+ processing these results with a graphing tool.
</p>
<p>
@@ -147,16 +148,23 @@
</p>
<p>
- A comprehensive overview of requirements, use cases, prior research, and proposed
+ For general background on provenance, a
+ comprehensive overview of requirements, use cases, prior research, and proposed
vocabularies for provenance are available from the
<a href="http://www.w3.org/2005/Incubator/prov/XGR-prov/">Final Report of the W3C Provenance Incubator Group</a>.
- The document contains three general scenarios
+ That document contains three general scenarios
that may help identify the provenance aspects of your planned applications and
help plan the design of your provenance system.
</p>
<p>
+ The next section gives an introductory overview of PROV-DM using simple examples.
+ The following section shows how the formal ontology PROV-O can be used to represent the PROV-DM assertions
+ as RDF triples. The document also contains frequently asked questions, and an appendix giving example
+ snippets of the PROV-DM Abstract Syntax Notation (ASN).
For a detailed description of PROV-DM, please refer to the
- <a href="http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html">PROV Data Model and Abstract Syntax Notation Document</a>.
+ <a href="http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html">PROV Data Model and Abstract Syntax Notation document</a>.
+ For a detailed description of PROV-O, refer to the
+ <a href="http://dvcs.w3.org/hg/prov/raw-file/default/ontology/ProvenanceFormalModel.html">PROV Ontology Model and Formal Semantics document</a>.
</p>
</section>
@@ -165,22 +173,36 @@
<p><i>This section provides an intuitive explanation of the concepts in PROV-DM.
As with the rest of this document, it should be treated as a starting point for
- understanding the model, and not normative in itself. The model specification
- provides the precise definitions and constraints to be followed in using PROV-DM.</i></p>
+ understanding the model, and not normative in itself. The PROV-DM model specification
+ provides precise definitions and constraints to be used.</i></p>
+
+<p>
+The following ER diagram provides a high level overview of the <strong>structure of PROV-DM records</strong>.
+The diagram is the same that appears in the
+<a href="http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html">PROV Data Model and Abstract Syntax Notation document</a>,
+but note that this primer document only describes some of the terms shown in the diagram.
+</p>
+
+<div style="text-align: center;">
+ <img src="overview.png" alt="PROV-DM overview"/>
+</div>
<section>
<h3>Entities</h3>
<p>
- In PROV-DM, the things that you may ask the provenance of are called <i>entities</i>.
- An entity’s provenance may refer to many other entities. For example, a document D is
- an entity whose provenance refers to other entities such as a graph inserted into D,
- the dataset that was used to create that graph, or the author of the document.
+ In PROV-DM, the things that one may ask the provenance of are called <i>entities</i>.
+ Examples of such entities are a web page, a chart, and a spellchecker.
</p>
<p>
- Entities are described and identified by their attributes, may be more
- or less specific, and may be described from different perspectives. For example,
- document D, the second version of document D, and document D as stored on my file system,
+ An entity’s provenance may refer to many other entities. For example, a document D is
+ an entity whose provenance refers to other entities such as a chart inserted into D,
+ the dataset that was used to create that chart, or the author of the document.
+ </p>
+ <p>
+ Entities may be described from different perspectives that may be more or less specific. For example,
+ document D as stored in my file system, the second version of document D after someone edited it,
+ and D as an evolving document,
are three distinct entities for which we may describe the provenance. They
may all be perspectives on the same thing in the world (document D may exist only
in its second version and on my file system), but are <i>characterized</i> in
@@ -202,14 +224,16 @@
<h3>Activities</h3>
<p>
- While entities are static aspects in the world (things), <i>activities</i> are
- dynamic aspects (actions, processes, etc.)
- An activity is something that has either occurred or is still
- taking place. Most importantly, activities are how entities come into
- existence, often making use of previously existing entities to achieve this.
+ Activities are how entities come into
+ existence and how their attributes change to become new entities,
+ often making use of previously existing entities to achieve this.
For example, if the second version of document D was generated
by a translation from the first version of the document in another language,
then this translation is an activity.
+ An activity may have either already occurred or be still
+ taking place when a new entity is generated.
+ While entities are static aspects in the world (things), <i>activities</i> are
+ dynamic aspects (actions, processes, etc.)
</p>
</section>
@@ -217,12 +241,17 @@
<h3>Use and Generation</h3>
<p>
- Every entity is created by an activity, which is called the <i>generation</i> of the entity.
+ Activities <i>generate</i> new entities.
For example, writing a document brings the document into existence, while
revising the document brings a new version into existence.
+ </p>
+ <p>
Activities also make <i>use</i> of entities. For example, revising a document
to fix spelling mistakes uses the original version of the document as well
- as a list of corrections. In PROV-DM, assertions can be made to state that
+ as a list of corrections.
+ </p>
+ <p>
+ Assertions can be made in a provenance record to state that
particular activities used or generated particular entities.
</p>
</section>
@@ -234,176 +263,76 @@
An agent is a type of entity that takes an active role in an activity such
that it can be assigned some degree of responsibility for the activity taking
place. An agent can be a person, a piece of software, or an inanimate object.
- In PROV-DM, agents are a kind of entity, and it is therefore possible to
- associate provenance with agents. Consider a graph displaying some statistics
+ Several agents can be associated with an activity.
+ Consider a chart displaying some statistics
regarding crime rates over time in a linear regression. To represent the
- provenance of a that graph, we could state that the person who created the
- graph was an agent involved in its creation, and that the software used to
- create the graph was also an agent involved in that activity. We
- can also represent the provenance of that software and the agents involved in
- that, such as the vendor of that software.
+ provenance of a that chart, we could state that the person who created the
+ chart was an agent involved in its creation, and that the software used to
+ create the chart was also an agent involved in that activity.
+ </p>
+ <p>
+ Since agents are a kind of entity, it is therefore possible to
+ associate provenance records with the agents themselves.
+ In the running example, we
+ can also represent the provenance of the software used to create the chart, and specify the agents involved in
+ producing that software, such as the vendor.
</p>
</section>
- <!--section>
- <h3>Accounts</h3>
-
- <p>An intuitive overview of how to think about accounts in PROV-DM.</p>
- </section -->
-
<section>
<h3>Roles</h3>
<p>
- A role is a description of the function or part an entity
- played in an activity. In PROV-DM data, roles are qualifying, application-specific,
- information about the relationship between an entity and an activity, whether
- that is how an activity used an entity, generated an entity, or was controlled by an agent.
+ A <i>role</i> is a description of the function or the part that an entity
+ played in an activity. Roles specify
+ the relationship between an entity and an activity, whether
+ how an activity used an entity or generated an entity. Roles also specify how agents are
+ involved in an activity, qualifying their participation in the activity or
+ specifying what agents controlled it.
For example, an agent may play the role of "editor" in an activity that uses
- one entity in the role of "document to be edited" and another in the role of "edits
- to be made", to generate a further entity in the role of "edited document".
+ one entity in the role of "document to be edited" and another in the role of
+ "addition to be made to the document", to generate a further entity in the role of "edited document".
+ Roles are application specific.
</p>
<!--p>Roles are intended as an extension point in the model; it is expected users will define and use custom role taxonomies. Role interpretation is application specific.</p -->
</section>
<section>
- <h3>Revisions</h3>
+ <h3>Revisions and Derivation</h3>
<p>
- A single resource, such as a document, may go through multiple revisions
- (also called versions and other comparable terms) over time. Between revisions,
- several changes may have taken place to the resource, each possibly controlled
- by different agents. The result of each revision is, in PROV-DM terms, an entity,
+ A given entity, such as a document, may go through multiple <i>revisions</i>
+ (also called versions and other comparable terms) over time. Between revisions,
+ one or more attributes of the entity may change.
+ The result of each revision is a new entity,
and PROV-DM allows one to relate those entities by making an assertion that
one is a revision of another.
</p>
+ <p>
+ When one entity's existence, content, characteristics and so on are
+ at least partly due to another entity, then we say that the former is
+ <i>derived</i> from the latter. For example, one document may contain
+ material copied from another,
+ and a chart is derived from the data that is used to create it.
+ </p>
+ <p>
+ There are different kinds of derivation expressible in PROV-DM. For
+ example, the data may be normalized before creating the chart.
+ In PROV-DM terms, we say that the chart <i>was derived from</i>
+ the normalized data and <i>was eventually derived from</i> the original data.
+ </p>
</section>
- <section>
- <h3>Complementarity</h3>
- <p>
- As described above, entities can be described from different perspectives,
- by being characterized by different attributes. For example, "document D",
- "the second version of document D" and "document D as stored on my filesystem"
- are different entities
- because they are characterized in different ways. However, for some period of time
- they may all refer to the same thing in the world, e.g. for a while the copy of
- D on my filesystem <i>was</i> the second version.
- </p>
- <p>
- In PROV-DM, we say there is <i>complementarity</i> between one entity and another
- if, in some period of time, they have the same or compatible characterization.
- So, both "the second version of document D" and "document D as stored on my filesystem"
- are complements of "document D", because they are both characterized by being
- document D, but with specific additional attributes.
- If, at some point in time, a version of D stored on my filesystem is the second one, then
- "document D as stored on my filesystem" and "the second version of document D" are
- complements of each other.
- </p>
- <!-- p>
- Several asserted entities could be characterizing the same thing, in
- particular when entities are asserted by different <em>accounts</em> or over
- different time periods. If two such entities have <em>overlapping
- lifespans</em>, and the first entity have some <em>attributes</em> that
- have not been asserted (and not necessarily always true) for the second entity,
- then the first entity is said to be <em>complementing</em> the second
- entity, that is the first entity helps form a more detailed
- description of the second entity, at least for the duration of the
- overlapping lifespan.
- </p>
- <p>
- In addition, if <code>:A prov:wasComplementOf :B</code>, then of all the
- attributes of the entity <code>:A</code> which can be <em>mapped</em> to
- <em>compatible</em> attributes of <code>:B</code> MUST be <em>matching</em>
- for the continuous duration of the overlap of <code>:A</code> and
- <code>:B</code>'s lifespans.
- It is out of scope for PROV to specify or assert the nature of
- the <em>compatibility mapping</em> and <em>matching</em>, the exact
- interpretation of these is left to the asserter of
- <code>wasComplementOf</code>
- </p>
- <p>
- If <code>:B</code> also have some attributes which
- are not asserted (or not always true) about <code>:A</code>,
- then this MAY be asserted using the
- inverse relation <code>:B prov:wasComplementOf :A</code>. If two entities
- both complement each other in this manner, both MUST have some
- attributes the other does not have, although those attributes MAY
- not have been asserted in the provenance. Note that the
- <em>lack</em> of such an inverse assertion does not neccessarily
- mean that <code>:B</code> did not have any additional attributes
- for <code>:A</code> in the timespan, only that this has not
- been asserted.
- </p>
- <p>
- In the simplest case, both entites are described using the same
- attributes, in which case <em>matching</em> means the values SHOULD
- literally be the same (matching by identity). On the other hand an
- attribute like <code>ex1:speed_in_mph</code> can be <em>mapped</em> to
- a compatible <code>ex2:speed_in_kmh</code> attribute. Not all
- attributes might be mappable in both directions, for instance
- <code>ex1:city</code> to <code>ex2:country</code>, but not vice
- versa.
- </p>
- <p>
- Note that it is out of scope for PROV to assert or explain any
- mapping of compatible attributes. This is merely a conclusion
- that can be drawn from the assertion that the two entities both
- described the same thing in the overlapping time spans. Also note
- that asserting a complementary relationship does not detail how the
- two entity timespans overlap, this could be anything from
- complete one-to-one match (where all attributes are always true for
- both entities) to merely touching overlaps.
- </p -->
- </section>
-
- <section>
- <h3>Derivation</h3>
-
- <p>
- When one entity's existence, content, characteristics and so on are
- at least partly due to another entity, then we say that the former is
- derived from the latter. For example, one document may contain
- material copied from another, a child is derived from his/her
- ancestors, and a page displayed in a browser is derived from the same
- page on the web server from which it was downloaded, as well as from
- the designer's original sketches of what the page would look like.
- </p>
- <p>
- There are different kinds of derivation expressible in PROV-DM.
- Consider the case of the page in the browser above. It is derived from
- the designer's sketch in the strictest sense, i.e. if the sketch had
- been different so would the page. On the other hand, there are
- entities that are part of the page's history but which did not inform
- the content of that page, i.e. the page would have been the same even
- if the earlier entity changed. For example, on creating the original
- draft of the page, the designer may have included a banner image
- saying "DRAFT - FOR REVIEW ONLY". This banner was not part of the
- sketch, nor part of the published page downloaded to the browser, but
- was part of the page's history, and while not affecting the browsed
- page's content may have been a factor in its existence. Finally, in
- some cases, we may be able to say not only that one entity was derived
- from another, but also how it was derived, i.e. by what activity.
- For example, the page in the browser is derived from the
- page on the web server because a download activity sent the bytes of
- the latter across an HTTP connection to the browser client.
- </p>
- <p>
- In PROV-DM terms, we say that the page in the browser <i>was eventually
- derived from</i> the sketch, <i>depended on</i> the banner image, and <i>was derived
- from</i> the page on the web server due to the download activity.
- </p>
- </section>
</section>
<section>
- <h2>Worked Examples</h2>
+ <h2>Examples of Use of the PROV-O Ontology</h2>
<p>In the following sections, we show how PROV-DM can be used to model
provenance in specific examples.</p>
- <p>We include examples of how the formal ontology
+ <p>We include examples of how the formal ontology PROV-O
can be used to represent the PROV-DM assertions as RDF triples.
These are shown using the Turtle notation. In
the latter depictions, the namespace prefix <b>prov</b> denotes
@@ -411,33 +340,36 @@
denote terms specific to the example.</p>
<p>We also provide a representation of the examples in the Abstract
- Syntax Model used in the conceptual model document. The full ASM data is
+ Syntax Model ASM used in the conceptual model document. The full ASM data
+ for the examples in this section is
included in the appendix.</p>
<section>
<h3>Entities</h3>
<p>
- An online newspaper publishes an article making using of data (GovData) provided through a government portal, in England.
- The article includes a chart based on GovData, with data values aggregated by
- regions of the country.
+ An online newspaper publishes an article with a chart about crime statistics making using of data (GovData) provided through a government portal.
+ The article includes a chart based on the data, with data values aggregated by
+ geographical regions.
</p>
<p>
- A blogger, Betty, looking at the chart, spots what she thinks to be an error.
- Betty retrieves the provenance of the chart, to determine from where the facts presented derive.
+ A blogger, Betty, looking at the article, spots what she thinks to be an error in the chart.
+ Betty retrieves the provenance record of the article, how it was created.
</p>
- <p>The Prov data includes the assertions:</p>
+ <p>Betty would find the following assertions about entities in the provenance record:</p>
<pre class="turtle example">
- ex1:dataSet1 a prov:Entity .
+ ex1:newspaper1 a prov:Entity .
+ ex1:article1 a prov:Entity .
ex1:regionList1 a prov:Entity .
ex1:aggregate1 a prov:Entity .
ex1:chart1 a prov:Entity .
</pre>
<p>
- These statements, in order, assert that the original data set is an entity (<code>ex1:dataSet1</code>),
- the list of regions
- (<code>ex1:regionList1</code>) is an entity, the data aggregated by region is an entity (<code>ex1:aggregate1</code>),
- and the chart (<code>ex1:chart1</code>) is an entity.
+ These statements, in order, assert that there is a newspaper (<code>ex1:newspaper1</code>) and an article (<code>ex1:article1</code>),
+ that the original data set is an entity (<code>ex1:dataSet1</code>),
+ there is a list of regions
+ (<code>ex1:regionList1</code>) that is an entity, that the data aggregated by region is an entity (<code>ex1:aggregate1</code>),
+ and that the chart (<code>ex1:chart1</code>) is an entity.
</p>
</section>
@@ -446,7 +378,7 @@
<h3>Activities</h3>
<p>
- Further, the Prov data asserts that there was
+ Further, the provenance record asserts that there was
an activity (<code>ex1:compiled</code>) denoting the compilation of the
chart from the data set.
</p>
@@ -454,8 +386,8 @@
ex1:compiled a prov:Activity .
</pre>
<p>
- The provenance also includes reference to the steps involved in compilation,
- aggregating the data by region and generating the chart graphic.
+ The provenance record also includes reference to the more specific steps involved in this compilation,
+ which are first aggregating the data by region and then generating the chart graphic.
</p>
<pre class="turtle example">
ex1:aggregated a prov:Activity .
@@ -467,13 +399,13 @@
<h3>Use and Generation</h3>
<p>
- Finally, the Prov data asserts the key events that connected the above
+ Finally, the provenance record asserts the key relations among the above
entities and activities, i.e. the use of an entity by an activity,
or the generation of an entity by an activity.
</p>
<p>
- For example, the data below states that the aggregation activity
- (<code>ex1:aggregated</code>) used the data set, that it used the list of
+ For example, the assertions below state that the aggregation activity
+ (<code>ex1:aggregated</code>) used the original data set, that it used the list of
regions, and that the aggregated data was generated by this activity.
</p>
<pre class="turtle example">
@@ -490,7 +422,7 @@
ex1:chart1 prov:wasGeneratedBy ex1:illustrated .
</pre>
- <!-- p>
+ <!--p>
For example, the provenance declares the event (of type <code>prov:Usage</code>)
where the aggregation activity used the GovData data set, and the event
(of type <code>prov:Generation</code>) where the same activity generated
@@ -521,12 +453,13 @@
ex1:illustrated prov:qualifiedGeneration ex1:chart1Generation .
ex1:aggregate1Usage prov:entity ex1:aggregate1 .
ex1:chart1Generation prov:entity ex1:chart1 .
- </pre -->
+ </pre>
<p>
From this information Betty can see that
the mistake could have been in the original data set or else was introduced
in the compilation activity, and sets out to discover which.
</p>
+ </p -->
</section>
@@ -534,18 +467,18 @@
<h3>Agents</h3>
<p>
- Digging deeper, Betty wants to know who compiled the chart. Betty sees
- that both the aggregation and chart creation activities were controlled
- by the Derek.
+ Digging deeper, Betty wants to know who compiled the chart.
+ Betty sees that Derek was involved in both the aggregation and
+ chart creation activities:
</p>
<pre class="turtle example">
- ex1:aggregated prov:wasControlledBy ex1:derek .
- ex1:illustrated prov:wasControlledBy ex1:derek .
+ ex1:aggregated prov:wasAssociatedWith ex1:derek .
+ ex1:illustrated prov:wasAssociatedWith ex1:derek .
</pre>
<p>
The record for Derek provides the
following information, of which the first line is a PROV-O statement that
- Derek is a (PROV-DM) agent.
+ Derek is an agent, followed by statements about general properties of Derek.
</p>
<pre class="turtle example">
ex1:derek a prov:Agent ;
@@ -555,16 +488,6 @@
</pre>
</section>
- <!-- section>
- <h3>Accounts</h3>
-
- <p><i>Suggested example:</i> The analyst provides his own record of how he compiled GovData to create
- the chart, which provides more detail than in the newspaper's provenance data.
- Specifically, the analysts account separates compilation into two stages: aggregating
- data by region and then producing the graphic. Therefore, there are two separate
- accounts of the same events.</p>
- </section -->
-
<section>
<h3>Roles</h3>
@@ -579,7 +502,7 @@
should be aggregated.
</p>
<p>
- The above information is described as roles in the provenance data. The aggregation
+ The above information is described as roles in the provenance records. The aggregation
activity involved entities in four roles: the data to be aggregated (<code>ex1:dataToAggregate</code>),
the regions to aggregate by (<code>ex1:regionsToAggregateBy</code>), the
resulting aggregated data (<code>ex1:aggregatedData</code>), and the
@@ -594,10 +517,10 @@
<p>
In addition to the simple facts that the aggregation activity used, generated or
was controlled by entities/agents as described in the sections above, the
- provenance data contains more details of <i>how</i> these entities and agents
- were involved, i.e. the roles they played. For example, the data below states
+ provenance record contains more details of <i>how</i> these entities and agents
+ were involved, i.e. the roles they played. For example, the assertions below state
that the aggregation activity (<code>ex1:aggregated</code>) included the usage
- of the GovData data set (<code>ex1:dataSet1</code>) in the role of the data
+ of the government data set (<code>ex1:dataSet1</code>) in the role of the data
to be aggregated (<code>ex1:dataToAggregate</code>).
</p>
<pre class="turtle example">
@@ -634,213 +557,43 @@
</section>
<section>
- <h3>Revision</h3>
+ <h3>Revision and Derivation</h3>
<p>
After looking at the detail of the compilation activity, there appears
- to be nothing wrong, so Betty concludes the error is in GovData. She contacts
+ to be nothing wrong, so Betty concludes the error is in the government dataset.
+ She looks at the characterization of the dataset <code>ex1:dataSet1</code>,
+ and sees that it is missing data from one of the zipcodes in the area. She contacts
the government, and a new version of GovData is created, declared to be the
- next revision of the data by Edith. The provenance data now includes a statement
- that the new data set, <code>ex1:dataSet2</code> is a new revision of the
+ next revision of the data by Edith. The provenance record of this new dataset,
+ <code>ex1:dataSet2</code>, states that it is a revision of the
old data set, <code>ex1:dataSet1</code>.
</p>
<pre class="turtle example">
ex1:dataSet2 prov:wasRevisionOf ex1:dataSet1 .
</pre>
- </section>
-
- <section>
- <h3>Complementarity</h3>
-
- <p>Betty lets Derek know that a new revision of the data set exists,
- and he looks at the provenance of the new data to understand what he needs to
- re-analyze. </p>
- <p>In addition to specifying that
- <code>ex1:dataSet2</code> is a new revision of
- <code>ex1:dataSet1</code>, the provenance from DataGov also
- asserts that both of these entities were a <em>complement of</em>
- another entity <code>ex1:dataSet</code>.
- </p>
- <pre class="turtle example">
- ex1:dataSet1 prov:wasComplementOf ex1:dataSet .
- ex1:dataSet2 prov:wasComplementOf ex1:dataSet .
- </pre>
- <!--
- <pre class="asn example">
- wasComplementOf(ex1:dataSet1, ex1:dataSet)
- wasComplementOf(ex1:dataSet2, ex1:dataSet)
- </pre>
- -->
- <p>
- This assertion means that <code>ex1:dataSet1</code> at some point shared
- its characterizing attributes with <code>ex1:dataSet</code>, and the same for
- <code>ex2:dataSet2</code>. Thus the <em>entity</em>
- <code>ex1:dataSet1</code> did at some point represent the same
- thing as characterized by the entity <code>ex1:dataSet</code>. The same is
- true for <code>ex1:dataSet2</code>, though not necessarily at the
- same point in time.
- </p>
- <!-- p>
- The term <em>was complement of</em> here means that the
- <code>ex1:dataSet1</code>
- provide additional details that adds to the details of
- <code>ex1:dataSet</code> (complementing it), and that both of these
- entities represented the same thing.
- Characterizing attributes of <code>ex1:dataSet</code> are from this
- asserted to have been <em>compatible</em> with the properties of
- <code>ex1:dataSet1</code> and <code>ex1:dataSet2</code>.
- <em>Compatible</em> here means that some kind of mapping can be
- established between the attributes, they don't neccessarily have to
- match directly.
- </p -->
- <p>
- Derek then looks at the characterization of the generalized data set
- (<code>ex1:dataSet</code>) to find the attributes shared with the first
- and second versions of the data set. The assertions below give the generalized
- data set's attributes: it is of type <code>ex1:DataSet</code>, it covers
- three named regions, it was created by <code>ex1:DataGov</code>, and
- has a given title.
- </p>
- <pre class="example turtle">
- ex1:dataSet a ex1:DataSet ;
- ex1:regions ( ex1:North, ex1:NorthWest, ex1:East ) ;
- dc:creator ex1:DataGov ;
- dc:title "Regional incidence dataset 2011" .
- </pre>
- <p>
- As <code>ex1:dataSet1</code> and <code>ex1:dataSet2</code> complement
- <code>ex1:dataSet</code>,
- Derek can deduce from the above attributes that both the former had
- these same attributes at some point, i.e.
- the creator <code>ex1:DataGov</code> and so on. Derek compares the above
- assertions to the
- attributes of <code>ex1:dataSet1</code>.
- </p>
- <pre class="example turtle">
- ex1:dataSet1 a ex1:DataSet ;
- ex1:postCodes ( "N1", "N2", "NW1", "E1", "E2" ) ;
- ex1:totalIncidents 141 ;
- dc:creator ex1:DataGov ;
- dc:title "Regional incidence dataset 2011" .
- </pre>
<p>
- Shared characterizing attributes are not necessarily represented in
- the serialized assertions of different entities. For example, the creator
- and title are exactly the same for <code>ex1:dataSet</code> and <code>ex1:dataSet1</code>,
- but the regions covered by the data set are described in a different way:
- "regions" for <code>ex1:dataSet</code> and "postCodes" for <code>ex1:dataSet1</code>.
- Whether these are equivalent is a domain-specific judgment.
- We can also see that, while <code>ex1:dataSet1</code> complements <code>ex1:dataSet</code>,
- the inverse is not true. <code>ex1:dataSet1</code> is more specific, because
- it has a "totalIncidents" attribute specific to that version of the data set.
- </p>
- <!-- p>
- Derek sees that the creator and title are directly mappable and
- equal between these entities. He also knows (from his region
- aggregation method) that the <code>ex1:postCodes</code> <code>N1</code> and
- <code>N2</code> are in the
- region <code>ex1:North</code>, and so on, and can confirm that although
- this regional characterisation of the data is not expressed
- using the same attributes in the two entities, they are <em>compatible</em>.
- </p>
- <p>Derek notes that <code>ex1:totalIncidents</code> is not stated
- for <code>ex1:dataSet</code>, and not mappable to any of the
- other existing attributes. Thus this could be one of the
- complementing attributes that makes <code>ex1:dataSet1</code>
- more specific than <code>ex1:dataSet</code>.
-
- Derek can from the assertion <code>ex1:dataSet1
- prov:wasComplementOf ex1:dataSet</code>
- see that <code>ex1:dataSet</code>
- did have 141 incidents when its characterization interval
- overlapped that of <code>ex1:dataSet1</code>, but not neccessarily
- throughout its lifetime. Note that in this example the provenance
- assertions are not providing any direct description of the
- characterization interval of the entities.
- </p>
- <p>
- Due to the open world assumption (more
- information might be added later) he can not conclude
- from this alone that <code>ex1:dataSet</code> at any point did
- <strong>not</strong> have 141 incidents. He therefore does not know
- for sure that <code>ex1:totalIncidents</code> is a complementing
- attribute which <code>ex1:dataSet</code> does not have in its
- characterisation.
- </p>
- <p>
- Derek finally compares the newer revision
- <code>ex1:dataSet2</code> with
- <code>ex1:dataSet</code>:
- </p>
- <pre class="example turtle">
- ex1:dataSet2 a ex1:DataSet ;
- ex1:postCodes ( "N1", "N2", "NW1", "NW2", "E1", "E2" ) ;
- ex1:totalIncidents 158 ;
- dc:creator ex1:DataGov ;
- dc:title "Regional incidence dataset 2011" .
- </pre>
- <p>
- In this revision, the new postcode <kbd>NW2</kbd> appears, this is still
- <em>compatible</em> with the region <code>ex1:NorthWest</code>
- of <code>ex1:dataSet</code>
- On the other hand, the attribute <code>prov:totalIncidents</code> have gone up to 158.
- </p>
- <p>
- From the <code>prov:wasComplementOf</code> assertion Derek knows that
- <code>ex1:dataSet2</code> also provides additional attributes for
- <code>ex1:dataSet</code>, but because the total incidents can't
- both be 141 and 158, the attribute <code>ex1:totalIncidents</code>
- is a complementing attribute, and changes over the
- characterisation interval (lifespan) of <code>ex1:dataSet</code>,
- and is thus not one of its characterising attributes. He also now
- knows that <code>ex1:dataSet</code> is a common characterisation
- of the dataset that spans (parts of) both revisions. It has
- however not been asserted explicitly that the
- <code>ex1:dataSet</code> is a somewhat more general
- characterisation, just that it allows mutability on the
- <code>prov:totalIncidents</code> attribute and overlapped (parts
- of) the timespans of the two revisions.
- </p>
- <p>
- From this Derek concludes that he can still use the regions North,
- North West and East in the diagram layout, but as the
- <code>ex1:totalIncidents</code> differ, something in the
- raw data has changed. He can't from this provenance assertion
- alone tell if that is merely from the addition of the post code
- NW2, or if data for the other post codes have changed as well.
- Derek decides to redo the aggregation by region using
- <code>ex1:dataSet2</code> and regenerate the
- graphics using the same layout.
- </p -->
- </section>
-
- <section>
- <h3>Derivation</h3>
-
- <p>
- Derek creates a new chart based on the revised data,
+ Derek notices that there is a new dataset available and creates a new chart based on the revised data,
using the same compilation activity as before. Betty checks the article again at a
later point, and wants to know if it is based on the old or new GovData.
- She sees three new assertions about derivation in the provenance data, plus
+ She sees two new assertions about derivation in the provenance data, plus
an assertion about how the new chart was generated.
</p>
<pre class="example turtle">
- ex1:chart2 prov:dependedOn ex1:dataSet2 .
ex1:chart2 prov:wasEventuallyDerivedFrom ex1:dataSet2 .
ex1:chart2 prov:wasDerivedFrom ex1:dataSet2 .
ex1:chart2 prov:wasGeneratedBy ex1:compiled2 .
</pre>
<p>
- She interprets these assertions as follows. The first says that the new chart included,
- somewhere in the history of its creation, the revised data set.
- The second says further that the new chart is as it because of the revised
+ She interprets these assertions as follows. The first says that the new chart
+ is as it because of the revised
data set, i.e. there is an explicit influence of the data on the chart.
Finally, the third and fourth assertions together say further that it was
the activity <code>ex1:compiled2</code> that derived the new chart
from the revised data set.
</p>
</section>
- </section>
+
<section>
<h2>Frequently asked questions</h2>
@@ -865,10 +618,11 @@
<section>
<h3>Activities</h3>
<pre class="example asn">
- activity(ex1:compiled,compilation_step).
+ activity(ex1:compiled).
activity(ex1:aggregated).
activity(ex1:illustrated).
</pre>
+ <!--
<p>
In the first assertion above, 'compilation_step' is an optional reference to the 'recipe' that describes
what the 'compiled' activity did. The interpretation of its name,
@@ -877,6 +631,7 @@
<p>
In the second assertion, optional 'recipe' has been omitted.
</p>
+ -->
<!--PM comment: here readers will be confused by the processExecutiion / activity disconnect!
also this does not show start/end times, optional attributes. At least one example would be useful-->
</section>
@@ -922,31 +677,12 @@
</section>
<section>
- <h3>Revision</h3>
+ <h3>Revision and Derivation</h3>
<pre class="example asn">
wasRevisionOf(ex1:dataSet2, ex1:dataSet1).
</pre>
- </section>
-
- <section>
- <h3>Complementarity</h3>
- <pre class="example asn">
- entity(ex1:dataSet, [ type="ex1:DataSet", ex1:regions ="(ex1:North, ex1:NorthWest, ex1:East)",
- dc:creator="ex1:DataGov", dc:title="Regional incidence dataset 2011" ]).
- wasComplementOf(dataSet1, dataSet).
- wasComplementOf(dataSet2, dataSet).
-
- entity(ex1:dataSet1, [ type="ex1:DataSet", ex1:postCodes="( 'N1', 'N2', 'NW1', 'E1', 'E2' ) ",
- ex1:totalIncidents = "141", dc:creator = " ex1:DataGov",
- dc:title = "Regional incidence dataset 2011" ]).
- </pre>
- </section>
-
- <section>
- <h3>Derivation</h3>
<pre class="example asn">
- dependedOn(ex1:chart2, ex1:dataSet2).
wasEventuallyDerivedFrom(ex1:chart2, ex1:dataSet2).
wasDerivedFrom(ex1:chart2, ex1:dataSet2).
wasGeneratedBy(ex1:chart2, ex1:compiled2).
Binary file primer/overview.png has changed