--- a/primer/Primer.html Thu Feb 23 14:29:37 2012 +0000
+++ b/primer/Primer.html Thu Feb 23 14:50:21 2012 +0000
@@ -35,7 +35,8 @@
var respecConfig = {
// specification status (e.g. WD, LCWD, NOTE, etc.). If in doubt use ED.
- specStatus: "FPWD-NOTE",
+ //specStatus: "FPWD-NOTE",
+ specStatus: "ED",
// the specification's short name, as in http://www.w3.org/TR/short-name/
shortName: "prov-primer",
@@ -172,7 +173,7 @@
understanding how data was collected so it can be meaningfully used, determining
ownership and rights over an object, making judgments about information to
determine whether to trust it, verifying that the process and steps used to obtain a
- result complies with given requirements, and reproducing how something it was generated.
+ result complies with given requirements, and reproducing how something was generated.
</p>
<p>
@@ -256,7 +257,8 @@
Entities may be described from different perspectives that may be more or less specific. For example,
document D as stored in my file system, the second version of document D after someone edited it,
and D as an evolving document,
- are three distinct entities for which we may describe the provenance. They
+ are three distinct entities for which we may describe the provenance.
+ <!-- They
may all be perspectives on the same thing in the world (document D may exist only
in its second version and on my file system), but are <i>characterized</i> in
different ways by being described using different <i>attributes</i> (version, location, and
@@ -269,7 +271,7 @@
second version, and so assertions about who reviewed that entity apply only
to the document as it is in its second version. When the document becomes
the third version, a new entity exists (the third version of D) and the
- provenance assertions about who reviewed the second version do not apply.
+ provenance assertions about who reviewed the second version do not apply.-->
</p>
</section>
@@ -315,13 +317,16 @@
<p>
An agent is a type of entity that takes an active role in an activity such
that it can be assigned some degree of responsibility for the activity taking
- place. An agent can be a person, a piece of software, or an inanimate object.
+ place. An agent can be a person, a piece of software, an inanimate object, an organization, or
+ other entities that may be ascribed responsibility.
Several agents can be associated with an activity.
Consider a chart displaying some statistics
regarding crime rates over time in a linear regression. To represent the
provenance of a that chart, we could state that the person who created the
chart was an agent involved in its creation, and that the software used to
- create the chart was also an agent involved in that activity.
+ create the chart was also an agent involved in that activity.
+ An agent may be acting on behalf of others, e.g. an employee on behalf of their
+ organization, and we can express such chains of responsibility in the provenance.
</p>
<p>
Since agents are a kind of entity, it is therefore possible to
@@ -368,12 +373,6 @@
material copied from another,
and a chart is derived from the data that is used to create it.
</p>
- <p>
- There are different kinds of derivation expressible in PROV-DM. For
- example, the data may be normalized before creating the chart.
- In PROV-DM terms, we say that the chart <i>was derived from</i>
- the normalized data and <i>was eventually derived from</i> the original data.
- </p>
</section>
@@ -382,15 +381,17 @@
<section>
<h2>Examples of Use of the PROV-O Ontology</h2>
- <p>In the following sections, we show how PROV-DM can be used to model
- provenance in specific examples.</p>
-
- <p>We include examples of how the formal ontology PROV-O
+ <p>
+ In the following sections, we show how PROV-DM can be used to model
+ provenance in a specific example.
+ </p>
+ <p>
+ We include samples of how the formal ontology PROV-O
can be used to represent the PROV-DM assertions as RDF triples.
These are shown using the Turtle notation. In
the latter depictions, the namespace prefix <b>prov</b> denotes
- terms from the Prov ontology, while <b>ex1</b>, <b>ex2</b>, etc.
- denote terms specific to the example.</p>
+ terms from the Prov ontology, while <b>ex</b> denotes terms specific to the example.
+ </p>
<p>We also provide a representation of the examples in the Abstract
Syntax Model ASM used in the conceptual model document. The full ASM data
@@ -411,18 +412,18 @@
</p>
<p>Betty would find the following assertions about entities in the provenance record:</p>
<pre class="turtle example">
- ex1:newspaper1 a prov:Entity .
- ex1:article1 a prov:Entity .
- ex1:regionList1 a prov:Entity .
- ex1:aggregate1 a prov:Entity .
- ex1:chart1 a prov:Entity .
+ ex:newspaper1 a prov:Entity .
+ ex:article1 a prov:Entity .
+ ex:regionList1 a prov:Entity .
+ ex:aggregate1 a prov:Entity .
+ ex:chart1 a prov:Entity .
</pre>
<p>
- These statements, in order, assert that there is a newspaper (<code>ex1:newspaper1</code>) and an article (<code>ex1:article1</code>),
- that the original data set is an entity (<code>ex1:dataSet1</code>),
+ These statements, in order, assert that there is a newspaper (<code>ex:newspaper1</code>) and an article (<code>ex:article1</code>),
+ that the original data set is an entity (<code>ex:dataSet1</code>),
there is a list of regions
- (<code>ex1:regionList1</code>) that is an entity, that the data aggregated by region is an entity (<code>ex1:aggregate1</code>),
- and that the chart (<code>ex1:chart1</code>) is an entity.
+ (<code>ex:regionList1</code>) that is an entity, that the data aggregated by region is an entity (<code>ex:aggregate1</code>),
+ and that the chart (<code>ex:chart1</code>) is an entity.
</p>
</section>
@@ -432,19 +433,19 @@
<p>
Further, the provenance record asserts that there was
- an activity (<code>ex1:compiled</code>) denoting the compilation of the
+ an activity (<code>ex:compiled</code>) denoting the compilation of the
chart from the data set.
</p>
<pre class="turtle example">
- ex1:compiled a prov:Activity .
+ ex:compiled a prov:Activity .
</pre>
<p>
The provenance record also includes reference to the more specific steps involved in this compilation,
which are first aggregating the data by region and then generating the chart graphic.
</p>
<pre class="turtle example">
- ex1:aggregated a prov:Activity .
- ex1:illustrated a prov:Activity .
+ ex:aggregated a prov:Activity .
+ ex:illustrated a prov:Activity .
</pre>
</section>
@@ -458,62 +459,22 @@
</p>
<p>
For example, the assertions below state that the aggregation activity
- (<code>ex1:aggregated</code>) used the original data set, that it used the list of
+ (<code>ex:aggregated</code>) used the original data set, that it used the list of
regions, and that the aggregated data was generated by this activity.
</p>
<pre class="turtle example">
- ex1:aggregated prov:used ex1:dataSet1 ;
- prov:used ex1:regionList1 .
- ex1:aggregate1 prov:wasGeneratedBy ex1:aggregated .
+ ex:aggregated prov:used ex:dataSet1 ;
+ prov:used ex:regionList1 .
+ ex:aggregate1 prov:wasGeneratedBy ex:aggregated .
</pre>
<p>
- Similarly, the chart graphic creation activity (<code>ex1:illustrated</code>)
+ Similarly, the chart graphic creation activity (<code>ex:illustrated</code>)
used the aggregated data, and the chart was generated by this activity.
</p>
<pre class="turtle example">
- ex1:illustrated prov:used ex1:aggregate1 .
- ex1:chart1 prov:wasGeneratedBy ex1:illustrated .
- </pre>
-
- <!--p>
- For example, the provenance declares the event (of type <code>prov:Usage</code>)
- where the aggregation activity used the GovData data set, and the event
- (of type <code>prov:Generation</code>) where the same activity generated
- the data aggregated by region.
- </p>
- <pre class="turtle example">
- ex1:dataSet1Usage a prov:Usage .
- ex1:aggregate1Generation a prov:Generation .
+ ex:illustrated prov:used ex:aggregate1 .
+ ex:chart1 prov:wasGeneratedBy ex:illustrated .
</pre>
- <p>
- To describe these events, the provenance says within which activity
- they occur and what entity is used or generated.
- </p>
- <pre class="turtle example">
- ex1:aggregated prov:qualifiedUsage ex1:dataSet1Usage .
- ex1:aggregated prov:qualifiedGeneration ex1:aggregate1Generation .
- ex1:dataSet1Usage prov:entity ex1:dataSet1 .
- ex1:aggregate1Generation prov:entity ex1:aggregate1 .
- </pre>
- <p>
- Comparable events are described for the activity of generating the chart image
- from the aggregated data.
- </p>
- <pre class="turtle example">
- ex1:aggregate1Usage a prov:Usage .
- ex1:chart1Generation a prov:Generation .
- ex1:illustrated prov:qualifiedUsage ex1:aggregate1Usage .
- ex1:illustrated prov:qualifiedGeneration ex1:chart1Generation .
- ex1:aggregate1Usage prov:entity ex1:aggregate1 .
- ex1:chart1Generation prov:entity ex1:chart1 .
- </pre>
- <p>
- From this information Betty can see that
- the mistake could have been in the original data set or else was introduced
- in the compilation activity, and sets out to discover which.
- </p>
- </p -->
-
</section>
<section>
@@ -525,19 +486,31 @@
chart creation activities:
</p>
<pre class="turtle example">
- ex1:aggregated prov:wasControlledBy ex1:derek .
- ex1:illustrated prov:wasControlledBy ex1:derek .
+ ex:aggregated prov:wasAssociatedWith ex:derek .
+ ex:illustrated prov:wasAssociatedWith ex:derek .
</pre>
<p>
The record for Derek provides the
- following information, of which the first line is a PROV-O statement that
- Derek is an agent, followed by statements about general properties of Derek.
+ following information, of which the first lines are PROV-O statements that
+ Derek is an agent, specifically a person, followed by (non-PROV) statements
+ giving general properties of Derek.
</p>
<pre class="turtle example">
- ex1:derek a prov:Agent ;
- a foaf:Person ;
- foaf:givenName "Derek"^^xsd:string ;
- foaf:mbox <mailto:dererk@example.org> .
+ ex:derek a prov:Agent ;
+ a foaf:Person ;
+ foaf:givenName "Derek"^^xsd:string ;
+ foaf:mbox <mailto:dererk@example.org> .
+ </pre>
+ <p>
+ Derek works as part of an organization, Chart Generators, and so the provenance
+ declares that he acts on their behalf. Note that the organization is itself
+ an agent.
+ </p>
+ <pre>
+ ex:derek prov:actedOnBehalfOf ex:chartgen .
+ ex:chartgen a prov:Agent ;
+ a prov:Organization ;
+ foaf:name "Chart Generators" .
</pre>
</section>
@@ -547,64 +520,64 @@
<p>
For Betty to understand where the error lies, she needs to have more detailed
information on how entities have been used in, participated in, and generated
- by activities. Betty has determined that <code>ex1:aggregated</code> used
- entities <code>ex1:regionList1</code> and <code>ex1:dataSet1</code>, but she does not
+ by activities. Betty has determined that <code>ex:aggregated</code> used
+ entities <code>ex:regionList1</code> and <code>ex:dataSet1</code>, but she does not
know what function these entities played in the processing. Betty
- also knows that <code>ex1:derek</code> controlled the activities, but she does
+ also knows that <code>ex:derek</code> controlled the activities, but she does
not know if Derek was the analyst responsible for determining how the data
should be aggregated.
</p>
<p>
The above information is described as roles in the provenance records. The aggregation
- activity involved entities in four roles: the data to be aggregated (<code>ex1:dataToAggregate</code>),
- the regions to aggregate by (<code>ex1:regionsToAggregateBy</code>), the
- resulting aggregated data (<code>ex1:aggregatedData</code>), and the
- analyst doing the aggregation (<code>ex1:analyst</code>).
+ activity involved entities in four roles: the data to be aggregated (<code>ex:dataToAggregate</code>),
+ the regions to aggregate by (<code>ex:regionsToAggregateBy</code>), the
+ resulting aggregated data (<code>ex:aggregatedData</code>), and the
+ analyst doing the aggregation (<code>ex:analyst</code>).
</p>
<pre class="turtle example">
- ex1:dataToAggregate a prov:Role .
- ex1:regionsToAggregateBy a prov:Role .
- ex1:aggregatedData a prov:Role .
- ex1:analyst a prov:Role .
+ ex:dataToAggregate a prov:Role .
+ ex:regionsToAggregateBy a prov:Role .
+ ex:aggregatedData a prov:Role .
+ ex:analyst a prov:Role .
</pre>
<p>
In addition to the simple facts that the aggregation activity used, generated or
was controlled by entities/agents as described in the sections above, the
provenance record contains more details of <i>how</i> these entities and agents
were involved, i.e. the roles they played. For example, the assertions below state
- that the aggregation activity (<code>ex1:aggregated</code>) included the usage
- of the government data set (<code>ex1:dataSet1</code>) in the role of the data
- to be aggregated (<code>ex1:dataToAggregate</code>).
+ that the aggregation activity (<code>ex:aggregated</code>) included the usage
+ of the government data set (<code>ex:dataSet1</code>) in the role of the data
+ to be aggregated (<code>ex:dataToAggregate</code>).
</p>
<pre class="turtle example">
- ex1:aggregated prov:hadQualifiedUsage [ a prov:Usage ;
- prov:hadQualifiedEntity ex1:dataSet1 ;
- prov:hadRole ex1:dataToAggregate ] .
+ ex:aggregated prov:involved [ a prov:Usage ;
+ prov:entity ex:dataSet1 ;
+ prov:hadRole ex:dataToAggregate ] .
</pre>
<p>
This can then be distinguished from the same activity's usage of the list of
regions because the roles played are different.
</p>
<pre class="turtle example">
- ex1:aggregated prov:hadQualifiedUsage [ a prov:Usage ;
- prov:hadQualifiedEntity ex1:regionList1 ;
- prov:hadRole ex1:regionsToAggregateBy ] .
+ ex:aggregated prov:involved [ a prov:Usage ;
+ prov:entity ex:regionList1 ;
+ prov:hadRole ex:regionsToAggregateBy ] .
</pre>
<p>
Similarly, the provenance includes assertions that the same activity was
- controlled in a particular way (<code>ex1:analyst</code>) by Derek, and that
- the entity <code>ex1:aggregate1</code> took the role of the aggregated
+ controlled in a particular way (<code>ex:analyst</code>) by Derek, and that
+ the entity <code>ex:aggregate1</code> took the role of the aggregated
data in what the activity generated.
</p>
<pre class="turtle example">
- ex1:aggregated
- prov:hadQualifiedControl [ a prov:Control ;
- prov:hadQualifiedEntity ex1:derek ;
- prov:hadRole ex1:analyst
+ ex:aggregated
+ prov:involved [ a prov:Association ;
+ prov:entity ex:derek ;
+ prov:hadRole ex:analyst
] ;
- prov:hadQualifiedGeneration [ a prov:Generation ;
- prov:hadQualifiedEntity ex1:aggregate1 ;
- prov:hadRole ex1:aggregatedData
+ prov:involved [ a prov:Generation ;
+ prov:entity ex:aggregate1 ;
+ prov:hadRole ex:aggregatedData
] .
</pre>
</section>
@@ -615,36 +588,25 @@
<p>
After looking at the detail of the compilation activity, there appears
to be nothing wrong, so Betty concludes the error is in the government dataset.
- She looks at the characterization of the dataset <code>ex1:dataSet1</code>,
+ She looks at the characterization of the dataset <code>ex:dataSet1</code>,
and sees that it is missing data from one of the zipcodes in the area. She contacts
the government, and a new version of GovData is created, declared to be the
next revision of the data by Edith. The provenance record of this new dataset,
- <code>ex1:dataSet2</code>, states that it is a revision of the
- old data set, <code>ex1:dataSet1</code>.
+ <code>ex:dataSet2</code>, states that it is a revision of the
+ old data set, <code>ex:dataSet1</code>.
</p>
<pre class="turtle example">
- ex1:dataSet2 prov:wasRevisionOf ex1:dataSet1 .
+ ex:dataSet2 prov:wasRevisionOf ex:dataSet1 .
</pre>
<p>
Derek notices that there is a new dataset available and creates a new chart based on the revised data,
using the same compilation activity as before. Betty checks the article again at a
later point, and wants to know if it is based on the old or new GovData.
- She sees two new assertions about derivation in the provenance data, plus
- an assertion about how the new chart was generated.
+ She sees a new assertion stating that the new chart is derived from the new dataset.
</p>
<pre class="example turtle">
- ex1:chart2 prov:wasEventuallyDerivedFrom ex1:dataSet2 .
- ex1:chart2 prov:wasDerivedFrom ex1:dataSet2 .
- ex1:chart2 prov:wasGeneratedBy ex1:compiled2 .
+ ex:chart2 prov:wasDerivedFrom ex:dataSet2 .
</pre>
- <p>
- She interprets these assertions as follows. The first says that the new chart
- is as it because of the revised
- data set, i.e. there is an explicit influence of the data on the chart.
- Finally, the third and fourth assertions together say further that it was
- the activity <code>ex1:compiled2</code> that derived the new chart
- from the revised data set.
- </p>
</section>
</section>
@@ -661,19 +623,19 @@
<section>
<h3>Entities</h3>
<pre class="example asn">
- entity(ex1:dataSet1).
- entity(ex1:regionList1).
- entity(ex1:aggregate1).
- entity(ex1:chart1).
+ entity(ex:dataSet1).
+ entity(ex:regionList1).
+ entity(ex:aggregate1).
+ entity(ex:chart1).
</pre>
</section>
<section>
<h3>Activities</h3>
<pre class="example asn">
- activity(ex1:compiled).
- activity(ex1:aggregated).
- activity(ex1:illustrated).
+ activity(ex:compiled).
+ activity(ex:aggregated).
+ activity(ex:illustrated).
</pre>
<!--
<p>
@@ -692,24 +654,24 @@
<section>
<h3>Use and Generation</h3>
<pre class="example asn">
- used(ex1:aggregated, ex1:dataSet1).
- used(ex1:aggregated, ex1:regionList1).
- wasGeneratedBy(ex1:aggregate1, ex1:aggregated).
+ used(ex:aggregated, ex:dataSet1).
+ used(ex:aggregated, ex:regionList1).
+ wasGeneratedBy(ex:aggregate1, ex:aggregated).
- used(ex1:illustrated, ex1:aggregate1).
- wasGeneratedBy(ex1:chart1, ex1:illustrated).
+ used(ex:illustrated, ex:aggregate1).
+ wasGeneratedBy(ex:chart1, ex:illustrated).
</pre>
</section>
<section>
<h3>Agents</h3>
<pre class="example asn">
- entity(ex1:derek, [ type="foaf:Person", foaf:givenName = "Derek",
+ entity(ex:derek, [ type="foaf:Person", foaf:givenName = "Derek",
foaf:mbox= "<mailto:derek@example.org>"]).
- agent(ex1:derek).
+ agent(ex:derek).
- wasControlledBy(ex1:aggregated, ex1:derek).
- wasControlledBy(ex1:illustrated, ex1:derek).
+ wasControlledBy(ex:aggregated, ex:derek).
+ wasControlledBy(ex:illustrated, ex:derek).
</pre>
</section>
@@ -720,8 +682,8 @@
relations. Thus, the entire Turtle example in sec. 3.5 is rendered as follows:
</p>
<pre class="example asn">
- used(ex1:aggregated, ex1:dataSet1, [ prov:role = "dataToAggregate"]).
- used(ex1:aggregated, ex1:regionList1, [ prov:role = "regionsToAggregteBy"]).
+ used(ex:aggregated, ex:dataSet1, [ prov:role = "dataToAggregate"]).
+ used(ex:aggregated, ex:regionList1, [ prov:role = "regionsToAggregteBy"]).
</pre>
<p>
In the first assertion above, note that this adds a "role" attribute to the first 'used' assertion of Ex. 3.
@@ -732,13 +694,13 @@
<section>
<h3>Revision and Derivation</h3>
<pre class="example asn">
- wasRevisionOf(ex1:dataSet2, ex1:dataSet1).
+ wasRevisionOf(ex:dataSet2, ex:dataSet1).
</pre>
<pre class="example asn">
- wasEventuallyDerivedFrom(ex1:chart2, ex1:dataSet2).
- wasDerivedFrom(ex1:chart2, ex1:dataSet2).
- wasGeneratedBy(ex1:chart2, ex1:compiled2).
+ wasEventuallyDerivedFrom(ex:chart2, ex:dataSet2).
+ wasDerivedFrom(ex:chart2, ex:dataSet2).
+ wasGeneratedBy(ex:chart2, ex:compiled2).
</pre>
</section>
</section>
@@ -750,4 +712,17 @@
</p>
</section>
+ <section class="appendix">
+ <h2>Changes Since First Public Working Draft</h2>
+ <ul>
+ <li>Removed details about "things" and attributes from intuition on entities.</li>
+ <li>Removed discussion of "eventually derived from" from intuition on derivation.</li>
+ <li>Removed wasEventuallyDerivedFrom from example, and simplified example of derivation.</li>
+ <li>Revised language and namespace prefix (ex1) to talk about a single worked example.</li>
+ <li>Updated wasControlledBy to wasAssociatedWith.</li>
+ <li>Changed (Qualified)Involvement classes and associated relations to match current ontology.</li>
+ <li>Added actedOnBehalfOf in intuition and example.</li>
+ </ul>
+ </section>
+
</body></html>