Added to the primer: PROV-O examples for attribution, plans, specialization and alternate
authorSimon Miles <simon.miles@kcl.ac.uk>
Sat, 31 Mar 2012 18:07:51 +0100
changeset 2132 62fc7d26a0be
parent 2131 729472032d31
child 2133 cfcc7534ceaf
child 2140 05887cb4a756
Added to the primer: PROV-O examples for attribution, plans, specialization and alternate
primer/Primer.html
--- a/primer/Primer.html	Fri Mar 30 22:48:59 2012 -0400
+++ b/primer/Primer.html	Sat Mar 31 18:07:51 2012 +0100
@@ -457,21 +457,21 @@
     </p>
     <p>
      A blogger, Betty, looking at the article, spots what she thinks to be an error in the chart.
-     Betty retrieves the provenance record of the article, how it was created.
+     Betty retrieves a record of the provenance of the article, describing how it was created.
     </p>
-    <p>Betty would find the following descriptions about entities in the provenance record:</p>
+    <p>Betty finds the following descriptions of entities in the provenance:</p>
     <pre class="turtle example">
-     ex:article1     a prov:Entity ; dcterms:title "Crime rises in cities" .
+     ex:article      a prov:Entity ; dcterms:title "Crime rises in cities" .
      ex:dataset1     a prov:Entity .
-     ex:regionList1  a prov:Entity .
-     ex:composition1 a prov:Entity .
+     ex:regionList   a prov:Entity .
+     ex:composition  a prov:Entity .
      ex:chart1       a prov:Entity .
    </pre>
     <p>
-     These statements, in order, describe that there was an article (<code>ex:article1</code>),
+     These statements, in order, describe that there was an article (<code>ex:article</code>),
      an original data set (<code>ex:dataSet1</code>),
-     a list of regions (<code>ex:regionList1</code>), 
-     data aggregated by region (<code>ex:composition1</code>), 
+     a list of regions (<code>ex:regionList</code>), 
+     data aggregated by region (<code>ex:composition</code>), 
      and a chart (<code>ex:chart1</code>), and that each is an entity.
      Any entity may have attributes not specific to provenance, such as the title
      of the article, expressed using <code>dcterms:title</code> above.
@@ -515,15 +515,15 @@
     </p>
     <pre class="turtle example">
      ex:compose      prov:used           ex:dataSet1 ;
-                     prov:used           ex:regionList1 .
-     ex:composition1 prov:wasGeneratedBy ex:compose .
+                     prov:used           ex:regionList .
+     ex:composition  prov:wasGeneratedBy ex:compose .
     </pre>
     <p>
      Similarly, the chart graphic creation activity (<code>ex:illustrated</code>)
      used the composed data, and the chart was generated by this activity.
     </p>
     <pre class="turtle example">
-     ex:illustrate prov:used           ex:composition1 .
+     ex:illustrate prov:used           ex:composition .
      ex:chart1     prov:wasGeneratedBy ex:illustrate .
     </pre>
    </section>
@@ -550,7 +550,7 @@
      ex:derek a prov:Agent ;
               a prov:Person ;
               foaf:givenName "Derek"^^xsd:string ;
-              foaf:mbox      &lt;mailto:dererk@example.org&gt; .
+              foaf:mbox      &lt;mailto:derek@example.org&gt; .
     </pre>
     <p>
      Derek works as part of an organization, Chart Generators, and so the provenance
@@ -563,6 +563,13 @@
                  a prov:Organization ;
                  foaf:name "Chart Generators Inc" .
     </pre>
+    <p>
+     Finally, there is an explicit statement in the provenance that the chart was
+     attributed to Derek.
+    </p>
+    <pre>
+     ex:chart1 prov:wasAttributedTo ex:derek .
+    </pre>
    </section>
 
    <section>
@@ -570,16 +577,16 @@
 
     <p>
      For Betty to understand where the error lies, she needs to have more detailed 
-     information on how entities have been used in, participated in, and generated 
+     information on how entities have been used in and generated 
      by activities.  Betty has determined that <code>ex:compose</code> used 
-     entities <code>ex:regionList1</code> and <code>ex:dataSet1</code>, but she does not 
+     entities <code>ex:regionList</code> and <code>ex:dataSet1</code>, but she does not 
      know what function these entities played in the processing.  Betty 
-     also knows that <code>ex:derek</code> controlled the activities, but she does 
+     also knows that <code>ex:derek</code> was associated with the activities, but she does 
      not know if Derek was the analyst responsible for determining how the data 
      should be composed.
     </p>
     <p>
-     The above information is described as roles in the provenance records. The composition
+     The above information is described as roles in the provenance. The composition
      activity involved entities in four roles: the data to be composed (<code>ex:dataToCompose</code>),
      the regions to aggregate by (<code>ex:regionsToAggregateBy</code>), the
      resulting composed data (<code>ex:composedData</code>), and the
@@ -592,8 +599,8 @@
      ex:analyst              a prov:Role .
     </pre>
     <p>
-     In addition to the simple facts that the composition activity used, generated or
-     was controlled by entities/agents as described in the sections above, the
+     In addition to the simple facts that the composition activity used, was generated by or
+     was associated with entities/agents as described in the sections above, the
      provenance record contains more details of <i>how</i> these entities and agents
      were involved, i.e. the roles they played. For example, the descriptions below state
      Examples in the sections above show descriptions of the simple facts that the
@@ -608,13 +615,15 @@
      The
      provenance record can contain more details of <i>how</i> these entities and agents
      were involved in the activity. One example is the roles the entities played.
+     To do this, PROV-O refers to <i>qualified usage</i>, <i>qualified generation</i>, etc.,
+     which are descriptions consisting of several statements about how use, generation, etc. took place.
      For example, the descriptions below state
      that the composition activity (<code>ex:compose</code>) included the usage
      of the government data set (<code>ex:dataSet1</code>) in the role of the data
      to be composed (<code>ex:dataToCompose</code>).
     </p>
     <pre class="turtle example">
-     ex:compose prov:qualifiedUsage [ a prov:Usage ;
+     ex:compose prov:hadQualifiedUsage [ a prov:Usage ;
                    prov:entity  ex:dataSet1 ;
                    prov:hadRole ex:dataToCompose ] .
     </pre>
@@ -623,26 +632,28 @@
      regions because the roles played are different.
     </p>
     <pre class="turtle example">
-     ex:compose prov:qualifiedUsage [ a prov:Usage ;
-                   prov:entity  ex:regionList1 ;
-                   prov:hadRole ex:regionsToAggregateBy ] .
+     ex:compose  prov:qualifiedUsage [
+                   a  prov:Usage ;
+                   prov:entity   ex:regionList ;
+                   prov:hadRole  ex:regionsToAggregateBy ] .
     </pre>
     <p>
      Similarly, the provenance includes descriptions that the same activity was
      controlled in a particular way (<code>ex:analyst</code>) by Derek, and that
-     the entity <code>ex:composition1</code> took the role of the composed
+     the entity <code>ex:composition</code> took the role of the composed
      data in what the activity generated.
     </p>
     <pre class="turtle example">
-     ex:compose
-        prov:qualifiedAssociation [ a prov:Association ;
-            prov:entity  ex:derek ;
-            prov:hadRole ex:analyst
-        ] ;
-        prov:qualifiedGeneration [ a prov:Generation ;
-            prov:entity  ex:composition1 ;
-            prov:hadRole ex:composedData
-        ] .
+     ex:compose  prov:qualifiedAssociation [
+                   a  prov:Association ;
+                   prov:agent    ex:derek ;
+                   prov:hadRole  ex:analyst
+     ] .
+     ex:composition prov:qualifiedGeneration [
+                        a prov:Generation ;
+                        prov:activity  ex:compose ;
+                        prov:hadRole   ex:composedData
+     ] .
     </pre>
    </section>
 
@@ -652,15 +663,16 @@
     <p>
      After looking at the detail of the compilation activity, there appears
      to be nothing wrong, so Betty concludes the error is in the government dataset. 
-     She looks at the characterization of the dataset <code>ex:dataSet1</code>, 
+     She looks at the dataset <code>ex:dataSet1</code>, 
      and sees that it is missing data from one of the zipcodes in the area.  She contacts
      the government, and a new version of GovData is created, declared to be the
-     next revision of the data by Edith. The provenance record of this new dataset,
+     next revision of the data. The provenance record of this new dataset,
      <code>ex:dataSet2</code>, states that it is a revision of the
      old data set, <code>ex:dataSet1</code>.
     </p>
     <pre class="turtle example">
-     ex:dataSet2 prov:wasRevisionOf ex:dataSet1 .
+     ex:dataSet2 a prov:Entity ;
+                 prov:wasRevisionOf ex:dataSet1 .
     </pre>
     <p>
      Derek notices that there is a new dataset available and creates a new chart based on the revised data, 
@@ -669,11 +681,104 @@
      She sees a new description stating that the new chart is derived from the new dataset.
     </p>
     <pre class="example turtle">
-     ex:chart2 prov:wasDerivedFrom ex:dataSet2 .
+     ex:chart2 a prov:Entity ;
+               prov:wasDerivedFrom ex:dataSet2 .
     </pre>
    </section>
   </section>
 
+   <section>
+    <h3>Plans</h3>
+
+    <p>
+     Betty then wishes to know whether the new data set correctly addresses
+     the error that existed before. The provenance of the new dataset,
+     <code>ex:dataSet2</code>, describes not only who performed the corrections,
+     Edith, but also what instructions she followed in doing so (in PROV terms, the plan).
+     First, the correction activity (<code>ex:correct</code>), the person who corrected
+     it, Edith (<code>ex:edith</code>), and the correction instructions (<code>ex:corrections</code>)
+     are described.
+    <pre class="turtle example">
+     ex:correct     a prov:Activity .
+     ex:edith       a prov:Agent, prov:Person .
+     ex:corrections a prov:Plan .
+    </pre>
+    <p>
+     The connection between them is expressed in PROV-O using a qualified association giving details of
+     how Edith was associated with the correction activity,
+     including that she adopted the above corrections plan.
+    </p>
+    <pre class="turtle example">
+     ex:correct prov:qualifiedAssociation [
+                    prov:agent   ex:edith .
+                    prov:hadPlan ex:corrections .
+                ] .
+     ex:dataSet2 prov:wasGeneratedBy ex:correct .
+    </pre>
+   </section>
+  
+   <section>
+    <h3>Time</h3>
+    
+    <p>
+     
+    </p>
+   </section>
+   
+   <section>
+    <h3>Alternate Entities and Specialization</h3>
+    
+    <p>
+     Before noticing anything wrong with the government data, Betty had already
+     posted a blog entry about the article. The blog entry had its own published
+     provenance, stating that it quoted from the article. This was expressed
+     using a PROV property, <code>wasQuotedFrom</code>, which is a kind of
+     derivation.
+    </p>
+    <pre class="turtle example">
+     ex:blogEntry a prov:Entity ;
+                  prov:wasQuotedFrom ex:article .
+    </pre>
+    <p>
+     The newspaper, from past experience, anticipated that there could be revisions
+     to the article, and so created identifiers for both the article in general
+     (<code>ex:article</code>) and the first version of the article (<code>ex:articleV1</code>),
+     allowing both to be referred to as entities in provenance data. The article
+     discussed the GovData data set, and so the provenance data published by the
+     newspaper asserts that the first version of the article was derived from that data set.
+    </p>
+    <pre class="turtle example">
+     ex:articleV1 a prov:Entity ;
+                  prov:wasDerivedFrom ex:dataSet1 .
+    </pre>
+    <p>
+     Without some way to know entities <code>ex:article</code> and <code>ex:articleV1</code>
+     are related, anyone looking at Betty's and the newspaper's PROV data above would
+     not know that the blog entry was written about an article derived from GovData.
+     Therefore, the newspaper also describes the connection between the two: that
+     the first version of the article is a specialization of the article in general.
+    </p>
+    <pre class="turtle example">
+     ex:articleV1 prov:specializationOf ex:article .
+    </pre>
+    <p>
+     Later, after the data set is corrected and new chart generated, a new version
+     of the article is created, <code>ex:articleV2</code>. To ensure that those
+     consulting the provenance of <code>ex:articleV2</code> understand that it
+     is connected with the provenance of <code>ex:article</code> and <code>ex:articleV1</code>,
+     the newspaper describes how these entities are related.
+    </p>
+    <pre class="turtle example">
+     ex:articleV2 prov:specializationOf ex:article .
+     ex:articleV2 prov:alternateOf      ex:articleV1 .
+    </pre>
+    <p>
+     Here, <code>alternateOf</code> expresses that the first and second versions
+     are specializations of the same thing (the article).
+    </p>
+    
+   </section>
+  
   <section class="appendix">
    <h2>PROV-N Examples</h2>
    <p>
@@ -684,8 +789,8 @@
     <h3>Entities</h3>
     <pre class="example asn">
      entity(ex:dataSet1).
-     entity(ex:regionList1).
-     entity(ex:aggregate1).
+     entity(ex:regionList).
+     entity(ex:aggregate).
      entity(ex:chart1).
     </pre>
    </section>
@@ -715,10 +820,10 @@
     <h3>Use and Generation</h3>
     <pre class="example asn">
      used(ex:aggregated, ex:dataSet1).
-     used(ex:aggregated, ex:regionList1).
-     wasGeneratedBy(ex:aggregate1, ex:aggregated).
+     used(ex:aggregated, ex:regionList).
+     wasGeneratedBy(ex:aggregate, ex:aggregated).
 
-     used(ex:illustrated, ex:aggregate1).
+     used(ex:illustrated, ex:aggregate).
      wasGeneratedBy(ex:chart1, ex:illustrated).
     </pre>
    </section>
@@ -743,7 +848,7 @@
     </p>
     <pre class="example asn">
      used(ex:aggregated, ex:dataSet1,    [ prov:role = "dataToAggregate"]).
-     used(ex:aggregated, ex:regionList1, [ prov:role = "regionsToAggregteBy"]).
+     used(ex:aggregated, ex:regionList, [ prov:role = "regionsToAggregteBy"]).
     </pre>
     <p>
      In the first description above, note that this adds a "role" attribute to the first 'used' description of Ex. 3.
@@ -790,6 +895,9 @@
     <li>Included description of attribution in intuition section on agents and responsibility.</li>
     <li>Changed from ASN to PROV-N</li>
     <li>Updated examples to latest PROV-O terms</li>
+    <li>Added PROV-O examples for attribution </li>
+    <li>Added PROV-O examples for plans, adoptedPlan </li>
+    <li>Added PROV-O examples for specialization and alternate </li>
    </ul>
   </section>