prov: changeset 3025:1f3c4e5b7a41

--- a/model/comments/wd6-Graham.txt	Mon May 28 17:30:44 2012 +0100
+++ b/model/comments/wd6-Graham.txt	Mon May 28 23:10:32 2012 +0100
@@ -20,13 +20,19 @@
   > 2.  So my proposals focus more on explaining how the concepts work
   > together and not repeating the actual definitions.
 
-
+I am in favour of keeping definitions in section 2, to make it self-contained.
+Paul, Paolo, what do you think?
+Do we seek a WG resolution?
 
   > 
   > As I reflect on what I've read, I think it might be worth linking each
   > of the core structure concepts to the corresponding subsection in
   > section 5.  This would provide a quick-and-easy route from the
   > structural overview to the corresponding details.
+
+This can be done. 
+Suggestion: at the end of each definition, add a link to the corresponding subsection in section 5.
+
   > 
   > Detailed comments follow.
   > 
@@ -41,6 +47,9 @@
   > to form assessments about its quality, reliability or trustworthiness.
   > PROV-DM is the conceptual data model that forms a basis for the W3C
   > provenance (PROV) family of specifications.  ...  ]]
+
+Done.
+
   > 
   > Otherwise it looks pretty reasonable.
   > 
@@ -49,6 +58,9 @@
   > 
   > Para 2: "We consider" -> "We present"
   > 
+
+Done
+
   > Para 3: "The PROV data model" - this is first use in the body of the
   > text, and should be defined (what's "PROV"?).  Suggest the *previous*
   > paragraph starts thus:
@@ -70,9 +82,10 @@
   > statement - precursor provenance models and CIDOC-CRM are examples I
   > have used - OPM, OPMV, Provenir, PML all use broadly similar
   > structures)
+
   > 
 
-Goof point, we should cite http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings
+Good point, I now cite http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings
 
   > Para 4 and list: I would have the derivations component immediately
   > follow on from entities and activities (or folded in with those).
@@ -82,6 +95,9 @@
   component 2: derivation
   component 3: agent/responsibility.
 
+Paul, Paolo, Put this up for vote?
+Issue: tracedTo (currently with derivation, should be moved to new component 3)
+
 
   > 
   > Para 5 and 6: I think these should be run together.  I find that para
@@ -99,15 +115,22 @@
   > correctness, trustworthiness, etc.  This is addressed in a companion
   > specification [PROV-CONSTRAINTS] by proposing formal constraints on
   > the way that provenance descriptions are related to the things they
-  > describe (such as the use of attributes, temporal information and
+  > describe (such as the use of attributes, temporal information, and
   > specialization of entities), and additional conclusions that are valid
   > to infer if those constraints are satisfied.  ]]
   > 
+
+Agreed it's better to drop the distinction simple provenance ... and enrichment.
+I mostly keep the suggested text.
+
   > 
   > == Section 2 ==
   > 
   > "catering for more advanced uses..." - I would suggest "catering for more specific uses...".
   > 
+
+Done.
+
   > 
   > == Section 2.1 ==
   > 
@@ -124,18 +147,45 @@
   > activities provide key information for assessing the reliability and
   > trustworthiness of the result.
   > ]]
+
+Good first sentence. It's in.
+
+I am not convinced by the second part, the "provenance trace", and
+then the "annotations" for assessing reliability etc. I don't think we should
+focus on what provenance is used for.
+
   > 
   > Figure 1 is a great improvement over previous incarnations, largely by
   > virtue of the coloured boxes, but I think it could be more effective
   > and appealing.  I attach a proposed alternative (graffle and png)
   > which follows the style of diagrams used in the examples.
+
+Thanks. 
+I prefer to use UML diagrams for "schema" level diagrams. 
+The notation ellipsis/rectangle/pentagon is good for "instance" level diagrams.
+
+
   > 
   > I think there's an inconsistency between the diagram (figure 1) and
   > table (Table 2): relations on the diagram use values from the "Name"
   > column of the table, but types use values from the "Concepts" column.
   > 
 
-I am aware of this. Not sure how to address this.
+I am aware of this. It shows at a couple of other places.
+
+How do we write 
+  wasDerivedFrom(e2,e1,[prov:type='prov:WasRevisionOf'])
+
+Is it prov:WasRevisionOf or prov:wasRevisionOf?
+
+Should relation names be capitalized, e.g. WasDerivedFrom
+while their writing in prov-n is not, e.g. wasDerivedFrom(e2,e1)?
+
+Proposal: In all diagrams, capitalize relations.
+          In table 2, capitalize all entries in column 'Name'
+
+
+Paolo?
 
   > I think it's a little confusing that there are named "concepts" and
   > (sometimes) different names for the types and relations.  This is
@@ -148,6 +198,16 @@
   > this arrangement, I think table 2 is redundant.
 
 
+I think we want to keep both 'wasGeneratedBy/WasGeneratedBy' and 'Generation'.
+The latter is the concept, the former the relation.
+
+I would be reluctant to replace 'wasGeneratedBy' by 'Generation' in
+Figure.  Likewise, I would be reluctant to say "WasGeneratedBy is the
+completion of production of a new entity ..."
+
+So, the solution above, with capitalization of relations would address
+the concern.
+
 
   > 
   > 
@@ -164,6 +224,7 @@
   > Provenance describes /entities/, which are both generated and used by /activities/.
   > 
   > While the main anticipated use of provenance is to describe entities
+why say this?
   > that are digital artifacts, it is not constrained from describing
   > other kinds of thing. Thus, an entity may be a broad diversity of
   > notions, including digital objects such as a file or web page,
@@ -178,11 +239,18 @@
   > 
   > Activities are (time-bounded) processes that consume or generate
   > entities; they are the mechanisms by which entities are created and
+
+I don't feel like we should paraphrase definitions of terms.
+
+
   > used in the creation of further entities.  Just as entities cover a
   > broad range of notions, activities can cover a broad range of
   > processes, commonly related to information processing, but also
   > covering broader notions like driving a car from Boston to Cambridge.
   > 
+
+Inserted/rephrased the above.
+
   > <example 2 (activities> here>
   > 
   > Provenance is concerned with activities that create a new state of
@@ -207,6 +275,11 @@
   > <example 3 here>
   > 
   > <example 4 here>
+
+The above suggested text says 'Usage is relationship' .... 'Usage is considered to occur'
+So, it mixes 'data model construct' and 'concept' 
+I tried to stay with definitions of concepts here.
+
   > 
   > One might reasonably ask what entities are used and consumed by
   > driving a car from Boston to Cambridge.  This is answered by
@@ -225,20 +298,35 @@
   > page reporting a traffic violation involving that car.  This breadth
   > of provenance allows descriptions of interactions between physical and
   > digital artifacts.
+
+Good!
+
+Can I check that you meant "in this case a car in Boston may be a
+ different ENTITY from a car in Cambridge"  ... and not ARTIFACT?
+
   > 
   > <I added a fair amount of explanatory text here, because I think that
   > the whole issue of breadth of interpretation begs some explanation.>
+
+Yes, good idea.
+
   > 
   > Communication is the generation of an entity by an activity and its
   > subsequent usage by another activity.
   > 
   > <skip definition - it just repeats and is covered later>
   > 
+
+See above.
+
   > 
   > Example 5 here; I might also add to this: the activity of purchasing a
   > car in Boston could be informed by the the activity of its being
   > designed in Japan> ]]
   > 
+
+Yes.
+
   > 
   > == After section 2.1.1 ==
   > 
@@ -250,6 +338,9 @@
   > (e.g. the weather report W was derived from meteorological datasets X,
   > Y and Z; my Ford car was derived from a VW design).
   > 
+
+I would propose to swap 2.1.2 and 2.1.3
+
   > Thus, following on from the proposed revised 2.1.1:
   > 
   > [[
@@ -260,6 +351,12 @@
   > editing a document, and also extends more broadly to a canvas used for
   > creating a painting, transporting a work of art from London to New
   > York, or melting ice to produce water.
+
+I don't like 'Derivation is the generation .." 
+because derivation may have started well before the actual generation.
+
+As indicated above, I find it challenging to paraphrase the definitions.
+
   > 
   > While the basic idea is quite simple, the concept of derivation can be
   > tricky: implicit is the notion that the generated object was affected
@@ -280,7 +377,11 @@
   > <Again, I've added bit of text here, because I think it's part of the
   > orientation that's needed to avoid misunderstandings like the one I
   > exhibited in the last teleconference.>
-  > 
+
+I like it, I have added it after a few tweaks.
+
+
+
   > 
   > == Section 2.1.2 ==
   > 
@@ -288,6 +389,8 @@
   > bare, and doesn't really put it into a context of provenance usage.
   > I'm also uneasy about describing software agents as having
   > responsibility.
+
+Agreed intro was bare.
   > 
   > You say "An agent may be a particular type of entity or activity" -
   > can an agent *really* be an activity?  While I would shop short of
@@ -310,20 +413,32 @@
   > believed than a claim by a new student; a calculation performed by an
   > established software library may be more reliable than by a one-off
   > program.
+
+I reused this intro, but put the word 'responsible' in it.
   > 
   > In provenance terms, an /agent/ is a person or entity that can
   > initiate, control or otherwise bear responsibility for an activity.
+
+Again, I want to avoid paraphrasing the definition.
+
   > 
   > <example 6 here>
   > 
   > An /association/ of an activity with an agent indicates that the agent
   > had some role in the activity.
   > 
+
+An entity used by an activity also has some roles. 
+The WG agreed that agent association means implies responsibility.
+
+
   > <example 8 here>
   > 
   > An /attribution/ of an entity to an agent means that the entity was
   > generated by some (possibly unknown) activity that was associated with
   > the agent.
+
+
   > 
   > <example 7 here>
   > 
@@ -431,3 +546,286 @@
   > 
   > #g
 -- 
+
+  > 
+  > On 25/05/2012 11:16, Luc Moreau wrote:
+  > > Hi Graham,
+  > >
+  > > I have produced an updated version of the prov-dm document for
+  > > you to go through.
+  > >
+  > > http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-dm-20120525/prov-dm.html
+  > 
+  > Continuing (my comments up to section 2 sent previously...)
+  > 
+  > 
+  > == Section 3 ==
+  > 
+  > (Not reviewed in detail)
+  > 
+  > == Section 4 ==
+  > 
+  > (Not reviewed in detail)
+  > 
+  > == Section 5 ==
+  > 
+  > I'd still like to see derivations immediately follow entities/activities.  :)
+  > 
+  > Figure 4: I don't know what this is trying to tell me.
+  > 
+  > What is the significance of vertical stacking and/or horizontal alignment.
+  > 
+  > Figure 5: uses UML, but at one point I was looking for cardinality indicators (1:1, 1:N, N:M, etc.)
+  > 
+  > == Section 5.1.1 ==
+  > 
+  > Definition of entity seems a bit clumsy, but not problematic.
+  > 
+  > == Section 5.1.2 ==
+  > 
+  > An activity may be "... associated ..." with an entity?  Seems like an
+  > unfortunate overloading of terminology.  Suggest drop "being
+  > associated with" ... it's not trying to be an exhaustive list.
+  > 
+  > "An activity is not an entity" - I think you said otherwise further up (see previous email).
+  > 
+  > == Section 5.1.3 ==
+  > 
+  > "This entity did not exist before..." jarred a little for me.  I'd suggest "was not available for use before ..."
+  > 
+  > == Section 5.1.4 ==
+  > 
+  > == Section 5.1.5 ==
+  > 
+  > It seems a little odd that the definition leads on the exchange of an
+  > entiry, which is not identified.  Suggest:
+  > 
+  > "/Communication/ is the exchange of information by two activities, in
+  > the form of an unspecified entity, one activity using information
+  > generated by the other"
+  > 
+  > As I recall, this assertion implies the existence of an entity or
+  > entity-derivation-entity chain from one to the other.  Hmmm... Is the
+  > intent here that the entity is always passed directly from one
+  > activity to the other, rather than possibly indirectly via intervening
+  > activities?  The phrasing suggests this, but the example not so much
+  > (what bureaucratic process could be so simple?).
+  > 
+  > == Section 5.1.6 ==
+  > 
+  > The phrase "that initiated the activity" suggests to me that the
+  > trigger is an agent (i.e. capable of action).  Maybe "that set of..."
+  > ?
+  > 
+  > 
+  > == Section 5.2.1 ==
+  > 
+  > I still feel uneasy about defining agency solely in terms of
+  > "responsibility", particularly when software agents are included.  I
+  > think the defining feature of an agent is that they are capable of
+  > unsupervised action; that they can directly initiate an activity
+  > without the direct involvement of any other agent at the time.
+  > 
+  > == Section 5.2.3 ==
+  > 
+  > As above.
+  > 
+  > I also find, looking at the text, that it's not clear if more than one
+  > agent can be associated with an activity.  The example makes it clear
+  > that this is allowed, but the phrasing "assignment of responsibility"
+  > suggests otherwise to me.
+  > 
+  > == Section 5.2.4 ==
+  > 
+
+  > The use of responsibility here is rather more in line with my
+  >     expectation, and not, I think, fully consistent with the preceding
+  >     uses.
+  > 
+  > == Section 5.3.1 ==
+  > 
+  > "A derivation is a transformation of an entity into another, an update
+  > of an entity, resulting in a new one, or based on an entity, the
+  > construction of another."
+  > 
+  > I'm having trouble parsing this, especially the final clause "the
+  > construction of another" - is there a missing or misplaced "or"?
+  > 
+  > I'm finding "transformation" seems rather a limiting term to define
+  > derivation.  If I get an idea from reading a document, and then write
+  > a new document around that idea, I think there's derivation there, but
+  > not transformation.
+  > 
+  > Here's a stab an an alternative:
+  > 
+  > [[
+  > A /derivation/ is the construction of a new entity using
+  > information obtained directly or indirectly from another pre-existing
+  > entity.  ]]
+  > 
+  > 
+  > == Section 5.3.2 ==
+  > 
+  > So many "revise"-based terms in one short definition!
+  > 
+  > maybe:
+  > "A revision is a derivation for which the resulting entity is a
+  >     revised version of some original.  The implication here is that
+  >     the resulting entity contains substantial content from the
+  >     original."
+  > 
+  > 
+  > == Section 5.3.3 ==
+  > 
+  > The definition of quotation makes it seem like an activity, not a relationship between entities.
+  > 
+  > Maybe:
+  > 
+  > "A quotation is a form of derivation in which the new entity contains
+  > a verbatim copy of some or all of the original entity's content."
+  > 
+  > 
+  > == Section 5.3.4 ==
+  > 
+  > 
+  > == Section 5.3.5 ==
+  > 
+  > What's a "responsibility" relation - I don't see that defined anywhere.
+  > 
+  > I'm completely unclear what is meant by this relation that isn't already covered by derivation.
+  > 
+  > 
+  > I might /guess/ that it's meant to allow the "entity" concerned to be
+  > an agent in the overall process, but I find that confusing given that
+  > agents are separately identified core concepts.
+  > 
+  > What's the requirement for this relation?  Do we really need it?
+  > 
+  > 
+  > == Section 5.4.1 ==
+  > 
+  > 
+  > "In particular, the lifetime of the specialized entity contains that
+  > of any specialization."  I find the term "specialized entity" is
+  > unclear, making the sentence harder to read.  Suggest: "In particular,
+  > the lifetime of the more general entity contains that of any
+  > specialization."
+  > 
+  > 
+  > I know this is formally a topic for CONSTRAINTS, but I think it would
+  > help to give an indication of whether a thing can be considered a
+  > specialization of itself.
+  > 
+  > 
+  > == Section 5.4.2 ==
+  > 
+  > 
+  > In example 41, you claim "They are both specialization of an
+  > (unspecified) entity."  If this is true, shouldn't it be part of the
+  > definition of alternate?
+  > 
+  > 
+  > == section 5.5 ==
+  > 
+  > s/depict/depicts/
+  > 
+  > 
+  > == Section 5.5.1 ==
+  > 
+  > "provenance description" is not defined, but appears here as a key
+  > element in a definition.  (See previous comment - maybe in other
+  > email)
+  > 
+  > 
+  > == Section 5.5.2 ==
+  > 
+  > 
+  > I found the combination of section title and phrasing difficult to
+  > understand. I think the title and first sentence might read:
+  > 
+  > [[
+  > 5.5.2 Bundle type
+  > 
+  > A bundle is a named set of provenance descriptions.  The bundle name
+  > may be declared to be an entity of type prov:Bundle, and the bundle
+  > may thus have its own provenance description.  ]]
+  > 
+  > Reflecting further on this, what need is served by introducing the
+  > bundle type?  Provenance is just data.  If the Bundle constructor is
+  > present, then naming it as an entity is enough - the bundle type
+  > provides no new information.  If the bundle constructor is not
+  > present, WHY does it matter that the entity described is provenance?
+  > 
+  > 
+  > == Section  5.5.3 ==
+  > 
+  > The description of provenance locator is very confusing.  Particularly the bit:
+  > 
+  > "id: an identifier for a provenance locator;"
+  > 
+  > Looking at the examples, I think this should be "the identifier of
+  >     an entity for which provenance is provided".  Or maybe the id is
+  >     optional, and should be shown as "id;" in the PROV-N example.
+  > 
+  > My sense is that, especially as this is motivated by PROV-AQ, there are just too many identifiers floating around.
+  > 
+  > Why not just:
+  > 
+  >   hasProvenanceIn(subject, bundle)
+  > 
+  > Where subject is the URI of an entity, and bundle is the URI of a provenance bundle with information about that entity.
+  > 
+  > I've raised this as a separate issue, as I think it needs discussion.
+  > 
+  > 
+  > == Section 5.6 ==
+  > 
+  > I'm skipping review of collections, as I don't really think they
+  > belongs as an integral part of the PROV model, but I know others in
+  > the WG feel differently. Thus, I'm not well-placed to judge how it
+  > meets expectations.
+  > 
+  > 
+  > == Section 5.7 ==
+  > 
+  > 
+  > Why are namespace declarations and qualified names part of the
+  > provenance data model, rather than part of the PROV-N specification?
+  > It seems to me that most of this is syntactic artifact.
+  > 
+  > 
+  > == Section 5.7.4 ==
+  > 
+  > I'm finding that the attribute identifiers seem to be somewhat buried
+  > in what is otherwise syntactic detail.  I think that section 5.7.4
+  > would be more appropriately presented as an immediate subsection of
+  > section 5 (e.g. 5.7, with the rest of section 5.7 becoming 5.8)
+  > 
+  > == Section 5.7.5 ==
+  > 
+  > Similar comments to above.
+  > 
+  > 
+  > == Section 6 ==
+  > 
+  > 
+  > "The PROV namespace declares a set of reserved attributes catering for
+  > extensibility: prov:type, prov:role, prov:location."  I think these
+  > should be the first port of call for PROV extensions, and as such
+  > should feature more prominently, and their use discussed more fully,
+  > rather than as an afterthought to "attribute-value lists"
+  > 
+
+  > The second bullet point already does this for prov:type. I think
+  >     this should come first, followed by a similar entry for prov:role.
+  > 
+  > 
+  > == Section 7 ==
+  > 
+  > I'm not sure this section is needed.  My preference would be to drop it.
+  > 
+  > But if that's not considered good form, I'd like to revisit this when
+  > the PROV-CONSTRAINTS document settles (especially its introduction and
+  > abstract).  I think they should be conveying the same message.
+  > 
+  >

--- a/model/diff.html	Mon May 28 17:30:44 2012 +0100
+++ b/model/diff.html	Mon May 28 23:10:32 2012 +0100
@@ -972,12 +972,16 @@
 
     <div id="abstract" class="introductory section"><h2>Abstract</h2>
 <p>
-PROV-DM, the PROV <span class="insert">conceptual </span>data model, is a data model for provenance that describes
+<span class="delete">PROV-DM, the PROV</span><span class="insert">Provenance is information about entities, activities, and people,
+involved in producing a piece of</span> data <span class="delete">model,</span><span class="insert">or thing, which can be used
+ to form assessments about its quality, reliability or trustworthiness.
+PROV-DM</span> is <span class="delete">a</span><span class="insert">the conceptual</span> data model <span class="insert">that forms a basis </span>for <span class="insert">the </span><acronym title="World Wide Web Consortium"><span class="insert">W3C</span></acronym>
+provenance <span class="delete">that describes
 the entities, people and activities involved in
-producing a piece of data or thing. 
-PROV-DM <span class="insert">distinguishes core structures, forming the essence of provenance descriptions, from
-extended structures catering for more advanced uses of provenance. 
-PROV-DM </span>is <span class="delete">structured</span><span class="insert">organized</span> in six components,<span class="insert"> respectively</span> dealing with: 
+producing a piece of data or thing.</span><span class="insert">(PROV) family of specifications.
+PROV-DM distinguishes core structures, forming the essence of provenance descriptions, from
+extended structures catering for more advanced uses of provenance.</span> 
+PROV-DM is <span class="delete">structured</span><span class="insert">organized</span> in six components,<span class="insert"> respectively</span> dealing with: 
 (1) entities and activities, and the time at which they were created, used, or ended;
 (2) agents bearing responsibility for entities that were generated and activities that happened;
 (3) derivations of entities from entities;
@@ -1067,8 +1071,9 @@
 
 
 <p>
-We
-consider a <span class="delete">core</span><span class="insert">generic</span> data model for provenance that allows  domain and application specific representations of provenance to be translated into such a data model and  <em>interchanged</em> between systems.
+We<span class="delete">
+consider a core</span><span class="insert"> present the PROV data model,
+a generic</span> data model for provenance that allows  domain and application specific representations of provenance to be translated into such a data model and  <em>interchanged</em> between systems.
 Thus, heterogeneous systems can export their native provenance into such a core data model, and applications that need to make sense of provenance <span class="delete">in heterogeneous systems </span>can then import it,
 process it, and reason over it.</p>
 
@@ -1077,20 +1082,22 @@
 
 
 
-<p><span class="delete">A set of specifications, referred</span><span class="insert">
-The PROV data model distinguishes </span><em><span class="insert">core structures</span></em><span class="insert"> from
+<p><span class="delete">A set of specifications, referred to as the</span><span class="insert">
+The</span> PROV <span class="delete">family of specifications, define the</span><span class="insert">data model distinguishes </span><em><span class="insert">core structures</span></em><span class="insert"> from
 </span><em><span class="insert">extended structures</span></em><span class="insert">: core structures form the essence of
-provenance descriptions, and are commonly found in various
-domain-specific vocabularies. Extended structures enhance and refine core
-structures with more expressive capabilities</span> to <span class="delete">as the PROV family of specifications, define the various aspects
-that are necessary to achieve this vision in an interoperable
+provenance descriptions, and are commonly found in</span> various<span class="delete"> aspects
+</span><span class="insert">
+domain-specific vocabularies </span>that <span class="delete">are necessary to achieve this vision in an interoperable
 way:</span>
 
-<span class="delete">A data model</span><span class="insert">cater</span> for <span class="delete">provenance, which is presented in three documents:
+<span class="delete">A data model for provenance, which is presented in three documents:
 </span>
-<span class="delete"> PROV-DM (part I): the provenance data model, informally described (this document);
-</span><span class="delete"> PROV-CONSTRAINTS (part II): constraints underpinning the data model [</span><span class="delete">PROV-CONSTRAINTS</span><span class="delete">];
-</span><span class="delete"> PROV-N (part III): a notation to express instances of that data model for human consumption [</span><span class="delete">PROV-N</span><span class="delete">];
+<span class="delete"> PROV-DM (part I): the</span><span class="insert">deal
+with</span> provenance <span class="delete">data model, informally described (this document);
+</span><span class="delete"> PROV-CONSTRAINTS (part II): constraints underpinning the data model</span><span class="insert">or similar kinds of information</span> [<cite><span class="delete">PROV-CONSTRAINTS</span><a class="bibref" rel="biblioentry" href="#bib-Mappings"><span class="insert">Mappings</span></a></cite><span class="delete">];
+</span><span class="delete"> PROV-N (part III): a notation</span><span class="insert">].
+Extended structures enhance and refine core
+structures with more expressive capabilities</span> to <span class="delete">express instances of that data model</span><span class="insert">cater</span> for <span class="delete">human consumption [</span><span class="delete">PROV-N</span><span class="delete">];
 </span> 
 
 <span class="delete">PROV-O: the PROV ontology, an OWL-RL ontology allowing the mapping of PROV to RDF [</span><span class="delete">PROV-O</span><span class="delete">];</span>
@@ -1126,15 +1133,20 @@
 provenance types and relations, without specific concern for how they are applied.
 With these, it becomes possible to write useful provenance descriptions, and publish or embed them <span class="delete">along side</span><span class="insert">alongside</span> the data they relate to. </p>
 
-<p>However, if something about which provenance is expressed is subject to change, then it is challenging to express its provenance precisely (e.g. the data from which a daily weather report is derived  changes from day to day).
- To address this challenge, <span class="delete">a </span><span class="delete">refinement</span><span class="insert">it</span> is proposed to enrich simple provenance, with <span class="delete">extra</span><span class="insert">refined</span> descriptions that  help qualify the specific subject of provenance and provenance itself, with attributes and temporal information, intended to satisfy a comprehensive set of constraints.  These aspects are covered in the companion specification [<cite><a class="bibref" rel="biblioentry" href="#bib-PROV-CONSTRAINTS">PROV-CONSTRAINTS</a></cite>].
+<p>However, if something about which provenance is expressed is subject to change, then it is challenging to express its provenance precisely (e.g. the data from which a daily weather report is derived  changes from day to day).<span class="insert">
+This is addressed in a companion</span>
+ <span class="delete">To address this challenge, a </span><span class="delete">refinement</span><span class="delete"> is proposed to enrich simple provenance, with extra descriptions that  help qualify the specific subject of provenance and provenance itself, with attributes and temporal information, intended to satisfy a comprehensive set of constraints.  These aspects are covered in the companion </span>specification [<cite><a class="bibref" rel="biblioentry" href="#bib-PROV-CONSTRAINTS">PROV-CONSTRAINTS</a></cite><span class="delete">].</span><span class="insert">] by proposing formal constraints on
+ the way that provenance descriptions are related to the things they
+ describe (such as the use of attributes, temporal information and
+ specialization of entities), and additional conclusions that are valid
+ to infer.</span>
 </p>
 
 
 <div id="structure-of-this-document" class="section"> 
 <h3><span class="secno">1.1 </span>Structure of this Document</h3>
 
-<p><a href="#section-prov-overview">Section 2</a> provides<span class="insert"> an overview of the PROV Data Model,</span>  <span class="delete">starting points for the PROV Data Model, listing a</span><span class="insert">distinguishing a core</span> set of types and  relations, <span class="delete">which allows users to make initial</span><span class="insert">commonly found in</span> provenance <span class="delete">descriptions.</span><span class="insert">descriptions, from extended structures catering for advanced uses. It also introduces a modular organization of the data model in components. </span></p>
+<p><a href="#section-prov-overview">Section 2</a> provides<span class="insert"> an overview of the PROV Data Model,</span>  <span class="delete">starting points for the PROV Data Model, listing a</span><span class="insert">distinguishing a core</span> set of types and  relations, <span class="delete">which allows users to make initial</span><span class="insert">commonly found in</span> provenance <span class="delete">descriptions.</span><span class="insert">descriptions, from extended structures catering for more specific uses. It also introduces a modular organization of the data model in components. </span></p>
 
 <p><a href="#prov-notation"><span class="insert">Section 3</span></a><span class="insert"> overviews the Provenance Notation used to illustrate examples of provenance descriptions.</span></p>
 
@@ -1222,21 +1234,35 @@
 
 
 <div id="core-structures" class="section"> 
-<h3><span class="secno">2.1 </span><span class="delete">Entity and Activity</span>
-
-
-<span class="delete">Things we want to describe  the provenance of are called </span><span class="delete">entities</span><span class="delete"> in PROV. The term "things" encompasses a broad diversity of notions, including digital objects such as a file or web page, 
-physical things such as a building or a printed book, or a car as well as abstract concepts and ideas. </span>
+<h3><span class="secno">2.1 </span><span class="delete">Entity</span><span class="insert">PROV Core Structures</span></h3>
+
+<p><span class="insert">At its core, provenance describes the use</span> and <span class="delete">Activity</span>
+
+
+<span class="delete">Things we want to describe  the provenance of are called </span><span class="insert">production of
+</span><em>entities</em> <span class="insert">by </span><em><span class="insert">activities</span></em><span class="insert">, which may be 
+controlled or influenced </span>in<span class="delete"> PROV. The term "things" encompasses a broad diversity of notions, including digital objects such as a file or web page, 
+physical things such as a building or a printed book, or a car as well as abstract concepts</span><span class="insert">
+various ways by </span><em><span class="insert">agents</span></em><span class="insert">.  These core types</span> and <span class="delete">ideas. </span><span class="insert">their relationships
+are illustrated
+by
+the UML diagram of </span><a href="#prov-core-structures-top"><span class="insert">Figure 1</span></a><span class="insert">.</span></p>
+
+
+<div style="text-align: center; ">
+  <figure style="max-width: 70%; " id="prov-core-structures-top">
 
 
 <span class="delete">
-   An </span><span class="delete">entity</span><span class="delete"> is a physical, digital, conceptual, or other kind of thing; entities may be real or imaginary. </span>
+   An </span><span class="delete">entity</span><span class="delete"> is a physical, digital, conceptual, or other kind of thing; entities may be real or imaginary. </span><img src="uml/essentials.png" alt="PROV Core Structures" style="max-width: 70%; ">
+<div class="figcaption" id="prov-core-structures"><span class="insert">Figure 1: PROV Core Structures</span></div>
 
 
 
 
 <span class="delete">An entity may be the document at URI </span><span class="delete">http://www.bbc.co.uk/news/science-environment-17526723</span><span class="delete">, a file in a file system, a car, or an idea.</span>
-
+  </figure>
+</div>
 
 
 
@@ -1300,16 +1326,9 @@
 
 
  
-<span class="delete">2.3 </span><span class="delete">Agents, Attribution, Association, and Responsibility</span><span class="insert">PROV Core Structures</span></h3>
-
-<p>The <span class="delete">motivation for introducing</span><span class="insert">core of PROV consists of essential provenance structures commonly found in provenance descriptions.
-It is summarized graphically by
-the UML diagram of </span><a href="#prov-core-structures-top"><span class="insert">Figure 1</span></a><span class="insert">,
-illustrating</span>  <span class="delete">agents in the model is</span><span class="insert">three types (entity, activity, and agent) and how they relate</span> to <span class="delete">express the agent's responsibility for activities that happened and entities that were generated.</span><span class="insert">each other.  In the core of PROV, all associations are binary.</span> </p>
-
-
-<div style="text-align: center; ">
-  <figure style="max-width: 70%; " id="prov-core-structures-top">
+<span class="delete">2.3 </span><span class="delete">Agents, Attribution, Association, and Responsibility</span>
+
+<p>The <span class="delete">motivation for introducing  agents</span><span class="insert">concepts found</span> in the <span class="delete">model is to express the agent's responsibility for activities that happened and entities that were generated. </span>
 
 
 <span class="delete">
@@ -1324,11 +1343,11 @@
 
 
 <span class="delete">
-Software for checking the use of grammar in a document may be defined as an agent of a document preparation activity, and at the same time one can describe its provenance, including for instance the vendor and the version history. 
-A site selling books on the Web, the services involved in the processing of orders, and the companies hosting them are also agents.
+Software for checking the use of grammar</span><span class="insert">core of PROV are introduced</span> in <span class="delete">a document may be defined as an agent of a document preparation activity, and at the same time one can describe its provenance, including for instance the vendor and the version history. 
+A site selling books on the Web, the services involved</span><span class="insert">the rest of this section.
+They are summarized</span> in<span class="delete"> the processing of orders, and the companies hosting them are also agents.
 </span>
-<img src="uml/essentials.png" alt="PROV Core Structures" style="max-width: 70%; ">
-<div class="figcaption" id="prov-core-structures"><span class="insert">Figure 1: PROV Core Structures</span></div>
+
 
 
 <span class="delete">Agents may adopt sets of actions or steps to achieve their goals. This is captured by the notion of plan. </span>
@@ -1361,21 +1380,19 @@
 
 
 <span class="delete">A blog post can be attributed to an author, a mobile phone to its manufacturer.</span>
-  </figure>
-</div>
-
-<p><span class="delete">
-Agents</span><span class="insert">The concepts found in the core of PROV</span> are <span class="delete">defined as having some kind of responsibility for activities. In some  
+
+
+<span class="delete">
+Agents are defined as having some kind of responsibility for activities. In some  
 cases, those activities reflect the execution of a plan that was  
-designed</span><span class="insert">introduced</span> in <span class="delete">advance to guide the execution.  Thus,
+designed in advance to guide the execution.  Thus,
 a plan may also be linked to an activity.  </span>
 
 
 
 
 
-<span class="delete">   An activity </span><span class="delete">association</span><span class="delete"> is an assignment of responsibility to an agent for an activity, indicating that the agent had a role</span><span class="insert">the rest of this section.
-They are summarized</span> in<span class="delete"> the activity. It further allows for a plan to be specified, which is the plan intended by the agent to achieve some goals in the context of this activity. </span>
+<span class="delete">   An activity </span><span class="delete">association</span><span class="delete"> is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity. It further allows for a plan to be specified, which is the plan intended by the agent to achieve some goals in the context of this activity. </span>
 
 
 
@@ -1449,7 +1466,7 @@
 
 <span class="delete">So far, we have introduced a series of concepts underpinning provenance.   PROV-DM  is a conceptual data model consisting of types and relations between these.</span>  <a href="#overview-types-and-relations">Table 2</a><span class="delete"> shows how provenance concepts can be mapped to types and relations in PROV-DM: the</span><span class="insert">, where they are categorized as
  type or relation.
- The</span> first column lists <span class="delete">concepts introduced in this section,</span><span class="insert">concepts,</span> the second column indicates whether a concept maps to a type or a relation, whereas the third column contains the corresponding name.    Names of relations have a verbal form in the past tense to express what happened in the past, as opposed to what may or will happen. 
+ The</span> first column lists <span class="delete">concepts introduced in this section,</span><span class="insert">concepts,</span> the second column indicates whether a concept maps to a type or a relation, whereas the third column contains the corresponding name.    Names of relations have a verbal form in the past tense to express what happened in the past, as opposed to what may or will happen.<span class="insert"> In the core of PROV, all relations are binary.</span> 
 </p>
 
 
@@ -1519,7 +1536,14 @@
 
 
 <p>
-<span class="glossary-ref"><span class="insert">   An </span><span class="dfn"><span class="insert">activity</span></span><span class="insert">  is something that occurs over a period of time and acts upon or with entities;  it may include consuming, processing, transforming, modifying, relocating, using, generating, or being associated with entities.  </span></span><span class="insert"> Activities that operate on digital entities may for example move, copy, or duplicate them.
+<span class="glossary-ref"><span class="insert">   An </span><span class="dfn"><span class="insert">activity</span></span><span class="insert">  is something that occurs over a period of time and acts upon or with entities;  it may include consuming, processing, transforming, modifying, relocating, using, generating, or being associated with entities.  </span></span><span class="insert"> 
+Just as entities cover a broad range of notions, 
+activities can cover a broad range of
+notions:
+information processing activities
+ may for example move, copy, or duplicate  digital entities;
+ physical activities can include
+ driving a car from Boston to Cambridge.
 </span></p>
 
 
@@ -1529,11 +1553,11 @@
 </div>
 
 <p><span class="delete">Figure 1</span><span class="delete"> illustrates the three types (entity, activity,</span><span class="insert">Activities</span> and <span class="delete">agent)</span><span class="insert">entities are associated with each other in two different ways: activities utilize entities</span> and <span class="delete">how they relate</span><span class="insert">activities  produce entities. The act of utilizing or producing an entity may have a duration.  
- The term 'generation' refers</span> to <span class="delete">each</span><span class="insert">the completion of the act of producing; likewise, the term 'usage' refers to the beginning of the act of utilizing entities. Thus, we define the following concepts of generation and usage. </span></p>
+ The term 'generation' refers</span> to <span class="delete">each other.  At</span><span class="insert">the completion of the act of producing; likewise, the term 'usage' refers to the beginning of the act of utilizing entities. Thus, we define the following concepts of generation and usage. </span></p>
 
 <p>
 </p><div class="glossary-ref">
-   <span class="dfn"><span class="insert">Generation</span></span><span class="insert"> is the completion of production of a new entity by an activity. This entity did not exist before generation and becomes available for usage after this generation. </span></div>
+   <span class="dfn"><span class="insert">Generation</span></span><span class="insert"> is the completion of production of a new entity by an activity. This entity did not exist before generation and becomes available for usage after</span> this <span class="delete">stage, all relations</span><span class="insert">generation. </span></div>
 
 
 <p>
@@ -1544,10 +1568,39 @@
 
 
 <div class="anexample conceptexample" id="generation-example" count="3">
-<p><span class="insert">Examples of generation are the completed creation of a file by a
+<p><span class="insert">Examples of generation</span> are <span class="delete">shown to be binary.  Definitions of </span><span class="delete">Section 4</span><span class="delete"> reveal that some relations, while  involving two primary elements, are n-ary. </span><span class="insert">the completed creation of a file by a
 program, the completed creation of a linked data set, and the completed
 publication of a new version of a document.
-</span></p></div>
+</span></p>
+
+
+
+  
+  
+<span class="delete">Figure 1: Simplified  Overview of PROV-DM</span>
+  
+</div>
+
+<p><span class="delete">Figure 1</span><span class="insert">
+One might reasonably ask what entities are used and generated by
+driving a car from Boston to Cambridge.  This</span> is <span class="insert">answered by
+considering that a single artifact may
+correspond to several entities; in this case, a car in Boston may be a
+different entity from a car in Cambridge.  
+Thus, among other things,
+an entity "car in Boston" would be used, and a new entity "car in
+Cambridge" would be generated by this activity of driving.  The
+provenance trace of the car might include: designed in Japan,
+manufactured in Korea, shipped to Boston USA, purchased by customer,
+driven to Cambridge, serviced by engineer in Cambridge, etc., all of
+which might be important information when deciding whether or </span>not <span class="insert">it
+represents a sensible second-hand purchase.  Or some of it might
+alternatively be relevant when trying to determine the truth of a web
+page reporting a traffic violation involving that car.  This breadth
+of provenance allows descriptions of interactions between physical and
+digital artifacts.
+</span></p>
+
 
 
 
@@ -1562,7 +1615,7 @@
 
 <p>
 </p><div class="glossary-ref">
-   <span class="dfn"><span class="insert">Communication</span></span><span class="insert"> is the exchange of an entity by two activities, one activity using the entity generated by the</span> other. </div>
+   <span class="dfn"><span class="insert">Communication</span></span><span class="insert"> is the exchange of an entity by two activities, one activity using the entity generated by the other. </span></div>
 
 
 
@@ -1572,6 +1625,9 @@
 <p><span class="insert">
 The activity of writing a celebrity article was informed by (a
 communication instance) the activity of intercepting voicemails.
+ The activity of purchasing a
+ car in Boston can be informed by the the activity of its being
+ designed in Japan.
 </span></p></div>
 
 
@@ -1582,7 +1638,15 @@
 <div id="section-agents-attribution-association-delegation" class="section"> 
 <h4><span class="secno"><span class="insert">2.1.2 </span></span><span class="insert">Agents and Responsibility</span></h4>
 
-<p><span class="insert">The motivation for introducing</span>  <span class="delete">At this stage, all relations</span><span class="insert">agents in the model is to express the agent's responsibility for activities that happened and entities that were generated. </span></p>
+<p><span class="insert">For many purposes, a key consideration
+ for deciding whether something is reliable and/or trustworthy is
+ knowing who or what </span><em><span class="insert">was reponsible</span></em><span class="insert"> for its production.  Data published by
+ a respected independent organization may be considered more
+ trustworthy that that from a lobby organization; a claim by a
+ well-known scientist with an established track record may be more
+ believed than a claim by a new student; a calculation performed by an
+ established software library may be more reliable than by a one-off
+ program.</span></p>
 
 <p>
 <span class="glossary-ref"><span class="insert">
@@ -1598,36 +1662,29 @@
 <div class="anexample conceptexample" id="agent-example" count="6">
 <p><span class="insert">
 Software for checking the use of grammar in a document may be defined as an agent of a document preparation activity;  one can also describe its provenance, including for instance the vendor and the version history. 
-A site selling books on the Web, the services involved in the processing of orders, and the companies hosting them</span> are <span class="delete">shown</span><span class="insert">also agents.
+A site selling books on the Web, the services involved in the processing of orders, and the companies hosting them are also agents.
 </span></p>
 </div>
 
 
 
 
-<p><span class="insert">Agents can be related</span> to <span class="insert">entities, activities, and other agents.</span></p>  
+<p><span class="insert">Agents can be related to entities, activities, and other agents.</span></p>  
 
 <div class="glossary-ref">   <span class="dfn"><span class="insert">Attribution</span></span><span class="insert"> is the ascribing of an entity to an agent. </span></div>
 
 <div class="anexample conceptexample" id="attribution-example" count="7">
-<p><span class="insert">A blog post can </span>be <span class="delete">binary.  Definitions of </span><span class="delete">Section 4</span><span class="delete"> reveal that some relations, while  involving two primary elements, are n-ary. </span><span class="insert">attributed to an author, a mobile phone to its manufacturer.</span></p>
-
-
-
-  
-  
-<span class="delete">Figure 1: Simplified  Overview of PROV-DM</span>
-  
+<p><span class="insert">A blog post can be attributed to an author, a mobile phone to its manufacturer.</span></p>
 </div>
 
-<p><span class="delete">Figure 1</span><span class="insert">
+<p><span class="insert">
 Agents are defined as having some kind of responsibility for activities. </span></p>
 
 
 
 
 <p>
-<span class="glossary-ref"><span class="insert">   An activity </span><span class="dfn"><span class="insert">association</span></span> is <span class="insert">an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity.  </span></span>
+<span class="glossary-ref"><span class="insert">   An activity </span><span class="dfn"><span class="insert">association</span></span><span class="insert"> is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity.  </span></span>
 </p>
 
 <div class="anexample conceptexample" id="association-example" count="8">
@@ -1643,7 +1700,7 @@
 
 <p>
 <span class="glossary-ref">
-   <span class="dfn"><span class="insert">Delegation</span></span><span class="insert"> is the assignment of authority to an agent (by itself or by another agent)  to carry out a specific activity as a delegate or representative, while the agent that it represents remains responsible for the outcome of the delegated work. </span></span><span class="insert"> The nature of this relation is intended to be broad,  including contractual relation, but also altruistic initiative by the representative agent. </span></p>
+   <span class="dfn"><span class="insert">Delegation</span></span><span class="insert"> is the assignment of authority to an agent (by itself or by another agent)  to carry out a specific activity as a delegate or representative, while the agent that it represents remains responsible for the outcome of the delegated work. </span></span><span class="insert"> The nature of this relation is </span>intended to be <span class="delete">complete:</span><span class="insert">broad,  including contractual relation, but also altruistic initiative by the representative agent. </span></p>
 
 
 
@@ -1651,9 +1708,9 @@
 <div class="anexample conceptexample" id="responsibility-example" count="9">
 <p><span class="insert">A student publishing a web page describing an academic
 department could result in both the student and the department being
-agents associated with the activity.  It may </span>not <span class="delete">intended</span><span class="insert">matter which actual
-student published a web page, but it may matter significantly that the department
-told the student</span> to <span class="delete">be complete: it only illustrates</span><span class="insert">put up the web page.  
+agents associated with the activity.  It may not matter which actual
+student published a web page, but</span> it <span class="delete">only illustrates</span><span class="insert">may matter significantly that the department
+told the student to put up the web page.  
 </span></p>
 </div>
 </div>
@@ -1663,19 +1720,42 @@
 
 
 
-<p><span class="insert">Activities utilize entities and produce entities. In some cases, utilizing an entity influences the creation of another in some way. This notion of 'influence' is captured by derivations, defined as follows.</span></p>
+<p><span class="insert">Activities utilize entities and produce entities. In some cases, utilizing an entity influences the creation of another in some way. This notion of 'influence' is captured by derivations, defined as follows.
+</span></p>
 
 <p>
 <span class="glossary-ref"><span class="insert">   A </span><span class="dfn"><span class="insert">derivation</span></span>  <span class="delete">types</span><span class="insert">is a transformation of an entity into another, an update of an entity, resulting in a new one, or based on an entity, the construction of another.</span></span>
 
 
 
+
+
 </p><div class="anexample conceptexample" id="derivation-example" count="10">
 <p><span class="insert">Examples of derivation include  the transformation of a relational table into a
-linked data set, the transformation of a canvas into a painting, the transportation of a work of art from London to New York,</span> and <span class="delete">relations introduced in this section (</span><span class="delete">Section 2</span><span class="delete">), exploited in the example discussed in </span><span class="delete">Section 3</span><span class="delete">, and explained in detail in </span><span class="delete">Section 4</span><span class="delete">.
-Names of relations depicted in </span><span class="delete">Figure 1</span><span class="delete"> 
-are listed in
-the third column of </span><span class="delete">Table 2</span><span class="delete">. These names are part of a textual notation to write instances of the PROV data model, which we introduce in the next section. </span><span class="insert">a physical transformation such as the melting of ice into water.</span></p>
+linked data set, the transformation of a canvas into a painting, the transportation of a work of art from London to New York,</span> and <span class="delete">relations introduced </span><span class="insert">a physical transformation such as the melting of ice into water.</span></p>
+</div>
+
+
+<p><span class="insert">
+While the basic idea is simple, the concept of derivation can be quite
+subtle: implicit is the notion that the generated entity was affected
+</span>in <span class="delete">this section (</span><span class="delete">Section 2</span><span class="delete">), exploited in the example discussed in </span><span class="delete">Section 3</span><span class="delete">,</span><span class="insert">some way by the used entity.  
+
+If an artifact
+was used by an activity that also generated a new artifact, it does not always follow
+that the second artifact was derived from the first.  In the activity
+of creating a painting, an artist may have mixed some paint that was
+never actually applied to the canvas: the painting would typically
+not be considered a derivation from the unused paint.  
+
+PROV does not attempt to specify the conditions under which derivations
+exist; rather, derivation is considered to have been determined by unspecified means. 
+Thus, while a chain of usage</span> and <span class="delete">explained in detail in </span><span class="delete">Section 4</span><span class="delete">.
+Names of relations depicted in </span><span class="delete">Figure 1</span><span class="insert">generation is necessary for a
+derivation to hold between entities, it is not sufficient; some
+form of influence occurring during the activities involved is also needed.</span> 
+<span class="delete">are listed in
+the third column of </span><span class="delete">Table 2</span><span class="delete">. These names are part of a textual notation to write instances of the PROV data model, which we introduce in the next section. </span></p>
 
 
 
@@ -1684,8 +1764,6 @@
 
 </div>
 
-</div>
-
 <div id="section-extended-structures" class="section"> 
 <h3><span class="secno"><span class="delete">2.6 </span><span class="delete">PROV-N: </span><span class="insert">2.2 </span></span><span class="insert">PROV Extended Structures</span></h3>
 
@@ -2932,6 +3010,9 @@
 </li><li><span class="attribute" id="association.attributes">attributes</span>: an <em class="rfc2119" title="optional">optional</em> set (<span class="name">attrs</span>) of attribute-value pairs representing additional information about this association of this activity with this agent.</li>
 </ul></div>
 
+<p><span class="insert">While each of </span><a href="#association.id"><span class="attribute"><span class="insert">id</span></span></a><span class="insert">, </span><a href="#association.agent"><span class="attribute"><span class="insert">agent</span></span></a><span class="insert">,  </span><a href="#association.plan"><span class="attribute"><span class="insert">plan</span></span></a><span class="insert">, and  </span><a href="#association.attributes"><span class="attribute"><span class="insert">attributes</span></span></a><span class="insert"> is </span><em class="rfc2119" title="optional"><span class="insert">optional</span></em><span class="insert">, at least one of them </span><em class="rfc2119" title="must"><span class="insert">must</span></em><span class="insert"> be present.</span></p>
+
+
 <div class="anexample" id="anexample-wasAssociatedWith" count="31">
 <p>In the following example, a designer <span class="insert">agent </span>and an operator <span class="delete">agents</span><span class="insert">agent</span> are associated with an activity. The designer's goals are achieved by a workflow <span class="name">ex:wf</span>, described as an an entity of type <span class="name"><a href="#concept-plan" class="internalDFN">plan</a></span>.   </p>
 <pre class="codeexample">activity(ex:a, <span class="delete">[prov:type="workflow execution"])</span><span class="insert">[ prov:type="workflow execution" ])</span>
@@ -4214,12 +4295,10 @@
 <h5><span class="secno"><span class="delete">4.7.4.3</span><span class="insert">5.7.4.3</span> </span>prov:role</h5>
 
 
-
-
-<p><span class="glossary-ref">   The attribute <dfn id="concept-role" title="role"><span class="name">prov:role</span></dfn>  denotes the function of an entity with respect to an activity, in the context of a <span class="delete">usage, generation,
- association,</span><a href="#concept-usage" class="internalDFN"><span class="insert">usage</span></a><span class="insert">, </span><a href="#concept-generation" class="internalDFN"><span class="insert">generation</span></a><span class="insert">,</span>  <span class="delete">start,</span><a href="#concept-activityAssociation" class="internalDFN"><span class="insert">association</span></a><span class="insert">,  </span><a href="#concept-start" class="internalDFN"><span class="insert">start</span></a><span class="insert">,</span> and  <span class="delete">end. </span><a href="#concept-end" class="internalDFN"><span class="insert">end</span></a><span class="insert">. </span></span></p>
-
-<p>
+<p><span class="glossary-ref"><span class="insert">   A </span><dfn id="concept-role"><span class="insert">role</span></dfn><span class="insert"> is the function of an entity with respect to an activity, in the context of a </span><a href="#concept-usage" class="internalDFN"><span class="insert">usage</span></a><span class="insert">, </span><a href="#concept-generation" class="internalDFN"><span class="insert">generation</span></a><span class="insert">, </span><a href="#concept-activityAssociation" class="internalDFN"><span class="insert">association</span></a><span class="insert">,  </span><a href="#concept-start" class="internalDFN"><span class="insert">start</span></a><span class="insert">, and  </span><a href="#concept-end" class="internalDFN"><span class="insert">end</span></a><span class="insert">. </span></span></p>
+
+<p><span class="delete">The attribute </span><span class="delete">prov:role</span><span class="delete">  denotes the function of an entity with respect to an activity, in the context of a usage, generation,
+ association,  start, and  end. </span>
 The attribute <span class="name">prov:role</span> is allowed to occur multiple times in a list of attribute-value pairs. The value associated with a <span class="name">prov:role</span> attribute <em class="rfc2119" title="must">must</em> be a PROV-DM <a title="value" href="#concept-value" class="internalDFN">Value</a>.</p>
 
 <div class="anexample" id="anexample-role" count="56">
@@ -4319,7 +4398,7 @@
 <div id="term-attribute-provenance-uri" class="section">
 <h5><span class="secno"><span class="insert">5.7.4.6 </span></span><span class="insert">prov:provenance-uri</span></h5>
 
-<p><span class="glossary-ref"><span class="insert">   A </span><dfn id="concept-provenance-uri"><span class="insert">provenance-URI</span></dfn><span class="insert">  the IRI denoting some provenance information. </span></span><span class="insert"> (See  </span><a href="http://www.w3.org/TR/prov-aq/#dfn-provenance-uri"><span class="insert">Provenance-URI</span></a><span class="insert"> in [</span><cite><a class="bibref" rel="biblioentry" href="#bib-PROV-AQ"><span class="insert">PROV-AQ</span></a></cite><span class="insert">].) </span></p>
+<p><span class="glossary-ref"><span class="insert">   A </span><dfn id="concept-provenance-uri"><span class="insert">provenance-URI</span></dfn><span class="insert"> is the IRI denoting some provenance information. </span></span><span class="insert"> (See  </span><a href="http://www.w3.org/TR/prov-aq/#dfn-provenance-uri"><span class="insert">Provenance-URI</span></a><span class="insert"> in [</span><cite><a class="bibref" rel="biblioentry" href="#bib-PROV-AQ"><span class="insert">PROV-AQ</span></a></cite><span class="insert">].) </span></p>
 
 <p><span class="insert"> The attribute </span><dfn title="provenance-uri-attribute" id="dfn-provenance-uri-attribute"><span class="name"><span class="insert">prov:provenance-uri</span></span></dfn><span class="insert"> provides
   an </span><em class="rfc2119" title="optional"><span class="insert">optional</span></em> <a href="#concept-provenance-uri" class="internalDFN"><span class="insert">provenance-URI</span></a><span class="insert">.</span></p>
@@ -4347,7 +4426,7 @@
 
 <p><span class="glossary-ref"><span class="insert">   A </span><dfn id="concept-provenance-service"><span class="insert">provenance service</span></dfn><span class="insert">   is a service that provides provenance information or a </span><a href="#concept-provenance-uri" class="internalDFN"><span class="insert">provenance-URI</span></a><span class="insert"> given an </span><a href="#dfn-identifier" class="internalDFN"><span class="insert">identifier</span></a><span class="insert">. </span></span><span class="insert"> (See </span><a href="http://www.w3.org/TR/prov-aq/#dfn-provenance-service"><span class="insert">provenance service</span></a><span class="insert"> in [</span><cite><a class="bibref" rel="biblioentry" href="#bib-PROV-AQ"><span class="insert">PROV-AQ</span></a></cite><span class="insert">].)</span></p>
 
-<p><span class="glossary-ref"><span class="insert">   A </span><dfn id="concept-service-uri"><span class="insert">service-URI</span></dfn><span class="insert">  the IRI of a </span><a href="#concept-provenance-service" class="internalDFN"><span class="insert">provenance service</span></a><span class="insert">. </span></span><span class="insert"> (See </span><a href="http://www.w3.org/TR/prov-aq/#dfn-service-uri"><span class="insert">Service-URI</span></a><span class="insert"> in [</span><cite><a class="bibref" rel="biblioentry" href="#bib-PROV-AQ"><span class="insert">PROV-AQ</span></a></cite><span class="insert">].)</span></p>
+<p><span class="glossary-ref"><span class="insert">   A </span><dfn id="concept-service-uri"><span class="insert">service-URI</span></dfn><span class="insert"> is the IRI of a </span><a href="#concept-provenance-service" class="internalDFN"><span class="insert">provenance service</span></a><span class="insert">. </span></span><span class="insert"> (See </span><a href="http://www.w3.org/TR/prov-aq/#dfn-service-uri"><span class="insert">Service-URI</span></a><span class="insert"> in [</span><cite><a class="bibref" rel="biblioentry" href="#bib-PROV-AQ"><span class="insert">PROV-AQ</span></a></cite><span class="insert">].)</span></p>
 
 
 <p><span class="insert"> The </span>attribute <dfn title="service-uri-attribute" id="dfn-service-uri-attribute"><span class="name"><span class="delete">prov:value</span><span class="delete"> is </span><span class="insert">prov:service-uri</span></span></dfn><span class="insert"> provides
@@ -4589,6 +4668,7 @@
 </dd><dt id="bib-URI">[URI]</dt><dd>T. Berners-Lee; R. Fielding; L. Masinter. <a href="http://www.ietf.org/rfc/rfc3986.txt"><cite>Uniform Resource Identifiers (URI): generic syntax.</cite></a> January 2005. Internet RFC 3986. URL: <a href="http://www.ietf.org/rfc/rfc3986.txt">http://www.ietf.org/rfc/rfc3986.txt</a> 
 </dd><dt id="bib-XMLSCHEMA-2">[XMLSCHEMA-2]</dt><dd>Paul V. Biron; Ashok Malhotra. <a href="http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/"><cite>XML Schema Part 2: Datatypes Second Edition.</cite></a> 28 October 2004. W3C Recommendation. URL: <a href="http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/">http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/</a> 
 </dd></dl></div><div id="informative-references" class="section"><h3><span class="secno">B.2 </span>Informative references</h3><dl class="bibliography"><dt id="bib-Logic">[Logic]</dt><dd>W. E. Johnson <a href="http://www.ditext.com/johnson/intro-3.html"><cite>Logic: Part III</cite></a>.1924. URL: <a href="http://www.ditext.com/johnson/intro-3.html">http://www.ditext.com/johnson/intro-3.html</a>
+</dd><dt id="bib-Mappings"><span class="insert">[Mappings]</span></dt><dd><span class="insert">Satya Sahoo and Paul Groth and Olaf Hartig and Simon Miles and Sam Coppens and James Myers and Yolanda Gil and Luc Moreau and Jun Zhao and Michael Panzer and Daniel Garijo </span><a href="http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings"><cite><span class="insert">Provenance Vocabulary Mappings</span></cite></a><span class="insert">. August 2010 URL: </span><a href="http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings"><span class="insert">http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings</span></a>
 </dd><dt id="bib-PROV-AQ">[PROV-AQ]</dt><dd>Graham Klyne and Paul Groth (eds.) Luc Moreau, Olaf Hartig, Yogesh Simmhan, James Meyers, Timothy Lebo, Khalid Belhajjame, and Simon Miles <a href="http://www.w3.org/TR/prov-aq/"><cite>Provenance Access and Query</cite></a>. 2011, Working Draft. URL: <a href="http://www.w3.org/TR/prov-aq/">http://www.w3.org/TR/prov-aq/</a>
 </dd><dt id="bib-PROV-CONSTRAINTS">[PROV-CONSTRAINTS]</dt><dd>James Cheney, Paolo Missier, and Luc Moreau (eds.) <a href="http://www.w3.org/TR/prov-constraints/"><cite>Constraints of the PROV Data Model</cite></a>. 2011, Working Draft. URL: <a href="http://www.w3.org/TR/prov-constraints/">http://www.w3.org/TR/prov-constraints/</a>
 </dd><dt id="bib-PROV-N">[PROV-N]</dt><dd>Luc Moreau and Paolo Missier (eds.) <a href="http://www.w3.org/TR/prov-n/"><cite>PROV-N: The Provenance Notation</cite></a>. 2011, Working Draft. URL: <a href="http://www.w3.org/TR/prov-n/">http://www.w3.org/TR/prov-n/

--- a/model/prov-dm.html	Mon May 28 17:30:44 2012 +0100
+++ b/model/prov-dm.html	Mon May 28 23:10:32 2012 +0100
@@ -177,6 +177,13 @@
           "version 2.0, 2005 "+
           "URL: <a href=\"http://www.omg.org/spec/UML/2.0/Superstructure/PDF/\">http://www.omg.org/spec/UML/2.0/Superstructure/PDF/</a>",
 
+        "Mappings":
+          "Satya Sahoo and Paul Groth and Olaf Hartig and Simon Miles and Sam Coppens and James Myers and Yolanda Gil and Luc Moreau and Jun Zhao and Michael Panzer and Daniel Garijo "+
+          "<a href=\"http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings\"><cite>Provenance Vocabulary Mappings</cite></a>. "+
+          "August 2010 "+
+          "URL: <a href=\"http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings\">http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings</a>",
+
+
 
       };
       var respecConfig = {
@@ -276,9 +283,11 @@
 
     <section id="abstract">
 <p>
-PROV-DM, the PROV conceptual data model, is a data model for provenance that describes
-the entities, people and activities involved in
-producing a piece of data or thing. 
+Provenance is information about entities, activities, and people,
+involved in producing a piece of data or thing, which can be used
+ to form assessments about its quality, reliability or trustworthiness.
+PROV-DM is the conceptual data model that forms a basis for the W3C
+provenance (PROV) family of specifications.
 PROV-DM distinguishes core structures, forming the essence of provenance descriptions, from
 extended structures catering for more advanced uses of provenance. 
 PROV-DM is organized in six components, respectively dealing with: 
@@ -374,8 +383,8 @@
 
 
 <p>
-We
-consider a generic data model for provenance that allows  domain and application specific representations of provenance to be translated into such a data model and  <em>interchanged</em> between systems.
+We present the PROV data model,
+a generic data model for provenance that allows  domain and application specific representations of provenance to be translated into such a data model and  <em>interchanged</em> between systems.
 Thus, heterogeneous systems can export their native provenance into such a core data model, and applications that need to make sense of provenance can then import it,
 process it, and reason over it.</p>
 
@@ -404,7 +413,9 @@
 The PROV data model distinguishes <em>core structures</em> from
 <em>extended structures</em>: core structures form the essence of
 provenance descriptions, and are commonly found in various
-domain-specific vocabularies. Extended structures enhance and refine core
+domain-specific vocabularies that deal
+with provenance or similar kinds of information [[Mappings]].
+Extended structures enhance and refine core
 structures with more expressive capabilities to cater for more
 advanced uses of provenance.
 The  PROV data model, comprising both core and extended structures, is a domain-agnostic model, but with clear extensibility points allowing further domain-specific and
@@ -429,14 +440,19 @@
 With these, it becomes possible to write useful provenance descriptions, and publish or embed them alongside the data they relate to. </p>
 
 <p>However, if something about which provenance is expressed is subject to change, then it is challenging to express its provenance precisely (e.g. the data from which a daily weather report is derived  changes from day to day).
- To address this challenge, it is proposed to enrich simple provenance, with refined descriptions that  help qualify the specific subject of provenance and provenance itself, with attributes and temporal information, intended to satisfy a comprehensive set of constraints.  These aspects are covered in the companion specification [[PROV-CONSTRAINTS]].
+This is addressed in a companion
+ specification [[PROV-CONSTRAINTS]] by proposing formal constraints on
+ the way that provenance descriptions are related to the things they
+ describe (such as the use of attributes, temporal information and
+ specialization of entities), and additional conclusions that are valid
+ to infer.
 </p>
 
 
 <section id="structure-of-this-document"> 
 <h3>Structure of this Document</h3>
 
-<p><a href="#section-prov-overview">Section 2</a> provides an overview of the PROV Data Model,  distinguishing a core set of types and  relations, commonly found in provenance descriptions, from extended structures catering for advanced uses. It also introduces a modular organization of the data model in components. </p>
+<p><a href="#section-prov-overview">Section 2</a> provides an overview of the PROV Data Model,  distinguishing a core set of types and  relations, commonly found in provenance descriptions, from extended structures catering for more specific uses. It also introduces a modular organization of the data model in components. </p>
 
 <p><a href="#prov-notation">Section 3</a> overviews the Provenance Notation used to illustrate examples of provenance descriptions.</p>
 
@@ -503,10 +519,13 @@
 <section id='core-structures'> 
 <h1>PROV Core Structures</h1>
 
-<p>The core of PROV consists of essential provenance structures commonly found in provenance descriptions.
-It is summarized graphically by
-the UML diagram of <a href="#prov-core-structures-top">Figure 1</a>,
-illustrating  three types (entity, activity, and agent) and how they relate to each other.  In the core of PROV, all associations are binary. </p>
+<p>At its core, provenance describes the use and production of
+<em>entities</em> by <em>activities</em>, which may be 
+controlled or influenced in
+various ways by <em>agents</em>.  These core types and their relationships
+are illustrated
+by
+the UML diagram of <a href="#prov-core-structures-top">Figure 1</a>.</p>
 
 
 <div style="text-align: center; ">
@@ -520,7 +539,7 @@
 <p>The concepts found in the core of PROV are introduced in the rest of this section.
 They are summarized in  <a href="#overview-types-and-relations">Table 2</a>, where they are categorized as
  type or relation.
- The first column lists concepts, the second column indicates whether a concept maps to a type or a relation, whereas the third column contains the corresponding name.    Names of relations have a verbal form in the past tense to express what happened in the past, as opposed to what may or will happen. 
+ The first column lists concepts, the second column indicates whether a concept maps to a type or a relation, whereas the third column contains the corresponding name.    Names of relations have a verbal form in the past tense to express what happened in the past, as opposed to what may or will happen. In the core of PROV, all relations are binary. 
 </p>
 
 
@@ -595,7 +614,14 @@
 
 
 <p>
-<span class="glossary-ref" data-ref="glossary-activity"  data-withspan="true"></span> Activities that operate on digital entities may for example move, copy, or duplicate them.
+<span class="glossary-ref" data-ref="glossary-activity"  data-withspan="true"></span> 
+Just as entities cover a broad range of notions, 
+activities can cover a broad range of
+notions:
+information processing activities
+ may for example move, copy, or duplicate  digital entities;
+ physical activities can include
+ driving a car from Boston to Cambridge.
 </p>
 
 
@@ -625,6 +651,27 @@
 publication of a new version of a document.
 </div>
 
+<p>
+One might reasonably ask what entities are used and generated by
+driving a car from Boston to Cambridge.  This is answered by
+considering that a single artifact may
+correspond to several entities; in this case, a car in Boston may be a
+different entity from a car in Cambridge.  
+Thus, among other things,
+an entity "car in Boston" would be used, and a new entity "car in
+Cambridge" would be generated by this activity of driving.  The
+provenance trace of the car might include: designed in Japan,
+manufactured in Korea, shipped to Boston USA, purchased by customer,
+driven to Cambridge, serviced by engineer in Cambridge, etc., all of
+which might be important information when deciding whether or not it
+represents a sensible second-hand purchase.  Or some of it might
+alternatively be relevant when trying to determine the truth of a web
+page reporting a traffic violation involving that car.  This breadth
+of provenance allows descriptions of interactions between physical and
+digital artifacts.
+</p>
+
+
 
 
 <div class="anexample conceptexample" id="usage-example">
@@ -648,6 +695,9 @@
 <p>
 The activity of writing a celebrity article was informed by (a
 communication instance) the activity of intercepting voicemails.
+ The activity of purchasing a
+ car in Boston can be informed by the the activity of its being
+ designed in Japan.
 </div>
 
 
@@ -658,7 +708,15 @@
 <section id="section-agents-attribution-association-delegation"> 
 <h2>Agents and Responsibility</h2>
 
-<p>The motivation for introducing  agents in the model is to express the agent's responsibility for activities that happened and entities that were generated. </p>
+<p>For many purposes, a key consideration
+ for deciding whether something is reliable and/or trustworthy is
+ knowing who or what <em>was reponsible</em> for its production.  Data published by
+ a respected independent organization may be considered more
+ trustworthy that that from a lobby organization; a claim by a
+ well-known scientist with an established track record may be more
+ believed than a claim by a new student; a calculation performed by an
+ established software library may be more reliable than by a one-off
+ program.</p>
 
 <p>
 <span class="glossary-ref" data-ref="glossary-agent"  data-withspan="true">
@@ -736,18 +794,43 @@
 
 
 
-<p>Activities utilize entities and produce entities. In some cases, utilizing an entity influences the creation of another in some way. This notion of 'influence' is captured by derivations, defined as follows.</p>
+<p>Activities utilize entities and produce entities. In some cases, utilizing an entity influences the creation of another in some way. This notion of 'influence' is captured by derivations, defined as follows.
+</p>
 
 <p>
 <span class="glossary-ref" data-ref="glossary-derivation"  data-withspan="true"></span>
 
 
 
+
+
 <div class="anexample conceptexample" id="derivation-example">
 <p>Examples of derivation include  the transformation of a relational table into a
 linked data set, the transformation of a canvas into a painting, the transportation of a work of art from London to New York, and a physical transformation such as the melting of ice into water.</p>
 </div>
 
+
+<p>
+While the basic idea is simple, the concept of derivation can be quite
+subtle: implicit is the notion that the generated entity was affected
+in some way by the used entity.  
+
+If an artifact
+was used by an activity that also generated a new artifact, it does not always follow
+that the second artifact was derived from the first.  In the activity
+of creating a painting, an artist may have mixed some paint that was
+never actually applied to the canvas: the painting would typically
+not be considered a derivation from the unused paint.  
+<!-- The provenance
+model does not attempt to define what constitutes derivation; rather,
+it is considered to be something that is asserted, having been
+determined by unspecified means. -->
+PROV does not attempt to specify the conditions under which derivations
+exist; rather, derivation is considered to have been determined by unspecified means. 
+Thus, while a chain of usage and generation is necessary for a
+derivation to hold between entities, it is not sufficient; some
+form of influence occurring during the activities involved is also needed. 
+</p>
 </section>
 
 </section>

author	Luc Moreau <l.moreau@ecs.soton.ac.uk>
	Mon, 28 May 2012 23:10:32 +0100
changeset 3025	1f3c4e5b7a41
parent 3024	454b2a25df6e
child 3026	299a9ecd8e26

model/comments/wd6-Graham.txt
model/diff.html
model/prov-dm.html