changes following yolanda's email
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Fri, 16 Dec 2011 12:36:02 +0000
changeset 1279 b767e2123b0b
parent 1278 38fcd3e5095d
child 1280 e956dee51080
changes following yolanda's email
model/ProvenanceModel.html
model/comments/comments-from-yolanda.txt
--- a/model/ProvenanceModel.html	Thu Dec 15 18:38:34 2011 +0000
+++ b/model/ProvenanceModel.html	Fri Dec 16 12:36:02 2011 +0000
@@ -82,8 +82,8 @@
  
           // if there is a previously published draft, uncomment this and set its YYYY-MM-DD date
           // and its maturity status
-          previousPublishDate:  "2011-10-18",
-          previousMaturity:  "FPWD",
+          previousPublishDate:  "2011-12-15",
+          previousMaturity:  "WD",
  
           // if there a publicly available Editor's Draft, this is the link
           edDraftURI:           "http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html",
@@ -192,7 +192,7 @@
 </h2> 
 
 <p> 
-For the purpose of this specification, provenance is defined as a record that describes the people,
+For the purpose of this specification, <dfn>provenance</dfn> is defined as a record that describes the people,
 institutions, entities, and activities, involved in producing,
 influencing, or delivering a piece of data or a thing in the world.
 In particular, the provenance of information is crucial in deciding
@@ -333,9 +333,9 @@
 <p>We do not assume that any characterization is more important than any other, and in fact, it is possible to describe the processing that occurred for the report to be commissioned, for individual versions to be created, for those versions to be published at the given URL, etc., each via a different entity that characterizes the report appropriately.</p>
 
 <p>In the world, <dfn id="concept-activity">activities</dfn> involve
-entities in multiple ways: they consume them, they process them, they
-transform them, they modify them, they change them, they relocate
-them, they use them, they generate them, they are controlled by them,
+entities in multiple ways:  consuming them,  processing them, 
+transforming them,  modifying them,  changing them,  relocating
+them,  using them,  generating them, being controlled by them,
 etc.</p>
 
 
@@ -344,6 +344,14 @@
 
 <p> Even software agents can be assigned some responsibility for the effects they have in the world, so for example if one is using a Text Editor and one's laptop crashes, then one would say that the Text Editor was responsible for crashing the laptop.  If one invokes a service to buy a book, that service can be considered responsible for drawing funds from one's bank to make the purchase (the company that runs the service and the web site would also be responsible, but the point here is that we assign some measure of responsibility to software as well).  So when someone models software as an agent for an activity in our model, they mean the agent has some responsibility for that activity.
 </p>
+
+<p>PROV-DM considers agents as a type of entity so that the model can be
+ used to represent the provenance of the agents themselves.  For
+ example, a spellchecker software may be an agent of a document
+ preparation activity, but itself can have a provenance record that
+ states who its vendor is.</p>
+
+
 <p> In this specification, the qualifier 'identifiable' is implicit whenever a reference is made to an activity, agent, or an entity.</p>
 
 </section>
@@ -530,13 +538,13 @@
 <p>This specification includes a grammar for PROV-ASN expressed using the Extended  Backus-Naur Form (EBNF) notation.</p>
 
 <div class="grammar">
-<p> Each rule in the grammar defines one symbol, in the form:</p>
+<p> Each production rule (or <dfn>production</dfn>, for short) in the grammar defines one non-terminal symbol, in the form:</p>
 <p>
 <span class="nonterminal">E</span>&nbsp;::= <em>expression</em>
 </p>
 
 
-Within the expression on the right-hand side ofa rule, the follwoing expressions are used to match strings of one or more characters:
+Within the expression on the right-hand side of a rule, the following expressions are used to match strings of one or more characters:
 <ul>
 <li> 
 <span class="nonterminal">E</span>: matches term satisfying rule for symbol E.
@@ -573,6 +581,7 @@
 
 <div style="text-align: center;">
   <img src="overview.png" alt="PROV-DM overview"/>
+<figcaption>PROV-DM overview</figcaption>
 </div>
 
 <div class="note"> Overview diagram does not represent the sub-relations -- proposal to use a UML notation instead of ER.</div>
@@ -599,6 +608,10 @@
 The attributes <a title="prov:role">role</a> and <a title="prov:type">type</a> are pre-defined.  
 </p>
 
+<p>
+In addition to the kinds of record introduced in the overview figure, PROV-DM also features a notion of <a title="account record">Account Record</a> that allows attribution of provenance records to be expressed.
+</p>
+
 
 <p>The set of relations presented here forms a core, which is further extended with additional relations, defined in Section <a href="#common-relations">Common Relations</a>.</p>
 
@@ -657,7 +670,7 @@
 <p>
 <a>Event</a> evt4: David edits (a3) file /share/crime.txt as follows.</p>
 <pre>
-There was a lot of crime in London and New-York last month.
+There was a lot of crime in London and New York last month.
 </pre>
 <p>
 We denote the revised file e3.
@@ -720,7 +733,9 @@
 
 <p>
 Generation Records (described in <a href="#record-Generation">Section Generation</a>) represent the <a>event</a> at which a file is created in a specific form. Attributes are used to describe the modalities according to which a given entity is generated by a given activity.  The interpretation of attributes is application specific. Illustrations of such attributes for the scenario are: no attribute is provided for <span class="name">e0</span>;
-<span class="name">e2</span> was generated by the editor's  save function;  <span class="name">e4</span> can be found on the smtp port, in the attachment section of the mail message; <span class="name">e6</span> was produced on the standard output of <span class="name">a5</span>. Two identifiers <span class="name">g1</span> and <span class="name">g2</span> identify the generation records referenced in derivations introduced below.</p>
+<span class="name">e2</span> was generated by the editor's  save function;  <span class="name">e4</span> can be found on the smtp port, in the attachment section of the mail message; <span class="name">e6</span> was produced on the standard output of <span class="name">a5</span>. 
+Sometimes, it is necessary to refer to generation records in other records. For those cases, we introduce
+ identifiers such as <span class="name">g1</span> and <span class="name">g2</span> to  identify the generation records; these identifiers are used in derivations introduced below to reference those specific records.</p>
 <pre>
 wasGeneratedBy(e0, a0)
 wasGeneratedBy(e1, a0, [ex:fct="create"])
@@ -737,7 +752,8 @@
 Usage Records (described in <a href="#record-Usage">Section Usage</a>) represent the <a>event</a> by which a file is read by an activity. 
 
 Likewise, attributes describe the modalities according to which the various entities are used by activities.  Illustrations of such attributes are: 
-<span class="name">e1</span> is used in the context of  <span class="name">a1</span>'s <span class="name">load</span> functionality; <span class="name">e2</span> is used by <span class="name">a2</span> in the context of its attach functionality; <span class="name">e3</span> is used on the standard input by <span class="name">a5</span>. Two identifiers <span class="name">u1</span> and <span class="name">u2</span> identify the Usage records referenced in derivations introduced below.</p>
+<span class="name">e1</span> is used in the context of  <span class="name">a1</span>'s <span class="name">load</span> functionality; <span class="name">e2</span> is used by <span class="name">a2</span> in the context of its attach functionality; <span class="name">e3</span> is used on the standard input by <span class="name">a5</span>. 
+Sometimes, it is also necessary to refer to usage records in other records. To this end, for these usage records, identifiers such as <span class="name">u1</span> and <span class="name">u2</span> are introduced to identify them; these identifiers are used later in derivations introduced below to refer to these specific Usage records.</p>
 <pre>
 used(a1,e1,[ex:fct="load"])
 used(a3,e2,[ex:fct="load"])
@@ -804,7 +820,7 @@
 <p>
 Provenance assertions can be <em>illustrated</em> graphically. The illustration is not intended to represent all the details of the model, but it is intended to show the essence of a set of provenance assertions.  Therefore, it cannot be seen as an alternate notation for expressing provenance.</p>
 
-<p>The graphical illustration takes the form of a graph. Entities, activities and agents are represented as nodes, with oval, rectangular, and half-hexagonal shapes, respectively.  Usage, Generation, Derivation, Activity Association, and Complementarity are represented as directed edges.</p>
+<p>The graphical illustration takes the form of a graph. Entities, activities and agents are represented as nodes, with oval, rectangular, and pentagonal shapes, respectively.  Usage, Generation, Derivation, Activity Association, and Complementarity are represented as directed edges.</p>
 
 <p>Entities are layed out according to the ordering of their generation event.  We endeavor to show time progressing from left to right.  This means that edges for Usage, Generation and Derivation typically point from right to left.</p>
 
@@ -839,13 +855,43 @@
       
 <h3>Record</h3>
 
-<p>PROV-DM consists of a set of constructs, referred to as <em>records</em>, to formulate representations of the world and constraints that must be satisfied by them.</p>
+<p>This specification introduced <a>provenance</a> as a record
+describing the people, institutions, entities, and activities,
+involved in producing, influencing, or delivering a piece of data or a
+thing in the world. PROV-DM is a data model defining the structure and
+meaning of such record.</p>
+
+
+<p>Concretely, PROV-DM consists of a set of constructs to formulate representations of the world and constraints that must be satisfied by them.
+A PROV-DM construct is referred to as a <em>record</em>, a body of information about something which is of interest from a provenance viewpoint.   PROV-DM records may be asserted directly or may be inferred from others.  
+</p>
+
+<p>
+PROV-DM records are typed and can be among the following types, introduced one by one in this section:
+<a>entity record</a>,
+<a>activity record</a>,
+<a>agent record</a>,
+<a>note record</a>,
+<a>generation record</a>,
+<a>usage record</a>,
+<a>derivation record</a>,
+<a>activity association record</a>,
+<a>responsibility record</a>,
+<a>start record</a>,
+<a>end record</a>,
+<a>complementarity record</a>, 
+<a>annotation record</a>, and
+<a>account record</a>.
+</a>
+
 
 <p>
 Furthermore,  PROV-DM includes a "house-keeping construct", a record container,
- used to wrap PROV-DM records and facilitate their interchange.</p>
-
-<p>In PROV-ASN, such representations of the world MUST be conformant with the toplevel production <span class="nonterminal">record</span> of the grammar. These <span class="nonterminal">record</span>s are grouped in three categories:
+ used to wrap PROV-DM records and facilitate their interchange.
+Hence, by creating a set of PROV-DM records and packaging them into a record container, 
+one forms a provenance record. </p>
+
+<p>In PROV-ASN, such representations of the world MUST be conformant with the toplevel <a>production</a> <span class="nonterminal">record</span> of the grammar. These <span class="nonterminal">record</span>s are grouped in three categories:
 <span class="nonterminal">elementRecord</span> (see section <a href="#record-element">Element</a>),
 <span class="nonterminal">relationRecord</span>  (see section <a href="#record-relation">Relation</a>), and
 <span class="nonterminal">accountRecord</span> (see section <a href="#record-Account">Account</a>).</p>
@@ -950,7 +996,7 @@
 
 Further considerations:
 <ul>
-<li>If an asserter wishes to characterize an entity  with the same attribute-value pairs over several intervals, then they are required to assert multiple entity records, each with its own identifier (so as to allow potential dependencies between the various entity records to be expressed).  </li>
+<li>If an asserter wishes to characterize an entity  with the same attribute-value pairs over several intervals, then they are required to create multiple entity records (either by direct assertion or by inference), each with its own identifier (so as to allow potential dependencies between the various entity records to be expressed).  </li>
 
 <li>There is no assumption that the set of attributes is complete and that the attributes are independent/orthogonal of each other.</li>
 
@@ -1010,16 +1056,12 @@
 <span class="name">activity</span>
 <span class="name">(</span>
 <span class="nonterminal">identifier</span>
-<span class="optional"><span class="name">,</span>
-<!--
-<span class="nonterminal">recipeLink</span> </span>
 <span class="name">,</span>
--->
 <span class="optional"><span class="nonterminal">time</span></span>
 <span class="name">,</span>
 <span class="optional"><span class="nonterminal">time</span></span>
 <span class="nonterminal">optional-attribute-values</span>
-<span class="name">)</span><br/>
+<span class="name">)</span>
 </div>
 
 <div class="anexample">
@@ -1053,6 +1095,7 @@
 'continuant' and 'occurrent' in logic [[Logic]].</p>
 
 
+
 </section> 
 
 <section id="record-Agent">
@@ -1128,6 +1171,8 @@
 the record <span class="name">agent(e,attrs)</span> also holds.
 </div>
 
+<div class='issue'> Shouldn't we allow for entities (not agent) to be associated with an activity?  Should we delete the constraint association-agent? <a href="http://www.w3.org/2011/prov/track/issues/203">ISSUE-203</a>.</div>
+
 </section>
 
    <section id="record-note"> 
@@ -1159,14 +1204,16 @@
 
 <div class="anexample">
 <p>
-The following note record</p>
+The following note record consists of a set of application-specific attribute-value pairs, intended
+to help the rendering of the record it is associated with, by
+specifying its color and its position on the screen.</p>
 <pre class="codeexample">
 note(ann1,[ex:color="blue", ex:screenX=20, ex:screenY=30])
+hasAnnotation(g1,n1)
 </pre>
-<p>consists of a set of application-specific attribute-value pairs, intended
-to help the rendering of the record it is associated with, by
-specifying its color and its position on the screen.  In this example,
-these attribute-value pairs do not constitute a representation of something
+<p>The note record is associated with a record <span class="name">g1</span> previously introduced (<a title="annotation record">hasAnnotation</a> is 
+discussed in Section <a href="#record-annotation">Annotation Record</a>).  In this example,
+the attribute-value pairs do not constitute a representation of something
 in the world; they are just used to help render provenance.
 </p>
 </div>
@@ -1680,7 +1727,7 @@
 </div>
 
 <div class="anexample">
-In the following example, a programmer, a researcher and a funder agents are asserted.  The porgrammer and researcher are associated with a workflow activity.  The programmer acts on behalf of the researcher (delegation) encoding the commands specified by the researcher; the researcher acts on behalf of the funder, who has an contractual agreement with the researcher.
+In the following example, a programmer, a researcher and a funder agents are asserted.  The porgrammer and researcher are associated with a workflow activity.  The programmer acts on behalf of the researcher (delegation) encoding the commands specified by the researcher; the researcher acts on behalf of the funder, who has an contractual agreement with the researcher. The terms 'delegation' and 'contact' used in this example are domain specific.
 <pre class="codeexample">
 activity(a,[prov:type="workflow"])
 agent(ag1,[prov:type="programmer"])
@@ -1702,7 +1749,7 @@
 <p>In PROV-DM, a <dfn id="dfn-Derivation">derivation record</dfn> is a representation that some entity is transformed from, created from, or affected by another entity in the world.  </p>
 
 
-<p>Examples of derivation include the transformation of a canvas into a painting, the transportation of a person from London to New-York, the transformation of a relational table into a linked data set, and the melting of ice into water.</p>
+<p>Examples of derivation include the transformation of a canvas into a painting, the transportation of a person from London to New York, the transformation of a relational table into a linked data set, and the melting of ice into water.</p>
 
 
 <p>According to <a href="#conceptualization">Section Conceptualization</a>, for an entity to be transformed from, created from, or affected by another in some way, there must be some underpinning activities performing the necessary actions resulting in such a derivation.  
@@ -1742,8 +1789,11 @@
 
 <p> We note that the fourth theoretical case of a precise derivation, where the number of activities is not known or asserted cannot occur. </p>
 
-
-<p>The three kinds of derivation records are successively introduced.  To minimize the number of relation types in PROV-DM, we introduce a PROV-DM reserved attribute <span class="name">steps</span>, which allows us to distinguish the various derivation types. </p>
+<p>In order to represent the number of activities in a derivation, we introduce a PROV-DM attribute <span class="name">steps</span>, which can take two possible values.  
+When <span class="name">prov:steps="1"</span>, derivation is due to one activity; when <span class="name">prov:steps="n"</span>, the number of activities is not known.</p>
+
+
+<p>The three kinds of derivation records are successively introduced.  Making use of the attribute <span class="name">steps</span>, we can distinguish the various derivation types.</p>
 
 <p>A <dfn>precise-1 derivation record</dfn>, written <span class="name">wasDerivedFrom(id, e2, e1, a, g2, u1, attrs)</span> in PROV-ASN, contains:</p>
 <ul>
@@ -2311,6 +2361,10 @@
 <section id="record-Account">
 <h3>Account Record</h3>
 
+<p>It is common for multiple provenance records to co-exist. For instance, when emailing
+ a file, there could be a provenance record kept by the mail client,
+ and another by the mail server. Such provenance records may provide different explanations about something happening in the world, because they are created by different parties or observed by different witnesses. A given party could also create multiple provenance records about an execution, to capture different levels of details, targeted at different end-users: the programmer of an experiment may be interested in a detailed log of execution, while the scientists may focus more on the scientific-level description.   Given that multiple provenance records can co-exist, it is important to know who asserted these records. </p>
+
 <p>In PROV-DM, an <dfn id="dfn-Account">account record</dfn> is a wrapper of records with a dual purpose:  </p> 
 <ul>
 <li> It is the mechanism by which attribution of provenance can be assserted; it allows asserters to bundle up their assertions, and assert suitable attribution;
@@ -2417,7 +2471,7 @@
           wasGeneratedBy(e0,a1,[ex:fct="create"])     
           ... )
 </pre>
-<p>with identifier <span class="name">ex:acc2</span>, containing assertions by asserter by <span class="name">http://example.org/asserter2</span> stating that the entity represented by entity record identified by <span class="name">e0</span> was generated by an activity represented by activity record identified by <span class="name">a1</span> instead of <span class="name">a0</span> in the previous account <span class="name">ex:acc0</span>.  If accounts <span class="name">ex:acc0</span> and <span class="name">ex:acc2</span> are merged together, the resulting set of records violates <a href="#generation-unicity">generation-unicity</a>.</p>
+<p>with identifier <span class="name">ex:acc2</span>, containing assertions by asserter by <span class="name">http://example.org/asserter2</span> stating that the entity represented by entity record identified by <span class="name">e0</span> was generated by an activity represented by activity record identified by <span class="name">a1</span> instead of <span class="name">a0</span> in the previous account <span class="name">ex:acc0</span>.  If accounts <span class="name">ex:acc0</span> and <span class="name">ex:acc2</span> are merged together, the resulting set of records violates <a href="#generation-unicity">generation-unicity</a> if the two activities <span class="name">a0</span> and <span class="name">a1</span> are distinct.</p>
 </div>
 
 <p>Account records constitute a scope for record identifiers. Since accounts can be nested,  scopes can also be nested; thus, the scope of record identifiers should be understood in the context of such nested scopes.  When a record with an identifier occurs directly within an account, then its identifier denotes this record in the scope of this account, except in sub-accounts where records with the same identifier occur. </p>
@@ -2450,7 +2504,7 @@
 <section id="RecordContainer">
 <h4>Record Container</h4>
 
-<p>A <dfn id="dfn-RecordContainer">record container</dfn> is a house-keeping construct of PROV-DM, also capable of bundling PROV-DM records. A record container is not a record, but can be exploited to return assertions in response to a request for the provenance of something ([[PROV-PAQ]]). </p> 
+<p>A <dfn id="dfn-RecordContainer">record container</dfn> is a house-keeping construct of PROV-DM, also capable of bundling PROV-DM records. A record container is the root of a provenance record and can be exploited to package up prov-dm records in response to a request for the provenance of something ([[PROV-PAQ]]). Given that a record container is the root of a provenance record, it is not defined as a PROV-DM record (production <span class="nonterminal">record</span>), since otherwise it could appear arbitrarily nested inside accounts.</p> 
 
 <p>
 
@@ -2876,7 +2930,9 @@
 <section id="record-traceability">
 <h3>Traceability Record</h3>
 
-<p> A <dfn id="dfn-Traceability">traceability record</dfn> states the existence of  a  "dependency path" between two entities, indicating that one entity can be shown to be in the lineage of another, and may have influenced it in some way. This relation is transitive. </p>
+<p> It is common that we may want to know who or what may have some influence, whether direct or indirect, on a given entity, or who may, directly or not, have some responsibility for a given outcome.  Hence, we may want to infer such a notion from an existing set of PROV-DM records.   Vice-versa, we may have knowledge of this influence and responsibility, but without knowing its actual details. Thus, we may also want to assert such a notion. </p>
+
+<p> A <dfn id="dfn-Traceability">traceability record</dfn> states the existence of  a  "dependency path" between two entities, indicating that one entity can be shown to be in the lineage of another, and may have influenced it, or may bear some responsibility for it, in some way. A traceability record subsumes derivation, activity association, and responsibility, and is defined to be transitive.  </p>  
 
 <p> A traceability record, written <span class="name">tracedTo(id,e2,e1,attrs)</span> in PROV-ASN:</p>
 <ul>
@@ -3030,6 +3086,7 @@
 
 <div style="text-align: center;">
 <img src="informedByNonTransitive.png" alt="non transitivity of wasInformedBy" />
+
 </div>
 
 <p>
@@ -3178,7 +3235,21 @@
 <li><em>agent</em>: an identifier <span class="name">ag</span> of an agent whom the entity is attributed to;</li>
 <li><em>attributes</em>: an OPTIONAL set <span class="name">attrs</span> of attribute-value pairs to further describe this record.</li>
 </ul>
-<p>Attribution models the notion of an activity generating an entity identified by <span class="name">e</span> being controlled by an agent <span class="name">ag</span>, which takes responsibility for generating <span class="name">e</span>. Formally, this is expressed as the following necessary condition.</p>
+<p>Attribution models the notion of an activity generating an entity identified by <span class="name">e</span> being associated with an agent <span class="name">ag</span>, which takes responsibility for generating <span class="name">e</span>. Formally, this is expressed as the following necessary condition.</p>
+
+
+<div class='constraint' id='attribution-implication'>
+<span class='conditional'>If</span>
+<span class="name">wasAttributedTo(e,ag)</span> holds for some identifiers
+<span class="name">e</span> and <span class="name">ag</span>,  
+<span class='conditional'>then</span> there exists an activity identified by <span class="name">pe</span> such that the following statements hold:
+<pre>
+activity(pe,t1,t2,attr1)
+wasGenerateBy(e,pe)
+wasAssociatedWith(pe,ag,attr2)
+</pre>
+for some sets of attribute-value pairs <span class="name">attr1</span> and  <span class="name">attr2</span>, time <span class="name">t1</span>, and <span class="name">t2</span>.
+</div>
 
 <p>In PROV-ASN, an attribution record's text matches the <span class="nonterminal">attributionRecord</span> production of the grammar.</p>
 
@@ -3193,18 +3264,6 @@
 <span class="name">)</span> 
 </div>
 
-<div class='constraint' id='attribution-implication'>
-<span class='conditional'>If</span>
-<span class="name">wasAttributedTo(e,ag)</span> holds for some identifiers
-<span class="name">e</span> and <span class="name">ag</span>,  
-<span class='conditional'>then</span> there exists an activity identified by <span class="name">pe</span> such that the following statements hold:
-<pre>
-activity(pe,t1,t2,attr1)
-wasGenerateBy(e,pe)
-wasAssociatedWith(pe,ag,attr2)
-</pre>
-for some sets of attribute-value pairs <span class="name">attr1</span> and  <span class="name">attr2</span>, time <span class="name">t1</span>, and <span class="name">t2</span>.
-</div>
 </section>
 
 
@@ -3262,8 +3321,7 @@
 
 <section>
 <h3>Summary Record</h3>
-<p>A <dfn>summary record</dfn> represents that an entity is a synopsis or abbreviation of another entity. A summary record is compliant with the 
-<span class="nonterminal">summaryRecord</span> production.</p>
+<p>A <dfn>summary record</dfn> represents that an entity (expected to be a document) is a synopsis or abbreviation of another entity (also expected to be a document). </p>
 
 
 
@@ -3274,6 +3332,11 @@
 <li><em>attributes</em>: an OPTIONAL set <span class="name">attrs</span> of attribute-value pairs to further describe this record.</li>
 </ul>
 
+<p>
+<span class="name">wasSummaryOf</span> is a strict sub-relation of <span class="name">wasDerivedFrom</span>.
+</p>
+
+
 <p>In PROV-ASN, a summary record's text matches the <span class="nonterminal">summaryRecord</span> production of the grammar.</p>
 
 <div class='grammar'>
@@ -3287,9 +3350,6 @@
 <span class="name">)</span> 
 </div>
 
-<p>
-<span class="name">wasSummaryOf</span> is a strict sub-relation of <span class="name">wasDerivedFrom</span>.
-</p>
 
 
 </section>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/comments-from-yolanda.txt	Fri Dec 16 12:36:02 2011 +0000
@@ -0,0 +1,304 @@
+ > All,
+ > 
+ > I went over the PROV-DM and the PROV-O documents, and have some comments.
+ > 
+ 
+ >  My first comment is that overall the documents read reasonably well,
+ >  orders of magnitude better than the version that was released a
+ >  couple of months ago.
+ > 
+ > I have some suggestions, a few that could be easily and immediately
+ > done, others that should probably wait for the next round of
+ > revisions.  Some easy edits that I think would improve the readability
+ > of the document.  Others may seem to me like easy edits but perhaps
+ > you think deserve further discussion, I could not easily discern this
+ > for some of the items.
+ > 
+ > My comments:
+ > 
+ > 1) Section 2.1.1: The sentence "In the world, activities involve
+ > entities in multiple ways: they consume them, they process them, they
+ > transform them, they modify them, they change them, they relocate
+ > them, they use them, they generate them, they are controlled by them,
+ > etc." could be improved by stating it as: "In the world, activities
+ > involve entities in multiple ways: consuming them, processing them,
+ > transforming them, modifying them, changing them, relocating them,
+ > using them, generating them, being controlled by them, etc.
+
+Done.
+
+ > 
+ > 2) Section 2.1.1: I'd add a sentence at the end of the description of
+ > agent to say why it is considered a subclass of entity, something like
+ > "PROV-DM considers agents as a type of entity so that the model can be
+ > used to represent the provenance of the agents themselves.  For
+ > example, a spellchecker software may be an agent of a document
+ > preparation activity, but itself can have a provenance record that
+ > states who its vendor is."
+
+Yes, good added.
+
+ > 
+ > 3) At the beginning of section 3, the notion of a "record" is
+ > introduced.  I get an idea of what is meant by record, but I don't
+ > think it is well motivated.  OWL does not have "records" but it can be
+ > used to state assertions about classes and objects, so why do we need
+ > this notion of record.  Also, what is there raises several questions
+ > that may or may not have the following answers: "A provenance record
+ > is composed of a set of entity records, a set of activity records, a
+ > set of agent records, a set of generation records, (and so on).  An
+ > entity record is a type of provenance record (and so are the others).
+ > A provenance record can have in turn its own provenance record, where
+ > it would be considered an entity."
+
+Introduction of Section 5.1 was changed.
+
+ > 
+ > 4) Section 4.1: It took me a couple of backs and forths to realize
+ > that e0 is a type while e1...e6 are instances.  I'd suggest to rename
+ > e0 to be crime-file, or cf, or something like that.
+
+e0 is not a type, e0 describe the file.
+Nothing done here, but we may want to revisit the example.
+
+ > 
+ > 5) Section 4.2: The examples of Activity Records I think would be more
+ > clear if they had "edit" instead of "add-crime-in-London" and "edit"
+ > instead of "edit-London-New-York".
+
+recipe link was removed, but I added some type attribute with the suggested types.
+
+ > 
+ > 6) Section 4.2: In the examples of Generation Records, I did not
+ > understand g1 and g2 at all.
+
+I have edited the text.
+
+ > 
+ > 7) Section 5.1: The terms "account", "production", and "record
+ > container" pop up out of nowhere.  They should be introduced and
+ > motivated a bit.  They should also be related to the notion of
+ > "record" better than they are now, this is not very clear.  I suspect
+ > there might be plans to discuss this aspect of the model further in
+ > the WG.
+
+I have made sure production is defined earlier (+ link added).
+
+Intro of 5.1 was changed. We are just listing here the kind of records that this section defines.
+
+I have added in overview section.  
+
+  In addition to the kinds of record introduced in the overview
+  figure, PROV-DM also features a notion of <a title="account
+  record">Account Record</a> that allows attribution of provenance
+  records to be expressed.
+
+There is here some brief motivation for the provenance container.
+
+
+ > 
+ > 8) Section 5.2.1: I would change the sentence "If an asserter wishes
+ > to characterize an entity with the same attribute-value pairs over
+ > several intervals, then they are required to assert multiple entity
+ > records, each with its own identifier (so as to allow potential
+ > dependencies between the various entity records to be expressed)." to
+ > clarify the asserting so it says something like: "If an asserter
+ > wishes to characterize an entity with the same attribute-value pairs
+ > over several intervals, then they are required to directly assert or
+ > create axioms to infer assertions for multiple entity records, each
+ > with its own identifier (so as to allow potential dependencies between
+ > the various entity records to be expressed).".
+
+
+
+Axioms have not been defined in this document.
+So, I suggest the following instead:
+
+ ... then they are required to CREATE multiple entity records (EITHER BY DIRECT ASSERTION OR BY INFERENCE)
+
+
+ > 
+ > 9) Section 5.2.3: The examples of agents could include a spellchecker
+ > agent, just to show a bit of diversity in what we consider to be
+ > agents.
+
+
+Woudl be nice to do it in section 4 too.
+TODO.
+
+ > 
+ > 10) Section 5.2.4: The example of the note record should show a link
+ > to some provenance record, ideally one that would have been shown as
+ > an example in section 5.1 (maybe the g1 and g2 that I mentioned in
+ > point 5 above).
+ > 
+Yes, updated.
+
+ > 11) Section 5.3.3: We argue that we don't want to get into
+ > "responsibility".  But we introduce the term "responsible" and
+ > "subordinate".  I suggest we refer to them as "represented-agent" and
+ > "representing-agent" instead.  Also, the section is titled
+ > "Responsibility Record", so that will be confusing, maybe "Delegation
+ > Record" would be better.
+
+I take the point, but Delegation is not better, and seems too specific.
+I think, more generally, we need to revisit exactly how we pitch responsibility.
+See thread ISSUE-203.
+
+ > 
+ > 12) Section 5.3.3: The example uses the terms "delegation" and
+ > "contract".  Perhaps useful to mention that these are domain terms to
+ > make clear that they are not part of the model.
+
+OK, done.
+
+ > 
+ > 13) Section 5.3.3.1: The sentence "To promote take-up, PROV-DM offers
+ > a mild version of responsibility in the form of a relation to
+ > represent when an agent acted on another agent's behalf." is a bit of
+ > an awkard way to introduce this, so I'd replace it by "The definition
+ > of agent mentions that an agent is a type of entity that can be
+ > assigned some degree of responsibility for an activity.  In many
+ > situations, the creators of a provenance record may not have the
+ > authority to ascribe responsibility to the various agents that they
+ > know are involved in the activity.  For example, the developer of a
+ > provenance service using PROV-DM could say that a student and his
+ > advisor were both involved in creating a dataset, but might not be in
+ > a position to know who has actual responsibility for the dataset.
+ > Responsibility often has legal connotations that could deter
+ > developers and users of PROV-DM from stating responsibility assertions
+ > in provenance records.  To address this, PROV-DM offers a mild version
+ > of responsibility in the form of a relation to represent when an agent
+ > acted on another agent's behalf.".
+
+TODO, as we gather more feedback on responsibility.
+
+ > 
+ > 14) Section 5.3.3.2: The terms "we introduce a PROV-DM reserved
+ > attribute STEPS" is used for the first time, no idea what that means.
+ > Maybe just say "we introduce a PROV-DM attribute STEPS".  More
+ > importantly, I did not understand what steps means.
+ > 
+Text has been updated. Attribute steps was introduced first.
+
+ > 15) Section 5.3.3.3: Complementarity is very confusing, even its
+ > description in the primer was confusing to me.  And I am a planning
+ > person used to thinking about entities changing, states, fluents, etc
+ > etc.  I even wrote a survey on "Planning and Description Logics" a
+ > while back.  But this is actually a very complex area that I don't
+ > think is well understood at all.  For my money, I would say this is
+ > worth a side chat with Pat Hayes about this particular aspect of the
+ > model, to get his guidance.  He originally worked with McCarthy on the
+ > frame problem and understands very well all the different issues
+ > involved in this type of logic to reason about actions and change.
+
+All being rewritten.
+
+ > 
+ > 16) Section 5.4.1: Could use a bit of introduction to introduce why
+ > separate provenance records may be created.  For example, in emailing
+ > a file there could be a provenance record kept by the mail client,
+ > another by the SMPT server, etc.  It would also be useful to give
+ > motivating scenarios/examples for how accounts can be nested.
+ > 
+
+Added an intro to 5.4.1
+
+Nesting still needs to be debated.
+
+
+ > 17) Section 5.4.1: The example that introduces account acc2 says at
+ > the end that the result of the merge violates generation-unicity.  But
+ > if I am following this correctly, if a1 and a0 are asserted to be the
+ > same then there is no violation.  Perhaps worth clarifying this, or
+ > perhaps finding a more real-world example that really really creates a
+ > violation.  Otherwise people are going to be scared of merging
+ > provenance records, which I think is the opposite of what we want.
+
+Added:  if the two activities <span class="name">a0</span> and <span class="name">a1</span> are distinct.
+
+This whole section is to be structured in any case.
+ > 
+ > 18) Section 5.4.2: It states "A record container is not a record."  I
+ > am puzzled.  This is related to the confusion I raised in point 6.
+
+Updated intro of 5.4.2
+
+ > 
+ > 19) Section 6.2: I did not understand why the notion of "traceability
+ > record", why is it introduced, and how is it different from a
+ > "derivation record".
+ > 
+
+Added an introduction to 6.2.
+
+
+ > n20) Section 6.5: The sentence "Attribution models the notion of an
+ > activity generating an entity identified by e being controlled by an
+ > agent ag, which takes responsibility for generating e." could be
+ > perhaps replaced by "Attribution models the notion of an activity
+ > generating an entity identified by e being controlled by an agent ag."
+
+I replaced the word control. I didnt see why remove the responsibility bit?
+
+ > 
+ > 21) Section 6.7: I did not understand what a "summary record" is.  I
+ > am guessing we want a notion that someone can excerpt some subset of
+ > assertions from a provenance record in order to create a summary
+ > record.  Is this right?  If so, why would this apply only for entities
+ > and not for other parts of the model?  Also: why wouldn't we use
+ > PROV-DM terms to express this meta-derivation?  It would make the
+ > model easier if we did not need to add an extra notion of "summary
+ > record" as here.  Or perhaps I did not understand.
+ > 
+
+Added ... (expected to be a document)
+
+
+ > One more comment that I am pretty certain is more appropriate for future discussions of the model:
+ > 
+ > 22) Proposal for hadRoleIn: This proposal is motivated by agent being
+ > a subclass of entity.  Should there be a relation between entity and
+ > activity that is subsumes (generalizes) used, generatedby, and
+ > wasAssociatedWith?  I think such a relation would allow us to state
+ > that an entity had to do with an activity but we don't yet know how
+ > exactly it was involved in the activity (eg whether it was an agent,
+ > or it was used by it, or generated by it, or...).  I would propose to
+ > call this something like <entity hadARoleIn activity>.  We should
+ > think about how this aligns with what we now call "roles" (my choice
+ > of name for this new general relation is not a coincidence), so in the
+ > examples in PROV-DM document section 5.2.3 instead of
+ > "[prov:role="sponsor"]" perhaps we could see sponsorOf as a
+ > specialization of hadRoleIn and of wasAssociatedWith.
+ > 
+ > 
+Yes, kept for later ;-)
+
+ > On a rather pedantic note, maybe "New-York" should be "New York", and
+ > that perhaps "half-hexagonal shape" should be "pentagon shape".
+
+Done
+
+ > 
+ > 
+ > Sorry for the long email...
+ > 
+
+Thanks a lot for this constructive input!
+
+Luc
+
+ > Best,
+ > 
+ > Yolanda
+ > 
+ > 
+ > 
+ > Yolanda Gil, USC/ISI
+ > +1-310-448-8794
+ > 
+ > 
+ > 
+ > 
+ > 
+