log of issues 331,332,333 in model/comments
authorPaolo Missier <pmissier@acm.org>
Tue, 10 Apr 2012 15:14:33 +0100
changeset 2255 92e52847d717
parent 2254 dbb8b150f6ac
child 2256 fddd29c9e73c
log of issues 331,332,333 in model/comments
model/comments/issue-331-Cheney.txt
model/comments/issue-331-Jun.txt
model/comments/issue-331-Khalid.txt
model/comments/issue-331-curt.txt
model/comments/issue-331-graham.txt
model/comments/issue-332-Cheney.txt
model/comments/issue-332-Khalid.txt
model/comments/issue-332-Simon.txt
model/comments/issue-333-Cheney.txt
model/comments/issue-333-graham.txt
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-331-Cheney.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,62 @@
+High-level comments:
+
+* I feel that the PROV-DM document takes a long time to get to the point.  We do not see any concrete examples of PROV notation (in PROV-N or PROV-O) until the end of section 2.  Moreover, the discussionfocuses on explaining the concepts in isolation rather than describing the high-level modeling problems they work together to solve.
+
+Suggestion:  Move the PROV-N section to the beginning of sec. 2 and illustrate the concepts through examples.  Or, arguably this is redundant given that the primer does more or less the same thing: perhaps, simply drop section 2 and proceed to the specification.
+
+* The main examples (sec 3.1, 3.2, 4.6) are too "meta" - why not restate them in more generic terms.  These examples about describing the WG's own activities sound a little self-centered.
+
+Given that both the primer and ontology use extended examples, why not align with one or both of them?  
+
+* I feel that the document doesn't lay things out in a logical order.  I think it would be helpful to list the basic or standard constituents first: they are currently in sections 4.3 and higher.  In particular, the fact that some attribute names are reserved is left implicit in several descriptions of examples, and not explicitly discussed in the corresponding section.  
+
+* PLEASE say somewhere prominently what the convention(s) are for optional arguments.  Some are simply omitted (e.g. initial identifiers, attribute lists) while others are replaced by "-".  Please make sure that all of the examples make sense with respect to whichever convention is in use.
+
+* Reading the document, I wondered why generation and use have time instants rather than intervals.  Why couldn't an activity use something over an interval, or generate something during an interval?  We should say why we only care about the end of generation and beginning of use.
+
+* There are a LOT of parenthetical examples, which I think stand little chance of making sense to a reader who hasn't been following the mailing list.
+
+
+Detailed comments:  (Quotes with starred substrings represent suggested edits.)
+
+Why "people" and not "agents"?
+
+Why do we say that the various aspects of the standards are necessary, rather than just appropriate?  There may be other ways of dong this.
+
+Sec 1.  "very quickly" -> "quickly"
+"extra-descriptions" -> "extra descriptions"
+"interval " -> "intervals"
+
+Section 4 provides the *definitions* of PROV-DM concepts, structured according to six components.
+
+
+2.2: "A same entity" -> "The same entity" - this happens many times
+
+2.6.  The activity in the example has the wrong number of arguments (the times are omitted, but I believe should be replaced with "-").  Also, the convention about missing arguments being written "-" is very important and should be explained somewhere prominently, with examples.  This happens many more times.
+
+3.1.  "(some of which *locate* archived email messages, available to W3C Members)."
+
+4.1.2.  The reserved attribute "type" is mentioned here.  Where is hte list of all reserved attributes?  Why not list them up front as part of the preliminaries?
+
+4.1.3.  The first example in Generation: p1 and p2 should be in code font.
+
+4.2.3.  The missing id arguments to wasAssociatedWith in the examples are not marked as "-".  Happens again in 4.2.4, 4.1.5, 4.1.6, etc.  Also, many missing attribute lists are omitted without being replaced by "-".  This is a sensible convention but is not stated anywhere.
+
+4.2.4.  The examples discussed in the second paragraph are not mentioned anywhere else.  So say "For example" instead of "In the example".
+
+4.2.4.  Here and elsewhere, the term "modalities of ..." is used to describe what the attributes are for.
+
+4.2.4. "a funder agents" - case mismatch
+
+4.3.1 " And to provide a completely accurate description of the derivation" -> "To provide a more accurate ..."
+
+4.6. "extra-information" -> "extra information"
+
+4.6.  Concerning annotations, why would I want to do this instead of directly putting the x and y positions on the entity?
+
+
+4.7.4.[3,4]: Why are role and type attributes allowd to occur multiple times?  Ordinary attributes aren't (I thought).  If we want to allow multiple occurrences of attribute names, why stop with these two?
+
+
+4.7.5 "the string "abc", the string "abc" " - repeated text
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-331-Jun.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,228 @@
+--------------
+
+- Can the document be released as a next public working draft? If no, 
+what are the blocking issues?
+
+yes, it can. I see no obvious blockers.
+
+- Is the structure of the document approved?
+Yes.
+
+- Can the short name of the document be confirmed (in particular, for 
+prov-n, prov-dm-constraints, since request needs to be sent for 
+publication)?
+
+Yes, the names work for me.
+
+- If a reviewer raised some issues (closed pending review), can they 
+be closed?	
+N/A. I see no open issues from my regarding DM.
+
+- Can all concept definitions be confirmed? Specifically,
+consider ISSUE-337 on agents
+consider ISSUE-223 on entities
+
+The new definitions work for me.
+
+
+
+
+
+-------------------------
+Additional minor comments
+-------------------------
+
+Status of Documents
+-------------------------
+1. Developers seeking to retrieve or publish provenance should focus of 
+PROV-AQ.
+
+of -> on
+
+
+
+
+1.1. Structure of this document
+----------------------------------------
+
+2. Section 6 introduces the idea that constraints can be applied to the 
+PROV data model to refine provenance descriptions; these are covered in 
+the companion specification [PROV-DM-CONSTRAINTS].
+
+-> there are *further* covered in ...?
+
+
+
+
+
+
+2.2 Generation, Usage, Derivation
+--------------------------------------------
+
+1. At the beginning of section 2.2., we have the sentence:Activities and 
+entities are associated with each other in two different ways: 
+activities are consumers of entities and activities are producers of 
+entities.
+
+would it be better to say:
+
+... in two different ways: activities can be consumers of entities or 
+producers.
+
+
+
+
+2.3 Agents and other types of entities
+--------------------------------------------
+
+1. There exist no prescriptive requirement -> There exist no 
+prescriptive requirement*s*
+
+2. In section 2.3, maybe the sub-types of Agents could also be given in 
+bold, italic font when they were introduced at the first time, like what 
+you did with other concepts?
+
+
+
+
+
+2.4 Attribution, Association, and Responsibility
+--------------------------------------------
+
+
+Reading section 2.4, I felt the word "Responsibility" is becoming a bit 
+overloaded.
+
+At the beginning of section 2.3. it says: The motivation for introducing 
+agents in the model is to denote the agent's "responsibility" for 
+activities. But then in the last part of this section, responsibility is 
+used to refer to a relationship between an agent and a subordinate agent.
+
+I don't how to fix this and I don't know how important this is. But I 
+didn't know that wasInformedBy actually reflects a kind of 
+responsibility until I read this section and related sections in the 
+rest of the document.
+
+
+
+
+
+2.5 Simplified Overview Diagram
+--------------------------------------------
+
+In section 2.5, the sentence above the table says: We note that names of 
+relations have a verbal form in the past tense to express what happened 
+in the past, as opposed to what may or will happen.
+
+But not all the definitions of the DM concepts expressed a description 
+of a past event, such as the definition of the activity or agent. Is 
+this on purpose?
+
+Furthermore, descriptions about the examples given in section 3 were not 
+expressed in past tense either, where they could have been.
+
+I feel fixing this and making it consistent might be a good example to 
+the readers, emphasizing provenance as descriptions of a past event.
+
+
+
+3.3. Attribution of Provenance
+--------------------------------------------
+Attribution of Provenance -> Attribution to Provenance?
+
+IMO, they mean different things, and I felt you meant the latter.
+
+
+
+
+4.1.3 Generation
+--------------------------------------------
+In section 4.1.3. it says that the activity in a generation is optional 
+and the last example shows how to express the time of a generation 
+without naming the activity. I wonder how this is supported in Prov-o.
+
+
+
+
+4.3.1 Derivation
+---------------------
+
+What does "modality" mean? (... added to describe modalities of derivation)
+
+
+
+
+4.5 Component 5: Collections
+------------------------------------------
+
+In the first paragraph, it says the collection component can express 
+"which member it contains at which point in time....". I am not sure 
+this is clearly explained or illustrated so far in the document. None of 
+the derivation by insertion or by deletion is associated with any time 
+information; and none of the examples in this section include any time 
+information with the collection. I think this time information is quite 
+indirectly available rather than directly supported by the collection 
+component.
+
+
+
+
+4.5.4 Membership
+------------------------------------------
+
+For the property memberOf, I was expecting to find it defined as 
+elements entities being member of a collections, such as memberOf(id, 
+{(key_1, e_1), ..., (key_n, e_n)}, c, attrcs). This seems to be a 
+consistent pattern used for all the properties in DM, but I didn't do a 
+thorough check.
+
+
+The example given in this section used the following assertion:
+
+c  contains   ("k1", e1), ("k2", e2)
+
+But "contains" is not something defined in the DM. If it is merely a 
+description, then it might be expressed using a different font rather 
+then typeset?
+
+
+In the last paragraph of this section, it mentioned the immutability of 
+entities. And reading the specification of collections, I understand 
+this is the pillar of this component. However, if I understand right, 
+the immutable nature of entities is not something emphasized in the 
+first part of DM. I wonder whether this might create any confusion for 
+readers. I don't have a good suggestion to this, but this section does 
+read as specifying stronger semantics than many of the rest sections.
+
+
+
+
+4.6 Component 6: Annotations
+------------------------------------------
+
+Did I miss something? The relationship between Note, Annotation and 
+Entities seems to be the only relationship that is not specified in the 
+components sections. Is this on purpose?
+
+
+
+4.7.4 Attribute
+------------------------------------------
+1. A brief discussion about the difference between prov:label and 
+prov:note? Is it a special type of Note?
+
+
+
+6. Towards a Refinement of the PROV Data Model
+------------------------------------------------------------------------------------
+
+
+Can we have a brief explaination of "partial state"?
+
+I don't quite understand this sentence: "The notion of account is 
+specified in the companion specification [PROV-DM-CONSTRAINTS], as well 
+as constraint that structurally well-formed descriptions are expected to 
+satisfy." What does it trying to say?
+
+"blundling up" -> bundling up?
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-331-Khalid.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,35 @@
+
+-------------
+- In the beginning of the working draft "How to read the PROV Family", it is said that "Developers seeking to retrieve or publish provenance should focus *on* PROv-AQ". Given the discussion that we had few weeks ago on using a SPARQL end point to query provenance that is encoded using provo. I would add PROVO as well to that sentence.
+
+- Fourth public working draft -> Fifth working draft
+
+- 1.1 Structure of the document. "... which are allows users" -> "which allow users"
+
+- 2.2 Generation, Usage, Derivation
+In the definition of Usage it is said that "Before usage, the activity had not begun to consume or use this entity and could not have been affected by the entity". I note that this sentence assumes that an entity can be used only once by an activity. In practice, the same activity can use the same entity, for example with different roles.
+
+- The usage example states situations in which the usage implies that the activity consumes the entity, and others in which the entity remains intact. Will is be useful to distinguich these two kinds of usage explicitly, by specializing the usage relation? In particular, I note that the notion of consumption entails interesting properties such as the invalidation of an entity and the fact that an entity can be consumed by at most one activity.
+
+- 3.1 Illustration of PROV-DM by an example.
+I find this section hard to read, and this is not the first time I read it. I think its readability can be improved if the following comments are considered. - In the text, the first and second working draft are referred using identifiers that are not intuitive, tr:WD-prov-dm-201.... I am not suggesting not to use them, but to specify whether they represent the first or the second working draft, whenever they are used in the text. - The figure given at the end of Sectio 3.1 can be more helpful in guiding the reader if it placed earlier in that section. - Talkiing about the figure the fact there are two arrows that link an arrow to a class, I understand their meaning, by I am not sure the reader will. - Section 3.2 giving information about the provenance form the author point of view seems to be simpler, and I think it would be better to start by the provenance from the author point of view before presenting the provenance from the process point of view.
+
+- 4: PROV-DM Types and Relations
+I am not sure the notion of component helps in the readability of the document. Refering to component1, component 2, etc. in the text is not helpful. I guess the only justification of using the term component is Figure PROV-DM component, which shows dependencies between those component. That said, I don't think that figure is helpful. It simply used to specify that one concept or a relation in a component depends on one concept or relation in another component. I note also that the term component is used in the text to refers to the definition elements in PROV-N. I would therefore suggest not ti use the notion of component, and rather use directly heading such as "Entity, Activity and their Relations", "Agent and their Responsibility", etc.
+
+- One of the consequence of trying to structure the model into component, is the fact that the reader will have to read the details of communication and start by activity, before reaching the definition of agent, responsibility and derivation, which are far more important for the ordinary reader. That said, I think the starting point which are in the beginning of the document already introduced the main concepts and relations.
+
+- 4.1.8 Start by Activity
+In the example given it is not explained why a2 was started by a1. There is an assumption that the reader will understand that a sub-workflow will be started by the parent workflow. I think this should explicitly stated.
+
+- 4.4.1 Specialization
+In the first paragraph: "common entity" -> "common thing"
+
+-4.5 Component 5: Collections
+I think that there is a need for defining collection here.
+Although it is stated that a collection is an entity. I feel there is a need for specifying what the members of a collection are as part of the collection specification, even when the specification of those members is optional.
+The membership relation fulfills the above requirements only partly, it is meant to specify a subset of the members that belong to a collection, not necessarily all of them.
+Therefore, I would suggest using a dedicated chracterizing attribute "members" for entities that happen to be collections.
+
+For example, we can define a collection c1 as
+entity (c1, [prov:type="Collection", prov:members = {<k1,v1>,...,<kn,vn>}]
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-331-curt.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,345 @@
+General:
+
+I probably missed some discussion about this, but the '-' placeholder
+for missing optional arguments is inconsistently used in the PROV-N
+examples.  Is it always required?  I don't really care for it in
+optional arguments where the argument mapping is unambiguous,
+particularly at the beginning or end of the argument list.
+
+
+
+The examples also seem to inconsistently use the prov defined types,
+e.g. prov:Person, prov:Collection, etc.
+
+For example,
+
+   [prov:type="Collection"]
+
+   vs.
+
+   [prov:type="prov:Collection" %% xsd:QName]
+
+Is there a real distinction between these we should be highlighting,
+or should we be using them the same way in the various examples?
+
+
+
+
+Abstract
+--------
+
+"PROV-DM, the PROV data model, is a data model for
+provenance that describes the entities, people and activities involved
+in producing a piece of data or thing."
+
+   Why not use the term 'agents' instead of people in the first
+   sentence?
+
+
+"actities"
+
+   typo
+
+
+"Second, to be able to provide examples of provenance, a notation is
+used for expressing instances of PROV-DM for human consumption; the
+syntactic details of this notation are also kept in a separate
+document."
+
+   This seems awkwardly worded to me.  Here's a stab at a revision,
+   perhaps someone could reword it even better:
+
+   Second, a separate document describes a provenance notation used for
+   expressing instances of provenance for human consumption. It is used
+   in examples in this document.
+
+
+Status of This Document
+-----------------------
+
+"...a set of specifications aiming to define the various aspects..."
+
+   be optimistic!  we will hit our aim!
+
+   ...a set of specifications that define various aspects...
+
+   or even
+
+   ...a set of specifications defining various aspects...
+
+
+
+The specifications are as follows.
+
+   If it is acceptable style, I would end that with a colon, as
+   follows:
+
+
+In the list of documents, some end with , some . and some ;  I
+would do them all the same way.
+
+
+The primer is the entry point to PROV offering a pedagogical
+presentation of the provenance model.
+
+   "offering an introduction to the provenance model."
+
+
+...separating the data model, from its contraints, and the notation
+used to illustrate it.
+
+   remove commas:
+
+   ...separating the data model from its contraints and the notation
+   used to illustrate it.
+
+
+The PROV-DM release is synchronized with the release of the PROV-O,
+PROV-PRIMER, PROV-N, PROV-DM-CONSTRAINTS documents.
+
+   add "and" between last two
+
+
+We are now making clear what the entry path to the PROV family of
+specifications is.
+
+   We are now clarifying the entry path to the PROV family of
+   specifications.
+
+
+1. Introduction
+---------------
+
+...with extra-descriptions that help...
+
+   extraneous -?
+
+
+...introduction to the PROV data model by overviewing a set of concepts...
+
+    ...introduction to the PROV data model with an overview of concepts...
+
+
+2.1 Entity and Activity
+-----------------------
+
+...over a triple store, and editing a file.
+
+    change 'and' to 'or'
+
+
+2.2 Generation, Usage, Derivation
+---------------------------------
+
+In some case, the consumption...
+
+    some cases
+
+
+2.3 Agents and Other Types of Entities
+--------------------------------------
+
+Three types of agents are recognized because they are commonly
+encountered in applications making data and documents available on the
+Web: persons, software agents, and organizations.
+
+    Should those three be bolded here?  Maybe not since we aren't
+    really defining them and they are just special defined types?
+
+
+...member of the collections.
+
+   of the collection.
+
+
+This concept allows for the provenance of the collection, but also of
+its constituents to be expressed.
+
+   This concept allows for the provenance of the collection itself to
+   be expressed in addition to that of the constituents.
+
+
+Such a notion of collection corresponds to a wide variety of concrete
+data structures, such as a maps, dictionaries, or associative arrays.
+
+   I'm not certain I would describe this is as a "wide variety" -- I
+   think of those as pretty much the same thing...
+
+   Perhaps just "Such a notion of collection corresponds to concrete
+   data structures such as a maps, dictionaries, or associative arrays."?
+
+
+2.5 Simplified Overview Diagram
+-------------------------------
+
+I would add a sentence somewhere in here about Agent being an Entity.
+
+Maybe here:
+
+    ...how they relate to each other. At this stage...
+
+    ...how they relate to each other.  Note that each agent is also an
+    entity, so the entity relationships can also apply to agents. At
+    this stage...
+
+
+2.6 PROV-N: The Provenance Notation
+-----------------------------------
+
+PROV-N is a notation that is designed to write instances...
+
+   PROV-N is a notation for writing instances...
+
+
+...a series of arguments in bracket.
+
+   ...a series of arguments in brackets.
+
+   (actually, I usually call them parentheses, but either is fine.)
+
+
+The bulleted list here has inconsistent spacing between bulleted
+items.
+
+
+...which always occur in first position...
+
+   ...which always occurs in the first position...
+
+
+...which occur in last position...
+
+   ...which occurs in the last position...
+
+
+   actually, I would probably just take out the 'occur' and word it
+   like this:
+
+   Most expressions have an identifier in the first position, and an
+   optional set of attribute-value pairs in the last position,
+   delimited by square brackets.
+
+
+3.1 The Process View
+--------------------
+
+...some of which locating archived email messages, available to...
+
+   ...some of which refer to archived email messages available only
+   to...
+
+
+..illustrate them with the PROV-N notation, a notation for PROV-DM
+aimed at human consumption.
+
+   I would eliminate the explanation and just say
+
+   ...illustrate them with the PROV-N notation.
+
+
+4. PROV-DM Types and Relations
+
+
+...derivations and its derivation subtypes.
+
+   remove 'its':
+
+   ...derivations and derivation subtypes.
+
+
+...somehow referring to a same thing.
+
+   referring to the same thing.
+
+
+4.1 Component 1: Entities and Activities
+----------------------------------------
+
+...and their inter-relations...
+
+   ...and their interrelations...
+
+
+Figure figure-component1 overviews the first component, with two "UML
+classes" and binary associations between them.
+
+   Figure figure-component1 uses UML to depict the first component with
+   two classes and binary associations between them.
+
+   (If you reword this figure description, make the other figure
+   descriptions match, if not, don't 
+
+
+Associations are not just binary; indeed, Usage, Generation, Start,
+End are remarkable because they have time attributes, which are
+placeholders for time information related to provenance.
+
+   Associations are not just binary; indeed, Usage, Generation, Start,
+   End also include time attributes.
+
+
+4.1.3 Generation
+----------------
+
+...state the existence of two generations (with respective times
+2001-10-26T21:32:52 and 2001-10-26T10:00:00), at which new entities,
+identified by e1 and e2, are created by an activity, identified by a1.
+
+   ...describe the generation of new entitities e1 and e2 by activity
+   a1 at respective times 2001-10-26T21:32:52 and 2001-10-26T10:00:00.
+
+
+4.1.4 Usage
+-----------
+
+...state that the activity identified by a1 used two entities
+identified by e1 and e2, at times...
+
+
+    ...state that activity a1 used entities e1 and e2 at times...
+
+
+4.3 Component 3: Derivations
+----------------------------
+
+see figure note on 4.1 above -- I don't like the verb "overviews".
+I also wouldn't say
+
+   So-called "UML association classes" are used...
+
+Just say
+
+   UML association classes are used...
+
+
+4.3.1 Derivation
+----------------
+
+The reason for optional information such as activity, generation, and
+usage to be linked to derivations is to aid analysis of provenance and
+to facilitate provenance-based reproducibility.
+
+   Optional information such as activity, generation, and usage can be
+   linked to derivations to aid analysis of provenance and to
+   facilitate provenance-based reproducibility.
+
+
+...it was passed as, if the activity...
+
+   replace , with 'or'
+
+
+4.5 Component 5: Collections
+----------------------------
+
+In many applications, it is also of interest to be able to express the
+provenance of the collection itself...
+
+   Many applications also need to express the provenance of the
+   collection itself...
+
+
+4.7.4.4 prov:type
+-----------------
+
+include Collection and EmptyCollection here?
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-331-graham.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,399 @@
+While this has many improvements over previous documents, I still feel that
+there are several respects in which the document does not really serve its
+intended purpose.
+
+Generally, I found the tone and phrasing were more akin to academic rhetoric,
+whose purpose is to persuade a peer of the truth of some proposition, than a
+technical standard whose aim should be to *specify*, *inform* and where
+necessary to *explain*.  Especially for developers who will have to use this
+material as a reference source.  Thus, I found much of what I read, particularly
+in the introductory section, had far to much justification (some of which was
+obvious, other aspects of which were just "noise") which didn't help to to
+understand what was being presented, or how to use it.
+
+I also still have problems with the overall organization.  In particular, I
+(still) find the example in section 3 breaks the hoped-for flow between the
+section 2 overview (which I also now think is mis-titled) and the provenance
+expression details in section 4.  I also don't think the final two subsections
+of section 2 belong there, as they deal with provenance expression details, not
+concepts.
+
+Finally, I found many examples of unusual or awkward phrasing which I found to
+be unhelpful, confusing or in some cases just plain wrong.
+
+To summarize: if we expect the next public working draft to be nearly ready for
+last, then I don't think this document is ready for release.
+
+Details follow.
+
+...
+
+
+== Abstract ==
+
+The phrase "derivations between entities" is strange and confusing.  I think you
+mean something like "derivation of entities from other entities".
+
+"Properties that link entities that refer to a same thing".  I think this is
+just wrong:  I don't believe that entities *refer*.  I think you mean something
+like "Properties that link entities that are based on the same thing".
+
+"collections of entities, whose provenance itself can be tracked" - this feels
+vaguely ungrammatical, and I'm not quite sure what this is trying to express.
+In any case, I'll argue later that I don;t see why this is necessary as part of
+the provenance core model.  (What I'm not seeing here is anything I can
+recognize as the notion of accounts, which allow for provenance of provenance to
+be expressed.)
+
+Here, and later in the document, there are references to "natural language".  I
+believe this is a term of art that is meaningful only to those who have exposure
+to formal languages, as a way of distinguishing, and may be confusing to some
+readers.  In the abstract, I'd suggest just dropping this - the rest of the
+sentence carries the intended meaning.
+
+I'm not sure what you mean by "systematically defines".  Just "defines" would
+do, I think.
+
+== Status of this document ==
+
+The heading "how to read this document" is, I think, both patronizing and
+inaccurate.  And the following comments seem to significantly replicate the
+content of the preceding text.  I'd suggest moving descriptive material about
+the documents into the preceding text, and drop the stuff that tries to tell
+people what to read.
+
+"Fourth public working draft".  Really!!  Are we really up to 4 with this?  I
+lose count.
+
+== Introduction ==
+
+"how it should be integrated with other diverse information sources".  I find
+this phrase to be vague and unclear, and hence unhelpful.  I'd suggest dropping
+this, and changing "... help those users to make trust judgements" in the next
+sentence to read:
+
+"... help those users to decide which information to include in their analyses,
+and which to exclude."
+
+"The idea that ... a pragmatiuc approach is to consider ..." add's no useful
+value.  I suggest replacing all of this with "We consider ...".
+
+"the vision is that" is pure noise.  Suggest deleting this.  This whole
+paragraph seems to be an unnecessary repetition of what the previous says.
+While I sometimes think that a repeated summary can be useful, in this case I
+think it would be more helpful to simplify the preceding paragraph.
+
+The material that starts with "A set of specifications, ..." seems to be pure
+repetition of material contained in the "status of this document" - is it really
+necessary to repeat it here?
+
+The listing of "components!" seems to be greatly redundant.  Each component is
+both numbered (N) and introduced as "component N".  I think a simple numbered
+list without the "component N" tags would suffice.
+
+Two paragraphs starting with "This specification intentionally presents..." -
+these paragraphs are loaded with unnecessary self-justification.  I think a
+simpler statement along the lines of:
+
+"This specification presents the key concepts of the PROV data model and
+provenance expressions, without specific concern for how they are applied.  A
+companion document [PROV-DM-CONSTRAINTS] discusses some possible constraints on
+the application of this model, and corresponding useful inferences that may be
+available when those constraints are known to be satisfied."
+
+[[The next comment is rendered moot if the previous one is accepted...]]
+Paragraph: "However, if data changes...".  To an uninitiated reader, it is not
+at all clear what is meant by "data" here.  I'd suggest something like "If a
+thing about which provenance is expressed is subject to change, it is
+challenging to express its provenance precisely (e.g. the data from which a
+daily weather report is derived will change from day to day)."  Drop the
+reference to other metadata here - it adds nothing of value.
+
+@@(note to self) raise a separate issue about how to describe this "refinement".
+  I know I have argued for "refinement" over the idea of an "updated" or
+"modified" provenance model, but the term is still a bit vague.  I find myself
+leaning toward a notion of a "strict" interpretation of provenance that in turn
+allows certain inferences to be drawn if the supplied provenance satisfies
+certain strictness criteria (constraints).
+
+== 1.2 PROV namespace ==
+
+This section glibly introduces the notion of a "namespace" without explaining
+(or citing) what it means.
+
+"The PROV namespace is http://www.w3.org/prov#".  This is WRONG.
+http://www.w3.org/prov# is a URI, not a namespace (or, more precisely, it's a
+string that conforms to URI syntax).
+
+What should be said is something like: "The names for concepts, attributes and
+other reserved names introduced by this document belong to a namespace
+identified by the URI http://www.w3.org/prov#".
+
+And: what is the consequence of these names belonging to a namespace?  I think
+it would be appropriate to cite the corresponding XML and RDF documents that
+deal with namespace issues [1] [2].
+
+[1] http://www.w3.org/TR/REC-xml-names/
+
+[2] http://www.w3.org/TR/REC-rdf-syntax/ (sections 6.1.2, 6.1.4, etc.  These
+define how RDF/XML forms a URI-reference by appending a local name to a
+namespace URI.)
+
+== Section 2, PROV-DM staring points ==
+
+I think this section is mis-titled.
+
+I think it should be: "2. Introduction to provenance concepts", since that is
+what most of the section is about.
+
+In light of this, the final two sub-sections seem mis-placed, and I suggest they
+should be part of the early material in section 4.
+
+"... that a novice reader would write in a first instance".  Yuk!  How
+patronizing!  Also, a reference here to "natural language" (see previous).  I
+would phrase this whole paragraph thus:
+
+"This section introduces provenance concepts with informal descriptions and
+illustrative examples.  Later (section @@ref), we describe how these concepts
+are described using PROV-DM types and relations."
+
+(where @@ref should be in another section that actually deals with PROV-DM terms.)
+
+== 2.1 Entity and Activity ==
+
+"The term things encompasses..." - I find this phrasing awkward and potentially
+confusing - are we talking here about things or entities?  I suggest simply
+"These encompass ..."
+
+The final sentence is mostly noise.  Why not just "Any Web resource may be an
+entity."?
+
+"For the purpose of this specification..." is just noise.  Also, confusing
+reference to "entities" and "things".  Suggest for this para:  "An entity is a
+thing one wants to provide provenance for, which may be physical, digital,
+conceptual, or otherwise; entities may be real or imaginary."
+
+"This action can take multiple forms: ..." - this is confusing; are we talking
+about a single activity having multiple forms, or different activities having
+different forms.  I think you mean the latter, hence I suggest: "An activity is
+something that occurs over a period of time and acts upon or with entities. They
+may include consuming, processing, transforming, modifying, relocating, using,
+generating, or other associations with entities."
+
+
+== 2.2, et seq. ==
+
+I find similar issues with the wording of subsequent sections, but I haven't
+gone through every one for lack of time.  But I hope you get the general thrust
+from the above.
+
+
+== 2.3 Agents and other types of entities ==
+
+I think this exhibits poor organization of the material.  I think Agents and
+Plans are related, and suggest a sub-section for them.  Collections and accounts
+don't have any obvious relationship, and IMO should be separated.
+
+Concerning collections, it is not at all clear to me that these need to be in
+the core PROV-DM.  By including them here, you impose a particular view of
+collections that may not be appropriate  (somewhere, though I can't immediately
+find where, there is mention of a collection being a key-value map).  Domains
+that deal with collections have their own models for these, so why not let this
+be an aspect for domain-specific extension?
+
+
+I think accounts should have a section of their own, since they underpin the key
+feature of supporting provenance0-of-provenance.
+
+However, I have a problem with the description "An account is an entity that
+contains a bundle of provenance descriptions."  I think that this should be "An
+account *is* an entity that is a bundle of provenance descriptions."  That is, I
+don't think the core DM needs to or should expose the notion of containment,
+since that begs more questions.
+
+== 2.4 Attribution, association and responsibility ==
+
+I find the expression of these ideas to be hopelessly muddled, and incoherent.
+In particular, it seems to be self-contradictory with respect to the notion of
+"responsibility" (also with section 2.3):
+
+"An agent is a type of entity that bears some form of responsibility for an
+activity taking place."
+"Software for checking the use of grammar in a document may be defined as an agent"
+"Agents are defined as having some kind of responsibility for activities."
+"[an association may be] an XSLT transform launched by a user ..."
+"An activity association is an assignment of responsibility to an agent for an
+activity"
+"Responsibility is the fact that an agent is accountable for ..."
+
+At heart, I think the problem here is the notion that agents are "responsible".
+  Especially when "responsibility" is later defined in terms of accountability -
+I can't see a software agent as being accountable.  I don't know how to make
+sense of this, so it's hard for me to suggest alternatives.
+
+== Section 2.5, Simplified overview diagram ==
+== Section 2.6, PROV-N ... ==
+
+See earlier comments.  These is about PROV-DM terms, not provenance concepts, so
+I don't really think they belong here.
+
+I'd move them to start start of section 4.
+
+== Section 3, Illustration... ==
+
+I *still* think the positioning of this example disrupts the logical flow from
+concepts (section 2) to PROV-DM expressions (section 4).
+
+(I haven't reviewed the content of this section.)
+
+
+== 4. PROV-DM types and relations ==
+
+The enumeration of components seems to be repetitive.  Numbered items *and*
+component numbers?  (See earlier comment.)
+
+"In the first column, one finds concept names directly linking to their English
+definition. In the second column, ...".  Why not just use column headings in the
+table?  The reference to "English" description seems redundant.
+
+"In the rest of the section, each concept and relation is defined, in English
+initially, followed by a more formal definition and some example."  Similar
+comment.  Suggest:
+"In the rest of the section, each type and relation is defined informally,
+followed by a summary of the information used to represent the concept, and
+illustrated with PROV-N examples."
+
+== 4.1.1 Entity ==
+
+"An entity is a thing one wants to provide provenance for. For the purpose of
+this specification, things can be physical, digital, conceptual, or otherwise;
+things may be real or imaginary."  confuses entities and things again.  Suggest:
+"An entity is a thing one wants to provide provenance for. It can be physical,
+digital, conceptual, or otherwise, and may be real or imaginary."
+
+"An entity, written entity(id, [attr1=val1, ...]) in PROV-N, contains:" - I
+think this is wrong - an entity does not (in general) *contain*.  Suggest:
+"An entity, written entity(id, [attr1=val1, ...]) in PROV-N, has:"
+
+"id: an identifier for an entity;" - this is redundant and potentially
+confusing.  Suggest "id: an identifier".
+
+"attributes: an optional set of attribute-value pairs ((attr1, val1), ...)
+representing this entity's situation in the world." - I find this phrasing
+awkward and unclear.  Suggest:
+"attributes: an optional set of attribute-value pairs ((attr1, val1), ...)
+representing additional nformation about this entity."
+
+== 4.1.2, et seq ==
+
+(Similar editorial comments to those for 4.1.1 Entity.  I'm not repeating them
+all now for lack of time.)
+
+
+== Section 4.1.5 Start ==
+
+I find this whole section is confusing.  Starting with:
+
+"trigger: an optional identifier (e) for the entity triggering the activity;" -
+do you really mean to allow *any* entity here, rather than just agents?
+
+Looking forward to the example, I find the idea that an email (qua entity) can
+"trigger" an activity is incoherent.  Suppose the email is drafted and never
+sent.  It still exists as an entity, but can't be said to actually *trigger*
+anything.  For me, it is the act of actually sending (or receiving) an email
+that may trigger something, not the email as a passive entity.
+
+
+== Section 4.1.6, End ==
+
+(Similar comments to those above.)
+
+
+== Section 4.1.7, Communication ==
+
+It seems strange to me, given the pattern used for other concepts/expressions,
+that the communicated entity cannot be optionally named.  I find myself
+wondering if I've understood the definition properly.
+
+
+== Section 4.2.1, Agent ==
+
+Continues the muddle about responsibility.  I don't know what it all means
+(especially when the agent is running software).  See previous comments.
+
+Awkward and unnecessary phrase "situation in the world" again.  See earlier for
+suggested phrasing.
+
+
+== Section 4.3.1 Derivation ==
+
+"A derivation is a transformation of an entity into another, a construction of
+an entity into another, or an update of an entity, resulting in a new one."
+seems ungrammatical.  Suggest:
+"A derivation is a transformation of an entity into another, a construction of
+an entity *from* another, or an update of an entity, resulting in a new one."
+
+
+== Section 4.5 Collections ==
+
+I'm not understanding why this needs to be part of the core PROV-DM, and cannot
+be habdled by domain specific notions of aggregation.
+
+The stated goal is that "it is also of interest to be able to express the
+provenance of the collection itself" - this could be done equally well with a
+domain-specific collection notion, AFAICT.
+
+See also earlier comments.
+
+
+== Section 4.6, Annotations ==
+
+I'm still not seeing why these are needed as part of the core DM. There's no
+associated inference that I am aware of, and additional information can be added
+via attributes, so I'm not seeing what useful additional expressive capability
+this affords.
+
+
+== Section 4.7.4 Attribute ==
+
+Is an attribute really just a qualified name, or is it a pair consisting of a
+qualified name and a value?
+
+
+== Section 5, Extensibility points ==
+
+This section makes little sense to me.  The obvious extensibility points of
+sub-typing and sub-properties of defined PROV-DM terms isn't mentioned.
+
+The use of new attributes seems reasonable, though it's not entirely clear how
+they act as extension points, and the mention of "perspective on the world"
+doesn't mean anything to me.
+
+I cannot see how notes, which are defined to be pretty much semantics-free, can
+be described as an extensibility point - they don't actually add any expressive
+power that I can see.
+
+The remaining points I just don't get.
+
+I think this whole notion of extensibility needs to be treated more carefully
+and comprehensively if it is to be taken seriously.  Otherwise expect developers
+to ignore this and just use extensibility options in the representation
+substrate (e.g. RDF) used.
+
+== Section 6 ==
+
+I think this section is completely redundant and out-of-place, and could be
+removed without any loss.
+
+
+=================
+Yes, it's largely a document/text quality thing - I feel it doesn't entirely lay
+things out clearly enough for its target audience, and in some cases is actively
+confusing.  This may be "editorial", but I think it's important enough to need
+addressing to move forwards towards LC.  There are a few points of substance
+(mainly stuff that feels superfluous to me), but I wouldn't be surprised to be
+lone voice on that.
+
+I've indicated a number of specific points points in the "details" part of my
+email, with suggested alternative phrasing, though there are many more (similar
+to those I detail) that I've skipped over in passing.
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-332-Cheney.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,57 @@
+I think the document puts the cart before the horse to some degree: basic notions, such as identifiers, attributes and other elementary syntax, is buried in section 4.7, which makes it difficult for an uninformed reader to understand the earlier sections.   Similarly, the fact that PROV-N introduces syntax for accounts and expression containers is not mentioned earlier in the document, and containers are used in examples before they are ever discussed. 
+
+> * Can the short name of the document be confirmed (in particular, for prov-n, 
+> prov-dm-constraints, since request needs to be sent for publication)? 
+I think PROV-N is OK as a name.  I am not sure, however, that the split between prov-n and prov-dm is working well, since there is a lot of duplicate material.  A reader new to this might not understand the difference between the notation and the data model, since the DM documents exclusively use PROV-N notation anyway.
+
+
+High-level comments:
+
+* Please define all basic syntax in a preliminary section where it will be seen before it is used, and where it can easily be found for later reference. (i.e. it's highly confusing when the RHS of a grammar rule refers to nonterminals that haven't been defined yet, and whose definitions are hard to find).
+
+* Similarly, for accounts and expression-containers, I suggest adding a sentence to the introduction that mentions these constructs, and a paragraph or two (with examples) to the design rationale section that gives examples of these constructs.  
+
+* Accounts could be discussed before expression containers in order to avoid having to redefine the grammar rule for expressionContainer.
+
+* In sec. 2. it says "PROV-N optional arguments need not be specified as long as this does not lead to ambiguity" - Is this something that implementations should check or is it a property we assert holds?  If the latter, I'm not sure I believe it - there are certainly shift-reduce conflicts in the grammar, even if it is formally unambiguous.  Allowing only one, uniform mechanism for optional arguments would IMO be better. (leading to less complex grammar rules and less guesswork on the part of the reader).
+
+
+Detailed comments:
+
+sec. 2. "so that application*s*", "arguments in bracket*s*."
+
+- In the example talking about optional attributes, is there any difference (to the meaning") between an absent attribute list and an empty one?
+
+- Sentence beginning "PROV-N exposes attributes" ungrammatical, citation of PROV-DM-CONSTRAINTS in the middle makes it hard to read.
+
+sec. 3.- Please split the grammar into six nonterminals, one for each component.  Also, I strongly suggest moving the "further expressions" stuff to the beginning or to its own section before sec. 4., since it applies to sec. 5 and 6 too, and I would like sec. 3 to give a high-level summary of the grammar and explain what a "PROV-N document" is.
+
+sec. 4.  Genreation, start, end and association have extra constraints that can actually be expressed using grammar rules.  This would be more precise.
+
+sec. 4.3.1.  Derivation has some optional identifiers that can be replaced by - but not omitted.  This contradicts the discussion of optional arguments in section 2.  Again, I'd prefer to have just one, uniform mechanism for optional arguments.
+
+sec. 4.5.3.  Membership grammar rule doesn't match the example.
+
+sec. 4.7.1.  Stray "|" in the first line of the grammar.
+
+sec. 4.7.1, example: end should be endContainer (I think).  Also, at this point a reader will not have any idea what the container business is about.  
+
+sec. 4.7.2.  Since prefixes and IRIs are used in namespaceDeclarations, I suggest talking about identifiers first.
+
+sec. 5.  Th discussion implies, without saying explicitly, that containers cannot be nested (right?)
+
+sec. 6.  It is strange that a container can contain either a collection of accounts or a collection of expressions, but not a mix of both.  Also, the need to "update" expressionContainer rule suggests to me that it would be better to discuss accounts first, then containers, since we can then avoid having to change the rule mid-stream. (It would be best to avoid superseding rules, to prevent bugs where a developer misses the fact that the rule is later extended.)
+
+sec. B.1.  The reference to IRIs/RFC 3987 is duplicated (there are two different citations of the same document).
+
+
+====
+The ProvRDF and ProvXML wiki pages (along with the Formal Semantics strawman page) are cited, and I'm listed as author.
+
+While I did start these pages, and am the only contributor to the ProvXML and semantics pages, the work on the ProvRDF page should be credited to the PROV-O team (I've done relatively little since setting the page up initially).
+
+I also assume that eventually references to wiki pages will have to be replaced with something else, e.g. 
+ProvRDF -> a part of PROV-O that covers the mapping (is this planned?)
+ProvXML -> PROV-XML note
+Formal semantics strawman -> official version of PROV-SEM Note
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-332-Khalid.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,28 @@
+
+I read PROV-N document. It is well structured and easy to follow. Before starting reading PROV-N, I was skeptical about its usefulness since the PROV-N expressions are already used in PROV-DM part1 and prov-DM constraints.
+After reading it, I changed my mind. It actually gives a complete account of the concepts and relations on the DM in a relatively small number of pages.
+Also, I believe that for people who are familiar with existing provenance models, PROV-N may be the best entry point.
+
+I have only few minor comments on the document:
+
+2- Design Rationale for PROV-N
+* The number of examples that are given to illustrate the notation can be reduced. For example, 2 examples are used to illustrate PROV-N optional arguments. I think that at this stage, the reader does not know yet PROV-N expression, and therefore, it would be better to use one kind of expression, e.g., derivation, and use small number of examples.
+
+* The term expression, and expression identifier is used in this section, but it is actually introduced in the section that follows.
+
+* "While not all PROV-DM relations are not binary, they involve two primary elements". I find the use of the term primary element a bit vague, specially when presenting a notation. Instead, I would use something in the following lines. "While most of PROV-DM relations are binary, such relations can be characterized (or qualified) using attributes."
+
+4 PROV-N Productions per Component
+* For several expressions, e.g. generation and start by activity, it is specified that certain allowed expressions are not valid. Take the example of generation, the text specify that when the expression contain information only about the entity generated, i.e., without mentioning the activity or the generation time, then the expression is not valid. Instead of doing so, and given that this document is only on annotation, I think we can reformulate the definition of expression to exclude invalid cases. To illustrate this, consider the case of generation. The definition of generation in the PROV-N document is as follows:
+
+
+generationExpression ::= 'wasGeneratedBy' '('( ( identifier | '-' ) ',')?  eIdentifier ',' ( aIdentifier | '-' ) ',' ( time | '-' ) optional-attribute-values ')'
+
+To avoid invalid cases, the above definition can be reformulated as follows:
+
+
+generationExpression ::= 'wasGeneratedBy' '('
+                                         ( ( ( identifier | '-' ) ',')?  eIdentifier ',' aIdentifier  ',' ( time | '-' ) optional-attribute-values
+                                         |
+                                         ( ( ( identifier | '-' ) ',')?  eIdentifier ',' ( aIdentifier | '-' ) ',' time  optional-attribute-values)
+                                          ')'
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-332-Simon.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,24 @@
+1. Section 1.2: I suggest you also identify the "prov" namespace prefix here, as it is used in the examples in the document.
+
+2. There is apparent inconsistency of brackets. The collections expressions use curly braces {} for unordered sets, while the rest of PROV-N uses square brackets [], e.g. for attribute sets. I suggest the curly braces are more standard.
+
+3. I find the form of sentence "An X's text matches the Y production" unintuitive. I think it is clear what you mean from the context, but is "text" the right word? Maybe something like "An X is expressed in PROV-N using the Y production" would be clearer?
+
+4. Some grammar errors in Section 2:
+ - Bullet 1: "consisting a name"
+ - Bullet 1: "in bracket"
+ - Bullet 4: "identifier always occur"
+ - Bullet 4: "with optional identifier"
+
+5. I notice that PROV-N has no plan(...) construct, while PROV-O has a Plan class (also raised in my review of PROV-O). Might this cause issues translating between them?
+
+6. Considering the document as a whole, its real purpose seems a little unclear to me. Is it meant to be (a) the grammar of PROV-N, or (b) an explanation of the grammar of PROV-N? The document seems maybe too heavyweight for (a) and too minimal for (b).
+
+In more detail, if it is just intended to be a grammar for a language whose meaning is explained elsewhere (the DM spec), then why include examples, subsections, introductory boilerplate, and design rationale? That is, why not just have a file containing the EBNF?
+
+If it is supposed to explain the grammar, so helping those constructing PROV-N or reading PROV-N others have constructed, then I find it too light. To take an example from the document, if a user reads "actedOnBehalfOf(r, ag2, ag3, a, [prov:type="contract"])" in some PROV-N data, then looks up the expression in Section 4.2.4 of the PROV-N document, they will not understand what it means, and will not know which agent is acting on behalf of which other (as all the parameter names are opaque with regard to their role). They could understand from looking in the DM spec, but then the PROV-N document is not needed for explanation after all.
+
+There is one piece of explanation of expressions' meaning at the end of Section 2, that the subject of the relation precedes its object, but this seems inadequate for understanding most expressions and inconsistent with the lack of explanation of meaning elsewhere in the document.
+
+I'm not sure what to suggest to resolve this point, except to consider whether the content of the document really fits its intended use, as it wasn't entirely clear to me that this is the case now.
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-333-Cheney.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,157 @@
+> Can the document be released as a next public working draft? If no, what are
+> the blocking issues?
+I don't think it's ready.  Putting myself in the shoes of a developer, what exactly would it mean to implement the PROV-DM-CONSTRAINTS recommendation?  It isn't clear what a compliant implementation must do, may do, or should do.
+
+More concretely, there are at least four kinds of semi-formal boxes in the document:
+
+- constraints (failure to satisfy a constraint is bad?)
+- inferences (which provide additional information; do we need to check that inferences don't violate constraints?)
+- definitions (of some PROV-N assertions in terms of others; usually called constraints too)
+- "interpretations" - which seem to be inferences about event ordering mostly
+
+and it is not clear to me what it means to satisfy or apply them.
+
+> * Is the structure of the document approved?
+PROV-DM-CONSTRAINTS is not especially well-organized.  A lot of the early material is redundant given PROV-DM.
+
+What is the difference between definitional constraints and inferences, "interpretations", and the remaining miscellaneous constraints?  Why not have one big constraint section, organized analogously to the main data model description?  There are a lot of forward references to "interpretations" (and the meaning of this term /difference between an "interpretaion" and a "constraint" or "inference" is never discussed).
+
+> * Can the short name of the document be confirmed (in particular, for prov-n,
+> prov-dm-constraints, since request needs to be sent for publication)?
+I think the names are OK, assuming this way of splitting things sticks.  However, concerning the content, there now seems to be enough overlap between PROV-DM-CONSTRAINTS and PROV-SEM that it may make sense to merge them somehow, as otherwise there will be a lot of duplication and there are parts of PROV-DM-CONSTRAINTS that don't make much sense unless we say a little about the intended meaning/semantics of the records.
+
+
+High-level comments:
+
+* Both the main DM and the constraints documents are missing a clear description of the main problems we are trying to overcome, which (in my view) include:
+
+- We are trying to deal sanely with descriptions of "things that can change over time"; much of the complexity comes from this; in the common case where things aren't changing or change slowly, much of the complexity (specifically alternate/specialization) can be avoided
+
+- different suers/applications may make different (equally valid) subjective decisions about where to draw boundaries between entities, artifacts, etc. and what events to consider important
+
+- provenance needs to allow for different perspectives on the same situation, for example to allow duplicate elimination where multiple descriptions of the same entity exist (Royal Society), or to allow for disambiguation when multiple things can have similar descriptions (person in chair)
+
+- The purpose of the constraints document is (I think) to help manage this complexity by identifying constraints that we can check to determine whether a provenance record is minimally sane, defining some complex concepts in terms of other simpler ones, and identifying inferences that can be used to fill in implicit information (which might itself be relevant to checking consistency).
+
+* Most of the constraints themselves seem sensible, and I will add a section to the semantics indicating which are satisfied by the current version and if not, why.
+
+* The document doesn't adequately explain what the constraints, inferences, definitions and "interpretations" are for, or how an implementation might satisfy them.
+
+One rationalization (as suggested by Graham)may be: PROV-DM instances can be syntactically well-formed but nonsensical.  To rule this out, PROV-DM-CONSTRAINTS introduces a class of "valid" or "strict" PROV-DM data.  Valid PROV-DM has to satisfy some constraints, and in addition, supports some inferences (which must, after being applied, not violate the constraints either).
+
+However, even if this is the idea (it is never said plainly), it's not clear to me how an implementation should interpret them.  Many of the constraints are of the form "if such and such holds then something else must hold".  What are the consequences if not? Should an application reject the data?  or add the missing data?
+
+Similarly, some inferences say "if such and such holds then some other things must hold", but are written in such a way that it's not clear whether you mean "the data model instance must already have the additional records" or "we consider the additional records to be implicitly part of the model".  If the latter, this seems to imply that new ids will be created at run time for unknown entities, e.g. in Inference:wasRevision.
+
+* the status of missing values is unclear.  Moreover, in wasRevisionOf and Quotation, the missing agent is specified as meaning either no agent exists or one exists but is not identified.  This seems to indicate that we can not make any assumption about the missing value's existence or value; I guess the default case is taht we assume the missing value exists but is unknown.  But it's not clear what difference this makes to an implementation - under what circumstances would it matter?
+
+
+Detailed comments:
+
+Third, fourth and fifth paragraphs of 2.1.2 are unclear - what does "it is anticipated that" mean?  The same as "we expect that in practice"?  Or "implementations had better..."?
+
+"instantaneous event to *be* inferred"
+
+- concept of actual verification of ordering constraints is vague
+
+Sec. 2.2 - heading "Attributes in Entities and Beyond" is unspecific.  Why not just "Attributes".  The second paragraph (which is a complete sentence) is also quite convoluted.
+
+Here, as in a lot of places, the text is extremely verbose/indirect where it doesn't need to be, e.g. "It is the purpose of attributes in PROV-DM to help fix some aspect of entities" -> "Attributes in PROV-DM describe some aspects of entities".
+
+"period comprised between" -> "period between"
+
+"alternative entity" - I think you are alluding to "alternativeOf", without makign it clear to a reader that "alternative" refers to the PROV-DM concept.  Perhaps "An alternative entity that describes the same thing"
+
+In example: "expressed *as*:".  Also, why not an example in PROV-O or PROV-N syntax?
+
+"mroe important" - what is the metric for importance?  I think you are basically saying that we don't assume an absolute ground truth with respect to which we can judge correctness or completeness of descriptions.
+
+"belong to a variety of PROV-DM objects" - maybe "can be associated with" instead of "belong"?  Also, is "object" used in the same sense as in the semantics?
+
+Sec 2.3.  Last paragraph: "When this is the case, this specification defines such inferences" - I think it's more accurate to say "This specification defines some such inferences" - otherwise it sounds like we're claiming to have a complete axiomatization of possible inferences, which I don't think we do.
+
+Sec. 2.4.  Some of the problems discussed here are relevant whether or not we consider accounts.
+
+Aso, "must" and "may" are used to constrain hypothetical account mechanisms.  I have no idea how to check such a constraint.
+
+What is the "set of descriptions" of an account? MUST it increase monotonically with time?
+
+Since PROV-DM doesn't specify how accounts can be handled, or provide an abstraction specifying how implementations could provide for accounts, I don't see any point in saying anything in PROV-DM-CONSTRAINTS about accounts, unless we have concrete examples in mind.
+
+sec. 2.5.  "some value SHOULD be assumed to exist" - If I understand correctly, this means that implementations can ignore this requirement if there is a strong reason to.  But I don't understand who is doing the assuming and what the effect on the implementation is.  Can an implementor satisfy this recommendation simply by saying in the documentation that she assumes missing optional values exist, or is there something that the implementation actually has to do in order to fulfil this expectation?
+
+sec. 3.1.1.  For entity, we also don't assume that the attributes uniquely identify the entity (or underlying thing), right?   Examples of the various (non)properties would help.
+
+- Also, here is the first of many forward references labeled "interpretation: ... see blah blah".  These make zero sense the first time you read through the document.
+
+Sec. 3.1.2.  "However, an activity *record*"
+
+- The bullet point under "further considerations" is useful information that could be said earlier, or in PROV-DM.
+
+Sec. 3.1.3 "This instantaneous event encompasses a description of the modalities of genration of this entity by this activity, by means of key-value pairs".  This is opaque.  Suggest:  "Generation events can have attributes that describe how the entity was generated by the activity."
+
+- Constraint unique-generation-time: isn't the activity also unique?
+
+Sec. 3.1.6.  The constraint wasInformedBy-definition is missing the identifiers on wasGeneratedBy and used.
+
+- The term information flow is used without explanation that this means "communication"
+
+Sec. 3.1.8.  Similarly, the constraint here is missing ids on wasGeneratedBy and wasStartedBy.
+
+Is the constraint really a definition?
+
+Sec. 3.2.1 has no explanation.
+
+Sec. 3.2.2 has two sentences of explanation, but there is not any context.
+
+Sec. 3.2.3 has no explanation apart from a forward reerence to a later constraint./
+
+Sec. 3.3.1."since of e2" -> "since e2"
+
+Sec. 3.3.4 Traceability-inference: line 5 is incomplete.  Also, the section talks about "the defintiion of tracedTo" - do the traceability-inference rules constitute a (recursive) definition of tracedTo?  or are they just some rules for inferring tracedTo and there could be others that don't conform to the rules?  What is the whole definition, if these are just parts?
+
+- Why use superscripts on e in traceability-assertion?
+
+Sec. 3.4.1.  Anti-symmetry counter example:  I suggest saying that we don't assume antisymmetry because we don't assume that two different entities that happen to have the same informatipon about the same thing are the same entity.  (However, we can easily accommodate antisymmetry in the formal semantics.)
+
+The example about the email, printed version, and thoughts is too vague to be useful.  Thoughts are not the same kind of thing as emails, which are not the same kind of thing as printouts.
+
+Sec. 3.4.2.  Appears to tacitluy assume that alternateOf is defined in terms of specialization, which I believe has been revisited as a result of email discussion already.
+
+- The customerInChair example is confusing, since it seems to use the same entity id customerInChair to refer to different things at different times.
+
+Sec. 4.  I'm not sure of the rationale for unique-description-in-account.
+
+-In the example, why list both alternateOf links, since it is (at least) symmetric?
+
+Sec. 5.  "to be meaningful" -is this the same as "valid" in the sense I suggested above?  What are the consequences of failing to be meaningful (by violating some constraints)?
+
+- "that such *an* instantiated"
+
+- "the four kind of " -> "the five kinds of"
+
+- "By transitivity of generation-precedes-usage, generation of an entity precedes its invalidation" - This is ONLY true if the entity is ever used!  I think it is better to explicitly give a constraint "usage-precedes-invalidation"  dual to "genration-precedes-usage".
+
+- Just after "wasStartedByAgent-ordering" - "A similar constraints exists" -> "...constraint..."
+
+- wasAttributedWith should be wasAttributedTo.  Similarly, "attributed with" in the preceding paragraphs should be "attributed to".
+
+Sec. 6.  The numbering of prior sections at the beginning of this section is off.
+
+- Second paragraph: seems like a long-winded and circular way of saying "We assume that each entity is generated exactly once."
+
+- "said not *to* be".  Also, is "structurally well-formed" the same as my "valid" notion?
+
+- Overall, as for the discussion of accountes earlier in sec. 2.4, I don't really see the point of talking about what happens when we merge tow accounts since PROV does not specify anything about accounts.
+
+Sec. 7 - The constraints about collections (and to some extent the collections mechanisms themselves) seem preliminary and not particularly strongly motivated to me.  I agree that collections are important, but I am not sure I agree that they're well understood enough to merit standardization.
+
+- Moreover, I'm not sure I agree with collection-unique-derivation.  Suppose
+
+c1 = {}
+c2 = {a:1}, obtained by inserting (a,1) into c0
+c3 = {a:1,b:2} obtained by inserting (b,2) into c1
+
+I might want to be able to say that c3 is obtained by inserting (a,1), (b,2) into c1.  Why can't I?
+
+- Do collection-derivation relations imply alternateOf?  (And for that matter, does wasRevision?)
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-333-graham.txt	Tue Apr 10 15:14:33 2012 +0100
@@ -0,0 +1,99 @@
+Summary:  not ready for release.  Sorry!
+
+I've read this document up to section 2.2 and, based on what I've read, I'm not 
+sure I can see any reason for this document to exist.
+
+When we set out on this path of separating the description of constrained (or 
+"strict") provenance from "scruffy" provenance, my understanding was that:
+(1) we wished to provide an easy-to-understand provenance data model that anyone 
+could use to generate and present provenance information, and
+(2) we wished to describe a strict, or constrained, use of this model that would 
+allow certain conclusions to be validly inferred.
+
+As such, the PROV-CONSTRAINTS document needs to build upon the PROV-DM document 
+in a way that doesn't seek to invalidate things that people do based on PROV-DM 
+alone (cf. Paul's use-case about making provenance statements about his blog).
+
+Yet this is not what I see when I read the PROV-CONSTRAINTS document.  What I 
+see is a document that (a) simply repeats a lot of material that is present in 
+PROV-DM (I think familiarity with the contents of PROV-DM should be assumed for 
+readers of PROV-CONSTRAINTS), and (b) introduces new definitions that seem to 
+invalidate some usage that would be valid based on a reading of PROV-DM alone 
+(e.g. the MUST constraint in section 2.1.2, para 3).  I think it is important 
+that PROV-CONSTRAINTS MUST NOT invalidate a naive use of the provenance model. 
+In this light, I find several parts of the text I have read to be contradictory 
+(e.g. section 2.1 paras 3 & 4, or the notion that "event" underpins PROV-DM when 
+it isn't even mentioned there).
+
+The goal, as I understand it, is that when provenance statements are made in a 
+way that conform to the stricter usage, then certain inferences become valid.
+
+In writing this, I realize that there is something that, to my knowledge, has 
+not been discussed in the WG.  If presented with some arbitrary provenance 
+information, how is an agent to know if it has been constructed with regard to 
+the strict constraints of PROV-CONSTRAINTS, or is simply a looser use of the 
+basic provenance model?  Without some way to answer this, I think the "scruffy" 
+and "strict" (for want of more evocative terms) approaches to expressing 
+provenance are destined to flounder.
+
+So, for this document to work as I understand it is intended to do, I think it 
+needs:
+(1) to start out with a much clearer articulation of its goal - I find the 
+present section 1 introduction tells me nothing that I actually need to know 
+about the role of PROV-CONSTRAINTS, and
+(2) we need a way to recognize when provenance statements are intended to be 
+interpreted according to the strict usage defined by PROV-CONSTRAINTS.
+
+For (1), stripping out the introductory references and repetition of PROV-DM, I 
+think something like this is needed:
+
+[[
+This specification defines a strict, or constrained, usage of the provenance 
+data model which, if followed, makes a number of conclusions commonly drawn from 
+provenance information to be logically valid inferences.  It also defines a way 
+to assert that the provenance usage conforms to this strict usage.  These 
+constraints are also reflected in the provenance formal semantics [@@ref].
+]]
+
+For (2), I don't have any definite proposal, though I can imagine some 
+approaches.  The following are intended as seeds of ideas, not definite suggestions:
+* a subproperty of prov:hasProvenance, e.g. prov:hasStrictProvenance, that 
+relates provenance to some entity.
+* a property associated with a prov:Account that indicates that the provenance 
+statements in that account can be interpreted as strict provenance
+* a property of an agent or activity associated with generation a provenance 
+account that indicates that the generation process follows strict provenance 
+constraints in generating provenance statements.
+* etc.
+
+Until these fundamental issues are addressed, I think that any further comment 
+on the content of this document would be in the league of shuffling deckchairs 
+on the Titanic.
+
+======
+
+Thinking further about my comments yesterday, it occurs to me that there are two 
+additional things that would be helpful to include:
+
+(1) in the introduction, a brief summary of the constraints introduced. 
+Offhand, I can think of two:
+- variability of an entity must be constrained so that any provenance expressed 
+is durable, or immutable.  (E.g. when we say a report was edited by X, the 
+report entity referenced must be a version that was and always will be edited by X.)
+- timing properties expressed must be consistent with common expectations; e.g. 
+that an artifact must be generated before it can be used, etc.
+I'm sure there are more, but I'm not sure that reading the document as it stands 
+would actually tell me what they are.
+
+(2) some discussion (but no more than that) of how to recast provenance that 
+does not conform the the required constraints into a form that does, and which 
+can therefore be safely combined with other provenance expressions and have 
+inferences drawn.  This discussion would help users of provenance to understand 
+that that "scruffy" provenance information can be used in a formal reasoning 
+environment if it is subjected to appropriate "conditioning".  For example, if 
+Paul says that he is author of his blog post on a given date (but on other dates 
+there may be guest posts by other people), then the URI for the blog post can be 
+replaced by one that refers to a specific date.  The exact mechanism for this 
+would not be specified, but one might point to ideas like Memento, tdb: URI 
+scheme or just bog post permalinks.  For timing constraints, one might need to 
+say something about re-casting all timestamps in terms of UTC/Zulu-time.  And so on.