prov: changeset 2256:fddd29c9e73c

--- a/model/comments/issue-331-graham.txt	Tue Apr 10 15:14:33 2012 +0100
+++ b/model/comments/issue-331-graham.txt	Tue Apr 10 21:59:28 2012 +0100
@@ -1,399 +1,486 @@
-While this has many improvements over previous documents, I still feel that
-there are several respects in which the document does not really serve its
-intended purpose.
-
-Generally, I found the tone and phrasing were more akin to academic rhetoric,
-whose purpose is to persuade a peer of the truth of some proposition, than a
-technical standard whose aim should be to *specify*, *inform* and where
-necessary to *explain*.  Especially for developers who will have to use this
-material as a reference source.  Thus, I found much of what I read, particularly
-in the introductory section, had far to much justification (some of which was
-obvious, other aspects of which were just "noise") which didn't help to to
-understand what was being presented, or how to use it.
-
-I also still have problems with the overall organization.  In particular, I
-(still) find the example in section 3 breaks the hoped-for flow between the
-section 2 overview (which I also now think is mis-titled) and the provenance
-expression details in section 4.  I also don't think the final two subsections
-of section 2 belong there, as they deal with provenance expression details, not
-concepts.
-
-Finally, I found many examples of unusual or awkward phrasing which I found to
-be unhelpful, confusing or in some cases just plain wrong.
-
-To summarize: if we expect the next public working draft to be nearly ready for
-last, then I don't think this document is ready for release.
-
-Details follow.
-
-...
-
-
-== Abstract ==
-
-The phrase "derivations between entities" is strange and confusing.  I think you
-mean something like "derivation of entities from other entities".
-
-"Properties that link entities that refer to a same thing".  I think this is
-just wrong:  I don't believe that entities *refer*.  I think you mean something
-like "Properties that link entities that are based on the same thing".
-
-"collections of entities, whose provenance itself can be tracked" - this feels
-vaguely ungrammatical, and I'm not quite sure what this is trying to express.
-In any case, I'll argue later that I don;t see why this is necessary as part of
-the provenance core model.  (What I'm not seeing here is anything I can
-recognize as the notion of accounts, which allow for provenance of provenance to
-be expressed.)
-
-Here, and later in the document, there are references to "natural language".  I
-believe this is a term of art that is meaningful only to those who have exposure
-to formal languages, as a way of distinguishing, and may be confusing to some
-readers.  In the abstract, I'd suggest just dropping this - the rest of the
-sentence carries the intended meaning.
-
-I'm not sure what you mean by "systematically defines".  Just "defines" would
-do, I think.
-
-== Status of this document ==
-
-The heading "how to read this document" is, I think, both patronizing and
-inaccurate.  And the following comments seem to significantly replicate the
-content of the preceding text.  I'd suggest moving descriptive material about
-the documents into the preceding text, and drop the stuff that tries to tell
-people what to read.
-
-"Fourth public working draft".  Really!!  Are we really up to 4 with this?  I
-lose count.
-
-== Introduction ==
-
-"how it should be integrated with other diverse information sources".  I find
-this phrase to be vague and unclear, and hence unhelpful.  I'd suggest dropping
-this, and changing "... help those users to make trust judgements" in the next
-sentence to read:
-
-"... help those users to decide which information to include in their analyses,
-and which to exclude."
-
-"The idea that ... a pragmatiuc approach is to consider ..." add's no useful
-value.  I suggest replacing all of this with "We consider ...".
-
-"the vision is that" is pure noise.  Suggest deleting this.  This whole
-paragraph seems to be an unnecessary repetition of what the previous says.
-While I sometimes think that a repeated summary can be useful, in this case I
-think it would be more helpful to simplify the preceding paragraph.
-
-The material that starts with "A set of specifications, ..." seems to be pure
-repetition of material contained in the "status of this document" - is it really
-necessary to repeat it here?
-
-The listing of "components!" seems to be greatly redundant.  Each component is
-both numbered (N) and introduced as "component N".  I think a simple numbered
-list without the "component N" tags would suffice.
+  > While this has many improvements over previous documents, I still feel that
+  > there are several respects in which the document does not really serve its
+  > intended purpose.
+  > 
+  > Generally, I found the tone and phrasing were more akin to academic rhetoric,
+  > whose purpose is to persuade a peer of the truth of some proposition, than a
+  > technical standard whose aim should be to *specify*, *inform* and where
+  > necessary to *explain*.  Especially for developers who will have to use this
+  > material as a reference source.  Thus, I found much of what I read, particularly
+  > in the introductory section, had far to much justification (some of which was
+  > obvious, other aspects of which were just "noise") which didn't help to to
+  > understand what was being presented, or how to use it.
 
-Two paragraphs starting with "This specification intentionally presents..." -
-these paragraphs are loaded with unnecessary self-justification.  I think a
-simpler statement along the lines of:
-
-"This specification presents the key concepts of the PROV data model and
-provenance expressions, without specific concern for how they are applied.  A
-companion document [PROV-DM-CONSTRAINTS] discusses some possible constraints on
-the application of this model, and corresponding useful inferences that may be
-available when those constraints are known to be satisfied."
-
-[[The next comment is rendered moot if the previous one is accepted...]]
-Paragraph: "However, if data changes...".  To an uninitiated reader, it is not
-at all clear what is meant by "data" here.  I'd suggest something like "If a
-thing about which provenance is expressed is subject to change, it is
-challenging to express its provenance precisely (e.g. the data from which a
-daily weather report is derived will change from day to day)."  Drop the
-reference to other metadata here - it adds nothing of value.
-
-@@(note to self) raise a separate issue about how to describe this "refinement".
-  I know I have argued for "refinement" over the idea of an "updated" or
-"modified" provenance model, but the term is still a bit vague.  I find myself
-leaning toward a notion of a "strict" interpretation of provenance that in turn
-allows certain inferences to be drawn if the supplied provenance satisfies
-certain strictness criteria (constraints).
-
-== 1.2 PROV namespace ==
-
-This section glibly introduces the notion of a "namespace" without explaining
-(or citing) what it means.
-
-"The PROV namespace is http://www.w3.org/prov#".  This is WRONG.
-http://www.w3.org/prov# is a URI, not a namespace (or, more precisely, it's a
-string that conforms to URI syntax).
-
-What should be said is something like: "The names for concepts, attributes and
-other reserved names introduced by this document belong to a namespace
-identified by the URI http://www.w3.org/prov#".
-
-And: what is the consequence of these names belonging to a namespace?  I think
-it would be appropriate to cite the corresponding XML and RDF documents that
-deal with namespace issues [1] [2].
-
-[1] http://www.w3.org/TR/REC-xml-names/
+Can you be specific about which justification is obvious or noise? Are
+they all listed below?
 
-[2] http://www.w3.org/TR/REC-rdf-syntax/ (sections 6.1.2, 6.1.4, etc.  These
-define how RDF/XML forms a URI-reference by appending a local name to a
-namespace URI.)
-
-== Section 2, PROV-DM staring points ==
-
-I think this section is mis-titled.
-
-I think it should be: "2. Introduction to provenance concepts", since that is
-what most of the section is about.
-
-In light of this, the final two sub-sections seem mis-placed, and I suggest they
-should be part of the early material in section 4.
-
-"... that a novice reader would write in a first instance".  Yuk!  How
-patronizing!  Also, a reference here to "natural language" (see previous).  I
-would phrase this whole paragraph thus:
-
-"This section introduces provenance concepts with informal descriptions and
-illustrative examples.  Later (section @@ref), we describe how these concepts
-are described using PROV-DM types and relations."
-
-(where @@ref should be in another section that actually deals with PROV-DM terms.)
+  > 
+  > I also still have problems with the overall organization.  In particular, I
+  > (still) find the example in section 3 breaks the hoped-for flow between the
+  > section 2 overview (which I also now think is mis-titled) and the provenance
+  > expression details in section 4.  I also don't think the final two subsections
+  > of section 2 belong there, as they deal with provenance expression details, not
+  > concepts.
 
-== 2.1 Entity and Activity ==
-
-"The term things encompasses..." - I find this phrasing awkward and potentially
-confusing - are we talking here about things or entities?  I suggest simply
-"These encompass ..."
-
-The final sentence is mostly noise.  Why not just "Any Web resource may be an
-entity."?
+Comment noted. Supportive comments also noted.
 
-"For the purpose of this specification..." is just noise.  Also, confusing
-reference to "entities" and "things".  Suggest for this para:  "An entity is a
-thing one wants to provide provenance for, which may be physical, digital,
-conceptual, or otherwise; entities may be real or imaginary."
+  > 
+  > Finally, I found many examples of unusual or awkward phrasing which I found to
+  > be unhelpful, confusing or in some cases just plain wrong.
 
-"This action can take multiple forms: ..." - this is confusing; are we talking
-about a single activity having multiple forms, or different activities having
-different forms.  I think you mean the latter, hence I suggest: "An activity is
-something that occurs over a period of time and acts upon or with entities. They
-may include consuming, processing, transforming, modifying, relocating, using,
-generating, or other associations with entities."
+Are they all listed below?
 
-
-== 2.2, et seq. ==
+  > 
+  > To summarize: if we expect the next public working draft to be nearly ready for
+  > last, then I don't think this document is ready for release.
+  > 
+  > Details follow.
+  > 
+  > ...
+  > 
+  > 
+  > == Abstract ==
+  > 
+  > The phrase "derivations between entities" is strange and confusing.  I think you
+  > mean something like "derivation of entities from other entities".
+  > 
+  > "Properties that link entities that refer to a same thing".  I think this is
+  > just wrong:  I don't believe that entities *refer*.  I think you mean something
+  > like "Properties that link entities that are based on the same thing".
 
-I find similar issues with the wording of subsequent sections, but I haven't
-gone through every one for lack of time.  But I hope you get the general thrust
-from the above.
+This questions the definition of specialization that was agreed over email.
+
+  > 
+  > "collections of entities, whose provenance itself can be tracked" - this feels
+  > vaguely ungrammatical, and I'm not quite sure what this is trying to express.
+  > In any case, I'll argue later that I don;t see why this is necessary as part of
+  > the provenance core model.  (What I'm not seeing here is anything I can
+  > recognize as the notion of accounts, which allow for provenance of provenance to
+  > be expressed.)
+
+It was agreed we would not address it for WD5, unless all other issues are solved.
+
+  > 
+  > Here, and later in the document, there are references to "natural language".  I
+  > believe this is a term of art that is meaningful only to those who have exposure
+  > to formal languages, as a way of distinguishing, and may be confusing to some
+  > readers.  In the abstract, I'd suggest just dropping this - the rest of the
+  > sentence carries the intended meaning.
+
+?
+
+  > 
+  > I'm not sure what you mean by "systematically defines".  Just "defines" would
+  > do, I think.
+  > 
+
+ok todo
+
+  > == Status of this document ==
+  > 
+  > The heading "how to read this document" is, I think, both patronizing and
+  > inaccurate.  And the following comments seem to significantly replicate the
+  > content of the preceding text.  I'd suggest moving descriptive material about
+  > the documents into the preceding text, and drop the stuff that tries to tell
+  > people what to read.
+ 
+?????????
+
+ > 
+  > "Fourth public working draft".  Really!!  Are we really up to 4 with this?  I
+  > lose count.
+  > 
+
+yes. 4 public + 1 internal
+
+  > == Introduction ==
+  > 
+  > "how it should be integrated with other diverse information sources".  I find
+  > this phrase to be vague and unclear, and hence unhelpful.  I'd suggest dropping
+  > this, and changing "... help those users to make trust judgements" in the next
+  > sentence to read:
+  > 
+  > "... help those users to decide which information to include in their analyses,
+  > and which to exclude."
+
+This text comes from the incubator final report.
+It's crucial to me to keep the term trust. 
+
+Unclear that changes need to be made.
+
+  > 
+  > "The idea that ... a pragmatiuc approach is to consider ..." add's no useful
+  > value.  I suggest replacing all of this with "We consider ...".
+
+Can be dropped.  Was in the charter.
+
+  > 
+  > "the vision is that" is pure noise.  Suggest deleting this.  This whole
+  > paragraph seems to be an unnecessary repetition of what the previous says.
+  > While I sometimes think that a repeated summary can be useful, in this case I
+  > think it would be more helpful to simplify the preceding paragraph.
+
+To consider.
+
+  > 
+  > The material that starts with "A set of specifications, ..." seems to be pure
+  > repetition of material contained in the "status of this document" - is it really
+  > necessary to repeat it here?
 
 
-== 2.3 Agents and other types of entities ==
-
-I think this exhibits poor organization of the material.  I think Agents and
-Plans are related, and suggest a sub-section for them.  Collections and accounts
-don't have any obvious relationship, and IMO should be separated.
-
-Concerning collections, it is not at all clear to me that these need to be in
-the core PROV-DM.  By including them here, you impose a particular view of
-collections that may not be appropriate  (somewhere, though I can't immediately
-find where, there is mention of a collection being a key-value map).  Domains
-that deal with collections have their own models for these, so why not let this
-be an aspect for domain-specific extension?
-
-
-I think accounts should have a section of their own, since they underpin the key
-feature of supporting provenance0-of-provenance.
-
-However, I have a problem with the description "An account is an entity that
-contains a bundle of provenance descriptions."  I think that this should be "An
-account *is* an entity that is a bundle of provenance descriptions."  That is, I
-don't think the core DM needs to or should expose the notion of containment,
-since that begs more questions.
-
-== 2.4 Attribution, association and responsibility ==
-
-I find the expression of these ideas to be hopelessly muddled, and incoherent.
-In particular, it seems to be self-contradictory with respect to the notion of
-"responsibility" (also with section 2.3):
-
-"An agent is a type of entity that bears some form of responsibility for an
-activity taking place."
-"Software for checking the use of grammar in a document may be defined as an agent"
-"Agents are defined as having some kind of responsibility for activities."
-"[an association may be] an XSLT transform launched by a user ..."
-"An activity association is an assignment of responsibility to an agent for an
-activity"
-"Responsibility is the fact that an agent is accountable for ..."
-
-At heart, I think the problem here is the notion that agents are "responsible".
-  Especially when "responsibility" is later defined in terms of accountability -
-I can't see a software agent as being accountable.  I don't know how to make
-sense of this, so it's hard for me to suggest alternatives.
-
-== Section 2.5, Simplified overview diagram ==
-== Section 2.6, PROV-N ... ==
-
-See earlier comments.  These is about PROV-DM terms, not provenance concepts, so
-I don't really think they belong here.
+  > 
+  > The listing of "components!" seems to be greatly redundant.  Each component is
+  > both numbered (N) and introduced as "component N".  I think a simple numbered
+  > list without the "component N" tags would suffice.
 
-I'd move them to start start of section 4.
-
-== Section 3, Illustration... ==
-
-I *still* think the positioning of this example disrupts the logical flow from
-concepts (section 2) to PROV-DM expressions (section 4).
-
-(I haven't reviewed the content of this section.)
-
-
-== 4. PROV-DM types and relations ==
-
-The enumeration of components seems to be repetitive.  Numbered items *and*
-component numbers?  (See earlier comment.)
-
-"In the first column, one finds concept names directly linking to their English
-definition. In the second column, ...".  Why not just use column headings in the
-table?  The reference to "English" description seems redundant.
+OK :-)
 
-"In the rest of the section, each concept and relation is defined, in English
-initially, followed by a more formal definition and some example."  Similar
-comment.  Suggest:
-"In the rest of the section, each type and relation is defined informally,
-followed by a summary of the information used to represent the concept, and
-illustrated with PROV-N examples."
-
-== 4.1.1 Entity ==
-
-"An entity is a thing one wants to provide provenance for. For the purpose of
-this specification, things can be physical, digital, conceptual, or otherwise;
-things may be real or imaginary."  confuses entities and things again.  Suggest:
-"An entity is a thing one wants to provide provenance for. It can be physical,
-digital, conceptual, or otherwise, and may be real or imaginary."
+  > 
+  > Two paragraphs starting with "This specification intentionally presents..." -
+  > these paragraphs are loaded with unnecessary self-justification.  I think a
+  > simpler statement along the lines of:
+  > 
+  > "This specification presents the key concepts of the PROV data model and
+  > provenance expressions, without specific concern for how they are applied.  A
+  > companion document [PROV-DM-CONSTRAINTS] discusses some possible constraints on
+  > the application of this model, and corresponding useful inferences that may be
+  > available when those constraints are known to be satisfied."
+  > 
 
-"An entity, written entity(id, [attr1=val1, ...]) in PROV-N, contains:" - I
-think this is wrong - an entity does not (in general) *contain*.  Suggest:
-"An entity, written entity(id, [attr1=val1, ...]) in PROV-N, has:"
-
-"id: an identifier for an entity;" - this is redundant and potentially
-confusing.  Suggest "id: an identifier".
+To consider.
 
-"attributes: an optional set of attribute-value pairs ((attr1, val1), ...)
-representing this entity's situation in the world." - I find this phrasing
-awkward and unclear.  Suggest:
-"attributes: an optional set of attribute-value pairs ((attr1, val1), ...)
-representing additional nformation about this entity."
+  > [[The next comment is rendered moot if the previous one is accepted...]]
+  > Paragraph: "However, if data changes...".  To an uninitiated reader, it is not
+  > at all clear what is meant by "data" here.  I'd suggest something like "If a
+  > thing about which provenance is expressed is subject to change, it is
+  > challenging to express its provenance precisely (e.g. the data from which a
+  > daily weather report is derived will change from day to day)."  Drop the
+  > reference to other metadata here - it adds nothing of value.
+  > 
+To consider.
 
-== 4.1.2, et seq ==
+  > @@(note to self) raise a separate issue about how to describe this "refinement".
+  >   I know I have argued for "refinement" over the idea of an "updated" or
+  > "modified" provenance model, but the term is still a bit vague.  I find myself
+  > leaning toward a notion of a "strict" interpretation of provenance that in turn
+  > allows certain inferences to be drawn if the supplied provenance satisfies
+  > certain strictness criteria (constraints).
 
-(Similar editorial comments to those for 4.1.1 Entity.  I'm not repeating them
-all now for lack of time.)
+Nothing todo.
+
+  > 
+  > == 1.2 PROV namespace ==
+  > 
+  > This section glibly introduces the notion of a "namespace" without explaining
+  > (or citing) what it means.
+  > 
+  > "The PROV namespace is http://www.w3.org/prov#".  This is WRONG.
+  > http://www.w3.org/prov# is a URI, not a namespace (or, more precisely, it's a
+  > string that conforms to URI syntax).
+  > 
+  > What should be said is something like: "The names for concepts, attributes and
+  > other reserved names introduced by this document belong to a namespace
+  > identified by the URI http://www.w3.org/prov#".
+
+Merge section 1.2 and 1.3 in "conventions"
+
+One of the conventions is listing of namespaces uri and prefixes we use here.
+Add a reference to namespace section for explanation of namespaces in the context of prov-dm.
+
+  > 
+  > And: what is the consequence of these names belonging to a namespace?  I think
+  > it would be appropriate to cite the corresponding XML and RDF documents that
+  > deal with namespace issues [1] [2].
+  > 
+  > [1] http://www.w3.org/TR/REC-xml-names/
+  > 
+  > [2] http://www.w3.org/TR/REC-rdf-syntax/ (sections 6.1.2, 6.1.4, etc.  These
+  > define how RDF/XML forms a URI-reference by appending a local name to a
+  > namespace URI.)
+
+We discuss namespace in 4.7.1. 
+
+  > 
+  > == Section 2, PROV-DM staring points ==
+  > 
+  > I think this section is mis-titled.
+  > 
+  > I think it should be: "2. Introduction to provenance concepts", since that is
+  > what most of the section is about.
+  > 
+  > In light of this, the final two sub-sections seem mis-placed, and I suggest they
+  > should be part of the early material in section 4.
+  > 
+
+To consider. I think section 3 is at the right place. Section 2
+structure is does not need to be changed.
 
 
-== Section 4.1.5 Start ==
-
-I find this whole section is confusing.  Starting with:
-
-"trigger: an optional identifier (e) for the entity triggering the activity;" -
-do you really mean to allow *any* entity here, rather than just agents?
-
-Looking forward to the example, I find the idea that an email (qua entity) can
-"trigger" an activity is incoherent.  Suppose the email is drafted and never
-sent.  It still exists as an entity, but can't be said to actually *trigger*
-anything.  For me, it is the act of actually sending (or receiving) an email
-that may trigger something, not the email as a passive entity.
-
-
-== Section 4.1.6, End ==
-
-(Similar comments to those above.)
-
-
-== Section 4.1.7, Communication ==
+  > "... that a novice reader would write in a first instance".  Yuk!  How
+  > patronizing!  Also, a reference here to "natural language" (see previous).  I
+  > would phrase this whole paragraph thus:
+  > 
+  > "This section introduces provenance concepts with informal descriptions and
+  > illustrative examples.  Later (section @@ref), we describe how these concepts
+  > are described using PROV-DM types and relations."
 
-It seems strange to me, given the pattern used for other concepts/expressions,
-that the communicated entity cannot be optionally named.  I find myself
-wondering if I've understood the definition properly.
-
-
-== Section 4.2.1, Agent ==
-
-Continues the muddle about responsibility.  I don't know what it all means
-(especially when the agent is running software).  See previous comments.
+Adopt some of these changes. Still keep forwarding pointer to 2.5 and 2.6.
+  > 
+  > (where @@ref should be in another section that actually deals with PROV-DM terms.)
+  > 
+  > == 2.1 Entity and Activity ==
+  > 
+  > "The term things encompasses..." - I find this phrasing awkward and potentially
+  > confusing - are we talking here about things or entities?  I suggest simply
+  > "These encompass ..."
 
-Awkward and unnecessary phrase "situation in the world" again.  See earlier for
-suggested phrasing.
-
+OK
+  > 
+  > The final sentence is mostly noise.  Why not just "Any Web resource may be an
+  > entity."?
 
-== Section 4.3.1 Derivation ==
+Therefore it's not noise, is it?
 
-"A derivation is a transformation of an entity into another, a construction of
-an entity into another, or an update of an entity, resulting in a new one."
-seems ungrammatical.  Suggest:
-"A derivation is a transformation of an entity into another, a construction of
-an entity *from* another, or an update of an entity, resulting in a new one."
+Is it *is* an entity, or *can be regarded* as an entity.
 
 
-== Section 4.5 Collections ==
-
-I'm not understanding why this needs to be part of the core PROV-DM, and cannot
-be habdled by domain specific notions of aggregation.
-
-The stated goal is that "it is also of interest to be able to express the
-provenance of the collection itself" - this could be done equally well with a
-domain-specific collection notion, AFAICT.
-
-See also earlier comments.
-
+  > 
+  > "For the purpose of this specification..." is just noise.  Also, confusing
+  > reference to "entities" and "things".  Suggest for this para:  "An entity is a
+  > thing one wants to provide provenance for, which may be physical, digital,
+  > conceptual, or otherwise; entities may be real or imaginary."
 
-== Section 4.6, Annotations ==
+Adopt Simon's definition here.
 
-I'm still not seeing why these are needed as part of the core DM. There's no
-associated inference that I am aware of, and additional information can be added
-via attributes, so I'm not seeing what useful additional expressive capability
-this affords.
+  > 
+  > "This action can take multiple forms: ..." - this is confusing; are we talking
+  > about a single activity having multiple forms, or different activities having
+  > different forms.  I think you mean the latter, hence I suggest: "An activity is
+  > something that occurs over a period of time and acts upon or with entities. They
+  > may include consuming, processing, transforming, modifying, relocating, using,
+  > generating, or other associations with entities."
 
-
-== Section 4.7.4 Attribute ==
-
-Is an attribute really just a qualified name, or is it a pair consisting of a
-qualified name and a value?
+I don't like the "They" here.
 
 
-== Section 5, Extensibility points ==
-
-This section makes little sense to me.  The obvious extensibility points of
-sub-typing and sub-properties of defined PROV-DM terms isn't mentioned.
-
-The use of new attributes seems reasonable, though it's not entirely clear how
-they act as extension points, and the mention of "perspective on the world"
-doesn't mean anything to me.
-
-I cannot see how notes, which are defined to be pretty much semantics-free, can
-be described as an extensibility point - they don't actually add any expressive
-power that I can see.
-
-The remaining points I just don't get.
+  > 
+  > 
+  > == 2.2, et seq. ==
+  > 
+  > I find similar issues with the wording of subsequent sections, but I haven't
+  > gone through every one for lack of time.  But I hope you get the general thrust
+  > from the above.
 
-I think this whole notion of extensibility needs to be treated more carefully
-and comprehensively if it is to be taken seriously.  Otherwise expect developers
-to ignore this and just use extensibility options in the representation
-substrate (e.g. RDF) used.
-
-== Section 6 ==
-
-I think this section is completely redundant and out-of-place, and could be
-removed without any loss.
-
+Please provide explicit comments.
 
-=================
-Yes, it's largely a document/text quality thing - I feel it doesn't entirely lay
-things out clearly enough for its target audience, and in some cases is actively
-confusing.  This may be "editorial", but I think it's important enough to need
-addressing to move forwards towards LC.  There are a few points of substance
-(mainly stuff that feels superfluous to me), but I wouldn't be surprised to be
-lone voice on that.
-
-I've indicated a number of specific points points in the "details" part of my
-email, with suggested alternative phrasing, though there are many more (similar
-to those I detail) that I've skipped over in passing.
\ No newline at end of file
+  > 
+  > 
+  > == 2.3 Agents and other types of entities ==
+  > 
+  > I think this exhibits poor organization of the material.  I think Agents and
+  > Plans are related, and suggest a sub-section for them.  Collections and accounts
+  > don't have any obvious relationship, and IMO should be separated.
+  > 
+  > Concerning collections, it is not at all clear to me that these need to be in
+  > the core PROV-DM.  By including them here, you impose a particular view of
+  > collections that may not be appropriate  (somewhere, though I can't immediately
+  > find where, there is mention of a collection being a key-value map).  Domains
+  > that deal with collections have their own models for these, so why not let this
+  > be an aspect for domain-specific extension?
+  > 
+  > 
+  > I think accounts should have a section of their own, since they underpin the key
+  > feature of supporting provenance0-of-provenance.
+  > 
+  > However, I have a problem with the description "An account is an entity that
+  > contains a bundle of provenance descriptions."  I think that this should be "An
+  > account *is* an entity that is a bundle of provenance descriptions."  That is, I
+  > don't think the core DM needs to or should expose the notion of containment,
+  > since that begs more questions.
+  > 
+  > == 2.4 Attribution, association and responsibility ==
+  > 
+  > I find the expression of these ideas to be hopelessly muddled, and incoherent.
+  > In particular, it seems to be self-contradictory with respect to the notion of
+  > "responsibility" (also with section 2.3):
+  > 
+  > "An agent is a type of entity that bears some form of responsibility for an
+  > activity taking place."
+  > "Software for checking the use of grammar in a document may be defined as an agent"
+  > "Agents are defined as having some kind of responsibility for activities."
+  > "[an association may be] an XSLT transform launched by a user ..."
+  > "An activity association is an assignment of responsibility to an agent for an
+  > activity"
+  > "Responsibility is the fact that an agent is accountable for ..."
+  > 
+  > At heart, I think the problem here is the notion that agents are "responsible".
+  >   Especially when "responsibility" is later defined in terms of accountability -
+  > I can't see a software agent as being accountable.  I don't know how to make
+  > sense of this, so it's hard for me to suggest alternatives.
+  > 
+  > == Section 2.5, Simplified overview diagram ==
+  > == Section 2.6, PROV-N ... ==
+  > 
+  > See earlier comments.  These is about PROV-DM terms, not provenance concepts, so
+  > I don't really think they belong here.
+  > 
+  > I'd move them to start start of section 4.
+  > 
+  > == Section 3, Illustration... ==
+  > 
+  > I *still* think the positioning of this example disrupts the logical flow from
+  > concepts (section 2) to PROV-DM expressions (section 4).
+  > 
+  > (I haven't reviewed the content of this section.)
+  > 
+  > 
+  > == 4. PROV-DM types and relations ==
+  > 
+  > The enumeration of components seems to be repetitive.  Numbered items *and*
+  > component numbers?  (See earlier comment.)
+  > 
+  > "In the first column, one finds concept names directly linking to their English
+  > definition. In the second column, ...".  Why not just use column headings in the
+  > table?  The reference to "English" description seems redundant.
+  > 
+  > "In the rest of the section, each concept and relation is defined, in English
+  > initially, followed by a more formal definition and some example."  Similar
+  > comment.  Suggest:
+  > "In the rest of the section, each type and relation is defined informally,
+  > followed by a summary of the information used to represent the concept, and
+  > illustrated with PROV-N examples."
+  > 
+  > == 4.1.1 Entity ==
+  > 
+  > "An entity is a thing one wants to provide provenance for. For the purpose of
+  > this specification, things can be physical, digital, conceptual, or otherwise;
+  > things may be real or imaginary."  confuses entities and things again.  Suggest:
+  > "An entity is a thing one wants to provide provenance for. It can be physical,
+  > digital, conceptual, or otherwise, and may be real or imaginary."
+  > 
+  > "An entity, written entity(id, [attr1=val1, ...]) in PROV-N, contains:" - I
+  > think this is wrong - an entity does not (in general) *contain*.  Suggest:
+  > "An entity, written entity(id, [attr1=val1, ...]) in PROV-N, has:"
+  > 
+  > "id: an identifier for an entity;" - this is redundant and potentially
+  > confusing.  Suggest "id: an identifier".
+  > 
+  > "attributes: an optional set of attribute-value pairs ((attr1, val1), ...)
+  > representing this entity's situation in the world." - I find this phrasing
+  > awkward and unclear.  Suggest:
+  > "attributes: an optional set of attribute-value pairs ((attr1, val1), ...)
+  > representing additional nformation about this entity."
+  > 
+  > == 4.1.2, et seq ==
+  > 
+  > (Similar editorial comments to those for 4.1.1 Entity.  I'm not repeating them
+  > all now for lack of time.)
+  > 
+  > 
+  > == Section 4.1.5 Start ==
+  > 
+  > I find this whole section is confusing.  Starting with:
+  > 
+  > "trigger: an optional identifier (e) for the entity triggering the activity;" -
+  > do you really mean to allow *any* entity here, rather than just agents?
+  > 
+  > Looking forward to the example, I find the idea that an email (qua entity) can
+  > "trigger" an activity is incoherent.  Suppose the email is drafted and never
+  > sent.  It still exists as an entity, but can't be said to actually *trigger*
+  > anything.  For me, it is the act of actually sending (or receiving) an email
+  > that may trigger something, not the email as a passive entity.
+  > 
+  > 
+  > == Section 4.1.6, End ==
+  > 
+  > (Similar comments to those above.)
+  > 
+  > 
+  > == Section 4.1.7, Communication ==
+  > 
+  > It seems strange to me, given the pattern used for other concepts/expressions,
+  > that the communicated entity cannot be optionally named.  I find myself
+  > wondering if I've understood the definition properly.
+  > 
+  > 
+  > == Section 4.2.1, Agent ==
+  > 
+  > Continues the muddle about responsibility.  I don't know what it all means
+  > (especially when the agent is running software).  See previous comments.
+  > 
+  > Awkward and unnecessary phrase "situation in the world" again.  See earlier for
+  > suggested phrasing.
+  > 
+  > 
+  > == Section 4.3.1 Derivation ==
+  > 
+  > "A derivation is a transformation of an entity into another, a construction of
+  > an entity into another, or an update of an entity, resulting in a new one."
+  > seems ungrammatical.  Suggest:
+  > "A derivation is a transformation of an entity into another, a construction of
+  > an entity *from* another, or an update of an entity, resulting in a new one."
+  > 
+  > 
+  > == Section 4.5 Collections ==
+  > 
+  > I'm not understanding why this needs to be part of the core PROV-DM, and cannot
+  > be habdled by domain specific notions of aggregation.
+  > 
+  > The stated goal is that "it is also of interest to be able to express the
+  > provenance of the collection itself" - this could be done equally well with a
+  > domain-specific collection notion, AFAICT.
+  > 
+  > See also earlier comments.
+  > 
+  > 
+  > == Section 4.6, Annotations ==
+  > 
+  > I'm still not seeing why these are needed as part of the core DM. There's no
+  > associated inference that I am aware of, and additional information can be added
+  > via attributes, so I'm not seeing what useful additional expressive capability
+  > this affords.
+  > 
+  > 
+  > == Section 4.7.4 Attribute ==
+  > 
+  > Is an attribute really just a qualified name, or is it a pair consisting of a
+  > qualified name and a value?
+  > 
+  > 
+  > == Section 5, Extensibility points ==
+  > 
+  > This section makes little sense to me.  The obvious extensibility points of
+  > sub-typing and sub-properties of defined PROV-DM terms isn't mentioned.
+  > 
+  > The use of new attributes seems reasonable, though it's not entirely clear how
+  > they act as extension points, and the mention of "perspective on the world"
+  > doesn't mean anything to me.
+  > 
+  > I cannot see how notes, which are defined to be pretty much semantics-free, can
+  > be described as an extensibility point - they don't actually add any expressive
+  > power that I can see.
+  > 
+  > The remaining points I just don't get.
+  > 
+  > I think this whole notion of extensibility needs to be treated more carefully
+  > and comprehensively if it is to be taken seriously.  Otherwise expect developers
+  > to ignore this and just use extensibility options in the representation
+  > substrate (e.g. RDF) used.
+  > 
+  > == Section 6 ==
+  > 
+  > I think this section is completely redundant and out-of-place, and could be
+  > removed without any loss.
+  > 
+  > 
+  > =================
+  > Yes, it's largely a document/text quality thing - I feel it doesn't entirely lay
+  > things out clearly enough for its target audience, and in some cases is actively
+  > confusing.  This may be "editorial", but I think it's important enough to need
+  > addressing to move forwards towards LC.  There are a few points of substance
+  > (mainly stuff that feels superfluous to me), but I wouldn't be surprised to be
+  > lone voice on that.
+  > 
+  > I've indicated a number of specific points points in the "details" part of my
+  > email, with suggested alternative phrasing, though there are many more (similar
+  > to those I detail) that I've skipped over in passing.
author	Luc Moreau <l.moreau@ecs.soton.ac.uk>
	Tue, 10 Apr 2012 21:59:28 +0100
changeset 2256	fddd29c9e73c
parent 2255	92e52847d717
child 2257	23b61163609e