inserted issues raised by Graham
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Fri, 29 Jul 2011 10:28:01 +0100
changeset 71 1149ff27d7d2
parent 70 5e9acc97a1a5
child 72 6e206fb281fe
inserted issues raised by Graham
model/comments-from-graham.txt
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments-from-graham.txt	Fri Jul 29 10:28:01 2011 +0100
@@ -0,0 +1,349 @@
+  > With reference to:
+  > http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html
+  > Retrieved at about 17:30 on 28-Jul-2011
+  > 
+  > As promised, I've taken a tilt at reviewing the model draft.  I must
+  > say, I've found it to be really hard going - many of the notions
+  > described are not making sense to me, and the language used sometimes
+  > seems to be unnecessarily obscure.
+  > 
+  > After a mammoth session going though this, I really don't have the
+  > time or energy to split my comments out into separate issues.  I think
+  > many of them are purely editorial in nature, and as such could be
+  > cleaned up relatively easily. There are some substantive comments that
+  > I may separate out as formal issues later, but I'm rather hoping that
+  > won't be needed.
+  > 
+
+PROV-ISSUE-63
+
+  > My comments follow:
+  > 
+  > 
+  > 3.1 Notation used is obscure.  What does [...[ mean?  Should be explained.
+  > 
+  > For a general audience, examples based on Unix command shell commands
+  > are probably not very helpful.
+  > 
+  > What is "characterized entity represented by the file".  As this is an
+  > example, just say "crime statistics" - would that be a correct
+  > interpretation?
+  > 
+  > 
+  > 3.2 where did 'e0' come from? - it's not mentioned in 3.1.  What is it intended to denote?
+  > 
+  > The "agent" statements are completely impenetrable to me.
+  > 
+  > How is the notation to be interpreted.  It looks a b it like some kind
+  > of deviant Prolog, but either I've forgotten some of the basic
+  > constructs, or it's not entirely clear how the deviant bits are meant
+  > to be interpreted.
+  > 
+  > 
+  > 3.3 graphical representation: could be very useful, and would be much
+  > easier to follow if the illustration included a key
+  > 
+  > What does it mean for an agent to be linked to a BOB as opposed to a
+  > process execution (cf. Alice and e0).
+  > 
+
+
+PROV-ISSUE-62
+
+  > 
+  > 4. About the Provenance Language
+  > 
+  > Introduction of "characterized entities" - if this is something that
+  > really needs to be said, I think it needs to be clarified.  I spent
+  > some time thinking about these two sentences, trying to work out if
+  > they could ever be completely correct, or just not understanding what
+  > they are intended to convey:
+  > 
+  > [[ Furthermore, this specification is concerned with characterized
+  > entities, that is, entities and their situation in the world, as
+  > perceived by their asserters.
+  > 
+  > In the rest of the document, we are concerned with the representation
+  > of such entities; their situation in the world will be represented
+  > using sets of attributes.  ]]
+  > 
+  > Why "characterized entities" as opposed to perceived entities"?
+  > What's the important distinction here?
+  > 
+  > The only interpretation I've found that makes sense to me is that the
+  > document is concerning itself with entities that are characterized by
+  > the values of some bounded set of attributes.  But that
+  > interpretation, if correct, is not obvious to me from the wording
+  > here.
+  > 
+  > 
+  > "PIL is a language by which representations of the world can be
+  > expressed using terms that are drawn from a controlled vocabulary. "
+  > I'm not sure how to interpret this.  Does this "controlled vocabulary
+  > include, for example, numbers? Is this controlled vocabulary expected
+  > to be the complete set of terms used in PIL expressions?
+  > 
+  > 
+  > "These representations are relative to an asserter, and in that sense
+  > constitute assertions about the world."  What is this trying to say?
+  > I think you might mean something like:
+  > 
+  > "These representations are relative to the context of an asserter, and
+  > in that sense constitute perceptions about the world."  which ties
+  > back to the earlier statement about "as perceived by their asserters".
+  > 
+  > "All assertions in PIL SHOULD be interpreted as a record of what has
+  > happened, as opposed to what may or will happen."  I feel we should
+  > find a way to strengthen this SHOULD to a MUST, but comments from
+  > earlier discussions make this tricky to get right.  Maybe:
+  > 
+  > "All assertions in PIL MUST be interpreted as a record of what has
+  > happened or been observed in some context, as opposed to what might
+  > happen or potential observations."  In this, I am using the reference
+  > to a context to provide just enough wiggle-room for description in
+  > future or imagined contexts.
+  > 
+  > "This specification does not prescribe the means by which assertions
+  > are made, for example on the basis of observations, inferences, or any
+  > other means."
+  > The phrasing "... assertions are made" here is jarring, if not
+  > confusing - I would think that assertions are made in PIL for the
+  > purposes of this spec. Suggest "... how assertions are arrived at,
+  > ..."
+  > 
+  > "The language introduces a notion of "provenance container", which
+  > provides a default scope for assertions."  The term "container" here
+  > is suggested of a physical or logical encapsulation, which I don't
+  > think is meant.  How about "provenance context"?
+  > 
+  > [[ ... The model may define additional scoping rules for
+  > assertions. Identifiers can safely be used within that
+  > scope. Optionally, identifiers can be exported so that they can be
+  > used outside their default scope. The language does not prescribe the
+  > mechanisms by which identifiers are generated.  ]]
+  > 
+  > This spec is describing a data model, *not* a language.  It says so at
+  > the top.  As such I think it's entirely inappropriate to start
+  > defining linguistic constructs such as identifiers and scoping.
+  > Assuming the actual language used will be RDF, I'm not seeing how what
+  > you describe will be possible.
+  > 
+  > "In this specification, when an assertion is defined to refer to
+  > another assertion about something, it does so by means of that thing's
+  > identifier."  I don't understand what this is trying to say.
+  > 
+  > 
+
+ISSUE-60
+  > 5.1 BOB
+  > 
+  > "A BOB represents an identifiable characterized entity."
+  > 
+  > What does it mean to be "characterized" here?  What does this tell us?
+  > What does it mean to not be "characterized"?  If this refers to the
+  > attribute-based assertions mentioned earlier, does this mean that if
+  > there are no such assertions, an entity cannot be a "BOB"?
+  > 
+  > [[ A BOB assertion is about a characterized entity, whose situation in
+  > the world is variant. A BOB assertion is made at a particular point
+  > and is invariant, in the sense that all the attributes are assigned a
+  > value as part of that assertion.  ]]
+  > 
+  > This section is, according to its heading, about "BOB".  But this is
+  > defining a different concept, so shouldn't this be in a separate
+  > section?
+  > 
+  > It seems to me that what we're talking about here is a "provenance
+  > assertion". I think it would be clearer to just describe that, e.g.
+  > [[ A provenance assertion is about an entity, whose situation in the
+  > world is generally assumed to be variable.  ]]
+  > 
+  > I either don't understand or don't agree with the second part of that
+  > description.  The notion of assigning values as party of an assertion
+  > seems wrong to me (I think the notion of constraining attributes is
+  > the job of the IVP-of relation).  I would expect something like:
+  > 
+  > [[ A provenance assertion is made at a particular point and is
+  > invariant, in the sense that the attributes it mentions do not change
+  > for the entity concerned.  ]]
+  > 
+  > [[ A BOB assertion must describe a characterized entity over a
+  > continuous time interval in the world (which may collapse into a
+  > single instant). Characterizing an entity over multiple time intervals
+  > requires multiple BOB assertions, each with its own identifier. Some
+  > attributes may retain their values across multiple assertions.  ]]
+  > This constraint seems rather unnecessary, and maybe
+  > counter-productive.
+  > 
+  > Suppose we want to describe the collective observations of a
+  > particular telescope when pointed at a particular region of the sky.
+  > This might actually consist of a (possibly unknown) number of disjoint
+  > time-segments caused by the rotation of the earth and other factors. I
+  > can't see any clear benefit in being forced to treat these
+  > observation-sets as distinct entities.
+  > 
+  > [[ There is no assumption that the set of attributes is complete and
+  > that the attributes are independent/orthogonal of each other.  ]] I
+  > don't see this adding any useful information here.  Remove?
+
+
+No issue raised
+
+  > 5.2 Process Execution
+  > 
+  > Thinking about today's teleconference (28 July) and reading this, I'm
+  > seeing the key distinction between Entity and Process execution being
+  > like the philosophical distinction between continuants (endurant) and
+  > occurrents (perdurant)
+  > (http://en.wikipedia.org/wiki/Formal_ontology#Common_terms_in_formal_ontologies)
+
+
+ISSUE-59
+  > 5.3 Generation
+  > 
+  > "characterized entitity" is clumsy - suggest just "entity" (or
+  > whatever term is selected for "BOB").
+  > 
+  > If I had not previously read about OPM, I'd be completely confused by
+  > the introduction of "role" here.  Following the hyperlink here does
+  > not help at all.
+  > 
+  > [[ Given an assertion isGeneratedBy(x,pe,r) or
+  > isGeneratedBy(x,pe,r,t), the activity denoted by pe and the entities
+  > used by pe dermine values of some of x's attributes.  ]] I've no idea
+  > what this is trying to say.
+  > 
+  > 
+
+ISSUE-64
+  > 5.4 Use
+  > 
+  > Same problem with 'role' as above.
+  > 
+  > [[ A reference to a given BOB may appear in multiple use assertions
+  > that refer to a given process execution, but each of those use
+  > assertions must have a distinct role.  ]] In light of the above, this
+  > seems nonsensical to me.
+  > 
+  > [[ Given an assertion uses(pe,x,r) or uses(pe,x,r,t), at least one
+  > value of x's attributes is a pre-condition for the activity denoted by
+  > pe to terminate.  ]]
+  > 
+  > As written this doesn't make sense - a value of an attribute being a
+  > precondition seems like a type error to me.  I think you mean
+  > something like availability of an attribute value.  But even that is
+  > hard to follow.  Suggest simplifying this to just:
+  > 
+  > [[ Given an assertion uses(pe,x,r) or uses(pe,x,r,t), existence of x
+  > is a pre-condition for the activity denoted by pe to terminate.  ]]
+  > 
+  > 
+
+
+ISSUE-56
+  > 5.5 Derivation
+  > 
+  > [[ Given an assertion isDerivedFrom(B,A), one can infer that the use
+  > of characterized entity denoted by A precedes the generation of the
+  > characterized entity denoted by B.  ]] 
+  > Where does this notion of "use" come from in the absence of some
+  > referenced activity?
+  > 
+  > Concerning transitivity of derivation:
+  > 
+  > Suppose:
+  > A has attributes a0, a1
+  > B having attributes b0, b1 is derived from A, with b0 being dependent on a0
+  > C having attributes c0, c1, is derived from B with c1 being dependent on b1
+  > 
+  > So none of the attributes of C can be said to be directly or
+  > indirectly dependent on attributes of A, which by the given definition
+  > is a requirement for derivation of C from A.  Thus, as defined,
+  > derivation cannot be transitive.
+  > 
+  > I don't really know if derivation should or should not be transitive,
+  > but the above seems to me like a problem of spurious
+  > over-specification.  My suggestion for now would be to focus on what
+  > really matters and see what logical properties fall out later.
+
+
+ISSUE-57 
+  > 5.8 IVP of
+  > 
+  > The revised
+  > (w.r.t. http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions#IVP_of)
+  > treatment of IVP-of, and relabeling as "complement-of" completely
+  > overturns my understanding of what this was intended to capture. I
+  > understood the whole point of A IVP-of B was intended to capture the
+  > notion that A denotes a contextually constrained form of the entity
+  > denoted by B.  I don't see what useful purpose this relation serves.
+  > 
+  > From a practical perspective, given the asymmetric nature of IVP-of
+  > (as was) it is easy to express the effect of complement-of in RDF by
+  > introducing a new entity node.  But I see no way of constructing the
+  > strict constraining role of IVP using complement-of.
+
+
+ISSUE-58
+  > 5.9 Time
+  > 
+  > [[
+  > Time is defined according to [ISO8601].
+  > ]]
+  > 
+  > I don't think it is appropriate of an open standard to be normatively
+  > dependent on a standard that is available only on payment of a charge
+  > for access.  In this case, we could make reference to the XML scheme
+  > datatypes, which would also require us to think about my next point...
+  > 
+  > As far as I'm aware, ISO 8601 covers both points in time and time
+  > intervals.  As such a bare reference to ISO 86012 is not really an
+  > adequate definition: which do we want?  I suspect
+  > http://www.w3.org/TR/xmlschema-2/#dateTime.
+
+
+No issue raised
+
+  > 
+  > 
+  > 5.10 Recipe Link
+  > 
+  > I don't see what useful purpose this serves.
+  > 
+  > 
+  > 5.11 Role
+  > 
+  > I can't completely follow the description given.
+  > 
+  > 
+  > 5.13 Ordering of Processes
+  > 
+  > This section confusingly changes the style of presentation from
+  > sections dedicated to specific concepts to a vague discussion of
+  > possible relationships between things.
+  > 
+
+ISSUE-61
+
+  > 5.14 Revision
+  > 
+  > This seems to be just a different form of Derivation that happens to
+  > mention an agent.  I'm not sure why I'd choose one over the other.
+  > 
+  > I think this may be unnecessary - would not a similar effect be
+  > achieved by having a process execution of "revision" that uses b1,
+  > generates b2 and is controlled by ag (possibly with role "revise"?).
+  > 
+  > 
+  > 5.16 Provenance Container
+  > 
+  > It's not clear what this is intended to be (maybe unsurprising, since
+  > the definition is absent).  But it looks as if it's intended to a
+  > syntactical kind of thing, which I feel is out of place in a data
+  > model description (especially if we're expecting to use RDF to
+  > represent the data).  The next version of RDF will probably formally
+  > define named graphs - I'm not seeing what additional definition would
+  > be needed here.
+  > 
+  > 
+  >