--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments-from-graham.txt Fri Jul 29 10:28:01 2011 +0100
@@ -0,0 +1,349 @@
+ > With reference to:
+ > http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html
+ > Retrieved at about 17:30 on 28-Jul-2011
+ >
+ > As promised, I've taken a tilt at reviewing the model draft. I must
+ > say, I've found it to be really hard going - many of the notions
+ > described are not making sense to me, and the language used sometimes
+ > seems to be unnecessarily obscure.
+ >
+ > After a mammoth session going though this, I really don't have the
+ > time or energy to split my comments out into separate issues. I think
+ > many of them are purely editorial in nature, and as such could be
+ > cleaned up relatively easily. There are some substantive comments that
+ > I may separate out as formal issues later, but I'm rather hoping that
+ > won't be needed.
+ >
+
+PROV-ISSUE-63
+
+ > My comments follow:
+ >
+ >
+ > 3.1 Notation used is obscure. What does [...[ mean? Should be explained.
+ >
+ > For a general audience, examples based on Unix command shell commands
+ > are probably not very helpful.
+ >
+ > What is "characterized entity represented by the file". As this is an
+ > example, just say "crime statistics" - would that be a correct
+ > interpretation?
+ >
+ >
+ > 3.2 where did 'e0' come from? - it's not mentioned in 3.1. What is it intended to denote?
+ >
+ > The "agent" statements are completely impenetrable to me.
+ >
+ > How is the notation to be interpreted. It looks a b it like some kind
+ > of deviant Prolog, but either I've forgotten some of the basic
+ > constructs, or it's not entirely clear how the deviant bits are meant
+ > to be interpreted.
+ >
+ >
+ > 3.3 graphical representation: could be very useful, and would be much
+ > easier to follow if the illustration included a key
+ >
+ > What does it mean for an agent to be linked to a BOB as opposed to a
+ > process execution (cf. Alice and e0).
+ >
+
+
+PROV-ISSUE-62
+
+ >
+ > 4. About the Provenance Language
+ >
+ > Introduction of "characterized entities" - if this is something that
+ > really needs to be said, I think it needs to be clarified. I spent
+ > some time thinking about these two sentences, trying to work out if
+ > they could ever be completely correct, or just not understanding what
+ > they are intended to convey:
+ >
+ > [[ Furthermore, this specification is concerned with characterized
+ > entities, that is, entities and their situation in the world, as
+ > perceived by their asserters.
+ >
+ > In the rest of the document, we are concerned with the representation
+ > of such entities; their situation in the world will be represented
+ > using sets of attributes. ]]
+ >
+ > Why "characterized entities" as opposed to perceived entities"?
+ > What's the important distinction here?
+ >
+ > The only interpretation I've found that makes sense to me is that the
+ > document is concerning itself with entities that are characterized by
+ > the values of some bounded set of attributes. But that
+ > interpretation, if correct, is not obvious to me from the wording
+ > here.
+ >
+ >
+ > "PIL is a language by which representations of the world can be
+ > expressed using terms that are drawn from a controlled vocabulary. "
+ > I'm not sure how to interpret this. Does this "controlled vocabulary
+ > include, for example, numbers? Is this controlled vocabulary expected
+ > to be the complete set of terms used in PIL expressions?
+ >
+ >
+ > "These representations are relative to an asserter, and in that sense
+ > constitute assertions about the world." What is this trying to say?
+ > I think you might mean something like:
+ >
+ > "These representations are relative to the context of an asserter, and
+ > in that sense constitute perceptions about the world." which ties
+ > back to the earlier statement about "as perceived by their asserters".
+ >
+ > "All assertions in PIL SHOULD be interpreted as a record of what has
+ > happened, as opposed to what may or will happen." I feel we should
+ > find a way to strengthen this SHOULD to a MUST, but comments from
+ > earlier discussions make this tricky to get right. Maybe:
+ >
+ > "All assertions in PIL MUST be interpreted as a record of what has
+ > happened or been observed in some context, as opposed to what might
+ > happen or potential observations." In this, I am using the reference
+ > to a context to provide just enough wiggle-room for description in
+ > future or imagined contexts.
+ >
+ > "This specification does not prescribe the means by which assertions
+ > are made, for example on the basis of observations, inferences, or any
+ > other means."
+ > The phrasing "... assertions are made" here is jarring, if not
+ > confusing - I would think that assertions are made in PIL for the
+ > purposes of this spec. Suggest "... how assertions are arrived at,
+ > ..."
+ >
+ > "The language introduces a notion of "provenance container", which
+ > provides a default scope for assertions." The term "container" here
+ > is suggested of a physical or logical encapsulation, which I don't
+ > think is meant. How about "provenance context"?
+ >
+ > [[ ... The model may define additional scoping rules for
+ > assertions. Identifiers can safely be used within that
+ > scope. Optionally, identifiers can be exported so that they can be
+ > used outside their default scope. The language does not prescribe the
+ > mechanisms by which identifiers are generated. ]]
+ >
+ > This spec is describing a data model, *not* a language. It says so at
+ > the top. As such I think it's entirely inappropriate to start
+ > defining linguistic constructs such as identifiers and scoping.
+ > Assuming the actual language used will be RDF, I'm not seeing how what
+ > you describe will be possible.
+ >
+ > "In this specification, when an assertion is defined to refer to
+ > another assertion about something, it does so by means of that thing's
+ > identifier." I don't understand what this is trying to say.
+ >
+ >
+
+ISSUE-60
+ > 5.1 BOB
+ >
+ > "A BOB represents an identifiable characterized entity."
+ >
+ > What does it mean to be "characterized" here? What does this tell us?
+ > What does it mean to not be "characterized"? If this refers to the
+ > attribute-based assertions mentioned earlier, does this mean that if
+ > there are no such assertions, an entity cannot be a "BOB"?
+ >
+ > [[ A BOB assertion is about a characterized entity, whose situation in
+ > the world is variant. A BOB assertion is made at a particular point
+ > and is invariant, in the sense that all the attributes are assigned a
+ > value as part of that assertion. ]]
+ >
+ > This section is, according to its heading, about "BOB". But this is
+ > defining a different concept, so shouldn't this be in a separate
+ > section?
+ >
+ > It seems to me that what we're talking about here is a "provenance
+ > assertion". I think it would be clearer to just describe that, e.g.
+ > [[ A provenance assertion is about an entity, whose situation in the
+ > world is generally assumed to be variable. ]]
+ >
+ > I either don't understand or don't agree with the second part of that
+ > description. The notion of assigning values as party of an assertion
+ > seems wrong to me (I think the notion of constraining attributes is
+ > the job of the IVP-of relation). I would expect something like:
+ >
+ > [[ A provenance assertion is made at a particular point and is
+ > invariant, in the sense that the attributes it mentions do not change
+ > for the entity concerned. ]]
+ >
+ > [[ A BOB assertion must describe a characterized entity over a
+ > continuous time interval in the world (which may collapse into a
+ > single instant). Characterizing an entity over multiple time intervals
+ > requires multiple BOB assertions, each with its own identifier. Some
+ > attributes may retain their values across multiple assertions. ]]
+ > This constraint seems rather unnecessary, and maybe
+ > counter-productive.
+ >
+ > Suppose we want to describe the collective observations of a
+ > particular telescope when pointed at a particular region of the sky.
+ > This might actually consist of a (possibly unknown) number of disjoint
+ > time-segments caused by the rotation of the earth and other factors. I
+ > can't see any clear benefit in being forced to treat these
+ > observation-sets as distinct entities.
+ >
+ > [[ There is no assumption that the set of attributes is complete and
+ > that the attributes are independent/orthogonal of each other. ]] I
+ > don't see this adding any useful information here. Remove?
+
+
+No issue raised
+
+ > 5.2 Process Execution
+ >
+ > Thinking about today's teleconference (28 July) and reading this, I'm
+ > seeing the key distinction between Entity and Process execution being
+ > like the philosophical distinction between continuants (endurant) and
+ > occurrents (perdurant)
+ > (http://en.wikipedia.org/wiki/Formal_ontology#Common_terms_in_formal_ontologies)
+
+
+ISSUE-59
+ > 5.3 Generation
+ >
+ > "characterized entitity" is clumsy - suggest just "entity" (or
+ > whatever term is selected for "BOB").
+ >
+ > If I had not previously read about OPM, I'd be completely confused by
+ > the introduction of "role" here. Following the hyperlink here does
+ > not help at all.
+ >
+ > [[ Given an assertion isGeneratedBy(x,pe,r) or
+ > isGeneratedBy(x,pe,r,t), the activity denoted by pe and the entities
+ > used by pe dermine values of some of x's attributes. ]] I've no idea
+ > what this is trying to say.
+ >
+ >
+
+ISSUE-64
+ > 5.4 Use
+ >
+ > Same problem with 'role' as above.
+ >
+ > [[ A reference to a given BOB may appear in multiple use assertions
+ > that refer to a given process execution, but each of those use
+ > assertions must have a distinct role. ]] In light of the above, this
+ > seems nonsensical to me.
+ >
+ > [[ Given an assertion uses(pe,x,r) or uses(pe,x,r,t), at least one
+ > value of x's attributes is a pre-condition for the activity denoted by
+ > pe to terminate. ]]
+ >
+ > As written this doesn't make sense - a value of an attribute being a
+ > precondition seems like a type error to me. I think you mean
+ > something like availability of an attribute value. But even that is
+ > hard to follow. Suggest simplifying this to just:
+ >
+ > [[ Given an assertion uses(pe,x,r) or uses(pe,x,r,t), existence of x
+ > is a pre-condition for the activity denoted by pe to terminate. ]]
+ >
+ >
+
+
+ISSUE-56
+ > 5.5 Derivation
+ >
+ > [[ Given an assertion isDerivedFrom(B,A), one can infer that the use
+ > of characterized entity denoted by A precedes the generation of the
+ > characterized entity denoted by B. ]]
+ > Where does this notion of "use" come from in the absence of some
+ > referenced activity?
+ >
+ > Concerning transitivity of derivation:
+ >
+ > Suppose:
+ > A has attributes a0, a1
+ > B having attributes b0, b1 is derived from A, with b0 being dependent on a0
+ > C having attributes c0, c1, is derived from B with c1 being dependent on b1
+ >
+ > So none of the attributes of C can be said to be directly or
+ > indirectly dependent on attributes of A, which by the given definition
+ > is a requirement for derivation of C from A. Thus, as defined,
+ > derivation cannot be transitive.
+ >
+ > I don't really know if derivation should or should not be transitive,
+ > but the above seems to me like a problem of spurious
+ > over-specification. My suggestion for now would be to focus on what
+ > really matters and see what logical properties fall out later.
+
+
+ISSUE-57
+ > 5.8 IVP of
+ >
+ > The revised
+ > (w.r.t. http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions#IVP_of)
+ > treatment of IVP-of, and relabeling as "complement-of" completely
+ > overturns my understanding of what this was intended to capture. I
+ > understood the whole point of A IVP-of B was intended to capture the
+ > notion that A denotes a contextually constrained form of the entity
+ > denoted by B. I don't see what useful purpose this relation serves.
+ >
+ > From a practical perspective, given the asymmetric nature of IVP-of
+ > (as was) it is easy to express the effect of complement-of in RDF by
+ > introducing a new entity node. But I see no way of constructing the
+ > strict constraining role of IVP using complement-of.
+
+
+ISSUE-58
+ > 5.9 Time
+ >
+ > [[
+ > Time is defined according to [ISO8601].
+ > ]]
+ >
+ > I don't think it is appropriate of an open standard to be normatively
+ > dependent on a standard that is available only on payment of a charge
+ > for access. In this case, we could make reference to the XML scheme
+ > datatypes, which would also require us to think about my next point...
+ >
+ > As far as I'm aware, ISO 8601 covers both points in time and time
+ > intervals. As such a bare reference to ISO 86012 is not really an
+ > adequate definition: which do we want? I suspect
+ > http://www.w3.org/TR/xmlschema-2/#dateTime.
+
+
+No issue raised
+
+ >
+ >
+ > 5.10 Recipe Link
+ >
+ > I don't see what useful purpose this serves.
+ >
+ >
+ > 5.11 Role
+ >
+ > I can't completely follow the description given.
+ >
+ >
+ > 5.13 Ordering of Processes
+ >
+ > This section confusingly changes the style of presentation from
+ > sections dedicated to specific concepts to a vague discussion of
+ > possible relationships between things.
+ >
+
+ISSUE-61
+
+ > 5.14 Revision
+ >
+ > This seems to be just a different form of Derivation that happens to
+ > mention an agent. I'm not sure why I'd choose one over the other.
+ >
+ > I think this may be unnecessary - would not a similar effect be
+ > achieved by having a process execution of "revision" that uses b1,
+ > generates b2 and is controlled by ag (possibly with role "revise"?).
+ >
+ >
+ > 5.16 Provenance Container
+ >
+ > It's not clear what this is intended to be (maybe unsurprising, since
+ > the definition is absent). But it looks as if it's intended to a
+ > syntactical kind of thing, which I feel is out of place in a data
+ > model description (especially if we're expecting to use RDF to
+ > represent the data). The next version of RDF will probably formally
+ > define named graphs - I'm not seeing what additional definition would
+ > be needed here.
+ >
+ >
+ >