prov-c reviews
authorLuc Moreau <l.moreau@ecs.soton.ac.uk>
Mon, 06 Aug 2012 10:01:52 +0100
changeset 4271 6f5c41637b4b
parent 4270 5bba99e1e3b1
child 4272 7ad138a1a291
prov-c reviews
model/comments/issue-459-paul.txt
model/comments/issue-459-simon.txt
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-459-paul.txt	Mon Aug 06 10:01:52 2012 +0100
@@ -0,0 +1,136 @@
+   > 
+   > 
+   > Hi prov-constraints editors:
+   > 
+   > This is my review of the constraints draft for last call. Sorry for
+   > the delay, I wanted to make sure that I could implement each type of
+   > constraint. I'm reviewing
+   > http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-constraints-20120723/prov-constraints.html
+   > 
+   > First, thanks for all your hard work. The document is precise and the
+   > approach is systematic. I have more detailed comments below. Answering
+   > the questions posed in
+   > http://lists.w3.org/Archives/Public/public-prov-wg/2012Jul/0346.html -
+   > 
+   > 1.  Is PROV-CONSTRAINTS ready to be released as a last call working
+   > draft (modulo editorial issues and resolution to the below issues)?
+   > 
+   > Yes, but there are some major editorial things that need to be done to
+   > help implementors. Additionally, in section 6 you mention a proof in
+   > an appendix. This is technical content so either needs to be or not
+   > mentioned.
+   > 
+   > 2.  Regarding ISSUE-346: Is the role, meaning, and intended use of
+   > each type of inference or constraint clear?
+   > (http://www.w3.org/2011/prov/track/issues/346)
+   > 
+   > I think each definition is now precise and clear but as I will mention
+   > in my longer comments I think there is some additional intuition
+   > necessary to help implementers.
+   > 
+   > 3.  Regarding ISSUE-451: Are there any objections to the
+   > revision-is-alternate inference?
+   > (http://www.w3.org/2011/prov/track/issues/451)
+   > 
+   > Nope
+   > 
+   > 4.  Regarding ISSUE-454: Are the rules for disjointness clear and
+   > appropriate? (http://www.w3.org/2011/prov/track/issues/454)
+   > 
+   > Yes
+   > 
+   > 5.  Regarding ISSUE-458: Should influence (and therefore all
+   > subrelations, including communication) be irreflexive, or can it be
+   > reflexive (i.e., can wasInfluencedBy(x,x) be valid)?
+   > (http://www.w3.org/2011/prov/track/issues/458)
+   > 
+   > I think this come downs what we think the role of the constraints are.
+   > My impression is to encourage implementers to be both explicit and
+   > correct in the provenance they create. In terms of the example given
+   > in the issue, I would expect that if an activity called itself you
+   > would want to identify that has two independent activities. Thus, I
+   > think it's irreflexive. Actually, maybe this is suggesting the need
+   > for a part of relation around activities.
+   > 
+   > 5.  Are there any objections to closing other open issues on
+   > PROV-CONSTRAINTS?  They are:
+   > 
+   > - http://www.w3.org/2011/prov/track/issues/387
+   > - http://www.w3.org/2011/prov/track/issues/394
+   > - http://www.w3.org/2011/prov/track/issues/452
+   > - http://www.w3.org/2011/prov/track/issues/453
+   > 
+   > I think all these issues are addressed.
+   > 
+   > 6.  Are there any new issues concerning definitions, constraints, or inferences?
+   > No
+   > 
+   > 
+   > ==Comments==
+   > 
+   > My approach to reviewing the constraints was to attempt to implement
+   > the constraints and inferences using semantic web technologies.  You
+   > can find the beginning of the implementation at
+   > https://github.com/pgroth/prov-constraints-validator-spin . I have
+   > satisfied myself that the specification can be implemented using SPIN
+   > RDF. However, I'm not 100 % certain, which is a bit of concern.
+   > Additionally, to get things to work I had to make sure the inferences
+   > were done in one pass, which may go against what is specified in the
+   > document.
+   > 
+   > My major concern is the lack of intuition about what valid provenance
+   > is. I would describe it as follows: valid provenance identifies
+   > exactly partial states and those partial states are correctly ordered.
+   > I'm trying to implement the spec but as an implementor I need to know
+   > my broad goal when implementing these constraints.
+   > 
+   > A key thing that it took me a while to get is that I need to generate
+   > all qualified relations before applying the constraints. This is an
+   > important point because it's sometimes unclear what should be
+   > considered an inference or constraints.
+   > 
+   > Concretely, in the Event Ordering Constraints, the constraints are
+   > expressed stating that the head of the rule leads to an assertion of
+   > precedence. But actually, the thing is that you have to assert all
+   > these precedences relations first and then check for cycles. So I
+   > guess, are these really constraints? At any rate, the notion of
+   > checking for cycles needs to be brought out more.
+   > 
+   > Overall, I think an implementor could use some examples that show the
+   > results of inference and the subsequent constraint checking and just
+   > more intuition about what a valid and invalid provenance graphs look
+   > like.
+   > 
+   > ==Some comments per section==
+   > 
+   > Section 3
+   > I'm worried about the MUST in the compliant list "When determining
+   > whether two PROV instances are equivalent, an application must
+   > determine whether their normal forms are equal, as specified in
+   > section 6. Normalization, Validity, and Equivalence."
+   > 
+   > Does this imply that I have to implement this to be compatible with
+   > PROV-DM? I would use SHOULD…
+   > 
+   > Section 5.1
+   > - From an RDF perspective, do I need to worry about merging? If the
+   > assumption is that I'm provided an RDF serialization to check then no
+   > merging is necessary. I guess the question is merging PROV-N specific?
+   > 
+   > Section 6.1
+   > - Why do we need to talk about a hierarchy of bundles? Isn't just the
+   > point that you want a set of provenance descriptions independent of
+   > bundles?
+   > 
+   > Minor Notes:
+   > 
+   > - PROV objects or prov constructs - check the consistency on this
+   > - inconsistency with naming. Do you always want to end inference with
+   > "-inference". See Inference 11 (derivation-generation-use) and
+   > Inference 10 (wasEndedBy-inference)
+   > 
+   > 
+   > Thanks
+   > Paul
+   > 
+   > 
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/model/comments/issue-459-simon.txt	Mon Aug 06 10:01:52 2012 +0100
@@ -0,0 +1,179 @@
+
+
+Hello PROV-Constraints authors,
+
+Please find my review below (answers to your questions first, then detailed points).
+
+1.  Is PROV-CONSTRAINTS ready to be released as a last call working draft (modulo editorial issues and resolution to the below issues)?
+
+Yes, though I think my points C and D below are important to try to resolve before release as they directly concern the intended meaning of the constraints.
+
+2.  Regarding ISSUE-346: Is the role, meaning, and intended use of each type of inference or constraint clear? (http://www.w3.org/2011/prov/track/issues/346)
+
+Yes. In fact, all the points in my review concern the explanatory text, not the inferences/constraints themselves.
+
+3.  Regarding ISSUE-451: Are there any objections to the revision-is-alternate inference? (http://www.w3.org/2011/prov/track/issues/451)
+
+No objection. I think it is an important and potentially useful inference, and I could object to its removal.
+
+4.  Regarding ISSUE-454: Are the rules for disjointness clear and appropriate? (http://www.w3.org/2011/prov/track/issues/454)
+
+Seemed clear to me.
+
+5.  Regarding ISSUE-458: Should influence (and therefore all subrelations, including communication) be irreflexive, or can it be reflexive (i.e., can wasInfluencedBy(x,x) be valid)? (http://www.w3.org/2011/prov/track/issues/458)
+
+I think it should be allowed to be reflexive, for the reasons given in the issue. Influence is effectively between events, I believe, and the lifetime of activities and entities involve multiple events, so it makes sense that one event in something's lifetime could be influenced by another event in the same thing's lifetime. ISSUE-458 describes this for an activity, so here's an example for an entity: I define entity my:book for a book I'm writing. It changes content as I write it, but has a consistent identity. At some point, I finish a complete draft and it is only 5 pages long, so I entitle it "The short book". One mutable attribute (title) of the entity has influenced another mutable attribute (length) of the same entity, so the entity has influenced itself.
+
+6.  Are there any objections to closing other open issues on PROV-CONSTRAINTS?  They are:
+
+http://www.w3.org/2011/prov/track/issues/387
+I have no problem with the mandatory constraint generation-uniqueness as long as it means that, for any entity, the generation event is unique and not that the generating activity is unique. It appears fine as currently stated. See point C below for more detailed comment on this issue.
+
+http://www.w3.org/2011/prov/track/issues/394
+This issue seems to be closed. Alternate should be reflexive.
+
+http://www.w3.org/2011/prov/track/issues/452
+The relation inferred should be "wasAssociatedWith(_id2;a, ag2, -, _attrs2)". As the document currently says, this does not necessarily mean that a plan does not exist, just that in that statement we have chosen not to identify it if it does exist. This does not preclude other statements on the assocation that do identify the plan. See related point B below.
+
+http://www.w3.org/2011/prov/track/issues/453
+I think the document is fine as it is. The identifier is the same in the original and inferred statements because they are both descriptions of the same event, just as if we had two wasGeneratedBy (for example) statements with the same ID.
+
+7.  Are there any new issues concerning definitions, constraints, or inferences? If so, please raise as new issues to be addressed before LC vote, ideally with a suggested change that would address the issue.
+
+The inferences and constraints themselves are fine, I think. See my detailed review below for more.
+
+Optional identifiers
+----------
+A. Section 4.1: It was unclear whether this section purely about PROV-N and how the same statement can take different forms in it, or is there something syntax-independent being stated? If the former, maybe the section title could include 'PROV-N' to clarify this. If the latter, it would be helpful to indicate what the 'optional identifiers' etc. correspond to in the DM independent of PROV-N.
+
+B. Remark under Definition 4: "In an association... the absence of a plan means: either no plan exists, or a plan exists but is not identified... Similarly, a wasDerivedFrom... that specifies an activity is not equivalent to wasDerivedFrom(id;e2,e1,-...)"
+I was not clear what you were implying here. That wasDerivedFrom with - for the activity parameter can mean that no activity existed? If so, how did the derivation occur? Or that it can mean the activity existed but is not identified? But isn't that what - means in every other relation? Why would activity in wasDerivedFrom be a special case?
+
+Unique generations
+-----------
+C. Immediately following Inference 12, the text says "the entity denoted by e2 is generated by at most one activity (see Constraint 27". The Remark below repeats this, "at most one activity could generate the entity e2."
+
+This seems wrong. Constraint 27 says that e2 is generated by only one generation event, not by only one activity. The distinction between these is important. In the primer's example, there is an activity ex:compile which is decomposed into steps ex:compose and ex:illustrate. While there is only one (implicit) generation event for entity ex:chart1, both ex:compile and ex:illustrate can be asserted to have generated the entity.
+
+Note I see nothing in the constraints (Inference 12, Constraints 27 and 28) that the primer example contradicts, and nothing intuitively invalid about the primer example. The text referred to above, however, states something more than the constraints do.
+
+Failed merges
+---------
+D. Section 5.1: I did not find clear what the consequences of a failed merge are. If merging fails (step 3, first algorithm in section), does this mean the instance containing the hypotheses is invalid? Or does it mean that the output instance of the merge is the same as the input instance? Or both?
+
+E. Section 6 says "A normal form of a PROV instance may not exist when a uniqueness constraint fails due to merging failure." This doesn't clarify things, as it says "may not" and I'm unclear under what circumstances failed merge means a PROV instance does not exist and when it does not.
+
+Bundles
+-------
+F. Section 6.1 seems a bit out of the blue. "The definitions [etc.]... assume a PROV instance with exactly one bundle", and then multiple bundles are handled as exactly the same number of instances. Why? Why is there a connection between number of instances and number of bundles? Why would a bundle be considered to be only one instance? I thought a bundle was an identified set of statements, allowing for provenance of provenance, which seems a distinct matter from whether a set of statements are valid. It seems fine for a user to treat one bundle as one instance if they want to, but there's no reason given why this is the general case.
+
+Misc
+------
+G. Section 2.2, paragraph 2: Is this definition of an entity (as something to provide provenance of) consistent with the current DM, PAQ etc? I thought we changed the definition because an entity is not just something with provenance but also part of other things' provenance, and that an activity can have provenance too?
+
+H. Inference 11: Can we also infer that _t1 precedes _t2? I'm not sure whether the event ordering constraints cover this, as they are inferred times.
+
+I. Inference 12: I agree this appears redundant
+
+J. Inference 18: Shouldn't the antecedent be expressed in PROV-N, "entity(e)", to be consistent with other rules?
+
+K. Section 5.1, paragraph 1: In the example merge, I wasn't clear why variables "t1" and "t2" disappeared in the merged version but "a" did not. Are the three terms existential variables or constants? If "t1" and "t2" are of a different kind to "a", can this be indicated, e.g. "_t1" instead of "t1"?
+
+L. Section 5.1, paragraph 2: What is an "unordered list"? Do you mean a list for which the order has no meaning? If so, why would you not say "set"? What is the relevant difference?
+
+Typos
+-----
+Section 1.1, paragraph 2: "Some of these ariables"
+Remark under Definition 3: What does the "also" refer to in "There are also no..."?
+Sentence above Inference 8: "activity statemen,t"
+Sentence above Constraint 45: "specalizes"
+Sentence above Constraint 46: "specalizes"
+
+Thanks,
+Simon
+
+Dr Simon Miles
+Senior Lecturer, Department of Informatics
+Kings College London, WC2R 2LS, UK
++44 (0)20 7848 1166
+
+Automatically Adapting Source Code to Document Provenance:
+http://eprints.dcs.kcl.ac.uk/1397/
+________________________________________
+From: Luc Moreau [[email protected]]
+Sent: 23 July 2012 15:09
+To: [email protected]
+Subject: Re: PROV-ISSUE-459 (prov-constraints-lc-review): PROV-CONSTRAINTS  review [prov-dm-constraints]
+
+... As some noted I shouldn't apologise for getting the document ready ...
+for the delay in doing so!
+
+Luc
+
+On 23/07/12 15:04, Luc Moreau wrote:
+> Dear all,
+>
+> Apologies in getting the prov-constraints document ready.
+> It is now available for review at
+>
+> http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-constraints-20120723/prov-constraints.html
+>
+>
+> It would be great if we can still follow the timetable we agreed on
+> the call last week.
+>
+> Regards,
+> The prov-c-team
+>
+> On 20/07/12 12:02, Provenance Working Group Issue Tracker wrote:
+>> PROV-ISSUE-459 (prov-constraints-lc-review): PROV-CONSTRAINTS review
+>> [prov-dm-constraints]
+>>
+>> http://www.w3.org/2011/prov/track/issues/459
+>>
+>> Raised by: James Cheney
+>> On product: prov-dm-constraints
+>>
+>> Hi,
+>>
+>> This issue is to capture review comments for the next draft of
+>> PROV-CONSTRAINTS, which will be released soon.
+>>
+>> Please answer the following review questions:
+>>
+>> 1.  Is PROV-CONSTRAINTS ready to be released as a last call working
+>> draft (modulo editorial issues and resolution to the below issues)?
+>>
+>> 2.  Regarding ISSUE-346: Is the role, meaning, and intended use of
+>> each type of inference or constraint clear?
+>> (http://www.w3.org/2011/prov/track/issues/346)
+>>
+>> 3.  Regarding ISSUE-451: Are there any objections to the
+>> revision-is-alternate inference?
+>> (http://www.w3.org/2011/prov/track/issues/451)
+>>
+>> 4.  Regarding ISSUE-454: Are the rules for disjointness clear and
+>> appropriate? (http://www.w3.org/2011/prov/track/issues/454)
+>>
+>> 5.  Regarding ISSUE-458: Should influence (and therefore all
+>> subrelations, including communication) be irreflexive, or can it be
+>> reflexive (i.e., can wasInfluencedBy(x,x) be valid)?
+>> (http://www.w3.org/2011/prov/track/issues/458)
+>>
+>> 5.  Are there any objections to closing other open issues on
+>> PROV-CONSTRAINTS?  They are:
+>>
+>> http://www.w3.org/2011/prov/track/issues/387
+>> http://www.w3.org/2011/prov/track/issues/394
+>> http://www.w3.org/2011/prov/track/issues/452
+>> http://www.w3.org/2011/prov/track/issues/453
+>>
+>> 6.  Are there any new issues concerning definitions, constraints, or
+>> inferences? If so, please raise as new issues to be addressed before
+>> LC vote, ideally with a suggested change that would address the issue.
+>>
+>>
+>> --James
+>>
+>>
+>>