prov-n edits
authorPaolo Missier <pmissier@acm.org>
Thu, 12 Apr 2012 12:19:54 +0100
changeset 2269 0869631830d5
parent 2268 f2dc6ac32a45
child 2270 3c5d41f861ca
prov-n edits
model/prov-n.html
--- a/model/prov-n.html	Thu Apr 12 12:19:12 2012 +0100
+++ b/model/prov-n.html	Thu Apr 12 12:19:54 2012 +0100
@@ -248,21 +248,30 @@
 <section id="purpose"> 
 <h2>Purpose of this Document and target audience</h2>
 
-This document describes PROV-N, a formal notation designed to express instances of the PROV data model in a way that is technology independent, and with human readabilty in mind. At the same time, PROV-N is a formal language, for which parsers can be implemented.
+A key goal of PROV-DM is the specification of a machine-processable data model for provenance. However, communicating provenance between humans is also important when teaching, illustrating, formalizing, and discussing provenance-related issues. 
 
-PROV-N has several uses:
+<!-- As such, representations of PROV-DM are available in RDF and XML. -->
+
+With these two requirements in mind, this document introduces PROV-N, a syntax notation designed to  write instances of the PROV-DM data model according to the following design principles:
+<ul>
+<li>Technology independence. PROV-N provides a simple syntax that can be mapped to technology-specific formats such as XML, RDF, JSON, and possibly more;
+
+<li>Human readability. PROV-N follows a functional syntax style that is meant to be easily human-readable so it can be used in illustrative examples, such as those presented in the PROV documents suite;
+
+<li>Formality. PROV-N is defined through a formal grammar amenable to be used with standard parser generators.
+
+ </ul>
+  
+PROV-N has several known uses:
 <ul>
 <li> It is the notation used in the examples found in  [[PROV-DM]], as well as in the definition of PROV-DM constraints [[PROV-DM-CONSTRAINTS]]; </li>
 <li>  It is a source language for the encoding of PROV-DM instances into a variety of target languages, including amongst others  RDF [[PROV-RDF]] and XML [[PROV-XML]]; </li>
 <li> It provides the basis for a formal semantics of PROV-DM  [[PROV-SEM]], in which an interpretation is given to each element of the PROV-N language.
 </ul>
 
-This document introduces the PROV-N grammar along with examples of its usage, and a justification for the language design choices.<br/>
-Its target audience includes primarily implementors of new PROV-DM encodings, and thus in particular of PROV-N parsers. It also includes those readers of the  [[PROV-DM]] and of  [[PROV-DM-CONSTRAINTS]] documents, who are interested in the details of the formal language underpinning the notation used in the examples and in the definition of the constraints.
+This document introduces the PROV-N grammar along with examples of its usage.<br/>
+Its target audience includes both developers of provenance management application, as well as implementors of new PROV-DM encodings, and thus in particular of PROV-N parsers. It also includes those readers of the  [[PROV-DM]] and of  [[PROV-DM-CONSTRAINTS]] documents, who are interested in the details of the formal language underpinning the notation used in the examples and in the definition of the constraints.
 
-<!--
-<p>PROV-N was designed to be as close as possible to PROV-DM without the syntactic bias and modelling constraints that concrete technologies bring with them, e.g., XML's choice between attribute and element, RDF's reliance on triples, or JSON's usage of dictionaries. </p>
--->
 
 </section>
 
@@ -293,7 +302,7 @@
  <h3>PROV-DM Namespace</h3>
 
 
-<p>The PROV namespace is <span class="name">http://www.w3.org/ns/prov#</span>.</p>
+<p>The PROV namespace is <span class="name">http://www.w3.org/ns/prov#</span> with prefix <span class="name">prov:</span>.</p>
 
 <p> All the elements, relations, reserved names and attributes introduced in this specification belong to the PROV namespace.</p>
 </section>
@@ -312,124 +321,33 @@
 </section> 
 
 
-<section id="prov-n-rationale"> 
-<h3>Design Rationale for PROV-N</h3>
-
-<p>A key goal of PROV-DM is the specification of a machine-processable data model for provenance so that application having obtained the provenance of the resource they manipulate can reason about such provenance. As such, representations of PROV-DM are available in RDF and XML.
-</p>
-
-<p>However, communicating provenance between humans is also important when teaching, illustrating, formalizing, and discussing provenance-related issues.  To this end, PROV-N is a notation that is designed to  write instances of the PROV-DM data model in a compact textual form, without the syntactic baggage and constraints coming with a markup language such as XML or a description framework such as RDF. </p>
-
-<ul>
-<li>PROV-N adopts a <em>functional notation</em> consisting a name and a series of arguments in bracket.
-<div class="anexample">
-<pre class="codeexample" >
-wasDerivedFrom(e2, e1, a, g2, u1)
-</pre>
-</div>
-</li>
-
-<li>The interpretation of PROV-N arguments is defined according to their <em>position</em> in the list of arguments. This convention allows for a compact notation. </li>
-
-<li><p>
-PROV-N <em>optional arguments</em> need not be specified (as long as this does not lead to ambiguity).</p>
-<div class="anexample">
-<p>The activity, generation, and usage are specified in the first derivation, whereas they are not in the second.</p>
-<pre class="codeexample" >
-wasDerivedFrom(e2, e1, a, g2, u1)
-wasDerivedFrom(e2, e1)
-</pre>
-</div>
-
-<div class="anexample">
-<p>Activity <span class="name">a1</span> does not have start and end times specified, whereas <span class="name">a2</span> does</p>
-<pre class="codeexample" >
-activity(a1)
-activity(a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)
-</pre>
-</div>
-</li>
+<section id="grammar-notation"> 
+<h3>Functional-style Syntax</h3>
 
-<li><p>For cases where it is desirable to indicate which arguments have not been specified, PROV-N uses  the <em>syntactic marker</em> <span class="name">-</span> for unspecified arguments.</p>
-<div class="anexample">
-<p>The activity, generation, and usage are specified in the first derivation, whereas they are not in the second, but have been explicitly marked as such.</p>
-<pre class="codeexample" >
-wasDerivedFrom(e2, e1, a, g2, u1)
-wasDerivedFrom(e2, e1, a, -, -)
-wasDerivedFrom(e2, e1, -, -, -)
-</pre>
-</div>
-
-<div class="anexample">
-<p>Activity <span class="name">a1</span> does not have start and end times specified, but  they are been marked syntactically.</p>
-<pre class="codeexample" >
-activity(a1, -, -)
-</pre>
-</div>
-</li>
-
-<li><p>When an expression has an identifier, the identifier always occur in <em>first position</em>.   For expressions with optional identifier, it may be replaced by the syntactic marker  <span class="name">-</span>.</p>
-
-<div class="anexample">
-<p>Derivation has an optional identifier. In the first derivation, the identifier is not expressed. It is explicit in the second, and marked by a <span class="name">-</span> in the third.</p>
-<pre class="codeexample" >
-wasDerivedFrom(e2, e1)
-wasDerivedFrom(d, e2, e1)
-wasDerivedFrom(-, e2, e1)
-</pre>
-</div>
-
-<li><p>Most expressions have an optional set of attribute-value pairs, which occur in <em>last position</em>, and delimited by square brackets. </p>
-<div class="anexample">
-The first activity does not have any attributes. The second has an empty list of attributes. The third activity  has two attributes. 
-<pre class="codeexample" >
-activity(ex:a10)
-activity(ex:a10, [])
-activity(ex:a10, [ex:param1="a", ex:param2="b"])
-</pre>
-</div>
-<li id="positional-vs-named-attributes"> PROV-N exposes attributes that PROV-DM provides an interpretation for [[PROV-DM-CONSTRAINTS]] directly as positional arguments of expressions, whereas those for which PROV-DM provides no interpretation are expressed among the optional attribute-value pairs.  This latter category of attributes 
-includes
-  <span class="name">prov:label</span>,
-  <span class="name">prov:location</span>,
-  <span class="name">prov:role</span>, and
-  <span class="name">prov:type</span>.
-
-
-<li id="subject-object-order">
+<p> PROV-N adopts a functional-style syntax consisting of a relation name and an ordered list of terms.
 All PROV-DM relations involve two primary elements, the <em>subject</em> and the <em>object</em>, in this order. Furthermore, some relations also admit additional elements that further characterize it.
 <div class="anexample">
 The following expression should be read as "<span class="name">e2</span> was generated by <span class="name">e1</span>". Here <span class="name">e2</span> is the subject, and  <span class="name">e1</span> is the object.
 <pre class="codeexample" >
 wasDerivedFrom(e2, e1)
 </pre>
-In the following expression, the optional activity <span class="name">a</span> has been added to further qualify the derivation:
+  </div>
+  
+<div class="anexample">
+In the following expressions, the optional activity <span class="name">a</span> along with the generation and usage IDs, and timestamps have been added to further qualify the derivation:
 <pre class="codeexample" >
 wasDerivedFrom(e2, e1, a)
+wasDerivedFrom(e2, e1, a, g2, u1)
+activity(a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)
 </pre>
- 
 </div>
-</li>
-</ul>
-
-
-
 
 
-
-</section>
-
-<section id="grammar-notation"> 
-<h3>Grammar Notation</h3>
-
-<p>This specification includes a grammar for PROV-N expressed using the Extended  Backus-Naur Form (EBNF) notation.</p>
-
+The grammar is specified using the Extended  Backus-Naur Form (EBNF) notation.<br/>
+Each production rule (or <dfn>production</dfn>, for short) in the grammar defines one non-terminal symbol <span class="nonterminal">E</span>, in the following form:</p>
 <div class="grammar">
-<p> Each production rule (or <dfn>production</dfn>, for short) in the grammar defines one non-terminal symbol, in the form:</p>
-<p>
 <span class="nonterminal">E</span>&nbsp;::= <em>expression</em>
-</p>
-
+</div>
 
 Within the expression on the right-hand side of a rule, the following expressions are used to match strings of one or more characters:
 <ul>
@@ -455,17 +373,10 @@
 </li>
 
 </ul>
-</div>
-
-</section>
 
-<section id="prov-n-expressions"> 
-<h2>PROV-N Productions per Component</h2>
+<div class="note">this is confusing. look at http://www.w3.org/TR/owl2-syntax/#BNF_Notation for example</div>
 
-<p>A PROV-N document consists of a sequence of <a title="expression">expressions</a>, wrapped up in an <a>expression container</a> with some namespace declarations. This section focuses on the definition of <a title="expression">expressions</a>. </p>
-
-
-<p>Instances of the PROV-DM data model are expressed as PROV-N <dfn title="expression">expressions</dfn>, which have a text conformant with the toplevel <a>production</a> <span class="nonterminal">expression</span> of the grammar. </p>
+The top level nonterminal of the grammar is <span class="nonterminal">expression</span>, defined as follows.
 
 <div class='grammar'>
 <table style="background: white; border=0; ">
@@ -508,9 +419,108 @@
 </table>
 </div>
 
-<p>In the rest of the section, productions are presented for each expression, followed by small examples illustrating the syntax of expressions compliant with the presented productions. </p> 
+Each expression type,  of the form <span class="nonterminal">XExpression</span>, i.e.,  <span class="nonterminal">entityExpression</span>, <span class="nonterminal">activityExpression</span> etc., corresponds to one element X (entity, activity, etc.) of PROV-DM.
+<p>A PROV-N document consists of a collection of <a title="expression">expressions</a>, wrapped in an <a>expression container</a> with some namespace declarations, such that the text for an element X matches the corresponding <span class="nonterminal">XExpression</span> production of the grammar.
+
+</section>
+
+<section  id="prov-n-conventions">
+<h3>General grammar conventions</h3>
+
+The following conventions are introduced concerning the specification of optional terms in an expression, of default terms, and of terms whose value is not specified, either because it is not available, or because it does not apply.
 
 
+<section id="prov-n-optionals"> 
+<h3>Optional terms in a relation expression</h3>
+
+Some terms in a relation may be optional. For example:
+
+<div class="anexample">
+<pre class="codeexample" >
+wasDerivedFrom(e2, e1, a, g2, u1)
+wasDerivedFrom(e2, e1)
+</pre>
+In a derivation expression, the activity, generation, and usage are optionals. They are specified in the first derivation, but not in the second.
+</div>
+
+<div class="anexample">
+<pre class="codeexample" >
+activity(a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)
+activity(a1)
+</pre>
+The start and end times for Activity <span class="name">a1</span> are optional. They are specified in the first expression, but not in the second.
+</div>
+
+The general rule for optionals is that, if <em>none</em> of the optionals are used in the expression, then they are simply omitted, resulting in a simpler expression as in the examples above.<br/>
+However, it may be the case that only some of the optional terms are omitted. Because the position of the terms in the expression matters, in this case an additional marker must be used to indicate that a particular term is not available. The symbol  <span class="name">-</span> is used for this purpose.
+
+<div class="anexample">
+<p>In the first expression below, all optionals are specified. However in the second, only the last one is specified, forcing the use of the marker for the missing terms. In the last, no marker is necessary because all <em>remaining</em> optionals after <span class="name">a</span> are missing.
+
+<pre class="codeexample" >
+wasDerivedFrom(e2, e1, a, g2, u1)
+wasDerivedFrom(e2, e1, -, -, u1)
+wasDerivedFrom(e2, e1, a)
+</pre>
+</div>
+Note that the more succinct form is just shorthand for a complete expression with all the markers specified:
+<div class="anexample">
+<pre class="codeexample" >
+activity(a1)
+activity(a1, -, -)
+</pre>
+</div>
+</li>
+
+</section>
+
+<section id="prov-n-standard-terms"> 
+<h3>Relation identifiers and attributes</h3>
+
+Most expression types defined in the grammar include the use of two terms: an identifier for the relation, and a set of attribute-value pairs, delimited by square brackets. Both are optional (unless specified otherwise). By convention, the identifier is the first term in any expression type, and the  set of attribute-value pairs is the last. <br/>
+Consistent with the convention on optional terms, the  '<span class="name">-</span>' marker can be used when the identifier is not available. Additionally, the grammar rules are defined in such a way that the optional identifier can be omitted altogether with no ambiguity arising.
+
+<div class="anexample">
+<p>Derivation has an optional identifier. In the first expression, the identifier is not available. It is explicit in the second, and marked by a <span class="name">-</span> in the third.</p>
+<pre class="codeexample" >
+wasDerivedFrom(e2, e1)
+wasDerivedFrom(d, e2, e1)
+wasDerivedFrom(-, e2, e1)
+</pre>
+</div>
+
+A distinction is made between relations with no attributes, and relations that include an empty list of attributes.
+<div class="anexample">
+<p>The first activity does not have any attributes. The second has an empty list of attributes. The third activity  has two attributes. 
+<pre class="codeexample" >
+activity(ex:a10)
+activity(ex:a10, [])
+activity(ex:a10, [ex:param1="a", ex:param2="b"])
+</pre>
+</div>
+
+</section>
+
+<section id="prov-n-attributes"> 
+<h3>Optional attributes</h3>
+
+<div class="note">This looks out of place --- why is this not in DM? </div>
+
+Name-value attribute pairs are intended for arbitrary, user-defined terms that are used to qualify the relation. Amongst these, a few are defined as standard in PROV-DM. These are:
+  <span class="name">prov:label</span>,
+  <span class="name">prov:location</span>,
+  <span class="name">prov:role</span>, and
+  <span class="name">prov:type</span>.
+
+</section>
+
+
+</section>  <!-- conventions for optionals etc. -->
+
+<section id="prov-n-expressions"> 
+<h2>PROV-N Productions per Component</h2>
+
+This section introduces grammar productions for each expression type, followed by small examples illustrating the use of expressions in PROV-N. </p> 
 
 
 <section id="component1"> 
@@ -521,7 +531,9 @@
 
 <div class="withPn">
 <p>
-An entity's text matches the <span class="nonterminal">entityExpression</span> production.
+ The <span class="nonterminal">entityExpression</span> production is used to express  entity relations:
+
+  <div class="note">only changed here to see if it works</div>
 </p>
 
 <div class='grammar'>