Revised section on provenance discovery to include optional content type parameter
authorGraham Klyne
Thu, 04 Aug 2011 18:05:50 +0100
changeset 111 0068a01361c0
parent 110 a055a7987aa7
child 112 4a712b32d6fb
Revised section on provenance discovery to include optional content type parameter
paq/provenance-access.html
--- a/paq/provenance-access.html	Thu Aug 04 15:58:51 2011 +0100
+++ b/paq/provenance-access.html	Thu Aug 04 18:05:50 2011 +0100
@@ -133,6 +133,10 @@
         These particular cases are selected as corresponding to primary current web protocol and data formats.
       </p>
 
+      <p class="issue">
+        The proposals so far assume the URI of the original resource.  This may not always be the case (e.g. a file received by means other than web transfer), and it is desirable to have mechanisms to convey the origibnal resource URI as well as any provenance URIs.  See <a href="http://www.w3.org/2011/prov/track/issues/46">ISSUE 46</a>.
+      </p>
+
       <section>
         <h2>Resource accessed by HTTP</h2>
         <p>
@@ -152,12 +156,9 @@
           The presence of a provenance link in an HTTP response does not preclude the possibility that other publishers may offer provenance information about the same resource.  In such cases, discovery of the additional provenance information must use other means (e.g. see <a href="#third-party-services" class="sectionRef"></a>).
         </p>
 
-        <section>
-          <h2>Open issues</h2>
-          <p>
-            Are the provenance resources indicated in this way to be considered authoritative?  I.e. if the client trusts information returned by the server (e.g. is prepared to act on inferences based on the returned data), should it also trust the provenance data, or should trust in the linked provenance data be determined separately?  If the linked data <em>is</em> to be trusted, then the data from multiple linked provenance resources MUST be consistent if it is to be meaningful.  I favour an approach whereby trust in the provenance resources is established independently, which is similar to the situation for any other resource; e.g. based on the domain that serves it, or an associated digital signature.
-          </p>
-        </section>
+        <p class="issue">
+          Are the provenance resources indicated in this way to be considered authoritative?  I.e. if the client trusts information returned by the server (e.g. is prepared to act on inferences based on the returned data), should it also trust the provenance data, or should trust in the linked provenance data be determined separately?  If the linked data <em>is</em> to be trusted, then the data from multiple linked provenance resources MUST be consistent if it is to be meaningful.  I favour an approach whereby trust in the provenance resources is established independently, which is similar to the situation for any other resource; e.g. based on the domain that serves it, or an associated digital signature.
+        </p>
 
       </section>
 
@@ -193,13 +194,9 @@
         <p class="note">
            An alternative option would be to use an HTML <code>&lt;meta&gt;</code> element to present provenance links.  The <code>&lt;Link&gt;</code> is preferred as it reflects more closely the intended goal, and has been defined with somewhat consistent applicability across HTTP, HTML and potentially RDF data.  A specification to use <code>&lt;meta&gt;</code> for this would miss this opportunity to build on the existing specification and registry.
         </p>
-        <section>
-          <h2>Open issues</h2>
-          <p>
-            @@TODO - 
-            The POWDER specification also adds: Documents MAY also include any of the attribution data from the POWDER document in meta tags. In particular, the issuedby field is likely to be useful to user agents deciding whether or not to fetch the full POWDER document. Any attribution data encoded in meta tags within an HTML document should be the same as that in the POWDER document. In case of discrepancy, the POWDER document should be taken as more authoritative.  Is there a parallel we should add here for provenance?
-          </p>
-        </section>
+        <p class="issue">
+          The POWDER specification also adds: Documents MAY also include any of the attribution data from the POWDER document in meta tags. In particular, the issuedby field is likely to be useful to user agents deciding whether or not to fetch the full POWDER document. Any attribution data encoded in meta tags within an HTML document should be the same as that in the POWDER document. In case of discrepancy, the POWDER document should be taken as more authoritative.  Is there a parallel we should add here for provenance?  I'm not seeing any compelling case for this.
+        </p>
       </section>
 
       <section>
@@ -222,17 +219,25 @@
 
     <section>
       <h2>Independent provenance discovery services</h2>
-      <p>
-        The mechanisms for provenance discovery described above have all assumed the provenance URI is being supplied by the provider of the original resource.  Where provenance information is provided independently without coordination with the original resource delivery channels (e.g. by a third party), alternative approaches must be considered.
+      <p class="pending">
+        Propose simple HTTP interface for discovery.  cf <a href="http://www.w3.org/2011/prov/track/issues/53">ISSUE 53</a>
+      </p>
+      <p class="pending">
+        <a href="http://www.w3.org/2011/prov/track/issues/53">ISSUE 53</a> suggests a simple REST protocol.
+        The proposal here is not strictly RESTful (as I understand it), because it depends to some extent on dereferencing constructed URIs rather than simply using URIs that are provided by the discovery service.  Do we care?  The alternatives I can think of seem to be more complex without so much obvious practical advantage.  The thrust of <a href="http://www.w3.org/2011/prov/track/issues/53">ISSUE 53</a> was to provide a <em>simple</em> service interface.
       </p>
       <p>
-        The mechanism described here focuses on finding the URI(s) for provenance information.  Below, <a href="#querying-provenance" class="sectionRef"></a> will consider access to provenance for which there is no separate URI.
+        <!-- The mechanisms for provenance discovery described above have all assumed the provenance URI is being supplied by the provider of the original resource. -->
+        Where provenance information is provided independently without coordination with the original resource delivery channels (e.g. by a third party), independent mechanisms for provenance discovery are needed.
+      </p>
+      <p>
+        The discovery mechanism described here focuses on finding the URI(s) for provenance information.  Below, <a href="#querying-provenance" class="sectionRef"></a> will consider access to provenance for which there is no separate URI.
       </p>
       <p>
         We assume that the requesting application has the URI of a resource for which provenance is required, and also has a URI for an independent provenance discovery service.
       </p>
       <p>
-        A service based on a simple HTTP GET operation is used to retrieve the provenance URI(s) for a resource.  In designing such a service, there are two main factors to consider:
+        A service based on a simple HTTP GET operation is used to retrieve the provenance URI(s) for a resource.  In designing such a service, the main factors to consider are:
         <ul>
           <li>The construction of the HTTP request URI</li>
           <li>The content and format(s) of the expected response</li>
@@ -250,19 +255,35 @@
             <dd>is the URI of the provenance discovery service.</dd>
             <dt><code><cite>Target-URI</cite></code></dt>
             <dd>is the URI of the resource for which provenance is required.</dd>
+            <dt><code><cite>Content-type</cite></code></dt>
+            <dd>is the desired MIME content-type of the result data.</dd>
           </dl>
         </p>
         <p>
-          Then the request URI for provenance discovery is constructed as:
+          Then the request URI for provenance discovery may be constructed as one of:
         </p>
-        <pre class="pattern">
-<code><strong><cite>Service-URI</cite></strong>?uri=<strong><cite>Target-URI</cite></strong></code></pre>
+        <code>
+          <pre class="pattern">
+<strong><cite>Service-URI</cite></strong>?uri=<strong><cite>Target-URI</cite></strong>
+<strong><cite>Service-URI</cite></strong>?uri=<strong><cite>Target-URI</cite></strong>&amp;type=<strong><cite>Content-type</cite></strong></pre>
+        </code>
         <p>
           For example, if the discovery service URI is <code>http://example.net/provenance-discovery</code> and the resource for which provenance is required is identified as <code>http://example.info/qdata/</code>, then the request URI to use for provenance discovery would be:
         </p>
         <pre class="example">
           <code>http://example.net/provenance-discovery?uri=http://example.info/qdata/</code>
         </pre>
+        <p>
+          Using the first form of request URI, the format of response data received may be determined by content negotiation using an HTTP <code>Accept:</code> header. 
+          The second form of request URI including the <code><strong><cite>Content-type</cite></strong></code> parameter requests results in the specified format without HTTP content negotiation.
+          Thus, a request URI for retrieving JSON data without HTTP content negotiation may look like this:
+        </p>
+        <pre class="example">
+<code>http://example.net/provenance-discovery?uri=http://example.info/qdata/&amp;type=application/json</code>
+        </pre>
+        <p class="issue">
+          SameAs.org also provides URIs for directly accessing the different result formats without content negotiation, by appending an extra segment to the SameAs.org service URI.  I'm reluctant to suggest this mechanism for a service with separately specified base URI, hence using the additional query parameter.
+        </p>
       </section>
       <section>
         <h2>Response content and formats</h2>
@@ -307,9 +328,10 @@
               </pre>
             </code>
           </dd>
-          <dd>Returns an RDF graph with one or more provenance URIs associated with the original resource,
+          <dt>text/turtle</dt>
+          <dd>
+            Returns an RDF graph with one or more provenance URIs associated with the original resource,
             presented as a Turtle or N3 document [[TURTLE]]:
-          <dd>
             <code>
               <pre class="example">
                 @prefix prov: &lt;http://www.w3.org/@@TBD@@#&gt; .
@@ -325,7 +347,7 @@
           </dd>
           <dt>text/plain</dt>
           <dd>
-          <dd>Returns a simple text file containing just a list of provenance URIs, one per line.  (The original resource URI is not included in the result data.;)
+            Returns a simple text file containing just a list of provenance URIs, one per line.  (The original resource URI is not included in the result data.;)
             <code>
               <pre class="example">
                 http://source1.example.info/provenance/qdata/
@@ -335,9 +357,6 @@
             </code>
           </dd>
         </dl>
-        <p class="issue">
-          SameAs.org also provides URIs for directly accessing the different result formats without content negotiation, by appending an extra segment to the SameAs.org service URI.  I'm reluctant to suggest this mechanism for a service with separately specified base URI.  Maybe allow the use of an additional query parameter, e.g. <code>&amp;format=rdf</code>, etc.
-        </p>
       </section>
       <section>
         <h2>Response codes</h2>
@@ -476,6 +495,9 @@
     </section>
     <section class="appendix">
       <h2>Motivating scenario</h2>
+      <p class="pending">
+        I propose to remove this appendix on publication.
+      </p>
       <p><a href="http://www.w3.org/2011/prov/wiki/ProvenanceAccessScenario">This scenario</a> was selected by the provenance working group as a touchstone for evaluating any provenance access proposal.  This appendix evaluates the foregoing proposals against the requirements implied by that scenario.</p>
       <p>
         <ul>