Fold in Paul's new section on incremental retrieval
authorGraham Klyne
Thu, 10 Nov 2011 11:24:44 +0000
changeset 867 66e416a2578b
parent 866 4edb6544d3b0
child 868 9a5f0b8dc712
Fold in Paul's new section on incremental retrieval
paq/provenance-access.html
--- a/paq/provenance-access.html	Thu Nov 10 10:43:01 2011 +0000
+++ b/paq/provenance-access.html	Thu Nov 10 11:24:44 2011 +0000
@@ -619,7 +619,38 @@
       </section>
 
     </section>
- 
+
+<!-- ===================================================================================== -->
+
+    <section>
+      <h2>Incremental Provenance Retrieval</h2>
+      <p><a class="internalDFN">Provenance information</a> may be large. While this specification does not define how to implement scalable provenance systems, it does allow for publishers to make available provenance in an incremental fashion. We now discuss two possibilities for incremental provenance retrieval.
+      </p>
+
+      <section>
+        <h2>Via Web Retrieval</h2>
+        <p>Publishers are not required to publish all the provenance information associated with a given entity at a particular <a class="internalDFN">provenance-URI</a>. The amount of provenance information exposed is application dependent. However, it is possible to incrementally retrieve (i.e. walk the provenance graph) by progressively looking up provenance information using HTTP. The pattern is as follows:
+          <ol>
+            <li>For a given entity (<code>entity-uri-1</code>) retrieve it's associated <code>provenance-uri-1</code> using the HTTP <code>Link</code> header (<a href="#resource-accessed-by-http" class="sectionRef"></a>)</li>
+            <li>Dereference <code>provenance-uri-1</code></li>
+            <li>Navigate the provenance information</li>
+            <li>When reaching a dead-end during navigation, that is on encountering a reference to an entity (<code>entity-uri-2</code>) with no provided provenance information, find its provenance-URI and continue from Step 1.  (Note: an HTTP HEAD operation may be used to obtain the Link headers without retrieving the entity content.)</li>
+          </ol>
+        </p>
+        <p>To reduce the overhead of multiple HTTP requests, a provenance information publisher may link entities to their associated provenance information using the <code>prov:hasProvenance</code> predicate. Thus, the same pattern above applies, except instead of having to retrieve a new <code>Link</code> header field, one can immediately dereference the entity's associated provenance. 
+        </p>
+        <p>The same approach can be adopted when using the <a class="internalDFN">provenance service</a> API (<a href="#provenance-services" class="sectionRef"></a>). However, instead of performing an HTTP HEAD or GET against a resource one queries the provenance service using the given <a class="internalDFN">entity-uri</a>.
+        </p>
+      </section>
+
+      <section>
+        <h2>Via Queries</h2>
+        <p>Provenance information may be made available using a SPARQL endpoint (<a href="#querying-provenance-information" class="sectionRef"></a>) [[RDF-SPARQL-PROTOCOL]] [[RDF-SPARQL-QUERY]]. Using SPARQL queries, provenance can be selectively retrieved using combinations of filters and or path queries.
+        </p>
+      </section>
+
+    </section>
+
 <!-- ===================================================================================== -->
 
     <!-- <section class="informative"> -->
@@ -709,7 +740,7 @@
         When using HTTP to access provenance information, or to determine a provenance URI, secure HTTP (https) SHOULD be used.
       </p>
       <p>
-        When retrieving a provenance URI from a document, steps SHOULD be taken to ensure the document itself is an accurate copy of the original whose author is being trusted (e.g. signature checking, or verifying its checksum aainst an author-provided secure web service). against
+        When retrieving a provenance URI from a document, steps SHOULD be taken to ensure the document itself is an accurate copy of the original whose author is being trusted (e.g. signature checking, or verifying its checksum against an author-provided secure web service).
       </p>
       <p>
         @@TODO ... privacy, access control to provenance (from Edinburgh meeting).  In particular, note that the fact that a resource is openly accessible does not mean that its provenance information should also be.