--- a/paq/prov-aq.html Thu Apr 05 15:19:31 2012 +0200
+++ b/paq/prov-aq.html Thu Apr 05 15:20:12 2012 +0200
@@ -1,4 +1,5 @@
<!DOCTYPE html>
+
<html>
<head>
<title>PROV-AQ: Provenance Access and Query</title>
@@ -29,14 +30,20 @@
"FABIO" : "D. Shotton; S. Peroni. <a href=\"http://speroni.web.cs.unibo.it/cgi-bin/lode/req.py?req=http:/purl.org/spar/fabio#namespacedeclarations\"><cite>FaBiO, the FRBR-aligned Bibliographic Ontology.</cite></a> June 2011. URL: <a href=\"http://speroni.web.cs.unibo.it/cgi-bin/lode/req.py?req=http:/purl.org/spar/fabio#namespacedeclarations\">http://speroni.web.cs.unibo.it/cgi-bin/lode/req.py?req=http:/purl.org/spar/fabio#namespacedeclarations</a>",
"URI-template":
- "J. Gregorio; R. Fielding, ed.; M. Hadley; M. Nottingham. "+
- "<a href=\"https://datatracker.ietf.org/doc/draft-gregorio-uritemplate/\"><cite>URI Template</cite></a>. "+
- "September 2011, Work in progress. "+
- "URL: <a href=\"https://datatracker.ietf.org/doc/draft-gregorio-uritemplate/\"><cite>https://datatracker.ietf.org/doc/draft-gregorio-uritemplate/</cite></a>",
+ "J. Gregorio; R. Fielding; M. Hadley; M. Nottingham; D. Orchard. "+
+ "<a href=\"http://tools.ietf.org/html/rfc6570\"><cite>URI Template</cite></a>. "+
+ "March 2012, Internet RFC 6570. "+
+ "URL: <a href=\"http://tools.ietf.org/html/rfc6570/\"><cite>http://tools.ietf.org/html/rfc6570</cite></a>",
+
+ "REST":
+ "R. Fielding. "+
+ "<a href=\"http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm\"><cite>Representational State Transfer (REST)</cite></a>. "+
+ "2000, Ph.D. dissertation. "+
+ "URL: <a href=\"http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm\"http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm>http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm</a>",
"REST-APIs":
"R. Fielding. "+
- "<a href=\"http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven\">REST APIs must be hypertext driven</a>. "+
+ "<a href=\"http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven\"><cite>REST APIs must be hypertext driven</cite></a>. "+
"October 2008 (blog post), "+
"URL: <a href=\"http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven\">http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven</a>",
@@ -58,6 +65,12 @@
"2011, Work in progress. "+
"URL: <a href=\"http://www.w3.org/TR/sparql11-service-description/\">http://www.w3.org/TR/sparql11-service-description/</a>",
+ "SPARQL-HTTP":
+ "Chimezie Ogbuji. "+
+ "<a href=\"http://www.w3.org/TR/sparql11-http-rdf-update/\"><cite>SPARQL 1.1 Service Description</cite></a>. "+
+ "2011, Work in progress. "+
+ "URL: <a href=\"http://www.w3.org/TR/sparql11-http-rdf-update/\">http://www.w3.org/TR/sparql11-http-rdf-update/</a>",
+
};
var respecConfig = {
// specification status (e.g. WD, LCWD, NOTE, etc.). If in doubt use ED.
@@ -168,8 +181,37 @@
<section id="sotd">
This document is part of a set of specifications produced by the W3C provenance working group aiming to define interoperable interchange of provenance information in heterogeneous environments such as the Web. It describes the use of existing web mechanisms for discovery and retrieval of provenance information.
+
+ <h4>PROV Family of Specifications</h4>
+The PROV family of specifications aims to define the various aspects that are necessary to achieve the vision of inter-operable
+interchange of provenance information in heterogeneous environments such as the Web.
+The specifications are as follows.
+<ul>
+<li> PROV-DM, the PROV data model for provenance (this document),</li>
+<li> PROV-DM-CONSTRAINTS, a set of constraints applying to the PROV data model,</li>
+<li> PROV-N, a notation for provenance aimed at human consumption,</li>
+<li> PROV-O, the PROV ontology, an OWL-RL ontology allowing the mapping of PROV to RDF;</li>
+<li> PROV-AQ, the mechanisms for accessing and querying provenance; </li>
+<li> PROV-PRIMER, a primer for the PROV data model,</li>
+<li> PROV-SEM, a formal semantics for the PROV data model.</li>
+<li> PROV-XML, an XML schema for the PROV data model.</li>
+</ul>
+<h4>How to read the PROV Family of Specifications</h4>
+<ul>
+<li>The primer is the entry point to PROV offering a pedagogical presentation of the provenance model.</li>
+<li>The Linked Data and Semantic Web community should focus on PROV-O defining PROV classes and properties specified in an OWL-RL ontology. For further details, PROV-DM and PROV-DM-CONSTRAINTS specify the constraints applicable to the data model, and its interpretation. PROV-SEM provides a mathematical semantics.</li>
+<li>The XML community should focus on PROV-XML defining an XML schema for PROV-DM. Further details can also be found in PROV-DM, PROV-DM-CONSTRAINTS, and PROV-SEM.</li>
+<li>Developers seeking to retrieve or publish provenance should focus of PROV-AQ.</li>
+<li>Readers seeking to implement other PROV serializations
+should focus on PROV-DM and PROV-DM-CONSTRAINTS. PROV-O, PROV-N, PROV-XML offer examples of mapping to RDF, text, and XML, respectively.</li>
+</ul>
+<h4>Second Public Working Draft</h4>
+This is the second public working draft. The changes focus on revising the provenance-service specification to provide better guidance to developers as well as introducing better naming conventions for the use of link headers in locating provenance.
+<p>
+
</section>
+
<!-- == Sect 1 =================================================================================== -->
<section>
@@ -198,42 +240,44 @@
<dd>refers to provenance represented in some fashion.</dd>
<dt><dfn>Provenance-URI</dfn></dt>
<dd>a URI denoting some <a class="internalDFN">provenance information</a>.</dd>
- <dt><dfn>Entity</dfn></dt>
- <dd>an aspect of a <a class="internalDFN">resource</a>, about which one wishes to present some <a class="internalDFN">provenance information</a>. For example, a weather report for a given date may be an aspect of a resource that is maintained as the current weather report. An entity is itself a <a class="internalDFN">resource</a>. See also [[PROV-DM]], and [[WEBARCH]] <a href="http://www.w3.org/TR/webarch/#representation-reuse">section 2.3.2</a>.</dd>
- <dt><dfn>Entity-URI</dfn></dt>
- <dd>a URI denoting an <a class="internalDFN">entity</a>, which allows that entity to be identified for the purpose of finding and expressing <a class="internalDFN">provenance information</a> (see <a href="#provenance-entities-resources" class="sectionRef"></a> for discussion)</dd>
+ <dt><dfn>Constrained resource</dfn></dt>
+ <dd>an aspect, version or instance of a <a class="internalDFN">resource</a>, about which one may wish to present some <a class="internalDFN">provenance information</a>. For example, a weather report for a given date may be an aspect of a resource that is maintained as the current weather report. An constrained resource is itself a <a class="internalDFN">resource</a>, and may have it's own URI different from that of the original. See also [[PROV-DM]], and [[WEBARCH]] <a href="http://www.w3.org/TR/webarch/#representation-reuse">section 2.3.2</a>.</dd>
+ <dt><dfn>Target-URI</dfn></dt>
+ <dd>a URI denoting a <a class="internalDFN">resource</a> (including any <a class="internalDFN">constrained resource</a>), which identifies that resource for the purpose of finding and expressing <a class="internalDFN">provenance information</a> (see <a href="#provenance-entities-resources" class="sectionRef"></a> for discussion)</dd>
<dt><dfn>Provenance service</dfn></dt>
- <dd>a service that provides a <a class="internalDFN">provenance-URI</a> or <a class="internalDFN">provenance information</a> given a <a class="internalDFN">resource</a> URI or an <a class="internalDFN">entity-URI</a>.</dd>
+ <dd>a service that provides <class="internalDFN">provenance information</a> given a <a class="internalDFN">target-uri</a>.</dd>
<dt><dfn>Service-URI</dfn></dt>
<dd>the URI of a <a class="internalDFN">provenance service</a>.</dd>
<dt><dfn>Resource</dfn></dt>
- <dd>also referred to as <dfn>web resource</dfn>: a resource as described by the Architecture of the World Wide Web [[WEBARCH]], <a href="http://www.w3.org/TR/webarch/#id-resources">section 2.2</a>. A resource may be associated with multiple <a title="Entity" class="internalDFN">entities</a> (see <a href="#provenance-entities-resources" class="sectionRef"></a> for discussion)</dd>
+ <dd>also referred to as <dfn>web resource</dfn>: a resource as described by the Architecture of the World Wide Web [[WEBARCH]], <a href="http://www.w3.org/TR/webarch/#id-resources">section 2.2</a>. A resource may be associated with multiple instances or views (<a class="internalDFN">constrained resource</a>s) with differing provenance.</dd>
</dl>
</p>
</section>
<section>
- <h2 id="provenance-entities-resources">Provenance, entities and resources</h2>
+ <h2 id="provenance-entities-resources">Provenance and resources</h2>
<p>
- Fundamentally, <a class="internalDFN">provenance information</a> is <em>about</em> <a class="internalDFN">resources</a>. In general, resources may vary over time and context. E.g., a resource describing the weather in London changes from day-to-day, or one listing restaurants near you will vary depending on your location. Provenance information, to be useful, must be persistent and not itself dependent on context. Yet we may still want to make provenance assertions about dynamic or context-dependent web resources (e.g. the weather forecast for London on a particular day may have been derived from a particular set of Meteorological Office data).
+ Fundamentally, <a class="internalDFN">provenance information</a> is <em>about</em> <a class="internalDFN">resource</a>s. In general, resources may vary over time and context. E.g., a resource describing the weather in London changes from day-to-day, or one listing restaurants near you will vary depending on your location. Provenance information, to be useful, must be persistent and not itself dependent on context. Yet we may still want to make provenance assertions about dynamic or context-dependent web resources (e.g. the weather forecast for London on a particular day may have been derived from a particular set of Meteorological Office data).
</p>
<p>
- Provenance descriptions of dynamic and context-dependent resources are possible through the notion of entities. An <a class="internalDFN">entity</a> is simply a resource (in the sense defined by [[WEBARCH]], <a href="http://www.w3.org/TR/webarch/#id-resources">section 2.2</a>) that is a contextualized view or instance of an original resource. For example, a W3C specification typically undergoes several public revisions before it is finalized. A URI that refers to the "current" revision might be thought of as denoting the specification through its lifetime. Separate URIs for each individual revision would then be <a class="internalDFN">entity-URIs</a>, denoting the specification at a particular stage in its development. Using these, we can make provenance assertions that a particular revision was published on a particular date, and was last modified by a particular editor. Entity-URIs may use any URI scheme, and are not required to be dereferencable.
+ Provenance descriptions of dynamic and context-dependent resources are possible through a notion of constrained resources. A <a class="internalDFN">constrained resource</a> is simply a resource (in the sense defined by [[WEBARCH]], <a href="http://www.w3.org/TR/webarch/#id-resources">section 2.2</a>) that is a contextualized view or instance of some other resource. For example, a W3C specification typically undergoes several public revisions before it is finalized. A URI that refers to the "current" revision might be thought of as denoting the specification through its lifetime. Separate URIs for each individual revision would also have <a class="internalDFN">target-uri</a>s, each denoting the specification at a particular stage in its development. Using these, we can make provenance assertions that a particular revision was published on a particular date, and was last modified by a particular editor. Target-URIs may use any URI scheme, and are not required to be dereferencable.
</p>
<p>
- Requests for provenance about a resource may return provenance information that uses one or more entity-URIs to refer to versions of that resource. Some given provenance information may use multiple entity-URIs if there are assertions referring to the same underlying resource in different contexts. For example, provenance information describing a W3C document might include information about all revisions of the document using statements that use the different entity-URIs of the various revisions.
+ Requests for provenance about a resource may return provenance information that uses one or more target-URIs to refer to versions of that resource. Some given provenance information may use multiple target-URIs if there are assertions referring to the same underlying resource in different contexts. For example, provenance information describing a W3C document might include information about all revisions of the document using statements that use the different target-URIs of the various revisions.
</p>
<p>
- In summary, a key notion within the concepts outlined above is that <a class="internalDFN">provenance information</a> may be not universally applicable to a <a class="internalDFN">resource</a>, but may be expressed with respect to that resource in a restricted context (e.g. at a particular time). This restricted view is called an <a class="internalDFN">entity</a>, and an <a class="internalDFN">entity-URI</a> is used to refer to it within provenance information.
+ In summary, <a class="internalDFN">provenance information</a> may be not universally applicable to a <a class="internalDFN">resource</a>, but may be expressed with respect to that resource in a restricted context (e.g. at a particular time). This restriction is itself just another resource (e.g. the weather forecast for a give date as opposed to the current weather forecast), with its own URI for referring to it within provenance information.
</p>
</section>
<section>
<h2>Interpreting provenance information</h2>
- <p><a class="internalDFN">Provenance information</a> describes relationships between entities, activities and agents. As such, any given provenance information may contain information about several <a title="Entity" class="internalDFN">entities</a>. Within some provenance information, the entities thus described are identified by their <a class="internalDFN">Entity-URI</a>s.
+ <p>
+ <a class="internalDFN">Provenance information</a> describes relationships between <a class="internalDFN">resource</a>s, including activities and agents. Any given provenance information may contain information about several resources, referring to them using their <a class="internalDFN">target-uri</a>s.
</p>
- <p>When interpreting provenance information, it is important to be aware that statements about several entities may be present, and to be accordingly selective when using the information provided. (In some exceptional cases, it may be that the provenance information returned does not contain any information relating to a specific associated entity.)
+ <p>
+ Thus, when interpreting provenance information, it is important to be aware that statements about several resources may be present, and to be accordingly selective when using the information provided. (In some exceptional cases, it may be that the provenance information returned does not contain any information relating to a specific associated resource.)
</p>
</section>
@@ -243,24 +287,21 @@
<section>
<h2>Accessing provenance information</h2>
- <p>Web applications may access <a class="internalDFN">provenance information</a> in the same way as any web resource, by dereferencing its URI. Typically, this will be by performing an HTTP GET operation. Thus, any provenance information may be associated with a <a class="internalDFN">provenance-URI</a>, and may be accessed by dereferencing that URI using web mechanisms.
- </p>
<p>
- Provenance assertions are about occurring or completed activities and the entities they involve. Thus, provenance information returned at a given provenance-URI may commonly be static. But the availability of provenance information about a resource may vary (e.g. if there is insufficient storage to keep it indefinitely, or new information becomes available at a later date), so the provenance information returned at a given URI may change, provided that such change does not contradict any previously retrieved information.
- </p>
- <p>
- How much or how little provenance information is returned in response to a retrieval request is a matter for the provenance provider application. At a minimum, for as long as provenance information about an entity remains available, sufficient should be returned to enable a client application to walk the provenance graph per <a class="sectionRef" href="#incremental-provenance-retrieval"></a>.
+ Web applications may access <a class="internalDFN">provenance information</a> in the same way as any web resource, by dereferencing its URI. Thus, any provenance information may be associated with a <a class="internalDFN">provenance-URI</a>, and may be accessed by dereferencing that URI using web mechanisms.
</p>
<p>
- When publishing provenance as a web resource, the <a class="internalDFN">provenance-URI</a> should be discoverable using one or more of the mechanisms described in <a href="#locating-provenance-information" class="sectionRef"></a>.
+ How much or how little provenance information is returned in response to a retrieval request is a matter for the provenance provider application.
</p>
<p>
- If there is no URI for some particular provenance information, then alternative mechanisms may be needed. Possible mechanisms are suggested in <a href="#provenance-services" class="sectionRef"></a> and <a href="#querying-provenance-information" class="sectionRef"></a>.
+ It may be useful to provide provenance information through a service interface. A REST protocol for provenance retrieval is defined in Section <a href="#provenance-services" class="sectionRef"></a>.
</p>
- <p class="note">
- The references above to provenance assertions being non-contradictory and non-changing are made with respect to <em>correct</em> provenance information. As with any other form of information published on the web, provenance may be subject to errors and omissions, and applications should attempt to be robust in the face of such. The requirement for provenance information to be non-contradictory should not be taken as an injunction against the correction of any errors that may occur.
+ <p>
+ When publishing provenance information, the location of that information either at a URI or within a Service should be discoverable using one or more of the mechanisms described in <a href="#locating-provenance-information" class="sectionRef"></a>.
</p>
-
+ <p>
+ Some alternative practices for accessing provenance information are discussed in <a href="#best-practice" class="sectionRef"></a>
+ </p>
</section>
<!-- == Sect 3 =================================================================================== -->
@@ -268,15 +309,16 @@
<section>
<h2>Locating provenance information</h2>
<p>
- When <a class="internalDFN">provenance information</a> is a resource that can be accessed using web retrieval, one needs to know a <a class="internalDFN">provenance-URI</a> to dereference. If this is known in advance, there is nothing more to specify. If a provenance-URI is not known then a mechanism to discover one must be based on information that is available to the would-be accessor.
+ When <a class="internalDFN">provenance information</a> is a resource that can be accessed using web retrieval, one needs to know its <a class="internalDFN">provenance-URI</a> to dereference. If this is known in advance, there is nothing more to specify. If a provenance-URI is not known then a mechanism to discover one must be based on information that is available to the would-be accessor. Likewise, provenance information may be exposed by a service. In this case, the <a class="internalDFN">service-URI</a> needs to be known.
</p>
- <p>Provenance information about a resource may be provided by several parties other than the provider of that resource, each using different provenance-URIs, and each with different concerns. (It is possible that these different parties may provide contradictory provenance information.)
+ <p>Provenance information about a resource may be provided by several parties other than the provider of that resource, each using different locations, and each with different concerns. (It is possible that these different parties may provide contradictory provenance information.)
</p>
<p>
- Once provenance information about a resource is retrieved, one also needs to know how to locate the view of that resource within that provenance information. This view is an <a class="internalDFN">entity</a> and is identified by an <a class="internalDFN">entity-URI</a>.
+ Once provenance information about a resource is retrieved, one may also need to know how to locate information about that resource within the provenance information. This may be a <a class="internalDFN">constrained resource</a> identified by a separate <a class="internalDFN">target-uri</a>.
</p>
<p>
- We start by considering mechanisms for the resource provider to indicate a <a class="internalDFN">provenance-URI</a> along with a <a class="internalDFN">entity-URI</a>. (Mechanisms that can be independent of the resource provision are discussed in <a href="#provenance-services" class="sectionRef"></a>). Three mechanisms are described here:
+ We start by considering mechanisms for the resource provider to indicate a <a class="internalDFN">provenance-URI</a> or <a class="internalDFN">Service-URI</a> along with a <a class="internalDFN">target-uri</a>.
+ Three mechanisms are described here:
<ul>
<li>The requester knows the resource URI <em>and</em> the resource is accessible using HTTP</li>
<li>The requester has a copy of a resource represented as HTML or XHTML</li>
@@ -291,15 +333,15 @@
<section>
<h2>Resource accessed by HTTP</h2>
<p>
- For a document accessible using HTTP, provenance information may be indicated using an HTTP <code>Link</code> header field, as defined by <a href="http://tools.ietf.org/html/rfc5988">Web Linking (RFC 5988)</a> [[LINK-REL]]. The <code>Link</code> header field is included in the HTTP response to a GET or HEAD operation (other HTTP operations are not excluded, but are not considered here).
+ For a resource accessible using HTTP, provenance information may be indicated using an HTTP <code>Link</code> header field, as defined by <a href="http://tools.ietf.org/html/rfc5988">Web Linking (RFC 5988)</a> [[LINK-REL]]. The <code>Link</code> header field is included in the HTTP response to a GET or HEAD operation (other HTTP operations are not excluded, but are not considered here).
</p>
<p>
- A <code>provenance</code> link relation type for referencing provenance information is registered according to the template in <a href="#iana-considerations" class="sectionRef"></a>, and may be used as shown::
- <pre class="pattern">Link: <cite>provenance-URI</cite>; rel="provenance"; anchor="<cite>entity-URI</cite>"</pre>
- When used in conjunction with an HTTP success response code (<code>2xx</code>), this HTTP header field indicates that <code><cite>provenance-URI</cite></code> is the URI of some provenance information associated with the requested resource and that the associated entity is identified within the referenced provenance information as <code><cite>entity-URI</cite></code>. (See also <a href="#interpreting-provenance-information" class="sectionRef"></a>.)
+ A <code>provenance</code> link relation type for referencing provenance information is registered according to the template in <a href="#iana-considerations" class="sectionRef"></a>, and may be used as shown:
+ <pre class="pattern">Link: <cite>provenance-URI</cite>; rel="provenance"; anchor="<cite>target-uri</cite>"</pre>
+ When used in conjunction with an HTTP success response code (<code>2xx</code>), this HTTP header field indicates that <code><cite>provenance-URI</cite></code> is the URI of some provenance information associated with the requested resource and that the associated resource is identified within the referenced provenance information as <code><cite>target-uri</cite></code>. (See also <a href="#interpreting-provenance-information" class="sectionRef"></a>.)
</p>
<p>
- If no <code>anchor</code> link is provided then the <code><cite>entity-URI</cite></code> is assumed to be the URI of the resource.
+ If no <code>anchor</code> link is provided then the <code><cite>target-uri</cite></code> is assumed to be the URI of the resource, used in the corresponding HTTP request.
</p>
<p>
At this time, the meaning of these links returned with other HTTP response codes is not defined: future revisions of this specification may define interpretations for these.
@@ -311,21 +353,18 @@
The presence of a <code>provenance</code> link in an HTTP response does not preclude the possibility that other publishers may offer provenance information about the same resource. In such cases, discovery of the additional provenance information must use other means (e.g. see <a href="#provenance-services" class="sectionRef"></a>).
</p>
<p>
- Provenance resources indicated in this way are not guaranteed to be authoritative. Trust in the linked provenance data must be determined separately from trust in the original resource, just as in the web at large, it is a users' responsibility to determine an appropriate level of trust in any other linked resource; e.g. based on the domain that serves it, or an associated digital signature. (Ssee also <a href="#security-considerations" class="sectionRef"></a>.)
+ Provenance resources indicated in this way are not guaranteed to be authoritative. Trust in the linked provenance information must be determined separately from trust in the original resource. Just as in the web at large, it is a user's responsibility to determine an appropriate level of trust in any other linked resource; e.g. based on the domain that serves it, or an associated digital signature. (See also <a href="#security-considerations" class="sectionRef"></a>.)
</p>
<section>
<h2>Specifying Provenance Services</h2>
- <p class="pending">
- This is a new proposal. It needs to be checked as to whether it is useful. GK/PG to review nature of provenance-service-URI.
- </p>
<p>
- The document provider may indicate that provenance information about the document is provided by a <a class="internalDFN">provenance service</a>. This is done through the use of a <code>provenance-service</code> link relation type following the same pattern as above:
+ The resource provider may indicate that provenance information about the document is provided by a <a class="internalDFN">provenance service</a>. This is done through the use of a <code>provenance-service</code> link relation type following the same pattern as above:
</p>
<pre class="pattern">
-Link: <cite>provenance-service-URI</cite>; anchor="<cite>entity-URI</cite>"; rel="provenance-service"</pre>
+Link: <cite>provenance-service-URI</cite>; anchor="<cite>target-uri</cite>"; rel="provenance-service"</pre>
<p>
- The <code>provenance-service</code> link identifies the <a class="internalDFN">service-URI</a>. Dereferencing this URI yields a service description that provides further information to enable a client to determine a <a class="internalDFN">provenance-URI</a> or retrieve <a class="internalDFN">provenance information</a> for an <a class="internalDFN">entity</a>; see <a href="#provenance-services" class="sectionRef"></a> for more details.
+ The <code>provenance-service</code> link identifies the <a class="internalDFN">service-URI</a>. Dereferencing this URI yields a service description that provides further information to enable a client to determine a <a class="internalDFN">provenance-URI</a> or retrieve <a class="internalDFN">provenance information</a> for a <a class="internalDFN">resource</a>; see <a href="#provenance-services" class="sectionRef"></a> for more details.
</p>
<p>
There may be multiple <code>provenance-service</code> link header fields, and these may appear in an HTTP response together with <code>provenance</code> link header fields (though, in simple cases, we anticipate that <code>provenance</code> and <code>provenance-service</code> link relations will not be used together).
@@ -343,7 +382,7 @@
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="provenance" href="<cite>provenance-URI</cite>">
- <link rel="anchor" href="<cite>entity-URI</cite>">
+ <link rel="anchor" href="<cite>target-uri</cite>">
<title>Welcome to example.com</title>
</head>
<body>
@@ -355,16 +394,18 @@
The <code><cite>provenance-URI</cite></code> given by the <code>provenance</code> link element identifies the provenance-URI for the document.
</p>
<p>
- The <code><cite>entity-URI</cite></code> given by the <code>anchor</code> link element specifies an specifies an identifier for the entity that may be used within the provenance information when referring to the document.
+ The <code><cite>target-uri</cite></code> given by the <code>anchor</code> link element specifies an specifies an identifier for the document that may be used within the provenance information when referring to the document.
</p>
<p>
- An HTML document header MAY include multiple "provenance" link elements, indicating a number of different provenance resources that are known to the creator of the document, each of which may provide provenance information about the document.
+ An HTML document header MAY include multiple <code>provenance</code> link elements, indicating a number of different provenance sources that are known to the creator of the document, each of which may provide provenance information about the document.
</p>
+<!--
<p>
- Likewise, the header MAY include multiple "anchor" link elements indicating that, e.g., different revisions of the document can be identified in the provenance information using the different <code><cite>entity-URIs</cite></code>.
+ Likewise, the header MAY include multiple "anchor" link elements indicating that, e.g., different revisions of the document can be identified in the provenance information using the different <code><cite>resource-URIs</cite></code>.
</p>
+-->
<p>
- If no "anchor" link element is provided then the <code><cite>entity-URI</cite></code> is assumed to be the URI of the document. It is RECOMMENDED that this convention be used only when the document is static.
+ If no "anchor" link element is provided then the <code><cite>target-uri</cite></code> is assumed to be the URI of the document. It is RECOMMENDED that this convention be used only when the document is static and has an easily-determined URI.
</p>
<section>
@@ -376,7 +417,7 @@
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="provenance-service" href="<cite>service-URI</cite>">
- <link rel="anchor" href="<cite>entity-URI</cite>">
+ <link rel="anchor" href="<cite>target-uri</cite>">
<title>Welcome to example.com</title>
</head>
<body>
@@ -384,10 +425,10 @@
</body>
</html></pre>
<p>
- The <code>provenance-service</code> link element identifies the <a class="internalDFN">service-URI</a>. Dereferencing this URI yields a service description that provides further information to enable a client to access <a class="internalDFN">provenance information</a> for an <a class="internalDFN">entity</a>; see <a href="#provenance-services" class="sectionRef"></a> for more details.
+ The <code>provenance-service</code> link element identifies the <a class="internalDFN">service-URI</a>. Dereferencing this URI yields a service description that provides further information to enable a client to access <a class="internalDFN">provenance information</a> for a <a class="internalDFN">resource</a>; see <a href="#provenance-services" class="sectionRef"></a> for more details.
</p>
<p>
- There may be multiple <code>provenance-service</code> link elements, and these MAY appear in the same document as <code>anchor</code> and <code>provenance</code> link elements (though, in simple cases, we anticipate that <code>provenance</code> and <code>provenance-service</code> link relations would not be used together).
+ There MAY be multiple <code>provenance-service</code> link elements, and these MAY appear in the same document as <code>provenance</code> link elements (though, in simple cases, we anticipate that <code>provenance</code> and <code>provenance-service</code> link relations would not be used together).
</p>
</section>
</section>
@@ -399,31 +440,35 @@
(These terms may be used to indicate provenance of other resources too, but discussion of such usage is beyond the scope of this section.)
</p>
<p>
- The RDF property <code>prov:hasProvenance</code> is defined as a relation between two resources, where the object of the property is a resource that provides provenance information about the subject resource. Multiple <code>prov:hasProvenance</code> assertions may be made about a subject resource. This property corresponds to a <a href="#registration-template-for-link-relation---provenance">"provenance" link relation</a> used with an HTTP <code>Link</code> header field, or HTML <code><link></code> element.
+ The RDF property <code>prov:hasProvenance</code> is defined as a relation between two resources, where the object of the property is a resource that provides provenance information about the subject resource. Multiple <code>prov:hasProvenance</code> assertions may be made about a subject resource. This property corresponds to a <a href="#registration-template-for-link-relation---provenance">provenance link relation</a> used with an HTTP <code>Link</code> header field, or HTML <code><link></code> element.
</p>
<p>
- Property <code>prov:hasAnchor</code> specifies an <a class="internalDFN">entity-URI</a> used in the provenance information to refer to the RDF document.
- This corresponds to use of the "anchor" parameter in an HTTP "provenance" <code>Link</code> header field, or an <a href="#registration-template-for-link-relation---anchor">"anchor" link relation</a> in an HTML <code><link></code> element, which similarly indicate a URI used by the provenance information to refer to the described document.
+ Property <code>prov:hasAnchor</code> specifies a <a class="internalDFN">target-uri</a> used in the provenance information to refer to the containing RDF document.
+ This corresponds to use of the <code>anchor</code> parameter in an HTTP provenance <code>Link</code> header field, or an <a href="#registration-template-for-link-relation---anchor">anchor link relation</a> in an HTML <code><link></code> element, which similarly indicate a URI used by the provenance information to refer to the described document.
</p>
<p>
- Property <code>prov:hasProvenanceService</code> specifies a <a class="internalDFN">service-URI</a>s associated with the RDF document for possible access to provenance information. This property corresponds to a <a href="#registration-template-for-link-relation---provenance-service">"provenance-service" link relation</a> used with an HTTP <code>Link</code> header field, or HTML <code><link></code> element.
+ Property <code>prov:hasProvenanceService</code> specifies a <a class="internalDFN">service-URI</a>s associated with the RDF document for possible access to provenance information. This property corresponds to a <a href="#registration-template-for-link-relation---provenance-service">provenance-service link relation</a> used with an HTTP <code>Link</code> header field, or HTML <code><link></code> element.
</p>
- <p class="TODO">
- @@TODO: document namespace. Check naming style. Use provenance model namespace? Define as part of model?<br/>
- @@TODO: example, when vocabulary issues are settled.
- </p>
+ <pre class="example code">
+ @prefix prov: <<provns/>>
+
+ <> dcterms:title "Welcome to example.com" ;
+ prov:hasAnchor <http://example.com/data/resource.rdf> ;
+ prov:hasProvenance <http://example.com/provenance/resource.rdf> ;
+ prov:hasProvenanceService <http://example.com/provenance-service/> .
+ :
+ (RDF data)
+ :
+ </pre>
</section>
<section>
<h2>Arbitrary data</h2>
- <p class="pending">
- We have so far decided not to try and define a common mechanism for arbitrary data, because it's not clear to us what the correct choice would be. Is this a reasonable position, or is there a real need for a generic solution for provenance discovery for arbitrary, non-web-accessible data objects?
- </p>
<p>
If a resource is represented using a data format other than HTML or RDF, and no URI for the resource is known, provenance discovery becomes trickier to achieve. This specification does not define a specific mechanism for such arbitrary resources, but this section discusses some of the options that might be considered.
</p>
<p>
- For formats which have provision for including metadata within the file (e.g. JPEG images, PDF documents, etc.), use the format-specific metadata to include a <a class="internalDFN">entity-URI</a>, <a class="internalDFN">provenance-URI</a> and/or <a class="internalDFN">service-URI</a>. Format-specific metadata provision might also be used to include <a class="internalDFN">provenance information</a> directly in the resource.
+ For formats which have provision for including metadata within the file (e.g. JPEG images, PDF documents, etc.), use the format-specific metadata to include a <a class="internalDFN">target-uri</a>, <a class="internalDFN">provenance-URI</a> and/or <a class="internalDFN">service-URI</a>. Format-specific metadata provision might also be used to include <a class="internalDFN">provenance information</a> directly in the resource.
</p>
<p>
Use a generic packaging format that can combine an arbitrary data file with a separate metadata file in a known format, such as RDF. At this time, it is not clear what format that should be, but some possible candidates are:
@@ -431,19 +476,16 @@
<li>MIME multipart/related [[RFC2387]]: both email and HTTP are based on MIME or MIME-derivatives, so this has the advantage of working well with the network transfer mechanisms discussed in the motivating scenarios considered.
</li>
<li>
- Composite object-packaging work from the digital library community, of which there are several (ORE, MPEG-21, BagIt @@refs) to name a handful. Practical implementations of these seem to commonly be based on the ZIP file format.
- </li>
- <li>
- Packaging formats along the lines of those used for shipping Java web applications or (basically, a ZIP file with a manifest and some imposed structure)
+ Composite object-packaging work from the digital library community, of which there are several (e.g., <a href="http://www.openarchives.org/ore/">ORE</a>, <a href="http://xml.coverpages.org/mpeg21-didl.html">MPEG-21 DIDL</a>, <a href="https://wiki.ucop.edu/display/Curation/BagIt">BagIt</a>) to name a handful. Practical implementations of these are commonly based on the ZIP file format.
</li>
<li>
- Ongoing work in the research community (e.g. <a href="http://eprints.ecs.soton.ac.uk/21587/">Why Linked Data is Not Enough for Scientists</a>, ePub, etc.) to encapsulate data, code, annotations and metadata into a common exchangeable format.
+ Packaging formats along the lines of those used for shipping Java web applications (basically, a ZIP file with a manifest and some imposed structure)
+ </li>
+ <li>
+ Ongoing work in the research community (e.g. <a href="http://eprints.ecs.soton.ac.uk/21587/">Why Linked Data is Not Enough for Scientists</a>, <a href="http://idpf.org/epub/30/spec/epub30-overview.html">ePub</a>, etc.) to encapsulate data, code, annotations and metadata into a common exchangeable format.
</li>
</ul>
</p>
- <p class="TODO">
- Fix references in above text.
- </p>
</section>
</section>
@@ -453,120 +495,31 @@
<section>
<h2>Provenance services</h2>
<p>
- This section describes a REST API [[REST-APIs]] for a provenance service with facilities for discovery and/or retrieval of provenance information, which can be implemented independently of the original resource delivery channels (e.g. by a third party service).
- </p>
- <p>
- All service implementations must respond with a service description (<a href="#service-description" class="sectionRef"></a>) when the service URI is dereferenced.
- Service implementations may provide either discovery, retrieval or both of these services, indicated by presence of the corresponding service URI templates in the service description. Which of these services to provide is a choice for individual service implementations.
+ This section describes a REST-based protocol [[REST]] for provenance services with facilities for the retrieval of provenance information. The protocol specifies HTTP operations for retrieval of provenance information from a provenance service. It follows the approach of the SPARQL Graph Store HTTP Protocol [[SPARQL-HTTP]].
</p>
- <p>
- On the Web, the normal mechanism for retrieving information is to associate it with a URI, and dereference the URI using HTTP. This approach is enabled using the provenance discovery service mechanism: given the URI of some resource for which provenance information is required, the service returns one or more URIs from which provenance information may be obtained.
- This approach may be preferred when the URIs used for identifying provenance information are controlled by someone other than the provider of the provenance discovery service, or when there is more than one known source of provenance information.
- </p>
- <p>
- The provenance retrieval service returns provenance information directly. This mechanism may be preferred when the provenance information is not already presented directly to the web, or is stored in a database with a complex query protocol, or when the provenance service can control the URI from which provenance information is served and avoid the intermediate step of URI discovery.
+ <p>The introduction of this protocol is motivated by the following possible considerations:
+ <ul>
+ <li>the naming authority associated with the <a class="internalDFN">target-uri</a> is not the same as the service offering provenance information</li>
+ <li>multiple services have <a class="internalDFN">provenance information</a> about the same resource</li>
+ <li>the service associated with the target-uri is not accessible for adding additional information when handling retrieval requests</li>
+ <li>there is no dereferencable <a class="internalDFN">provenance-uri</a> containing provenance information for a particular resource</li>
+ <li>provenance services may provide additional extensibility points for control over returned provenance information.</li>
+ </ul>
</p>
- <section>
- <h2>Using the provenance service API</h2>
- <p>
- This section describes general procedures for using the provenance service API. Later sections describe the resources presented by the API, and their representation using JSON. <a href="#provenance-service-format-examples" class="sectionRef"></a>gives examples of alternative representations. HTTP content negotiation mechanisms may be used to retrieve representations using formats convenient for the client application.
+ <section>
+ <h2>HTTP GET</h2>
+ <p>This protocol combines the <a class="internalDFN">target-uri</a> with the <a class="internalDFN">service-URI</a> to formulate an HTTP GET request, according to the following convention:
+ <pre class="pattern">
+ GET /provenance/service?<b>target</b>=http://www.example.com/entity HTTP/1.1
+ Host: example.com</pre>
</p>
-
- <section>
- <h2>Retrieve Provenance-URIs for a resource</h2>
- <p>
- To use the provenance service to retrieve a list of provenance-URIs for a resource, starting with the service URI (<code>service-URI</code>) and the URI of the resource or entity (<code>entity-URI</code>):
- <ol>
- <li>Dereference <code>service-URI</code> to obtain a representation of the <a class="internalDFN">service description</a>.</li>
- <li>Extract the provenance locations template from the service description.</li>
- <li>Form a <code>provenance-locations-URI</code> using the provenance locations template with <code>entity-URI</code> as a substitute for template variable <code>uri</code>.</li>
- <li>Dereference <code>provenance-locations-URI</code> to obtain a <a class="internalDFN">provenance locations resource</a> in one of the formats described below.</li>
- </ol>
- </p>
- <p>
- Any or all of the URIs in the returned provenance locations may be used to retrieve provenance information, per <a href="#accessing-provenance-information" class="sectionRef"></a>.
- </p>
- </section>
-
- <section>
- <h2>Retrieve Provenance information for a resource</h2>
- <p>
- To use the provenance service to directly retrieve provenance information for a resource, starting with the service URI (<code>service-URI</code>) and the URI of the resource or entity (<code>entity-URI</code>):
- <ol>
- <li>Dereference <code>service-URI</code> to obtain a representation of the <a class="internalDFN">service description</a>.</li>
- <li>Extract the provenance information template from the service description.</li>
- <li>Form a <code>provenance-URI</code> by using the provenance information template with <code>entity-URI</code> as a substitute
- for template variable <code>uri</code>.</li>
- <li>Dereference <code>provenance-URI</code> to obtain <a class="internalDFN">provenance information</a>.</li>
- </ol>
- </p>
- </section>
-
- </section>
-
- <section>
- <h2>Resources presented and representations used</h2>
-
- <section>
- <h2>Service description</h2>
- <p>
- A provenance <dfn>service description</dfn> describes the provenance discovery and retrieval service and, in particular, provides URI templates [[URI-template]] for URIs to access <a title="provenance locations resource" class="internalDFN">provenance locations resources</a> and/or <a class="internalDFN">provenance information</a>. Dereferencing a service URI returns a representation of such a service description. The service description MAY contain additional metadata about the service beyond that described here: API clients are expected to ignore any metadata elements they do not understand.
- </p>
- <p>
- This example shows a provenance service description using JSON format [[RFC4627]], which is presented as MIME content-type <code>application/json</code>.
- Other examples may be seen in <a href="#provenance-service-format-examples" class="sectionRef"></a>.
- </p>
- <pre class="example code">
- {
- "provenance_service_uri": "http://example.org/provenance_service/",
- "provenance_locations_template": "http://example.org/provenance_service/locations/?uri={uri}",
- "provenance_content_template": "http://example.org/provenance_service/provenance/?uri={uri}"
- }
- </pre>
- <p class="issue">
- Is there any point in including the provenance service URI here? It has been included for consistency with RDF representations, but is functionally redundant.
- </p>
- </section>
-
- <section>
- <h2>Provenance locations</h2>
- <p>
- A <dfn>provenance locations resource</dfn> enumerates one or more <a class="internalDFN">provenance-URI</a>s identifying <a class="internalDFN">provenance information</a> associated with a given resource.
- </p>
- <p>
- The examples below and in <a href="#provenance-service-format-examples" class="sectionRef"></a> are for a given resource URI <code>http://example.org/qdata/</code>, and the service description example above. Hence, the URI of the corresponding provenance locations resource would be <code>http://example.org/provenance_service/location/?uri=http%3A%2F%2Fexample.org%2Fqdata%2F</code>.
- </p>
- <p>
- This example uses JSON format [[RFC4627]], presented as MIME content type <code>application/json</code>.
- Other examples may be seen in <a href="#provenance-service-format-examples" class="sectionRef"></a>.
- </p>
- <pre class="example code">
- {
- "uri": "http://example.org/qdata/",
- "provenance": [
- "http://source1.example.org/provenance/qdata/",
- "http://source2.example.org/prov/qdata/",
- "http://source3.example.com/prov?id=qdata"
- ]
- }
- </pre>
- <p class="note">
- The template might use <code>?uri={+uri}</code> rather than just <code>?uri={uri}</code>, and thereby avoid %-escaping the <code>:</code> and <code>/</code> characters in the given URI, but this could cause difficulties for URIs containing query parameters and/or fragment identifiers. In this case, the client application would need to ensure that any such characters were %-escaped <em>before</em> being passed into a URI-template expansion processor.
- </p>
- </section>
-
- <section>
- <h2>Provenance information</h2>
- <p>
- Provenance information about a resource or resources may be returned in any format. It is recommended that the format be one defined by the Provenance Model specification [[PROV-DM]].
- </p>
- <p>
- Assuming a given resource URI <code>http://example.org/qdata/</code>, and
- using the service description example above, the provenance URI would be <code>http://example.org/provenance_service/provenance/?uri=http%3A%2F%2Fexample.org%2Fqdata%2F</code>.
- </p>
- </section>
-
+ <p>The embedded target-URI (<code>http://www.example.com/entity</code>) identifies the resource for which provenance information is to be returned. Any server that implements this protocol and receives a request URI in this form SHOULD return provenance information for the resource-URI embedded in the query component where that URI is the result of percent-decoding the value associated with the provenance-resource key. The embedded URI MUST be an absolute URI and the server MUST respond with a 400 Bad Request if it is not. If the supplied resource-URI includes a fragment identifier, the '#' MUST be %-encoded as <code>%23</code> when constructing the provenance-URI value; similarly, any '&' character in the resource-URI must be %-encoded as <code>%26</code>.
+ </p>
+ <p>If the provenance information identified in the request does not exist in the server, a 404 Not Found response code SHOULD be returned.
+ </p>
+ <p>The format of returned provenance information is not defined here, but may be established through content type negotiation using <code>Accept:</code> header fields in the HTTP request. A provenance service SHOULD be capable of returning RDF using the vocabulary defined by [[PROV-O]], in any standard RDF serialization (e.g. RDF/XML), or any other standard serialization of the Provenance Model specification [[PROV-DM]]. Services MUST identify the <code>Content-Type</code> of the information returned.
+ </p>
</section>
<!-- <section class="informative"> -->
@@ -576,84 +529,93 @@
This specification does not define any specific mechanism for discovering provenance services. Applications may use any appropriate mechanism, including but not limited to: prior configuration, search engines, service registries, etc.
</p>
<p>
- To facilitate service discovery, we recommend that RDF publication of service descriptions uses the provenance service type <code>http://www.w3.org/ns/prov-o/ProvenanceService</code>, defined by the <a href="http://dvcs.w3.org/hg/prov/file/0abd4c442b42/ontology/components/ProvenanceService.ttl">provenance ontology</a> [[PROV-O]]. The RDF examples in <a href="#provenance-service-format-examples" class="sectionRef"></a> show this use.
+ To facilitate service discovery, we recommend that RDF publication of service descriptions uses the provenance service type <code><provns/>ProvenanceService</code>, defined by the provenance ontology [[PROV-O]].
+ The RDF service description example below in <a href="#provenance-service-description" class="sectionRef"></a> shows this use.
</p>
- <p class="TODO">
- Fix up URI used in the "provenance ontology" link above, when finalized.
+ </section>
+
+ <!-- <section class="informative"> -->
+ <section>
+ <h2>Provenance service description</h2>
+ <!--<p>The provenance service interface as described above violates REST constraints by requiring the client to know about the structure of URIs offered by the service (see [[REST-APIs]], 4th bullet point). The provenance service description mitigates this coupling by providing a mechanism for discovering the URI format to be used, starting with just the service URI.
+ %</p> -->
+ <p>Dereferencing a provenance service URI should yield a provenance service description. This is to be compatible with the constraints of [[REST-APIs]]. The provenance service description should be available as RDF (in any of its common serializations, and determined through HTTP content negotiation), and it should contain RDF statements of the form:
+ </p>
+ <pre class="pattern">
+ <<cite>service-URI</cite>> a prov:ProvenanceService ;
+ prov:provenanceUriTemplate "<cite>service-URI</cite>?target={+uri}" .</pre>
+ <p>where <cite><code>service-URI</code></cite> is the URI of the provenance service. Note that the object of the <code>prov:provenanceUriTemplate</code> statement is a literal text value, not a URI.
+ </p>
+ <p>A client may retrieve this service description and extract the associated value for <code>prov:provenance-uri-template</code>. This value is a string containing a URI template [[URI-template]] (level 2). A URI for the desired provenance information is obtained by expanding the URI template with the variable <code>uri</code> set to the resource-URI for which provenance is required. If the target-URI contains '#' or '&' these must be %-escaped as <code>%23</code> or <code>%26</code> respectively before template expansion.
</p>
</section>
</section>
<!-- == Sect 5 =================================================================================== -->
-
+
<section>
- <h2>Querying provenance information</h2>
- <p>
- Simply identifying and retrieving provenance information as a web resource may not always meet the requirements of a particular application or service, e.g.:
- <ul>
- <li>the entity for which provenance information is required is not identified by a known URI</li>
- <li>the provenance information for an entity is not directly identified by a known URI</li>
- <li>a requirement to access provenance information for a number of distinct but related entities in a single atomic operation</li>
- <li><i>etc.</i></li>
- </ul>
- </p>
- <p>
- A provenance query service provides an alternative way to access provenance information and/or Provenance-URIs. An application will need a provenance query service URI, and some relevant information about the entity whose provenance is to be accessed.
- </p>
- <p>
- The details of a provenance query service is an implementation choice, but for interoperability between different providers and users we recommend use of SPARQL [[RDF-SPARQL-PROTOCOL]] [[RDF-SPARQL-QUERY]]. The query service URI would then be the URI of a SPARQL endpoint (or, to use the SPARQL specification language, a <a href="http://www.w3.org/TR/rdf-sparql-protocol/#conformant-sparql-protocol-service">SPARQL protocol service</a>). The following subsections provide examples for what are considered to be some plausible common scenarios for using SPARQL, and are not intended to cover all possibilities.
- </p>
- <p>
- A SPARQL protocol service description may be published using the <cite>SPARQL 1.1 Service Description</cite> vocabulary [[SPARQL-SD]].
- </p>
+ <h2>Best practice</h2>
- <section>
- <h2>Find provenance-URI given entity-URI of resource</h2>
+ <section id="querying-provenance-information">
+ <h2>Using SPARQL for provenance queries</h2>
<p>
- If the requester has an <a class="internalDFN">entity-URI</a>, a simple SPARQL query may be used to return the corresponding <a class="internalDFN">provenance-URI</a>. E.g., if the original resource has a entity-URI <code>http://example.org/resource</code>,
- <code>
+ Simply identifying and retrieving provenance information as a web resource may not always meet the requirements of a particular application or service, e.g.:
+ <ul>
+ <li>the resource for which provenance information is required is not identified by a known URI</li>
+ <li>the provenance information for an resource is not directly identified by a known URI</li>
+ <li>a requirement to access provenance information for a number of distinct but related resources in a single atomic operation</li>
+ <li><i>etc.</i></li>
+ </ul>
+ </p>
+ <p>
+ A provenance query service provides an alternative way to access provenance information and/or provenance-URIs. An application will need a provenance query service URI, and some relevant information about the resource whose provenance is to be accessed.
+ </p>
+ <p>
+ The details of a provenance query service is an implementation choice, but for interoperability between different providers and users we recommend use of SPARQL [[RDF-SPARQL-PROTOCOL]] [[RDF-SPARQL-QUERY]]. The query service URI would then be the URI of a SPARQL endpoint (or, to use the SPARQL specification language, a <a href="http://www.w3.org/TR/rdf-sparql-protocol/#conformant-sparql-protocol-service">SPARQL protocol service</a>). The following subsections provide examples for what are considered to be some plausible common scenarios for using SPARQL, and are not intended to cover all possibilities.
+ </p>
+ <p>
+ A SPARQL protocol service description may be published using the <cite>SPARQL 1.1 Service Description</cite> vocabulary [[SPARQL-SD]].
+ </p>
+
+ <section>
+ <h2>Find a provenance-URI given a target-uri</h2>
+ <p>
+ If the requester has an <a class="internalDFN">target-uri</a>, a simple SPARQL query may be used to return the corresponding <a class="internalDFN">provenance-URI</a>. E.g., if the original resource has a target-uri <code>http://example.org/resource</code>:
<pre class="example code">
- @prefix prov: <@@TBD>
+ @prefix prov: <<provns/>>
SELECT ?provenance_uri WHERE
{
<http://example.org/resource> prov:hasProvenance ?provenance_uri
}
</pre>
- </code>
- </p>
- <p class="TODO">
- @@TODO: specific provenance namespace and property to be determined by the model or ontology specification?
- </p>
- </section>
+ </p>
+ </section>
- <section>
- <h2>Find Provenance-URI given identifying information about a resource</h2>
- <p>
- If the requester has identifying information that is not the URI of the original resource, then they will need to construct a more elaborate query to locate an entity description and obtain its provenance-URI(s). The nature of identifying information that can be used in this way will depend upon the third party service used, further definition of which is out of scope for this specification. For example, a query for a document identified by a DOI, say <code>1234.5678</code>, using the PRISM vocabulary [[PRISM]] recommended by FaBio [[FABIO]], might look like this:
- <pre class="example code">
- @prefix prov: <@@TBD>
+ <section>
+ <h2>Find Provenance-URI given identifying information about a resource</h2>
+ <p>
+ If the requester has identifying information that is not the URI of the original resource, then they will need to construct a more elaborate query to locate a resource description and obtain its provenance-URI(s). The nature of identifying information that can be used in this way will depend upon the third party service used, further definition of which is out of scope for this specification. For example, a query for a document identified by a DOI, say <code>1234.5678</code>, using the PRISM vocabulary [[PRISM]] recommended by FaBio [[FABIO]], might look like this:
+ <pre class="example code">
+ @prefix prov: <<provns/>>
@prefix prism: <http://prismstandard.org/namespaces/basic/2.0/>
SELECT ?provenance_uri WHERE
{
[ prism:doi "1234.5678" ] prov:hasProvenance ?provenance_uri
}
- </pre>
- </p>
- <p class="TODO">
- @@TODO: specific provenance namespace and property to be determined by the model specification?
- </p>
- </section>
+ </pre>
+ </p>
+ </section>
- <section>
- <h2>Obtain provenance information directly given an entity-URI of a resource</h2>
- <p>
- This scenario retrieves provenance information directly given the URI of a resource or entity, and may be useful where the provenance information has not been assigned a specific URI, or when the calling application is interested only in specific elements of provenance information.
- </p>
- <p>
- If the original resource has an entity-URI <code>http://example.org/resource</code>, a SPARQL query for provenance information might look like this:
- <pre class="example code">
- @prefix prov: <@@TBD>
+ <section>
+ <h2>Obtain provenance information directly given a target-uri</h2>
+ <p>
+ This scenario retrieves provenance information directly given the URI of a resource, and may be useful where the provenance information has not been assigned a specific URI, or when the calling application is interested only in specific elements of provenance information.
+ </p>
+ <p>
+ If the original resource has a URI <code>http://example.org/resource</code>, a SPARQL query for provenance information might look like this:
+ <pre class="example code">
+ @prefix prov: <<provns/>>
CONSTRUCT
{
<http://example.org/resource> ?p ?v
@@ -662,48 +624,52 @@
{
<http://example.org/resource> ?p ?v
}
- </pre>
- This query essentially extracts all available properties and values available from the query service used that are directly about the specified entity, and returns them as an RDF graph. This may be fine if the service contains <em>only</em> provenance information about the indicated resource, or if the non-provenance information is also of interest. A more complex query using specific provenance vocabulary terms may be needed to selectively retrieve just provenance information when other kinds of information are also available.
- </p>
- <p class="TODO">
- @@TODO: specific provenance namespace and property to be determined by the model specification? The above query pattern assumes provenance information is included in direct properties about the entity. When an RDF provenance vocabulary is fully formulated, this may well turn out to not be the case. A better example would be one that retrieves specific provenance information when the vocabulary terms have been defined.
- </p>
+ </pre>
+ </p>
+ <p>
+ This query essentially extracts all available properties and values available from the query service used that are directly about the specified resource, and returns them as an RDF graph. This may be fine if the service contains <em>only</em> provenance information about the indicated resource, or if the non-provenance information is also of interest. A more complex query using specific provenance vocabulary terms may be needed to selectively retrieve just provenance information when other kinds of information are also available.
+ </p>
+ <p class="TODO">
+ @@TODO: specific provenance namespace and property to be determined by the model specification? The above query pattern assumes provenance information is included in direct properties about the resource. When an RDF provenance vocabulary is fully formulated, this may well turn out to not be the case. A better example would be one that retrieves specific provenance information when the vocabulary terms have been defined.
+ </p>
+ </section>
+
</section>
- </section>
-
-<!-- == Sect 6 ===================================================================================== -->
-
- <section>
- <h2>Incremental Provenance Retrieval</h2>
- <p><a class="internalDFN">Provenance information</a> may be large. While this specification does not define how to implement scalable provenance systems, it does allow for publishers to make available provenance in an incremental fashion. We now discuss two possibilities for incremental provenance retrieval.
- </p>
+ <!-- == Sect 5.2 ===================================================================================== -->
- <section>
- <h2>Via Web Retrieval</h2>
- <p>Publishers are not required to publish all the provenance information associated with a given entity at a particular <a class="internalDFN">provenance-URI</a>. The amount of provenance information exposed is application dependent. However, it is possible to incrementally retrieve (i.e. walk the provenance graph) by progressively looking up provenance information using HTTP. The pattern is as follows:
- <ol>
- <li>For a given entity (<code>entity-uri-1</code>) retrieve it's associated <code>provenance-uri-1</code> using the HTTP <code>Link</code> header (<a href="#resource-accessed-by-http" class="sectionRef"></a>)</li>
- <li>Dereference <code>provenance-uri-1</code></li>
- <li>Navigate the provenance information</li>
- <li>When reaching a dead-end during navigation, that is on encountering a reference to an entity (<code>entity-uri-2</code>) with no provided provenance information, find its provenance-URI and continue from Step 1. (Note: an HTTP HEAD operation may be used to obtain the Link headers without retrieving the entity content.)</li>
- </ol>
+ <section id="incremental-provenance-retrieval">
+ <h2>Incremental Provenance Retrieval</h2>
+ <p><a class="internalDFN">Provenance information</a> may be large. While this specification does not define how to implement scalable provenance systems, it does allow for publishers to make available provenance in an incremental fashion. We now discuss two possibilities for incremental provenance retrieval.
</p>
- <p>To reduce the overhead of multiple HTTP requests, a provenance information publisher may link entities to their associated provenance information using the <code>prov:hasProvenance</code> predicate. Thus, the same pattern above applies, except instead of having to retrieve a new <code>Link</code> header field, one can immediately dereference the entity's associated provenance.
- </p>
- <p>The same approach can be adopted when using the <a class="internalDFN">provenance service</a> API (<a href="#provenance-services" class="sectionRef"></a>). However, instead of performing an HTTP HEAD or GET against a resource one queries the provenance service using the given <a class="internalDFN">entity-uri</a>.
- </p>
- </section>
- <section>
- <h2>Via Queries</h2>
- <p>Provenance information may be made available using a SPARQL endpoint (<a href="#querying-provenance-information" class="sectionRef"></a>) [[RDF-SPARQL-PROTOCOL]] [[RDF-SPARQL-QUERY]]. Using SPARQL queries, provenance can be selectively retrieved using combinations of filters and or path queries.
- </p>
+ <section>
+ <h2>Via Web Retrieval</h2>
+ <p>Publishers are not required to publish all the provenance information associated with a given resource at a particular <a class="internalDFN">provenance-URI</a>. The amount of provenance information exposed is application dependent. However, it is possible to incrementally retrieve (i.e. walk the provenance graph) by progressively looking up provenance information using HTTP. The pattern is as follows:
+ <ol>
+ <li>For a given resource (<code>target-uri-1</code>) retrieve it's associated <code>provenance-uri-1</code> and its associated <code>target-uri-1</code> using a returned HTTP <code>Link:</code> header field (<a href="#resource-accessed-by-http" class="sectionRef"></a>)</li>
+ <li>Dereference <code>provenance-uri-1</code></li>
+ <li>Navigate the provenance information</li>
+ <li>When reaching a dead-end during navigation, that is on encountering a reference to a resource (<code>target-uri-2</code>) with no provided provenance information, find its provenance-URI and continue from Step 2. (Note: an HTTP HEAD request for <code>target-uri-2</code> may be used to obtain the <code>Link:</code> headers without retrieving the resource representation.)</li>
+ </ol>
+ </p>
+ <p>To reduce the overhead of multiple HTTP requests, a provenance information publishers are encouraged to link entities to their associated provenance information using the <code>prov:hasProvenance</code> predicate. Thus, the same pattern above applies, except instead of having to retrieve a new <code>Link</code> header field, one can immediately access the resource's associated provenance.
+ </p>
+ <p>The same approach can be adopted when using the <a class="internalDFN">provenance service</a> API (<a href="#provenance-services" class="sectionRef"></a>). However, instead of performing an HTTP HEAD or GET against a resource one queries the provenance service using the given <a class="internalDFN">target-uri</a>.
+ </p>
+ </section>
+
+ <section>
+ <h2>Via SPARQL Queries</h2>
+ <p>Provenance information may be made available using a SPARQL endpoint (<a href="#querying-provenance-information" class="sectionRef"></a>) [[RDF-SPARQL-PROTOCOL]] [[RDF-SPARQL-QUERY]]. Using SPARQL queries, provenance can be selectively retrieved using combinations of filters and or path queries.
+ </p>
+ </section>
+
</section>
</section>
-<!-- ===================================================================================== -->
+<!-- ==== Section 6 ===================================================================================== -->
<section>
<h2>IANA considerations</h2>
@@ -722,7 +688,7 @@
</dd>
<dt>Description:</dt>
<dd>
- the resource identified by target IRI of the link provides provenance information about the entity identified by the context link
+ the resource identified by target IRI of the link provides provenance information about the resource identified by the context link
</dd>
<dt>Reference:</dt>
<dd>
@@ -741,9 +707,6 @@
</section>
<section>
<h2>Registration template for link relation: "anchor"</h2>
- <p class="pending">
- The name "anchor" has been used for the link relation name, despite the corresponding URI being described as an entity-URI. This terminology has been chosen to align with usage in the description of the HTTP <code>Link</code> header field, per <a href="http://tools.ietf.org/html/rfc5988#section-5.2">RFC 5988</a>.
- </p>
<p>
<dl>
<dt>Relation Name:</dt>
@@ -752,7 +715,7 @@
</dd>
<dt>Description:</dt>
<dd>
- when used in conjunction with a "provenance" link, the resource identified by target IRI of the link is an entity for which provenance information may be provided. This may be used, for example, to isolate relevant information from a referenced document that contains provenance information for several entities.
+ when used in conjunction with a "provenance" link, the resource identified by target IRI of the link is one for which provenance information may be provided. This may be used, for example, to isolate relevant information from a referenced document that contains provenance information for several entities.
</dd>
<dt>Reference:</dt>
<dd>
@@ -798,12 +761,12 @@
</section>
</section>
-<!-- ===================================================================================== -->
+<!-- ==== Section 7 ===================================================================================== -->
<section>
<h2>Security considerations</h2>
<p>
- Provenance is central to establishing trust in data. If provenance information is corrupted, it may lead agents (human or software) to draw inappropriate and possibly harmful conclusions. Therefore, care is needed to ensure that the integrity of provenance data is maintained.
+ Provenance is central to establishing trust in data. If provenance information is corrupted, it may lead agents (human or software) to draw inappropriate and possibly harmful conclusions. Therefore, care is needed to ensure that the integrity of provenance information is maintained.
</p>
<p>
When using HTTP to access provenance information, or to determine a provenance URI, secure HTTP (https) SHOULD be used.
@@ -827,10 +790,46 @@
Many thanks to Robin Berjon for making our lives so much easier with his cool <a href="http://dev.w3.org/2009/dap/ReSpec.js/documentation.html">ReSpec</a> tool.
</p>
</section>
-
+
+<!-- ===================================================================================== -->
+
+ <section class='appendix'>
+ <h2>Names added to prov: namespace</h2>
+ <p>
+ This specification defines the following additional names in the provenance namespace.
+ </p>
+ <p>
+ The provenance namespace URI is <provns/>.
+ </p>
+ <p>
+ <table>
+ <tr>
+ <th>name</th><th>Description</th><th>Definition ref</th>
+ </tr>
+ <tr>
+ <td>ProvenanceService</td><td>Class for a service described by a provenance service description</td><td>...</td>
+ </tr>
+ <tr>
+ <td>hasAnchor</td><td>Indicates anchor URI for a potentially dynamic resource instance</td><td>...</td>
+ </tr>
+ <tr>
+ <td>hasProvenance</td><td>Relates a resource to its provenance</td><td>...</td>
+ </tr>
+ <tr>
+ <td>hasProvenanceService</td><td>Relates a resource to a provenance service</td><td>...</td>
+ </tr>
+ <tr>
+ <td>provenanceUriTemplate</td><td>Relates a provenance service to a URI template string for constructing provenance-URIs</td><td>...</td>
+ </tr>
+ </table>
+ </p>
+ </section>
+
+
<!-- ===================================================================================== -->
- <section class='appendix'>
+ <!--
+<section class='appendix'>
<h2>Provenance service format examples</h2>
<p>
In <a href="#provenance-services" class="sectionRef"></a>, the provenance service description was represented as a JSON-formatted document. As noted, HTTP content negotiation MAY be enabled to retrieve the document in alternative formats. This appendix provides examples of service description document represented using RDF Turtle and XML syntaxes, and XML.
@@ -934,56 +933,9 @@
</section>
</section>
+ -->
<!-- ===================================================================================== -->
-<!--
- <section class="appendix">
- <h2>Motivating scenario</h2>
- <p class="pending">
- I propose to replace this appendix with text based on Yogesh's walk-through of the scenario, renaming to be something like "Motivating scenario and examples"
- </p>
- <p><a href="http://www.w3.org/2011/prov/wiki/ProvenanceAccessScenario">This scenario</a> was selected by the provenance working group as a touchstone for evaluating any provenance access proposal. This appendix evaluates the foregoing proposals against the requirements implied by that scenario.</p>
- <p>
- <ul>
- <li>Obtaining the document D: for the purpose of this analysis, it is assumed that the access to the document is either from a known Web URI, or the document is available as HTML or RDF (the primary web standards for documents and data). The mechanisms here are in principle applicable to other document forms of a per-format basis.
- <ul>
- <li>D1, D2: use the HTTP <code>Link:</code> header. Any server providing the document may provide this information. Different servers might offer links to different provenance sources.</li>
- <li>D3: information provided as an image with a known URI, but from a non-provenance-aware source. The image URI can be used as a key to access a third party provenance discovery service.
- <li>D4, D6, D7, D8: information provided as an image, without a known web location. At the very least, some mechanism, not specified here, is needed to identify the image provided. In the case of an email attachment, it is possible (but not guaranteed) that the email message MIME wrapper specifies a URI for the image, which can be used as a key. Some image formats support embedded metadata which might be used for this purpose. <em>(Arbitrary data files could be wrapped in a package, say MIME multipart/related [[RFC2387]], that could include additional metadata. Image files could be wrapped in a minimal HTML document. It is not clear to me at this stage that a single mechanism is appropriate for all situations)</em>.</li>
- <li>D5: HTML email. Depending on how the HTML is constructed, the HTML header could include <code><link></code> elements.</li>
- </ul>
- </li>
- <li>Lacking identification or in-band metadata, some independent identification of the thing represented by an available mechanism is required. <em>I think this is unavoidable</em></li>
- <li>Enacting the "Oh yeah?" feature
- <ul>
- <li>W: once a URI for provenance information has been determined, accessing it using a web browser or other web client software should be straightforward. If the provenance is accessible via a third party query service, that may be less straightforward.</li>
- <li>E: this scenario seems to envisage a wholesale overhaul of email client software, which seems unlikely. If a URI for provenance can be provided, the natural way to access it would be via a web client of some kind, which might be a browser or other software.</li>
- <li>S: this scenario effectively calls for this: given an arbitrary data resource, implement a general purpose application to discover, retrieve and analyze provenance about that resource. At the present time, this is a matter for experimental development, which could be based substantially on the mechanisms described for provenance discovery and access via third party services.</li>
- </ul>
- </li>
- <li>I: Accessing the provenance
- <ul>
- <li>W: a web client needs one or more URIs for provenance information, and/or URI(s) for a provenance query service and sufficient additional information about the resource to formulate an effective query. They may also need access information that can be used to assess (or help a user assess) the trustworthiness of provenance of information obtained, (which could be more provenance information)</li>
- <li>E: an email client is a passive receiver of information, so asking one to retrieve provenance information is a perverse expectation. There have been some attempts to standardize email protocols that interact with the email sender but such mechanisms have not been significantly deployed in practice. This case can be viewed as a variation on the shell-client case (S) below. If all provenance information is sent <em>with</em> the original content using standard email mechanisms (MIME multipart, etc.) then the email client may use that (or hand it off to a helper application) as the basis for provenance-based analysis or presentation.</li>
- <li>S: command shell or other local application. This is the general case for provenance access. Given some arbitrary information, what does a provenance-aware application need to access the required provenance information? It may employ any of the mechanisms described above.</li>
- </ul>
- </li>
- </ul>
- </p>
- <section>
- <h2>Gap analysis</h2>
- <p>
- There are clearly a number of capabilities needed for a provenance-aware application that are not covered by the mechanisms described above. But most of these amount to implementation details and decisions for a particular application, and as such are beyond the scope of this document to specify.
- </p>
- <p>
- One feature not covered above that might be a candidate for specification is a common format for a data package that combines original content along with provenance-related metadata or data. At this stage, it is not clear what format that might take, but some possible candidates are discussed in <a href="#arbitrary-data" class="sectionRef"></a>.
- In any case, it seems to me that a specification that is specific for provenance to the exclusion of other metadata is unlikely to obtain traction, as provenance is just part of a wider landscape of information quality, trust, preservation and more.
- </p>
- </section>
- </section>
-
--->
-
</body>
</html>