PROV-AQ: Provenance Access and Query

Abstract

This document specifies how to use standard Web protocols, including HTTP, to obtain information about the provenance of Web resources. We describe both simple access mechanisms for locating provenance information associated with web pages or resources, and provenance query services for more complex deployments. This is part of the larger W3C Prov provenance framework.

1. Introduction

The Provenance Data Model [PROV-DM] and Provenance Ontology [PROV-O] specifications define how to represent provenance information in the World Wide Web.

This note describes how existing web mechanisms may be used to locate, retrieve and query provenance information.

1.1 Concepts

In defining the specification below, we make use of the following concepts.

Provenance information: refers to provenance represented in some fashion.
Provenance-URI: a URI denoting some provenance information.
Entity: an aspect of a resource, about which one wishes to present some provenance information. For example, a weather report for a given date may be an aspect of a resource that is maintained as the current weather report. An entity is itself a resource. See also [PROV-DM], and [WEBARCH] section 2.3.2.
Entity-URI: a URI denoting an entity, which allows that entity to be identified for the purpose of finding and expressing provenance information (see section 1.2 for discussion)
Provenance service: a service that provides a provenance-URI or provenance information given a resource URI or an entity-URI.
Service-URI: the URI of a provenance service.
Resource: also referred to as web resource: a resource as described by the Architecture of the World Wide Web [WEBARCH], section 2.2. A resource may be associated with multiple entities (see section 1.2 for discussion)

1.2 Provenance, entities and resources

Fundamentally, provenance information is about resources. In general, resources may vary over time and context. E.g., a resource describing the weather in London changes from day-to-day, or one listing restaurants near you will vary depending on your location. Provenance information, to be useful, must be persistent and not itself dependent on context. Yet we may still want to make provenance assertions about dynamic or context-dependent web resources (e.g. the weather forecast for London on a particular day may have been derived from a particular set of Meteorological Office data).

Provenance descriptions of dynamic and context-dependent resources are possible through the notion of entities. An entity is simply a web resource that is a contextualized view or instance of an original web resource. For example, a W3C specification typically undergoes several public revisions before it is finalized. A URI that refers to the "current" revision might be thought of as denoting the specification through its lifetime. Separate URIs for each individual revision would then be entity-URIs, denoting the specification at a particular stage in its development. Using these, we can make provenance assertions that a particular revision was published on a particular date, and was last modified by a particular editor. Entity-URIs may use any URI scheme, and are not required to be dereferencable.

Requests for provenance about a resource may return provenance information that uses one or more entity-URIs to refer to versions of that resource. Some given provenance information may use multiple entity-URIs if there are assertions referring to the same underlying resource in different contexts. For example, provenance information describing a W3C document might include information about all revisions of the document using statements that use the different entity-URIs of the various revisions.

In summary, a key notion within the concepts outlined above is that provenance information may be not universally applicable to a resource, but may be expressed with respect to that resource in a restricted context (e.g. at a particular time). This restricted view is called an entity, and an entity-URI is used to refer to it within provenance information.

1.3 Interpreting provenance information

Provenance information describes relationships between entities, activities and agents. As such, any given provenance information may contain information about several entities. Within some provenance information, the entities thus described are identified by their Entity-URIs.

When interpreting provenance information, it is important to be aware that statements about several entities may be present, and to be accordingly selective when using the information provided. (In some exceptional cases, it may be that the provenance information returned does not contain any information relating to a specific associated entity.)

2. Accessing provenance information

Web applications may access provenance information in the same way as any web resource, by dereferencing its URI. Typically, this will be by performing an HTTP GET operation. Thus, any provenance information may be associated with a provenance-URI, and may be accessed by dereferencing that URI using normal web mechanisms.

Provenance assertions are about pre-determined activities involving entities; as such, they are not dynamic. Thus, provenance information returned at a given provenance-URI may commonly be static. But the availability of provenance information about a resource may vary (e.g. if there is insufficient storage to keep it indefinitely, or new information becomes available at a later date), so the provenance information returned at a given URI may change, provided that such change does not contradict any previously retrieved information.

How much or how little provenance information is returned in response to to a retrieval request is a matter for the provenance provider application. At a minimum, for as long as provenance information about an entity remains available, sufficient should be returned to enable a client application to walk the provenance graph per section 6. Incremental Provenance Retrieval.

When publishing provenance as a web resource, the provenance-URI should be discoverable using one or more of the mechanisms described in section 3. Locating provenance information.

If there is no URI for some particular provenance information, then alternative mechanisms may be needed. Possible mechanisms are suggested in section 4. Provenance services and section 5. Querying provenance information.

3. Locating provenance information

When provenance information is a resource that can be accessed using normal web retrieval, one needs to know a provenance-URI to dereference. If this is known in advance, there is nothing more to specify. If a provenance-URI is not known then a mechanism to discover one must be based on information that is available to the would-be accessor.

Provenance information may be provided by several parties other than the provider of the original resource, each using different provenance-URIs, and each with different concerns. (It is possible that these different parties may provide contradictory provenance information.)

Once provenance information information is retrieved, one also needs to know how to locate the view of that resource within that provenance information. This view is an entity and is identified by an entity-URI.

We start by considering mechanisms for the resource provider to indicate a provenance-URI along with a entity-URI. (Mechanisms that can be independent of the resource provision are discussed in section 4. Provenance services). Three mechanisms are described here:

The requester knows the resource URI and the resource is accessible using HTTP
The requester has a copy of a resource represented as HTML or XHTML
The requester has a copy of a resource represented as RDF (including the range of possible RDF syntaxes, such as HTML with embedded RDFa)

These particular cases are selected as corresponding to primary current web protocol and data formats. Finally, in section 3.4 Arbitrary data, we discuss the case of a resource in an unspecified format which has been provided by some means other than HTTP.

The mechanisms specified for use with HTTP and HTML are similar to those proposed by POWDER [POWDER-DR] (sections 4.1.1 and 4.1.3).

3.1 Resource accessed by HTTP

For a document accessible using HTTP, provenance information may be indicated using an HTTP Link header field, as defined by Web Linking (RFC 5988) [LINK-REL]. The Link header field is included in the HTTP response to a GET or HEAD operation (other HTTP operations are not excluded, but are not considered here).

A provenance link relation type for referencing provenance information is registered according to the template in section 7. IANA considerations, and may be used as shown::

Link: provenance-URI; rel="provenance"; anchor="entity-URI"

When used in conjunction with an HTTP success response code (2xx), this HTTP header field indicates that provenance-URI is the URI of some provenance information associated with the requested resource and that the associated entity is identified as entity-URI. (See also section 1.3 Interpreting provenance information.)

If no anchor link is provided then the entity-URI is assumed to be the URI of the resource.

At this time, the meaning of these links returned with other HTTP response codes is not defined: future revisions of this specification may define interpretations for these.

An HTTP response may include multiple provenance link header fields, indicating a number of different provenance resources that are known to the responding server, each providing provenance information about the accessed resource.

The presence of a provenance link in an HTTP response does not preclude the possibility that other publishers may offer provenance information about the same resource. In such cases, discovery of the additional provenance information must use other means (e.g. see section 4. Provenance services).

Provenance resources indicated in this way are not guaranteed to be authoritative. Trust in the linked provenance data must be determined separately from trust in the original resource, just as in the web at large, it is a users' responsibility to determine an appropriate level of trust in any other linked resource; e.g. based on the domain that serves it, or an associated digital signature. (Ssee also section 8. Security considerations.)

3.1.1 Specifying Provenance Services

This is a new proposal. It needs to be checked as to whether it is useful. GK/PG to review nature of provenance-service-URI.

The document provider may indicate that provenance information about the document is provided by a provenance service. This is done through the use of a provenance-service link relation type following the same pattern as above:

Link: provenance-service-URI; anchor="entity-URI"; rel="provenance-service"

The provenance-service link identifies the service-URI. Dereferencing this URI yields a service description that provides further information to enable a client to determine a provenance-URI or retrieve provenance information for an entity; see section 4. Provenance services for more details.

There may be multiple provenance-service link header fields, and these may appear in the same document as provenance links (though, in simple cases, we anticipate that provenance and provenance-service link relations will not be used together).

3.2 Resource represented as HTML

For a document presented as HTML or XHTML, without regard for how it has been obtained, provenance information may be associated with a resource by adding a <Link> element to the HTML <head> section. Two new link relation types for referencing provenance information are registered according to the template in section 7. IANA considerations, and may be used as shown:

  <html xmlns="http://www.w3.org/1999/xhtml">
     <head>
        <link rel="provenance" href="provenance-URI">
        <link rel="anchor" href="entity-URI">
        <title>Welcome to example.com</title>
     </head>
     <body>
        ...
     </body>
  </html>

The provenance-URI given by the provenance link element identifies the provenance-URI for the document.

The entity-URI given by the anchor link element specifies an identifier for the presented document view, and which may be used within the provenance information when referring to this document.

An HTML document header may include multiple "provenance" link elements, indicating a number of different provenance resources that are known to the creator of the document, each of which may provide provenance information about the document.

Likewise, the header may include multiple "anchor" link elements indicating that, e.g., different revisions of the document can be identified in the provenance information using the different entity-URIs.

If no "anchor" link element is provided then the entity-URI is assumed to be the URI of the document. It is recommended that this convention be used only when the document is static.

3.2.1 Specifying Provenance Services

The document creator may specify that the provenance information about the document is provided by a provenance service. This is done through the use of a third link relation type following the same pattern as above:

  <html xmlns="http://www.w3.org/1999/xhtml">
     <head>
        <link rel="provenance-service" href="service-URI">
        <link rel="anchor" href="entity-URI">
        <title>Welcome to example.com</title>
     </head>
     <body>
        ...
     </body>
  </html>

The provenance-service link element identifies the service-URI. Dereferencing this URI yields a service description that provides further information to enable a client to access provenance information for an entity; see section 4. Provenance services for more details.

There may be multiple provenance-service link elements, and these may appear in the same document as anchor and provenance link elements (though, in simple cases, we anticipate that provenance and provenance-service link relations would not be used together).

3.3 Resource represented as RDF

If a resource is represented as RDF (in any of its recognized syntaxes, including RDFa), it may contain references to its own provenance using additional RDF statements.

For this purpose a new RDF property, prov:hasProvenance, is defined as a relation between two resources, where the object of the property is a resource that provides provenance information about the subject resource. Multiple prov:hasProvenance assertions may be made about a subject resource.

Another new RDF property, prov:hasAnchor, is defined to allow the RDF content to specify one or more entity-URIs of the RDF document for the purpose of provenance information (similar to the use of the "anchor" link relation in HTML).

@@TODO: document namespace. Check naming style. Use provenance model namespace? Define as part of model?
@@TODO: example, when vocabulary issues are settled.

3.4 Arbitrary data

We have so far decided not to try and define a common mechanism for arbitrary data, because it's not clear to us what the correct choice would be. Is this a reasonable position, or is there a real need for a generic solution for provenance discovery for arbitrary, non-web-accessible data objects?

If a resource is represented using a data format other than HTML or RDF, and no URI for the resource is known, provenance discovery becomes trickier to achieve. This specification does not define a specific mechanism for such arbitrary resources, but this section discusses some of the options that might be considered.

For formats which have provision for including metadata within the file (e.g. JPEG images, PDF documents, etc.), use the format-specific metadata to include a entity-URI, provenance-URI and/or service-URI. Format-specific metadata provision might also be used to include provenance information directly in the resource.

Use a generic packaging format that can combine an arbitrary data file with a separate metadata file in a known format, such as RDF. At this time, it is not clear what format that should be, but some possible candidates are:

MIME multipart/related [RFC2387]: both email and HTTP are based on MIME or MIME-derivatives, so this has the advantage of working well with the network transfer mechanisms discussed in the motivating scenarios considered.
Composite object-packaging work from the digital library community, of which there are several (ORE, MPEG-21, BagIt @@refs) to name a handful. Practical implementations of these seem to commonly be based on the ZIP file format.
Packaging formats along the lines of those used for shipping Java web applications or (basically, a ZIP file with a manifest and some imposed structure)
Ongoing work in the research community (e.g. Why Linked Data is Not Enough for Scientists, ePub, etc.) to encapsulate data, code, annotations and metadata into a common exchangeable format.

Fix references in above text.

4. Provenance services

This section describes a REST API [REST-APIs] for a provenance service with facilities for discovery and/or retrieval of provenance information, which can be implemented independently of the original resource delivery channels (e.g. by a third party service).

All service implementations must respond with a service description (section 4.2.1 Service description) when the service URI is dereferenced. Service implementations may provide either discovery, retrieval or both of these services, indicated by presence of the corresponding service URI templates in the service description. Which of these services to provide is a choice for individual service implementations.

On the Web, the normal mechanism for retrieving information is to associate it with a URI, and dereference the URI using normal retrieval mechanisms. This approach is enabled using the provenance discovery service mechanism: given the URI of some resource for which provenance information is required, the service returns one or more URIs from which provenance information may be obtained. This approach may be preferred when the provenance service cannot specify the form of URIs used for identifying provenance information, or when there may be more than one source of provenance information known to the provenance service.

The provenance retrieval service returns provenance information directly. This mechanism may be preferred when the provenance information is not already presented directly to the web, or is stored in a database with a complex query protocol, or when the provenance service can control the URI from which provenance information is served and avoid the intermediate step of URI discovery.

4.1 Using the provenance service API

This section describes general procedures for using the provenance service API. Later sections describe the resources presented by the API, and their representation using JSON. section B. Provenance service format examplesgives examples of alternative representations. Normal HTTP content negotiation mechanisms may be used to retrieve representations using formats convenient for the client application.

4.1.1 Retrieve Provenance-URIs for a resource

To use the provenance service to retrieve a list of provenance-URIs for a resource, starting with the service URI (service-URI) and the URI of the resource or entity (entity-URI):

Dereference service-URI to obtain a representation of the service description.
Extract the provenance locations template from the service description.
Use the provenance locations template with entity-URI for template variable uri to form provenance-locations-URI.
Dereference provenance-locations-URI to obtain a provenance locations resource in one of the formats described below.

Any or all of URIs in the returned provenance locations may be used to retrieve provenance information, per section 2. Accessing provenance information.

4.1.2 Retrieve Provenance information for a resource

To use the provenance service to directly retrieve provenance information for a resource, starting with the service URI (service-URI) and the URI of the resource or context (entity-URI):

Dereference service-URI to obtain a representation of the service description.
Extract the provenance information template from the service description.
Use the provenance information template with entity-URI for template variable uri to form provenance-URI.
Dereference provenance-URI to obtain provenance information.

4.2 Resources presented and representations used

4.2.1 Service description

A provenance service description describes the provenance discovery and retrieval service and, in particular, provides URI templates [URI-template] for URIs to access provenance locations resources and/or provenance information. Dereferencing the service URI returns a representation of this service description. The service description may contain additional metadata about the service beyond that described here: API clients are expected to ignore any metadata elements they do not understand.

This example shows a provenance service description using JSON format [RFC4627], which is presented as MIME content-type application/json. Other examples may be seen in section B. Provenance service format examples.

{
  "provenance_service_uri":         "http://example.org/provenance_service/",
  "provenance_locations_template":  "http://example.org/provenance_service/locations/?uri={uri}",
  "provenance_content_template":    "http://example.org/provenance_service/provenance/?uri={uri}"
}

Is there any point in including the provenance service URI here? It has been included for consistency with RDF representations, but is functionally redundant.

4.2.2 Provenance locations

A provenance locations resource enumerates one or more provenance-URIs identifying provenance information associated with a given resource.

The examples below and in section B. Provenance service format examples are for a given resource URI http://example.org/qdata/, and using the service description example above, its URI would be http://example.org/provenance_service/location/?uri=http%3A%2F%2Fexample.org%2Fqdata%2F.

This example uses JSON format [RFC4627], presented as MIME content type application/json. Other examples may be seen in section B. Provenance service format examples.

{
  "uri": "http://example.org/qdata/",
  "provenance": [
    "http://source1.example.org/provenance/qdata/",
    "http://source2.example.org/prov/qdata/",
    "http://source3.example.com/prov?id=qdata"
  ]
}

The template might use ?uri={+uri} rather than just ?uri={uri}, and thereby avoid %-escaping the : and / characters in the given URI, but this could cause difficulties for URIs containing query parameters and/or fragment identifiers. In this case, the client application would need to ensure that any such characters were %-escaped before being passed into a URI-template expansion processor.

4.2.3 Provenance information

Provenance information about a resource or resources may be returned in any format. It is recommended that the format be one defined by the Provenance Model specification [PROV-DM].

Assuming a given resource URI http://example.org/qdata/, and using the service description example above, the provenance URI would be http://example.org/provenance_service/provenance/?uri=http%3A%2F%2Fexample.org%2Fqdata%2F.

4.3 Provenance service discovery

This specification does not define any specific mechanism for discovering provenance services. Applications may use any appropriate mechanism, including but not limited to: prior configuration, search engines, service registries, etc.

5. Querying provenance information

Simply identifying and retrieving provenance information as a web resource may not always meet the requirements of a particular application or service, e.g.:

the entity for which provenance information is required is not identified by a known URI
the provenance information for an entity is not directly identified by a known URI
a requirement to access provenance information for a number of distinct but related entities in a single atomic operation
etc.

A provenance query service provides an alternative way to access provenance information and/or Provenance-URIs. An application will need a provenance query service URI, and some relevant information about the entity whose provenance is to be accessed.

The details of a provenance query service is an implementation choice, but for interoperability between different providers and users we recommend use of SPARQL [RDF-SPARQL-PROTOCOL] [RDF-SPARQL-QUERY]. The query service URI would then be the URI of a SPARQL endpoint (or, to use the SPARQL specification language, a SPARQL protocol service). The following subsections provide examples for what are considered to be some plausible common scenarios for using SPARQL, and are not intended to cover all possibilities.

5.1 Find provenance-URI given entity-URI of resource

If the requester has an entity-URI, a simple SPARQL query may be used to return the corresponding provenance-URI. E.g., if the original resource has a entity-URI http://example.org/resource,

  @prefix prov: <@@TBD>
  SELECT ?provenance_uri WHERE
  {
    <http://example.org/resource> prov:hasProvenance ?provenance_uri
  }

@@TODO: specific provenance namespace and property to be determined by the model or ontology specification?

5.2 Find Provenance-URI given identifying information about a resource

If the requester has identifying information that is not the URI of the original resource, then they will need to construct a more elaborate query to locate an entity description and obtain its provenance-URI(s). The nature of identifying information that can be used in this way will depend upon the third party service used, further definition of which is out of scope for this specification. For example, a query for a document identified by a DOI, say 1234.5678, using the PRISM vocabulary [PRISM] recommended by FaBio [FABIO], might look like this:

@prefix prov: <@@TBD>
@prefix prism: <http://prismstandard.org/namespaces/basic/2.0/>
SELECT ?provenance_uri WHERE
{
  [ prism:doi "1234.5678" ] prov:hasProvenance ?provenance_uri
}

@@TODO: specific provenance namespace and property to be determined by the model specification?

5.3 Obtain provenance information directly given an entity-URI of a resource

This scenario retrieves provenance information directly given the URI of a resource or entity, and may be useful where the provenance information has not been assigned a specific URI, or when the calling application is interested only in specific elements of provenance information.

If the original resource has an entity-URI http://example.org/resource, a SPARQL query for provenance information might look like this:

@prefix prov: <@@TBD>
CONSTRUCT
{
  <http://example.org/resource> ?p ?v
}
WHERE
{
  <http://example.org/resource> ?p ?v
}

This query essentially extracts all available properties and values available from the query service used that are directly about the specified entity, and returns them as an RDF graph. This may be fine if the service contains only provenance information about the indicated resource, or if the non-provenance information is also of interest. A more complex query using specific provenance vocabulary terms may be needed to selectively retrieve just provenance information when other kinds of information are also available.

@@TODO: specific provenance namespace and property to be determined by the model specification? The above query pattern assumes provenance information is included in direct properties about the entity. When an RDF provenance vocabulary is fully formulated, this may well turn out to not be the case. A better example would be one that retrieves specific provenance information when the vocabulary terms have been defined.

6. Incremental Provenance Retrieval

Provenance information may be large. While this specification does not define how to implement scalable provenance systems, it does allow for publishers to make available provenance in an incremental fashion. We now discuss two possibilities for incremental provenance retrieval.

6.1 Via Web Retrieval

Publishers are not required to publish all the provenance information associated with a given entity at a particular provenance-URI. The amount of provenance information exposed is application dependent. However, it is possible to incrementally retrieve (i.e. walk the provenance graph) by progressively looking up provenance information using HTTP. The pattern is as follows:

For a given entity (entity-uri-1) retrieve it's associated provenance-uri-1 using the HTTP Link header (section 3.1 Resource accessed by HTTP)
Dereference provenance-uri-1
Navigate the provenance information
When reaching a dead-end during navigation, that is on encountering a reference to an entity (entity-uri-2) with no provided provenance information, find its provenance-URI and continue from Step 1. (Note: an HTTP HEAD operation may be used to obtain the Link headers without retrieving the entity content.)

To reduce the overhead of multiple HTTP requests, a provenance information publisher may link entities to their associated provenance information using the prov:hasProvenance predicate. Thus, the same pattern above applies, except instead of having to retrieve a new Link header field, one can immediately dereference the entity's associated provenance.

The same approach can be adopted when using the provenance service API (section 4. Provenance services). However, instead of performing an HTTP HEAD or GET against a resource one queries the provenance service using the given entity-uri.

6.2 Via Queries

Provenance information may be made available using a SPARQL endpoint (section 5. Querying provenance information) [RDF-SPARQL-PROTOCOL] [RDF-SPARQL-QUERY]. Using SPARQL queries, provenance can be selectively retrieved using combinations of filters and or path queries.

7. IANA considerations

This document requests registration of new link relations, per section-6.2.1 of RFC 5988.

@@TODO The following templates should be completed and submitted to link-relations@ietf.org:

7.1 Registration template for link relation: "provenance"

Relation Name:: provenance
Description:: the resource identified by target IRI of the link provides provenance information about the entity identified by the context link
Reference:: @@this spec, @@provenance-model-spec
Notes:: ...
Application Data:: ...

7.2 Registration template for link relation: "anchor"

The name "anchor" has been used for the link relation name, despite the corresponding URI being described as an entity-URI. This terminology has been chosen to align with usage in the description of the HTTP Link header field, per RFC 5988.

Relation Name:: anchor
Description:: when used in conjunction with a "provenance" link, the resource identified by target IRI of the link is an entity for which provenance information may be provided. This may be used, for example, to isolate relevant information from a referenced document that contains provenance information for several entities.
Reference:: @@this spec, @@provenance-model-spec
Notes:: ...
Application Data:: ...

7.3 Registration template for link relation: "provenance-service"

Relation Name:: provenance-service
Description:: the resource identified by target URI of the link is an provenance service per section 4. Provenance services of this specification.
Reference:: @@this spec, @@provenance-model-spec
Notes:: ...
Application Data:: ...

8. Security considerations

Provenance is central to establishing trust in data. If provenance information is corrupted, it may lead agents (human or software) to draw inappropriate and possibly harmful conclusions. Therefore, care is needed to ensure that the integrity of provenance data is maintained.

When using HTTP to access provenance information, or to determine a provenance URI, secure HTTP (https) should be used.

When retrieving a provenance URI from a document, steps should be taken to ensure the document itself is an accurate copy of the original whose author is being trusted (e.g. signature checking, or verifying its checksum against an author-provided secure web service).

@@TODO ... privacy, access control to provenance (note to self: discussed in Edinburgh linked data provenance workshop). In particular, note that the fact that a resource is openly accessible does not mean that its provenance information should also be.

A. Acknowledgements

The editors acknowledge the contribution and review from members of the provenance working group.

Many thanks to Robin Berjon for making our lives so much easier with his cool ReSpec tool.

B. Provenance service format examples

In section 4. Provenance services, the provenance service description was represented as a JSON-formatted document. As noted, HTTP content negotiation may be enabled to retrieve the document in alternative formats. This appendix provides examples of service description document represented using RDF Turtle and XML syntaxes, and XML.

B.1 RDF Turtle example of service description

This example uses the RDF Turtle format [TURTLE], presented as MIME content-type text/turtle.

@prefix provds: <@@TBD@@#> .
<http://example.org/provenance_service/> a provds:Service_description ;
  provds:provenance_locations_template       "http://example.org/provenance_service/locations/?uri={uri}" ;
  provds:provenance_content_template     "http://example.org/provenance_service/provenance/?uri={uri}"
  .

The provenance URI templates are encoded in RDF as plain string literals, not as resource URIs.

Finalize URIs in the above example.

B.2 RDF/XML example of service description

This is essentially the same as the Turtle example above, but encoded in RDF/XML [RDF-SYNTAX-GRAMMAR], and presented as MIME content-type application/xml+rdf.

<rdf:RDF
  xmlns:rdf    = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs   = "http://www.w3.org/2000/01/rdf-schema#"
  xmlns:provds = "@@TBD@@#"
>
  <provds:Service_description rdf:about="http://example.org/provenance_service/">
   example.org <provds:provenance_locations_template>http://example.org/provenance_service/locations/?uri={uri}</provds:location_template> ;
    <provds:provenance_content_template>http://example.org/provenance_service/provenance/?uri={uri}</provds:provenance_template> ;
  </provds:Service_description>
</rdf:RDF>

Finalize URIs in the above example.

B.3 Plain XML example of service description

@@TODO: provide example and schema

B.4 RDF Turtle example of provenance locations

This example uses the RDF Turtle format [TURTLE], presented as MIME content type text/turtle.

@prefix prov: <@@TBD@@#> .
<http://example.org/qdata/> a prov:Entity ;
  prov:hasProvenance  <http://source1.example.org/provenance/qdata/> ;
  prov:hasProvenance  <http://source2.example.org/prov/qdata/> ;
  prov:hasProvenance  <http://source3.example.com/prov?id=qdata>
  .

NOTE: The namespace URI used here for the provenance properties is different from that used in the service description. I am anticipating that it will be defined as part of the provenance model. If it is not defined as part of the provenance model, then a property name should be allocated in the provenance discovery service namespace.

@@TODO: revise to conform with Provenance Model vocabulary; review URIs

B.5 RDF/XML example of provenance locations

This is essentially the same as the Turtle example above, but encoded in RDF/XML [RDF-SYNTAX-GRAMMAR], and presented with MIME content type application/rdf+xml.

<rdf:RDF
  xmlns:rdf    = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs   = "http://www.w3.org/2000/01/rdf-schema#"
  xmlns:prov   = "@@TBD@@#"
>
  <prov:Entity rdf:about="http://example.org/qdata/">
    <prov:hasProvenance  rdf:resource="http://source1.example.org/provenance/qdata/" /> ;
    <prov:hasProvenance  rdf:resource="http://source2.example.org/prov/qdata/" /> ;
    <prov:hasProvenance  rdf:resource="http://source3.example.com/prov?id=qdata" /> ;
  </prov:Entity>
</rdf:RDF>

@@TODO: revise to conform with Provenance Model vocabulary

B.6 Plain XML example of provenance locations

@@TODO: provide example and schema

PROV-AQ: Provenance Access and Query

W3C Working Draft 12 December 2011

Abstract

Status of This Document

Table of Contents

1. Introduction

1.1 Concepts

1.2 Provenance, entities and resources

1.3 Interpreting provenance information

2. Accessing provenance information

3. Locating provenance information

3.1 Resource accessed by HTTP

3.1.1 Specifying Provenance Services

3.2 Resource represented as HTML

3.2.1 Specifying Provenance Services

3.3 Resource represented as RDF

3.4 Arbitrary data

4. Provenance services

4.1 Using the provenance service API

4.1.1 Retrieve Provenance-URIs for a resource

4.1.2 Retrieve Provenance information for a resource

4.2 Resources presented and representations used

4.2.1 Service description

4.2.2 Provenance locations

4.2.3 Provenance information

4.3 Provenance service discovery

5. Querying provenance information

5.1 Find provenance-URI given entity-URI of resource

5.2 Find Provenance-URI given identifying information about a resource

5.3 Obtain provenance information directly given an entity-URI of a resource

6. Incremental Provenance Retrieval

6.1 Via Web Retrieval

6.2 Via Queries

7. IANA considerations

7.1 Registration template for link relation: "provenance"

7.2 Registration template for link relation: "anchor"

7.3 Registration template for link relation: "provenance-service"

8. Security considerations

A. Acknowledgements

B. Provenance service format examples

B.1 RDF Turtle example of service description

B.2 RDF/XML example of service description

B.3 Plain XML example of service description

B.4 RDF Turtle example of provenance locations

B.5 RDF/XML example of provenance locations

B.6 Plain XML example of provenance locations

C. References

C.1 Normative references

C.2 Informative references