This document describes the use of existing mechanisms for accessing and querying provenance data about resources on the web.

Introduction

@@TODO Introductory text

Concepts

In defining the specification below, we make use of the following concepts.

Provenance information
refers to provenance represented in some fashion.
Provenance-URI
a URI denoting some provenance information.
Context
an entity, or aspect of a resource, about which one wishes to present some provenance information.
Context-URI
a URI denoting a context, which allows that context to be isolated in some provenance information (see for discussion)
Provenance service
a service that provides a provenance-URI or provenance information given a resource URI or a context-URI.
Service-URI
the URI of a provenance service.
Resource
also referred to as web resource: a resource as described by the Architecture of the World Wide Web [[WEBARCH]], section 2.2. A resource may be associated with multiple contexts (see for discussion)

The terms context and context-URI are chosen to align with terminology used in describing the HTTP link header (http://tools.ietf.org/html/rfc5988#section-5.2) - does this terminology work in the current (ahem) context? See also next section.

Provenance, context and resources

This section has been drafted to address a number of concerns: (a) to avoid previous use of "Target" for the topic of a provenance assertion (cf. http://www.w3.org/2011/prov/track/issues/74), and (b) to clarify the use of different resources as views on a dynamic or variable subject of provenance.

Fundamentally, provenance information is about resources. In general, resources may vary over time and context: e.g., a resource describing the weather in London changes from day-to-day, or one listing restaurants near you will vary depending on your location. Provenance information, to be useful, must be persistent and not itself dependent on context. Yet we may still want to make provenance assertions about dynamic or context-varying web resources (e.g. the weather forecast for London on a particular day may have been derived from a particular set of Meteorological Office data).

Provenance descriptions of dynamic and context-dependent resources are possible through the notion of contexts. A context is simply a web resource that is a contextualized view or instance of an original web resource. For example, a W3C specification typically undergoes several public revisions before it is finalized. A URI that refers to the "current" revision might be thought of as denoting the specification through its lifetime. Separate URIs for each individual revision would then be context-URIs, denoting the specification at a particular stage in its development. Using these, we can make provenance assertions that a particular revision was published on a particular date, and was last modified by a particular editor.

In summary, a key notion within the concepts outlined above is that provenance information may be not universally applicable to a resource, but may be described with respect to a restricted view of that resource (e.g. the resource at a particular time). This restricted view is termed a context, and a context-URI allows one to refer to that context within the provenance information. The context-URI used to describe this restricted view of a resource is also related to the resource itself, and requests for provenance about that resource may return provenance information that uses one or more context-URIs to refer to it. Some given provenance information may use multiple context-URIs if there are provenance assertions referring to the same underlying resource in different contexts. For example, a provenance resource describing a W3C document might include information about all revisions of the document using statements that use the different context-URIs for the revisions.

Accessing provenance information

A general expectation is that web applications may access provenance information in the same way as any web resource, by dereferencing its URI. Typically, this will be by performing an HTTP GET operation. Thus, any provenance information may be associated with a provenance-URI, and may be accessed by dereferencing that URI using normal web mechanisms.

This specification thus recommends that if a publisher wishes to make provenance information available, it is published as a normal web resource, and provision is made for the provenance-URI to be discoverable using one or more of the mechanisms described in .

This presumption of using web retrieval to access provenance information does not preclude use of other mechanisms. In particular, alternative mechanisms may be needed if there is no URI associated with some particular provenance information. Possible mechanisms are suggested in and .

Locating provenance information

On the presumption that provenance information is a resource that can be accessed using normal web retrieval, one needs to know a provenance-URI to dereference. The provenance-URI may be known in advance, in which case there is nothing more to specify. If a provenance-URI is not known, then a mechanism to discover one must be based on information that is available to the would-be accessor. We also wish to allow that provenance information could be provided by parties other than the provider of the original resource. Indeed, provenance information for a resource may be provided by several different parties, at different URIs, each with different concerns. It is quite possible that different parties may provide contradictory provenance information.

Once provenance information information is retrieved, one needs to know how to identify the view of that resource within that provenance information. This view is known as the context and is identified by a context-URI.

We start by considering mechanisms for the resource provider to indicate a provenance-URI along with a context-URI. (Mechanisms that can be independent of the resource provision are discussed in ). Three mechanisms are described here:

These particular cases are selected as corresponding to primary current web protocol and data formats. Finally, in , we discuss the case of a resource in an unspecified format which has been provided by some means other than HTTP.

The mechanisms specified for use with HTTP and HTML are similar to those proposed by POWDER [[POWDER-DR]] (sections 4.1.1 and 4.1.3).

Resource accessed by HTTP

The link relation indicating a context_URI has been called "anchor" (as opposed to, say, "context"), following terminology used for the HTTP Link element.

Pick one or allow either of the following options for indicating the context-URI?: I am inclined to specify the separate "provenance" and "anchor" and relation type, as that approach will be more consistent with the HTML and RDF options described below. If we do this, does it make sense to retain "anchor" as the link relation type?

For a document accessible using HTTP, provenance information may be indicated using an HTTP Link header field, as defined by Web Linking (RFC 5988) [[LINK-REL]]. The Link header field is included in the HTTP response to a GET or HEAD operation (other HTTP operations are not excluded, but are not considered here). Two new link relation types for referencing provenance information are registered according to the template in , and may be used as shown::

              Link: provenance-URI; rel="provenance"
              Link: context-URI; rel="anchor"
or
              Link: provenance-URI; rel="provenance"; anchor="context-URI"
When used in conjunction with an HTTP success response code (2xx), this HTTP header indicates that provenance-URI is the URI of some provenance information associated with the requested resource and that the associated context is identified as context-URI.

[Yogesh]: I believe there is no guarantee that the provenance-URI will provide provenance information about the context-URI. Suggest we use *should* rather than (implicitly) *must* to state that the returned provenance-uri should have provenance information about the resource view identified by the context-uri.

I think I see your point, but I am concerned that making that possibility explicit here might be confusing for a reader. I wonder if this would be better served by a new sub-section in sect 2 about interpreting provenance information?

If no anchor link is provided then the context-URI is assumed to be the URI of the resource.

At this time, the meaning of these links returned with other HTTP response codes is not defined: future revisions of this specification may define interpretations for these.

An HTTP response MAY include multiple provenance link headers, indicating a number of different provenance resources that are known to the responding server, each providing provenance information about the accessed resource. Likewise, an HTTP response MAY include multiple anchor link headers, that indicate the resource may have provenance information associated with all of the indicated context-URIs.

The presence of a provenance link in an HTTP response does not preclude the possibility that other publishers may offer provenance information about the same resource. In such cases, discovery of the additional provenance information must use other means (e.g. see ).

Are the provenance resources indicated in this way to be considered authoritative? I.e. if the client trusts information returned by the server (e.g. is prepared to act on inferences based on the returned data), should it also trust the provenance data, or should trust in the linked provenance data be determined separately? If the linked data is to be trusted, then the data from multiple linked provenance resources MUST be consistent if it is to be meaningful. I favour an approach whereby trust in the provenance resources is established independently, which is similar to the situation for any other resource; e.g. based on the domain that serves it, or an associated digital signature.

Resource presented as HTML

Addresses ISSUE 46 by adding "anchor" link-relation.

For a document presented as HTML or XHTML, without regard for how it has been obtained, provenance information may be associated with a resource by adding a <Link> element to the HTML <head> section. Two new link relation types for referencing provenance information are registered according to the template in , and may be used as shown:

  <html xmlns="http://www.w3.org/1999/xhtml">
     <head>
        <link rel="provenance" href="provenance-URI">
        <link rel="anchor" href="context-URI">
        <title>Welcome to example.com</title>
     </head>
     <body>
        ...
     </body>
  </html>
            

The provenance-URI given by the provenance link element identifies the provenance-URI for the document.

The context-URI given by the anchor link element specifies an identifier for the presented document view, and which may be used within the provenance information when referring to this document.

An HTML document header MAY include multiple "provenance" link elements, indicating a number of different provenance resources that are known to the creator of the document, each of which may provide provenance information about the document.

Likewise, the header MAY include multiple "anchor" link elements indicating that, e.g., different revisions of the document can be identified in the provenance information using the different context-URIs.

If no "anchor" link element is provided then the context-URI is assumed to be the URI of the document. It is RECOMMENDED that this only be done when the document is static.

Proposing to remove the following Note:

See Appendix A. Notes on Using the Link Header with the HTML4 Format of RFC 5988 for further notes about using link relation types in HTML.

Specifying Provenance Services

This is a new proposal. It needs to be checked as to whether it is useful. GK/PG to review nature of provenance-service-URI.

- Any reason why provenance service URI relation has not been added to the HTTP Web Linking section as a new relation type? Is is just to finish discussions about the relation before just migrating its use to HTTP Web Linking?

This is a new section, pending wider review. It's a fairly radical change from what I did before, so I guess I was waiting to see if people were happy with the general approach, before fully integrating it.

The document creator may specify that the provenance information about the document is provided by a provenance service. This is done through the use of a third link relation type following the same pattern as above:

  <html xmlns="http://www.w3.org/1999/xhtml">
     <head>
        <link rel="provenance-service" href="service-URI">
        <link rel="anchor" href="context-URI">
        <title>Welcome to example.com</title>
     </head>
     <body>
        ...
     </body>
  </html>
              

The provenance-service link element identifies the service-URI. Dereferencing this URI yields a service description that provides further information to enable a client to determine a provenance-URI for a context; see for more details. There may be multiple provenance-service link elements, and these MAY appear in the same document as anchor and provenance link elements (though, in simple cases, we anticipate that provenance and provenance-service link relations would not both be used).

An alternative option would be to use an HTML <meta> element to present provenance links. The <Link> is preferred as it reflects more closely the intended goal, and has been defined with somewhat consistent applicability across HTTP, HTML and potentially RDF data. A specification to use <meta> for this would miss this opportunity to build on the existing specification and registry.

Resource presented as RDF

If a resource is presented as RDF (in any of its recognized syntaxes, including RDFa), it may contain references to its own provenance using additional RDF statements.

For this purpose a new RDF property, prov:hasProvenance, is defined as a relation between two resources, where the object of the property is a resource that provides provenance information about the subject resource. Multiple prov:hasProvenance assertions may be made about a subject resource.

Another new RDF property, prov:hasContext, is defined to allow the RDF content to specify one or more context-URIs of the RDF document for the purpose of provenance information (similar to the use of the "anchor" link relation in HTML).

@@TODO: needs to be completed.

Discussion:

The containing RDF resource is the subject. For RDF documents, this is sometimes written as an empty URI-reference; e.g.
<rdf:Description rdf:about="">
  <prov:hasProvenance rdf:resource="(provenance_URI)"/>
</rdf:Description>
(If publishing the RDF in a named graph, then use the URI of the graph.)

@@TODO: example

@@TODO: document namespace. Check naming style. Use provenance model namespace? Define as part of model?

Arbitrary data

We have so far decided not to try and define a common mechanism for arbitrary data, because it's not clear to us what the correct choice would be. Is this a reasonable position, or is there a real need for a generic solution for provenance discovery for arbitrary, non-web-accessible data objects?

If a resource is presented using a data format other than HTML or RDF, and no URI for the resource is known, provenance discovery becomes trickier to achieve. This specification does not define a specific mechanism for such arbitrary resources, but this section discusses some of the options that might be considered.

For formats which have provision for including metadata within the file (e.g. JPEG images, PDF documents, etc.), use the format-specific metadata to include a context-URI, provenance-URI and/or service-URI. Format-specific metadata provision might also be used to include provenance information directly in the resource.

Use a generic packaging format that can combine an arbitrary data file with a separate metadata file in a known format, such as RDF. At this time, it is not clear what format that should be, but some possible candidates are:

Provenance services

Propose simple HTTP interface for discovery. cf ISSUE 53. This should be properly RESTful, per http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven. Have I properly interpreted the principles indicated here?

This section describes a REST API [[REST-APIs]] for a provenance service with facilities for discovery and/or retrieval of provenance information, which can be implemented independently of the original resource delivery channels (e.g. by a third party service).

All service implementations must respond with a service description () when the service URI is dereferenced. Service implementations may provide either discovery, retrieval or both of these services, indicated by presence of the corresponding service URI templates in the service description. Which of these services to provide is a choice for individual service implementations.

On the Web, the normal mechanism for retrieving information is to associate it with a URI, and dereference the URI using normal retrieval mechanisms. This approach is enabled using the provenance discovery service mechanism: given the URI of some resource for which provenance information is required, the service returns one or more URIs from which provenance information may be obtained. This approach may be preferred when the provenance service cannot specify the form of URIs used for identifying provenance information, or when there may be more than one source of provenance information known to the provenance service.

The provenance retrieval service returns provenance information directly. This mechanism may be preferred when the provenance information is not already presented directly to the web, or is stored in a database with a complex query protocol, or when the provenance service can control the URI from which provenance information is served and avoid the intermediate step of URI discovery.

Using the provenance service API

This section describes general procedures for using the provenance service API. Later sections describe the resources presented by the API, and their representations using JSON. gives examples of alternative representations. Normal HTTP content negotiation mechanisms may be used to retrieve representations formats convenient for the client application.

Retrieve Provenance-URIs for a resource

To use the provenance service to retrieve a list of provenance-URIs for a resource, starting with the service URI (service-URI) and the URI of the resource or context (context-URI):

  1. Dereference service-URI to obtain a representation of the service description.
  2. Extract the provenance locations template from the service description.
  3. Use the provenance locations template with context-URI for template variable uri to form provenance-locations-URI.
  4. Dereference provenance-locations-URI to obtain a provenance locations resource in one of the formats described below.

Any or all of URIs in the returned provenance locations may be used to retrieve provenance information, per .

Retrieve Provenance information for a resource

To use the provenance service to directly retrieve provenance information for a resource, starting with the service URI (service-URI) and the URI of the resource or context (context-URI):

  1. Dereference service-URI to obtain a representation of the service description.
  2. Extract the provenance information template from the service description.
  3. Use the provenance information template with context-URI for template variable uri to form provenance-URI.
  4. Dereference provenance-URI to obtain provenance information as described by the Provenance Model specification [[PROV-MODEL]].

Resources presented and representations used

Service description

A provenance service description describes the provenance discovery and retrieval service and, in particular, provides URI templates [[URI-template]] for URIs to access provenance locations resources and/or provenance information. Dereferencing the service URI returns a representation of this service description. The service description MAY contain additional metadata about the service beyond that described here: API clients are expected to ignore any metadata elements they do not understand.

This example shows a provenance service description using JSON format [[RFC4627]], which is presented as MIME content-type application/json. Other examples may be seen in .

              {
                "provenance_service_uri":         "http://example.info/provenance_service/",
                "provenance_locations_template":  "http://example.info/provenance_service/locations/?uri={uri}",
                "provenance_content_template":    "http://example.info/provenance_service/provenance/?uri={uri}"
              }
            

Is there any point in including the provenance service URI here? It has been included for consistency with RDF representations, but is functionally redundant.

Provenance locations

A provenance locations resource enumerates one or more provenance-URIs identifying provenance information associated with a given resource.

The examples below and in are for a given resource URI http://example.info/qdata/, and using the service description example above, its URI would be http://example.info/provenance_service/location/?uri=http%3A%2F%2Fexample.info%2Fqdata%2F.

This example uses JSON format [[RFC4627]], presented as MIME content type application/json. Other examples may be seen in .

              {
                "uri": "http://example.info/qdata/",
                "provenance": [
                  "http://source1.example.info/provenance/qdata/",
                  "http://source2.example.info/prov/qdata/",
                  "http://source3.example.com/prov?id=qdata"
                ]
              }
            

The template might use ?uri={+uri} rather than just ?uri={uri}, and thereby avoid %-escaping the : and / characters in the given URI, but this could cause difficulties for URIs containing query parameters and/or fragment identifiers. In this case, the client application would need to ensure that any such characters were %-escaped before being passed into a URI-template expansion processor.

Provenance information

Provenance information about a resource or resources may be returned in any format. It is recommended that the format be one defined by the Provenance Model specification [[PROV-MODEL]].

Assuming a given resource URI http://example.info/qdata/, and using the service description example above, the provenance URI would be http://example.info/provenance_service/provenance/?uri=http%3A%2F%2Fexample.info%2Fqdata%2F.

Querying provenance information

This section proposes use of SPARQL queries to address requirements that are not covered by the simple retrieval and discovery services proposed above.

There are circumstances where simply identifying and retrieving provenance information as a web resource may not best fit the requirements of a particular application or service, e.g.:

For such circumstances, a provenance query service provides an alternative way to access provenance information and/or Provenance-URIs.

We assume that the requesting application has the URI of a provenance query service, and some information about the resource for which provenance information is required that can be used as the basis for a query. A query service is potentially a very general capability that can, in principle, subsume the provenance discovery service described in , but which may be more complex to deploy and use for simple provenance discovery cases..

The details of a provenance query service is an implementation choice, to be agreed between provider and users of the service, but for ease of interoperability between different providers and users we recommend use of SPARQL [[RDF-SPARQL-PROTOCOL]] [[RDF-SPARQL-QUERY]]. The query service URI would then be the URI of a SPARQL endpoint (or, to use the SPARQL specification language, a SPARQL protocol service). A query service can potentially be used in many different ways, limited only by the available information and capabilities of the SPARQL query language; the following subsections provide examples for what are considered to be some plausible common scenarios.

Find provenance-URI given context-URI of resource

If the requester has a context-URI for the original resource, they might simply issue a simple SPARQL query for the URI(s) of any associated provenance information; e.g., if the original resource has a context-URI http://example.org/resource,

              @prefix prov: <@@TBD>
              SELECT ?provenance_uri WHERE
              {
                <http://example.org/resource> prov:hasProvenance ?provenance_uri
              }
            

@@TODO: specific provenance namespace and property to be determined by the model specification?

Find Provenance-URI given identifying information about a resource

If the requester has identifying information that is not the URI of the original resource, then they will need to construct a more elaborate query to locate a context resource and obtain its provenance-URI(s). The nature of identifying information that can be used in this way will depend upon the third party service used, further definition of which is out of scope for this specification. For example, a query for a document identified by a DOI, say 1234.5678, using the PRISM vocabulary [[PRISM]] recommended by FaBio [[FABIO]], might look like this:

              @prefix prov: <@@TBD>
              @prefix prism: <http://prismstandard.org/namespaces/basic/2.0/>
              SELECT ?provenance_uri WHERE
              {
                [ prism:doi "1234.5678" ] prov:hasProvenance ?provenance_uri
              }
            

@@TODO: specific provenance namespace and property to be determined by the model specification?

Obtain provenance information directly given context-URI of a resource

This scenario retrieves provenance information directly given the URI of a resource, and may be useful where the provenance information has not been assigned a specific URI, or when the calling application is interested only in specific elements of provenance information.

If the original resource has a context-URI http://example.org/resource, a SPARQL query for provenance information might look like this:

              @prefix prov: <@@TBD>
              CONSTRUCT
              {
                <http://example.org/resource> ?p ?v
              }
              WHERE
              {
                <http://example.org/resource> ?p ?v
              }
            
This query essentially extracts all available properties and values available from the query service used that are directly about the specified context resource, and returns them as an RDF graph. This may be fine if the service contains only provenance information about the indicated resource, or if the non-provenance information is also of interest. A more complex query using specific provenance vocabulary terms may be needed to selectively retrieve just provenance information when other kinds of information are also available.

@@TODO: specific provenance namespace and property to be determined by the model specification? The above query pattern assumes provenance information is included in direct properties about the context resource. When an RDF provenance vocabulary is formulated, this may well turn out to not be the case. A better example would probably be one that retrieves specific provenance information when the vocabulary terms have been defined.

Provenance service discovery

(How to discover provenance services. There is nothing particular about provenance on this respect, and this section will discuss some of the available options without adding any new normative specification.)

@@TODO

IANA considerations

This document requests registration of new link relations, per section-6.2.1 of RFC 5988. @@TODO At an appropriate time (??), the following templates should be submitted to link-relations@ietf.org:

Registration template for link relation: "provenance"

Relation Name:
provenance
Description:
the resource identified by target IRI of the link provides provenance information about the resource identified by the context link
Reference:
@@this spec, @@provenance-model-spec
Notes:
...
Application Data:
...

Registration template for link relation: "anchor"

The name "anchor" has been used for the link relation name, despite the corresponding URI being described as a context-URI. This terminology has been chosen to align with usage in the description of the HTTP Link header field, per RFC 5988.

Relation Name:
anchor
Description:
the resource identified by target URI of the link is a context-URI for which provenance information may be provided. This may be used, for example, to isolate relevant information from a referenced document that contains provenance information for several contexts.
Reference:
@@this spec, @@provenance-model-spec
Notes:
...
Application Data:
...

Security considerations

Provenance is central to establishing trust in data. If provenance information is corrupted, it may lead agents (human or software) to draw inappropriate and possibly harmful conclusions. Therefore, care is needed to ensure that the integrity of provenance data is maintained.

When using HTTP to access provenance information, or to determine a provenance URI, secure HTTP (https) SHOULD be used.

When retrieving a provenance URI from a document, steps SHOULD be taken to ensure the document itself is an accurate copy of the original whose author is being trusted (e.g. signature checking, or verifying its checksum aainst an author-provided secure web service). against

@@TODO ... privacy, access control to provenance (from Edinburgh meeting). In particular, note that the fact that a resource is openly accessible does not mean that its provenance information should also be.

@@TODO ... more, probably

Acknowledgements

The editors acknowledge the contribution and review from members of the provenance working group, and in particular detailed reviews and suggestions for improvement provided by Yogesh Simmhan, Olaf Hartig, ...

Many thanks to Robin Berjon for making our lives so much easier with his cool ReSpec tool.

Provenance service format examples

In , the provenance service description was presented as a JSON-formatted document. As noted, HTTP content negotiation MAY be enabled to retrieve the document in alternative formats. This appendix provides examples of service description document presented using RDF Turtle and XML sybntaxes, and XML.

RDF Turtle example of service description

This example uses the RDF Turtle format [[TURTLE]], presented as MIME content-type text/turtle.

            @prefix provds: <http://www.w3.org/2011/provenance_discovery/@@TBD@@#> .
            <http://example.info/provenance_service/> a provds:Service_description ;
              provds:provenance_locations_template       "http://example.info/provenance_service/locations/?uri={uri}" ;
              provds:provenance_content_template     "http://example.info/provenance_service/provenance/?uri={uri}"
              .
          

The provenance URI templates are encoded in RDF as plain string literals, not as resource URIs.

RDF/XML example of service description

This is essentially the same as the Turtle example above, but encoded in RDF/XML [[RDF-SYNTAX-GRAMMAR]], and presented as MIME content-type application/xml+rdf.

            <rdf:RDF
              xmlns:rdf    = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
              xmlns:rdfs   = "http://www.w3.org/2000/01/rdf-schema#"
              xmlns:provds = "http://www.w3.org/2011/provenance_discovery/@@TBD@@#"
            >
              <provds:Service_description rdf:about="http://example.info/provenance_service/">
                <provds:provenance_locations_template>http://example.info/provenance_service/locations/?uri={uri}</provds:location_template> ;
                <provds:provenance_content_template>http://example.info/provenance_service/provenance/?uri={uri}</provds:provenance_template> ;
              </provds:Service_description>
            </rdf:RDF>
          

Plain XML example of service description

@@TODO: provide example and schema

RDF Turtle example of provenance locations

This example uses the RDF Turtle format [[TURTLE]], presented as MIME content type text/turtle.

            @prefix prov: <http://www.w3.org/2011/provenance/@@TBD@@#> .
            <http://example.info/qdata/> a prov:Entity ;
              prov:hasProvenance  <http://source1.example.info/provenance/qdata/> ;
              prov:hasProvenance  <http://source2.example.info/prov/qdata/> ;
              prov:hasProvenance  <http://source3.example.com/prov?id=qdata>
              .
          

NOTE: The namespace URI used here for the provenance properties is different from that used in the service description. I am anticipating that it will be defined as part of the provenance model. If it is not defined as part of the provenance model, then a property name should be allocated in the provenance discovery service namespace.

@@TODO: revise to conform with Provenance Model vocabulary

RDF/XML example of provenance locations

This is essentially the same as the Turtle example above, but encoded in RDF/XML [[RDF-SYNTAX-GRAMMAR]], and presented with MIME content type application/rdf+xml.

            <rdf:RDF
              xmlns:rdf    = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
              xmlns:rdfs   = "http://www.w3.org/2000/01/rdf-schema#"
              xmlns:prov   = "http://www.w3.org/2011/provenance/@@TBD@@#"
            >
              <prov:Entity rdf:about="http://example.info/qdata/">
                <prov:hasProvenance  rdf:resource="http://source1.example.info/provenance/qdata/" /> ;
                <prov:hasProvenance  rdf:resource="http://source2.example.info/prov/qdata/" /> ;
                <prov:hasProvenance  rdf:resource="http://source3.example.com/prov?id=qdata" /> ;
              </prov:Entity>
            </rdf:RDF>
          

@@TODO: revise to conform with Provenance Model vocabulary

Plain XML example of provenance locations

@@TODO: provide example and schema

Motivating scenario

I propose to replace this appendix with text based on Yogesh's walk-through of the scenario, renaming to be something like "Motivating scenario and examples"

This scenario was selected by the provenance working group as a touchstone for evaluating any provenance access proposal. This appendix evaluates the foregoing proposals against the requirements implied by that scenario.

Gap analysis

There are clearly a number of capabilities needed for a provenance-aware application that are not covered by the mechanisms described above. But most of these amount to implementation details and decisions for a particular application, and as such are beyond the scope of this document to specify.

One feature not covered above that might be a candidate for specification is a common format for a data package that combines original content along with provenance-related metadata or data. At this stage, it is not clear what format that might take, but some possible candidates are discussed in . In any case, it seems to me that a specification that is specific for provenance to the exclusion of other metadata is unlikely to obtain traction, as provenance is just part of a wider landscape of information quality, trust, preservation and more.