Re-worked description of direct HTTP query, particularly escaping of URI special characters and provenance formats returned. Revise description of return codes. (Section 4.2)
--- a/paq/prov-aq.html Tue Feb 26 16:10:55 2013 +0000
+++ b/paq/prov-aq.html Tue Feb 26 17:07:58 2013 +0000
@@ -901,23 +901,31 @@
This protocol combines the <a class="internalDFN">entity-URI</a> with a supplied URI template to formulate an HTTP GET request.
</p>
<p>
- Thus, if the URI template extrtacted from the service description is <code>http://example.com/provenance/service?target={+uri}</code> and the supplied entity-URI is <code>http://www.example.com/entity123</code>, the resulting HTTP request would be:
+ Thus, if the URI template extracted from the service description is <code>http://example.com/provenance/service?target={uri}</code> and the supplied entity-URI is <code>http://www.example.com/entity123</code>, the resulting HTTP request would be:
<pre class="example code">
-GET /provenance/service?<b>target</b>=http://www.example.com/entity123 HTTP/1.1
+GET /provenance/service?<b>target</b>=http%3A%2F%2Fwww.example.com%2Fentity123 HTTP/1.1
Host: example.com
</pre>
</p>
<p>
- The embedded entity-URI (<code>http://www.example.com/entity123</code>) identifies the resource for which provenance is to be returned. Any server that implements this protocol and receives a request URI in this form SHOULD return a provenance record for the resource-URI embedded in the query component, where that URI is the result of percent-decoding the value associated with the provenance-resource key. The embedded URI MUST be an absolute URI and the server MUST respond with a 400 Bad Request if it is not. If the supplied resource-URI includes a fragment identifier, the '#' MUST be %-encoded as <code>%23</code> when constructing the provenance-URI value; similarly, any '&' character in the resource-URI must be %-encoded as <code>%26</code> [[RFC3986]].
+ Any server that implements this protocol and receives a request URI in this form SHOULD return a provenance record for the entity-URI embedded in the query component, where that URI is the result of percent-decoding [[RFC3986]] the part of the request URI corresponding to <code>{var}</code> in the URI template. E.g., in the above example, the decoded entity-URI is <code>http://www.example.com/entity123</code>. The entity-URI MUST be an absolute URI, and the server SHOULD respond with <code>400 Bad Request</code> if it is not.
</p>
<p>
- If the provenance described by the request does not exist in the server, a suitable error response code SHOULD be returned. In the absence of any security of privacy concerns about the resource, that might be <code>404 Not Found</code>. But if the existence or non-existence of a resource is considered private, and authorization failure or other error response should be returned.
+ A server SHOULD NOT offer a template containing <code>{+uri}</code> or other non-simple variable expansion options [[URI-template]] unless all valid entity-URIs for which it can provide provenance do not contain problematic characters like <code>'#'</code> or <code>'&'</code>.
+ </p>
+ <p class="note">
+ The defined URI template expansion process [[URI-template]] generally takes care of %-escaping characters that are not permitted in URIs. However, when expanding a template with <code>{+uri}</code>, some permitted characters such as <code>'#'</code> and <code>'&'</code> are not escaped. If the supplied entity-URI contains these characters, then they may disrupt interpretation of the resulting query URI. To prevent this, <code>'#'</code> and <code>'&'</code> characters in the entity-URI may be replaced with <code>%23</code> and <code>%26</code> respectively, before performing the URI template expansion. An alternative, simpler and more reliable approach is to use <code>{uri}</code> in the URI template string, which will cause all URI-reserved characters to be %-escaped as part of the URI-template expansion, as in the example above.
</p>
<p>
- The format of the returned provenance record is not defined here, but may be established through content type negotiation using <code>Accept:</code> header fields in the HTTP request. A provenance query service SHOULD be capable of returning RDF using the vocabulary defined by [[PROV-O]], in at least one of the standard RDF serializations (e.g. RDF/XML), or any other standard serialization of the Provenance Model specification [[PROV-DM]]. Services MUST identify the <code>Content-Type</code> of the information returned.
+ If the provenance described by the request does not exist in the server, a suitable error response code SHOULD be returned. In the absence of any security of privacy concerns about the resource, that might be <code>404 Not Found</code>. But if the existence or non-existence of a resource is considered private or sensitive, an authorization failure or other error response may be returned.
</p>
<p>
- Additional URI query parameters may be used as indicated by the service description in <a href="#provenance-query-service-description" class="sectionRef"></a>.
+ The direct HTTP query service may return provenance in any available format.
+ For interoperable provenance publication, use of the PROV-O vocabulary [[PROV-O]] represented in a standardized RDF format is recommended. Where alternative formats are available, selection may be made by content negotiation, using <code>Accept:</code> header fields in the HTTP request.
+ Services MUST identify the <code>Content-Type</code> of the provenance returned.
+ </p>
+ <p>
+ Additional URI query parameters may be used as indicated by the service description in <a href="#direct-http-query-service-description" class="sectionRef"></a>.
</p>
</section>