added a lot more explanations on the protocol functioning and justification. Beautified the example code. The last SPARQL query assumes we have moved to merging rsa into cert ontology. Image improvements as suggested in weekly meeting. bblfish
authorHenry Story <henry.story@bblfish.net>
Tue, 22 Nov 2011 21:59:07 +0100
branchbblfish
changeset 197 a1b91a2f2d76
parent 196 b875b3bb617e
child 198 49ee69e74cf9
added a lot more explanations on the protocol functioning and justification. Beautified the example code. The last SPARQL query assumes we have moved to merging rsa into cert ontology. Image improvements as suggested in weekly meeting.
spec/img/WebIDSequence-friendly.graffle
spec/img/WebIDSequence-friendly.jpg
spec/index-respec.html
Binary file spec/img/WebIDSequence-friendly.graffle has changed
Binary file spec/img/WebIDSequence-friendly.jpg has changed
--- a/spec/index-respec.html	Mon Nov 21 11:58:21 2011 -0500
+++ b/spec/index-respec.html	Tue Nov 22 21:59:07 2011 +0100
@@ -47,6 +47,7 @@
           apply:  function(c) {
                     // extend the bibliography entries
                     berjon.biblio["RFC5246"] = "T. Dierks; E. Rescorla. <a href=\"http://tools.ietf.org/html/rfc5246\"><cite>The Transport Layer Security (TLS) Protocol Version 1.2</cite></a> August 2008. Internet RFC 5246. URL: <a href=\"http://tools.ietf.org/html/rfc5246\">http://tools.ietf.org/html/rfc5246</a> ";
+                    berjon.biblio["RFC5746"] = "E. Rescorla, M. Ray, S. Dispensa, N. Oskov,  <a href=\"http://tools.ietf.org/html/rfc5746\"><cite>Transport Layer Security (TLS) Renegotiation Indication Extension</cite></a> February 2010. Internet RFC 5246. URL: <a href=\"http://tools.ietf.org/html/rfc5746\">http://tools.ietf.org/html/rfc5746</a> ";
 
                     // process the document before anything else is done
                     var refs = document.querySelectorAll('adef') ;
@@ -439,7 +440,7 @@
 <code>Subject Alternative Name</code> extension with at least one URI entry identifying the <tref>Subject</tref>. 
 This URI MUST be one of the URIs with a dereferenceable secure scheme, such as https:// .   Dereferencing this URI should return a representation containing RDF data.
 For example, a certificate identifying the WebID URI <code>http://bob.example/profile#me</code> would contain the following:
-<pre>
+<pre class="example">
 X509v3 extensions:
    ...
    X509v3 Subject Alternative Name:
@@ -520,8 +521,7 @@
 
 <p>As an example to use throughout this specification here is the
 following certificate as an output of the openssl program.</p>
-<p class="example">
-<pre>
+<pre class="example">
 Certificate:
     Data:
         Version: 3 (0x2)
@@ -577,7 +577,6 @@
         45:0c:b9:48:c0:fd:ac:bc:fb:1b:c9:e0:1c:01:18:5e:44:bb:
         d8:b8
 </pre>
-</p>
 <p class="issue">Should we formally require the Issuer to be
 O=FOAF+SSL, OU=The Community of Self Signers, CN=Not a Certification Authority.
 This was discussed on the list as allowing servers to distinguish certificates
@@ -612,8 +611,7 @@
 <section class='normative'>
 <h1>Turtle</h1>
 <p>A widely used format for writing RDF graphs is the Turtle notation. </p>
-<p class="example">
-<pre>
+<pre class="example">
  @prefix cert: &lt;http://www.w3.org/ns/auth/cert#&gt; .
  @prefix rsa: &lt;http://www.w3.org/ns/auth/rsa#&gt; .
  @prefix foaf: &lt;http://xmlns.com/foaf/0.1/&gt; .
@@ -646,14 +644,12 @@
      rsa:public_exponent 65537 ;
      ] .
 </pre>
-</p>
 </section>
 <section>
 <h1>RDFa HTML notation</h1>
 <p>There are many ways of writing out the above graph using RDFa in
 html. Here is just one example.</p>
-<p class="example">
-<pre>
+<pre class="example">
 &lt;html xmlns="http://www.w3.org/1999/xhtml"
       xmlns:cert="http://www.w3.org/ns/auth/cert#"
       xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
@@ -702,7 +698,6 @@
 &lt;/body&gt;
 &lt;/html&gt;
 </pre>
-</p>
 <p>If a WebID provider would rather prefer not to mark up his data in RDFa, but
 just provide a human readable format for users and have the RDF graph appear
 in a machine readable format such as RDF/XML then he MAY publish the link from
@@ -725,7 +720,7 @@
 object notation or in relational databases. Parsers for it are also widely
 available.</p>
 
-<pre>
+<pre class="example">
 &lt;?xml version=&quot;1.0&quot;?&gt;
 &lt;rdf:RDF
  xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
@@ -786,13 +781,13 @@
 <img width="90%" src="img/WebIDSequence-friendly.jpg">
 <p>The steps in detail are as follows:</p>
 <ol>
-<li><tref>Bob</tref>'s <tref>Client</tref> MUST open a TLS [[!RFC5246]] connection with the server which authenticates itself using well known TLS mechanisms. This MAY be done as the first part of an HTTPS connection [[!HTTP-TLS]]. Once the TLS session is established, which means that the <tref>Server</tref> was authenticated by the client, then it is possible for the application layer protocol to get started. </li>
-<li>If the protocol is HTTP then the client can request an HTTP GET, PUT, POST or DELETE  action on a resource for example. The <tref>Guard</tref> can then intercept that request and by checking some access control rules determine if the client needs authentication. We will consider the case here where the client does need to be authenticated.</li>
+<li><tref>Bob</tref>'s <tref>Client</tref> MUST open a TLS [[!RFC5246]] connection with the server which authenticates itself using well known TLS mechanisms. This MAY be done as the first part of an HTTPS connection [[!HTTP-TLS]].</li>
+<li>Once the Transport Layer Security [TLS] has been set up, the application protocol exchange can start. If the protocol is HTTP then the client can request an HTTP GET, PUT, POST, DELETE, ... action on a resource as detailed by [[!HTTP11]]. The <tref>Guard</tref> can then intercept that request and by checking some access control rules determine if the client needs authentication. We will consider the case here where the client does need to be authenticated.</li>
 <li>The Guard MUST requests the client to authenticate itself using public key cryptography by signing a token with its private key and have the Client send its Certificate. This has been carefully defined in the TLS protocol and can be summarised by the following steps:
 <ol>
 <li>The guard requests of the TLS agent that it make a Certificate Request to the client. The TLS layer does this. Because the WebID protocol does not rely on Certificate Authorities to verify the contents of the <tref>Certificate</tref>, the TLS Agent can ask for any Certificate from the Client. More details in <a href="requesting-the-client-certificate">Requesting the Client Certificate</a></li>
 <li>The Client asks Bob to choose a certificate if the choice has not been automated. We will assume that Bob does choose a <tref>WebID Certificate</tref> and sends it to the client.</li>
-<li>The <tref>TLS Agent</tref> MUST verify that the client is indeed in posession of the private key. What is important here is that the TLS Agent need not know the Issuer of the Certificate, or need not have any trust relation with the Issuer. Indeed if the TLS Layer could verify the signature of the Isser and trusted the statements it signed, then step 4 and 5 would not be needed - other than perhaps as a double check to verify that the key was still valid.</li>
+<li>The <tref>TLS Agent</tref> MUST verify that the client is indeed in posession of the private key. What is important here is that the TLS Agent need not know the Issuer of the Certificate, or need not have any trust relation with the Issuer. Indeed if the TLS Layer could verify the signature of the Issuer and trusted the statements it signed, then step 4 and 5 would not be needed - other than perhaps as a way to verify that the key was still valid.</li>
 <li>The <tref>WebID Certificate</tref> is then passed on to the <tref>Guard</tref> with the proviso that the WebIDs still needs to be verified.</li>
 </ol>
 </li>
@@ -821,139 +816,104 @@
 <h2>Initiating a TLS Connection</h2>
 
 <p>Standard SSLv3 and TLSv1 and upwards can be used to establish the connection between
-the Client and the TLS Agent listening on the Service's port.</p>
+the Client and the TLS Agent listening on the Service's port. </p>
+<p class="note">Many servers allow a simple form of TLS client side authentication to be setup when configuring a <tref>TLS Agent</tref>: they permit the agent to be authenticated in WANT or NEED mode.
+If the client sends a certificate, then neither of these have an impact on the <tref>WebID Verification</tref> steps (4) and (5).
+Nevertheless, from a user interaction perspective both of these are problematic as they either force (NEED) or ask the user to authenticate himself even if the resource he wishes to interact with is public and requires no authentication. 
+People don't usually feel comfortable authenticating to a web site on the basis of a certificate alone. 
+They prefer human readable text, and detailed error messages which the HTTP layer deliver.
+
+It is better to move the authentication to the application layer <tref>Guard</tref> as it has a lot more information about the application state. 
+Please see the <a href="http://www.w3.org/2005/Incubator/webid/wiki/">WebID Wiki</a> for implementation pointers in different programming languages and platforms to learn about how this can be done and to share your experience.</p>
 </section>
 
 <section class='normative'>
 <h2>Connecting at the Application Layer</h2>
 
-<p>Once the TLS connection has started the application layer protocol can get going. It is always possible during communication at the application layer for communication parameters to be set at the TLS layer, such as requesting a client certificate as described in the following section. This permits some very flexible authentiation interaction, as the Guard can find out exactly what the abilities of the client are in order to work out what type of TLS client authentication it can ask for.</p>
-<p>If the protocol permits it, the Client can let the Application layer, and especially the <tref>Guard</tref> know that the client can authenticate with a WebID Certificate, and even if it wishes to do so. This may be useful both to allow the Server to know that it can request the client certificate, and also in order to allow Robots that may find it a lot more convenient to be authenticated automatically. 
+<p>Once the TLS connection has been setup, the application layer protocol interaction can start.
+This could be an HTTP GET request on the protected resource for example.
+<p>If the protocol permits it, the Client can let the Application layer, and especially the <tref>Guard</tref> know that the client can authenticate with a WebID Certificate, and even if it wishes to do so. This may be useful both to allow the Server to know that it can request the client certificate, and also in order to make life easier for Robots that may find it a lot more convenient to be authenticated at the TLS layer.
 </p>
+<p class="issue">Bergi proposed a header for HTTP which could do this. Please summarise it. </p>
 </section>
 
 
 <section class='normative'>
 <h2>Requesting the Client Certificate</h2>
 
-<p>TLS allows the server to request a Certificate from the Client using the <code>CertificateRequest</code> message [section 7.4.4] of TLS v1.1 [[RFC5246]]. Since WebID TLS authentication does not rely on CA's signing the certificate to verify the WebID Claims made therein, the Server does not need to restrict the certificate it receives by the CA's they were signed by. It can therefore leave the  <code>certificate_authorities</code> field blank in the request. Most programming languages permit this option to be set. </p>
-</section>
+<p>TLS allows the server to request a Certificate from the Client using the <code>CertificateRequest</code> message [section 7.4.4] of TLS v1.1 [[!RFC5246]].  Since WebID TLS authentication does not rely on CA's signing the certificate to verify the WebID Claims made therein, the Server does not need to restrict the certificate it receives by the CA's they were signed by. It can therefore leave the  <code>certificate_authorities</code> field blank in the request. </p>
 <p class="note">From our experience leaving the certificate_authorities field empty leads to the correct behavior on all browsers and all TLS versions.</p>
-<p>If the Client does not send a certificate, because either it does not have one or it does not wish to send one, other authentication procedures can be pursued from OpenID, OAuth, BrowserID, etc... as these occur at the Application Layer, which has not yet been accessed.</p>
-<p>As far as possible it is important for the server to request the client certificate in <code>WANT</code> mode, not in <code>NEED</code> mode. If the request is made in <code>NEED</code> mode then connections will be broken off if the client does not send a certificate. Unless the server can be confident that the client has a certificate - which it may be because it advertised that in some other way to the server - then it should try to avoid making the request in <code>NEED</code> mode. In most browsers this leads to a very bad user experience.  Luckily only few browsers require <code>Need</code> mode for the client to send a certificate.
+<p class="note">A security issue with TLS renegotiation was discovered in 2009, and an IETF fix was proposed in [[!RFC5746]] which is widely implemented.</p>
+<p>If the Client does not send a certificate, because either it does not have one or it does not wish to send one, other authentication procedures can be pursued at the application layer with protocols such as OpenID, OAuth, BrowserID, etc... </p>
+<p>As far as possible it is important for the server to request the client certificate in <code>WANT</code> mode, not in <code>NEED</code> mode. 
+If the request is made in <code>NEED</code> mode then connections will be broken off if the client does not send a certificate. 
+This will break the connection at the application protocol layer, and so will lead to a very bad user experience.  The server should therfore avoid doing this unless it can be confident that the client has a certificate - which it may be because the client advertised that in some other way to the server. 
 </p>
 <p class="issue">Is there some normative spec about what NEED and WANT refer to?</p>
 
+</section>
+<section class='normative'>
+<h2>Verifiying the WebIDs</h2>
+<p>The <tref>Verification Agent</tref> is given a list of WebIDs associated with a public key. It needs to verify that the agent identified by that WebID is indeed the agent that controls the private key of the given public key. It does this by looking up the definition of the WebID. A WebID is a URI, and it's meaning can be had by dereferencing it using the protocol indicated in its scheme. </p>
+<p>If we first consider WebIDs with fragment identifiers, we can explain the logic of this as follows. As is explained in the  RFC defining URIs [[!RFC3986]]
+<blockquote>
+The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information.  
+The identified secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. 
+[...]
+The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource.
+</blockquote>
+<p>In order therefore to know the meaning of WebID containing a fragment identifier, one needs to dereference the resource referred to without the fragment identifier. 
+This resource will describe the referent of the WebID in some way. 
+If it says that the referent of the WebID is the agent that controls the private key of the given public key, then this is a  definite description that can be considered to be a definition of the WebID: it gives its meaning.
+</p>
+<p>The trust that can be had in that statement is therefore the trust that one can have in one's having received the correct representation of the document that defined that WebID. 
+An https WebID will therefore be a lot more trustworthy than an https WebID by a factor of the likelyhood of man in the middle attacks.</p>
+<p>Once that is proven then the trust one can have in the agent at the end of the TLS connection being the referent of the WebID is related to the trust one has in the cryptography, and the likelyhood that the private key could have been stolen.</p>
+<p class="issue">Add explanation for URI with redirect.</p>
 <section class='normative'>
 <h2>Processing the WebID Profile</h2>
 
-<p>A <tref>Verification Agent</tref> MUST be able to process documents in
-RDF/XML [[!RDF-SYNTAX-GRAMMAR]] and XHTML+RDFa [[!XHTML-RDFA]].
-A server responding to a <tref>WebID Profile</tref> request SHOULD be able
-to deliver at least RDF/XML or RDFa.
-The <tref>Verification Agent</tref> MUST set the Accept-Header to request
-<code>application/rdf+xml</code> with a higher priority than
-<code>text/html</code> and <code>application/xhtml+xml</code>. If the server
-answers such a request with an HTML representation of the resource, this SHOULD
-describe the WebID Profile with RDFa.
+<p>So the Verification Agent needs to fetch the document, if it does not have a valid one in cache.   <tref>Verification Agent</tref> MUST be able to process documents in RDF/XML [[!RDF-SYNTAX-GRAMMAR]] and RDFa in XHTML and HTML [[!RDFA-CORE]]. The result of this processing should be a graph of RDF relations that is queryable, as explained in the next section.</p>
+<p class="note">
+It is suggested that the <tref>Verification Agent</tref> should set the Accept-Header to request <code>application/rdf+xml</code> with a higher priority than <code>text/html</code> and <code>application/xhtml+xml</code>.  The reason is that it is quite likely that many sites will produce non marked up html and leave the graph to the pure rdf formats.
 </p>
-
-<p class="issue">This section will explain how a Verification Agent extracts
-semantic data describing the identification credentials from a WebID Profile.</p>
 </section>
 
 <section class='normative'>
 <h2>Verifying the WebID is identified by that public key</h2>
 
 <p>
-There are number of different ways to check that the public key given in the
-X.509 certificate against the one provided by the <tref>WebID Profile</tref> or
-another trusted source, the essence is checking that the graph of relations in
-the Profile contains a pattern of relations.
+There are number of different ways to check that the public key given in the X.509 certificate against the one provided by the <tref>WebID Profile</tref>, but the simplest way to explain it is to say that they all have to be equivalent to the following SPARQL queries.
 </p>
-<p>Assuming the public key is an RSA key, and that its modulus is
-"9D79BFE2498..." and exponent "65537" then the following SPARQL query could
-be used:
+<p>Assuming the public key is an RSA key, and that its modulus is "9D79BFE2498..." and exponent "65537" then the following query should be used:
 </p>
 <pre class='example'>
-PREFIX cert: &lt;http://www.w3.org/ns/auth/cert#&gt;
-PREFIX rsa: &lt;http://www.w3.org/ns/auth/rsa#&gt;
+PREFIX : &lt;http://www.w3.org/ns/auth/cert#&gt;
 ASK {
-   &lt;http://example.org/webid#public&gt; cert:key [
-      rsa:modulus  "9D79BFE2498..."^^cert:hex;
-      rsa:public_exponent "65537"^^cert:int;
+   &lt;http://bob.example/webid#public&gt; :key [
+      :modulus  "9D79BFE2498..."^^xsd:hexBinary;
+      :exponent 65537;
    ] .
 }
 </pre>
-<p>If the query returns true, then the graph has validated the associated
-public key with the WebID.</p>
-<p>The above requires the sparql endpoint (or the underlying triple store
-to be able to do inferencing on dataytypes. This is because the numerical
-values may be expressed with different xsd and cert datatypes which must all
-be supported by <tref>VerificationAgent</tref>s. The cert datatypes allow
-the numerical expression to be spread over a number of lines, or contain
-arbitrary characters such as "9D ☮ 79 ☮ BF ☮ E2 ☮ F4 ☮ 98 ☮..." . The datatype
-itself need not necessarily be expressed in cert:hex, but could use a number of
-xsd integer datatype notations, cert:int or future base64 notations.
-</p>
-<p class="issue">Should we define the base64 notation?</p>
-<p>If the SPARQL endpoint doesn't provide a literal inferencing engine, then
-the modulus should be extracted from the graph, normalised into a big integer
-(integers without an upper bound), and compared with the values given in the
-public key certificate. After replacing the <code>?webid</code> variable in the
-following query with the required value the <tref>Verifying Agent</tref> can
-query the Profile Graph with</p>
-<pre class='example'>
-PREFIX cert: &lt;http://www.w3.org/ns/auth/cert#&gt;
-PREFIX rsa: &lt;http://www.w3.org/ns/auth/rsa#&gt;
-SELECT ?m ?e
-WHERE {
-   ?webid cert:key [
-        rsa:modulus ?m ;
-        rsa:public_exponent ?e ;
-   ] .
-}
-</pre>
-<p>Here the verification agent must check that one of the answers for ?m and ?e
-matches the integer values of the modulus and exponent given in the public key
-in the certificate.</p>
 <p class="issue"> The public key could be a DSA key. We need to add an ontology
-for DSA too. What other cryptographic ontologies should we add?</p>
+for DSA too.</p>
 
 </section>
-
+</section>
 <section class='normative'>
 <h2>Authorization</h2>
 
-<p class="issue">This section will explain how a Verification Agent may
-use the information discovered via a WebID URI to determine if one should
-be able to access a particular resource. It will explain how a Verification
-Agent can use links to other RDFa documents to build knowledge about the
-given WebID.</p>
-
-</section>
-
-<section class='normative'>
-<h2>Secure Communication</h2>
-
-<p class="issue">This section will explain how an Identification Agent and
-a Verification Agent may communicate securely using a set of verified
-identification credentials.</p>
-
-<p>
-If the <tref>Verification Agent</tref> has verified that the
-<tref>WebID Profile</tref> is owned by the <tref>Identification Agent</tref>,
-the <tref>Verification Agent</tref> SHOULD use the verified
-<tref>public key</tref> contained in the <tref>Identification Certificate</tref>
-for all TLS-based communication with the <tref>Identification Agent</tref>.
-This ensures that both the <tref>Verification Agent</tref> and the
-<tref>Identification Agent</tref>
-are communicating in a secure manner, ensuring cryptographically protected
-privacy for both sides.
+<p>The Authorization step may  be as simple as just allowing everybody read access. The authentication phase may then just have been useful in order to gain some extra information from the <tref>WebID Profile</tref> in order to personalise a site.</p>
+<p>Once the <tref>Guard</tref> has a WebID he can do a lookup in a database to see if the agent is allowed the required access to the given resource. 
+Up to this point we are not much more advanced that with a user name and password, except that the user did not have to create an account on Alice's server to identify himself and that the server has some claimed attributes to personalise the site for the requestor.
+But the interesting thing about such a WebID is that because it is a global linkable URI, one can  build webs of trust that can be crawled the same way the web can be crawled: by following links from one document to another. 
+It is therfore possible to have very flexible access control rules where parts of the space of the user's machine is given access to friend and those friends friends (FOAF), stated by them at their domains.
+It is even be possible to allow remote agents to define their own access control rules for parts of the machine's namespace.
+There are too many possibilities to list them all here.
 </p>
 
 </section>
-
 </section>
 
 <section class='normative'>