Taken Pat's comment into account, as well as Peter's and Guus' suggestion
authorazimmerm
Thu, 12 Dec 2013 21:59:15 +0100
changeset 1561 d693d3307563
parent 1560 cc0e0a0cd166
child 1563 a7d28fbe2a5f
child 1575 e2f6ab7bc482
Taken Pat's comment into account, as well as Peter's and Guus' suggestion
rdf-dataset/index.html
--- a/rdf-dataset/index.html	Thu Dec 12 09:50:50 2013 -0800
+++ b/rdf-dataset/index.html	Thu Dec 12 21:59:15 2013 +0100
@@ -30,7 +30,7 @@
 
 "DELBRU-ET-AL-2008" : "Renaud Delbru, Axel Polleres, Giovanni Tummarello, Stefan Decker. <cite>Context Dependent Reasoning for Semantic Documents in Sindice.</cite> In Proceedings of the 4th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS). Karlsruhe, Germany, 2008.",
 
-"CARROLL-ET-AL-2005" : "Jeremy Carroll, Pat Hayes, " },
+"NAMED-GRAPH" : "Jeremy Carroll, Pat Hayes, etc" },
           
          // the specification's short name, as in http://www.w3.org/TR/short-name/
           shortName:            "rdf-datasets",
@@ -40,7 +40,7 @@
           // subtitle   :  "an excellent document",
 
           // if you wish the publication date to be other than today, set this
-          publishDate:  "2013-09-17",
+          publishDate:  "2013-12-04",
 
           // if the specification's copyright date is a range of years, specify
           // the start date here:
@@ -65,7 +65,7 @@
 
           // if you want to have extra CSS, append them to this list
           // it is recommended that the respec.css stylesheet be kept
-          extraCSS:             ["http://dvcs.w3.org/hg/rdf/raw-file/default/ReSpec.js/css/respec.css"],
+          //extraCSS:             ["http://dvcs.w3.org/hg/rdf/raw-file/default/ReSpec.js/css/respec.css"],
 
           // editors, add as many as you like
           // only "name" is required
@@ -134,23 +134,27 @@
   <body>
 
 <section id="abstract">
-  <p>RDF defines the concept of RDF datasets, a structure composed of a distinguished RDF graph and zero or more named graphs, being pairs comprising an IRI or blank node and an RDF graph. While RDF graphs have a formal model-theoretic semantics that determines what arrangements of the world make an RDF graph true, no agreed formal semantics exists for RDF datasets. This document presents the issues to be addressed when defining a formal semantics for datasets, as they have been discussed in the RDF 1.1 Working Group, and specify several semantics in terms of model theory, each corresponding to a certain design choice for RDF datasets.</p>
+  <p>RDF defines the concept of RDF datasets, a structure composed of a distinguished RDF graph and zero or more named graphs, being pairs comprising an IRI or blank node and an RDF graph. While RDF graphs have a formal model-theoretic semantics that determines what arrangements of the world make an RDF graph true, no agreed formal semantics exists for RDF datasets. This document presents some issues to be addressed when defining a formal semantics for datasets, as they have been discussed in the RDF&nbsp;1.1 Working Group, and specify several semantics in terms of model theory, each corresponding to a certain design choice for RDF datasets.</p>
+</section>
+
+<section id="sotd">
+  <p>This document is intended to be published as a Working Group note.</p>
 </section>
 
 <section id="sec-introduction">
     <h2 id="introduction">Introduction</h2>
 
-    <p>The <a href="http://www.w3.org/TR/rdf11-concepts/">Resource Description Framework (RDF)</a> version 1.1 defines the concept of RDF datasets, a notion introduced first by the SPARQL specification [[RDF-SPARQL-QUERY]].  An RDF dataset is defined as a collection of <a title="RDF graph">RDF graphs</a> where all but one are <a title="named graph">named graphs</a> associated with an <a>IRI</a> or <a>blank node</a> (the <a>graph name</a>), and the unnamed default graph [[RDF11-CONCEPTS]].  Given that RDF is a data model equiped with a formal semantics [[RDF11-MT]], it is natural to try and define what the semantics of datasets should be.</p>
+    <p>The <a href="http://www.w3.org/TR/rdf11-concepts/">Resource Description Framework (RDF)</a> version 1.1 defines the concept of RDF datasets, a notion introduced first by the SPARQL specification [[RDF-SPARQL-QUERY]].  An RDF dataset is defined as a collection of <a title="RDF graph">RDF graphs</a> where all but one are <a title="named graph">named graphs</a> associated with an <a>IRI</a> or <a>blank node</a> (the <a>graph name</a>), and the unnamed default graph [[RDF11-CONCEPTS]].  Given that RDF is a data model equipped with a formal semantics [[RDF11-MT]], it is natural to try and define what the semantics of datasets should be.</p>
 
-    <p>The RDF 1.1 Working Group was initially chartered to provide such semantics in its recommendation:</p>
+    <p>The RDF&nbsp;1.1 Working Group was initially chartered to provide such semantics in its recommendation:</p>
     <blockquote cite="http://www.w3.org/2011/01/rdf-wg-charter">
         <h5>Required features</h5>
         <ul><li id="ng">Standardize a model and semantics for multiple graphs and graphs stores [...]</li></ul>
     </blockquote>
 
-	<p>However, discussions within the Working Group revealed that very different assumptions were currently existing among practitioners, who are using RDF datasets with their own intuition of the meaning of the datasets.  Defining the semantics of RDF datasets requires an understanding of the two following issues:</p>
+	<p>However, discussions within the Working Group revealed that very different assumptions currently exist among practitioners, who are using RDF datasets with their own intuition of the meaning of datasets.  Defining the semantics of RDF datasets requires an understanding of the two following issues:</p>
 	<ul>
-		<li>what the graph names (IRI or blank node) denote;</li>
+		<li>what the graph names (IRI or blank node) denote, or what are the constraints on what the names can possibly denote;</li>
 		<li>how the triples in the named graph influence the meaning of the dataset.</li>
 	</ul>
 	
@@ -161,16 +165,20 @@
 		<li>it denotes a supergraph of the graph inside the pair;</li>
 		<li>it denotes a container for the RDF graph, that is, a mutable element;</li>
 		<li>it denotes the information resource that can be obtained by dereferencing the graph name, when it is an IRI and if such resource exists;</li>
-		<li>it denotes an arbitrary resource that is constrained to be in a special relationship with the graph inside the pair;</li>
-		<li>it denotes an unconstrained resource.</li>
+		<li>it denotes an arbitrary resource that is constrained to be in a special relationship (for instance, <code>ex:hasGraph</code> with the graph inside the pair;</li>
+		<li>it denotes the deductive closure of the graph inside the pair;</li>
+		<li>it denotes an arbitrary resource that is in a special relation with the deductive closure, or with a superset of the graph;</code>
+		<li>it denotes an unconstrained resource;</li>
+		<li>etc.</li>
 	</ul>
+	<p>Even with an intuitive understanding of what the truth of an RDF dataset should be, the precise model-theoretic formalization can be subject to many variations.</p>
 	
 	<p>Possible choices for the meaning of the triples in the named graphs include:</p>
 	<ul>
 		<li>all the triples in the named graphs and default graphs contribute to the truth of the dataset in the same way triples contribute to the truth of a single graph;</li>
 		<li>the triples of the named graphs are considered part of the knowledge of the default graph;</li>
-		<li>different named graphs indicate different "contexts", or different "worlds", and the triples inside a named graph are assumed to be true in the associated context only; in this case, the default graph can be interpreted as yet another context, or be considered as a "global context" which must hold in all contexts;</li>
-		<li>the named graphs are considered as "hypothetical graphs" which bear the same consequences as their RDF graphs, but they do not participate in the truth of the dataset; this is similar to the "context" option above but it allows a graph to contain contradictions without making the dataset contradictory;</li>
+		<li>different named graphs indicate different “contexts”, or different “worlds”, and the triples inside a named graph are assumed to be true in the associated context only; in this case, the default graph can be interpreted as yet another context, or be considered as a “global context” which must hold in all contexts, or again as metadata about the contexts;</li>
+		<li>the named graphs are considered as “hypothetical graphs” which bear the same consequences as their RDF graphs, but they do not participate in the truth of the dataset; this is similar to the “context” option above but it allows a graph to contain contradictions without making the dataset contradictory;</li>
 		<li>the triples are merely quoted without any indication of what they mean; they do not participate in the truth of a dataset.</li>
 	</ul>
 	
@@ -183,8 +191,8 @@
 
 	<p>We first take a look at existing specifications that could shed a light on how the semantics of datasets should be defined. There are three important documents that closely relate to the issue:</p>
 	<ul>
-		<li>the RDF semantics, as standardised in 2004 [[RDF-MT]] and its revision in 2013 [[RDF11-MT]];</li>
-		<li>the article <i>Named Graphs</i> by Carrol et al., which first introduced the term "named graph" and contains a section on formal semantics;</li>
+		<li>the RDF semantics, as standardized in 2004 [[RDF-MT]] and its revision in 2013 [[RDF11-MT]];</li>
+		<li>the article <i>Named Graphs</i> by Carroll et al. [[NAMED-GRAPH]], which first introduced the term “named graph” and contains a section on formal semantics;</li>
 		<li>the SPARQL specification [[RDF-SPARQL-QUERY]], which defines RDF datasets and how to query them.</li>
 	</ul>
 	
@@ -193,23 +201,23 @@
 		
 		<!--<p class="issue">Part of what follows is somewhat subjective.</p>-->
 		
-		<p>The first version of RDF semantics defined the meaning of a set of RDF graphs: <q cite="http://www.w3.org/TR/rdf-mt/#entail">a set of graphs can be treated as equivalent to its merge, that is, a single graph, as far as the model theory is concerned</q>. The new version indicates that a set of RDF graphs can be either interpreted as its union or as its merge.</p>
-		<p>So, a first intuition could be that an RDF dataset, being presented as a collection of graph, should mean exactly what the set of its named graphs and default graph means. However, this completely leaves out the meaning of graph names, which could be valuable indicators for the truth of a dataset.</p>
-		<p>Formally, the semantics of RDF defines a notion of interpretation for a set of triples (i.e., an RDF graph), which then can extend to a set of RDF graphs. A dataset is neither a set of triples nor a set of RDF graphs. It is a set of <em>pairs</em> (name,graph) together with a distinguished RDF graph. Consequently, defining interpretation and entailement for RDF datasets would require at least an extension of the RDF semantics.</p>
+		<p>As described in RDF 1.1 Semantics, a set of RDF graphs can be interpreted as either the union of the graphs or as their merge ([[RDF11-MT]], Technical note, Section&nbsp;5.2).</p>
+		<p>So, a first intuition could be that an RDF dataset, being presented as a collection of graph, should mean exactly what the set of its named graphs and default graph means. However, this completely leaves out the potential meaning of graph names, which could be valuable indicators for the truth of a dataset.</p>
+		<p>Formally, the semantics of RDF defines a notion of interpretation for a set of triples (i.e., an RDF graph), which then can extend to a set of RDF graphs. A dataset is neither a set of triples nor a set of RDF graphs. It is a set of <em>pairs</em> (name,graph) together with a distinguished RDF graph and the RDF semantics does not itself specify a meaning for these pairs.</p>
 		<p>Conceptually, it is problematic since one of the reasons for separating triples into distinct (named) graphs is to avoid propagating the knowledge of one graph to the entire triple base. Sometimes, contradicting graphs need to coexist in a store. Sometimes named graphs are not endorsed by the system as a whole, they are merely quoted.</p>
 	</section>
 	
 	<section id="sec-named-graph-paper">
 		<h3 id="named-graph">The Named Graphs paper</h3>
 
-		<p>In Carrol et al., a named graph is simply defined as a pair comprising an IRI and an RDF graph. The notion of RDF interpretation is extended to named graphs by saying that the graph IRI in the pair must denote the pair itself. This non-ambiguously answers the question of what the graph IRI denotes. This can then be used to define a proper dataset semantics, as shown in Section 3.3.</p>
+		<p>In Carroll et al. [[NAMED-GRAPHS]], a named graph is defined as a pair comprising an IRI and an RDF graph. The notion of RDF interpretation is extended to named graphs by saying that the graph IRI in the pair must denote the pair itself. This non-ambiguously answers the question of what the graph IRI denotes. This can then be used to define proper dataset semantics, as shown in Section&nbsp;3.3. Note that it is deliberate that the graph IRI is forced to denote the pair rather than the RDF graph. This is done in order to differentiate two occurrences of the same RDF graph that could have been published at different times, or authored by different people. A simple reference to the RDF graph would simply identify a mathematical set, which is the same wherever it occurs.</p>
 	</section>
 	
 	<section id="sec-sparql">
 		<h3 id="sparql">The SPARQL specification</h3>
 
-		<p>RDF 1.1 defines the notion of RDF dataset  identically to SPARQL, which introduced it first. So, in order to understand the semantics of dataset, it is worth looking at how SPARQL uses datasets. SPARQL defines what are answers to queries posed against a dataset, but it never defines the notions that are key to a model theoretic formal semantics: it neither presents interpretations nor entailment. Still, it is worth noticing that a ASK query that only contains a basic graph pattern without variables yields the same result as asking whether the RDF graph in the query is entailed by the default graph. Based on this observation, one may extrapolate that a ASK query containing no variables and only GRAPH graph patterns would yield the same result as dataset entailment.</p>
-		<p>This can be used as a guide for formalizing the semantics of datasets, as can be seen in Section 3.7.</p>
+		<p>RDF&nbsp;1.1 borrows the notion of RDF dataset from the SPARQL specification [[RDF-SPARQL-QUERY]], with the notable different that RDF&nbsp;1.1 allows graph names to be blank nodes. So, in order to understand the semantics of dataset, it is worthwhile looking at how SPARQL uses datasets. SPARQL defines what answers to queries posed against a dataset are, but it never defines the notions that are key to a model theoretic formal semantics: it neither presents interpretations nor entailment. Still, it is worth noticing that a ASK query that only contains a basic graph pattern without variables yields the same result as asking whether the RDF graph in the query is entailed by the default graph. Based on this observation, one may extrapolate that a ASK query containing no variables and only <code>GRAPH</code> graph patterns would yield the same result as dataset entailment.</p>
+		<p>This can be used as a guide for formalizing the semantics of datasets, as can be seen in Section&nbsp;3.7.</p>
 	</section>
 	
 </section>
@@ -219,7 +227,7 @@
 
 	<h2 id="formal-definitions">Formal definitions</h2>
 
-	<p>This section presents the different options proposed, together with their formal definitions. We include each time a discussion of the merrits of the choice, and some properties.</p>
+	<p>This section presents the different options proposed, together with their formal definitions. We include each time a discussion of the merits of the choice, and some properties.</p>
 	<p>Each subsection here describes the option informally, before presenting the formal definitions. As far as the formal part is concerned, one has to be familiar with the definitions given in RDF Semantics. We rely a lot on the notion of interpretation and entailment, which are key in model theory.</p>
 	<p>All proposed options share some commonalities:</p>
 	<ul>
@@ -227,18 +235,21 @@
 		<li>they define notions of interpretation and entailment in function of the corresponding notions in RDF Semantics.</li>
 	</ul>
 
-	<p>In fact, the dependency on RDF semantics is such that most of the dataset semantics below reuse RDF semantics as a black box.  The purpose of a formal semantics for datasets is to determine under what circumstances a dataset can be said to be true or false.  The formalisation below indicates that the truth of an RDF dataset can be determined in function of the truth of an RDF graph, no matter how the latter is determined.  Therefore, instead of defining a precise definition of RDF graph interpretations and entailment, we use the more abstract notion of <a>entailment regime</a>.  In fact, RDF Semantics does not define a single formal semantics, but multiple ones, depending on what standard vocabularies are endorsed by an application.  Consequently, we will parameterize most of the definitions below with an unspecified entailment regime <var>E</var>.  RDF 1.1 defines the following entailment regimes: simple entailment, D-entailment, RDF-entailment, RDFS-entailment.  Additionally, OWL defines two other entailment regimes, based on the OWL 2 direct semantics [[OWL2-DIRECT-SEMANTICS]] and the OWL 2 RDF-based semantics [[OWL2-RDF-BASED-SEMANTICS]].</p>
+	<p>The first item above reflects the indication given in [[RDF11-MT]] with respect to dataset semantics: <q cite="http://www.w3.org/TR/rdf11-mt/#rdf-datasets">a dataset SHOULD be understood to have at least the same content as its default graph</q>.</p>
+	<p>The dependency on RDF semantics is such that most of the dataset semantics below reuse RDF semantics as a black box.  More precisely, it is not necessary to be specific about how truth of RDF graphs is defined as long as there is a notion of interpretation that determines the truth of a set of triples.  In fact, RDF Semantics does not define a single formal semantics, but multiple ones, depending on what standard vocabularies are endorsed by an application (such as the RDF, RDFS, XSD vocabularies).  Consequently, we parameterize most of the definitions below with an unspecified entailment regime <var>E</var>.  RDF&nbsp;1.1 defines the following entailment regimes: simple entailment, D-entailment, RDF-entailment, RDFS-entailment.  Additionally, OWL defines two other entailment regimes, based on the OWL&nbsp;2 direct semantics [[OWL2-DIRECT-SEMANTICS]] and the OWL&nbsp;2 RDF-based semantics [[OWL2-RDF-BASED-SEMANTICS]].</p>
 	<p>For an entailment regime <var>E</var>, we will say <var>E</var>-interpretation, <var>E</var>-entailment, <var>E</var>-equivalence, <var>E</var>-consistency to describe the notions of interpretations, entailment, equivalence and consistency associated with the regime <var>E</var>. Similarly, we will use the terms dataset-interpretation, dataset-entailment, dataset-equivalence, dataset-consistency for the corresponding notions in dataset semantics.</p>
 
 	<section>
 		<h3 id="no-meaning">Named graphs have no meaning</h3>
+
 		<p>The simplest semantics defines an interpretation of a dataset as an RDF interpretation of the default graph. The dataset is true, according to the interpretation, if and only if the default graph is true. In this case, any datasets that have equivalent default graphs are dataset-equivalent.</p>
-		<p>This means that the named graphs in a dataset are irrelevent to determining the truth of a dataset. Therefore, arbitrary modifications of the named graphs in a graph store always yield an equivalent dataset, according to this semantics.</p>
+		<p>This means that the named graphs in a dataset are irrelevant to determining the truth of a dataset. Therefore, arbitrary modifications of the named graphs in a graph store always yield a logically equivalent dataset, according to this semantics.</p>
 		<h4 id="f1" class="formal">Formalization</h4>
 		<p>Considering an entailment regime <var>E</var>, a dataset-interpretation with respect to <var>E</var> is an <var>E</var>-interpretation. Given an interpretation <var>I</var> and a dataset <var>D</var> having default graph <var>G</var> and named graphs <var>NG</var>, <var>I(D)</var> is true if and only if <var>I(G)</var> is true.</p>
 
-		<h4 id="ex1" class="ex">Examples of entailement and non-entailments</h4>
+		<h4 id="ex1" class="ex">Examples of entailment and non-entailments</h4>
 		<p>Consider the following dataset:</p>
+		
 		<pre class="example">{ :s  :p  :o . }
 :g1 { :a  :b  :c }</pre>
 		<p>does not dataset-entail:</p>
@@ -256,17 +267,17 @@
  :g1  :created  "2013-09-17"^^xsd:date .}
 :g1 { :x  :y  :z }</pre>
 
-		<h4 id="p1" class="prop">Properties of this dataset semantics</h4>
-		<p>Assuming this semantics is convenient since it merely ignores named graphs in a dataset. As a result, datasets can be simply treated as regular RDF graphs by extracting the default graph. Named graphs can still be used to preserve useful information, but it bares no more meaning than a commentary in a program source code.</p>
-		<p>The obvious disadvantage is that, since named graphs are completely disregarded, there is no added value in using RDF datasets rather than regular RDF graphs.</p>
+		<h4 id="p1" class="pro">Properties of this dataset semantics</h4>
+		<p>Assuming this semantics is convenient since it merely ignores named graphs in a dataset for any reasoning task. As a result, datasets can be simply treated as regular RDF graphs by extracting the default graph. Named graphs can still be used to preserve useful information, but it bears no more meaning than a commentary in a program source code.</p>
+		<p>The obvious disadvantage is that, since named graphs are completely disregarded in terms of meaning, there is no guarantee that any information intended to be conveyed by the named graphs is preserved by inference.</p>
 	</section>
 
 	<section>
 		<h3 id="union">Default graph as union or as merge</h3>
-		<p>It is sometimes assumed that named graphs are simply a convenient way of sorting the triples but all the triples participte in a united knowledge base that takes the place of the default graph.  More precisely, a dataset is considered to be true if all the triples in all the graphs, named or default, are true together.  This description allows two formalizations of dataset semantics, depending on how blank nodes spanning several named graphs are treated.</p>
+		<p>It is sometimes assumed that named graphs are simply a convenient way of sorting the triples but all the triples participate in a united knowledge base that takes the place of the default graph.  More precisely, a dataset is considered to be true if all the triples in all the graphs, named or default, are true together.  This description allows two formalizations of dataset semantics, depending on how blank nodes spanning several named graphs are treated. Indeed, if one blank node appears in several named graphs, it may be intentional, to indicate the existence of only one thing across the graphs, in which case union is appropriate. If the sharing of blank nodes is incidental, merge is also an applicable solution.</p>
 
 		<h4 id="f2-1" class="formal">Formalization: first version</h4>
-		<p>We define a dataset-interpretation with respect to an entailment regime <var>E</var> as an <var>E</var>-interpretation. Given a dataset-interpretation <var>I</var> and a dataset <var>D</var> having default graph <var>G</var> and named fgraphs <var>NG</var>, <var>I(D)</var> is true if and only if <var>I(G)</var> is true and for all <var>ng</var> in <var>NG</var>, <var>I(ng)</var> is true.</p>
+		<p>We define a dataset-interpretation with respect to an entailment regime <var>E</var> as an <var>E</var>-interpretation. Given a dataset-interpretation <var>I</var> and a dataset <var>D</var> having default graph <var>G</var> and named graphs <var>NG</var>, <var>I(D)</var> is true if and only if <var>I(G)</var> is true and for all <var>ng</var> in <var>NG</var>, <var>I(ng)</var> is true.</p>
 		<p>This is equivalent to <var>I(D)</var> is true if <var>I(H)</var> is true where <var>H</var> is the <a>merge</a> of all the RDF graphs, named or default, appearing in <var>D</var>.</p>
 		
 		<h4 id="f2-2" class="formal">Formalization: second version</h4>
@@ -287,21 +298,22 @@
 
 		<h4 id="p2" class="prop">Properties of this dataset semantics</h4>
 		<p>This semantics allows one to partition the triples of an RDF graph into multiple named graphs for easier data management, yet retaining the meaning of the overall RDF graph. Note that this choice of semantics does not impact the way graph names are interpreted: it is possible to further constrain the graph names to denote the RDF graph associated with it, or other possible constraints. The possible interpretations of graph names, and their consequences, are presented in the next sections.</p>
-		<p>This semantics is implicitely assumed by existing graph store implementations. The OWLIM RDF database management system implements reasoning techniques over RDF datasets that materialize inferred statements into the database [[citation needed]]. This is done by taking the union of the graphs in the named graphs, applying standard entailment regimes over this RDF graph and putting the inferred triples into the default graph.</p>
-		<p>The main drawback of this dataset semantics is that all triples in the named graphs contribute to a global knowledge that must be consistent. In situations where named graphs are used to store RDF graphs obtained from various sources on the open Web, inconsistencies or contradictions can easily occur. Notably, Web crawlers of search engines harvest all RDF documents, and it is known as a fact that the Web contains documents serializing inconsistent RDF graphs as well as documents that are mutually contradicting yet consistent on their own.</p>
+		<p>This semantics is implicitly assumed by existing graph store implementations. The OWLIM RDF database management system implements reasoning techniques over RDF datasets that materialize inferred statements into the database [[citation needed]]. This is done by taking the union of the graphs in the named graphs, applying standard entailment regimes over this RDF graph and putting the inferred triples into the default graph.</p>
+		<p>This dataset semantics makes all triples in the named graphs contribute to a global knowledge, thus making the whole dataset inconsistent whenever two graphs are mutually contradictory. In situations where named graphs are used to store RDF graphs obtained from various sources on the open Web, inconsistencies or contradictions can easily occur. Notably, Web crawlers of search engines harvest all RDF documents, and it is known as a fact that the Web contains documents serializing inconsistent RDF graphs as well as documents that are mutually contradicting yet consistent on their own. In this case, this semantics can be seen as problematic.</p>
 	</section>
 
 	<section>
 		<h3 id="naming">The graph name denotes the named graph or the graph</h3>
-		<p>It is common to use the graph name as a way to identify the RDF graph inside the named graphs, or rather, to identify a particular occurence of the graph. This allows one to describe the graph or the graph source in triples. For instance, one may want to say who is the creator of a particular occurence of a graph. Assuming this semantics for graph names amounts to say that each named graph pair is an assertion that sets the <a>referent</a> of the graph name to be the associated graph.</p>
-		<p>Intutively, this semantics can be seen as quoting the RDF graphs inside the named graphs. In this sense, <code>:alice {:bob  :is  :smart}</code> has to be understood as <q>Alice said: "Bob is smart"</q> which does not entail <q>Alice said: "Bob is intelligent"</q> because Alice did not use the word "intelligent", even though "smart" and "intelligent" can be understood as equivalent.</p>
+		<p>It is common to use the graph name as a way to identify the RDF graph inside the named graphs, or rather, to identify a particular occurrence of the graph. This allows one to describe the graph or the graph source in triples. For instance, one may want to say who the creator of a particular occurrence of a graph is. Assuming this semantics for graph names amounts to say that each named graph pair is an assertion that sets the <a>referent</a> of the graph name to be the associated graph or named graph pair.</p>
+		<p class="issue">The following paragraph refers to speech and asserting, while dataset semantics never refers to such notions. This may be confusing.</p>
+		<p>Intuitively, this semantics can be seen as quoting the RDF graphs inside the named graphs. In this sense, <code>:alice {:bob  :is  :smart}</code> has to be understood as <q>Alice said: “Bob is smart”</q> which does not entail <q>Alice said: “Bob is intelligent”</q> because Alice did not use the word “intelligent”, even though “smart” and “intelligent” can be understood as equivalent.</p>
 
 		<h4 id="f3" class="formal">Formalization</h4>
 		<p>We reuse the notation presented in [[RDF11-MT]]:</p>
 		<blockquote>Suppose I is an interpretation and A is a mapping from a set of blank nodes to the universe IR of I. Define the mapping [I+A] to be I on names, and A on blank nodes on the set: [I+A](x)=I(x) when x is a name and [I+A](x)=A(x) when x is a blank node; and extend this mapping to triples and RDF graphs using the rules given above for ground graphs.</blockquote>
 		<p>A dataset-interpretation <var>I</var> with respect to an entailment regime <var>E</var> is an <var>E</var>-interpretation extended to named graphs and datasets as follows:</p>
 		<ul><li>if <var>(n,g)</var> is a named graph where the graph name is an IRI, then <var>I(n,g)</var> is true if and only if <var>I(n)</var> = <var>(n,g)</var>.
-		<li>if <var>D</var> is a dataset comprising default graph <var>DG</var> and named graphs <var>NG</var>, then <var>I(D)</var> is true if and only if there exists a mapping from bnodes to the universe <var>IR</var> of <var>I</var> such that <var>[I+A](DG)</var> is true and for all named graph <var>(n,g)</var> in <var>NG</var>, <var>[I+A](n)</var> = <var>(n,g)</var>.</li>
+		<li>if <var>D</var> is a dataset comprising default graph <var>DG</var> and named graphs <var>NG</var>, then <var>I(D)</var> is true if and only if there exists a mapping from blank nodes to the universe <var>IR</var> of <var>I</var> such that <var>[I+A](DG)</var> is true and for all named graph <var>(n,g)</var> in <var>NG</var>, <var>[I+A](n)</var> = <var>(n,g)</var>.</li>
 		</ul>
 
 		<h4 id="ex3" class="ex">Examples</h4>
@@ -324,26 +336,26 @@
 		<pre class="example">{ :age  rdfs:range  xsd:integer .
 :me  :age  :g1 . }  # default graph
 :g1 { :s  :p  :o }</pre>
-		<p>The graph name can be used in triples to attached metadata (here <code>:entains</code> is a custom term that does not enforce a formal constraint, so it is up to the implementation to decide how to treat it):</p>
-		<pre class="example">{ :g1  :published  "2023-08-26"^^xsd:date .
- :g1  :entails  :g2 .}
+		<p>The graph name can be used in triples to attached metadata (here <code>:hasNextVersion</code> is a custom term that does not enforce a formal constraint, so it is up to the implementation to decide how to treat it):</p>
+		<pre class="example">{ :g1  :published  "2013-08-26"^^xsd:date .
+ :g1  :hasNextVersion  :g2 .}
 :g1 { :s1  :p1  :o1 .
       :s2  :p2  :o2 }
 :g2 { :s1  :p1  :o1 }</pre>
 		
 		<h4 id="p3" class="prop">Properties of this dataset semantics</h4>
-		<p>There are important implications with this semantics. First, the presence of blank nodes as graph names can be problematic because a named graph entails an infinity of other named graphs where only the graph name is changed to a different blank node. Second, graph names have to be handled almost like literals. Unlike other IRIs or blank nodes, their denotation is strictly fixed, like literals are. Therefore, any entailment regime that recognizes datatypes and use this semantics has to be able to distinguish graphs from, e.g., integers and strings. Combined with RDFS semantics, it can lead to inconsistencies, as in the last example above.</p>
+		<p>There are important implications with this semantics. In this case, a named graph pair can only entail itself or a graph that is structurally equivalent if the graph name is a blank node. Graph names have to be handled almost like literals. Unlike other IRIs or blank nodes, their denotation is strictly fixed, like literals are. This means that graph IRIs may possibly clash with constraints on datatypes, as in the example above.</p>
 		<p>A variant of this dataset semantics imposes that the graph name denotes the RDF graph itself, rather than the pair. This means that two occurrences of the same graph in different named graph pairs actually identify the same thing. Thus, the graph names associated with the same RDF graphs are interchangeable in any triple in this case.</p>
 	</section>
 
 	<section>
 		<h3 id="context">Each named graph defines its own context</h3>
-		<p>Named graphs in RDF datasets are sometimes used to delimit a context in which the triples of the named graphs are true. From the truth of these triples, it is possible to infer knowledge that it is convenient to make part of the named graph. An example of such situation occurs when one wants to keep track of the evolution of the data with time. Another example is when one wants to allow different view points to be expressed and reasoned with, without creating a conflict or inconsistency. By having inferences done at the named graph level, one can prevent for instance that triples comming from untrusted parties are influenceing trusted knowledge. Yet it does not disallow reasoning with and drawing conclusions from untrusted information.</p>
-		<p>Intutively, this semantics can be seen as interpreting the RDF graphs inside the named graphs. In this sense, <code>:alice {:bob  :is  :smart}</code> has to be understood as <q>Alice said that Bob is smart</q> which entails <q>Alice said that Bob is intelligent</q> because the two sentences mean the same thing. Neither sentences mean that Alice used these actual words.</p>
+		<p>Named graphs in RDF datasets are sometimes used to delimit a context in which the triples of the named graphs are true. From the truth of these triples according to the graph semantics, follows the truth of the named graph pair. An example of such situation occurs when one wants to keep track of the evolution of facts with time. Another example is when one wants to allow different viewpoints to be expressed and reasoned with, without creating a conflict or inconsistency. By having inferences done at the named graph level, one can prevent for instance that triples coming from untrusted parties are influencing trusted knowledge. Yet it does not disallow reasoning with and drawing conclusions from untrusted information.</p>
+		<p>Intuitively, this semantics can be seen as interpreting the RDF graphs inside the named graphs. In this sense, <code>:alice {:bob  :is  :smart}</code> has to be understood as <q>Alice said that Bob is smart</q> which entails <q>Alice said that Bob is intelligent</q> because it is what Bob means, whether he used the term “smart”, “intelligent”, or “bright”. Neither sentence implies that Alice used these actual words.</p>
 
 		<h4 id="f4" class="formal">Formalization</h4>
 		<p class="issue">This does not take into account blank nodes as graph names.</p>
-		<p>There are several possible formalization of this. One way is to interpret the graph name as denoting a graph that represents all that is true in the context of the named graph. In this case, a dataset-interpretation with respect to an entailment regime <var>E</var> is an <var>E</var>-interpretation such that:</p>
+		<p>There are several possible formalizations of this. One way is to interpret the graph name as denoting a graph, and a named graph pair is true if this graph entails the graph inside the pair. In this case, a dataset-interpretation with respect to an entailment regime <var>E</var> is an <var>E</var>-interpretation such that:</p>
 		<ul>
 			<li>for each named graph pair <var>ng</var> = <var>(n,G)</var>, <var>I(ng)</var> is true if <var>I(n)</var> is an RDF graph and <var>E</var>-entails <var>G</var>;</li>
 			<li>for a dataset <var>D</var> = <var>(DG,NG)</var>, <var>I(D)</var> is true if <var>I(DG)</var> is true and for all named graph <var>ng</var> in <var>NG</var>, <var>I(ng)</var> is true;
@@ -362,7 +374,7 @@
 		<p>but does not RDFS-dataset-entail:</p>
 		<pre class="example">{ }
 :g2 { :chadHurley  rdf:type  :GoogleEmployee }</pre>
-		<p>With this semantics too, graph names can be used in triples:</p>
+		<p>Graph names used in triples that express metadata do not necessarily generate inconsistency:</p>
 		<pre class="example">{ :g1  :validAfter  "2006"^^xsd:gYear .
  :g1  :published  "2013-08-26"^^xsd:date .
  :g2  :validAt  "2005"^^:xsd:gYear .}
@@ -372,34 +384,40 @@
 		<p>(here, <code>:validAfter</code> and <code>:validAt</code> are custom terms that do not enforce a formal constraint, but may be used internally for, e.g., checking the temporal validity of triples in the named graph).</p>
 
 		<h4 id="p4" class="prop">Properties of this dataset semantics</h4>
-		<p>This semantics assumes that the truth of named graphs is preserved when replacing the RDF graphs inside named graphs with equivalent graphs. This means in particular, that one can normalise literals and still preserve the truth of a named graph. This means too that standard RDF inferences that can be drawn from the RDF graphs inside named graphs can be added to the graph associated with the graph name without impacting the truth of the RDF dataset.</p>
+		<p>This semantics assumes that the truth of named graphs is preserved when replacing the RDF graphs inside named graphs with equivalent graphs. This means in particular, that one can normalize literals and still preserve the truth of a named graph. This means too that standard RDF inferences that can be drawn from the RDF graphs inside named graphs can be added to the graph associated with the graph name without impacting the truth of the RDF dataset.</p>
 		<p>While this semantics does not guarantee that reasoning with RDF datasets will preserve the exact triples of an original dataset, it is semantically valid to store both the original and any entailed datasets.</p>
 		<p>An example implementation of such a context-based semantics is Sindice [[DELBRU-ET-AL-2008]].</p>
 		
-		<h4 id="v4" class="other">Variants this dataset semantics</h4>
+		<h4 id="v4" class="other">Variants of this dataset semantics</h4>
 		<p>There are several variants of this type of dataset-semantics</p>
 		<ul>
 			<li>The default graph is interpreted as universal truth, that is, for a named graph <var>(n,G)</var>, <var>I(n)</var> <var>E</var>-entails the default graph.</li>
-			<li>The graph name does not denote an RDF graph but a resource associated with an RDF graph. This is similar to saying that the name is interpreted as the intension of the graph, and the actual RDF graph is its extension.</li>
+			<li>The graph name does not denote an RDF graph but a resource associated with an RDF graph.</li>
 			<li>Each named graph could be associated with a distinct <var>E</var>-interpretation and impose all interpretations to be true for their corresponding graph, in order for the dataset to be true.</li>
 		</ul>
 	</section>
 
-	<!--<section>
-		<h3>Each named graph defines its own "context"</h3>
-		<p>Sometimes, the separation of triples into different named graphs is used to indicate truth in different contexts. Each graph describes a "world".</p>
-		<p>In substance, the formalization says that each RDF graph in a dataset is interpreted separately.  This models the fact that different RDF graphs may hold in different contexts.  This way, graphs that have been put in different "named graph pairs" can contradict with each other without making the dataset inconsistent.</p>
-	
-		<h4 class="formal">Formalization</h4>
-		<p>For any entailment regime <var>E</var>, let <var>K(E)</var> be the set of all <var>E</var>-interpretations. A dataset-interpretation with respect to an entailment regime <var>E</var> is a pair <var>(IG,Con)</var> where IG is an <var>E</var>-interpretation and <var>Con</var> is a partial mapping from to </var>K(E)</var>.</p>
-		<p>The truth of a dataset for a dataset-interpretation I = (IG,Con) is defined as follows:</p>
+	<section>
+		<h3 id="boxdataset">Named graph are in a particular relationship with what the graph name dereferences to</h3>
+		<p>In accordance with linked data principles, IRIs may be assumed to reference the document that is obtained by dereferencing it. If the document contains an RDF graph it can be assumed that the graph in the named graph is in a special relationship (such as, equals, entails) with this RDF graph.</p>
+		<p>In such case, the truth of an RDF dataset is dependent on the state of the Web, and the same dataset may entail different statements at different times.</p>
+
+		<h4 id="hbox" class="formal">Formalization</h4>
+		<p>Let <var>d</var> be the function that maps an IRI to an RDF graph that can be obtained from dereferencing the IRI. For an IRI <var>u</var>, <var>d(u)</var> is empty when dereferencing returns an error or a document that does not encode an RDF graph.</p>
+		<p>A dataset-interpretation <var>I</var> with respect to an entailment regime <var>E</var> is an <var>E</var>-interpretation such that:</p>
 		<ul>
-			<li>for a named graph pair ng = (n,G), I(ng) is true if Con(n) is defined Con(n)(G) is true;</li>
-			<li>for a dataset D = (DG,G), I(D) is true if IG(G) is true and for all named graph ng in NG, I(ng) is true;
-			<li>I(D) is false otherwise.</li>
+			<li>for a named graph pair <var>ng</var> = <var>(n,G)</var>, <var>I(ng)</a> is true if <var>d(n)</var> equals (respectively, is a subgraph of, is entailed by) <var>G</var>;</li>
+			<li>for a dataset <var>D</var> = <var>(DG,NG)</var>, <var>I(D)</var> is true if <var>I(DG)</var> is true and for all named graph <var>ng</var> in <var>NG</var>, <var>I(ng)</var> is true;
+			<li><var>I(D)</var> is false otherwise.</li>
 		</ul>
-		<p>Following standard definitions, we say that a dataset D1 entails a dataset D2 if all dataset-interpretation I that makes D1 true also makes D2 true.</p>
-	</section>-->
+
+		<h4 id="ex4" class="ex">Examples</h4>
+		<p>Entailments in this semantics depend not only on the content of a dataset but also on the content of the Web and the ability of a reasoner to accept this content. Moreover, the entailments vary whether the considered relation is “equals”, or “subgraph of”, or “entailed by”.</p>
+		<p>For instance, if the reasoner is offline, then the dereferencing function <var>d</var> in the previous definition always return an empty graph. In this case, if the relation is “equals” or “subgraph of”, only empty named graphs can be true; if the relation is “entails by”, then only named graphs containing axiomatic triples are true. In general, if the relationship is “equals”, named graph do not provide extra entailments.</p>
+
+		<h4 id="pbox" class="prop">Properties of this dataset semantics</h4>
+		<p>The distinguishing characteristic of this dataset semantics is the fact that a single RDF dataset can lead to different entailments, depending on the state of the Web. This can be seen as a feature for systems that need to be in line with what is found online, but is a drawback for systems that must retain consistency even when they go offline.</p>
+	</section>
 
 	<!--<section>
 		<h3>Named graphs as contexts, and the default graph is universal truth</h3>
@@ -416,7 +434,7 @@
 		<p>This semantics is extending the semantics of RDF rather than simply reusing it.</p>
 
 		<h4 id="f5" class="formal">Formalization</h4>
-		<p>A quad-interpretration is a tuple <var>(IR,IP,IEXT,IS,IL,LV)</var> where <var>IR</var>, <var>IP</var>, <var>IS</var>, <var>IL</var> and <var>LV</var> are defined as in RDF and <var>IEXT</var> is a mapping from <var>IP</var> into the powerset of <var>IR &times; IR union IR &times; IR &times; IR</var>.</p>
+		<p>A quad-interpretation is a tuple <var>(IR,IP,IEXT,IS,IL,LV)</var> where <var>IR</var>, <var>IP</var>, <var>IS</var>, <var>IL</var> and <var>LV</var> are defined as in RDF and <var>IEXT</var> is a mapping from <var>IP</var> into the powerset of <var>IR &times; IR union IR &times; IR &times; IR</var>.</p>
 
 		<p>Since this option modifies the notion of simple-interpretation, which is the basis for all <var>E</var>-interpretations in any entailment regime E, it is not clear how it can be extended to arbitrary entailment regimes. For instance, does the following quad set:</p>
 		<pre class="example">:a  rdf:type  :c  :x .
@@ -425,16 +443,16 @@
 		<pre class="example">:a  rdf:type  :d  :x .</pre>
 		
 		<h4 id="p5" class="prop">Properties of this dataset semantics</h4>
-		<p>With this semantics, all inferences that are valid with normal RDF triples are preserved, but it is necessary to extend RDFS in order to accomodate for ternary relations. There are several existing proposal that extends this quad semantics by dealing with a specific "dimension", such as time, uncertainty, provenance. For instance, temporal RDF [[TEMPORAL-RDF]] use the fourth element to denote a time frame, and reasoning can be performed per time frame. Special semantic rules allow one to combine triples in overlapping time frames. Fuzzy RDF [[FUZZY-RDF]] extends the semantics to deal with uncertainty. stRDF [[ST-RDF]] extends temporal RDF to deal with spatial information. Annotated RDF [[ANNOTATED-RDF]] generalizes the previous proposals.</p>
+		<p>With this semantics, all inferences that are valid with normal RDF triples are preserved, but it is necessary to extend RDFS in order to accommodate for ternary relations. There are several existing proposal that extends this quad semantics by dealing with a specific “dimension”, such as time, uncertainty, provenance. For instance, temporal RDF [[TEMPORAL-RDF]] use the fourth element to denote a time frame, and reasoning can be performed per time frame. Special semantic rules allow one to combine triples in overlapping time frames. Fuzzy RDF [[FUZZY-RDF]] extends the semantics to deal with uncertainty. stRDF [[ST-RDF]] extends temporal RDF to deal with spatial information. Annotated RDF [[ANNOTATED-RDF]] generalizes the previous proposals.</p>
 	</section>
 
 	<section>
 		<h3 id="quote">Quoted graphs</h3>
-		<p>Quoted graphs are a way to associate information to a specific RDF graph without constraining the relationship between a graph name and the graph associated with it in a dataset. An RDF graph is "quoted" by using a literal having a lexical form that is a syntactic expression of the graph. For instance:</p>
+		<p>Quoted graphs are a way to associate information to a specific RDF graph without constraining the relationship between a graph name and the graph associated with it in a dataset. An RDF graph is “quoted” by using a literal having a lexical form that is a syntactic expression of the graph. For instance:</p>
 		<pre class="example">{ :g  :quotes  ":a  :b  []"^^:turtle . }
 :g { :b  rdf:type  rdf:Property .
  :a  :b  _:x . }</pre>
-		<p>This technique allows one to assume a dataset semantics of contexts (as in Section 3.4) and still preserve an initial version of a graph. However, quoting big graphs may be combursome and would require a custom datatype to be recognized.</p>
+		<p>This technique allows one to assume a dataset semantics of contexts (as in Section 3.4) and still preserve an initial version of a graph. However, quoting big graphs may be cumbersome and would require a custom datatype to be recognized.</p>
 
 	</section>
 
@@ -456,7 +474,7 @@
     GRAPH :g2 { :y  rdf:type  :d }
 }</pre>
 		<p>would answer <code>false</code>.</p>
-		<p>This can lead to a classification of dataset semantics in terms of whether they are compatible with SPARQL ASK queries or not. It can be noted that a semantics where each named graph defines its own context is "SPARQL-ASK-compatible", while a semantics where the graph name denotes the graph or named graph is not compatible in this sense.</p>
+		<p>This can lead to a classification of dataset semantics in terms of whether they are compatible with SPARQL ASK queries or not. It can be noted that a semantics where each named graph defines its own context is “SPARQL-ASK-compatible”, while a semantics where the graph name denotes the graph or named graph is not compatible in this sense.</p>
 	</section>
 </section>
 
@@ -464,7 +482,7 @@
 	<h2>Declaring the intended semantics</h2>
 	
 	<p>The RDF Working Group did not define a formal semantics for a multiple graph data model because none of the semantics presented before could obtained consensus. Choosing one or another of the propositions before would have gone against some deployed implementations. Therefore, the Working Group discussed the possibility to define several semantics, among which an implementation could choose, and provide the means to declare which semantics is adopted.</p>
-	<p>This was not retained eventually, because of the lack of experience, and potentially the lack of utility, so there is no definite option for this. Nonetheless, for completeness, we describe here possible solutions.</p>
+	<p>This was not retained eventually, because of the lack of experience, so there is no definite option for this. Nonetheless, for completeness, we describe here possible solutions.</p>
 
 	<h3 id="using-vocab">Using vocabularies</h3>
 	<p>A dataset can be described in RDF using vocabularies like voiD [[VOID]] and the SPARQL service description vocabulary [[SPARQL11-SERVICE-DESCRIPTION]]. VoiD is used to describe how a collection of RDF triples is organized in a web site or across web sites, giving information about the size of the datasets, the location of the dump files, the IRI of the query endpoints, and so on. The notion of dataset in voiD is used as a more informal and broader concept than RDF dataset. However, an RDF dataset and the graphs in it can be describe as voiD datasets and the information can be completed with SPARQL service description</p>
@@ -488,8 +506,10 @@
 <section class="appendix informative" id="changes">
   <h2>Changes</h2>
   <ul>
+	<li>2013-12-12:  Added the semantics for Sandro's box dataset.</li>
+	<li>2013-12-04:  Addressed Pat’s comments.</li>
 	<li>2013-09-17:  All sections revised and document completed. Many improvements.</li>
-	<li>2013-01-28:  Initial editor's draft.</li>
+	<li>2013-01-28:  Initial editor’s draft.</li>
   </ul>
 </section>