--- a/rdf-dataset/index.html Fri Feb 21 15:38:42 2014 +0000
+++ b/rdf-dataset/index.html Fri Feb 21 18:34:07 2014 +0100
@@ -221,6 +221,27 @@
<p>The first item above reflects the indication given in [[RDF11-MT]] (Section <a href="http://www.w3.org/TR/rdf11-mt/#rdf-datasets">"RDF Datasets"</a>) with respect to dataset semantics: <q>a dataset SHOULD be understood to have at least the same content as its default graph</q>.</p>
<p>The dependency on RDF semantics is such that most of the dataset semantics below reuse RDF semantics as a black box. More precisely, it is not necessary to be specific about how truth of RDF graphs is defined as long as there is a notion of interpretation that determines the truth of a set of triples. In fact, RDF Semantics does not define a single formal semantics, but multiple ones, depending on what standard vocabularies are endorsed by an application (such as the RDF, RDFS, XSD vocabularies). Consequently, we parameterize most of the definitions below with an unspecified entailment regime <var>E</var>. RDF 1.1 defines the following entailment regimes: simple entailment, D-entailment, RDF-entailment, RDFS-entailment. Additionally, OWL defines two other entailment regimes, based on the OWL 2 direct semantics [[OWL2-DIRECT-SEMANTICS]] and the OWL 2 RDF-based semantics [[OWL2-RDF-BASED-SEMANTICS]].</p>
<p>For an entailment regime <var>E</var>, we will say <var>E</var>-interpretation, <var>E</var>-entailment, <var>E</var>-equivalence, <var>E</var>-consistency to describe the notions of interpretations, entailment, equivalence and consistency associated with the regime <var>E</var>. Similarly, we will use the terms dataset-interpretation, dataset-entailment, dataset-equivalence, dataset-consistency for the corresponding notions in dataset semantics.</p>
+ <p>This document provides examples in TriG [[TRIG]] and assumes that the following prefixes are defined:</p>
+ <table class="simple">
+ <caption>Namespace prefixes and IRIs used in this document</caption>
+ <tr>
+ <th>Namespace prefix</th>
+ <th>Namespace IRI</th>
+ </tr>
+ <tr>
+ <td><code>rdf</code></td>
+ <td><code>http://www.w3.org/1999/02/22-rdf-syntax-ns#</code></td>
+ </tr>
+ <tr><td><code>rdfs</code></td>
+ <td><code>http://www.w3.org/2000/01/rdf-schema#</code></td>
+ </tr>
+ <tr><td><code>xsd</code></td>
+ <td><code>http://www.w3.org/2001/XMLSchema#</code></td>
+ </tr>
+ <tr><td><code>ex</code></td>
+ <td><code>http://example.org/voc#</code></a></td>
+ </tr>
+ </table>
<section>
<h3 id="no-meaning">Named graphs have no meaning</h3>
@@ -233,22 +254,22 @@
<h4 id="ex1" class="ex">Examples of entailment and non-entailments</h4>
<p>Consider the following dataset:</p>
- <pre class="example">{ :s :p :o . }
-:g1 { :a :b :c }</pre>
+ <pre class="example">{ ex:s ex:p ex:o . }
+ex:g1 { ex:a ex:b ex:c }</pre>
<p>does not dataset-entail:</p>
- <pre class="example">{ :s :p :o .
-:a :b :c .}</pre>
+ <pre class="example">{ ex:s ex:p ex:o .
+ ex:a ex:b ex:c .}</pre>
<p>but dataset-entails:</p>
<pre class="example">{} # empty default graph
-:g2 { :x :y :z }</pre>
+ex:g2 { ex:x ex:y ex:z }</pre>
<p>Since graph names are not particularly constrained, one can use them in triples, for instance:</p>
- <pre class="example">{ :g1 :author :Bob .
- :g1 :created "2013-09-17"^^xsd:date .}
-:g1 { :a :b :c }</pre>
+ <pre class="example">{ ex:g1 ex:author ex:Bob .
+ ex:g1 ex:created "2013-09-17"^^xsd:date .}
+ex:g1 { ex:a ex:b ex:c }</pre>
<p>but it would dataset-entail:</p>
- <pre class="example">{ :g1 :author :Bob .
- :g1 :created "2013-09-17"^^xsd:date .}
-:g1 { :x :y :z }</pre>
+ <pre class="example">{ ex:g1 ex:author ex:Bob .
+ ex:g1 ex:created "2013-09-17"^^xsd:date .}
+ex:g1 { ex:x ex:y ex:z }</pre>
<h4 id="p1" class="pro">Properties of this dataset semantics</h4>
<p>Assuming this semantics is convenient since it merely ignores named graphs in a dataset for any reasoning task. As a result, datasets can be simply treated as regular RDF graphs by extracting the default graph. Named graphs can still be used to preserve useful information, but it bears no more meaning than a commentary in a program source code.</p>
@@ -269,15 +290,15 @@
<h4 id="e2" class="ex">Examples</h4>
<p>Consider the following dataset:</p>
- <pre class="example">{ :s :p :o . } # default graph
-:g1 { :a :b :c }</pre>
+ <pre class="example">{ ex:s ex:p ex:o . } # default graph
+ex:g1 { ex:a ex:b ex:c }</pre>
<p>dataset-entails:</p>
- <pre class="example">{ :s :p :o .
-:a :b :c .}</pre>
+ <pre class="example">{ ex:s ex:p ex:o .
+ ex:a ex:b ex:c .}</pre>
<p>If the entailment regime <var>E</var> is RDFS with the recognized datatype <code>xsd:integer</code>, then the following RDF dataset is RDFS-dataset-inconsistent:</p>
<pre class="example">{ } # empty default graph
-:g1 { :age rdfs:range xsd:integer . }
-:g2 { :bob :age "twenty" .}</pre>
+ex:g1 { ex:age rdfs:range xsd:integer . }
+ex:g2 { ex:bob ex:age "twenty" .}</pre>
<h4 id="p2" class="prop">Properties of this dataset semantics</h4>
<p>This semantics allows one to partition the triples of an RDF graph into multiple named graphs for easier data management, yet retaining the meaning of the overall RDF graph. Note that this choice of semantics does not impact the way graph names are interpreted: it is possible to further constrain the graph names to denote the RDF graph associated with it, or other possible constraints. The possible interpretations of graph names, and their consequences, are presented in the next sections.</p>
@@ -288,7 +309,7 @@
<section>
<h3 id="naming">The graph name denotes the named graph or the graph</h3>
<p>It is common to use the graph name as a way to identify the RDF graph inside the named graphs, or rather, to identify a particular occurrence of the graph. This allows one to describe the graph or the graph source in triples. For instance, one may want to say who the creator of a particular occurrence of a graph is. Assuming this semantics for graph names amounts to say that each named graph pair is an assertion that sets the <a href="http://www.w3.org/TR/rdf11-concepts/#dfn-referent">referent</a> of the graph name to be the associated graph or named graph pair.</p>
- <p>Intuitively, this semantics can be seen as quoting the RDF graphs inside the named graphs. In this sense, <code>:alice {:bob :is :smart}</code> has to be understood as <q>Alice said: “Bob is smart”</q> which does not entail <q>Alice said: “Bob is intelligent”</q> because Alice did not use the word “intelligent”, even though “smart” and “intelligent” can be understood as equivalent. Note, however, that this analogy is only valid insofar as it can provide an intuition of this type of semantics, but the formalization does not actually refer to speech and the act of asserting.</p>
+ <p>Intuitively, this semantics can be seen as quoting the RDF graphs inside the named graphs. In this sense, <code>ex:alice {ex:bob ex:is ex:smart}</code> has to be understood as <q>Alice said: “Bob is smart”</q> which does not entail <q>Alice said: “Bob is intelligent”</q> because Alice did not use the word “intelligent”, even though “smart” and “intelligent” can be understood as equivalent. Note, however, that this analogy is only valid insofar as it can provide an intuition of this type of semantics, but the formalization does not actually refer to speech and the act of asserting.</p>
<h4 id="f3" class="formal">Formalization</h4>
<p>In order to be consistent with RDF model theory, blank nodes used as graph names are treated like existential variables. Consequently, their semantics is formalized according to the same notation presented in [[RDF11-MT]]:</p>
@@ -301,29 +322,29 @@
<h4 id="ex3" class="ex">Examples</h4>
<p>Consider the following dataset:</p>
<pre class="example">{ } # empty default graph
-:g1 { :a :b :c }
-:g2 { :x :y :z }</pre>
+ex:g1 { ex:a ex:b ex:c }
+ex:g2 { ex:x ex:y ex:z }</pre>
<p>dataset-entails:</p>
<pre class="example">{ }
-_:b { :a :b :c }
-:g2 { :x :y :z }</pre>
+_:b { ex:a ex:b ex:c }
+ex:g2 { ex:x ex:y ex:z }</pre>
<p>but does not dataset-entail:</p>
<pre class="example">{ }
-:g1 { [] :b :c }
-:g2 { :x :y :z }</pre>
+ex:g1 { [] ex:b ex:c }
+ex:g2 { ex:x ex:y ex:z }</pre>
<p>nor:</p>
<pre class="example">{ }
-:g1 { }</pre>
+ex:g1 { }</pre>
<p>If the entailment regime <var>E</var> is RDFS with the recognized datatype <code>xsd:integer</code>, then the following RDF dataset is RDFS-dataset-inconsistent:</p>
- <pre class="example">{ :age rdfs:range xsd:integer .
-:me :age :g1 . } # default graph
-:g1 { :s :p :o }</pre>
- <p>The graph name can be used in triples to attached metadata (here <code>:hasNextVersion</code> is a custom term that does not enforce a formal constraint, so it is up to the implementation to decide how to treat it):</p>
- <pre class="example">{ :g1 :published "2013-08-26"^^xsd:date .
- :g1 :hasNextVersion :g2 .}
-:g1 { :s1 :p1 :o1 .
- :s2 :p2 :o2 }
-:g2 { :s1 :p1 :o1 }</pre>
+ <pre class="example">{ ex:age rdfs:range xsd:integer .
+ ex:me ex:age ex:g1 . } # default graph
+ex:g1 { ex:s ex:p ex:o }</pre>
+ <p>The graph name can be used in triples to attached metadata (here <code>ex:hasNextVersion</code> is a custom term that does not enforce a formal constraint, so it is up to the implementation to decide how to treat it):</p>
+ <pre class="example">{ ex:g1 ex:published "2013-08-26"^^xsd:date .
+ ex:g1 ex:hasNextVersion ex:g2 .}
+ex:g1 { ex:s1 ex:p1 ex:o1 .
+ ex:s2 ex:p2 ex:o2 }
+ex:g2 { ex:s1 ex:p1 ex:o1 }</pre>
<h4 id="p3" class="prop">Properties of this dataset semantics</h4>
<p>There are important implications with this semantics. In this case, a named graph pair can only entail itself or a graph that is structurally equivalent if the graph name is a blank node. Graph names have to be handled almost like literals. Unlike other IRIs or blank nodes, their denotation is strictly fixed, like literals are. This means that graph IRIs may possibly clash with constraints on datatypes, as in the example above.</p>
@@ -333,7 +354,7 @@
<section>
<h3 id="context">Each named graph defines its own context</h3>
<p>Named graphs in RDF datasets are sometimes used to delimit a context in which the triples of the named graphs are true. From the truth of these triples according to the graph semantics, follows the truth of the named graph pair. An example of such situation occurs when one wants to keep track of the evolution of facts with time. Another example is when one wants to allow different viewpoints to be expressed and reasoned with, without creating a conflict or inconsistency. By having inferences done at the named graph level, one can prevent for instance that triples coming from untrusted parties are influencing trusted knowledge. Yet it does not disallow reasoning with and drawing conclusions from untrusted information.</p>
- <p>Intuitively, this semantics can be seen as interpreting the RDF graphs inside the named graphs. In this sense, <code>:alice {:bob :is :smart}</code> has to be understood as <q>Alice said that Bob is smart</q> which entails <q>Alice said that Bob is intelligent</q> because it is what Bob means, whether he used the term “smart”, “intelligent”, or “bright”. Neither sentence implies that Alice used these actual words.</p>
+ <p>Intuitively, this semantics can be seen as interpreting the RDF graphs inside the named graphs. In this sense, <code>ex:alice {ex:bob ex:is ex:smart}</code> has to be understood as <q>Alice said that Bob is smart</q> which entails <q>Alice said that Bob is intelligent</q> because it is what Bob means, whether he used the term “smart”, “intelligent”, or “bright”. Neither sentence implies that Alice used these actual words.</p>
<h4 id="f4" class="formal">Formalization</h4>
<p>There are several possible formalizations of this leading to similar entailments. One way is to interpret the graph name as denoting a graph, and a named graph pair is true if this graph entails the graph inside the pair. In this case, a dataset-interpretation with respect to an entailment regime <var>E</var> is an <var>E</var>-interpretation such that:</p>
@@ -346,23 +367,23 @@
<h4 id="ex4" class="ex">Examples</h4>
<p>Consider the following dataset:</p>
<pre class="example">{ } # empty default graph
-:g1 { :YoutubeEmployee rdfs:subClassOf :GoogleEmployee .
-:steveChen rdf:type :YoutubeEmployee . }
-:g2 { :chadHurley rdf:type :YoutubeEmployee }</pre>
+ex:g1 { ex:YoutubeEmployee rdfs:subClassOf ex:GoogleEmployee .
+ ex:steveChen rdf:type ex:YoutubeEmployee . }
+ex:g2 { ex:chadHurley rdf:type ex:YoutubeEmployee }</pre>
<p>RDFS-dataset-entails:</p>
<pre class="example">{ }
-:g1 { :steveChen rdf:type :GoogleEmployee }</pre>
+ex:g1 { ex:steveChen rdf:type ex:GoogleEmployee }</pre>
<p>but does not RDFS-dataset-entail:</p>
<pre class="example">{ }
-:g2 { :chadHurley rdf:type :GoogleEmployee }</pre>
+ex:g2 { ex:chadHurley rdf:type ex:GoogleEmployee }</pre>
<p>Graph names used in triples that express metadata do not necessarily generate inconsistency:</p>
- <pre class="example">{ :g1 :validAfter "2006"^^xsd:gYear .
- :g1 :published "2013-08-26"^^xsd:date .
- :g2 :validAt "2005"^^:xsd:gYear .}
-:g1 { :YoutubeEmployee rdfs:subClassOf :GoogleEmployee .
-:steveChen rdf:type :YoutubeEmployee . }
-:g2 { :chadHurley rdf:type :YoutubeEmployee }</pre>
- <p>(here, <code>:validAfter</code> and <code>:validAt</code> are custom terms that do not enforce a formal constraint, but may be used internally for, e.g., checking the temporal validity of triples in the named graph).</p>
+ <pre class="example">{ ex:g1 ex:validAfter "2006"^^xsd:gYear .
+ ex:g1 ex:published "2013-08-26"^^xsd:date .
+ ex:g2 ex:validAt "2005"^^xsd:gYear .}
+ex:g1 { ex:YoutubeEmployee rdfs:subClassOf ex:GoogleEmployee .
+ ex:steveChen rdf:type ex:YoutubeEmployee . }
+ex:g2 { ex:chadHurley rdf:type ex:YoutubeEmployee }</pre>
+ <p>(here, <code>ex:validAfter</code> and <code>ex:validAt</code> are custom terms that do not enforce a formal constraint, but may be used internally for, e.g., checking the temporal validity of triples in the named graph).</p>
<h4 id="p4" class="prop">Properties of this dataset semantics</h4>
<p>This semantics assumes that the truth of named graphs is preserved when replacing the RDF graphs inside named graphs with equivalent graphs. This means in particular, that one can normalize literals and still preserve the truth of a named graph. This means too that standard RDF inferences that can be drawn from the RDF graphs inside named graphs can be added to the graph associated with the graph name without impacting the truth of the RDF dataset.</p>
@@ -418,10 +439,10 @@
<p>A quad-interpretation is a tuple (<var>IR</var>,<var>IP</var>,<var>IEXT</var>,<var>IS</var>,<var>IL</var>,<var>LV</var>) where <var>IR</var>, <var>IP</var>, <var>IS</var>, <var>IL</var> and <var>LV</var> are defined as in RDF and <var>IEXT</var> is a mapping from <var>IP</var> into the powerset of <var>IR</var> × <var>IR</var> union <var>IR</var> × <var>IR</var> × <var>IR</var>.</p>
<p>Since this option modifies the notion of simple-interpretation, which is the basis for all <var>E</var>-interpretations in any entailment regime E, it is not clear how it can be extended to arbitrary entailment regimes. For instance, does the following quad set:</p>
- <pre class="example">:a rdf:type :c :x .
-:c rdfs:subClassOf :d :x .</pre>
+ <pre class="example">ex:a rdf:type ex:c ex:x .
+ex:c rdfs:subClassOf ex:d ex:x .</pre>
<p>RDFS-dataset-entails:</p>
- <pre class="example">:a rdf:type :d :x .</pre>
+ <pre class="example">ex:a rdf:type ex:d ex:x .</pre>
<h4 id="p5" class="prop">Properties of this dataset semantics</h4>
<p>With this semantics, all inferences that are valid with normal RDF triples are preserved, but it is necessary to extend RDFS in order to accommodate for ternary relations. There are several existing proposals that extend this quad semantics by dealing with a specific “dimension”, such as time, uncertainty, provenance. For instance, temporal RDF [[TEMPORAL-RDF]] uses the fourth element to denote a time frame and thus allow reasoning to be performed per time frame. Special semantic rules allow one to combine triples in overlapping time frames. Fuzzy RDF [[FUZZY-RDF]] extends the semantics to deal with uncertainty. stRDF [[ST-RDF]] extends temporal RDF to deal with spatial information. Annotated RDF [[ANNOTATED-RDF]] generalizes the previous proposals.</p>
@@ -430,9 +451,9 @@
<section>
<h3 id="quote">Quoted graphs</h3>
<p>Quoted graphs are a way to associate information to a specific RDF graph without constraining the relationship between a graph name and the graph associated with it in a dataset. An RDF graph is “quoted” by using a literal having a lexical form that is a syntactic expression of the graph. For instance:</p>
- <pre class="example">{ :g :quotes ":a :b []"^^:turtle . }
-:g { :b rdf:type rdf:Property .
- :a :b _:x . }</pre>
+ <pre class="example">{ ex:g ex:quotes "ex:a ex:b []"^^ex:turtle . }
+ex:g { ex:b rdf:type rdfex:Property .
+ ex:a ex:b _:x . }</pre>
<p>This technique allows one to assume a dataset semantics of contexts (as in Section 3.4) and still preserve an initial version of a graph. However, quoting big graphs may be cumbersome and would require a custom datatype to be recognized.</p>
</section>
@@ -442,17 +463,17 @@
<p>There is a strong relationship between SPARQL ASK queries with an entailment regime [[SPARQL11-ENTAILMENT]] and inferences in the regime. If an ASK query does not contain variables and its WHERE clause only contains a basic graph pattern, then the query can be seen as an RDF graph. If such a graph query <var>Q</var> returns <code>true</code> when issued against an RDF graph <var>G</var> with entailment regime <var>E</var>, then <var>G</var> <var>E</var>-entails <var>Q</var>. If it returns <code>false</code>, then <var>G</var> does not <var>E</var>-entail <var>Q</var>.</p>
<p>A dataset semantics can also be compared to what ASK queries return when they do not contain variables but may contain basic graph patterns or graph graph patterns. For instance, consider the dataset:</p>
<pre class="example">{ }
-:g1 { :x rdf:type :c .
- :c rdfs:subClassOf :d . }
-:g2 { :y rdf:type :c . }</pre>
+ex:g1 { ex:x rdf:type ex:c .
+ ex:c rdfs:subClassOf ex:d . }
+ex:g2 { ex:y rdf:type ex:c . }</pre>
<p>Then the query:</p>
<pre class="example">ASK WHERE {
- GRAPH :g1 { :x rdf:type :d }
+ GRAPH ex:g1 { ex:x rdf:type ex:d }
}</pre>
<p>with RDFS entailment regime would answer <code>true</code>, but the query:</p>
<pre class="example">ASK WHERE {
- GRAPH :g1 { :x rdf:type :d }
- GRAPH :g2 { :y rdf:type :d }
+ GRAPH ex:g1 { ex:x rdf:type ex:d }
+ GRAPH ex:g2 { ex:y rdf:type ex:d }
}</pre>
<p>would answer <code>false</code>.</p>
<p>This can lead to a classification of dataset semantics in terms of whether they are compatible with SPARQL ASK queries or not. It can be noted that a semantics where each named graph defines its own context is “SPARQL-ASK-compatible”, while a semantics where the graph name denotes the graph or named graph is not compatible in this sense.</p>
@@ -496,6 +517,7 @@
<section class="appendix" id="changes">
<h2>Changes since the first public working draft of 17 December 2013</h2>
<ul>
+ <li>2014-02-21: Defined prefixes and updated TriG examples.</li>
<li>2014-02-19: Updated references to [[SPARQL11-QUERY]].</li>
<li>2014-02-19: Removed Issue 2, adding support for blank nodes in the formalization of the semantics.</li>
<li>2014-02-19: Removed Issue 1 and added a sentence to mitigate the issue.</li>