The Resource Description Framework (RDF) is a general-purpose language for representing information in the Web.
This document defines a textual syntax for RDF called Turtle that allows an RDF graph to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes. Turtle provides levels of compatibility with the existing N-Triples format as well as the triple pattern syntax of the SPARQL W3C Recommendation.
Turtle is already a reasonably settled serialization of RDF. Many implementations of Turtle already exist, we are hoping for feedback from those existing implementers and other people deciding that now would be a good time to support Turtle. There are still a few rough edges that need polishing, and better alignment with the SPARQL triple patterns. The working group does not expect to make any large changes to the existing syntax.
This document defines Turtle, the Terse RDF Triple Language, a concrete syntax for RDF as defined in the RDF Concepts and Abstract Syntax ([[!RDF-CONCEPTS]]) W3C Recommendation. Turtle is an extension of N-Triples ([[!N-TRIPLES]]) carefully taking the most useful and appropriate things added from Notation 3 ([[N3]]) while staying within the RDF model.
ISSUE-4: A future version of this document is expected to define N-Triples.
The Turtle grammar for triples
is a subset of the
SPARQL Query Language for RDF
[[RDF-SPARQL-QUERY]] grammar for TriplesBlock
. The two grammars share production and terminal names where possible.
This section is informative. The Turtle Syntax and Turtle Grammar sections formally define the language.
A Turtle document allows writing down an RDF graph in a compact
textual form. It consists of a sequence of directives, triple-generating
statements or blank lines. Comments may be given after a #
that is not part of another lexical token
and continue to the end of the line.
Simple triples are a sequence of (subject, predicate, object) terms, separated by whitespace and terminated by '.' after each triple. This corresponds to N-Triples [[N-TRIPLES]].
ISSUE-4: A future version of this document is expected to define N-Triples.
There are three types of RDF Term: Internationalized Resource Identifiers (IRIs for short), literals and blank nodes.
IRIs are written enclosed in '<' and '>' and may be absolute RDF URI References or relative to the current base IRI (described below).
IRIs may also be abbreviated by using Turtle's @prefix
directive that allows declaring a short prefix name for a long prefix
of repeated IRIs. This is useful for many RDF vocabularies that are
all defined with a common namespace like IRI.
While @prefix
works somewhat like XML
namespaces the restrictions from XML QNames do NOT apply. leg:3032571
is a perfectly fine prefixed name.
Once a prefix such as @prefix foo:
<http://example.org/ns#>
is defined, any mention of a
URI later in the document may use a prefixed name that
starts foo:
to stand for the longer IRI. So for
example, the prefixed name foo:bar
is a shorthand for
the IRI http://example.org/ns#bar
.
Literals are written either using double-quotes when they do not
contain linebreaks like "simple literal"
or
"""long literal"""
when they may contain linebreaks.
Literals have either a language suffix or a datatype IRI
but not both. Languages are indicated by appending the simple
literal with @
and the language tag. Datatype IRIs
similarly append ^^
followed by any legal IRI form (full
or prefixed) as described above to give the datatype IRI. Literals
may be written without either a language tag or a datatype IRI as a
shortcut for a literal with the type xsd:string
.
The "That Seventies Show"
above is equivalent to "That Seventies Show"^^xsd:string
.
ISSUE-12 The RDF Working Group is currently examining a simplification of RDF which considers plain literals with no language tag to be literals with a datatype xsd:string
.
Blank nodes are written as _:
BLANK_NODE_LABEL
to provide a blank node either from the given BLANK_NODE_LABEL.
A generated blank node may also be made with []
which is useful to provide the subject of RDF triples for
each pair from the predicateObjectList
or the root of the collection.
Literals , prefixed names and IRIs may also contain escapes to encode surrounding syntax, non-printable characters and to encode Unicode characters by codepoint number (although they may also be given directly, encoded as UTF-8). The character escapes are:
\t
(U+0009, tab)\n
(U+000A, linefeed)\r
(U+000D, carriage return)\"
(U+0022, double quote - only allowed inside strings)\>
(U+003E, greater than - only allowed inside IRI_REFs)\\
(U+005C, backslash)\u
HHHH or
\U
HHHHHHHH
for writing Unicode characters by hexadecimal codepoint where
H is a single hexadecimal digit.
See the String escapes section for full details.
ISSUE 67 The inclusion of escape sequences in prefixed names is undecided.
The current base IRI may be altered in a Turtle document using the
@base
directive. It allows further abbreviation of
IRIs but is usually for simplifying the IRIs in the data, where
the prefix directives are for vocabularies that describe the data.
Whenever this directive appears, it defines the base IRI for which all relative IRIs are resolved against. That includes IRIs, qualified names, prefix directives as well as later base directives.
The token a
is equivalent to the IRI
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
Decimal integers may be written directly and correspond to the XML Schema Datatype xsd:integer in both syntax and datatype IRI.
Decimal floating point double/fixed precision numbers may be written directly and correspond to the XML Schema Datatype xsd:double in both syntax and datatype IRI.
Decimal floating point arbitrary precision numbers may be written directly and correspond to the XML Schema Datatype xsd:decimal. in both syntax and datatype IRI.
Boolean may be written directly as true
or
false
and correspond to the
the XML Schema Datatype
xsd:boolean
in both syntax and datatype IRI.
The ,
symbol may be used to repeat the subject and
predicate of triples that only differ in the object RDF term.
The ;
symbol may be used to repeat the subject of
triples that vary only in predicate and object RDF terms.
An RDF Collection may be abbreviated using a sequence of
RDF Terms enclosed in ( )
brackets. Whitespace may
be used to separate them, as usual. This format provides a
blank node at the start of RDF Collection which may be used
in further abbreviations.
See section Collections for the details on the long form of the generated triples.
Turtle is a language for an RDF graph, a set of RDF triples. An RDF graph is composed of URI references (now interpreted as IRIs), literals and blank nodes.
The Turtle syntax for IRIs is identical to that of SPARQL Query, including the use of
prefix
and base
directives, though these are spelled
@prefix
and @base
respectively in Turtle.
Per RFC3986 section 5.1.1 [[!RFC3986]],
the parsing begins with a context-defined In-Scope Base URI. Each @base
directive
sets a new In-Scope Base URI, relative to the previous one. @prefix
directives
map a local name to an IRI, also resolved against the current In-Scope Base URI.
Subsequent @prefix
may re-map the same local name.
Turtle IRI syntax, including relative IRI resolution, is defined by SPARQL Query section 4.1.1 (noting the different spellings of the PREFIX
and BASE
keywords).
Example (test-30.ttl) with document base IRI http://www.w3.org/2001/sw/DataAccess/df1/tests/
encodes the following N-Triples (test-30.out):
The Turtle syntax for literals and blank nodes are defined by SPARQL Query section 4.1.2 and SPARQL Query section 4.1.4 respectively.
A Turtle document is a Unicode[[!UNICODE]] character string encoded in UTF-8. Unicode codepoints only in the range U+0 to U+10FFFF inclusive are allowed.
White space (production ws) is used to separate two tokens which would otherwise be (mis-)recognized as one token.
Comments in Turtle take the form of '#', outside an IRI_REF or strings, and continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker. Comments are treated as white space.
Turtle strings and IRIs can use \
-escape sequences to
represent Unicode code points.
The following table describes all the escapes allowed inside a string or IRI_REF:
Escape | Unicode code point |
---|---|
'\u' hex hex hex hex | A Unicode codepoint in the range U+0 to U+FFFF inclusive corresponding to the encoded hexadecimal value. |
'\U' hex hex hex hex hex hex hex hex | A Unicode codepoint in the range U+10000 to U+10FFFF inclusive corresponding to the encoded hexadecimal value. |
'\t' | U+0009 |
'\n' | U+000A |
'\r' | U+000D |
'\"' (inside string) |
U+0022 |
'\>' (inside IRI_REF only) |
U+003E |
'\\' | U+005C |
where HEX is a hexadecimal character
HEX ::= [0-9] | [A-F] | [a-f]
The EBNF used here is defined in XML 1.0 (Third Edition) [[!EBNF-NOTATION]]. Production labels consisting of a number and a final 's', e.g. [60s], reference to the production with that number in the SPARQL Query Language for RDF grammar [[RDF-SPARQL-QUERY]].
There are known formating issues with the table form of the grammar. Please see turtle.bnf for exact grammar.
The RDF Concepts and Abstract Syntax ([[!RDF-CONCEPTS]]) specification defines three types of RDF Term:
RDF URI References (here called IRIs),
literals and
blank nodes.
Literals are composed of a lexical form and an optional language tag or datatype IRI.
An extra type, prefix
, is used during parsing to map string identifiers to namespace IRIs.
This section maps a string conforming to the grammar in section 4.4 to a set of triples by mapping strings matching productions and lexical tokens to RDF terms or their components (e.g. language tags, lexical forms of literals). Some productions change the parser state (base or prefix declarations).
Parsing Turtle requires a state of four items:
baseURI
— When the base production is reached, the second rule argument, IRI_REF
, is the base URI used for relative IRI resolution (test: base1 base2).namespaces
— The second and third rule arguments (PNAME_NS
and IRI_REF
) in the prefixID production assign a namespace name (IRI_REF
) for the prefix (PNAME_NS
). Outside of a prefixID
production, any PNAME_NS
is substituted with the namespace (test: prefix1 escapedNamespace1). Note that the prefix may be an empty string, per the PNAME_NS,
production: (PN_PREFIX)? ":"
(test: default1).bnodeLabels
— A mapping from string to blank node label.curSubject
— The curSubject
is bound to the subject
production.curPredicate
— The curPredicate
is bound to the verb
production. If token matched was "a
", curPredicate
is bound to the IRI http://www.w3.org/1999/02/22-rdf-syntax-ns#type
(test: type).This table maps productions and lexical tokens to RDF terms
or components of RDF terms
listed in section 5:
production | type | procedure |
---|---|---|
IRI_REF | IRI | The characters between "<" and ">" are unescaped¹ to form the unicode string of the IRI. Relative IRI resolution is performed per SPARQL Query section 4.1.1. |
PNAME_NS | prefix | The potentially empty unicode string matching the first argument of the rule is a key into the namespaces map. |
PNAME_LN | IRI | A prefix is identified by the first argument, PNAME_NS . The namespaces map has a corresponding namespace . The unicode string of the IRI is formed by concatenating this namespace and the second argument, PN_LOCAL . Relative IRI resolution is performed per SPARQL Query section 4.1.1. |
STRING_LITERAL1 | lexical form | The characters between the outermost "'"s are unescaped¹ to form the unicode string of a lexical form. |
STRING_LITERAL2 | lexical form | The characters between the outermost '"'s are unescaped¹ to form the unicode string of a lexical form. |
STRING_LITERAL_LONG1 | lexical form | The characters between the outermost "'''"s are unescaped¹ to form the unicode string of a lexical form. |
STRING_LITERAL_LONG2 | lexical form | The characters between the outermost '"""'s are unescaped¹ to form the unicode string of a lexical form. |
LANGTAG | language tag | The characters following the "@" form the unicode string of the language tag. |
RDFLiteral | literal | The literal has a lexical form of the first rule argument (String ) and either a language tag of LANGTAG or a datatype IRI of IRIref , depending on which rule matched the input. |
INTEGER | literal | The literal has a lexical form of the input string, and a datatype of xsd:integer. |
DECIMAL | literal | The literal has a lexical form of the input string, and a datatype of xsd:decimal. |
DOUBLE | literal | The literal has a lexical form of the input string, and a datatype of xsd:double. |
BooleanLiteral | literal | The literal has a lexical form of the "true" or "false", depending on which matched the input, and a datatype of xsd:boolean. |
BLANK_NODE_LABEL | blank node | The string matching the second argument, PN_LOCAL , is a key in bnodeLabels. If there is no corresponding blank node in the map, one is allocated. |
ANON | blank node | A blank node is generated. |
blankNodePropertyList | blank node | A blank node is generated. Note the rules for blankNodePropertyList in the next section. |
collection | blank node | A blank node is generated. Note the rules for collection in the next section. |
¹ Section 3.3 defines an mapping from escaped unicode strings
to unicode strings
. The following lexical tokens are unescaped to produce unicode strings
: IRI_REF, STRING_LITERAL1, STRING_LITERAL2, STRING_LITERAL_LONG1 and STRING_LITERAL_LONG2.
A Turtle document defines an RDF graph composed of set of RDF triples.
Each object N
in the document produces an RDF triple: curSubject
curPredicate
N
.
Beginning the blankNodePropertyList
production records the curSubject
and curPredicate
, and sets curSubject
to a novel blank node
B
.
Finishing the blankNodePropertyList
production restores curSubject
and curPredicate
.
The node produced by matching blankNodePropertyList
is the blank node B
.
Beginning the collection
production records the curSubject
and curPredicate
, sets curSubject
to a novel blank node
Bhead
and sets curSubject
and curPredicate
to Bhead
and rdf:first
respectively.
Each object O
in collection
allocates a novel blank node
Bn
, creates an additional triple curSubject rdf:rest Bn
. and sets curSubject
to Bn
.
Finishing the collection
production creates an additional triple curSubject rdf:rest rdf:nil
. and restores curSubject
and curPredicate
The node produced by matching collection
is the blank node Bhead
.
The following informative example shows the semantic actions performed when parsing this Turtle document with an LALR(1) parser:
ericFoaf
to the IRI http://www.w3.org/People/Eric/ericP-foaf.rdf#
.http://xmlns.com/foaf/0.1/
.curSubject
the IRI http://www.w3.org/People/Eric/ericP-foaf.rdf#ericP
.curPredicate
the IRI http://xmlns.com/foaf/0.1/givenName
.<...rdf#ericP>
<.../givenName>
"Eric"
.curPredicate
the IRI http://xmlns.com/foaf/0.1/knows
.<...rdf#ericP>
<.../knows>
<...who/dan-brickley>
.<...rdf#ericP>
<.../knows>
_:1
.curSubject
and reassign to the blank node _:1
.curPredicate
.curPredicate
the IRI http://xmlns.com/foaf/0.1/mbox
._:1
<.../mbox>
<mailto:timbl@w3.org>
.curSubject
and curPredicate
to their saved values (<...rdf#ericP>
, <.../knows>
).<...rdf#ericP>
<.../knows>
<http://getopenid.com/amyvdh>
.This example is a Turtle translation of example 7 in the RDF/XML Syntax specification (example1.ttl):
An example of an RDF collection of two literals.
which is short for (example2.ttl):
An example of two identical triples containing literal objects containing newlines, written in plain and long literal forms. Assumes that line feeds in this document are #xA. (example3.ttl):
As indicated by the grammar, a collection can be either a subject or an object. This subject or object will be the novel blank node for the first object, if the collection has one or more objects, or rdf:nil
if the collection is empty.
For example,
is syntactic sugar for (noting that the blank nodes b0
, b1
and b2
do not occur anywhere else in the RDF graph):
RDF collections can be nested and can involve other syntactic forms:
(1 [:p :q] ( 2 ) ) .
is syntactic sugar for:
The IRI that identifies the Turtle language is:
http://www.w3.org/ns/formats/Turtle
The XML (Namespace name, Local name) pair that identifies
the Turtle language is:
Namespace: http://www.w3.org/ns/formats/Turtle
Local name: turtle
The suggested namespace prefix is ttl
(informative)
which would make this ttl:turtle
as an XML QName.
Previous versions of the Turtle specified
http://www.w3.org/2008/turtle#turtle
as the IRI for the Turtle language.
This change aligns Turtle with identifiers for RDF/XML, N3, POWDER, etc
Systems conforming to Turtle MUST pass all the following test cases:
Passing these tests means:
test-n.ttl
tests MUST generate equivalent RDF
triples to those given in the corresponding test-n.out
N-Triples file.bad-n.ttl
tests MUST NOT generate RDF triples.The media type of Turtle is text/turtle
.
The content encoding of Turtle content is always UTF-8. Charset
parameters on the mime type are required until such time as the
text/
media type tree permits UTF-8 to be sent without a
charset parameter. See B. Internet Media
Type, File Extension and Macintosh File Type for the media type
registration form.
Turtle is related to a number of other languages.
ISSUE-4: A future version of this document is expected to define N-Triples. By default, the RDF WG will specify N-Triples to allow UTF-8 characters in IRIs, literals and blank node identifiers. Readers with an opinion about whether or not N-Triples should be ASCII-only may wish to comment.
All N-Triples files are vaild Turtle documents. Turtle adds the following syntax to N-Triples:
@base
directive for setting a base IRI@prefix
directive for assigning namespace prefixes,
;
[]
rdf:type
shorthand a
()
sxsd:integer
xsd:double
xsd:decimal
xsd:boolean
Turtle is similar to and inspired by the more powerful Notation 3 (N3). Please see the most recent Notation3 specification for comparison with Turtle.
RDF/XML ([[RDF-SYNTAX-GRAMMAR]]) has certain restrictions imposed by XML and the use of XML Namespaces that prevent it encoding all RDF graphs (some predicate URIs are forbidden and XML 1.0 forbids encoding some Unicode codepoints). These restrictions do not apply to Turtle.
The SPARQL Query Language for RDF (SPARQL) [[RDF-SPARQL-QUERY]] uses a Turtle style syntax for its TriplesBlock production. This production differs from the Turtle language in that:
?
name or $
name) in any part of the triple of the formFor further information see the Syntax for IRIs and SPARQL Grammar sections of the SPARQL query document [[RDF-SPARQL-QUERY]].
The Internet Media Type / MIME Type for Turtle is "text/turtle".
It is recommended that Turtle files have the extension ".ttl" (all lowercase) on all platforms.
It is recommended that Turtle files stored on Macintosh HFS file systems be given a file type of "TEXT".
This information that follows has been submitted to the IESG for review, approval, and registration with IANA.
charset
— this parameter is required when transferring non-ASCII data. If present, the value of charset
is always UTF-8
.This work was described in the paper New Syntaxes for RDF which discusses other RDF syntaxes and the background to the Turtle (Submitted to WWW2004, referred to as N-Triples Plus there).
This work was started during the Semantic Web Advanced Development Europe (SWAD-Europe) project funded by the EU IST-7 programme IST-2001-34732 (2002-2004) and further development supported by the Institute for Learning and Research Technology at the University of Bristol, UK (2002-Sep 2005).
Changes since the last publication of this document W3C Turtle Submission 2008-01-14 . See the Previous changelog for further information
true
and false
.ex:first.name
.ex:7tm
.