The Resource Description Framework (RDF) is a general-purpose language for representing information in the Web.

This document defines a textual syntax for RDF called Turtle that allows an RDF graph to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes. Turtle provides levels of compatibility with the existing N-Triples format as well as the triple pattern syntax of the SPARQL W3C Recommendation.

The following are the non-editorial changes since last publication:

Introduction

This document defines Turtle, the Terse RDF Triple Language, a concrete syntax for RDF ([[!RDF-CONCEPTS]]).

A Turtle document is a textual representations of a RDF graph. The following Turtle document describes the relationship between Green Goblin and Spiderman.

The Turtle grammar for triples is a subset of the SPARQL Query Language for RDF [[RDF-SPARQL-QUERY]] grammar for TriplesBlock. The two grammars share production and terminal names where possible.

Comments may be given after a # that is not part of another lexical token and continue to the end of the line.

The Turtle Grammar and Parsing sections define the construction of an RDF graph from a Turtle document.

Triples in Turtle

A Turtle document allows writing down an RDF graph in a compact textual form. An RDF graph is made up of triples consisting of a subject, predicate and object.

Simple Triples

The simplest triple statement is a sequence of (subject, predicate, object) terms, separated by whitespace and terminated by '.' after each triple.


			

Predicate Lists

Often the same subject will be refrenced by a number of predicates. The predicateObjectList production matches a series of predicates and objects, separated by ;, following a subject. This expresses a series of RDF Triples with that subject and a each predicate and object allocated to one triple. Thus, the ; symbol is used to repeat the subject of triples that vary only in predicate and object RDF terms.

These two examples are equivalent ways of writing the triples about Spiderman.

Object Lists

As with predicates often objects are repeated with the same subject and predicate. The objectList production matches a series of objects, separated by ,, following a subject and predicate. This expresses a series of RDF Triples with that subject and predicate and a each object allocated to one triple. Thus, the , symbol is used to repeat the subject and predicate of triples that only differ in the object RDF term.

These two examples are equivalent ways of writing Spiderman's name in two languages.

RDF Terms in Turtle

There are three types of RDF Term defined in RDF Concepts: IRIs (Internationalized Resource Identifiers), literals and blank nodes. Turtle provides a number of ways of writing each.

IRIs

IRIs may be written as relative or absolute IRIs or prefixed names. Relative and absolute IRIs are enclosed in '<' and '>' and may contain numeric escape sequences (described below). For example <http://example.org/#green-goblin>.

The token a in the predicate position of a Turtle triple represents the IRI http://www.w3.org/1999/02/22-rdf-syntax-ns#type .

The following Turtle document contains examples of all the diffrent ways of writting IRIs in Turtle.

Prefixed Names in Turtle

A prefixed name is a prefix label and a local part, separated by a colon ":". A prefixed name is turned into an IRI by concatenating the IRI associated with the prefix and the local part. The @prefix directive associates a prefix label with an IRI. Subsequent @prefix directives may re-map the same prefix label.

To write http://www.perceive.net/schemas/relationship/enemyOf using a prefixed name:

  1. Define a prefix label for the vocabulary IRI http://www.perceive.net/schemas/relationship/ as rel
  2. Then write rel:enemyOf which is equalivate to writing <http://www.perceive.net/schemas/relationship/enemyOf>

Prefixed names are a superset of XML QNames. They differ in that the local part of prefixed names may include:

Relative IRIs

Relative IRIs are resolved with base IRIs as per Uniform Resource Identifier (URI): Generic Syntax [RFC3986] using only the basic algorithm in section 5.2. Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) are performed. Characters additionally allowed in IRI references are treated in the same way that unreserved characters are treated in URI references, per section 6.5 of Internationalized Resource Identifiers (IRIs) [RFC3987].

The @base directive defines the Base IRI used to resolve relative IRIs per RFC3986 section 5.1.1, "Base URI Embedded in Content". Section 5.1.2, "Base URI from the Encapsulating Entity" defines how the In-Scope Base IRI may come from an encapsulating document, such as a SOAP envelope with an xml:base directive or a mime multipart document with a Content-Location header. The "Retrieval URI" identified in 5.1.3, Base "URI from the Retrieval URI", is the URL from which a particular SPARQL query was retrieved. If none of the above specifies the Base URI, the default Base URI (section 5.1.4, "Default Base URI") is used. Each @base directive sets a new In-Scope Base URI, relative to the previous one.

RDF Literals

Literals are used to identify values such as strings, numbers, dates.

Literals in Turtle have a lexical form followed by a language tag, a datatype IRI, or neither. The representation of the lexical form consists of a delimiting ", a sequence of characters matching the regular expression [^\"\\\n\r] or numeric escape sequence or string escape sequence, and a final delimiting ". The corresponding RDF lexical form is the characters between the ""s, after processing any escape sequences. If present, the language tag is preceded by a @. The datatype IRI in Turtle may be written using either an absolute IRI, a relative IRI or prefixed name. If there is no language tag, there may be a datatype IRI, preceeded by ^^. If there is no datatype IRI and no language tag, the datatype is xsd:string.

Other Lexical Representations in Turtle

Besides the " delimited literal there are three other representations of the lexical form:

  • Literals delimited by ', which contain escape characters and characters matching the pattern [^'\\\n\r].
  • Literals delimited by """, which permit up to two "s, as well as \r and \n.
  • Literals delimited by ''', which permit up to two 's, as well as \r and \n.

Representing Numbers in Turtle

Numbers can be written with lexical form and datatype (Example: "-5.0"^^xsd:decimal) but Turtle has syntax for writing integer values, arbitrary precision decimal values, double precision floating point values and boolean values.

Data Type Abbreviated Lexical Description
xsd:integer -5 "-5"^^xsd:integer Integer values may be written as an optional sign and a series of digits. Integers match the regular expression "[+-]?[0-9]+".
xsd:decimal -5.0 "-5.0"^^xsd:decimal Arbitrary-precision decimals may be written as an optional sign, zero or more digits, a decimal point and one or more digits. Decimals match the regular expression "[+-]?[0-9]*\.[0-9]+".
xsd:double 4.2E9 "4.2E9"^^xsd:double Double-precision floating point values may be written as an optionally signed mantissa with an optional decimal point, the letter "e" or "E", and an optionally signed integer exponent. The exponent matches the regular expression "[+-]?[0-9]+" and the mantissa one of these regular expressions: "[+-]?[0-9]+\.[0-9]+", "[+-]?\.[0-9]+" or "[+-]?[0-9]".

Representing Booleans in Turtle

Boolean values may be written as either true or false (case-sensitive) and represent RDF literals with the datatype xsd:boolean.

RDF Blank Nodes

RDF blank nodes in Turtle are expressed as _: followed by a blank node label which is a series of name characters. The characters in the label are built upon PN_CHARS_BASE, liberalized to allow:

A fresh RDF blank node is allocated for each unique blank node label in a document. Repeated use of the same blank node label identifies the same RDF blank node.

Nesting Unlabeled Blank Nodes in Turtle

In Turtle, fresh RDF blank nodes are also allocated when matching the production blankNodePropertyList and the terminal ANON. Both of these may appear in the subject or object position of a triple (see the Turtle Grammar). That subject or object is a fresh RDF blank node. This blank node also serves as the subject of the triples produced by matching the predicateObjectList production embedded in a blankNodePropertyList. The generation of these triples is described in Predicate Lists. Blank nodes are also allocated for Collections in Turtle (below).

The Turtle grammar allows blankNodePropertyLists to be nested. In this case, each inner [ establishes a new subject blank node which reverts to the outer node at the ], and serves as the current subject for predicate object lists.

The use of predicateObjectList within a blankNodePropertyList is a common idiom for representing a series of properties of a node.

Abbreviated:

Corresponding simple triples:

Collections in Turtle

RDF provides a Collection [[RDF-MT]] structure for lists of RDF nodes. The Turtle syntax for Collections is a possibly empty list of RDF terms enclosed by (). This collection represents an rdf:first/rdf:rest list structure with the sequence of objects of the rdf:first statements being the order of the terms enclosed by ().

The (…) syntax MUST appear in the subject or object position of a triple (see the Turtle Grammar). The blank node at the head of the list is the subject or object of the containing triple.

Turtle Grammar

A Turtle document is a Unicode[[!UNICODE]] character string encoded in UTF-8. Unicode codepoints only in the range U+0 to U+10FFFF inclusive are allowed.

White Space

White space (production WS) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal. Rule names below in capitals indicate where white space is significant; these form a possible choice of terminals for constructing a Turtle parser.

White space is significant in terminal IRIREF and the production String.

Comments

Comments in Turtle take the form of '#', outside an IRIREF or String, and continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker. Comments are treated as white space.

Escape Sequences

There are three forms of escapes used in turtle documents:

numeric
escapes
string
escapes
reserved character
escapes
IRIs, used as RDF terms or as in @prefix or @base declarations yes no no
local names no no yes
Strings yes yes no

%-encoded sequences are in the character range for IRIs and are explicitly allowed in local names. These appear as a '%' followed by two hex characters and represent that same sequence of three characters. These sequences are not decoded during processing. A term written as <http://a.example/%66oo-bar> in Turtle designates the IRI http://a.example/%66oo-bar and not IRI http://a.example/foo-bar. A term written as ex:%66oo-bar with a prefix @prefix ex: <http://a.example/> also designates the IRI http://a.example/%66oo-bar.

Grammar

The EBNF used here is defined in XML 1.0 [[!EBNF-NOTATION]]. Production labels consisting of a number and a final 's', e.g. [60s], reference the production with that number in the SPARQL Query Language for RDF grammar [[RDF-SPARQL-QUERY]].

Parsing

The RDF Concepts and Abstract Syntax ([[!RDF-CONCEPTS]]) specification defines three types of RDF Term: IRIs, literals and blank nodes. Literals are composed of a lexical form and an optional language tag or datatype IRI. An extra type, prefix, is used during parsing to map string identifiers to namespace IRIs. This section maps a string conforming to the grammar in section 4.4 to a set of triples by mapping strings matching productions and lexical tokens to RDF terms or their components (e.g. language tags, lexical forms of literals). Some productions change the parser state (base or prefix declarations).

Parser State

Parsing Turtle requires a state of four items:

RDF Term Constructors

This table maps productions and lexical tokens to RDF terms or components of RDF terms listed in section 5:

production type procedure
IRIREF IRI The characters between "<" and ">" are unescaped¹ to form the unicode string of the IRI. Relative IRI resolution is performed per SPARQL Query section 4.1.1.
PNAME_NS prefix The potentially empty unicode string matching the first argument of the rule is a key into the namespaces map.
PNAME_LN IRI A prefix is identified by the first argument, PNAME_NS. The namespaces map has a corresponding namespace. The unicode string of the IRI is formed by concatenating this namespace and the second argument, PN_LOCAL. Relative IRI resolution is performed per SPARQL Query section 4.1.1.
STRING_LITERAL1 lexical formThe characters between the outermost "'"s are unescaped¹ to form the unicode string of a lexical form.
STRING_LITERAL2 lexical formThe characters between the outermost '"'s are unescaped¹ to form the unicode string of a lexical form.
STRING_LITERAL_LONG1 lexical formThe characters between the outermost "'''"s are unescaped¹ to form the unicode string of a lexical form.
STRING_LITERAL_LONG2 lexical formThe characters between the outermost '"""'s are unescaped¹ to form the unicode string of a lexical form.
LANGTAG language tagThe characters following the "@" form the unicode string of the language tag.
RDFLiteral literal The literal has a lexical form of the first rule argument (String) and either a language tag of LANGTAG or a datatype IRI of IRIref, depending on which rule matched the input.
INTEGER literal The literal has a lexical form of the input string, and a datatype of xsd:integer.
DECIMAL literal The literal has a lexical form of the input string, and a datatype of xsd:decimal.
DOUBLE literal The literal has a lexical form of the input string, and a datatype of xsd:double.
BooleanLiteral literal The literal has a lexical form of the "true" or "false", depending on which matched the input, and a datatype of xsd:boolean.
BLANK_NODE_LABEL blank node The string matching the second argument, PN_LOCAL, is a key in bnodeLabels. If there is no corresponding blank node in the map, one is allocated.
ANON blank node A blank node is generated.
blankNodePropertyList blank node A blank node is generated. Note the rules for blankNodePropertyList in the next section.
collection blank node A blank node is generated. Note the rules for collection in the next section.

¹ Escape Sequences defines a mapping from escaped unicode strings to unicode strings. The following lexical tokens are unescaped to produce unicode strings: IRI_REF, STRING_LITERAL1, STRING_LITERAL2, STRING_LITERAL_LONG1 and STRING_LITERAL_LONG2.

RDF Triples Constructors

A Turtle document defines an RDF graph composed of set of RDF triples. Each object N in the document produces an RDF triple: curSubject curPredicate N .

Beginning the blankNodePropertyList production records the curSubject and curPredicate, and sets curSubject to a novel blank node B. Finishing the blankNodePropertyList production restores curSubject and curPredicate. The node produced by matching blankNodePropertyList is the blank node B.

Beginning the collection production records the curSubject and curPredicate, sets curSubject to a novel blank node Bhead and sets curSubject and curPredicate to Bhead and rdf:first respectively. Each object O in collection allocates a novel blank node Bn, creates an additional triple curSubject rdf:rest Bn . and sets curSubject to Bn. Finishing the collection production creates an additional triple curSubject rdf:rest rdf:nil . and restores curSubject and curPredicate The node produced by matching collection is the blank node Bhead.

Parsing Example

The following informative example shows the semantic actions performed when parsing this Turtle document with an LALR(1) parser:

Examples

This example is a Turtle translation of example 7 in the RDF/XML Syntax specification (example1.ttl):

An example of an RDF collection of two literals.

which is short for (example2.ttl):

An example of two identical triples containing literal objects containing newlines, written in plain and long literal forms. Assumes that line feeds in this document are #xA. (example3.ttl):

As indicated by the grammar, a collection can be either a subject or an object. This subject or object will be the novel blank node for the first object, if the collection has one or more objects, or rdf:nil if the collection is empty.

For example,

is syntactic sugar for (noting that the blank nodes b0, b1 and b2 do not occur anywhere else in the RDF graph):

RDF collections can be nested and can involve other syntactic forms:

@prefix :  .
(1 [:p :q] ( 2 ) ) .

is syntactic sugar for:

Identifiers for the Turtle Language

The IRI that identifies the Turtle language is:
http://www.w3.org/ns/formats/Turtle

This specification defines conformance criteria for:

A conforming Turtle document is a Unicode string that conforms to the grammar and additional constraints defined in Turtle Grammar, starting with the turtleDoc production. A Turtle document serializes an RDF graph.

A conforming Turtle parser is a system capable of reading Turtle documents on behalf of an application. It makes the serialized RDF graph, as defined in Parsing, available to the application, usually through some form of API.

This specification does not define how Turtle parsers handle non-conforming input documents.

Media Type and Content Encoding

The media type of Turtle is text/turtle. The content encoding of Turtle content is always UTF-8. Charset parameters on the mime type are required until such time as the text/ media type tree permits UTF-8 to be sent without a charset parameter. See B. Internet Media Type, File Extension and Macintosh File Type for the media type registration form.

Turtle in HTML

HTML ([[!HTML5]]) script tags can be used to embed data blocks in documents. Turtle can be easily embedded in HTML this way.

<script type="text/turtle">
@prefix dc: <http://purl.org/dc/terms/> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .

<http://books.example.com/works/45U8QJGZSQKDH8N> a frbr:Work ;
     dc:creator "Wil Wheaton"@en ;
     dc:title "Just a Geek"@en ;
     frbr:realization <http://books.example.com/products/9780596007683.BOOK>,
         <http://books.example.com/products/9780596802189.EBOOK> .

<http://books.example.com/products/9780596007683.BOOK> a frbr:Expression ;
     dc:type <http://books.example.com/product-types/BOOK> .

<http://books.example.com/products/9780596802189.EBOOK> a frbr:Expression ;
     dc:type <http://books.example.com/product-types/EBOOK> .
</script>
        

Turtle content should be placed in a script tag with the type attribute set to text/turtle. < and > symbols do not need to be escaped inside of script tags. The character encoding of the embedded Turtle MUST match the HTML documents encoding.

XHTML

Like JavaScript, Turtle authored for HTML (text/html) can break when used in an XHTML (application/xhtml+xml). The solution is the same one used for JavaScript.

<script type="text/turtle">
# <![CDATA[
@prefix frbr: <http://purl.org/vocab/frbr/core#> .

<http://books.example.com/works/45U8QJGZSQKDH8N> a frbr:Work .
# ]]>
</script>
        

When embedded in XHTML Turtle data blocks MUST be enclosed in CDATA sections. Those CDATA markers MUST be in Turtle comments. If the character sequence "]]>" occurs in the document it MUST be escaped using strings escapes (\u005d\u0054\u003e). This will also make Turtle safe in polyglot documents served as both text/html and application/xhtml+xml.

Displaying Examples

It is possible to display the contents of script tags containing Turtle for use in examples or other guides using Cascading Style Sheets Selectors Level 3 ([[SELECT]]).

script[type='text/turtle'] {
  display:block;
  white-space: pre;
  font-family: monospace;
}

However, this creates issues with polyglot documents. If you wish to display Turtle from script tags you SHOULD only use text/html.

Parsing Turtle in HTML

There are no syntactic or grammar differences between parsing Turtle that has been embedded and normal Turtle documents. Each script data block is considered to be it's own Turtle document. @prefix, @base declarations MUST NOT effect other data blocks. All Turtle data blocks in a HTML document share the same document base URI as the HTML document. THe HTML lang attribute or XHTML xml:lang attribute have no effect on the parsing of the data blocks.

Turtle compared

Turtle is related to a number of other languages.

Turtle compared to Notation 3

Turtle is similar to and inspired by Notation 3 (N3). Please see the most recent Notation3 specification for comparison with Turtle.

Turtle compared to RDF/XML

RDF/XML ([[RDF-SYNTAX-GRAMMAR]]) has certain restrictions imposed by XML and the use of XML Namespaces that prevent it encoding all RDF graphs (some predicate URIs are forbidden and XML 1.0 forbids encoding some Unicode codepoints). These restrictions do not apply to Turtle.

Turtle compared to SPARQL

The SPARQL Query Language for RDF (SPARQL) [[RDF-SPARQL-QUERY]] uses a Turtle style syntax for its TriplesBlock production. This production differs from the Turtle language in that:

  1. SPARQL permits RDF Literals as the subject of RDF triples (per Last Call draft)
  2. SPARQL permits variables (?name or $name) in any part of the triple of the form
  3. Turtle allows prefix and base declarations anywhere outside of a triple. In SPARQL, they are only allowed in the Prologue (at the start of the SPARQL query).
  4. SPARQL uses case insensitive keywords, except for a. Turtle's prefix and base declarations are case sensitive.
  5. true and false are case insensitive in SPARQL and case sensitive in Turtle. TrUe is not a valid boolean value in Turtle.

For further information see the Syntax for IRIs and SPARQL Grammar sections of the SPARQL query document [[RDF-SPARQL-QUERY]].

Internet Media Type, File Extension and Macintosh File Type

Contact:
Ian Davis
See also:
How to Register a Media Type for a W3C Specification
Internet Media Type registration, consistency of use
TAG Finding 3 June 2002 (Revised 4 September 2002)

The Internet Media Type / MIME Type for Turtle is "text/turtle".

It is recommended that Turtle files have the extension ".ttl" (all lowercase) on all platforms.

It is recommended that Turtle files stored on Macintosh HFS file systems be given a file type of "TEXT".

This information that follows has been submitted to the IESG for review, approval, and registration with IANA.

Type name:
text
Subtype name:
turtle
Required parameters:
None
Optional parameters:
charset — this parameter is required when transferring non-ASCII data. If present, the value of charset is always UTF-8.
Encoding considerations:
The syntax of Turtle is expressed over code points in Unicode [[!UNICODE]]. The encoding is always UTF-8 [[!UTF-8]].
Unicode code points may also be expressed using an \uXXXX (U+0 to U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a hexadecimal digit [0-9A-F]
Security considerations:
Turtle is a general-purpose assertion language; applications may evaluate given data to infer more assertions or to dereference IRIs, invoking the security considerations of the scheme for that IRI. Note in particular, the privacy issues in [[!RFC3023]] section 10 for HTTP IRIs. Data obtained from an inaccurate or malicious data source may lead to inaccurate or misleading conclusions, as well as the dereferencing of unintended IRIs. Care must be taken to align the trust in consulted resources with the sensitivity of the intended use of the data; inferences of potential medical treatments would likely require different trust than inferences for trip planning.
Turtle is used to express arbitrary application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (e.g. PGP encryption, MD5 sum validation, password-protected compression) may also be used on Turtle documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information.
Turtle can express data which is presented to the user, for example, RDF Schema labels. Application rendering strings retrieved from untrusted Turtle documents must ensure that malignant strings may not be used to mislead the reader. The security considerations in the media type registration for XML ([[!RFC3023]] section 10) provide additional guidance around the expression of arbitrary data and markup.
Turtle uses IRIs as term identifiers. Applications interpreting data expressed in Turtle should address the security issues of Internationalized Resource Identifiers (IRIs) [[!RFC3987]] Section 8, as well as Uniform Resource Identifier (URI): Generic Syntax [[!RFC3986]] Section 7.
Multiple IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Any person or application that is writing or interpreting data in Turtle must take care to use the IRI that matches the intended semantics, and avoid IRIs that make look similar. Further information about matching of similar characters can be found in Unicode Security Considerations [[UNISEC]] and Internationalized Resource Identifiers (IRIs) [[RFC3987]] Section 8.
Interoperability considerations:
There are no known interoperability issues.
Published specification:
This specification.
Applications which use this media type:
No widely deployed applications are known to use this media type. It may be used by some web services and clients consuming their data.
Additional information:
Magic number(s):
Turtle documents may have the strings '@prefix' or '@base' (case dependent) near the beginning of the document.
File extension(s):
".ttl"
Base URI:
The Turtle '@base <IRIref>' term can change the current base URI for relative IRIrefs in the query language that are used sequentially later in the document.
Macintosh file type code(s):
"TEXT"
Person & email address to contact for further information:
Eric Prud'hommeaux <eric@w3.org>
Intended usage:
COMMON
Restrictions on usage:
None
Author/Change controller:
The Turtle specification is the product of the RDF WG. The W3C reserves change control over this specifications.

Acknowledgements

This work was described in the paper New Syntaxes for RDF which discusses other RDF syntaxes and the background to the Turtle (Submitted to WWW2004, referred to as N-Triples Plus there).

This work was started during the Semantic Web Advanced Development Europe (SWAD-Europe) project funded by the EU IST-7 programme IST-2001-34732 (2002-2004) and further development supported by the Institute for Learning and Research Technology at the University of Bristol, UK (2002-Sep 2005).

Changes

Changes since the last publication of this document W3C Turtle Submission 2008-01-14 . See the Previous changelog for further information