This document defines N-Triples, a concrete syntax for RDF [[!RDF11-CONCEPTS]]. N-Triples is an easy to parse line-based subset of Turtle [[!TURTLE]].
The syntax is a revised version of N-Triples as originally defined in the RDF Test Cases [[!RDF-TESTCASES]] document. Its original intent was for writing test cases, but it has proven to be popular as an exchange format for RDF data.
An N-Triples document contains no parsing directives.
N-Triples triples are a sequence of RDF terms representing the subject, predicate and object of an RDF Triple. These may be separated by white space (spaces U+0020
or tabs U+0009
). This sequence is terminated by a '.
' and a new line (optional at the end of a document).
N-Triples triples are also Turtle simple triples, but Turtle includes other representations of RDF terms and abbreviations of RDF Triples. When parsed by a Turtle parser, data in the N-Triples format will produce exactly the same triples as a parser for the N-triples language.
The RDF graph represented by an N-Triples document contains
exactly each triple matching the N-Triples
triple
production.
The simplest triple statement is a sequence of (subject, predicate, object) terms, separated by whitespace and terminated by '.
' after each triple.
IRIs may be written only as absolute IRIs.
IRIs are enclosed in '<
' and '>
' and may contain numeric escape sequences (described below). For example <http://example.org/#green-goblin>
.
Literals are used to identify values such as strings, numbers, dates.
Literals (Grammar production Literal) have a lexical form followed by a language tag, a datatype IRI, or neither.
The representation of the lexical form consists of an
initial delimiter "
(U+0022), a sequence of permitted
characters or numeric escape sequence or string escape sequence, and a final delimiter. Literals may not contain the characters "
, LF
, CR
except in their escaped forms. In addition '\
' (U+005C) may not appear in any quoted literal except as part of an escape sequence.
The corresponding RDF lexical form is the characters between the delimiters, after processing any escape sequences.
If present, the language tag is preceded by a '@
' (U+0040).
If there is no language tag, there may be a datatype IRI, preceded by '^^
' (U+005E U+005E). If there is no datatype IRI and no language tag it is a simple literal and the datatype is http://www.w3.org/2001/XMLSchema#string
.
RDF blank nodes in N-Triples are expressed as _:
followed by a blank node label which is a series of name characters.
The characters in the label are built upon PN_CHARS_BASE, liberalized as follows:
_
and [0-9]
may appear anywhere in a blank node label..
may appear anywhere except the first or last character.-
, U+00B7
, U+0300
to U+036F
and U+203F
to U+2040
are permitted anywhere except the first character.A fresh RDF blank node is allocated for each unique blank node label in a document. Repeated use of the same blank node label identifies the same RDF blank node.
application/n-triples
\b
and \f
for backspace and form feed
This section defined a canonical form of N-Triples which has less variability in layout. The grammar for the language is the same. Implementers are encouraged to produce this form.
Canonical N-Triples has the following additional constraints on layout:
subject
,
predicate
,
and object
MUST be a single space,
(U+0020
). All other locations that allow
whitespace MUST be empty.HEX
MUST use only uppercase letters ([A-F]
).UCHAR
.U+0022
, U+005C
, U+000A
, U+000D
are encoded using ECHAR
.
ECHAR
MUST NOT be used for characters that are
allowed directly in
STRING_LITERAL_QUOTE. This specification defines conformance criteria for:
A conforming N-Triples document is a Unicode string that conforms to the grammar and additional constraints defined in , starting with the ntriplesDoc
production. An N-Triples document serializes an RDF graph.
A conforming Canonical N-Triples document is an N-Triples document that follows the additional constraints of Canonical N-Triples.
A conforming N-Triples parser is a system capable of reading N-Triples documents on behalf of an application. It makes the serialized RDF graph, as defined in , available to the application, usually through some form of API.
The IRI that identifies the N-Triples language is:
http://www.w3.org/ns/formats/N-Triples
The media type of N-Triples is application/n-triples
.
The content encoding of N-Triples is always UTF-8.
See N-Triples Media Type for the media type
registration form.
N-Triples has been historically provided with other media types. N-Triples may also be provided as text/plain
. When used in this way N-Triples MUST use the escaped form of any character outside US-ASCII. As N-Triples is a subset of Turtle an N-Triples document MAY also be provided as text/turtle
. In both of these cases the document is not an N-Triples document as an N-Triples document is only provided as application/n-triples
.
An N-Triples document is a Unicode [[!UNICODE]] character string encoded in UTF-8. Unicode code points only in the range U+0 to U+10FFFF inclusive are allowed.
White space (tab U+0009
or space U+0020
) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal. White space is significant in the production STRING_LITERAL_QUOTE.
Comments in N-Triples take the form of '#
',
outside an IRIREF
or STRING_LITERAL_QUOTE
, and continue
up-to, and excluding, the end of line (EOL
),
or end of file if there is no end of line after the comment
marker. Comments are treated as white space.
The EBNF used here is defined in XML 1.0 [[!EBNF-NOTATION]].
Escape sequence rules are the same as Turtle
[[TURTLE]]. However, as only the STRING_LITERAL_QUOTE
production is allowed new lines in literals MUST be escaped.
Parsing N-Triples requires a state of one item:
bnodeLabels
— A mapping from string to blank node.This table maps productions and lexical tokens to RDF terms
or components of RDF terms
listed in :
production | type | procedure |
---|---|---|
IRIREF | IRI | The characters between "<" and ">" are taken, with escape sequences unescaped, to form the unicode string of the IRI. |
STRING_LITERAL_QUOTE | lexical form | The characters between the outermost '"'s are taken, with escape sequences unescaped, to form the unicode string of a lexical form. |
LANGTAG | language tag | The characters following the @ form the unicode string of the language tag. |
literal | literal | The literal has a lexical form of the first rule argument, STRING_LITERAL_QUOTE , and either a language tag of LANGTAG or a datatype IRI of iri , depending on which rule matched the input. If the LANGTAG rule matched, the datatype is rdf:langString and the language tag is LANGTAG . If neither a language tag nor a datatype IRI is provided, the literal has a datatype of xsd:string . |
BLANK_NODE_LABEL | blank node | The string after '_: ', is a key in bnodeLabels. If there is no corresponding blank node in the map, one is allocated. |
An N-Triples document defines an RDF graphs composed of a set of RDF triples. The triple
production produces a triple defined by the terms constructed for subject
, predicate
and object
.
The editor of the RDF 1.1 edition acknowledges valuable contributions from Gregg Kellogg, Eric Prud'hommeaux, Dave Beckett, David Robillard, Gregory Williams, Pat Hayes, Richard Cyganiak, Henry S. Thompson, Peter Ansell, Evan Patton and David Booth.
This specification is a product of extended deliberations by the members of the RDF Working Group. It draws upon the earlier specification in RDF Test Cases, edited by Dave Beckett.
No substantive changes.
The Internet Media Type / MIME Type for N-Triples is "application/n-triples".
It is recommended that N-Triples files have the extension ".nt" (all lowercase) on all platforms.
It is recommended that N-Triples files stored on Macintosh HFS file systems be given a file type of "TEXT".
This information that follows will be submitted to the IESG for review, approval, and registration with IANA.