Introduction

A JSON-LD document is a representation of a directed graph. A single directed graph can have many different serializations, each expressing exactly the same information. Developers typically work with trees, represented as JSON objects. While mapping a graph to a tree can be done, the layout of the end result must be specified in advance. A Frame can be used by a developer on a JSON-LD document to specify a deterministic layout for a graph.

How to Read this Document

This document is a detailed specification for a serialization of Linked Data in JSON. The document is primarily intended for the following audiences:

Authors who want to query JSON-LD documents to create representations more appropriate for a given use case.
Software developers that want to implement processors and APIs for JSON-LD.

To understand the basics in this specification you must first be familiar with JSON, which is detailed in [[!RFC4627]]. You must also understand the JSON-LD Syntax [[!JSON-LD]], which is the base syntax used by all of the algorithms in this document, and the JSON-LD API [[!JSON-LD-API]]. To understand the API and how it is intended to operate in a programming environment, it is useful to have working knowledge of the JavaScript programming language [[ECMA-262]] and WebIDL [[!WEBIDL]]. To understand how JSON-LD maps to RDF, it is helpful to be familiar with the basic RDF concepts [[!RDF-CONCEPTS]].

General Terminology

The intent of the Working Group and the Editors of this specification is to eventually align terminology used in this document with the terminology used in the RDF Concepts document to the extent to which it makes sense to do so. In general, if there is an analogue to terminology used in this document in the RDF Concepts document, the preference is to use the terminology in the RDF Concepts document.

The following is an explanation of the general terminology used in this document:

JSON object: An object structure is represented as a pair of curly brackets surrounding zero or more name-value pairs. A name is a string. A single colon comes after each name, separating the name from the value. A single comma separates a value from a following name. The names within an object SHOULD be unique.
array: An array is represented as square brackets surrounding zero or more values that are separated by commas.
string: A string is a sequence of zero or more Unicode (UTF-8) characters, wrapped in double quotes, using backslash escapes (if necessary). A character is represented as a single character string.
number: A number is similar to that used in most programming languages, except that the octal and hexadecimal formats are not used and that leading zeros are not allowed.
true and false: Values that are used to express one of two possible boolean states.
null: The use of the null value within JSON-LD is used to ignore or reset values.
keyword: A JSON key that is specific to JSON-LD, specified in the JSON-LD Syntax specification [[!JSON-LD]] in the section titled Syntax Tokens and Keywords.
context: A a set of rules for interpreting a JSON-LD document as specified in The Context of the [[JSON-LD]] specification.
IRI: An Internationalized Resource Identifier as described in [[!RFC3987]].
Linked Data: A set of documents, each containing a representation of a linked data graph.
linked data graph or dataset: An unordered labeled directed graph, where nodes are IRIs or Blank Nodes, or other values. A linked data graph is a generalized representation of a RDF graph as defined in [[!RDF-CONCEPTS]].
named graph: A linked data graph that is identified by an IRI.
graph name: The IRI identifying a named graph.
default graph: When executing an algorithm, the graph where data should be placed if a named graph is not specified.
node: A piece of information that is represented in a linked data graph.
node definition: A JSON object used to represent a node and one or more properties of that node. A JSON object is a node definition if it does not contain the keys @value, @list or @set and it has one or more keys other than @id.
node reference: A JSON object used to reference a node having only the @id key.
blank node: A node in the linked data graph that does not contain a de-referenceable identifier because it is either ephemeral in nature or does not contain information that needs to be linked to from outside of the linked data graph. A blank node is assigned an identifier starting with the prefix _:.
property: The IRI label of an edge in a linked data graph.
subject: A node in a linked data graph with at least one outgoing edge, related to an object node through a property.
object: A node in a linked data graph with at least one incoming edge.
quad: A piece of information that contains four items; a subject, a property, an object, and a graph name.
literal: An object expressed as a value such as a string, number or in expanded form.

Contributing

There are a number of ways that one may participate in the development of this specification:

Technical discussion typically occurs on the public mailing list: public-linked-json@w3.org
Public teleconferences are held on Tuesdays at 1500UTC on the second and fourth week of each month.
Specification bugs and issues should be reported in the issue tracker.
Source code for the specification can be found on Github.
The #json-ld IRC channel is available for real-time discussion on irc.freenode.net.

Algorithms

All algorithms described in this section are intended to operate on language-native data structures. That is, the serialization to a text-based JSON document isn't required as input or output to any of these algorithms and language-native data structures MUST be used where applicable.

Syntax Tokens and Keywords

This specification adds a number of keywords to the ones defined in the [[!JSON-LD]] specification:

@default: Used in Framing to set the default value for an output property when the framed node definition does not include such a property.
@explicit: Used in Framing to override the value of explicit inclusion flag within a specific frame.
@omitDefault: Used in Framing to override the value of omit default flag within a specific frame.
@embed: Used in Framing to override the value of object embed flag within a specific frame.
@null: Used in Framing when a value of null should be returned, which would otherwise be removed when Compacting.

All JSON-LD tokens and keywords are case-sensitive.

Algorithm Terms

active subject: the currently active subject that the processor should use when processing.
active property: the currently active property that the processor should use when processing. The active property is represented in the original lexical form, which is used for finding coercion mappings in the active context.
active object: the currently active object that the processor should use when processing.
active context: a context that is used to resolve terms while the processing algorithm is running. The active context is the context contained within the processor state.
compact IRI: a compact IRI is has the form of prefix and suffix and is used as a way of expressing an IRI without needing to define separate term definitions for each IRI contained within a common vocabulary identified by prefix.
local context: a context that is specified within a JSON object, specified via the @context keyword.
processor state: the processor state, which includes the active context, active subject, and active property. The processor state is managed as a stack with elements from the previous processor state copied into a new processor state when entering a new JSON object.
JSON-LD input: The JSON-LD data structure that is provided as input to the algorithm.
JSON-LD output: The JSON-LD data structure that is produced as output by the algorithm.
term: A term is a short word defined in a context that MAY be expanded to an IRI
prefix: A prefix is a term that expands to a vocabulary base IRI. It is typically used along with a suffix to form a compact IRI to create an IRI within a vocabulary.
language-tagged string: A language-tagged string is a literal without a datatype, including a language. See languaged-tagged string in [[!RDF-CONCEPTS]].
typed literal: A typed literal is a literal with an associated IRI which indicates the literal's datatype. See languaged-tagged literal in [[!RDF-CONCEPTS]].

Framing

Framing is the process of taking a JSON-LD document, which expresses a graph of information, and applying a specific graph layout (called a Frame).

Framing makes use of the Node Map Generation algorithm to place each object defined in the JSON-LD document into a flat list of objects, allowing them to be operated upon by the framing algorithm.

Framing Algorithm Terms

input frame: the initial frame provided to the framing algorithm.
framing context: a context containing a map of embeds, the object embed flag, the explicit inclusion flag and the omit default flag.
map of embeds: a map that tracks if a subject is to be embedded in the output of the Framing Algorithm; it maps a subject @id to a parent JSON object and property or parent array.
object embed flag: a flag specifying that objects should be directly embedded in the output, instead of being referred to by their IRI.
explicit inclusion flag: a flag specifying that for properties to be included in the output, they must be explicitly declared in the framing context.
omit default flag: a flag specifying that properties that are missing from the JSON-LD input, but present in the input frame should be omitted from the output.
map of flattened subjects: a map of subjects that is the result of the Node Map Generation algorithm.

Framing Algorithm

This algorithm is a work in progress. Presently, it only works for documents without named graphs.

Currently, framing allows just to select node definitions based on @type matching or duck typing for included properties. It allows value properties to be explicitly matched based on defining the property and excluding things that are undefined, but it does not allow to be more specific about the types of values selected. Allowing this is currently being discussed.

The framing algorithm takes an JSON-LD input (expanded input) and an input frame (expanded frame) that have been expanded according to the Expansion Algorithm, and a number of options and produces JSON-LD output.

Create framing context using null for the map of embeds, the object embed flag set to true, the explicit inclusion flag set to false, and the omit default flag set to false along with map of flattened subjects set to the @merged property of the result of performing the Node Map Generation algorithm on expanded input. Also create results as an empty array.

Invoke the recursive algorithm using framing context (state), the map of flattened subjects (subjects), expanded frame (frame), result as parent, and null as active property.

The following series of steps is the recursive portion of the framing algorithm:

Validate frame.
Create a set of matched subjects by filtering subjects checking the map of flattened subjects against frame:
1. If frame has a @type property containing one or more IRIs match any node definition with a @type property including any of those IRIs.
2. Otherwise, if frame has a @type property only a empty JSON object, matches any node definition with a @type property, regardless of the actual values.
3. Otherwise, match if the node definition contains all of the non-keyword properties in frame.
Get values for embedOn and explicitOn by looking in frame for the keys @embed and @explicit using the current values for object embed flag and explicit inclusion flag from state if not found.
For each id and subject from the set of matched subjects, ordered by id:
1. If the active property is null, set the map of embeds in state to an empty map.
2. Initialize output with @id and id.
3. Initialize embed with parent and active property to property.
4. If embedOn is true, and id is in map of embeds from state:
  1. Set existing to the value of id in map of embeds and set embedOn to false.
  2. If existing has a parent which is an array containing a JSON object with @id equal to id, element has already been embedded and can be overwritten, so set embedOn to true.
  3. Otherwise, existing has a parent which is a node definition. Set embedOn to true if any of the items in parent property is a node definition or node reference for id because the embed can be overwritten.
  4. If embedOn is true, existing is already embedded but can be overwritten, so Remove Embedded Definition for id.
5. If embedOn is false, add output to parent by either appending to parent if it is an array, or appending to active property in parent otherwise.
6. Otherwise:
  1. Add embed to map of embeds for id.
  2. Process each property and value in the matched subject, ordered by property:
    1. If property is a keyword, add property and a copy of value to output and continue with the next property from subject.
    2. If property is not in frame:
      1. If explicitOn is false, Embed values from subject in output using subject as element and property as active property.
      2. Continue to next property.
    3. Process each item from value as follows:
      1. If item is a JSON object with the key @list, then create a JSON object named list with the key @list and the value of an empty array. Append list to property in output. Process each listitem in the @list array as follows:
        
        If listitem is a node reference process listitem recursively using this algorithm passing a new map of subjects that contains the @id of listitem as the key and the node definition from the original map of flattened subjects as the value. Pass the first value from frame for property as frame, list as parent, and @list as active property.
        
        Otherwise, append a copy of listitem to @list in list.
      2. If item is a node reference process item recursively using this algorithm passing a new map as subjects that contains the @id of item as the key and the node definition from the original map of flattened subjects as the value. Pass the first value from frame for property as frame, output as parent, and property as active property.
        Passing a node reference doesn't work if this map is used recursively. Presently pass node definition from original map of flattened subjects.
      3. Otherwise, append a copy of item to active property in output.
  3. Process each property and value in frame, where property is not a keyword, ordered by property:
    1. Set property frame to the first item in value or a newly created JSON object if value is empty.
    2. Skip to the next property in frame if property is in output or if property frame contains @omitDefault which is true or if it does not contain @omitDefault but the value of omit default flag true.
    3. Set the value of property in output to a new JSON object with a property @preserve and a value that is a copy of the value of @default in frame if it exists, or the string @null otherwise.
  4. Add output to parent. If parent is an array, append output, otherwise append output to active property in parent.

At the completion of the recursive algorithm, results will contain the top-level node definitions.

The final two steps of the framing algorithm require results to be compacted according to the Compaction Algorithm by using the context provided in the input frame. If the frame has no context, compaction is performed with an empty context (not a null context). The compaction result MUST use the @graph keyword at the top-level, even if the context is empty or if there is only one element to put in the @graph array. Subsequently, replace all key-value pairs where the key is @preserve with the value from the key-pair. If the value from the key-pair is @null, replace the value with null. If, after replacement, an array contains only the value null remove the value, leaving an empty array. The resulting value is the final JSON-LD output.

Remove Embedded Definition

This algorithm replaces an already embedded node definition with a node reference. It then recursively removes any entries in the map of embeds that had the removed node definition in their parent chain.

About as clear as mud

The current behaviour avoids embedding the same data multiple times in the result makes it difficult to work with the output. A proposal to change this to "agressive re-embedding" is currently being discussed.

The algorithm is invoked with a framing context and subject id id.

Find embed from map of embeds for id.
Let parent and property be from embed.
If parent is an array, replace the node definition that matches id with a node reference. If parent is a JSON object, replace the node definition for property that matches id with a node reference.
Remove dependents for id in map of embeds by scanning the map for entries with parent that have an @id of id, removing that definition from the map, and then removing the dependents for the parent id recursively by repeating this step. This step will terminate when there are no more embed entries containing the removed node definition's @id in their parent chain.

Embed Values

This algorithm recursively embeds property values in node definition output, given a framing context, input node definition element, active property, and output.

For each item in active property of element:
1. If item is a JSON object with the key @list, then create a new JSON object with a key @list and a value of an empty array and add it to output, appending if output is an array, and appending to active property otherwise. Recursively call this algorithm passing item as element, @list as active property, and the new array as output. Continue to the next item.
2. If item is a node reference:
  1. If map of embeds does not contain an entry for the @id of item:
    1. Initialize embed with output as parent and active property as property and add to map of embeds.
    2. Initialize a new node definition o to act as the embedded node definition.
    3. For each property and value in the expanded definition for item in subjects:
      1. Add property and a copy of value to o if property is a keyword.
      2. Otherwise, recursively call this algorithm passing value as element, property as active property and o as output.
  2. Set item to o.
If output is an array, append a copy of item, otherwise append a copy of item to active property in output.

Introduction

How to Read this Document

General Terminology

Contributing

The Application Programming Interface

JsonLdProcessor

Callbacks

JsonLdCallback

Data Structures

JsonLdOptions