--- a/spec/latest/index.html Sun Aug 14 12:44:49 2011 -0400
+++ b/spec/latest/index.html Sun Aug 14 19:27:20 2011 -0400
@@ -2226,6 +2226,14 @@
<section>
<h3>Normalization Algorithm Terms</h3>
<dl>
+ <dt><tdef>label</tdef></dt>
+ <dd>
+ The subject IRI associated with a graph node. The subject IRI is expressed
+ using a key-value pair in a <tref>JSON object</tref> where the key is
+ <code>@subject</code> and the value is a string that is an IRI or
+ a <tref>JSON object</tref> containing the key <code>@iri</code> and
+ a value that is a string that is an IRI.
+ </dd>
<dt><tdef>list of expanded nodes</tdef></dt>
<dd>
A list of all nodes in the <tref>JSON-LD input</tref> graph containing no
@@ -2238,25 +2246,6 @@
second nodes or values being examined in an algorithm. The names are
merely used to refer to each input value to a comparison algorithm.
</dd>
- <dt><tdef>labeling prefix</tdef></dt>
- <dd>
- A random base string that starts with the characters <code>_:</code>, is
- not used by any other node's subject IRI in the <tref>JSON-LD input</tref>,
- and does not start with the characters
- <code>_:c14n</code>. The prefix is used to temporarily name nodes during
- the normalization algorithm in a way that doesn't collide with the
- names that already exist as well as the names that will be generated by the
- normalization algorithm. To prevent further collisions, an incrementing
- <tdef>labeling counter</tdef> is appended to the base string. For
- example, <code>_:j8r3k</code> is a proper <tref>labeling prefix</tref>.
- </dd>
- <dt><tdef>deterministic labeling prefix</tdef></dt>
- <dd>
- A base string with the value <code>_:c14n</code>. The prefix is used to
- deterministically name nodes during the normalization algorithm. The
- <tdef>deterministic labeling counter</tdef> is appended to the base string
- to ensure that there are no collisions.
- </dd>
<dt><tdef>renaming counter</tdef></dt>
<dd>
A counter that is used during the
@@ -2278,77 +2267,139 @@
</section>
<section>
+<h3>Normalization State</h3>
+
+<p>When performing the steps required by the normalization algorithm,
+it is helpful to track the many pieces of information in a
+data structure called the <tdef>normalization state</tdef>. The information
+contained in the <tref>normalization state</tref> is described below.</p>
+
+<dl>
+ <dt><tdef>labeling prefix</tdef></dt>
+ <dd>
+ A random base string that starts with the characters <code>_:</code>, is
+ not used by any other node's <tref>label</tref> in the <tref>JSON-LD input</tref>,
+ and does not start with the characters
+ <code>_:c14n</code>. The prefix is used to temporarily name nodes during
+ the normalization algorithm in a way that doesn't collide with the
+ names that already exist as well as the names that will be generated by the
+ normalization algorithm. To prevent further collisions, an incrementing
+ <tref>labeling counter</tref> is appended to the base string. For
+ example, <code>_:j8r3k</code> is a proper <tref>labeling prefix</tref>.
+ </dd>
+ <dt><tdef>deterministic labeling prefix</tdef></dt>
+ <dd>
+ A base string with the value <code>_:c14n</code>. The prefix is used to
+ deterministically name nodes during the normalization algorithm. The
+ <tref>deterministic labeling counter</tref> is appended to the base string
+ to ensure that there are no collisions.
+ </dd>
+ <dt><tdef>labeling counter</tdef></dt>
+ <dd>
+ A counter that is used to re-label nodes that are not labeled, or that
+ conflict with the canonicalization naming scheme.
+ </dd>
+ <dt><tdef>deterministic labeling counter</tdef></dt>
+ <dd>
+ A counter this is used to re-label nodes in a deterministic manner for
+ the normalization output.
+ </dd>
+ <dt><tdef>map of flattened nodes</tdef></dt>
+ <dd>
+ A map containing a representation of all nodes in the graph where the
+ key is a node <tref>label</tref> and the value is a single
+ <tref>JSON object</tref> that has no nested sub-objects
+ and has had all properties for the same node merged into a single
+ <tref>JSON object</tref>.
+ </dd>
+ <dt><tdef>incoming map</tdef></dt>
+ <dd>
+ Relates graph node IRIs to every other node that refers to them in the
+ graph. For example, if a node <tref>alpha</tref> refers to a
+ node <tref>beta</tref> via a property, the key in the
+ <tref>incoming map</tref> is the <tref>label</tref> for <tref>beta</tref> and
+ the value is an array containing at least node <tref>alpha</tref>.
+ </dd>
+ <dt><tdef>outgoing map</tdef></dt>
+ <dd>
+ Relates graph node IRIs to the target nodes that they reference. For
+ example, if a node <tref>alpha</tref> refers to a node <tref>beta</tref>
+ via a property, the key in the <tref>outgoing map</tref> is the subject
+ IRI of <tref>alpha</tref> and the value is an array containing at least
+ <tref>beta</tref>.
+ </dd>
+</dl>
+
+</section>
+
+<section>
<h3>Normalization Algorithm</h3>
<p>The normalization algorithm expands the <tref>JSON-LD input</tref>,
flattens the data structure, and creates an initial set of names for all
nodes in the graph. The flattened data structure is then processed by a
node labeling algorithm in order to get a fully expanded and named list of
-nodes which is then sorted.
+nodes which is then sorted. The result is a deterministically named and
+ordered list of graph nodes.
</p>
<ol class="algorithm">
+<li>Create a <tref>normalization state</tref> and initialize all array
+ values to an empty array, all map values to an empty map, all
+ string values to an empty string and all number values to <code>0</code>.</li>
<li>Expand the <tref>JSON-LD input</tref> according to the steps in
the <a href="#expansion-algorithm">Expansion Algorithm</a> and store the
result as the <strong>expanded input</strong>.</li>
<li>Initialize the <tref>labeling prefix</tref> to a value that doesn't
-collide with any other subject IRI prefix in the graph. Initialize the
+collide with any other <tref>label</tref> prefix in the graph. Initialize the
<tref>labeling counter</tref> to 1.
-<li>Create an empty <tdef>map of expanded nodes</tdef> and recursively
-process every <tdef>expanded node</tdef> in the <strong>expanded input</strong>
-in depth-first order:
+<li>Initialize the <tref>map of flattened nodes</tref> by recursively
+processing every <tdef>expanded node</tdef> in the
+<strong>expanded input</strong> in depth-first order:
<ol class="algorithm">
<li>If the <tref>expanded node</tref> is an unlabeled node, add a
- new key-value pair
+ new key-value pair to the <tref>expanded node</tref>
where the key is <code>@subject</code> and the value is the
concatenation of the <tref>labeling prefix</tref>
- and the string value of the <tref>labeling counter</tref>.</li>
+ and the string value of the <tref>labeling counter</tref>.
+ Increment the <tref>labeling counter</tref>.</li>
<li>Add the <tref>expanded node</tref> to the
- <tref>map of expanded nodes</tref>:
+ <tref>map of flattened nodes</tref>:
<ol class="algorithm">
- <li>If the <tref>expanded node</tref>'s subject IRI is already in the
- <tref>map of expanded nodes</tref> merge all properties from the
- entry in the <tref>map of expanded nodes</tref> into the
+ <li>If the <tref>expanded node</tref>'s <tref>label</tref> is already
+ in the
+ <tref>map of flattened nodes</tref> merge all properties from the
+ entry in the <tref>map of flattened nodes</tref> into the
<tref>expanded node</tref>.</li>
<li>Go through every property associated with an array in the
<tref>expanded node</tref> and remove any duplicate IRI entries from
- the array. If resulting array only has one IRI entry, change it
+ the array. If the resulting array only has one IRI entry, change it
from an array to an object.</li>
- <li>Set the entry for the <tref>expanded node</tref>'s subject IRI
- in the <tref>map of expanded nodes</tref> to the
+ <li>Set the entry for the <tref>expanded node</tref>'s <tref>label</tref>
+ in the <tref>map of flattened nodes</tref> to the
<tref>expanded node</tref>.
</li></ol></li>
- <li>Replace the reference to the node in the
- <strong>expanded input</strong> with an object containing a single
+ <li>After exiting the recursive step, replace the reference to the
+ <tref>expanded node</tref> with an object containing a single
key-value pair where the key is <code>@iri</code> and the value is
the value of the <code>@subject</code> key in the node.</li>
</ol></li>
-<li>Create a <tdef>outgoing map</tdef> that relates graph
- node IRIs to the targets nodes that they reference. For example, if a
- node <tref>alpha</tref> refers to a node <tref>beta</tref> via a property,
- the key in the <tref>outgoing map</tref> is the subject IRI of
- <tref>alpha</tref> and the value is an array containing at least
- <tref>beta</tref>.
- <ol class="algorithm">
- <li>Add all outgoing references for every node in the graph.</li>
- </ol></li>
-<li>Create a <tdef>incoming map</tdef> that relates graph node IRIs
- to every other node that refers to them in the graph.
- For example, if a node <tref>alpha</tref> refers to a
- node <tref>beta</tref> via a property, the key in the
- <tref>incoming map</tref> is the subject IRI for <tref>beta</tref> and
- the value is an array containing at least <tref>alpha</tref>.
- <ol class="algorithm">
- <li>Add all incoming references for every node in the graph.</li>
- </ol></li>
-
-<li>For every entry in the <tref>map of expanded nodes</tref> that has a
-subject IRI that begins with <code>_:c14n</code>, relabel the subject IRI
+<li>Initialize the <tref>outgoing map</tref> by iterating over every object
+ associated with a property in the node, and if it is another node,
+ adding the association to the list of nodes associated with the current
+ node.</li>
+<li>Initialize the <tref>incoming map</tref> by iterating over every
+ node in the graph and mapping its <tref>label</tref> to the list of all
+ other nodes that refer to it.</li>
+<li>For every entry in the <tref>map of flattened nodes</tref> that has a
+<tref>label</tref> that begins with <code>_:c14n</code>, relabel the <tref>label</tref>
using the steps in the
<a href="#node-relabeling-algorithm">Node Relabeling Algorithm</a> with the
-following parameters; <tref>naming base string</tref> should be the value
-of <tref>labeling prefix</tref> and <tref>renaming counter</tref> should be the
-value of <tref>labeling counter</tref>.
+following parameters; the <tref>normalization state</tref>,
+the <tref>old label</tref> should be the value of <tref>label</tref>,
+the <tref>naming base string</tref> should be the value
+of <tref>labeling prefix</tref> and the <tref>renaming counter</tref> should be
+the value of <tref>labeling counter</tref>.
<li>Label all of the nodes that contain a <code>@subject</code> key associated
with a value starting with <code>_:</code> according to the steps in the
<a href="#deterministic-labeling-algorithm">Deterministic Labeling Algorithm</a>.
@@ -2359,24 +2410,40 @@
<section>
<h4>Node Relabeling Algorithm</h4>
-<p>The node relabeling algorithm takes a node containing an
-<tdef>old label</tdef> that is the subject IRI, a
-<tdef>naming base string</tdef>, a <tref>renaming counter</tref>, and the
-<tref>incoming map</tref> and <tref>outgoing map</tref>.
-</p>
+<p>This algorithm renames a node by generating a unique
+<tdef>new label</tdef> and updating all references to that <tref>label</tref>
+in the <tref>incoming map</tref> and the <tref>outgoing map</tref>. The
+following parameters must be given as input to the algorithm.
+
+<dl class="algorithm">
+ <dt><tref>normalization state</tdef></dt>
+ <dd>A reference to the current normalization state.</dd>
+ <dt><tdef>old label</tdef></dt>
+ <dd>The string value to use when performing look-ups on the
+ <tref>incoming map</tref> and the <tref>outgoing map</tref>.</dd>
+ <dt><tdef>naming base string</tdef></dt>
+ <dd>The base string to utilize when generating the <tref>new label</tref>.
+ </dd>
+ <dt><tdef>renaming counter</tdef></dt>
+ <dd>The counter that ensures uniqueness for all <tref>new label</tref>s that
+ are generated. This value must be passed in by reference as changes
+ to it must be reflected elsewhere in the normalization algorithm.</dd>
+</dl>
+
+<p>The node relabeling algorithm is as follows:</p>
<ol class="algorithm">
<li>Generate the <tdef>new label</tdef> by concatenating the
<tref>naming base string</tref> with the string value of the
- <tref>renaming counter</tref>. Add <code>1</code>
- to the <tref>renaming counter</tref>, ensuring to preserve the latest
- value of the counter across multiple calls to this algorithm.
+ <tref>renaming counter</tref>. Increment the <tref>renaming counter</tref>
+ by one.
+ </li>
<li>For each node that is associated with the <tref>old label</tref> in the
<tref>incoming map</tref>, update all properties that reference the
<tref>old label</tref> with the <tref>new label</tref>.</li>
<li>Change the <tref>old label</tref> keys in the <tref>incoming map</tref>
and the <tref>outgoing map</tref> to the <tref>new label</tref>.</li>
- <li>Set the node's subject IRI to the <tref>new label</tref>.</li>
+ <li>Return the <tref>new label</tref> as the result.</li>
</ol>
</section>
@@ -2384,7 +2451,7 @@
<h4>Deterministic Labeling Algorithm</h4>
<p>The deterministic labeling algorithm takes the
-<tref>map of expanded nodes</tref>
+<tref>map of flattened nodes</tref>
and produces a sorted list of deterministically named and expanded nodes from
the graph.
@@ -2393,7 +2460,7 @@
<code>1</code> and
the <tdef>sorted list of expanded nodes</tdef> to an empty array.</li>
<li>Place a reference to each of the nodes in the
- <tref>map of expanded nodes</tref> into a
+ <tref>map of flattened nodes</tref> into a
<tref>list of expanded nodes</tref>.
<li>Move every node with an IRI that does not start with <code>_:</code>
from the <tref>list of expanded nodes</tref> into the
@@ -2445,7 +2512,7 @@
<tref>outgoing serialization map</tref>.</li>
<li>For each <tref>old label</tref> in the
<tref>list of old labels</tref> get the node from the
- <tref>map of expanded nodes</tref> and relabel it according to the
+ <tref>map of flattened nodes</tref> and relabel it according to the
<a href="#node-relabeling-algorithm">Node Relabeling Algorithm</a>
with the following parameters;
<tref>naming base string</tref> is the value of the
@@ -2466,7 +2533,7 @@
<tref>incoming serialization map</tref>.</li>
<li>For each <tref>old label</tref> in the
<tref>list of old labels</tref> get the node from the
- <tref>map of expanded nodes</tref> and relabel it according to the
+ <tref>map of flattened nodes</tref> and relabel it according to the
<a href="#node-relabeling-algorithm">Node Relabeling Algorithm</a>
with the following parameters;
<tref>naming base string</tref> is the value of the
@@ -2491,7 +2558,7 @@
entry in the <tref>map of incoming serialization maps</tref>.
</li>
<li>
- Remove each node with a subject IRI that starts with
+ Remove each node with a <tref>label</tref> that starts with
<code>_:c14n</code> from the <tref>list of expanded nodes</tref> and
add it to the <tref>sorted list of expanded nodes</tref>.
</ol></li>
@@ -2625,7 +2692,7 @@
<li>Compare incoming and outgoing edges for each node, updating the
serialization maps as each node is processed:
<ol class="algorithm">
- <li>If the subject IRI for <tref>alpha</tref> does not exist in the
+ <li>If the <tref>label</tref> for <tref>alpha</tref> does not exist in the
<tref>outgoing serialization map</tref>, generate the serialization by
following the steps in the
<a href="#node-serialization-algorithm">Node Serialization Algorithm</a>.
@@ -2634,7 +2701,7 @@
the <tref>outgoing serialization map</tref>,
the <tref>serialized labels map</tref> and the
<tref>list of unserialized labels</tref> to the algorithm as inputs.
- <li>If the subject IRI for <tref>beta</tref> does not exist in the
+ <li>If the <tref>label</tref> for <tref>beta</tref> does not exist in the
<tref>outgoing serialization map</tref>, generate the serialization by
following the steps in the
<a href="#node-serialization-algorithm">Node Serialization Algorithm</a>.
@@ -2648,7 +2715,7 @@
for <tref>beta</tref> in the <tref>outgoing serialization map</tref>,
then <tref>alpha</tref> is first. If the lexicographical value is less,
then <tref>beta</tref> is first.</li>
- <li>If the subject IRI for <tref>alpha</tref> does not exist in the
+ <li>If the <tref>label</tref> for <tref>alpha</tref> does not exist in the
<tref>incoming serialization map</tref>, generate the serialization by
following the steps in the
<a href="#node-serialization-algorithm">Node Serialization Algorithm</a>.
@@ -2657,7 +2724,7 @@
the <tref>incoming serialization map</tref>,
the <tref>serialized labels map</tref> and the
<tref>list of unserialized labels</tref> to the algorithm as inputs.
- <li>If the subject IRI for <tref>beta</tref> does not exist in the
+ <li>If the <tref>label</tref> for <tref>beta</tref> does not exist in the
<tref>incoming serialization map</tref>, generate the serialization by
following the steps in the
<a href="#node-serialization-algorithm">Node Serialization Algorithm</a>.
@@ -2683,8 +2750,7 @@
a <tref>processed labels map</tref>, a
<tdef>serialization map</tdef>, a <tref>serialized labels map</tref>,
and a <tref>list of unserialized labels</tref> as inputs and generates
-deterministic serializations for all unlabeled nodes in the graph. The
-node's subject IRI will be called the <tdef>label</tdef> in this algorithm.
+deterministic serializations for all unlabeled nodes in the graph.
</p>
<ol class="algorithm">
@@ -2699,7 +2765,7 @@
passing the <tref>label</tref> and the <tref>mapping count</tref> as
parameters to the algorithm.</li>
<li>For every property in the node that references a target node with a
-subject IRI that starts with <code>_:</code>
+<tref>label</tref> that starts with <code>_:</code>
(the <tdef>target node label</tdef>):
<ol class="algorithm">
<li>Look up the <tref>target node label</tref> in the