author Manu Sporny Sun, 07 Aug 2011 16:44:00 -0400 changeset 136 b53e36625acf parent 135 b8ef51a2ab60 child 137 e69616f137d0
Fleshed out top-level normalization algorithm.
 spec/latest/index.html
```--- a/spec/latest/index.html	Sat Aug 06 23:19:42 2011 -0400
+++ b/spec/latest/index.html	Sun Aug 07 16:44:00 2011 -0400
@@ -2124,14 +2124,40 @@

<p class="issue">This algorithm is a work in progress, do not implement it.</p>

-<p></p>
+<p>Normalization is the process of taking <tref>JSON-LD input</tref> and
+performing a deterministic transformation on that input that results in all
+aspects of the graph being fully expanded and named in the
+<tref>JSON-LD output</tref>. The normalized output is generated in such a way
+that any conforming JSON-LD processor will generate identical output
+given the same input. The problem is a fairly difficult technical
+problem to solve because it requires a directed graph to be ordered into a
+set of nodes and edges in a deterministic way. This is easy to do when all of
+the nodes have unique names, but very difficult to do when some of the nodes
+are not labeled.
+</p>
+
+<p>In time, there may be more than one normalization algorithm that will need
+to be identified. For identification purposes, this algorithm is named
+<abbr title="Universal Graph Normalization Algorithm 2011">UGNA2011</abbr>.
+</p>

<section>
<h3>Normalization Algorithm Terms</h3>
<dl>
-   <dt><tdef>term1</tdef></dt>
+   <dt><tdef>list of expanded nodes</tdef></dt>
<dd>
-     definition1
+     A list of all nodes in the <tref>JSON-LD input</tref> graph containing no
+     embedded objects and having all keys and values expanded according to the
+     steps in the <a href="#expansion-algorithm">Expansion Algorithm</a>.
+   </dd>
+   <dt><tdef>naming base string</tdef></dt>
+   <dd>
+     An unlabeled node naming prefix that is not used by any other node
+     in the <tref>JSON-LD input</tref> and does not start with the characters
+     <code>c14n</code>. The prefix is used to temporarily name nodes during
+     the normalization algorithm in a way that doesn't collide with the
+     names that already exist as well as the names that will be generated by the
+     normalization algorithm.
</dd>
</dl>
</section>
@@ -2140,9 +2166,32 @@
<h3>Normalization Algorithm</h3>

<ol class="algorithm">
-<li>Expand input (resolve keywords and CURIEs, etc).</li>
-<li>Flatten input (give unique names to unnamed blank nodes, remove embeds, duplicate objects for predicates, etc.).</li>
-<li>Label the unlabeled nodes.</li>
+<li>Expand the <tref>JSON-LD input</tref> according to the steps in
+the <a href="#expansion-algorithm">Expansion Algorithm</a> and store the
+result as the <strong>expanded input</strong>.</li>
+<li>Process every object in the <strong>expanded input</strong>
+text string <code>_:c14n</code>. If a match is found, rename the subject and
+all references to the subject by concatenating <code>_:</code> with the
+<tref>naming base string</tref> and a unique identifier, such as an
+incremented counter value.
+</li>
+<li>Create an empty <tref>list of expanded nodes</tref> and recursively
+process every object in the <strong>expanded input</strong> that is not an
+expanded IRI, typed literal or language literal, in depth-first order:
+  <ol class="algorithm">
+    <li>If an object does not contain a <code>@subject</code>,
+      name it by concatenating
+      <code>_:</code> with the <tref>naming base string</tref> and a
+      unique identifier, such as an  incremented counter value.</li>
+    <li>Add the object to the <tref>list of expanded nodes<tref>.</li>
+    <li>Replace the reference to the object with the value of the
+       <code>@subject</code> in the object.</li>
+    <li>???duplicate objects for predicates???</li>
+  </ol></li>
+<li>Label all of the nodes that contain a <code>@subject</code> key associated
+with a value starting with <code>_:</code> according to the steps in the
+<a href="#node-labeling-algorithm">Node Labeling Algorithm</a>.</li>
</ol>
</section>

@@ -2150,13 +2199,30 @@
<h4>Node Labeling Algorithm</h4>

<ol class="algorithm">
-<li>Rename any blank nodes that start with 'c14n' using a temporary prefix.</li>
<li>Initialize property and reference relation serializations for all blank nodes to null.</li>
<li>Until all blank nodes are canonically named, keep sorting them according to DeepCompare and naming them according to NameNode.</li>
</ol>
</section>

<section>
+<h4>Shallow Comparison Algorithm</h4>
+
+<p>
+ShallowCompare(bnodeA, bnodeB):
+</p>
+
+<ol class="algorithm">
+  <li>The bnode with fewer properties is first.</li>
+  <li>The bnode with the alphabetically first property is first.</li>
+  <li>Do CompareObjects for each equivalent property.</li>
+  <li>The bnode with fewer references is first.</li>
+  <li>The bnode with the reference IRI (vs. bnode) is first.</li>
+  <li>The bnode with the alphabetically-first reference IRI is first.</li>
+  <li>The bnode with the alphabetically-first reference property is first.</li>
+</ol>
+</section>
+
+<section>
<h4>Deep Comparison Algorithm</h4>

<p>```