At least pretend to get nbsps right in insertText
authorAryeh Gregor <AryehGregor+gitcommit@gmail.com>
Tue, 21 Jun 2011 11:56:45 -0600
changeset 305 1658d3f45c56
parent 304 a7b50faff9b9
child 306 fa09e48f4239
At least pretend to get nbsps right in insertText

Still very often fails, but works for a couple of common cases.
editcommands.html
implementation.js
preprocess
source.html
--- a/editcommands.html	Mon Jun 20 15:52:03 2011 -0600
+++ b/editcommands.html	Tue Jun 21 11:56:45 2011 -0600
@@ -38,7 +38,7 @@
 <body class=draft>
 <div class=head id=head>
 <h1>HTML Editing Commands</h1>
-<h2 class="no-num no-toc" id=work-in-progress-&mdash;-last-update-20-june-2011>Work in Progress &mdash; Last Update 20 June 2011</h2>
+<h2 class="no-num no-toc" id=work-in-progress-&mdash;-last-update-21-june-2011>Work in Progress &mdash; Last Update 21 June 2011</h2>
 <dl>
  <dt>Editor
  <dd>Aryeh Gregor &lt;[email protected]&gt;
@@ -2820,6 +2820,63 @@
   </ol>
 </ol>
 
+<p>The <dfn id=canonical-space-sequence>canonical space sequence</dfn> of length <var title="">n</var>, with boolean
+flags <var title="">non-breaking start</var> and <var title="">non-breaking end</var>, is
+returned by the following algorithm:
+<!-- See long comment before insertText. -->
+
+<ol>
+  <li>If <var title="">n</var> is zero, return the empty string.
+
+  <li>If <var title="">n</var> is one and both <var title="">non-breaking start</var> and
+  <var title="">non-breaking end</var> are false, return a single space (U+0020).
+
+  <li>If <var title="">n</var> is one, return a single non-breaking space (U+00A0).
+
+  <li>Let <var title="">buffer</var> be the empty string.
+
+  <li>If <var title="">non-breaking start</var> is true, let <var title="">repeated pair</var> be
+  U+00A0 U+0020.  Otherwise, let it be U+0020 U+00A0.
+
+  <li>While <var title="">n</var> is greater than three, append <var title="">repeated pair</var>
+  to <var title="">buffer</var> and subtract two from <var title="">n</var>.
+
+  <li>If <var title="">n</var> is three, append a three-<a href=http://es5.github.com/#x8.4>element</a> string to
+  <var title="">buffer</var> depending on <var title="">non-breaking start</var> and
+  <var title="">non-breaking end</var>:
+
+  <dl class=switch>
+    <dt><var title="">non-breaking start</var> and <var title="">non-breaking end</var> false
+    <dd>U+0020 U+00A0 U+0020
+
+    <dt><var title="">non-breaking start</var> true, <var title="">non-breaking end</var> false
+    <dd>U+00A0 U+00A0 U+0020
+
+    <dt><var title="">non-breaking start</var> false, <var title="">non-breaking end</var> true
+    <dd>U+0020 U+00A0 U+00A0
+
+    <dt><var title="">non-breaking start</var> and <var title="">non-breaking end</var> both true
+    <dd>U+00A0 U+0020 U+00A0
+  </dl>
+
+  <li>Otherwise, append a two-<a href=http://es5.github.com/#x8.4>element</a> string to <var title="">buffer</var> depending
+  on <var title="">non-breaking start</var> and <var title="">non-breaking end</var>:
+
+  <dl class=switch>
+    <dt><var title="">non-breaking start</var> and <var title="">non-breaking end</var> false
+    <dt><var title="">non-breaking start</var> true, <var title="">non-breaking end</var> false
+    <dd>U+00A0 U+0020
+
+    <dt><var title="">non-breaking start</var> false, <var title="">non-breaking end</var> true
+    <dd>U+0020 U+00A0
+
+    <dt><var title="">non-breaking start</var> and <var title="">non-breaking end</var> both true
+    <dd>U+00A0 U+00A0
+  </dl>
+
+  <li>Return <var title="">buffer</var>.
+</ol>
+
 
 <h3 id=allowed-children><span class=secno>7.3 </span>Allowed children</h3>
 
@@ -4246,6 +4303,8 @@
 
 <h3 id=the-delete-command><span class=secno>7.9 </span><dfn>The <code title="">delete</code> command</dfn></h3>
 
+<p class=XXX>Needs nbsp magic.
+
 <p><a href=#action>Action</a>:
 
 <ol>
@@ -4731,6 +4790,8 @@
 
 <h3 id=the-forwarddelete-command><span class=secno>7.11 </span><dfn>The <code title="">forwardDelete</code> command</dfn></h3>
 
+<p class=XXX>Needs nbsp magic.
+
 <p><a href=#action>Action</a>:
 <!-- Copy-pasted from delete, see there for comments. -->
 
@@ -5521,11 +5582,37 @@
 
 Opera 11.11 has varying behavior, like Firefox and Chrome.  Like Firefox, I
 didn't discern an obvious pattern.
+
+This was discussed: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-June/032187.html
+
+Unfortunately, we're stuck with this nbsp stuff, because of 1) legacy reasons,
+2) mail clients might not support CSS equivalents, 3) authors might not know to
+apply any CSS to wherever the content is eventually used.  The behavior I
+decided on to minimize the evil is as follows:
+
+* If the first and last spaces are in non-collapsing positions, two spaces is
+  nbsp+space, three is space+nbsp+space, four or more is space+nbsp followed by
+  the pattern for two less.
+* If the first space has to be an nbsp so it doesn't collapse, three is instead
+  nbsp+nbsp+space, four or more is nbsp+space followed by the pattern for two
+  less.
+* If the last space has to be nbsp, two is space+nbsp, three is
+  space+nbsp+nbsp, four or more is space+nbsp followed by the pattern for two
+  less.
+* If the first and last space must both be nbsp, two is nbsp+nbsp, three is
+  nbsp+space+nbsp, four or more nbsp+space followed by the pattern for two
+  less.
+
+This avoids nbsp at the end of a run except where it's needed, so words won't
+appear indented if they wrap to the next line.  It avoids more than two nbsp's
+in a row, so there won't be huge chunks of space that get wrapped all at once.
+And it avoids nbsp at the beginning of a run except where it's needed or
+if there are only two spaces in the run, so words won't have to wrap
+unnecessarily.
+
+This is still a huge headache, though.
 -->
 
-<p class=XXX>We need to handle spaces/nbsps specially here.  See comment and <a href=http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-June/032187.html>WHATWG
-discussion</a>.
-
 <ol>
   <li><a href=#delete-the-contents>Delete the contents</a> of the <a href=#active-range>active range</a>.
   <!-- Chrome 14 dev does this even if passed the empty string. -->
@@ -5575,6 +5662,107 @@
   and that <a class=external data-anolis-spec=domcore href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#concept-tree-child title=concept-tree-child>child</a> is a <code class=external data-anolis-spec=domcore><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#text>Text</a></code> node, set <var title="">node</var> to that <a class=external data-anolis-spec=domcore href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#concept-tree-child title=concept-tree-child>child</a>,
   then set <var title="">offset</var> to zero.
 
+  <li>If <var title="">node</var> is a <code class=external data-anolis-spec=domcore><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#text>Text</a></code> node whose <a class=external data-anolis-spec=domcore href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#concept-tree-parent title=concept-tree-parent>parent</a>'s <a href=http://www.w3.org/TR/CSS21/cascade.html#computed-value>computed value</a>
+  for "white-space" is neither "pre" nor "pre-wrap":
+
+  <p class=XXX>This also needs to take visually adjoining text nodes into
+  account, even if their parents are in different elements.  When inserting "a"
+  in "&lt;a href=/&gt;foo&nbsp;&lt;/a&gt;[] ", for instance, we need to convert the
+  nbsp to a regular space.  This kind of thing is just not feasible using pure
+  DOM stuff, though, so the current definition is a bad hack that will often
+  fail in real-world cases.  Suggestions for how to improve it appreciated.
+
+  <ol>
+    <li>Let <var title="">leading space</var> equal zero.
+
+    <li>Let <var title="">start offset</var> equal <var title="">offset</var> minus one.
+
+    <li>While <var title="">start offset</var> is nonnegative and the
+    <var title="">start offset</var>th <a href=http://es5.github.com/#x8.4>element</a> of <var title="">node</var>'s <code class=external data-anolis-spec=domcore title=dom-CharacterData-data><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-data>data</a></code> is a
+    space (U+0020) or non-breaking space (U+00A0):
+
+    <ol>
+      <li>If the <var title="">start offset</var>th <a href=http://es5.github.com/#x8.4>element</a> of <var title="">node</var>'s
+      <code class=external data-anolis-spec=domcore title=dom-CharacterData-data><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-data>data</a></code> is a non-breaking space (U+00A0), or the <a href=http://es5.github.com/#x8.4>element</a> before it
+      is not a space (U+0020), add one to <var title="">leading space</var>.
+
+      <li>Subtract one from <var title="">start offset</var>.
+    </ol>
+
+    <li>Add one to <var title="">start offset</var>.
+
+    <li>Let <var title="">trailing space</var> equal zero.
+
+    <li>Let <var title="">end offset</var> equal <var title="">offset</var>.
+
+    <li>While <var title="">end offset</var> is less than <var title="">node</var>'s <code class=external data-anolis-spec=domcore title=dom-CharacterData-length><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-length>length</a></code>
+    and the <var title="">end offset</var>th <a href=http://es5.github.com/#x8.4>element</a> of <var title="">node</var>'s <code class=external data-anolis-spec=domcore title=dom-CharacterData-data><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-data>data</a></code>
+    is a space (U+0020) or non-breaking space (U+00A0):
+
+    <ol>
+      <li>If the <var title="">end offset</var>th <a href=http://es5.github.com/#x8.4>element</a> of <var title="">node</var>'s
+      <code class=external data-anolis-spec=domcore title=dom-CharacterData-data><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-data>data</a></code> is a non-breaking space (U+00A0), or the <a href=http://es5.github.com/#x8.4>element</a> before it
+      is not a space (U+0020), add one to <var title="">trailing space</var>.
+      <!-- If we're between two spaces that are collapsed together, this means
+      we're effectively at the end of the collapsed run.  This shouldn't happen
+      with user-created selections, of course. -->
+
+      <li>Add one to <var title="">end offset</var>.
+    </ol>
+
+    <li>Set <var title="">initial nbsp</var> to true if <var title="">start offset</var> is 0,
+    false otherwise.
+
+    <li>Set <var title="">final nbsp</var> to true if <var title="">end offset</var> is the
+    <code class=external data-anolis-spec=domcore title=dom-CharacterData-length><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-length>length</a></code> of <var title="">node</var>, false otherwise.
+
+    <p class=XXX>These are wrong in zillions of common cases.  Per XXX above,
+    the fix here is not obvious at all.
+
+    <li>If <var title="">value</var> is a space (U+0020):
+
+    <ol>
+      <li>Let <var title="">new trailing space</var> be the <a href=#canonical-space-sequence>canonical space
+      sequence</a> of length <var title="">leading space</var> plus <var title="">trailing
+      space</var> plus one, with <var title="">non-breaking start</var> equal to
+      <var title="">initial nbsp</var> and <var title="">non-breaking end</var> equal to
+      <var title="">final nbsp</var>.
+
+      <li>Remove the first <var title="">leading space</var> <a href=http://es5.github.com/#x8.4>elements</a> from <var title="">new
+      trailing space</var>, and let <var title="">new leading space</var> be the result.
+
+      <li>Remove the first <a href=http://es5.github.com/#x8.4>element</a> from <var title="">new trailing space</var>, and
+      let <var title="">value</var> be the result.
+    </ol>
+
+    <li>Otherwise:
+
+    <ol>
+      <li>Let <var title="">new leading space</var> be the <a href=#canonical-space-sequence>canonical space
+      sequence</a> of length <var title="">leading space</var>, with
+      <var title="">non-breaking start</var> equal to <var title="">initial nbsp</var> and
+      <var title="">non-breaking end</var> equal to false.
+
+      <li>Let <var title="">new trailing space</var> be the <a href=#canonical-space-sequence>canonical space
+      sequence</a> of length <var title="">trailing space</var>, with
+      <var title="">non-breaking start</var> equal to false and <var title="">non-breaking
+      end</var> equal to <var title="">final nbsp</var>.
+    </ol>
+
+    <li>Call <code class=external data-anolis-spec=domcore title=dom-CharacterData-replaceData><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-replacedata>replaceData(<var title="">start offset</var>, <var title="">offset</var> &minus;
+    <var title="">start offset</var>, <var title="">new leading space</var>)</a></code> on <var title="">node</var>.
+
+    <li>Subtract <var title="">offset</var> from <var title="">end offset</var>, then add
+    <var title="">start offset</var> plus <var title="">leading space</var> to <var title="">end
+    offset</var>.
+
+    <li>Set <var title="">offset</var> to <var title="">start offset</var> plus <var title="">leading
+    space</var>.
+
+    <li>Call <code class=external data-anolis-spec=domcore title=dom-CharacterData-replaceData><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-replacedata>replaceData(<var title="">offset</var>, <var title="">end offset</var> &minus;
+    <var title="">offset</var>, <var title="">new trailing space</var>)</a></code> on <var title="">node</var>.
+  </ol>
+
   <li>If <var title="">node</var> is a <code class=external data-anolis-spec=domcore><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#text>Text</a></code> node:
 
   <ol>
@@ -5594,6 +5782,13 @@
   <li>If <var title="">node</var> has only one <a class=external data-anolis-spec=domcore href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#concept-tree-child title=concept-tree-child>child</a>, which is a <a href=#collapsed-line-break>collapsed
   line break</a>, remove its <a class=external data-anolis-spec=domcore href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#concept-tree-child title=concept-tree-child>child</a> from it.
 
+  <li>If <var title="">value</var> is a space (U+0020), set <var title="">value</var> to a
+  non-breaking space (U+00A0).
+
+  <p class=XXX>This is wrong in all sorts of cases, like
+  "foo&lt;b&gt;[]&lt;/b&gt;bar".  As above, this is hard to get right without heavy
+  CSS involvement.
+
   <li>Let <var title="">text</var> be the result of calling <code class=external data-anolis-spec=domcore title=dom-Document-createTextNode><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-document-createtextnode>createTextNode(<var title="">value</var>)</a></code> on
   the <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#context-object>context object</a>.
 
--- a/implementation.js	Mon Jun 20 15:52:03 2011 -0600
+++ b/implementation.js	Tue Jun 21 11:56:45 2011 -0600
@@ -3039,7 +3039,7 @@
 
 //@}
 
-///// Assorted block formatting algorithm commands /////
+///// Assorted block formatting command algorithms /////
 //@{
 
 function fixDisallowedAncestors(node) {
@@ -3219,6 +3219,67 @@
 	}
 }
 
+function canonicalSpaceSequence(n, nonBreakingStart, nonBreakingEnd) {
+	// "If n is zero, return the empty string."
+	if (n == 0) {
+		return "";
+	}
+
+	// "If n is one and both non-breaking start and non-breaking end are false,
+	// return a single space (U+0020)."
+	if (n == 1 && !nonBreakingStart && !nonBreakingEnd) {
+		return " ";
+	}
+
+	// "If n is one, return a single non-breaking space (U+00A0)."
+	if (n == 1) {
+		return "\xa0";
+	}
+
+	// "Let buffer be the empty string."
+	var buffer = "";
+
+	// "If non-breaking start is true, let repeated pair be U+00A0 U+0020.
+	// Otherwise, let it be U+0020 U+00A0."
+	var repeatedPair;
+	if (nonBreakingStart) {
+		repeatedPair = "\xa0 ";
+	} else {
+		repeatedPair = " \xa0";
+	}
+
+	// "While n is greater than three, append repeated pair to buffer and
+	// subtract two from n."
+	while (n > 3) {
+		buffer += repeatedPair;
+		n -= 2;
+	}
+
+	// "If n is three, append a three-element string to buffer depending on
+	// non-breaking start and non-breaking end:"
+	if (n == 3) {
+		buffer +=
+			!nonBreakingStart && !nonBreakingEnd ? " \xa0 "
+			: nonBreakingStart && !nonBreakingEnd ? "\xa0\xa0 "
+			: !nonBreakingStart && nonBreakingEnd ? " \xa0\xa0"
+			: nonBreakingStart && nonBreakingEnd ? "\xa0 \xa0"
+			: "impossible";
+
+	// "Otherwise, append a two-element string to buffer depending on
+	// non-breaking start and non-breaking end:"
+	} else {
+		buffer +=
+			!nonBreakingStart && !nonBreakingEnd ? "\xa0 "
+			: nonBreakingStart && !nonBreakingEnd ? "\xa0 "
+			: !nonBreakingStart && nonBreakingEnd ? " \xa0"
+			: nonBreakingStart && nonBreakingEnd ? "\xa0\xa0"
+			: "impossible";
+	}
+
+	// "Return buffer."
+	return buffer;
+}
+
 //@}
 
 ///// Allowed children /////
@@ -5734,6 +5795,115 @@
 			offset = 0;
 		}
 
+		// "If node is a Text node whose parent's computed value for
+		// "white-space" is neither "pre" nor "pre-wrap":"
+		if (node.nodeType == Node.TEXT_NODE
+		&& ["pre", "pre-wrap"].indexOf(getComputedStyle(node.parentNode).whiteSpace) == -1) {
+			// "Let leading space equal zero."
+			var leadingSpace = 0;
+
+			// "Let start offset equal offset minus one."
+			var startOffset = offset - 1;
+
+			// "While start offset is nonnegative and the start offsetth
+			// element of node's data is a space (U+0020) or non-breaking space
+			// (U+00A0):"
+			while (startOffset >= 0
+			&& /[ \xa0]/.test(node.data[startOffset])) {
+				// "If the start offsetth element of node's data is a
+				// non-breaking space (U+00A0), or the element before it is not
+				// a space (U+0020), add one to leading space."
+				if (node.data[startOffset] == "\xa0"
+				|| node.data[startOffset - 1] !== " ") {
+					leadingSpace++;
+				}
+
+				// "Subtract one from start offset."
+				startOffset--;
+			}
+
+			// "Add one to start offset."
+			startOffset++;
+
+			// "Let trailing space equal zero."
+			var trailingSpace = 0;
+
+			// "Let end offset equal offset."
+			var endOffset = offset;
+
+			// "While end offset is less than node's length and the end
+			// offsetth element of node's data is a space (U+0020) or
+			// non-breaking space (U+00A0):"
+			while (endOffset < node.length
+			&& /[ \xa0]/.test(node.data[endOffset])) {
+				// "If the end offsetth element of node's data is a
+				// non-breaking space (U+00A0), or the element before it is not
+				// a space (U+0020), add one to trailing space."
+				if (node.data[endOffset] == "\xa0"
+				|| node.data[endOffset - 1] !== " ") {
+					trailingSpace++;
+				}
+
+				// "Add one to end offset."
+				endOffset++;
+			}
+
+			// "Set initial nbsp to true if start offset is 0, false
+			// otherwise."
+			var initialNbsp = startOffset == 0;
+
+			// "Set final nbsp to true if end offset is the length of node,
+			// false otherwise."
+			var finalNbsp = endOffset == node.length;
+
+			// "If value is a space (U+0020):"
+			if (value == " ") {
+				// "Let new trailing space be the canonical space sequence of
+				// length leading space plus trailing space plus one, with
+				// non-breaking start equal to initial nbsp and non-breaking
+				// end equal to final nbsp."
+				var newTrailingSpace = canonicalSpaceSequence(leadingSpace + trailingSpace + 1, initialNbsp, finalNbsp);
+
+				// "Remove the first leading space elements from new trailing
+				// space, and let new leading space be the result."
+				var newLeadingSpace = newTrailingSpace.slice(0, leadingSpace);
+				newTrailingSpace = newTrailingSpace.slice(leadingSpace);
+
+				// "Remove the first element from new trailing space, and let
+				// value be the result."
+				value = newTrailingSpace[0];
+				newTrailingSpace = newTrailingSpace.slice(1);
+
+			// "Otherwise:"
+			} else {
+				// "Let new leading space be the canonical space sequence of
+				// length leading space, with non-breaking start equal to
+				// initial nbsp and non-breaking end equal to false."
+				var newLeadingSpace = canonicalSpaceSequence(leadingSpace, initialNbsp, false);
+
+				// "Let new trailing space be the canonical space sequence of
+				// length trailing space, with non-breaking start equal to
+				// false and non-breaking end equal to final nbsp."
+				var newTrailingSpace = canonicalSpaceSequence(trailingSpace, false, finalNbsp);
+			}
+
+			// "Call replaceData(start offset, offset − start offset, new
+			// leading space) on node."
+			node.replaceData(startOffset, offset - startOffset, newLeadingSpace);
+
+			// "Subtract offset from end offset, then add start offset plus
+			// leading space to end offset."
+			endOffset -= offset;
+			endOffset += startOffset + leadingSpace;
+
+			// "Set offset to start offset plus leading space."
+			offset = startOffset + leadingSpace;
+
+			// "Call replaceData(offset, end offset − offset, new trailing
+			// space) on node."
+			node.replaceData(offset, endOffset - offset, newTrailingSpace);
+		}
+
 		// "If node is a Text node:"
 		if (node.nodeType == Node.TEXT_NODE) {
 			// "Call insertData(offset, value) on node."
@@ -5759,6 +5929,12 @@
 			node.removeChild(node.firstChild);
 		}
 
+		// "If value is a space (U+0020), set value to a non-breaking space
+		// (U+00A0)."
+		if (value == " ") {
+			value = "\xa0";
+		}
+
 		// "Let text be the result of calling createTextNode(value) on the
 		// context object."
 		var text = document.createTextNode(value);
--- a/preprocess	Mon Jun 20 15:52:03 2011 -0600
+++ b/preprocess	Tue Jun 21 11:56:45 2011 -0600
@@ -20,6 +20,7 @@
     'bpoffset': '<span data-anolis-spec=domrange title=concept-boundary-point-offset>offset</span>',
     'bpposition': '<span data-anolis-spec=domrange title=concept-bp-position>position</span>',
     'cddata': '<code data-anolis-spec=domcore title=dom-CharacterData-data>data</code>',
+    'cdlength': '<code data-anolis-spec=domcore title=dom-CharacterData-length>length</code>',
     'child': '<span data-anolis-spec=domcore title=concept-tree-child>child</span>',
     'children': '<span data-anolis-spec=domcore title=concept-tree-child>children</span>',
     'collection': '<span data-anolis-spec=domcore title=concept-collection>collection</span>',
@@ -121,6 +122,7 @@
     'extend': '<code data-anolis-spec=domrange title=dom-Selection-extend>extend(\\1)</code>',
     'insertdata': '<code data-anolis-spec=domcore title=dom-CharacterData-insertData>insertData(\\1)</code>',
     'insertnode': '<code data-anolis-spec=domrange title=dom-Range-insertNode>insertNode(\\1)</code>',
+    'replacedata': '<code data-anolis-spec=domcore title=dom-CharacterData-replaceData>replaceData(\\1)</code>',
     'selcollapse': '<code data-anolis-spec=domrange title=dom-Selection-collapse>collapse(\\1)</code>',
     'setattribute': '<code data-anolis-spec=domcore title=dom-Element-setAttribute>setAttribute(\\1)</code>',
 }
--- a/source.html	Mon Jun 20 15:52:03 2011 -0600
+++ b/source.html	Tue Jun 21 11:56:45 2011 -0600
@@ -2804,6 +2804,63 @@
     </ol>
   </ol>
 </ol>
+
+<p>The <dfn>canonical space sequence</dfn> of length <var>n</var>, with boolean
+flags <var>non-breaking start</var> and <var>non-breaking end</var>, is
+returned by the following algorithm:
+<!-- See long comment before insertText. -->
+
+<ol>
+  <li>If <var>n</var> is zero, return the empty string.
+
+  <li>If <var>n</var> is one and both <var>non-breaking start</var> and
+  <var>non-breaking end</var> are false, return a single space (U+0020).
+
+  <li>If <var>n</var> is one, return a single non-breaking space (U+00A0).
+
+  <li>Let <var>buffer</var> be the empty string.
+
+  <li>If <var>non-breaking start</var> is true, let <var>repeated pair</var> be
+  U+00A0 U+0020.  Otherwise, let it be U+0020 U+00A0.
+
+  <li>While <var>n</var> is greater than three, append <var>repeated pair</var>
+  to <var>buffer</var> and subtract two from <var>n</var>.
+
+  <li>If <var>n</var> is three, append a three-[[strel]] string to
+  <var>buffer</var> depending on <var>non-breaking start</var> and
+  <var>non-breaking end</var>:
+
+  <dl class=switch>
+    <dt><var>non-breaking start</var> and <var>non-breaking end</var> false
+    <dd>U+0020 U+00A0 U+0020
+
+    <dt><var>non-breaking start</var> true, <var>non-breaking end</var> false
+    <dd>U+00A0 U+00A0 U+0020
+
+    <dt><var>non-breaking start</var> false, <var>non-breaking end</var> true
+    <dd>U+0020 U+00A0 U+00A0
+
+    <dt><var>non-breaking start</var> and <var>non-breaking end</var> both true
+    <dd>U+00A0 U+0020 U+00A0
+  </dl>
+
+  <li>Otherwise, append a two-[[strel]] string to <var>buffer</var> depending
+  on <var>non-breaking start</var> and <var>non-breaking end</var>:
+
+  <dl class=switch>
+    <dt><var>non-breaking start</var> and <var>non-breaking end</var> false
+    <dt><var>non-breaking start</var> true, <var>non-breaking end</var> false
+    <dd>U+00A0 U+0020
+
+    <dt><var>non-breaking start</var> false, <var>non-breaking end</var> true
+    <dd>U+0020 U+00A0
+
+    <dt><var>non-breaking start</var> and <var>non-breaking end</var> both true
+    <dd>U+00A0 U+00A0
+  </dl>
+
+  <li>Return <var>buffer</var>.
+</ol>
 <!-- @} -->
 
 <h3>Allowed children</h3>
@@ -4248,6 +4305,8 @@
 
 <h3><dfn>The <code title>delete</code> command</dfn></h3>
 <!-- @{ -->
+<p class=XXX>Needs nbsp magic.
+
 <p><span>Action</span>:
 
 <ol>
@@ -4734,6 +4793,8 @@
 
 <h3><dfn>The <code title>forwardDelete</code> command</dfn></h3>
 <!-- @{ -->
+<p class=XXX>Needs nbsp magic.
+
 <p><span>Action</span>:
 <!-- Copy-pasted from delete, see there for comments. -->
 
@@ -5539,12 +5600,37 @@
 
 Opera 11.11 has varying behavior, like Firefox and Chrome.  Like Firefox, I
 didn't discern an obvious pattern.
+
+This was discussed: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-June/032187.html
+
+Unfortunately, we're stuck with this nbsp stuff, because of 1) legacy reasons,
+2) mail clients might not support CSS equivalents, 3) authors might not know to
+apply any CSS to wherever the content is eventually used.  The behavior I
+decided on to minimize the evil is as follows:
+
+* If the first and last spaces are in non-collapsing positions, two spaces is
+  nbsp+space, three is space+nbsp+space, four or more is space+nbsp followed by
+  the pattern for two less.
+* If the first space has to be an nbsp so it doesn't collapse, three is instead
+  nbsp+nbsp+space, four or more is nbsp+space followed by the pattern for two
+  less.
+* If the last space has to be nbsp, two is space+nbsp, three is
+  space+nbsp+nbsp, four or more is space+nbsp followed by the pattern for two
+  less.
+* If the first and last space must both be nbsp, two is nbsp+nbsp, three is
+  nbsp+space+nbsp, four or more nbsp+space followed by the pattern for two
+  less.
+
+This avoids nbsp at the end of a run except where it's needed, so words won't
+appear indented if they wrap to the next line.  It avoids more than two nbsp's
+in a row, so there won't be huge chunks of space that get wrapped all at once.
+And it avoids nbsp at the beginning of a run except where it's needed or
+if there are only two spaces in the run, so words won't have to wrap
+unnecessarily.
+
+This is still a huge headache, though.
 -->
 
-<p class=XXX>We need to handle spaces/nbsps specially here.  See comment and <a
-href=http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-June/032187.html>WHATWG
-discussion</a>.
-
 <ol>
   <li><span>Delete the contents</span> of the <span>active range</span>.
   <!-- Chrome 14 dev does this even if passed the empty string. -->
@@ -5594,6 +5680,107 @@
   and that [[child]] is a [[text]] node, set <var>node</var> to that [[child]],
   then set <var>offset</var> to zero.
 
+  <li>If <var>node</var> is a [[text]] node whose [[parent]]'s [[compval]]
+  for "white-space" is neither "pre" nor "pre-wrap":
+
+  <p class=XXX>This also needs to take visually adjoining text nodes into
+  account, even if their parents are in different elements.  When inserting "a"
+  in "&lt;a href=/>foo&nbsp;&lt;/a>[] ", for instance, we need to convert the
+  nbsp to a regular space.  This kind of thing is just not feasible using pure
+  DOM stuff, though, so the current definition is a bad hack that will often
+  fail in real-world cases.  Suggestions for how to improve it appreciated.
+
+  <ol>
+    <li>Let <var>leading space</var> equal zero.
+
+    <li>Let <var>start offset</var> equal <var>offset</var> minus one.
+
+    <li>While <var>start offset</var> is nonnegative and the
+    <var>start offset</var>th [[strel]] of <var>node</var>'s [[cddata]] is a
+    space (U+0020) or non-breaking space (U+00A0):
+
+    <ol>
+      <li>If the <var>start offset</var>th [[strel]] of <var>node</var>'s
+      [[cddata]] is a non-breaking space (U+00A0), or the [[strel]] before it
+      is not a space (U+0020), add one to <var>leading space</var>.
+
+      <li>Subtract one from <var>start offset</var>.
+    </ol>
+
+    <li>Add one to <var>start offset</var>.
+
+    <li>Let <var>trailing space</var> equal zero.
+
+    <li>Let <var>end offset</var> equal <var>offset</var>.
+
+    <li>While <var>end offset</var> is less than <var>node</var>'s [[cdlength]]
+    and the <var>end offset</var>th [[strel]] of <var>node</var>'s [[cddata]]
+    is a space (U+0020) or non-breaking space (U+00A0):
+
+    <ol>
+      <li>If the <var>end offset</var>th [[strel]] of <var>node</var>'s
+      [[cddata]] is a non-breaking space (U+00A0), or the [[strel]] before it
+      is not a space (U+0020), add one to <var>trailing space</var>.
+      <!-- If we're between two spaces that are collapsed together, this means
+      we're effectively at the end of the collapsed run.  This shouldn't happen
+      with user-created selections, of course. -->
+
+      <li>Add one to <var>end offset</var>.
+    </ol>
+
+    <li>Set <var>initial nbsp</var> to true if <var>start offset</var> is 0,
+    false otherwise.
+
+    <li>Set <var>final nbsp</var> to true if <var>end offset</var> is the
+    [[cdlength]] of <var>node</var>, false otherwise.
+
+    <p class=XXX>These are wrong in zillions of common cases.  Per XXX above,
+    the fix here is not obvious at all.
+
+    <li>If <var>value</var> is a space (U+0020):
+
+    <ol>
+      <li>Let <var>new trailing space</var> be the <span>canonical space
+      sequence</span> of length <var>leading space</var> plus <var>trailing
+      space</var> plus one, with <var>non-breaking start</var> equal to
+      <var>initial nbsp</var> and <var>non-breaking end</var> equal to
+      <var>final nbsp</var>.
+
+      <li>Remove the first <var>leading space</var> [[strels]] from <var>new
+      trailing space</var>, and let <var>new leading space</var> be the result.
+
+      <li>Remove the first [[strel]] from <var>new trailing space</var>, and
+      let <var>value</var> be the result.
+    </ol>
+
+    <li>Otherwise:
+
+    <ol>
+      <li>Let <var>new leading space</var> be the <span>canonical space
+      sequence</span> of length <var>leading space</var>, with
+      <var>non-breaking start</var> equal to <var>initial nbsp</var> and
+      <var>non-breaking end</var> equal to false.
+
+      <li>Let <var>new trailing space</var> be the <span>canonical space
+      sequence</span> of length <var>trailing space</var>, with
+      <var>non-breaking start</var> equal to false and <var>non-breaking
+      end</var> equal to <var>final nbsp</var>.
+    </ol>
+
+    <li>Call [[replacedata|<var>start offset</var>, <var>offset</var> &minus;
+    <var>start offset</var>, <var>new leading space</var>]] on <var>node</var>.
+
+    <li>Subtract <var>offset</var> from <var>end offset</var>, then add
+    <var>start offset</var> plus <var>leading space</var> to <var>end
+    offset</var>.
+
+    <li>Set <var>offset</var> to <var>start offset</var> plus <var>leading
+    space</var>.
+
+    <li>Call [[replacedata|<var>offset</var>, <var>end offset</var> &minus;
+    <var>offset</var>, <var>new trailing space</var>]] on <var>node</var>.
+  </ol>
+
   <li>If <var>node</var> is a [[text]] node:
 
   <ol>
@@ -5613,6 +5800,13 @@
   <li>If <var>node</var> has only one [[child]], which is a <span>collapsed
   line break</span>, remove its [[child]] from it.
 
+  <li>If <var>value</var> is a space (U+0020), set <var>value</var> to a
+  non-breaking space (U+00A0).
+
+  <p class=XXX>This is wrong in all sorts of cases, like
+  "foo&lt;b>[]&lt;/b>bar".  As above, this is hard to get right without heavy
+  CSS involvement.
+
   <li>Let <var>text</var> be the result of calling <code
   data-anolis-spec=domcore
   title=dom-Document-createTextNode>createTextNode(<var>value</var>)</code> on