Lots of research on insertText with spaces
authorAryeh Gregor <AryehGregor+gitcommit@gmail.com>
Mon, 20 Jun 2011 15:21:57 -0600
changeset 301 6583685ff9c9
parent 300 f654c59dd768
child 302 971ee4506a58
Lots of research on insertText with spaces

But no good results yet. I sent a message to the mailing list
soliciting feedback.
editcommands.html
implementation.js
inserttext2test.html
inserttexttest.html
manualtest.js
preprocess
source.html
tests.css
tests.js
--- a/editcommands.html	Sun Jun 19 15:59:41 2011 -0600
+++ b/editcommands.html	Mon Jun 20 15:21:57 2011 -0600
@@ -38,7 +38,7 @@
 <body class=draft>
 <div class=head id=head>
 <h1>HTML Editing Commands</h1>
-<h2 class="no-num no-toc" id=work-in-progress-&mdash;-last-update-19-june-2011>Work in Progress &mdash; Last Update 19 June 2011</h2>
+<h2 class="no-num no-toc" id=work-in-progress-&mdash;-last-update-20-june-2011>Work in Progress &mdash; Last Update 20 June 2011</h2>
 <dl>
  <dt>Editor
  <dd>Aryeh Gregor &lt;ayg+spec@aryeh.name&gt;
@@ -5480,38 +5480,81 @@
 
 <p><a href=#action>Action</a>:
 
+<!--
+Non-breaking space fun!  The issue: if the user hits space twice, they expect
+it to create two spaces, not collapse.  Also, if they're at the beginning or
+end of a line and hit space, again, they expect it not to collapse.  Since we
+don't want to require that all contenteditable element contents always be used
+only with white-space: pre-wrap, we need to convert to and from non-breaking
+spaces.
+
+But there's a catch: you can't just make spaces non-breaking willy-nilly,
+because that doesn't just stop the space from collapsing, it also prevents
+breaking.  (Chrome 14 dev actually cheats here: in contenteditable, it doesn't
+collapse nbsp, but breaks after it like a regular space.)  The upshot of this
+is that any nbsp needs to be followed by a space, or else it might end up at
+the beginning of a line and be visible there; and it needs to be preceded by a
+space, or else it might break a line prematurely.  How to achieve both of these
+goals when there are an even number of spaces to display is left as an exercise
+for the reader.
+
+Browsers vary greatly in how they handle all this, of course!
+
+The basic philosophy of IE9 is that if you're inserting a space, and one or
+both of the neighboring characters is a space, change the neighboring
+characters to non-breaking spaces.  This breaks if one of the neighboring
+characters is part of a run of collapsed whitespace: "foo  []bar" becomes "foo
+&nbsp; []bar", which converts one visible space to three.
+
+Firefox 6.0a2 will sometimes convert the space you're inserting to an nbsp,
+sometimes convert neighboring spaces to nbsps, and sometimes convert
+neighboring nbsps to spaces.  I cannot discern any clear reason to when it
+chooses what, except that it seems to prefer runs of nbsp's followed by a
+single space (although not always).  I didn't find any outright bugs, except
+the inevitable ones like nbsp's sometimes being right after letters.
+
+Chrome 14 dev tries to normalize everything to look like " &nbsp; &nbsp; ...",
+alternating with space then nbsp.  Unfortunately, it does so buggily, because
+it converts collapsed spaces to nbsp's, so inserting a space before " " makes
+it into " &nbsp; &nbsp;", which changes one visible space to four (or
+arbitrarily many).
+
+Opera 11.11 has varying behavior, like Firefox and Chrome.  Like Firefox, I
+didn't discern an obvious pattern.
+-->
+
+<p class=XXX>We need to handle spaces/nbsps specially here.  See comment and <a href=http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-June/032187.html>WHATWG
+discussion</a>.
+
 <ol>
   <li><a href=#delete-the-contents>Delete the contents</a> of the <a href=#active-range>active range</a>.
   <!-- Chrome 14 dev does this even if passed the empty string. -->
 
-  <li>While <var title="">value</var> contains a newline (U+0010):
-
-  <ol>
-    <li>Let <var title="">subvalue</var> be the empty string.
-
-    <li>While the first 16-bit element of <var title="">value</var> is not a newline
-    (0x0010), remove the first 16-bit element from <var title="">value</var> and append
-    it to <var title="">subvalue</var>.
-
-    <li>Take the <a href=#action>action</a> for <a href=#the-inserttext-command>the <code title="">insertText</code> command</a>, with <var title="">value</var> set to
-    <var title="">subvalue</var>.
-
-    <li>Remove the first 16-bit element from <var title="">value</var>.
-
-    <li>Take the <a href=#action>action</a> for <a href=#the-insertparagraph-command>the <code title="">insertParagraph</code> command</a>.
-  </ol>
-
-  <li>If <var title="">value</var> is the empty string, abort these steps.
-
-  <p class=XXX>WebKit does magic for tabs, wrapping them in a
-  whitespace-preserving span.
-
   <p class=XXX>This doesn't work well if the input contains things that aren't
   supposed to appear in HTML, like carriage returns or nulls.  Nor is it going
   to work well if the current cursor position is in between two halves of a
   non-BMP character.  This will result in unserializability.  The current spec
   disregards this, as Chrome 14 dev does.
 
+  <li>If <var title="">value</var>'s <a href=http://es5.github.com/#x15.5.5.1>length</a> is greater than one:
+
+  <ol>
+    <li>For each <a href=http://es5.github.com/#x8.4>element</a> <var title="">el</var> in <var title="">value</var>, take the
+    <a href=#action>action</a> for <a href=#the-inserttext-command>the <code title="">insertText</code>
+    command</a>, with <var title="">value</var> equal to <var title="">el</var>.
+
+    <li>Abort these steps.
+  </ol>
+
+  <li>If <var title="">value</var> is the empty string, abort these steps.
+
+  <li>If <var title="">value</var> is a newline (U+00A0), take the <a href=#action>action</a>
+  for <a href=#the-insertparagraph-command>the <code title="">insertParagraph</code> command</a> and abort
+  these steps.
+
+  <p class=XXX>WebKit also does magic for tabs, wrapping them in a
+  whitespace-preserving span.
+
   <li>Let <var title="">node</var> and <var title="">offset</var> be the <a href=#active-range>active
   range</a>'s <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-range-start title=concept-range-start>start</a> <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-boundary-point-node title=concept-boundary-point-node>node</a> and <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-boundary-point-offset title=concept-boundary-point-offset>offset</a>.
 
@@ -5538,8 +5581,7 @@
     <li>Call <code class=external data-anolis-spec=domcore title=dom-CharacterData-insertData><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-insertdata>insertData(<var title="">offset</var>, <var title="">value</var>)</a></code> on
     <var title="">node</var>.
 
-    <li>Add the <a href=http://es5.github.com/#x15.5.5.1>length</a> of
-    <var title="">value</var> to <var title="">offset</var>.
+    <li>Add <var title="">value</var>'s <a href=http://es5.github.com/#x15.5.5.1>length</a> to <var title="">offset</var>.
 
     <li>Call <code class=external data-anolis-spec=domrange title=dom-Selection-collapse><a href=http://html5.org/specs/dom-range.html#dom-selection-collapse>collapse(<var title="">node</var>, <var title="">offset</var>)</a></code> on the
     <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#context-object>context object</a>'s <code class=external data-anolis-spec=domrange><a href=http://html5.org/specs/dom-range.html#selection>Selection</a></code>.
--- a/implementation.js	Sun Jun 19 15:59:41 2011 -0600
+++ b/implementation.js	Mon Jun 20 15:21:57 2011 -0600
@@ -5687,28 +5687,16 @@
 		// "Delete the contents of the active range."
 		deleteContents(getActiveRange());
 
-		// "While value contains a newline (U+0010):"
-		while (value.indexOf("\n") != -1) {
-			// "Let subvalue be the empty string."
-			var subvalue = "";
-
-			// "While the first 16-bit element of value is not a newline
-			// (0x0010), remove the first 16-bit element from value and append
-			// it to subvalue."
-			while (value[0] != "\n") {
-				subvalue += value[0];
-				value = value.slice(1);
+		// "If value's length is greater than one:"
+		if (value.length > 1) {
+			// "For each element el in value, take the action for the
+			// insertText command, with value equal to el."
+			for (var i = 0; i < value.length; i++) {
+				commands.inserttext.action(value[i]);
 			}
 
-			// "Take the action for the insertText command, with value set to
-			// subvalue."
-			commands.inserttext.action(subvalue);
-
-			// "Remove the first 16-bit element from value."
-			value = value.slice(1);
-
-			// "Take the action for the insertParagraph command."
-			commands.insertparagraph.action();
+			// "Abort these steps."
+			return;
 		}
 
 		// "If value is the empty string, abort these steps."
@@ -5716,6 +5704,13 @@
 			return;
 		}
 
+		// "If value is a newline (U+00A0), take the action for the
+		// insertParagraph command and abort these steps."
+		if (value == "\n") {
+			commands.insertparagraph.action();
+			return;
+		}
+
 		// "Let node and offset be the active range's start node and offset."
 		var node = getActiveRange().startContainer;
 		var offset = getActiveRange().startOffset;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/inserttext2test.html	Mon Jun 20 15:21:57 2011 -0600
@@ -0,0 +1,35 @@
+<!doctype html>
+<meta charset=utf-8>
+<title>Manual insertText tests (space key)</title>
+<link rel=stylesheet href=tests.css>
+<p>Legend: {[ are the selection anchor, }] are the selection focus, {}
+represent an element boundary point, [] represent a text node boundary point.
+Syntax and some of the tests taken from <a
+href=http://www.browserscope.org/richtext2/test>Browserscope</a>.  data-start
+and data-end attributes also represent element boundary points, with the node
+being the element with the attribute and the offset given as the attribute
+value, for cases where HTML parsing doesn't allow text nodes.  Currently we
+don't really pay attention to reversed selections at all, so they might get
+displayed as forwards or such.
+
+<p><input type=button value="Clear cached results" onclick="clearCachedResults()">
+
+<div id=tests>
+	<input type=button value="Run tests" onclick="runTests()">
+	<table border=1><tr><th>Input<th>Spec<th>Browser<th>Same</table>
+	<p><label>New test input: <input></label> <input type=button value="Add test" onclick="addTest()">
+</div>
+
+<div id=overlay>Tap the space key repeatedly until this annoying message
+disappears!  (But not too quickly.  And don't hit any other keys or click with
+the mouse anywhere, it will mess it up.<span id=testcount>  <span></span>
+manual test(s) remain.</span>)</div>
+
+<script src=implementation.js></script>
+<script src=tests.js></script>
+<script>
+var command = "inserttext";
+var keyname = "textinsertion2";
+var globalValue = " ";
+</script>
+<script src=manualtest.js></script>
--- a/inserttexttest.html	Sun Jun 19 15:59:41 2011 -0600
+++ b/inserttexttest.html	Mon Jun 20 15:21:57 2011 -0600
@@ -1,6 +1,6 @@
 <!doctype html>
 <meta charset=utf-8>
-<title>Manual insertText tests</title>
+<title>Manual insertText tests (A key)</title>
 <link rel=stylesheet href=tests.css>
 <p>Legend: {[ are the selection anchor, }] are the selection focus, {}
 represent an element boundary point, [] represent a text node boundary point.
--- a/manualtest.js	Sun Jun 19 15:59:41 2011 -0600
+++ b/manualtest.js	Mon Jun 20 15:21:57 2011 -0600
@@ -1,6 +1,13 @@
-// We don't support non-default values for manual tests, so only strings need
-// apply.
-var tests = tests[command].filter(function(test) { return typeof test == "string"});
+// If globalValue isn't set, we don't support non-default values for manual
+// tests, so only strings need apply.
+if ("globalValue" in window) {
+	tests = tests[command].filter(function(test) {
+		return typeof test == "object"
+			&& test[0] == globalValue;
+	}).map(function(test) { return test[1] });
+} else {
+	tests = tests[command].filter(function(test) { return typeof test == "string"});
+}
 
 var testsRunning = false;
 
@@ -40,9 +47,8 @@
 	var tr = doSetup("#tests table", 0);
 	var input = document.querySelector("#tests label input");
 	var test = input.value;
-	if (command == "inserttext") {
-		// Yay hack
-		test = ["a", test];
+	if (command in defaultValues) {
+		test = ["globalValue" in window ? globalValue : defaultValues[command], test];
 	}
 	doInputCell(tr, test);
 	doSpecCell(tr, test, command, false);
--- a/preprocess	Sun Jun 19 15:59:41 2011 -0600
+++ b/preprocess	Mon Jun 20 15:21:57 2011 -0600
@@ -25,6 +25,7 @@
     'collection': '<span data-anolis-spec=domcore title=concept-collection>collection</span>',
     'contained': '<span data-anolis-spec=domrange>contained</span>',
     'comment': '<code data-anolis-spec=domcore>Comment</code>',
+    'compval': '<a href=http://www.w3.org/TR/CSS21/cascade.html#computed-value>computed value</a>',
     'contextobject': '<span data-anolis-spec=domrange>context object</span>',
     'descendant': '<span data-anolis-spec=domcore title=concept-tree-descendant>descendant</span>',
     'directionality': '<span data-anolis-spec=html title="the directionality">directionality</span>',
@@ -82,7 +83,9 @@
     'span': '<code data-anolis-spec=html title="the span element">span</code>',
     'startnode': '<span data-anolis-spec=domrange title=concept-range-start>start</span> <span data-anolis-spec=domrange title=concept-boundary-point-node>node</span>',
     'startoffset': '<span data-anolis-spec=domrange title=concept-range-start>start</span> <span data-anolis-spec=domrange title=concept-boundary-point-offset>offset</span>',
+    'strel': '<a href=http://es5.github.com/#x8.4>element</a>',
     'strike': '<code data-anolis-spec=html title="the strike element">strike</code>',
+    'strlen': '<a href=http://es5.github.com/#x15.5.5.1>length</a>',
     'strong': '<code data-anolis-spec=html title="the strong element">strong</code>',
     'style': '<code data-anolis-spec=html title="the style attribute">style</code>',
     'sub': '<code data-anolis-spec=html title="the sub and sup elements">sub</code>',
--- a/source.html	Sun Jun 19 15:59:41 2011 -0600
+++ b/source.html	Mon Jun 20 15:21:57 2011 -0600
@@ -5498,40 +5498,82 @@
 
 <p><span>Action</span>:
 
+<!--
+Non-breaking space fun!  The issue: if the user hits space twice, they expect
+it to create two spaces, not collapse.  Also, if they're at the beginning or
+end of a line and hit space, again, they expect it not to collapse.  Since we
+don't want to require that all contenteditable element contents always be used
+only with white-space: pre-wrap, we need to convert to and from non-breaking
+spaces.
+
+But there's a catch: you can't just make spaces non-breaking willy-nilly,
+because that doesn't just stop the space from collapsing, it also prevents
+breaking.  (Chrome 14 dev actually cheats here: in contenteditable, it doesn't
+collapse nbsp, but breaks after it like a regular space.)  The upshot of this
+is that any nbsp needs to be followed by a space, or else it might end up at
+the beginning of a line and be visible there; and it needs to be preceded by a
+space, or else it might break a line prematurely.  How to achieve both of these
+goals when there are an even number of spaces to display is left as an exercise
+for the reader.
+
+Browsers vary greatly in how they handle all this, of course!
+
+The basic philosophy of IE9 is that if you're inserting a space, and one or
+both of the neighboring characters is a space, change the neighboring
+characters to non-breaking spaces.  This breaks if one of the neighboring
+characters is part of a run of collapsed whitespace: "foo  []bar" becomes "foo
+&nbsp; []bar", which converts one visible space to three.
+
+Firefox 6.0a2 will sometimes convert the space you're inserting to an nbsp,
+sometimes convert neighboring spaces to nbsps, and sometimes convert
+neighboring nbsps to spaces.  I cannot discern any clear reason to when it
+chooses what, except that it seems to prefer runs of nbsp's followed by a
+single space (although not always).  I didn't find any outright bugs, except
+the inevitable ones like nbsp's sometimes being right after letters.
+
+Chrome 14 dev tries to normalize everything to look like " &nbsp; &nbsp; ...",
+alternating with space then nbsp.  Unfortunately, it does so buggily, because
+it converts collapsed spaces to nbsp's, so inserting a space before " " makes
+it into " &nbsp; &nbsp;", which changes one visible space to four (or
+arbitrarily many).
+
+Opera 11.11 has varying behavior, like Firefox and Chrome.  Like Firefox, I
+didn't discern an obvious pattern.
+-->
+
+<p class=XXX>We need to handle spaces/nbsps specially here.  See comment and <a
+href=http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-June/032187.html>WHATWG
+discussion</a>.
+
 <ol>
   <li><span>Delete the contents</span> of the <span>active range</span>.
   <!-- Chrome 14 dev does this even if passed the empty string. -->
 
-  <li>While <var>value</var> contains a newline (U+0010):
-
-  <ol>
-    <li>Let <var>subvalue</var> be the empty string.
-
-    <li>While the first 16-bit element of <var>value</var> is not a newline
-    (0x0010), remove the first 16-bit element from <var>value</var> and append
-    it to <var>subvalue</var>.
-
-    <li>Take the <span>action</span> for <span>the <code
-    title>insertText</code> command</span>, with <var>value</var> set to
-    <var>subvalue</var>.
-
-    <li>Remove the first 16-bit element from <var>value</var>.
-
-    <li>Take the <span>action</span> for <span>the <code
-    title>insertParagraph</code> command</span>.
-  </ol>
-
-  <li>If <var>value</var> is the empty string, abort these steps.
-
-  <p class=XXX>WebKit does magic for tabs, wrapping them in a
-  whitespace-preserving span.
-
   <p class=XXX>This doesn't work well if the input contains things that aren't
   supposed to appear in HTML, like carriage returns or nulls.  Nor is it going
   to work well if the current cursor position is in between two halves of a
   non-BMP character.  This will result in unserializability.  The current spec
   disregards this, as Chrome 14 dev does.
 
+  <li>If <var>value</var>'s [[strlen]] is greater than one:
+
+  <ol>
+    <li>For each [[strel]] <var>el</var> in <var>value</var>, take the
+    <span>action</span> for <span>the <code title>insertText</code>
+    command</span>, with <var>value</var> equal to <var>el</var>.
+
+    <li>Abort these steps.
+  </ol>
+
+  <li>If <var>value</var> is the empty string, abort these steps.
+
+  <li>If <var>value</var> is a newline (U+00A0), take the <span>action</span>
+  for <span>the <code title>insertParagraph</code> command</span> and abort
+  these steps.
+
+  <p class=XXX>WebKit also does magic for tabs, wrapping them in a
+  whitespace-preserving span.
+
   <li>Let <var>node</var> and <var>offset</var> be the <span>active
   range</span>'s [[startnode]] and [[bpoffset]].
 
@@ -5558,8 +5600,7 @@
     <li>Call [[insertdata|<var>offset</var>, <var>value</var>]] on
     <var>node</var>.
 
-    <li>Add the <a href=http://es5.github.com/#x15.5.5.1>length</a> of
-    <var>value</var> to <var>offset</var>.
+    <li>Add <var>value</var>'s [[strlen]] to <var>offset</var>.
 
     <li>Call [[selcollapse|<var>node</var>, <var>offset</var>]] on the
     [[contextobject]]'s [[selection]].
--- a/tests.css	Sun Jun 19 15:59:41 2011 -0600
+++ b/tests.css	Mon Jun 20 15:21:57 2011 -0600
@@ -37,6 +37,9 @@
 body.wbr-workaround > div > table > tbody > tr > td > div:last-child {
 	word-wrap: break-word;
 }
+body > div > table > tbody > tr > td > div:last-child {
+	white-space: pre-wrap;
+}
 /* https://bugs.webkit.org/show_bug.cgi?id=56670 */
 dfn { font-style: italic }
 /* Opera has weird default blockquote style */
--- a/tests.js	Sun Jun 19 15:59:41 2011 -0600
+++ b/tests.js	Mon Jun 20 15:21:57 2011 -0600
@@ -1795,7 +1795,6 @@
 		['', 'foo[bar]baz'],
 
 		['\t', 'foo[]bar'],
-		[' ', 'foo[]bar'],
 		['&', 'foo[]bar'],
 		['\n', 'foo[]bar'],
 		['\r', 'foo[]bar'],
@@ -1805,10 +1804,43 @@
 		['\ud800', 'foo[]bar'],
 		['\x07', 'foo[]bar'],
 
+		[' ', 'foo[]bar'],
+		[' ', 'foo []bar'],
+		[' ', 'foo[] bar'],
+		[' ', 'foo &nbsp;[]bar'],
+		[' ', 'foo []&nbsp;bar'],
+		[' ', 'foo[] &nbsp;bar'],
+		[' ', 'foo&nbsp; []bar'],
+		[' ', 'foo&nbsp;[] bar'],
+		[' ', 'foo[]&nbsp; bar'],
+		[' ', 'foo&nbsp;&nbsp;[]bar'],
+		[' ', 'foo&nbsp;[]&nbsp;bar'],
+		[' ', 'foo[]&nbsp;&nbsp;bar'],
+		[' ', 'foo []&nbsp;        bar'],
+		[' ', 'foo  []bar'],
+		[' ', 'foo []&nbsp;&nbsp; &nbsp; bar'],
+
 		[' ', '[]foo'],
+		[' ', '{}foo'],
 		[' ', 'foo[]'],
+		[' ', 'foo{}'],
 		[' ', 'foo&nbsp;[]'],
+		[' ', 'foo&nbsp;{}'],
 		[' ', 'foo&nbsp;&nbsp;[]'],
+		[' ', 'foo&nbsp;&nbsp;{}'],
+		[' ', '<b>foo[]</b>bar'],
+		[' ', 'foo[]<b>bar</b>'],
+
+		[' ', '<pre>foo[]</pre>'],
+		[' ', '<pre>[]foo</pre>'],
+		[' ', '<pre>foo []bar</pre>'],
+		[' ', '<span style=white-space:pre>foo[]</span>'],
+		[' ', '<span style=white-space:pre>[]foo</span>'],
+		[' ', '<span style=white-space:pre>foo []bar</span>'],
+		[' ', '<span style=white-space:pre-wrap>foo[]</span>'],
+		[' ', '<span style=white-space:pre-wrap>[]foo</span>'],
+		[' ', '<span style=white-space:pre-wrap>foo []bar</span>'],
+
 		['   ', 'foo[]'],
 
 		'foo[]bar',