Basic combining mark support for forwardDelete
authorAryeh Gregor <AryehGregor+gitcommit@gmail.com>
Tue, 12 Jul 2011 15:04:20 -0600
changeset 401 f00446ef68a6
parent 400 b3d61e8688c6
child 402 02076f23006c
Basic combining mark support for forwardDelete

This is good enough for now, but a proper solution is needed in the long
term.
editcommands.html
implementation.js
source.html
tests.js
--- a/editcommands.html	Tue Jul 12 14:26:22 2011 -0600
+++ b/editcommands.html	Tue Jul 12 15:04:20 2011 -0600
@@ -5830,15 +5830,30 @@
   </ol>
 
   <li>If <var title="">node</var> is a <code class=external data-anolis-spec=domcore><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#text>Text</a></code> node and <var title="">offset</var> is not
-  <var title="">node</var>'s <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-node-length title=concept-node-length>length</a>, call <code class=external data-anolis-spec=domrange title=dom-Selection-collapse><a href=http://html5.org/specs/dom-range.html#dom-selection-collapse>collapse(<var title="">node</var>,
-  <var title="">offset</var>)</a></code> on the <code class=external data-anolis-spec=domrange><a href=http://html5.org/specs/dom-range.html#selection>Selection</a></code>.  Then <a href=#delete-the-contents>delete the
-  contents</a> of the <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-range title=concept-range>range</a> with <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-range-start title=concept-range-start>start</a> (<var title="">node</var>,
-  <var title="">offset</var>) and <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-range-end title=concept-range-end>end</a> (<var title="">node</var>, <var title="">offset</var> + 1)
-  and abort these steps.
-
-  <p class=XXX>This is wrong for combining diacritics.  It can result in a
-  diacritic on the next letter being added to the previous one when the letter
-  is deleted.  Worse, it places the cursor between a letter and its diacritic.
+  <var title="">node</var>'s <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-node-length title=concept-node-length>length</a>:
+
+  <ol>
+    <li>Call <code class=external data-anolis-spec=domrange title=dom-Selection-collapse><a href=http://html5.org/specs/dom-range.html#dom-selection-collapse>collapse(<var title="">node</var>, <var title="">offset</var>)</a></code> on the
+    <code class=external data-anolis-spec=domrange><a href=http://html5.org/specs/dom-range.html#selection>Selection</a></code>.
+
+    <li>Let <var title="">end offset</var> be <var title="">offset</var> plus one.
+
+    <li>While <var title="">end offset</var> is not <var title="">node</var>'s <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-node-length title=concept-node-length>length</a> and the
+    <var title="">end offset</var>th <a href=http://es5.github.com/#x8.4>element</a> of <var title="">node</var>'s <code class=external data-anolis-spec=domcore title=dom-CharacterData-data><a href=http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-characterdata-data>data</a></code> has
+    general category M when interpreted as a Unicode code point, add one to
+    <var title="">end offset</var>.
+    <!-- TODO: This is probably not right.  We probably want to normalize to
+    grapheme cluster boundaries, using UAX#29 or something.  We also need to
+    handle non-BMP stuff.  The idea is that if the cursor is before a character
+    that precedes a combining mark, you need to delete the combining mark too.
+    -->
+
+    <li><a href=#delete-the-contents>Delete the contents</a> of the <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-range title=concept-range>range</a> with <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-range-start title=concept-range-start>start</a>
+    (<var title="">node</var>, <var title="">offset</var>) and <a class=external data-anolis-spec=domrange href=http://html5.org/specs/dom-range.html#concept-range-end title=concept-range-end>end</a> (<var title="">node</var>,
+    <var title="">end offset</var>).
+
+    <li>Abort these steps.
+  </ol>
 
   <li>If <var title="">node</var> is an <a href=#inline-node>inline node</a>, abort these steps.
 
--- a/implementation.js	Tue Jul 12 14:26:22 2011 -0600
+++ b/implementation.js	Tue Jul 12 15:04:20 2011 -0600
@@ -5993,15 +5993,32 @@
 			}
 		}
 
-		// "If node is a Text node and offset is not node's length, call
-		// collapse(node, offset) on the Selection. Then delete the contents of
-		// the range with start (node, offset) and end (node, offset + 1) and
-		// abort these steps."
+		// "If node is a Text node and offset is not node's length:"
 		if (node.nodeType == Node.TEXT_NODE
 		&& offset != getNodeLength(node)) {
+			// "Call collapse(node, offset) on the Selection."
 			getActiveRange().setStart(node, offset);
 			getActiveRange().setEnd(node, offset);
-			deleteContents(node, offset, node, offset + 1);
+
+			// "Let end offset be offset plus one."
+			var endOffset = offset + 1;
+
+			// "While end offset is not node's length and the end offsetth
+			// element of node's data has general category M when interpreted
+			// as a Unicode code point, add one to end offset."
+			//
+			// TODO: Not even going to try handling anything beyond the most
+			// basic combining marks.
+			while (endOffset != node.length
+			&& /^[\u0300-\u036f]$/.test(node.data[endOffset])) {
+				endOffset++;
+			}
+
+			// "Delete the contents of the range with start (node, offset) and
+			// end (node, end offset)."
+			deleteContents(node, offset, node, endOffset);
+
+			// "Abort these steps."
 			return;
 		}
 
--- a/source.html	Tue Jul 12 14:26:22 2011 -0600
+++ b/source.html	Tue Jul 12 15:04:20 2011 -0600
@@ -5825,15 +5825,30 @@
   </ol>
 
   <li>If <var>node</var> is a [[text]] node and <var>offset</var> is not
-  <var>node</var>'s [[nodelength]], call [[selcollapse|<var>node</var>,
-  <var>offset</var>]] on the [[selection]].  Then <span>delete the
-  contents</span> of the [[range]] with [[rangestart]] (<var>node</var>,
-  <var>offset</var>) and [[rangeend]] (<var>node</var>, <var>offset</var> + 1)
-  and abort these steps.
-
-  <p class=XXX>This is wrong for combining diacritics.  It can result in a
-  diacritic on the next letter being added to the previous one when the letter
-  is deleted.  Worse, it places the cursor between a letter and its diacritic.
+  <var>node</var>'s [[nodelength]]:
+
+  <ol>
+    <li>Call [[selcollapse|<var>node</var>, <var>offset</var>]] on the
+    [[selection]].
+
+    <li>Let <var>end offset</var> be <var>offset</var> plus one.
+
+    <li>While <var>end offset</var> is not <var>node</var>'s [[length]] and the
+    <var>end offset</var>th [[strel]] of <var>node</var>'s [[cddata]] has
+    general category M when interpreted as a Unicode code point, add one to
+    <var>end offset</var>.
+    <!-- TODO: This is probably not right.  We probably want to normalize to
+    grapheme cluster boundaries, using UAX#29 or something.  We also need to
+    handle non-BMP stuff.  The idea is that if the cursor is before a character
+    that precedes a combining mark, you need to delete the combining mark too.
+    -->
+
+    <li><span>Delete the contents</span> of the [[range]] with [[rangestart]]
+    (<var>node</var>, <var>offset</var>) and [[rangeend]] (<var>node</var>,
+    <var>end offset</var>).
+
+    <li>Abort these steps.
+  </ol>
 
   <li>If <var>node</var> is an <span>inline node</span>, abort these steps.
 
--- a/tests.js	Tue Jul 12 14:26:22 2011 -0600
+++ b/tests.js	Tue Jul 12 15:04:20 2011 -0600
@@ -1091,6 +1091,7 @@
 		'foo[]<script>bar</script>baz',
 		'fo[]&ouml;bar',
 		'fo[]o&#x308;bar',
+		'fo[]o&#x308;&#x327;bar',
 
 		'<p>foo[]</p><p>bar</p>',
 		'<p>foo[]</p>bar',