Input Method Editor API

Unofficial Draft 19 April 2013

Contributors:
Travis Leithead, Microsoft Corporation
Brendan Elliott, Microsoft Corporation
Jianfeng Lin, Microsoft Corporation

Abstract

The goal of this API proposal is to enable web developers to build IME-aware web applications for a better IME input experience, for example to be able to use IME composition information for auto-complete or search suggestion, to avoid user-interface collisions between IME candidate window and search suggestion list, and to limit the input mode of an input box to be either katakana or hiragana.
Note

The current editor's draft on IME API indicates another goal of enabling using IME on non-editable elements such as <canvas>. As we raised in the discussion threads here, here and here, we don’t think using canvas to create an editor is the right way to go. Please let’s discuss about issues you are trying to solve and find out a better solution. The former versions of the editor's draft on IME API also indicated a goal of facilitating JavaScript-based IMEs, but in current draft we see that some related functions are removed. We are not pursuing that goal but are not against it.

Status of This Document

This document is merely a public working draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organisation.

This document is currently an editor's draft.

Table of Contents

1. Introduction

This section is non-normative.

This API proposal consists of:
  1. An InputMethodContext object that can be requested on an HTML element. It provides information about the current status of IME composition.
  2. CSS properties to define the positioning of IME candidate window.
  3. An Inputmode attribute to define the initial status of IME.
You can find in the Usage Scenarios section about how the API is supposed to be used.

2. The InputMethodContext Object

2.1 The getInputContext() method

This method returns or creates (if none already exists) an InputMethodContext object on an HTML element, regardless of whether it's editable or not.
partial interface HTMLElement {
    InputMethodContext getInputContext ();
};

2.1.1 Methods

getInputContext
Returns an InputMethodContext interface associated with this element, regardless of whether it's editable or not.
No parameters.
Return type: InputMethodContext
Note
The same API exists in current editor's draft on IME API.
Note
The purpose of allowing this method on non-editable elements is not to enable using non-editable elements like <canvas> for editing, but to allow developers to set up the objects and event listeners on an element that may switch in between editable and non-editable, for example the email message body that switches in between reading and replying modes.

2.2 The InputMethodContext Interface

Note

For composition dictionary in current proposal, we can see exposing IME clauses as child nodes of text node, and making them real DOM nodes with styles being useful for a JS-based IME as the IME needs to tell the web application how to render the composition, but if JS IME is not a goal, is there any other scenarios that will benefit from this? If not, how about a simple design that expose the text being composed as DOMString?

For caret range, if it’s for enabling JS-based IME, then exposing the caret ranges of IME clauses is helpful, but if it’s not for JS IME, is there any other usage? We understand that web applications want to know about the whole string of the tentative composition, but we are not sure in which case they want to know how the whole tentative composition string is divided into several parts. Another issue is that the range type only tells the start and end offsets of the composition from its immediate parent. Web application usually wants to know the offset from the beginning of the text field so that it could combine the composition alternate with the text before it to create a full text string. But the beginning of the text field can be up in the parent tree if it’s a contentEditable element and requires JavaScript code to trace up in the parent tree to get the right offset.

So instead of a dictionary type for composition, we suggest compositionText, compositionStartOffset and compositionEndOffset as a simpler design. Please let us know if you have scenarios that need to be the other way.

interface InputMethodContext : EventTarget {
    boolean            hasComposition ();
    readonly    attribute DOMString     compositionText;
    readonly    attribute unsigned long compositionStartOffset;
    readonly    attribute unsigned long compositionEndOffset;
    sequence DOMString getCompositionAlternatives ();
    readonly    attribute DOMString     locale;
    readonly    attribute HTMLElement   target;
    boolean            isCandidateWindowVisible ();
    ClientRect         getCandidateWindowClientRect ();
                attribute EventHandler  oncandidatewindowshow;
                attribute EventHandler  oncandidatewindowupdate;
                attribute EventHandler  oncandidatewindowhide;
};

2.2.1 Attributes

compositionText of type DOMString, readonly
Represents the text being composed by an IME.
compositionStartOffset of type unsigned long, readonly

Represents the starting offset of the composition relative to the target if a composition is occurring, or 0 if there is no composition in progress. For composition occurring on an <input> or <textarea> element, the compositionStartOffset is the starting offset of the composition within the target's value string. For compositions occurring on an element with the contentEditable flag set, then this is the starting offset relative to the target's textContent property (textContent is a linear view of all the text under an element).

When assigned, updates the composition information of the hosting user agent.

compositionEndOffset of type unsigned long, readonly
Represents the ending offset of the composition, in a similar way to compositionStartOffset.
locale of type DOMString, readonly
Represents the locale of the current input method as a BCP-47 tag (e.g. "en-US"). The locale MAY be the empty string when inapplicable or unknown.
target of type HTMLElement, readonly
Represents the element associated with the InputMethodContext object.
Note
This does not exist in current editor's draft on IME API. We recommend it as an easy way to get back to the associated element.
oncandidatewindowshow of type EventHandler,

This event should be fired immediately after the IME candidate window is set to appear, which means immediately after the position information for the candidate window has been identified, but before any related animation has started, if the user agent decided to open the candidate window with animation. The event is fired synchronously such that the browser's message pump does not have a chance to run between the time the candidate window is set to appear and the event handler code runs. This allows web developer positioning code to adjust the position of other UI around the candidates window before the candidate window occludes this content.

Common things among oncandidatewindowshow/update/hide events:

  1. To get a better performance, these events are of the generic "Event" interface and fired directly to the InputMethodContext object. They do not bubble or capture through the DOM tree.
  2. Event handlers for these events will be able to use the "this" property of the event, or event.target / event.currentTarget to refer to the InputMethodContext object upon which the event is firing.
  3. These events are not cancellable, meaning that the appearing of the candicate window cannot be controlled via these events by cancelling them. Web applications should use "ime-mode" CSS property to control that.
  4. Web applications need only register for these events once per element (the handlers will remain valid for as long as the element is alive.

oncandidatewindowupdate of type EventHandler,
This events should be fire synchronously after either:
  1. The IME candidate window has been identified as needing to change size (but before animating to the new position) as a result of displaying new/changed alternatives or predictions.
  2. The IME candidate window has been identified as needing to change size (but before animating to the new position) due to user-zoom, browser frame resize, or other action that changes the candidate window placement.
In either case, the event should be fired after the new size/location of the candidate window is known by the user agent.
oncandidatewindowhide of type EventHandler,
This event should be fired synchronously after the candidate window is fully hidden (after the dismissal animation has ended, if there is any). The event handler code will see that the isCandidateWindowVisible() method returns false, and no ClientRect can be obtained inside this handler.

2.2.2 Methods

hasComposition

The hasComposition method is used to determine if there is a composition in-progress on the element associated with this context. The method returns true if a composition is in progress, false otherwise.

Note: candidate windows (such as for predictivate candidate suggestions on an element) may be active/visible even if hasComposition() returns false.

Web applications should check this method prior to requesting alternates (via getCompositionAlternatives).

No parameters.
Return type: boolean
getCompositionAlternatives
Returns a copy (per call) of the current list of alternate candidate strings from the InputMethodContext object. The InputMethodContext object can produce alternates while it has a composition in progress. As soon as hasComposition() returns false, then getCompositionAlternatives() will always return an empty list. The list of alternatives is not updated "live"; it is only updated between compositionupdate events.
No parameters.
Return type: sequence DOMString
isCandidateWindowVisible

This method reports whether the IME's candidate window UI is visible (for all kinds of candidates). It returns true just after the user agent starts to show the candidate window, and returns false after the window is fully hidden.

Note: the state of the candidate window visibility can change multiple times during a composition, and can even show after the composition completes (to display other types of candidates like predictions).

No parameters.
Return type: boolean
getCandidateWindowClientRect

An web application MAY use this information to explicitly control the position for its own input-related UI elements, such as search suggestions. One example is to dock the suggestion list below the IME candidate window and give them the same width.

Note: client coordinates are in document pixels and have origin at the upper-left corner of the client area.

Note
We propose this to replace the setExclusionRectangle in current editor's draft on IME API. Although setExclusionRectangle can ensure that the IME candidate window doesn't overlap with some specific UI that the web application doesn't want to be occluded (e.g. search suggestion list), it doesn't seem to be able to ensure that the two UIs layout in a desirable way. . For example, if the web application wants to render a search suggestion list that docks below the IME candidate window and aligns nicely without gap, it can't do so with setExclusionRectangle because it doesn't know where is the candidate window, how tall it is below the text field, and whether it is horizontally shifted to follow the caret position. Therefore we are proposing getCandidateWindowClientRect together with a group of CSS properties below to give more flexibility for the UI design. Please refer to the Usage Scenarios section for examples.
No parameters.
Return type: ClientRect
Note
For enabled, setCaretRectangle() and open() in current editor's draft on IME API, is there any usage scenario besides enabling IME for non-editable elements? Please refer to the first note about our opinion on this scenario.
Note
The confirmComposition() in current editor's draft on IME API looks like was designed to enable JavaScript-based IMEs based on the discussion here, but other functions for JavaScript-based IMEs had been removed from the draft. Is there any other scenarios where this function is still useful?

3. IME CSS Properties

3.1 ime property (shorthand)

Name:ime
Value:[<ime-mode> || <ime-align> || [ <ime-width> [ / <ime-offset> ] ? ] ] | auto
Initial:auto
Applies to:text fields
Inherited:no
Percentages:N/A
Media:interactive
Computed value:Same as specified value.

This property is a shorthand to combine the other IME CSS properties into a single line.

3.2 ime-align property

Name:ime-align
Value:before | after | start | end | auto
Initial:auto
Applies to:text fields
Inherited:no
Percentages:N/A
Media:interactive
Computed value:Same as specified value.

This property aligns the candidate window box relative to the element on which the IME composition is active.

before
The window is at the side that comes earlier in the block progression, for example in horizontal left-to-right writing mode it’s above the element.
after
The window is at the side that comes later in the block progression, for example in horizontal left-to-right writing mode it’s below the element.
start
The window is at the side from which text of its inline base direction will start, for example in horizontal left-to-right writing mode it’s to the left of the element.
end
The window is at the side from which text of its inline base direction will end, for example in horizontal left-to-right writing mode it’s to the right of the element.
auto
The user agent may align the candidate window in any manner (including alignment with the caret). The behavior of auto may change depending on the particular IME in use.

IME candidate lists should always be placed on screen with sufficient size to allow basic text input, so the IME may enforce a reasonable minimum size. If there isn’t sufficient screen space to align the IME candidate list in the specified side, it should fall back to the opposite site, for example if there isn’t sufficient space before the element, it should try the after position, and finally if there isn’t sufficient space for that either, it can fall back to its default auto behavior.

3.3 ime-mode property

Note
This property is the same as the one in current W3C proposal. We put it here as part of the group of IME CSS properties we are proposing.
Name:ime-mode
Value:auto | normal | active | inactive | disabled | inherit
Initial:auto
Applies to:text fields
Inherited:no
Percentages:N/A
Media:interactive
Computed value:Same as specified value.

This property controls the state of the input method editor for text fields.

auto
No change is made to the current input method editor state. This is the default.
normal
The IME state should be normal; this value can be used in a user style sheet to override the page setting.
active
The input method editor is initially active; text entry is performed using it unless the user specifically dismisses it.
inactive
The input method editor is initially inactive, but the user may activate it if they wish.
disabled
The input method editor is disabled and may not be activated by the user.
inherit
The ime-mode of this element inherits from its parents.

3.4 ime-width property

Name:ime-width
Value:<length> | box-width | auto
Initial:auto
Applies to:text fields
Inherited:no
Percentages:N/A
Media:interactive
Computed value:Same as specified value.

This property specifies the width of the IME candidate window when it appears.

<length>
Specifies the width of the IME candidate window using a length unit.
box-width
IME candidate window uses the same width as the text field that it associates with.
auto
The width of the IME candidate window is set by the user agent.

3.5 ime-offset property

Name:ime-offset
Value:<length> | auto
Initial:auto
Applies to:text fields
Inherited:no
Percentages:N/A
Media:interactive
Computed value:Same as specified value.

This property specifies the offset between the IME candidate window and the element that the IME is associated with.

<length>
Specifies the offset between the IME candidate window and the element using a length unit.
auto
The offset between the IME candidate window and the element is set by the user agent.

4. Inputmode Attribute

Note
The proposal of this attribute is based on current proposal on W3C. We added several more models for East Asian, especially full-width and half-width options that Japanese uses would like to see them preset in textboxes. For tel, email and url modes in the exisiting proposal, is there any reason why developers want to use them and not <input type="tel|email|url">?
enum InputMode {
    "default",
    "verbatim",
    "latin",
    "latin-name",
    "latin-prose",
    "latin-full-width",
    "hiragana",
    "katakana-full-width",
    "katakana-half-width",
    "hanja",
    "hangul-full-width",
    "hangul-half-width",
    "chinese-full-width",
    "chinese-half-width"
};

partial interface Element { attribute InputMode inputmode; };

4.1 Attributes

inputmode of type InputMode,
Enumeration description
defaultThe user agent decides which mode to use. This is also the missing value default.
verbatimAlphanumeric Latin-script input of non-prose content, e.g. usernames, passwords, product codes.
latinLatin-script input in the user's preferred language(s), with some typing aids enabled (e.g. text prediction). Intended for human-to-computer communications, e.g. free-form text search field.
latin-nameLatin-script input in the user's preferred language(s), with typing aids intended for entering human names enabled (e.g. text prediction from the user's contact list and automatic capitalisation at every word). Intended for situations such as customer name fields.
latin-proseLatin-script input in the user's preferred language(s), with aggressive typing aids intended for human-to-human communications enabled (e.g. text prediction and automatic capitalisation at the start of sentences). Intended for situations such as e-mails and instant messaging.
latin-full-widthLatin-script input in the user's secondary language(s), using full-width characters, with aggressive typing aids intended for human-to-human communications enabled (e.g. text prediction and automatic capitalisation at the start of sentences). Intended for latin text embedded inside CJK text.
hiraganaHiragana input for Japanese.
katakana-full-widthKatakana input using full-width characters for Japanese.
katakana-half-widthKatakana input using half-width characters for Japanese.
hanjaHanja input for Korean.
hangul-full-widthHangul input using full-width characters for Korean.
hangul-half-widthHangul input using half-width characters for Korean.
chinese-full-widthChinese input using full-width characters.
chinese-half-widthChinese input using half-width characters.

5. Usage Scenarios

5.1 Moving search suggestions to accommodate IME candidate window placement

  1. Set the ime-align CSS property to after for the desired search box.
  2. Add a compositionstart event handler to the desired search box.
  3. In the compositionstart event handler, get the InputMethodContext (call getInputContext()) object from the event's target element, and register for the candidatewindowshow, candidatewindowupdate, and candidatewindowhide events.
  4. In the candiatewindowshow handler, get the candidate window's ClientRect by getCandidateWindowClientRect(), and align the search suggestions box with the ClientRect's bottom/left.
  5. Use the same handler for candidatewindowshow with the candidatewindowupdate event.
  6. In the candidatewindowhide event, restore the search suggestions box to the bottom edge of the search box.

5.2 Matching the layout style of search suggestion list and IME candidate window

  1. Do the steps in the first scenario to avoid visual collision.
  2. Set ime-align style to move the IME candidate window to the same side as the search suggestion list.
  3. Set ime-width and ime-offset to match the sizes and layout distances of the search suggestion and the text input element.

5.3 Sending IME candidates to a web service to improve search suggestions

  1. Add a compositionupdate event handler to the desired search box.
  2. In the compositionupdate event handler, get the InputMethodContext object, and request the current alternatives thourgh composition.text attribute.
  3. Get the current range of the composition within the editing context using the InputMethodContext object's composition.caret attribute.
  4. Append the text before the current composition range to each of the current candidates.
    1. If this is an input field or textarea, then get a substring of the value of the element from 0 to the start of the caret range.
    2. If this is a contenteditable element, then get a substring of the target's textContent from 0 to the end of the caret range.
  5. Append the text after the current composition range to each of the candidates.
    1. If this is an input field or textarea, then get a substring of the value of the element from 0 to the start of the caret range.
    2. If this is a contenteditable element, then get a substring of the target's textContent from 0 to the end of the caret range.
  6. XHR the current candidates to a web server/service and refresh the search suggestions with the results.

5.4 Presetting the input mode

  1. Set the inputmode attribute of the text input element to be the mode that you want the user to type in. For example some Japanese address input fields require the input to be in half-width katakana. By setting inputmode="katakana-half-width", the web application can tell the Japanese IME to switch to half-width katakana mode when the focus is moved to this input element.

A. Acknowledgements

Many thanks to lots of people for their proposals and recommendations, some of which are incorporated into this document.

[WEBIDL] [CSS3WRITINGMODES] [DOM-LEVEL-3-EVENTS] [RFC2119]

B. References

B.1 Normative references

[CSS3WRITINGMODES]
Elika J. Etemad; Koji Ishii; Shinyu Murakami. CSS Writing Modes Module Level 3. 17 October 2010. W3C Editor's Draft. URL: http://dev.w3.org/csswg/css3-writing-modes
[DOM-LEVEL-3-EVENTS]
Travis Leithead; Jacob Rossi; Doug Schepers; Björn Höhrmann; Philippe Le Hégaret; Tom Pixley. Document Object Model (DOM) Level 3 Events Specification. 06 September 2012. W3C Working Draft. URL: http://www.w3.org/TR/2012/WD-DOM-Level-3-Events-20120906
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Internet RFC 2119. URL: http://www.ietf.org/rfc/rfc2119.txt
[WEBIDL]
Cameron McCormack. Web IDL. 19 April 2012. W3C Candidate Recommendation. URL: http://www.w3.org/TR/2012/CR-WebIDL-20120419/