W3C

CSS Text Level 3

Editor's Draft 18 October 2010

This version:
$Date: 2010/10/07 06:46:06 $ (CVS $Revision$)
Latest version:
http://www.w3.org/TR/css3-text/
Previous version:
http://www.w3.org/TR/2010/WD-css3-text-20101005/
Editors:
Elika J. Etemad (Invited Expert)
Koji Ishii (Antenna House)
Shinyu Murakami (Antenna House)

Abstract

This CSS3 module defines properties for text manipulation and specifies their processing model. It covers line breaking, justification and alignment, white space handling, text decoration and text transformation.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This CSS module has been produced as a combined effort of the W3C Internationalization Activity, and the Style Activity and is maintained by the CSS Working Group. It also includes contributions made by participants in the XSL Working Group (members only).

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This Text module and a separate (upcoming) Writing Modes module replace and obsolete the May 2003 CSS3 Text Module Candidate Recommendation. Since this is a thorough overhaul of the previous CR, a list of changes has been provided.

Feedback on this draft should be posted to the (archived) public mailing list www-style@w3.org (see instructions) with [css3-text] in the subject line. You are strongly encouraged to complain if you see something stupid in this draft. The editors will do their best to respond to all feedback.

If you have implemented properties from the May 2003 CSS3 Text CR please let us know so we can take that into account as we redraft the spec. You can post to www-style (public), post to the CSS WG mailing list (Member-restricted), or email fantasai directly (personal).

The following features are at risk and may be cut from the spec during its CR period: the ‘text-outline’ property, the ‘unrestricted’ value of ‘text-wrap’, the ‘hanging-punctuation’ and ‘punctuation-trim’ properties

Table of Contents

1. Introduction

[document here]

2. Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification. All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformance to CSS Text Level 3 is defined for three classes:

style sheet
A CSS style sheet.
renderer
A UA that interprets the semantics of a style sheet and renders documents that use them.
authoring tool
A UA that writes a style sheet.

A style sheet is conformant to CSS Text Level 3 if all of its declarations that use properties defined in this module have values that are valid according to the generic CSS grammar and the individual grammars of each property as given in this module.

A renderer is conformant to CSS Text Level 3 if, in addition to interpreting the style sheet as defined by the appropriate specifications, it supports all the properties defined by CSS Text Level 3 by parsing them correctly and rendering the document accordingly. However the inability of a UA to correctly render a document due to limitations of the device does not make the UA non-conformant. (For example, a UA is not required to render color on a monochrome monitor.)

An authoring tool is conformant to CSS Text Level 3 if it writes syntactically correct style sheets, according to the generic CSS grammar and the individual grammars of each property in this module.

2.1. Partial and Experimental Implementations

UAs must treat as invalid any properties or values they do not support. Experimental implementations of a feature should support only a vendor-prefixed syntax for the property/value.

3. Transforming Text

3.1. Transforming Text: the ‘text-transform’ property

Name: text-transform
Value: none | capitalize | uppercase | lowercase | fullwidth | large-kana
Initial: none
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: as specified

This property transforms text for the styling purpose. Values have the following meanings:

none
No effects.
capitalize
Puts all words in titlecase.
uppercase
Puts all characters in uppercase.
lowercase
Puts all characters in lowercase.
fullwidth
Puts all characters in fullwidth form. If the character does not have corresponding fullwidth form, it is left as is. This value is typically used to typeset Latin characters and digits like ideographic characters.

should try fallback if the glyph is missing. should it also check fullwidth variants before falling back to next font?

large-kana
Converts all small Kana characters to normal Kana. This value is typically used for ruby annotation text, where all small Kana should be drawn as large Kana.

need to work on interoperability with OpenType and CSS3 Fonts font-variant:ruby

Although limited, the case mapping process has some language dependencies. Some well known examples are Turkish and Greek. If the content language is known then any such language-specific rules must be used.

The case mapping rules for the character repertoire specified by the Unicode Standard can be found on the Unicode Consortium Web site. [UNICODE] Only characters belonging to bicameral scripts are affected.

The definition of fullwidth and halfwidth forms can be found on the Unicode consortium web site at [UAX11].

The following example converts the ASCII characters in abbreviations in Japanese to their fullwidth variants so that they lay out and line break like ideographs:

abbr:lang(ja) { text-transform: fullwidth; }

4. White Space Processing

The source text of a document often contains formatting that is not relevant to the final rendering: for example, breaking the source into segments (lines) for ease of editing or adding white space characters such as tabs and spaces to indent the source code. CSS white space processing allows the author to control interpretation of such formatting: to preserve or collapse it away when rendering the document.

Segments in the document source can be separated by a carriage return (U+000D), a linefeed (U+000A) or both (U+000D U+000A), or by some other mechanism that identifies the beginning and end of document segments, such as the SGML RECORD-START and RECORD-END tokens. If no segmentation rules are specified for the document language, each line feed (U+000A), carriage return (U+000D) and CRLF sequence (U+000D U+000A) in the text is considered a segment break. (This default rule also applies to generated content.) In CSS, each such segment break is treated as a single line feed character (U+000A).

White space processing in CSS interprets white space characters for rendering: it has no effect on the underlying document data. In the context of CSS, the document white space set is defined to be any space characters (Unicode value U+0020), tab characters (U+0009), and line feeds (U+000A).

Note that the document parser may have not only normalized segment breaks, but also collapsed other space characters or otherwise processed white space according to markup rules. Because CSS processing occurs after the parsing stage, it is not possible to restore these characters for styling. Therefore, some of the behavior specified below can be affected by these limitations and may be user agent dependent.

Control characters other than U+0009 (tab), U+000A (line feed), U+0020 (space), and U+202x (bidi formatting characters) are treated as characters to render in the same way as any normal character. Copied from CSS2.1 but this has got to be wrong.

4.1. White Space Collapsing: the ‘white-space-collapsing’ property

This section is still under discussion and may change in future drafts.

Name: white-space-collapsing
Value: collapse | discard | [ [preserve | preserve-breaks] && trim-inner ]
Initial: collapse
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

Rename to white-space-trim or white-space-adjust? white-space-collapsing has an ‘ing’ and is confusing with XSL

This property declares whether and how white space inside the element is collapsed. Values have the following meanings, which must be interpreted according to the white space processing rules:

collapse
This value directs user agents to collapse sequences of white space into a single character (or in some cases, no character).
preserve
This value prevents user agents from collapsing sequences of white space. Segment breaks are preserved as forced line breaks.
preserve-breaks
This value collapses white space as for ‘collapse’, but preserves segment breaks as forced line breaks.
discard
This value directs user agents to "discard" all white space in the element.
trim-inner
This value directs UAs to discard all whitespace at the beginning of a block up to and including the last line break before the first non-white-space character in the block as well as to discard all white space at the end of a block starting with the first line break after the last non-white-space character in the block.

4.2. The White Space Processing Rules

For each inline (including anonymous inlines), white space characters are handled as follows, ignoring bidi formatting characters as if they were not there:

Then, the entire block is rendered. Inlines are laid out, taking bidi reordering into account, and wrapping as specified by the text-wrap property.

As each line is laid out,

  1. A sequence of collapsible spaces (U+0020) at the beginning of a line is removed.
  2. Each tab (U+0009) is rendered as a horizontal shift that lines up the start edge of the next glyph with the next tab stop. Tab stops occur at points that are multiples of 8 times the width of a space (U+0020) rendered in the block's font from the block's starting content edge.
  3. A sequence of collapsible spaces (U+0020) at the end of a line is removed.
  4. If spaces (U+0020) or tabs (U+0009) at the end of a line are non-collapsible but have ‘text-wrap’ set to ‘normal’ or ‘suppress’ the UA may visually collapse them.

4.2.1. Example of bidirectionality with white space collapsing

Consider the following markup fragment, taking special note of spaces (with varied backgrounds and borders for emphasis and identification):

<ltr>A <rtl> B </rtl> C</ltr>

where the <ltr> element represents a left-to-right embedding and the <rtl> element represents a right-to-left embedding. If the ‘white-space-collapsing’ property is set to ‘collapse’, the above processing model would result in the following:

This would leave two spaces, one after the A in the left-to-right embedding level, and one after the B in the right-to-left embedding level. This is then ordered according to the Unicode bidirectional algorithm, with the end result being:

A  BC

Note that there are two spaces between A and B, and none between B and C. This is best avoided by putting spaces outside the element instead of just inside the opening and closing tags and, where practical, by relying on implicit bidirectionality instead of explicit embedding levels.

4.2.2. Line Break Transformation Rules

When line feeds are collapsible, they are either transformed into a space (U+0020) or removed depending on the script context before and after the line break.

The script context is determined by the Unicode-given script value [UAX24] of the first character that side of the line feed. However, characters such as punctuation that belong to the COMMON and INHERITED scripts are ignored in this check; the next character is examined instead. The UA must not examine characters outside the block and may limit its examination to as few as four characters on each side of the line feed. If the check fails to find an acceptable script value (i.e. it has hit the check limits), then the script context is neutral.

Note that the white space processing rules have already removed any tabs and spaces after the line feed before these checks take place.

Comments on how well this would work in practice would be very much appreciated, particularly from people who work with Thai and similar scripts.

4.2.3. Informative Summary of White Space Collapsing Effects

4.3. White Space and Text Wrapping Shorthand: the ‘white-space’ property

Name: white-space
Value: normal | pre | nowrap | pre-wrap | pre-line
Initial: not defined for shorthand properties
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: see individual properties

The ‘white-space’ property is a shorthand for the white-space-collapsing and text-wrap properties. Not all combinations are represented. Values have the following meanings:

normal
Sets ‘white-space-collapsing’ to ‘collapse’ and ‘text-wrap’ to ‘normal
pre
Sets ‘white-space-collapsing’ to ‘preserve’ and ‘text-wrap’ to ‘none
nowrap
Sets ‘white-space-collapsing’ to ‘collapse’ and ‘text-wrap’ to ‘none
pre-wrap
Sets ‘white-space-collapsing’ to ‘preserve’ and ‘text-wrap’ to ‘normal
pre-line
Sets ‘white-space-collapsing’ to ‘preserve-breaks’ and ‘text-wrap’ to ‘normal

The following informative table summarizes the behavior of various ‘white-space’ values:

New Lines Spaces and Tabs Text Wrapping
normal Collapse Collapse Wrap
pre Preserve Preserve No wrap
nowrap Collapse Collapse No wrap
pre-wrap Preserve Preserve Wrap
pre-line Preserve Collapse Wrap

5. Line Breaking and Word Boundaries

For most scripts, in the absence of hyphenation a line break occurs only at word boundaries. Many writing systems use spaces or punctuation to explicitly separate words, and line break opportunities can be identified by these characters. Scripts such as Thai, Lao, and Khmer, however, do not use spaces or punctuation to separate words. Although the zero width space (U+200B) can be used as an explicit word delimiter in these scripts, this practice is not common. As a result, a lexical resource is needed to correctly identify break points in such texts.

In several other writing systems, (including Chinese, Japanese, Yi, and sometimes also Korean) a line break opportunities are based on syllable boundaries, not words. In these systems a line can break anywhere except between certain character combinations. Additionally the level of strictness in these restrictions can vary with the typesetting style.

CSS does not fully define where line breaking opportunities occur, however some controls are provided to distinguish common variations.

Floated and absolutely-positioned elements do not introduce a line breaking opportunity. The line breaking behavior of a replaced element is equivalent to that of a Latin character.

There is a question of what the default line breaking of Korean should be, and whether dictionary-based breaking is needed for typical layout (e.g. novels).

It is not clear whether this section handles Southeast Asian scripts well. Additionally, some guidance should be provided on how to break or not break Southeast Asian in the absence of a dictionary.

5.1. Line Breaking Restrictions for CJK Scripts: the ‘line-break’ property

This property specifies line break opportunities for CJK scripts.

CSS distinguishes between three levels of strictness in the rules for implicit line breaking in CJK text. The precise set of rules in effect for the strict and loose levels is up to the UA and should follow language conventions. However, this specification does recommend that:

Information on line breaking conventions can be found in [JIS4051] for Japanese, [ZHMARK] for Chinese, and [?] for Korean, and in [UAX14] for all scripts in Unicode.

Any guidance for appropriate references here would be much appreciated.

Name: line-break
Value: auto | newspaper | normal | strict | keep-all
Initial: auto
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

This property specifies what set of line breaking restrictions are in effect within the element. Values have the following meanings:

auto
The UA determines the set of line-breaking restrictions to use for CJK scripts, and it may vary the restrictions based on the length of the line; e.g., use a less restrictive set of line-break rules for short lines.
newspaper
Breaks CJK scripts using the least restrictive set of line-breaking rules. Typically used for short lines, such as in newspapers.
normal
Breaks CJK scripts using a normal set of line-breaking rules.
strict
Breaks CJK scripts using a more restrictive set of line-breaking rules than ‘normal’.
keep-all
Sequences of CJK characters can no longer break on implied break points. This option should only be used where the presence of word separator characters still creates line-breaking opportunities, as in Korean.

5.2. Line Breaking Rules for non-CJK Scripts: the ‘word-break’ property

This property specifies line break opportunities for non-CJK scripts.

Name: word-break
Value: normal | break-all | hyphenate
Initial: normal
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value
normal
Breaks non-CJK scripts according to their own rules.
break-all
Lines may break between any two grapheme clusters for non-CJK scripts. This option is used mostly in a context where the text is predominantly using CJK characters with few non-CJK excerpts and it is desired that the text be better distributed on each line.
hyphenate
Words may be broken at an appropriate hyphenation point. This requires that the user agent have an hyphenation resource appropriate to the language of the text being broken.

This value is proposed to replace ‘hyphenate’ property currently defined in Generated Content for Paged Media draft. Note there are other values in the editor's draft, which could be added as ‘word-break: hyphenate-all’ and ‘word-break: none’...

When shaping scripts such as Arabic are allowed to break within words due to ‘break-all’ or hyphenation, the characters must still be shaped as if the word were not broken.

6. Text Wrapping

Text wrapping is controlled by the ‘text-wrap’ and ‘word-wrap’ properties:

6.1. Text Wrap Settings: the ‘text-wrap’ property

Name: text-wrap
Value: normal | unrestricted | none | suppress
Initial: normal
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

This property specifies the mode for text wrapping. Possible values:

normal
Lines may break at allowed break points, as determined by the line-breaking rules in effect. Line breaking behavior defined for the WJ, ZW, and GL line-breaking classes in [UAX14] must be honored.
none
Lines may not break; text that does not fit within the block container overflows it.
unrestricted
Lines may break between any two grapheme clusters. Line-breaking restrictions have no effect and hyphenation does not take place. Character shaping is performed on each side of the break as if the break had not occurred. This value is for terminal-style dumb line breaking. But has no use cases.
suppress
Line breaking is suppressed within the element: the UA may only break within the element if there are no other valid break points in the line. If the text breaks, line-breaking restrictions are honored as for ‘normal’.

Regardless of the ‘text-wrap’ value, lines always break at forced breaks: for all values, line-breaking behavior defined for the BK, CR, LF, CM NL, and SG line breaking classes in [UAX14] must be honored.

When text-wrap is set to ‘normal’ or ‘suppress’, UAs that allow breaks at punctuation other than spaces should prioritize breakpoints. For example, if breaks after slashes have a lower priority than spaces, the sequence "check /etc" will never break between the ‘/’ and the ‘e’. The UA may use the width of the containing block, the text's language, and other factors in assigning priorities. As long as care is taken to avoid such awkward breaks, allowing breaks at appropriate punctuation other than spaces is recommended, as it results in more even-looking margins, particularly in narrow measures.

6.1.1. Example of using ‘text-wrap: suppress’ in presenting a footer

The priority of breakpoints can be set to reflect the intended grouping of text.

Given the rules

footer { text-wrap: suppress; /* inherits to all descendants */ }
      

and the following markup:

<footer>
  <venue>27th Internationalization and Unicode Conference</venue>
  &#8226; <date>April 7, 2005</date> &#8226;
  <place>Berlin, Germany</place>
</footer>
      

In a narrow window the footer could be broken as

27th Internationalization and Unicode Conference •
April 7, 2005 • Berlin, Germany
      

or in a narrower window as

27th Internationalization and Unicode
Conference • April 7, 2005 •
Berlin, Germany
      

but not as

27th Internationalization and Unicode Conference • April
7, 2005 • Berlin, Germany
      

6.2. Emergency Wrapping: the ‘word-wrap’ property

Name: word-wrap
Value: normal | break-word
Initial: normal
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

This property specifies whether the UA may break within a word to prevent overflow when an otherwise-unbreakable string is too long to fit within the line box. It only has an effect when ‘text-wrap’ is either ‘normal’ or ‘suppress’. Possible values:

normal
Lines may break only at allowed break points.
break-word
An unbreakable "word" may be broken at an arbitrary point if there are no otherwise-acceptable break points in the line. Shaping characters are still shaped as if the word were not broken, and grapheme clusters must together stay as one unit.

If Korean is set to keep-all, and the word doesn't fit, should it break anyway? Do we need an ‘auto’ value here for scripts where it's OK to break even if it's not ideal vs. scripts (like Latin) where breaking in the middle of a word is wrong?

7. Alignment and Justification

7.1. Text Alignment: the ‘text-align’ property

Name: text-align
Value: [start | end | left | right | center | justify | match-parent ] || <string>
Initial: start
Applies to: block containers
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value, except for ‘match-parent’ (see prose)

This property describes how inline contents of a block are horizontally aligned if the contents do not completely fill the line box. Values have the following meanings:

start
The inline contents are aligned to the start edge of the line box.
end
The inline contents are aligned to the end edge of the line box.
left
The inline contents are aligned to the left edge of the line box. In vertical text, ‘left’ aligns to the edge of the line box that would be the start edge for left-to-right text.
right
The inline contents are aligned to the right edge of the line box. In vertical text, ‘right’ aligns to the edge of the line box that would be the end edge for left-to-right text.
center
The inline contents are centered within the line box.
justify
The text is justified according to the method specified by the text-justify property.
<string>
The string must be a single character; otherwise the declaration must be ignored. When applied to a table cell, specifies a character on which all cells in its table column that also have a <string> value for ‘text-align’ will align (see the section on horizontal alignment in a column for details and an example).
match-parent
This value behaves the same as ‘inherit’ except that an inherited value of ‘start’ or ‘end’ is calculated against its parent's ‘direction’ value and results in a computed value of either ‘left’ or ‘right’.

A block of text is a stack of line boxes. In the case of ‘start’, ‘end’, ‘left’, ‘right’ and ‘center’, this property specifies how the inline boxes within each line box align within the line box: alignment is not with respect to the viewport or containing block. In the case of ‘justify’, the UA may stretch the inline boxes in addition to adjusting their positions. (See also the text-justify, letter-spacing and word-spacing.)

A keyword value may be specified in conjunction with the <string> value; if it is not given, it defaults to ‘end’. This value is used when <string> alignment is applied to boxes that are not table cells, when the alignment character appears more than once, and when the text wraps to multiple lines. Also, if the column is wide enough that the string value alone does not determine the alignment of its <string>-aligned contents, the fallback value of the first cell in the column with a <string> alignment is used to determine the aligned contents' alignment within the column. Use this value also to determine alignment wrt the axis instead of using the "character directionality", which is not defined for punctuation...

7.2. Last Line Alignment: the ‘text-align-last’ property

Name: text-align-last
Value: start | end | left | right | center | justify
Initial: start
Applies to: block containers
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

This property describes how the last line of a block or a line right before a forced line break is aligned when text-align is set to ‘justify’. Values have the same meaning as for text-align.

7.3. Justification Method: the ‘text-justify’ property

Name: text-justify
Value: auto | [ trim || [ inter-word | inter-ideograph | inter-cluster | distribute | kashida ] ]
Initial: auto
Applies to: block containers and, optionally, inline elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

This property selects the justification method used when text-align is set to justify. The property applies to block containers, but the UA may (but is not required to) also support it on inline elements. It takes the following values:

auto
The UA determines the justification algorithm to follow, based on a balance between performance and adequate presentation quality.

One possible algorithm is to determine the behavior based on the language of the paragraph: the UA can then choose appropriate value for the language, like ‘trim inter-ideograph’ for CJK, or ‘inter-word’ for English. Another possibility is to use a justification method that is a universal compromise for all scripts, e.g. the ‘distribute’ method with discrete scripts dropped to second priority.

inter-word
Justification primarily changes spacing at word separators. This value is typically used for languages that separate words using spaces, like English or (sometimes) Korean.
inter-ideograph
Justification primarily changes spacing at word separators and at inter-graphemic boundaries in scripts that use no word spaces. This value is typically used for CJK languages.
inter-cluster
Justification primarily changes spacing at word separators and at grapheme cluster boundaries in clustered scripts. This value is typically used for Southeast Asian scripts such as Thai.
distribute
Justification primarily changes spacing both at word separators and at grapheme cluster boundaries in all scripts except those in the connected and cursive groups. This value is sometimes used in e.g. Japanese, often with the ‘text-align-last’ property.
kashida
Justification primarily stretches Arabic and related scripts through the use of kashida or other calligraphic elongation.
trim
This keyword specifies that compression is preferred to expansion and enables the trimming of blank space in glyphs where allowed by typographic tradition (for example, in spaces and fullwidth punctuation glyphs). If specified alone, the exact justification algorithm is UA-defined (as for ‘auto’).

When justifying text, the user agent takes the remaining space between the ends of a line's contents and the edges of its line box, and distributes that space throughout its contents so that the contents exactly fill the line box. If the ‘letter-spacing’ and ‘word-spacing’ property values allow it, the user agent may also distribute negative space, putting more content on the line than would otherwise fit under normal spacing conditions. The exact justification algorithm is UA-dependent; however, CSS provides some general guidelines which should be followed when any justification method other than ‘auto’ is specified.

Justification affects different types of writing systems in different ways. For justification purposes, characters are grouped as follows:

block
CJK (including Hangul and half-width kana) and by extension all "wide" characters. (See [UAX11])
clustered
South-East Asian scripts that have discrete units but do not use space between words (such as Thai, Lao, Khmer, Myanmar). This category also includes the Tibetan script.
discrete
Scripts that use spaces between words and have discrete, unconnected (in print) units within words, such as Latin, Greek, Cyrillic, Hebrew.
cursive
Arabic and similar cursive scripts
connected
Devanagari and other scripts such as such as Bengali and Gurmukhi, that use spaces between words and baseline connectors within words. The Ogham script also falls into this category.

Where do scripts like Tamil fit in?

The UA may enable or break optional ligatures or use other font features such as alternate glyphs to help justify the text under any method. This behavior is not defined by CSS.

CSS defines expansion opportunities as points where the justification algorithm may alter spacing within the text. These expansion opportunities fall into priority levels as defined by the justification method. Within a line, higher priority expansion opportunities should be expanded or compressed to their limits before lower priority expansion opportunities are adjusted. (Expansion and compression limits are given by the letter-spacing and word-spacing properties.

How any remaining space is distributed once all expansion opportunities reach their limits is up to the UA. If the inline contents of a line cannot be stretched to the full width of the line box, then they must be aligned as specified by the text-align-last property (or as ‘start’ if ‘text-align-last’ is ‘justify’).

The expansion opportunity priorities for values of ‘text-justify’ are given in the table below. Space must be distributed evenly among all types of expansion opportunities in a given prioritization group, but may vary within a line due to changes in the font or letter-spacing and word-spacing values. The different types of expansion opportunities are defined as follows:

spaces
An expansion opportunity exists at spaces and other word separators. Expand as for word-spacing.
block
clustered
discrete
An expansion opportunity exists between two grapheme clusters when at least one of them belongs to the affected script group and the spacing that point has not already been altered at a higher priority.

I'm not sure grapheme clusters are the right unit to use for some of these complex scripts...

cursive
Words may be expanded through kashida elongation or other cursive expansion processes. Kashida may be applied in discrete units or continuously, and the prioritization of kashida points is UA-dependent: for example, the UA may apply more at the end of the line. The UA should not apply kashida to fonts for which it is inappropriate. It may instead rely on other justification methods that lengthen or shorten Arabic segments (e.g. by substituting in swash forms or optional ligatures). Because elongation rules depend on the typeface style, the UA should rely on on the font whenever possible rather than inserting kashida based on a font-independent ruleset. The UA should limit elongation so that, e.g. in multi-script lines a short stretch of Arabic will not be forced to soak up too much of the extra space by itself. If the UA does not support cursive elongation, then no expansion points exist between grapheme clusters of these scripts.
punctuation
An expansion opportunity exists between a pair of characters from the Unicode symbols (S*) and punctuation (P*) classes and at enabled autospace points. The default justification priority of these points depends on the justification method as defined below; however there may be additional rules controlling their justification behavior due to typographic tradition. For example, there are traditionally no expansion opportunities between consecutive EM DASH U+2014, HORIZONTAL BA U+2015, HORIZONTAL ELLIPSIS U+2026, or TWO DOT LEADER U+2025 characters. [JIS4051] The UA may introduce additional levels of priority to handle expansion opportunities involving punctuation.
connected
No expansion opportunities occur between pairs of connected script grapheme clusters. Is this correct?
Prioritization of Expansion Points
method: inter-word inter-ideograph distribute inter-cluster kashida auto
priority: 1st 2nd 1st 2nd 1st 2nd 1st 2nd 1st 2nd 3rd 1st 2nd
spaces ¿?
block ¿?
clustered ¿?
cursive ¿?
discrete ¿?
connected
punctuation ??? ??? ¿? ¿?

The ‘auto’ column defined above is informative.

Japanese is one of the language that prefers compression to expansion on justification. JIS X-4051 [JIS4051] defines how a text formatter can justify Japanese text. Here is one example of the interpretation of JIS X-4051 with slight modification.

  1. If no justification is necessary, neither compression nor expansion occur.
  2. If justification is necessary, take the first line break opportunity beyond the end of line and apply the following rules (in order) to compress until it fits.
    1. Compress space characters up to the minimum value specified by ‘word-spacing’ property, or up to 1/4em.
    2. Compress fullwidth middle dot punctuations and fullwidth colon punctuations up to 1/2em, by trimming the same amount of spaces from both sides of the characters.
    3. Compress the left side of fullwidth opening punctuations and the right side of fullwidth closing punctuations up to 1/2em.
    4. Compress spaces created by ‘text-autospace’ property up to 1/8em.
  3. If the compression fails to fit the line, take the last line break opportunity before the end of line, and apply the following rules (in order) to expand until it fits.
    1. Expand space characters up to the maximum value specified by ‘word-spacing’ property, or up to 1/2em.
    2. Expand spaces created by ‘text-autospace’ property up to 1/2em.
    3. Expand all expansion opportunities as defined above in equal percent of the size of each character.

8. Spacing

The next two properties refer to the <spacing-limit> value type, which is defined as follows:

<spacing-limit>
[ normal | <length> | <percentage> ]
normal
Specifies the normal optimum/minimum/maximum spacing, as defined by the current font and/or the user agent. Normal spacing should be percentage-based. Normal minimum and maximum spacing must be based on the optimum spacing so that the minimum and maximum limits increase and decrease with changes to the optimum spacing. Normal minimum and maximum spacing may also vary according to some measure of the amount of text on a line (e.g. block width divided by font size): larger measures can accommodate tighter spacing constraints. Normal optimum/minimum/maximum spacing may also vary based on the value of the text-justify property, the element's language, and other factors.
<length> or <percentage>
Specifies extra spacing in addition to the normal spacing. Percentages are with respect to the width of the affected character. Values may be negative, but there may be implementation-dependent limits.

8.1. Word Spacing: the ‘word-spacing’ property

Name: word-spacing
Value: <spacing-limit> {1,3}
Initial: normal
Applies to: all elements
Inherited: yes
Percentages: refers to width of space (U+0020) glyph
Media: visual
Computed value: normal’ or computed value or percentage

This property specifies the minimum, maximum, and optimal spacing between words. If only one value is specified, then it represents the optimal spacing and the minimum and maximum are both ‘normal’. If two values are specified, then the first represents both the optimal spacing and the minimum spacing, and the second represents the maximum spacing. If three values are specified, they represent the optimum, minimum, and maximum respectively.

If the value of the optimum or maximum spacing is less than the value of the minimum spacing, then its used value is the minimum spacing. If the optimum spacing is greater than the maximum spacing then its used value is the maximum spacing. (This substitution occurs after inheritance.)

In the absence of justification the optimal spacing must be used. The text justification process may alter the spacing from its optimum (see the text-justify property, above) but must not violate the minimum spacing limit and should also avoid exceeding the maximum.

Spacing is applied to each word-separator character left in the text after the white space processing rules have been applied and should be applied half on each side of the character. This is correct for Ethiopian and doesn't matter for invisible spaces, but is it correct for Tibetan? Most publications seem to add space after the tsek mark during justification. Word-separator characters include the space (U+0020), the no-break space (U+00A0), the Ethiopic word space (U+1361), the Aegean word separators (U+10100,U+10101), the Ugaritic word divider (U+1039F), and the Tibetan tsek (U+0F0B, U+0F0C). Is this list correct? If there are no word-separator characters, or if the word-separating character has a zero advance width (such as the zero width space U+200B) then the user agent must not create an additional spacing between words. General punctuation and fixed-width spaces (such as U+3000 and U+2000 through U+200A) are not considered word-separators.

8.2. Tracking: the ‘letter-spacing’ property

Name: letter-spacing
Value: <spacing-limit>{1,3}
Initial: normal
Applies to: all elements
Inherited: yes
Percentages: refers to width of space (U+0020) glyph
Media: visual
Computed value: normal’ or computed value or percentage

This property specifies the minimum, maximum, and optimal spacing between grapheme clusters. If only one value is specified, then it represents all three values. If two values are specified, then the first represents both the optimal spacing and the minimum spacing, and the second represents the maximum spacing. If three values are specified, they represent the optimum, minimum, and maximum respectively.

If the value of the optimum or maximum spacing is less than the value of the minimum spacing, then its used value is the minimum spacing. If the optimum spacing is greater than the maximum spacing then its used value is the maximum spacing. (This substitution occurs after inheritance.)

In the absence of justification the optimal spacing must be used. The text justification process may alter the spacing from its optimum (see the text-justify property, above) but must not violate the minimum spacing limit and should also avoid exceeding the maximum. Letter-spacing is applied in addition to any word-spacing. ‘normal’ optimum letter-spacing is typically zero.

A grapheme cluster is what a language user considers to be a character or a basic unit of the script. The term is described in detail in the Unicode Technical Report: Text Boundaries [UAX29]. This specification relies on the default (not tailored) rules only.

Letter-spacing must not be applied at the beginning or at the end of a line. At element boundaries, the letter spacing is given by and rendered within the innermost element that contains the boundary.

For example, given the markup

<P>a<LS>b<Z>cd</Z><Y>ef</Y></LS>g</P>

and the style sheet

LS { letter-spacing: 1em; }
Z { letter-spacing: 0.3em; }
Y { letter-spacing: 0.4em; }

the spacing would be

a[0]b[1em]c[0.3em]d[1em]e[0.4em]f[0]g

UAs may apply letter-spacing to cursive scripts. In this case, UAs should extend the space between disjoint graphemes as specified above and extend the visible connection between cursively connected graphemes by the same amount (rather than leaving a gap). The UA may use glyph substitution or other font capabilities to spread out the letters. If the UA cannot expand a cursive script without breaking the cursive connections, it should not apply letter-spacing between grapheme clusters of that script at all.

When the resulting space between two characters is not the same as the default space, user agents should not use optional ligatures.

8.3. Fullwidth Punctuation Kerning: the ‘punctuation-trim’ property

Name: punctuation-trim
Value: none | [start || [ end | allow-end ] || adjacent]
Initial: none
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

This property determines whether or not a fullwidth punctuation character should be trimmed (kerned) if it appears at the start or end of a line, or adjacent to another fullwidth punctuation character. Values are defined as follows:

none
Do not trim or kern the blank half of fullwidth opening or closing punctuation glyphs.
start
Trim (kern) the blank half of fullwidth opening punctuation at the beginning of each line.
end
Trim (kern) the blank half of fullwidth closing punctuation at the end of each line.
allow-end
Trim (kern) the blank half of fullwidth closing punctuation at the end of each line if it does not otherwise fit prior to justification.
adjacent
Trim (kern) the blank half of fullwidth opening punctuation if its previous adjacent character is a fullwidth opening punctuation, fullwidth middle dot punctuation, fullwidth closing punctuation, or ideographic space (U+3000). Trim (kern) the blank half of fullwidth closing punctuation if its next adjacent character is a fullwidth closing punctuation, fullwidth middle dot punctuation, or ideographic space (U+3000).

add description for what to do if the font has kerning pair defined. add description for what to do if the character is not 1em. classes and Unicode code point should be re-reviewed.

The following example table lists the punctuation pairs affected by the ‘adjacent’ value. It uses halfwidth equivalents to approximate the trimming effect.

Demonstration of ‘adjacent’ punctuation trimming
Combination Sample Pair Looks Like
Opening—Opening + (
Middle Dot—Opening + (
Closing—Opening + (
Ideographic Space—Opening  +  (
Closing—Closing + )
Closing—Middle Dot + )
Closing—Ideographic Space +  ) 

In the context of this property the following definitions apply:

fullwidth opening punctuation
Includes any opening punctuation character (Unicode category Ps) that belongs to the CJK Symbols and Punctuation block (U+3000–U+303F) or is categorized as East Asian Fullwidth (F) by [UAX11]. Also includes LEFT SINGLE QUOTATION MARK (U+2018) and LEFT DOUBLE QUOTATION MARK (U+201C). When trimmed, the left (for horizontal text) or top (for vertical text) half is kerned.
fullwidth closing punctuation
Includes any closing punctuation character (Unicode category Pe) that belongs to the CJK Symbols and Punctuation block (U+3000–U+303F) or is categorized as East Asian Fullwidth (F) by [UAX11]. Also includes RIGHT SINGLE QUOTATION MARK (U+2019) and RIGHT DOUBLE QUOTATION MARK (U+201D). May also include fullwidth colon punctuation and/or fullwidth dot punctuation (see below). When trimmed, the right (for horizontal text) or bottom (for vertical text) half is kerned.
fullwidth middle dot punctuation
Includes MIDDLE DOT (U+00B7), HYPHENATION POINT (U+2027), and KATAKANA MIDDLE DOT (U+30FB). May also include fullwidth colon punctuation and/or fullwidth dot punctuation (see below).
fullwidth colon punctuation
Includes FULLWIDTH COLON (U+FF1A) and FULLWIDTH SEMICOLON (U+FF1B).
fullwidth dot punctuation
Includes IDEOGRAPHIC COMMA (U+3001), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH COMMA (U+FF0C), FULLWIDTH FULL STOP (U+FF0E).

Fullwidth opening and closing punctuation must not be trimmed if the glyph is not actually fullwidth. A fullwidth glyph is one that has the same advance width as a typical Han character in the same font.

Whether fullwidth colon punctuation and fullwidth dot punctuation should be considered fullwidth closing punctuation or fullwidth middle dot punctuation depends on where in the glyph's box the punctuation is drawn. If the punctuation is centered, then it should be considered middle dot punctuation. If the punctuation is drawn to one side (left in horizontal text, top in vertical text) and the other half is therefore blank then the punctuation should be considered closing punctuation and trimmed accordingly.

The UA must classify fullwidth colon punctuation and fullwidth dot punctuation under either the fullwidth closing punctuation category or the fullwidth middle dot punctuation category as appropriate. The UA may rely on language conventions and the layout orientation (horizontal vs. vertical), and/or font information to determine this categorization. The UA may also add additional characters to any category as appropriate.

The following informative table summarizes language conventions for classifying fullwidth colon and dot punctuation:

colon punctuation dot punctuation
Simplified Chinese (horizontal) closing closing
Simplified Chinese (vertical) closing closing
Traditional Chinese middle dot middle dot
Korean middle dot closing
Japanese middle dot closing

Note, that for Chinese fonts at least, the author observes that the standard convention is often not followed.

8.4. Adding space: the ‘text-autospace’ property

Name: text-autospace
Value: none | [ ideograph-numeric || ideograph-alpha || ideograph-space || ideograph-parenthesis ]
Initial: none
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: specified value

When a run of non-ideographic or numeric characters appears inside of ideographic text, a certain amount of space is often preferred on both sides of the non-ideographic text to separate it from the surrounding ideographic glyphs. This property controls the creation of that space when rendering the text. That added width does not correspond to the insertion of additional space characters, but instead to the width increment of existing glyphs.

(A commonly used algorithm for determining this behavior is specified in JIS X-4051 [JIS4051].)

This property is additive with the word-spacing and letter-spacing [CSS21] properties. That is, the amount of spacing contributed by the ‘letter-spacing’ setting (if any) is added to the spacing created by text-autospace. The same applies to word-spacing.

The space added can be compressed or expanded during the justification process as specified in the ‘text-justify’ property.

This property applies only to the same inline element context, and can apply across elements if in the same inline element context.

Values have the following meanings:

none
No extra space is created.
ideograph-numeric
Creates 1/4em extra spacing between runs of ideographic letters and numeric glyphs.
ideograph-alpha
Creates 1/4em extra spacing between runs of ideographic letters and non-ideographic letters, such as Latin-based, Cyrillic, Greek, Arabic or Hebrew.
ideograph-space
Extends the width of the space character while surrounded by ideographs.
ideograph-parenthesis
Creates extra spacing between normal (non wide) parenthesis and ideographs.
punctuation
Creates extra non-breaking spacing around punctuation as required by language-specific typographic conventions. For example, if the element's content language is French, narrow no-break space (U+202F) and no-break space (U+00A0) should be inserted where required by French typographic guidelines.

We are considering to cut ideograph-space and ideograph-parenthesis as these two, unlike others, are to fix errors in the document and are not purposed for styling. ideograph-parenthesis can also make text-justify: trim harder and more complex.

It was requested to add a value for doubling the space after periods.

Ideograph letters in this definitions includes the following characters.

9. Edge Effects

9.1. First Line Indentation: the ‘text-indent’ property

Name: text-indent
Value: [ <length> | <percentage> ] && [ hanging || each-line ]?
Initial: 0
Applies to: block containers
Inherited: yes
Percentages: refers to width of containing block
Media: visual
Computed value: the percentage as specified or the absolute length

This property specifies the indentation applied to lines of inline content in a block. The indent is treated as a margin applied to the start edge of the line box. Unless otherwise specified via the ‘each-line’ and/or ‘hanging’ keywords, only lines that are the first formatted line of an element. For example, the first line of an anonymous block box is only affected if it is the first child of its parent element.

Values have the following meanings:

<length>
Gives the amount of the indent as an absolute length.
<percentage>
Gives the amount of the indent as a percentage of the containing block's logical width.
each-line
Indentation affects the first line of the block container as well as each line after a forced line break, but does not affect lines after a text wrap break.
hanging
Inverts which lines are affected.

If ‘text-align’ is ‘start’ and ‘text-indent’ is ‘5em’ in left-to-right text with no floats present, then first line of text will start 5em into the block:

     Since CSS1 it has been possible
to indent the first line of a block
element using the 'text-indent'
property.

Note that since the ‘text-indent’ property inherits, when specified on a block element, it will affect descendant inline-block elements. For this reason, it is often wise to specify ‘text-indent: 0’ on elements that are specified ‘display: inline-block’.

9.2. Hanging Punctuation: the ‘hanging-punctuation’ property

Name: hanging-punctuation
Value: none | [ first || last || [ allow-end | force-end ] ]
Initial: none
Applies to: block containers
Inherited: yes
Percentages: N/A
Media: visual
Computed value: as specified

This property determines whether a punctuation mark, if one is present, may be placed outside the line box at the start or at the end of a full line of text. Values have the following meanings:

first
Punctuation (specifically, opening brackets and quotes) may hang outside the start edge of the first line.
last
Punctuation (specifically, closing brackets and quotes) may hang outside the end edge of the last line.
allow-end
Punctuation (specifically, stops and commas) may hang outside the end edge of all lines if the punctuation does not otherwise fit prior to justification.
force-end
Punctuation (specifically, stops and commas) may hang outside the end edge of all lines. If justification is enabled on this line, then it will force the punctuation to hang.

In all cases only one punctuation character may hang outside the edge of the line.

Need to work on the description. Add Unicode character classes. Cover indentation as well. Check for Western use-cases. Add hyphens value?

10. Text Decoration

10.1. Line Decoration: Underline, Overline, and Strike-Through

Sync this against CSS2.1.

The following properties describe line decorations that are added to the content of an element. When specified on an inline element, such decoration affects all the boxes generated by that element; for all other elements, the decorations are propagated to an anonymous inline box that wraps all the in-flow inline children of the element, and to any block-level in-flow descendants. They are not, however, further propagated to floating and absolutely positioned descendants, nor to the contents of inline-table and inline-block descendants. The value of the text-decoration-line property on descendant elements therefore cannot have any effect on the decoration of the ancestor; to skip descendants, use the text-decoration-skip property.

By default underlines, overlines, and line-throughs are applied only to text (including white space, letter spacing, and word spacing): margins, borders, and padding are skipped. Elements containing no text, such as images, are likewise not decorated. The text-decoration-skip property can be used to modify this behavior, for example allowing inline replaced elements to be underlined or requiring that white space be skipped.

In determining the position and thickness of text decoration lines, user agents may consider the font sizes and dominant baselines of descendants, but for a given element's decoration must use the same baseline and thickness on each line. Relatively positioning a descendant moves all text decorations affecting it along with the descendant's text; it does not affect calculation of the decoration's initial position on that line. The color and line style of decorations must remain the same on all decorations applied by a given element, even if descendant elements have different color or line style values.

The following figure shows the averaging for underline:

In the first rendering of the underlined text '1st a' with
    'st' as a superscript, both the '1st' and the 'a' are rendered in a small
    font. In the second rendering, the 'a' is rendered in a larger font. In
    the third, both '1st' and 'a' are large.

In the three fragments of underlined text, the underline is drawn consecutively lower and thicker as the ratio of large text to small text increases.

In the following style sheet and document fragment:


   blockquote { text-decoration: underline; color: blue; }
   em { display: block; }
   cite { color: fuchsia; }

   <blockquote>
    <p>
     <span>
      Help, help!
      <em> I am under a hat! </em>

      <cite> —GwieF </cite>
     </span>
    </p>
   </blockquote>

...the underlining for the blockquote element is propagated to an anonymous inline element that surrounds the span element, causing the text "Help, help!" to be blue, with the blue underlining from the anonymous inline underneath it, the color being taken from the blockquote element. The <em>text</em> in the em block is also underlined, as it is in an in-flow block to which the underline is propagated. The final line of text is fuchsia, but the underline underneath it is still the blue underline from the anonymous inline element.

Sample rendering of the above underline example

This diagram shows the boxes involved in the example above. The rounded aqua line represents the anonymous inline element wrapping the inline contents of the paragraph element, the rounded blue line represents the span element, and the orange lines represent the blocks.

10.1.1. Text Decoration Lines: the ‘text-decoration-line’ property

Name: text-decoration-line
Value: none | [ underline || overline || line-through ]
Initial: none
Applies to: all elements
Inherited: no (but see prose)
Percentages: N/A
Media: visual
Computed value: as specified

Specifies what line decorations, if any, are added to the element. Values have the following meanings:

none
Produces no text decoration.
underline
Each line of text is underlined.
overline
Each line of text has a line above it (i.e. on the opposite side from an underline).
line-through
Each line of text has a line through the middle.

10.1.2. Text Decoration Color: the ‘text-decoration-color’ property

Name: text-decoration-color
Value: <color>
Initial: currentColor
Applies to: all elements
Inherited: no
Percentages: N/A
Media: visual
Computed value: as specified

This property specifies the color of text decoration (underlines overlines, and line-throughs) set on the element with text-decoration-line.

10.1.3. Text Decoration Style: the ‘text-decoration-style’ property

Name: text-decoration-style
Value: solid | double | dotted | dashed | wave
Initial: solid
Applies to: all elements
Inherited: no
Percentages: N/A
Media: visual
Computed value: as specified

This property specifies the style of the line(s) drawn for text decoration specified on the element. Values have the same meaning as for the border-style properties [CSS3BG].

10.1.4. Text Decoration Shorthand: the ‘text-decoration’ property

Name: text-decoration
Value: <text-decoration-line> || <text-decoration-color> || <text-decoration-style> || blink
Initial: none
Applies to: all elements
Inherited: no
Percentages: N/A
Media: visual
Computed value: as specified

This property is a shorthand for setting text-decoration-line, text-decoration-color, and text-decoration-style in one declaration. Omitted values are set to their initial values. A text-decoration declaration that omits both the text-decoration-color and text-decoration-style values is backwards-compatible with CSS Levels 1 and 2.

If the blink keyword is specified the text blinks (alternates between visible and invisible). Conforming user agents may simply not blink the text. Note that not blinking the text is one technique to satisfy checkpoint 3.3 of WAI-UAAG.

The following example underlines unvisited links with a solid blue underline in CSS1 and CSS2 UAs and a navy dotted underline in CSS3 UAs.


:link {
    color: blue;
    text-decoration: underline;
    text-decoration: navy dotted underline; /* Ignored in CSS1/CSS2 UAs */
}

10.1.5. Text Decoration Line Continuity: the ‘text-decoration-skip’ property

Name: text-decoration-skip
Value: none | [ images || spaces || ink || all ]
Initial: images
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: as specified

This property specifies what parts of the element's content any text decoration affecting the element must skip over. It controls all text decoration lines drawn by the element and also any text decoration lines drawn by its ancestors. Values have the following meanings:

none
Skip nothing: text-decoration is drawn for all text content and for inline replaced elements.
images
Skip this element if it is an inline replaced element.
spaces
Skip white space: this includes regular spaces (U+0020) and tabs (U+0009), as well as nbsp (U+00A0), ideographic space (U+3000), all fixed width spaces (such as U+2000–U+200A, U+202F and U+205F), and any adjacent letter-spacing or word-spacing.
ink
Skip over where glyphs are drawn: interrupt the decoration line to let text show through where the text decoration would otherwise cross over a glyph. The UA may also skip a small distance to either side of the glyph outline.
all
Skip over all content in this element. This value does not affect text decorations drawn by this element.

Do we need a value that doesn't skip margins and padding?

Note that this property inherits and that descendant elements can have a different setting. Therefore a child of an element with text-decoration-skip: all can cause its grandparent's underline to be drawn by specifying text-decoration-skip: none.

10.1.6. Text Underline Position: the ‘text-underline-position’ property

Name: text-underline-position
Value: auto | under | alphabetic | over
Initial: auto
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: as specified

This property sets the position of an underline specified on the same element: it does not affect underlines specified by ancestor elements. The can appear either "over" or "under" the run of text in relation to its baseline orientation. This property is typically used in vertical writing contexts such as in Japanese documents where it often desired to have the underline appear "over" (to the right of) the affected run of text. Values have the following meanings:

auto
The user agent may use any algorithm to determine the underline's position. In horizontal line layout, the underline should be aligned as for alphabetic. In vertical line layout, if the language is set to Japanese or Korean, the underline should be aligned as for over. this suggestion needs some refinement
alphabetic
The underline is aligned with the alphabetic baseline. In this case the underline is likely to cross some descenders.
under
The underline is aligned with the "bottom" (left in vertical writing) edge of the element's em-box. In this case the underline usually does not cross the descenders. This is sometimes called "accounting" underline.
over
The underline is aligned with the "top" (right in vertical writing) edge of the element's em-box. In this mode, an overline also switches sides.

10.2. Emphasis Marks

East Asian documents traditionally use small symbols next to each glyph to emphasize a run of text. For example:

Example of emphasis in Japanese appearing above the text

Accent emphasis (shown in blue for clarity) applied to Japanese text

10.2.1. Emphasis Mark Style: the ‘text-emphasis-style’ property

Name: text-emphasis-style
Value: none | [ [ filled | open ] || [ dot | circle | double-circle | triangle | sesame ] ] | <string>
Initial: none
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: none’, a pair of keywords representing the shape and fill, or a string

This property applies emphasis marks to the element's text. Values have the following meanings:

none
No emphasis marks.
filled
The shape is filled with solid color.
open
The shape is hollow.
dot
Draw small circles as marks. The filled dot is U+2022 ‘’, and the open dot is U+25E6 ‘’.
circle
Draw large circles as marks. The filled circle is U+25CF ‘’, and the open circle is U+25CB ‘’.
double-circle
Draw double circles as marks. The filled double-circle is U+25C9 ‘’, and the open double-circle is U+25CE ‘’.
triangle
Draw triangles as marks. The filled triangle is U+25B2 ‘’, and the open triangle is U+25B3 ‘’.
sesame
Draw sesames as marks. The filled sesame is U+FE45 ‘’, and the open sesame is U+FE46 ‘’.
<string>
Draw the given string as marks. Authors should not specify more than one grapheme cluster in <string>. The UA may truncate or ignore strings consisting of more than one grapheme cluster.

If a shape keyword is specified but neither of ‘filled’ nor ‘open’ is specified, ‘filled’ is assumed. If only ‘filled’ or 'open is specified, the shape keyword computes to ‘dot’ in horizontal writing mode and ‘sesame’ in vertical writing mode.

The marks should be drawn using the element's font settings with its size scaled down to 50%. UA should fall back to an appropriate font if the glyph is missing in the font. The marks may instead be synthesized by the UA.

This rendering scheme is based on ruby algorithm and the one used in Japanese printing industries. But the size of glyphs varies so much that this may result in inconsistent and/or bad visuals. If you have any feedback on this, it's appreciated.

The marks are drawn once for each grapheme cluster. However, emphasis marks are not drawn for a grapheme cluster consisting of:

10.2.2. Emphasis Mark Color: the ‘text-emphasis-color’ property

Name: text-emphasis-color
Value: <color>
Initial: currentcolor
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: as specified

This property describes the foreground color of the emphasis marks.

10.2.3. Emphasis Mark Shorthand: the ‘text-emphasis’ property

Name: text-emphasis
Value: <text-emphasis-style>’ || ‘<text-emphasis-color>
Initial: see individual properties
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: see individual properties

This property is a shorthand for setting text-emphasis-style and text-emphasis-color in one declaration. Omitted values are set to their initial values.

Note that ‘text-emphasis-position’ is not reset in this shorthand. This is because typically the shape and color vary, but the position is consistent for a particular language throughout the document. Therefore the position should inherit independently.

10.2.4. Emphasis Mark Position: the ‘text-emphasis-position’ property

Name: text-emphasis-position
Value: over | under
Initial: over Is this the right default?
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: as specified

This property describes where emphasis marks are drawn at. The values have following meanings:

over
Draw marks above the text in horizontal layout, to the right in vertical layout. This is the default position.
under
Draw marks below the text in horizontal layout, to the left in vertical layout.

Emphasis marks are drawn exactly as if each grapheme cluster was assigned the mark as its ruby annotation text with the ruby position given by ‘text-emphasis-position’ and the ruby alignment as centered.

The effect of emphasis marks on the line height is the same as for ruby text.

The values ‘over’ and ‘under’ are chosen over ‘before’ and ‘after’ here because in Mongolian, baseline is not alined with the block progression.

Steve Zilles points out that these values may not make much sense if vertical text is laid out so that horizontal scripts' glyph tops point left. An alternative set of values could be [ top | bottom ] && [ left | right ], with the appropriate value used for the current writing mode. This also makes it simple for PRC Chinese text to specify the correct behavior in the UA style sheet.

Note, the preferred position of emphasis marks depends on the language. In Japanese for example, the preferred position is ‘over’. In Chinese used in the PRC, on the other hand, the preferred position is ‘under’. The informative table below summarizes the preferred emphasis mark position for Chinese and Japanese:

Preferred emphasis mark and ruby position
Language Preferred mark position Illustration
Horizontal Vertical
Japanese over over Emphasis marks appear above each emphasized
       character in horizontal Japanese text. Emphasis marks appear on the right of each
       emphasized character in vertical Japanese text.
Mongolian over? over
Chinese (Traditional) over over
Chinese (Simplified) under over Emphasis marks appear below each emphasized character in
       horizontal Simplified Chinese text.

10.3. Text Shadows: the ‘text-shadow’ property

Name: text-shadow
Value: none | [<shadow>, ] * <shadow>
Initial: none
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: a color plus three absolute <length>s

This property accepts a comma-separated list of shadow effects to be applied to the text of the element. <shadow> is the same as defined for the ‘box-shadow’ property except that the ‘inset’ keyword is not allowed.

Should shadows be clipped to the glyph edges, like ‘box-shadow’ is clipped to the box edges, or should it be painted completely underneath the text. (This makes a difference when the text is partially transparent.)

The shadow is applied to all of the element's text as well as any text decoration applied to it. Would it be better to apply shadows together with text decoration: i.e. a descendant of an underlined element doesn't apply shadow to its underline, but the underlining element, if it has shadows, would apply it to the underline of all text it underlines. When a text outline is specified, the shadow shadows the outlined shape rather than the glyph shape.

The shadow effects are applied front-to-back: the first shadow is on top. The shadows may thus overlay each other, but they never overlay the text itself. Shadow effects do not alter the size of a box, but may extend beyond its boundaries. The shadow must be painted immediately behind the element's text (in front of its background). UAs should avoid painting text shadows over text in adjacent elements belonging to the same stack level and stacking context.

The painting order of shadows defined here is the opposite of that defined in the 1998 CSS2 Recommendation.

The text-shadow property applies to both the ::first-line and ::first-letter pseudo-elements.

10.4. Text Outlines: the ‘text-outline’ property

Name: text-outline
Value: none | [ <color> <length> <length>? | <length> <length>? <color> ]
Initial: none
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
Computed value: a color plus two absolute <length>s

This property specifies a text outline where the first length represents the outline's thickness and the second represents an optional blur radius. The outline never overlays the text itself. Its shape is the same as that obtained by applying text shadows in every radial direction, i.e. all text shadows whose offsets satisfy the equation x2 + y2 = thickness2. The blur radius is treated the same as for ‘text-shadow’.

The Timed-Text WG had suggestions for some keywords (text-outline: normal|heavy|light;) as well as a <length> thickness. Should these be added? How would they be defined? (Maybe use (thin|medium|thick) as in border-width?)

A color value must be specified before or after the length values of the outline effect. The color value will be used as the color of the outline.

Implementations may ignore the blur radius when text outline is combined with a text shadow.

11. This section should move to CSS3 UI

11.1. Overflow Ellipsis: the ‘text-overflow’ property

Name: text-overflow
Value: clip | ellipsis | <string>
Initial: clip
Applies to: block containers
Inherited: no
Percentages: N/A
Media: visual
Computed value: as specified

This property specifies the behavior when text overflows containing element. Values have the following meanings:

clip
Clip text as appropriate. Glyphs may be only partially rendered.
ellipsis
Render an ellipsis (U+2026) to represent clipped text.
<string>
Render the given string to represent clipped text.

Changes

Changes from the May 2003 CSS3 Text CR

Much of the text has been rewritten or severely revised, so all changes will not be listed here. Highlights include:

Sections relating to bidirectional and vertical text layout will be moved to a separate Writing Modes module. These features may change greatly from the last revision, but they have not been dropped. The vertical text feature, for example, will likely be based on the methods described in Unicode Technical Note #22.

The text-script property has been dropped, since it does not belong in the style layer.

Controls over kerning have been moved to the CSS Fonts Module.

The line grid properties have been removed. There is currently no plan to add them back, although a document grid feature may be added to future CSS modules.

Changes from the March 2007 CSS3 Text WD

Major changes include:

Acknowledgements

This specification would not have been possible without the help from: Ayman Aldahleh, Bert Bos, Tantek Çelik, Stephen Deach, Martin Dürst, Laurie Anna Edlund, Ben Errez, Yaniv Feinberg, Arye Gittelman, Ian Hickson, Martin Heijdra, Richard Ishida, Koji Ishii, Masayasu Ishikawa, Michael Jochimsen, Eric LeVine, Ambrose Li, Chris Lilley, Shinyu Murakami, Paul Nelson, Chris Pratley, Marcin Sawicki, Arnold Schrijver, Rahul Sonnad, Michel Suignard, Takao Suzuki, Frank Tang, Chris Thrasher, Etan Wexler, Chris Wilson, Masafumi Yabe and Steve Zilles.

Appendix A: References

Normative references

[CSS21]
Bert Bos; et al. Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification. 8 September 2009. W3C Candidate Recommendation. (Work in progress.) URL: http://www.w3.org/TR/2009/CR-CSS2-20090908
[CSS3BG]
Bert Bos; Elika J. Etemad; Brad Kemper. CSS Backgrounds and Borders Module Level 3. 12 June 2010. W3C Working Draft. (Work in progress.) URL: http://www.w3.org/TR/2010/WD-css3-background-20100612
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. Internet RFC 2119. URL: http://www.ietf.org/rfc/rfc2119.txt
[UAX11]
Asmus Freytag. East Asian Width. 23 March 2001. Unicode Standard Annex #11. URL: http://www.unicode.org/unicode/reports/tr11/tr11-8.html
[UAX14]
Asmus Freytag. Line Breaking Properties. 29 March 2005. Unicode Standard Annex #14. URL: http://www.unicode.org/unicode/reports/tr14/tr14-17.html
[UAX24]
Mark Davis. Script Names. 28 March 2005. Unicode Standard Annex #24. URL: http://www.unicode.org/unicode/reports/tr24/tr24-7.html
[UAX29]
Mark Davis. Text Boundaries. 25 March 2005. Unicode Standard Annex #29. URL: http://www.unicode.org/unicode/reports/tr29/tr29-9.html
[UNICODE]
The Unicode Consortium. The Unicode Standard. 2003. Defined by: The Unicode Standard, Version 4.0 (Boston, MA, Addison-Wesley, ISBN 0-321-18578-1), as updated from time to time by the publication of new versions URL: http://www.unicode.org/unicode/standard/versions/enumeratedversions.html

Informative references

[JIS4051]
Formatting rules for Japanese documents (『日本語文書の組版方法』). Japanese Standards Association. 2004. JIS X 4051:2004. In Japanese
[ZHMARK]
标点符号用法 (Punctuation Mark Usage). 1995. 中华人民共和国国家标准

Appendix B: Property index

Property Values Initial Applies to Inh. Percentages Media
hanging-punctuation none | [ first || last || [ allow-end | force-end ] ] none block containers yes N/A visual
letter-spacing <spacing-limit>{1,3} normal all elements yes refers to width of space (U+0020) glyph visual
line-break auto | newspaper | normal | strict | keep-all auto all elements yes N/A visual
punctuation-trim none | [start || [ end | allow-end ] || adjacent] none all elements yes N/A visual
text-align [start | end | left | right | center | justify | match-parent ] || <string> start block containers yes N/A visual
text-align-last start | end | left | right | center | justify start block containers yes N/A visual
text-autospace none | [ ideograph-numeric || ideograph-alpha || ideograph-space || ideograph-parenthesis ] none all elements yes N/A visual
text-decoration <text-decoration-line> || <text-decoration-color> || <text-decoration-style> || blink none all elements no N/A visual
text-decoration-color <color> currentColor all elements no N/A visual
text-decoration-line none | [ underline || overline || line-through ] none all elements no (but see prose) N/A visual
text-decoration-skip none | [ images || spaces || ink || all ] images all elements yes N/A visual
text-decoration-style solid | double | dotted | dashed | wave solid all elements no N/A visual
text-emphasis ‘<text-emphasis-style>’ || ‘<text-emphasis-color>’ see individual properties all elements yes N/A visual
text-emphasis-color <color> currentcolor all elements yes N/A visual
text-emphasis-position over | under over Is this the right default? all elements yes N/A visual
text-emphasis-style none | [ [ filled | open ] || [ dot | circle | double-circle | triangle | sesame ] ] | <string> none all elements yes N/A visual
text-indent [ <length> | <percentage> ] && [ hanging || each-line ]? 0 block containers yes refers to width of containing block visual
text-justify auto | [ trim || [ inter-word | inter-ideograph | inter-cluster | distribute | kashida ] ] auto block containers and, optionally, inline elements yes N/A visual
text-outline none | [ <color> <length> <length>? | <length> <length>? <color> ] none all elements yes N/A visual
text-overflow clip | ellipsis | <string> clip block containers no N/A visual
text-shadow none | [<shadow>, ] * <shadow> none all elements yes N/A visual
text-transform none | capitalize | uppercase | lowercase | fullwidth | large-kana none all elements yes N/A visual
text-underline-position auto | under | alphabetic | over auto all elements yes N/A visual
text-wrap normal | unrestricted | none | suppress normal all elements yes N/A visual
white-space normal | pre | nowrap | pre-wrap | pre-line not defined for shorthand properties all elements yes N/A visual
white-space-collapsing collapse | discard | [ [preserve | preserve-breaks] && trim-inner ] collapse all elements yes N/A visual
word-break normal | break-all | hyphenate normal all elements yes N/A visual
word-spacing <spacing-limit> {1,3} normal all elements yes refers to width of space (U+0020) glyph visual
word-wrap normal | break-word normal all elements yes N/A visual

Appendix C: Default UA Stylesheet

This section is informative, and is to help UA developers to implement default stylesheet, but UA developers are free to ignore or change.


/* make list items align together */
li { text-align: match-parent; }
/* disable inheritance of text-emphasis marks to ruby text:
  emphasis marks should only apply to base text */
rt { text-emphasis: none; }

:root:lang(zh-Hans) {
/* default emphasis mark position is 'under' for Chinese (Simplified) */
  text-emphasis-position: under;
}

If you find any issues, recommendations to add, or corrections, please send the information to www-style@w3.org with [css3-text] in the subject line.

Index