Abstract

This document specifies two profiles of [TTML1]: a text-only profile and an image-only profile. These profiles are intended to be used across subtitle and caption delivery applications worldwide, thereby simplifying interoperability, consistent rendering and conversion to other subtitling and captioning formats. The text profile is a superset of [ttml10-sdp-us].

The document defines extensions to [TTML1], as well as incorporates extensions specified in [ST2052-1] and [EBU-TT-D].

Both profiles are based on [SUBM].

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document was published by the Timed Text Working Group as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-tt@w3.org (subscribe, archives) with [imsc] at the start of your email's subject. All comments are welcome.

Please see the Working Group's implementation report.

For this specification to exit the CR stage, at least 2 independent implementations of every feature defined in this specification but not already present in [TTML1] need to be documented in the implementation report. The implementation report is based on implementer-provided test results for the test suite (tests and sample content) maintained by the Working Group. The Working Group does not require that implementations are publicly available but encourages them to be so.

The Working Group expects the test suite and the implementation report to evolve significantly before the specification advances to Proposed Recommendation.

The Working Group has not identified features "at risk" for this specification.

Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 August 2014 W3C Process Document.

Table of Contents

1. Scope

This document specifies two profiles of [TTML1]: a text-only profile and an image-only profile. These profiles are intended for subtitle and caption delivery worldwide, including dialog language translation, content description, captions for deaf and hard of hearing, etc.

The text-only profile is a strict superset of [ttml10-sdp-us].

The document defines extensions to [TTML1], as well as incorporates extensions specified in [ST2052-1] and [EBU-TT-D].

2. Documentation Conventions

This specification uses the same conventions as [TTML1] for the specification of parameter attributes, styling attributes and metadata elements. In particular, Section 2.3 of [TTML1] specifies conventions used in the XML representation of elements.

All content of this specification that is not explicitly marked as non-normative is considered to be normative. If a section or appendix header contains the expression "non-normative", then the entirety of the section or appendix is considered non-normative.

3. Terms and Definitions

Default Region. See Section 9.3.1 at [TTML1].

Document Instance. See Section 2.2 at [TTML1].

Intermediate synchronic document. See Section 9.3.2 at [TTML1].

Presentation processor. See Section 2.2 at [TTML1].

Transformation processor. See Section 2.2 at [TTML1].

Related Media Object. See Section 2.2 at [TTML1].

4. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, SHALL, SHALL NOT, and SHOULD are to be interpreted as described in [RFC2119].

A Document Instance that conforms to a profile defined herein SHALL satisfy all normative provisions specified by the profile.

A presentation processor that conforms to a profile defined in this specification SHALL:

A transformation processor that conforms to a profile defined in this specification SHALL:

Note

The use of the term presentation processor (transformation processor) within this specification does not imply conformance to the DFXP Presentation Profile (DFXP Transformation Profile) specified in [TTML1]. In other words, it is not considered an error for a presentation processor (transformation processor) to conform to a profile defined in this specification without also conforming to the DFXP Presentation Profile (DFXP Transformation Profile).

Note

This specification does not specify presentation processor or transformation processor behavior when processing or transforming a non-conformant Document Instance.

5. Profiles

5.1 General

A Document Instance SHALL NOT conform to the Text Profile and Image Profile simultaneously.

In applications that require subtitle/caption content in image form to be simultaneously available in text form, two distinct Document Instances, one conforming to the Text Profile and the other conforming to the Image Profile, SHOULD be offered. In addition, the Text Profile Document Instance SHOULD be associated with the Image Profile Document Instance such that, when image content is encountered, assistive technologies have access to its corresponding text form. The method by which this association is made is left to each application.

Note

The ittm:altText element specified 6.7.4 ittm:altText also allows text equivalent string to be associated with an image, e.g. to support indexation of the content and also facilitate quality checking of the document during authoring.

Annex D. WCAG Considerations specifically discusses this specification in the context of the [WCAG20] guidelines.

5.2 Text Profile

The Text Profile consists of Sections 6. Common Constraints and 7. Text Profile Constraints.

5.3 Image Profile

The Image Profile consists of Sections 6. Common Constraints and 8. Image Profile Constraints.

6. Common Constraints

6.1 Document Encoding

A Document Instance SHALL use UTF-8 character encoding as specified in [UNICODE].

6.2 Foreign Element and Attributes

A Document Instance MAY contain elements and attributes that are neither specifically permitted nor forbidden by a profile.

6.3 Namespaces

The following namespaces (see [xml-names]) are used in this specification:

Name Prefix Value Defining Specification
XML xml http://www.w3.org/XML/1998/namespace [xml-names]
TT Parameter ttp http://www.w3.org/ns/ttml#parameter [TTML1]
TT Styling tts http://www.w3.org/ns/ttml#styling [TTML1]
TT Feature none http://www.w3.org/ns/ttml/feature/ [TTML1]
SMPTE-TT Extension smpte http://www.smpte-ra.org/schemas/2052-1/2010/smpte-tt [ST2052-1]
EBU-TT Styling ebutts urn:ebu:tt:style [EBU-TT-D]
IMSC 1.0 Styling itts http://www.w3.org/ns/ttml/profile/imsc1#styling This specification
IMSC 1.0 Parameter ittp http://www.w3.org/ns/ttml/profile/imsc1#parameter This specification
IMSC 1.0 Metadata ittm http://www.w3.org/ns/ttml/profile/imsc1#metadata This specification
IMSC 1.0 Extension none http://www.w3.org/ns/ttml/profile/imsc1/extension/ This specification
IMSC 1.0 Text Profile Designator none http://www.w3.org/ns/ttml/profile/imsc1/text This specification
IMSC 1.0 Image Profile Designator none http://www.w3.org/ns/ttml/profile/imsc1/image This specification

The namespace prefix values defined above are for convenience and document instances MAY use any prefix value that conforms to [xml-names].

The namespaces defined by this specification are mutable [namespaceState]; all undefined names in these namespaces are reserved for future standardization by the W3C.

6.4 Overflow

A Document Instance SHOULD be authored assuming strict clipping of content that falls out of region areas, regardless of the computed value of tts:overflow for the region.

Note

As specified in [TTML1], tts:overflow has no effect on the extent of the region, and hence the total normalized drawing area S(En) at 9.3 Paint Regions.

6.6 Synchronization

Each intermediate synchronic document of the Document Instance is intended to be displayed on a specific frame and removed on a specific frame of the related video object.

When mapping a media time expression M to a frame F of a related video object, e.g. for the purpose of rendering a Document Instance onto the related video object, the presentation processor SHALL map M to the frame F with the presentation time that is the closest to, but not less, than M.

Note

In typical scenario, the same video program (the related video object) will be used for Document Instance authoring, delivery and user playback. The mapping from media time expression to related video object above allows the author to precisely associate subtitle video content with video frames, e.g. around scene transitions. In circumstances where the video program is downsampled during delivery, the application can specify that, at playback, the relative video object be considered the delivered video program upsampled to is original rate, thereby allowing subtitle content to be rendered at the same temporal locations it was authored.

6.7 Extensions

6.7.1 ittp:aspectRatio

The ittp:aspectRatio attributes allows authorial control of the mapping of the root container of a Document Instance to the related video object frame.

If present, the ittp:aspectRatio attribute SHALL conform to the following syntax:

ittp:aspectRatio
  : numerator denominator          // with int(numerator) != 0 and int(denominator) != 0
                                   // where int(s) parses string s as a decimal integer.
        
numerator | denominator
  : <digit>+

The root container of a Document Instance SHALL be mapped to the related video object frame according to the following:

  1. If ittp:aspectRatio is present, the root container SHALL be mapped to a rectangular area within the related video object such that:

    1. the ratio of the width to the height of the rectangular area is equal to ittp:aspectRatio,
    2. the center of the rectangular area is collocated with the center of the related video object frame,
    3. the rectangular area (including its boundary) is entirely within the related video object frame (including its boundary), and
    4. the rectangular area has a height or width equal to that of the related video object frame.
  2. Otherwise, the root container of a Document Instance SHALL be mapped to the related video object frame in its entirety.

An ittp:aspectRatio attribute is considered to be significant only when specified on the tt element.

Example 2
<tt
  xmlns="http://www.w3.org/ns/ttml"
  xmlns:ttm="http://www.w3.org/ns/ttml#metadata" 
  xmlns:tts="http://www.w3.org/ns/ttml#styling"
  xmlns:ttp="http://www.w3.org/ns/ttml#parameter" 
  xmlns:ittp="http://www.w3.org/ns/ttml/profile/imsc1#parameter"
  ittp:aspectRatio="4 3"
 >
 ...
</tt>

6.7.2 ittp:progressivelyDecodable

A progressively decodable Document Instance is structured to facilitate presentation before the document is received in its entirety, and can be identified using ittp:progressivelyDecodable attribute.

A progressively decodable Document Instance is a Document Instance that conforms to the following:

  1. no attribute or element of the TTML timing vocabulary is present within the head element;
  2. given two intermediate synchronic documents A and B of the Document Instance, with start times TA and TB, respectively, TA is not greater than TB if A includes a p element that occurs earlier in the document than any p element that B includes;
  3. no attribute of the TTML timing vocabulary is present on a descendant element of p; and
  4. no element E1 explicitly references another element E2 where the opening tag of E2 occurs after the opening tag of E1.

If present, the ittp:progressivelyDecodable attribute SHALL conform to the following syntax:

ittp:progressivelyDecodable
  : "true"
  | "false"

An ittp:progressivelyDecodable attribute is considered to be significant only when specified on the tt element.

If not specified, the value of ittp:progressivelyDecodable SHALL be considered to be equal to "false".

A Document Instance for which the computed value of ittp:progressivelyDecodable is "true" SHALL be a progressively decodable Document Instance.

A Document Instance for which the computed value of ittp:progressivelyDecodable is "false" is neither asserted to be a progressively decodable Document Instance nor asserted not to be a progressively decodable Document Instance.

Example 3
<tt
  xmlns="http://www.w3.org/ns/ttml"
  xmlns:ttm="http://www.w3.org/ns/ttml#metadata" 
  xmlns:tts="http://www.w3.org/ns/ttml#styling"
  xmlns:ttp="http://www.w3.org/ns/ttml#parameter" 
  xmlns:ittp="http://www.w3.org/ns/ttml/profile/imsc1#parameter"
  ittp:progressivelyDecodable="true"
 >
 ...
</tt>
Note

[TTML1] specifies explicitly referencing of elements identified using xml:id in the following circumstances:

  • an element in body referencing region elements. In this case, Requirement 4 above is always satisfied.
  • an element in body referencing style elements. In this case, Requirement 4 above is always satisfied.
  • a region element referencing style elements. In this case, Requirement 4 above is always satisfied.
  • a style element referencing other style elements. In this case, Requirement 4 provides an optimization of style element ordering within the head element.
  • a ttm:actor element referencing a ttm:agent element. In this case, Requirement 4 provides optimization of metadata elements ordering within the document.
  • a content element referencing ttm:agent elements using the ttm:agent attribute. In this case, Requirement 4 provides optimization of metadata elements ordering within the document.

6.7.3 itts:forcedDisplay

itts:forcedDisplay allows the processor to override the computed value of tts:visibility attribute in conjunction with an application parameter displayForcedOnlyMode.

If and only if the value of displayForcedOnlyMode is "true", a content element with a itts:forcedDisplay computed value of "false" SHALL NOT produce any visible rendering, but still affect layout, regardless of the computed value of tts:visibility.

The itts:forcedDisplay attribute shall conform to the following:

Values: false | true
Initial: false
Applies to: body, div, p, region, span
Inherited: yes
Percentages: N/A
Animatable: discrete

Annex C. Forced content (non-normative) illustrates the use of itts:forcedDisplay in an application in which a single document contains both hard of hearing captions and translated foreign language subtitles, using itts:forcedDisplay to display translation subtitles always, independently of whether the hard of hearing captions are displayed or hidden.

The presentation processor SHALL accept an optional boolean parameter called displayForcedOnlyMode, whose value MAY be set by a context external to the presentation processor. If not set, the value of displayForcedOnlyMode SHALL be assumed to be equal to "false".

The algorithm for setting the displayForcedOnlyMode parameter based on the circumstances under which the Document Instance is presented is left to the application.

Example 4
...
<head>
	...
	<region xml:id="r1" tts:origin="10% 2%" tts:extent="80% 10%" tts:color="white" itts:forcedDisplay="true" tts:backgroundColor="black"/>
	<region xml:id="r2" tts:origin="10% 80%" tts:extent="80% 88%" tts:color="white" tts:backgroundColor="black"/>
	...
</head>
...
<div>
	 <p region="r1" begin="1s" end="6s">Lycée</p>
		
	 <!-- the following will not appear if displayForcedOnlyMode='true' -->
	 <p region="r2" begin="4s" end="6s">Nous étions inscrits au même lycée.</p>
</div>
...
Note

As specified in [TTML1], the background of a region can be visible even if the computed value of tts:visibility equals "hidden" for all active content within. The background of a region for which itts:forcedDisplay equals "true" can therefore remain visible even if itts:forcedDisplay equals "false" for all active content elements within the region and displayForcedOnlyMode equals "true". Authors can avoid this situation, for instance, by ensuring that content elements and the regions that they are flowed into always have the same value of itts:forcedDisplay.

Note

Although itts:forcedDisplay, like all the TTML style attributes, has no defined semantics on a br content element, itts:forcedDisplay will apply to a br content element if it is either defined on an ancestor content element of the br content element or it is applied to a region element corresponding to a region that the br content element is being flowed into.

Note

It is expected that the functionality of itts:forcedDisplay will be mapped to a conditional style construct in a future revision of this specification.

6.7.4 ittm:altText

ittm:altText allows an author to provide a text string equivalent for an element, typically an image. This text equivalent MAY be used to support indexing of the content and also facilitate quality checking of the document during authoring.

The ittm:altText element SHALL conform to the following syntax:

<ittm:altText
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in the default namespace, any TT namespace or any IMSC 1.0 namespace}>
  Content: #PCDATA
</ittm:altText>

The ittm:altText element SHALL be a child of the metadata element.

8. Image Profile Constraints specifies the use of the ittm:altText element with images.

Example 5
...
<div region="r1" begin="1s" end="6s" smpte:backgroundImage="1.png">
  <metadata>
  <ittm:altText>Nous étions inscrits au même lycée.</ttm:title>
  </metadata>
</div>
...
Note

In contrast to the common use of alt attributes in [HTML5], the ittm:altText attribute content is not intended to be displayed in place of the element if the element is not loaded. The ittm:altText attribute content can however be read and used by assistive technologies.

6.8 Region

6.8.1 Presented Region

A presented region is a temporally active region that satisfies the following conditions:

  1. the computed value of tts:opacity is not equal to "0.0"; and
  2. the computed value of tts:display is not "none"; and
  3. the computed value of tts:visibility is not "hidden"; and
  4. either (a) content is selected into the region or (b) the computed value of tts:showBackground is equal to "always" and the computed value of tts:backgroundColor has non-transparent alpha.

6.8.2 Dimensions and Position

All regions SHALL NOT extend beyond the root container, i.e. the intersection of the sets of coordinates belonging to a region (including its boundary) and the sets of coordinates belonging to the root container (including its boundary) is the set of coordinates belonging to the region (including its boundary).

No two presented regions in a given intermediate synchronic document SHALL overlap, i.e. the intersection of the sets of coordinates within each region (including its boundary) is empty.

6.8.3 Maximum number

The number of presented regions in a given intermediate synchronic document SHALL NOT be greater than 4.

6.9 Hypothetical Render Model

Any sequence of consecutive intermediate synchronic documents SHALL be reproducible without error by the Hypothetical Render Model specified in Section 9. Hypothetical Render Model.

6.10 Features

Unless specified otherwise, a Document Instance SHALL conform to the following:

Feature Provisions
Relative to the TT Feature namespace
#animation MAY be used.
#cellResolution MAY be used.
#clockMode SHALL NOT be used.
#content MAY be used.
#core MAY be used.
#display-block MAY be used.
#display-inline MAY be used.
#display-region MAY be used.
#display MAY be used.
#dropMode SHALL NOT be used.
#extent-region MAY be used, with the following additional constraints:
  • The tts:extent attribute SHALL be present on all region elements.
#extent-root MAY be used, with the following additional constraints:
  • If the document includes any length value that uses the px expression, tts:extent SHALL be present on the tt element.
#extent MAY be used.
#frameRate MAY be used, with the following additional constraints:
  • If the document includes any clock time expression that uses the frames term or any offset time expression that uses the f metric, the ttp:frameRate attribute SHALL be present on the tt element.
#frameRateMultiplier MAY be used.
#layout MAY be used.
#length-cell SHALL NOT be used other than to specify the value of ebutts:linePadding.
#length-integer MAY be used.
#length-negative SHALL NOT be used.
#length-percentage MAY be used.
#length-pixel MAY be used.
#length-positive MAY be used.
#length-real MAY be used.
#length MAY be used.
#markerMode SHALL NOT be used.
#metadata MAY be used.
#opacity MAY be used.
#origin MAY be used.
#overflow MAY be used.
#pixelAspectRatio SHALL NOT be used.
#presentation MAY be used.
#profile MAY be used.
#showBackground MAY be used.
#structure MAY be used.
#styling-chained MAY be used.
#styling-inheritance-content MAY be used.
#styling-inheritance-region MAY be used.
#styling-inline MAY be used.
#styling-nested MAY be used.
#styling-referential MAY be used.
#styling MAY be used.
#subFrameRate SHALL NOT be used.
#tickRate MAY be used, with the following additional constraints:
  • ttp:tickRate SHALL be present on the tt element if the document contains any time expression that uses the t metric.
#timeBase-clock SHALL NOT be used.
#timeBase-media SHALL be used. NOTE: [TTML1] specifies that the default timebase is "media" if ttp:timeBase is not specified on tt.
#timeBase-smpte SHALL NOT be used.
#time-clock-with-frames MAY be used.
#time-clock MAY be used.
#time-offset-with-frames MAY be used.
#time-offset-with-ticks MAY be used.
#time-offset MAY be used.
#timeContainer MAY be used.
#timing MAY be used, with the following additional constraints:
  • all time expressions within a Document Instance SHOULD use the same syntax, either clock-time or offset-time; and
  • for any content element that contains br elements or text nodes or a smpte:backgroundImage attribute, the begin and end attributes SHOULD be specified on the content element or at least one of its ancestors.
#transformation MAY be used.
#unicodeBidi MAY be used.
#visibility-block MAY be used.
#visibility-inline MAY be used.
#visibility-region MAY be used.
#visibility MAY be used.
#writingMode-horizontal-lr MAY be used.
#writingMode-horizontal-rl MAY be used.
#writingMode-horizontal MAY be used.
#writingMode MAY be used.
#zIndex MAY be used.
Extension Provisions
Relative to the IMSC 1.0 Extension namespace
#aspectRatio MAY be used.
#forcedDisplay MAY be used.
#progressivelyDecodable MAY be used.
#altText MAY be used.
Note

As specified in [TTML1], a #time-offset-with-frames expression is translated to a media time M according to M = 3600 · hours + 60 · minutes + seconds + (frames ÷ (ttp:frameRateMultiplier · ttp:frameRate)).

7. Text Profile Constraints

7.1 Profile Designator

This profile is associated with the following profile designator:

Profile Name Profile Designator
IMSC 1.0 Text http://www.w3.org/ns/ttml/profile/imsc1/text
Note

As specified in 6.10 Features, the presence of the ttp:profile attribute is not required by this profile. The profile designator specified above is intended to be generally used to signal conformance of a Document Instance to the profile. The details of such signaling depends on the application, and can, for instance, use metadata structures out-of-band of the Document Instance.

7.3 Reference Fonts

The flow of text within a region depends the dimensions and spacing (kerning) between individual glyphs. The following allows, for instance, region extents to be set such that text flows without clipping.

When processing glyphs that match the combinations of computed font family and code point listed in A. Reference Fonts, e.g. during layout, a presentation processor or transformation processor SHALL use glyph metrics equal to the metrics of the specified reference font, unless the glyph is not defined by the reference font.

Note

Implementations can use fonts other than those specified in A. Reference Fonts. Two fonts with equal metrics can have a different appearance, but flow identically.

7.4 Features

The Document Instance SHALL conform to the following table:

Feature Provisions
Relative to the TT Feature namespace
#backgroundColor-block MAY be used.
#backgroundColor-inline MAY be used.
#backgroundColor-region MAY be used.
#backgroundColor MAY be used.
#bidi MAY be used.
#color MAY be used, with the following additional constraints:
  • The initial value of tts:color SHALL be "white". NOTE: This is consistent with [ST2052-1].
#direction MAY be used.
#displayAlign MAY be used.
#extent-region The tts:extent attribute when applied to a region element SHALL use px units or "percentage" representation, and SHALL NOT use em units.
#fontFamily-generic MAY be used, with the following additional constraints:
  • A tts:fontFamily of either "monospaceSerif" or "proportionalSansSerif" SHOULD be specified for all presented text content. A tts:fontFamily of "default" SHALL be equivalent to "monospaceSerif".
#fontFamily-non-generic MAY be used.
#fontFamily MAY be used.
#fontSize-anamorphic SHALL NOT be used.
#fontSize-isomorphic MAY be used.
#fontSize MAY be used.
#fontStyle-italic MAY be used.
#fontStyle-oblique MAY be used.
#fontStyle MAY be used.
#fontWeight-bold MAY be used.
#fontWeight MAY be used.
#length-em MAY be used.
#lineBreak-uax14 MAY be used.
#lineHeight MAY be used, with the following additional constraints:
  • An explicit <length> SHOULD be specified as there is no uniform implementation of the "normal" value at the time of this writing.
#nested-div MAY be used.
#nested-span MAY be used.
#origin The tts:origin attribute SHALL use px units or "percentage" representation, and SHALL NOT use em units.
#padding-1 MAY be used.
#padding-2 MAY be used.
#padding-3 MAY be used.
#padding-4 MAY be used.
#padding MAY be used.
#textAlign-absolute MAY be used.
#textAlign-relative MAY be used.
#textAlign MAY be used.
#textDecoration-over MAY be used.
#textDecoration-through MAY be used.
#textDecoration-under MAY be used.
#textDecoration MAY be used.
#textOutline-blurred SHALL NOT be used.
#textOutline-unblurred MAY be used.
#textOutline MAY be used, with the following additional constraints:
  • If specified, the border thickness SHALL be 10% or less than the associated font size.
#wrapOption MAY be used.
#writingMode-vertical MAY be used.
Extension Provisions
Relative to the SMPTE-TT Extension Namespace
#image SHALL NOT be used.
Relative to the IMSC 1.0 Extension namespace
#linePadding MAY be used, with the addition of the following provisions.

If used, the attribute ebutts:linePadding:
  • MAY be specified on elements region, body, div and p in addition to style; and
  • SHALL NOT apply to elements other than region, body, div and p.
#multiRowAlign MAY be used, with the addition of the following provisions.

If used, the attribute ebutts:multiRowAlign:
  • MAY be specified on elements region, body, div and p in addition to style;
  • SHALL NOT apply to elements other than p; and
  • SHALL be inherited.
Note

In contrast to this specification, [EBU-TT-D] specifies that the attributes ebutts:linePadding and ebutts:multiRowAlign are allowed only on the style element.

8. Image Profile Constraints

8.1 Profile Designator

This profile is associated with the following profile designator:

Profile Name Profile Designator
IMSC 1.0 Image http://www.w3.org/ns/ttml/profile/imsc1/image
Note

As specified in 6.10 Features, the presence of the ttp:profile attribute is not required by this profile. The profile designator specified above is intended to be generally used to signal conformance of a Document Instance to the profile. The details of such signaling depends on the application, and can, for instance, use metadata structures out-of-band of the Document Instance.

8.2 Presented Image

8.2.1 Definition

A presented image is a div element with a smpte:backgroundImage attribute that does not extend beyond a presented region.

8.2.2 Number per Region

In a given synchronic document, there shall be at most one presented image per presented region.

8.3 div element

If a smpte:backgroundImage attribute is applied to a div element:

Note

In [TTML1], tts:extent and tts:origin do not apply to div elements. In order to individually position multiple div elements, each div can be associated with a distinct region with the desired tts:extent and tts:origin.

8.4 Features

The features included in a Document Instance SHALL conform to the Table below:

Feature Provisions
Relative to the TT Feature namespace
#bidi SHALL NOT be used.
#color SHALL NOT be used.
#content The p, span and br elements SHALL NOT be present.
#direction SHALL NOT be used.
#displayAlign SHALL NOT be used.
#fontFamily SHALL NOT be used.
#fontSize SHALL NOT be used.
#fontStyle SHALL NOT be used.
#fontWeight SHALL NOT be used.
#length-em SHALL NOT be used.
#lineBreak-uax14 SHALL NOT be used.
#lineHeight SHALL NOT be used.
#nested-div SHALL NOT be used.
#nested-span SHALL NOT be used.
#padding SHALL NOT be used.
#textAlign SHALL NOT be used.
#textDecoration SHALL NOT be used.
#textOutline SHALL NOT be used.
#wrapOption SHALL NOT be used.
#writingMode-vertical SHALL NOT be used.
Extension Provisions
Relative to the SMPTE-TT Extension namespace
#image smpte:backgroundImage MAY be used.
smpte:backgroundImageHorizontal and smpte:backgroundImageVertical SHALL NOT be used.
smpte:image SHALL NOT be used.

9. Hypothetical Render Model

9.1 Overview

This Section specifies the Hypothetical Render Model illustrated in Fig. 1 Hypothetical Render Model .

The purpose of the model is to limit Document Instance complexity. It is not intended as a specification of the processing requirements for implementations. For instance, while the model defines a glyph buffer for the purpose of limiting the number of glyphs displayed at any given point in time, it neither requires the implementation of such a buffer, nor models the sub-pixel character positioning and anti-aliased glyph rendering that can be used to produce text output.

Hypothetical Render Model
Fig. 1 Hypothetical Render Model

The model operates on successive intermediate synchronic documents obtained from an input Document Instance, and uses a simple double buffering model: while an intermediate synchronic document En is being painted into Presentation Buffer Pn (the "front buffer" of the model), the previous intermediate synchronic document En-1 is available for display in Presentation Buffer Pn-1 (the "back buffer" of the model).

The model specifies an (hypothetical) time required for completely painting an intermediate synchronic document as a proxy for complexity. Painting includes drawing region backgrounds, rendering and copying glyphs, and decoding and copying images. Complexity is then limited by requiring that painting of intermediate synchronic document En completes before the end of intermediate synchronic document En-1.

Whenever applicable, constraints are specified relative to root container dimensions, allowing subtitle sequences to be authored independently of related video object resolution.

To enables scenarios where the same glyphs are used in multiple successive intermediate synchronic documents, e.g. to convey a CEA-608/708-style roll-up (see [CEA-608] and [CEA-708]), the Glyph Buffers Gn and Gn-1 store rendered glyphs across intermediate synchronic documents, allowing glyphs to be copied into the Presentation Buffer instead of rendered, a more costly operation.

Similarly, Decoded Image Buffers Dn and Dn-1 store decoded images across intermediate synchronic documents, allowing images to be copied into the Presentation Buffer instead of decoded.

9.2 General

The Presentation Compositor SHALL render in Presentation Buffer Pn each successive intermediate synchronic document En using the following steps in order:

  1. clear the pixels, except for the first intermediate synchronic document E0 for the which the pixels of P0 SHALL be assumed to have been cleared;
  2. paint, according to stacking order, all background pixels for each region;
  3. paint all pixels for background colors associated with text or image subtitle content; and
  4. paint the text or image subtitle content.

The Presentation Compositor SHALL start rendering En:

The duration DUR(En) for painting an intermediate synchronic document En in the Presentation Buffer Pn SHALL be:

DUR(En) = S(En) / BDraw + DURT(En) + DURI(En)

where

The contents of the Presentation Buffer Pn SHALL be transferred instantaneously to Presentation Buffer Pn-1 at the presentation time of intermediate synchronic document En, making the latter available for display.

Note

It is possible for the contents of Presentation Buffer Pn-1 to never be displayed. This can happen if Presentation Buffer Pn is copied twice to Presentation Buffer Pn-1 between two consecutive video frame boundaries of the related video object.

It SHALL be an error for the Presentation Compositor to fail to complete painting pixels for En before the presentation time of En.

Unless specified otherwise, the following table SHALL specify values for IPD and BDraw.

Parameter Initial value
Initial Painting Delay (IPD) 1 s
Normalized background drawing performance factor (BDraw) 12 s-1
Note

BDraw effectively sets a limit on fillings regions - for example, assuming that the root container is ultimately rendered at 1920×1080 resolution, a BDraw of 12 s-1 would correspond to a fill rate of 1920×1080×12/s=23.7×220pixels s-1.

Note

IPD effectively sets a limit on the complexity of any given intermediate synchronic document.

9.3 Paint Regions

The total normalized drawing area S(En) for intermediate synchronic document En SHALL be

S(En) = CLEAR(En) + PAINT(En )

where CLEAR(E0) = 0 and CLEAR(En | n > 0) = 1, i.e. the root container in its entirety.

Note

To ensure consistency of the Presentation Buffer, a new intermediate synchronic document requires clearing of the root container.

PAINT(En) SHALL be the normalized area to be painted for all regions that are used in intermediate synchronic document En according to:

PAINT(En) = ∑Ri∈Rp SIZE(Ri) ∙ NBG(Ri)

where R_p SHALL be the set of presented regions in the intermediate synchronic document En.

NSIZE(Ri) SHALL be given by:

NSIZE(Ri) = (width of Ri ∙ height of Ri ) ÷ (root container height ∙ root container width)

NBG(Ri) SHALL be the total number of tts:backgroundColor attributes associated with the given region Ri in the intermediate synchronic document. A tts:backgroundColor attribute is associated with a region when it is explicitly specified (either as an attribute in the element, or by reference to a declared style) in the following circumstances:

Even if a specified tts:backgroundColor is the same as specified on the nearest ancestor content element or animation element, specifying any tts:backgroundColor SHALL require an additional fill operation for all region pixels.

9.4 Paint Images

The Presentation Compositor SHALL paint into the Presentation Buffer Pn all visible pixels of presented images of intermediate synchronic document En.

For each presented image, the Presentation Compositor SHALL either:

Two images SHALL be identical if and only if they reference the same encoded image source.

The duration DURI(En) for painting images of an intermediate synchronic document En in the Presentation Buffer SHALL be as follows:

DURI(En) = ∑Ii ∈ Ic NRGA(Ii) / ICpy + ∑Ij ∈ Id NSIZ(Ij) / IDec

where

NRGA(Ii) is the Normalized Image Area of presented image Ii and SHALL be equal to:

NRGA(Ii)= (width of Ii ∙ height of Ii ) ÷ ( root container height ∙ root container width )

NSIZ(Ii) SHALL be the number of pixels of presented image Ii.

The contents of the Decoded Image Buffer Dn SHALL be transferred instantaneously to Decoded Image Buffer Dn-1 at the presentation time of intermediate synchronic document En.

The total size occupied by images stored in Decoded Image Buffers Dn or Dn-1 SHALL be the sum of their Normalized Image Area.

The size of Decoded Image Buffers Dn or Dn-1 SHALL be the Normalized Decoded Image Buffer Size (NDIBS).

Unless specified otherwise, the following table SHALL specify ICpy, Idec, and NDBIS.

Parameter Initial value
Normalized image copy performance factor (ICpy) 6
Image Decoding rate (Idec) 1 × 220 pixels s-1
Normalized Decoded Image Buffer Size (NDIBS) 0.9885

9.5 Paint Text

For each glyph displayed in intermediate synchronic document En, the Presentation Compositor SHALL:

Two glyphs are identical if and only if the following [TTML1] styles are identical:

Example of Presentation Compositor Behavior for Text Rendering
Fig. 2 Example of Presentation Compositor Behavior for Text Rendering

The duration DURT(En) for painting the text of an intermediate synchronic document En in the Presentation Buffer is as follows:

DURT(En) = ∑Gi ∈ Gr NRGA(Gi) / Ren(Gi) + ∑Gj ∈ Gc NRGA(Gj) / GCpy

where

Gr and Gc SHALL include only glyphs in presented regions and SHALL NOT include a [UNICODE] Code Point if it does not result in a change to presentation, e.g. the Code Point is ignored.

The Normalized Rendered Glyph Area NRGA(Gi) of a glyph Gi SHALL be equal to:

NRGA(Gi)= (fontSize of Gi as percentage of root container height)2

The contents of the Glyph Buffer Gn SHALL be copied instantaneously to Glyph Buffer Gn-1 at the presentation time of intermediate synchronic document En.

The total size occupied by the glyphs stored in Glyph Buffers Gn or Gn-1 SHALL be the sum of their Normalized Rendered Glyph Area.

The size of Glyph Buffers Gn and Gn-1 SHALL be the Normalized Glyph Buffer Size (NGBS).

Unless specified otherwise, the following table SHALL specify GCpy, Ren and NGBS, and SHALL apply to all supported font styles (including provision of outline border).

Parameter Initial value
Normalized glyph copy performance factor (GCpy) 12
Text rendering performance factor Ren(Gi if Gi is not a CJK Unified Ideograph as specified in [UNICODE]. 1.2
Text rendering performance factor Ren(Gi) if Gi is a CJK Unified Ideograph as specified in [UNICODE]. 0.6
Normalized Glyph Buffer Size (NGBS) 1
Note

NRGA(Gi) does not take into account glyph decorations (e.g. underline), glyph effects (e.g. outline) or actual glyph aspect ratio. An implementation can determine an actual buffer size needs based on worst-case glyph size complexity.

A. Reference Fonts

Computed Font Family Code Points Reference Font
monospaceSerif All code points specified in B. Recommended Character Sets http://www.microsoft.com/typography/fonts/family.aspx?FID=10 (Courier New)
proportionalSansSerif All code points specified in B. Recommended Character Sets, excluding the code points defined for Semitic languages alone. http://www.microsoft.com/typography/fonts/family.aspx?FID=8 (Arial) or http://www.linotype.com/en/526/Helvetica-family.html (Helvetica)
Note

proportionalSansSerif is not used in practice for Hebrew and Arabic captions and subtitles.

C. Forced content (non-normative)

Fig. 3 Illustration of the use of itts:forcedDisplay below illustrates the use of forced content, i.e. itts:forcedDisplay and displayForcedOnlyMode. The content with itts:forcedDisplay="true" is the French translation of the "High School" sign. The content with itts:forcedDisplay="false" are French subtitles capturing a voiceover.

Illustration of the use of itts:forcedDisplay
Fig. 3 Illustration of the use of itts:forcedDisplay

When the user selects French as the playback language but does not select French subtitles, displayForcedOnlyMode is set to "true", causing the display of the sign translation, which is useful to any French speaker, but hiding the voiceover subtitles as the voiceover is heard in French.

If the user selects French as the playback language and also selects French subtitles, e.g. if the user is hard-of-hearing, displayForcedOnlyMode is set to "false", causing the display of both the sign translation and the voiceover subtitles.

The algorithm for setting the displayForcedOnlyMode parameter and selecting the appropriate combination of subtitle and audio tracks depends on the application.

D. WCAG Considerations

In order to meet the guidelines in [WCAG20], the following considerations apply.

Guideline 1.1 of [WCAG20] recommends that an implementation provide text alternatives for all non-text content. In the context of this specification, this text alternative is intended primarily to support users of the subtitles who cannot see images. Since the images of an Image Profile Document Instance usually represent subtitle or caption text, the guidelines for authoring text equivalent strings given at Images of text of [HTML5] are appropriate.

Thus, for each subtitle in an Image Profile Document Instance, a text equivalent content in a Text Profile Document Instance SHOULD be written so that it conveys all essential content and fulfills the same function as the corresponding subtitle image. In the context of subtitling and captioning, this content will be (as a minimum) the verbatim equivalent of the image without précis or summarization. However, the author MAY include extra information to the text equivalent string in cases where styling is applied to the text image with a deliberate connotation, as a functional replacement for the applied style.

For instance, in subtitling and captioning, italics can be used to indicate an off screen speaker context (for example a voice from a radio). An author can choose to include this functional information in the text equivalent; for example, by including the word "Radio: " before the image equivalent text. Note that images in an Image Profile Document Instance that are intended for use as captions, i.e. intended for a hard of hearing audience, might already include this functional information in the rendered text.

Guideline 1.1 of [WCAG20] also recommends that accessible text alternatives must be "programmatically determinable." This means that the text must be able to be read and used by the assistive technologies (and the accessibility features in browsers) that people with disabilities use. It also means that the user must be able to use their assistive technology to find the alternative text (that they can use) when they land on the non-text content (that they can't use).

E. Sample Document Instance (non-normative)

The following sample Document Instances conforms to the Text and Image Profiles, respectively. These samples are for illustration only, and are neither intended to capture current or future practice, nor exercise all normative prose contained in this specification.

Example 10
<?xml version="1.0" encoding="UTF-8"?>
<tt
 xml:lang="en" xmlns="http://www.w3.org/ns/ttml"
 xmlns:ttm="http://www.w3.org/ns/ttml#metadata" 
 xmlns:tts="http://www.w3.org/ns/ttml#styling"
 xmlns:ttp="http://www.w3.org/ns/ttml#parameter" 
 xmlns:ittp="http://www.w3.org/ns/ttml/profile/imsc1#parameter"
 ittp:aspectRatio="4 3">
	
	<head>
		<layout>
			<region xml:id="area1" tts:origin="10% 10%" tts:extent="80% 10%" tts:backgroundColor="black" tts:displayAlign="center" tts:color="red"/>
		</layout>
	</head>
	<body>
		<div>
			<p region="area1" begin="0s" end="6s">Lorem ipsum dolor sit amet.</p>
		</div>
	</body>
</tt>
Example 11
<?xml version="1.0" encoding="UTF-8"?>
<tt
 xmlns="http://www.w3.org/ns/ttml"
 xmlns:ttm="http://www.w3.org/ns/ttml#metadata" 
 xmlns:tts="http://www.w3.org/ns/ttml#styling"
 xmlns:ttp="http://www.w3.org/ns/ttml#parameter" 
 xmlns:smpte="http://www.smpte-ra.org/schemas/2052-1/2010/smpte-tt"
 xmlns:itts="http://www.w3.org/ns/ttml/profile/imsc1#styling"
 tts:extent="640px 480px"
 ttp:frameRate="25"
 xml:lang="fr">
	
	<head>
		<layout>
			<region xml:id="region1" tts:origin="120px 410px" tts:extent="240px 40px" tts:showBackground="whenActive"/>
			<region xml:id="region2" tts:origin="120px 20px" tts:extent="240px 40px" tts:showBackground="whenActive"/>
		</layout>
	</head>
	<body>
		<div region="region1" begin="00:00:01:00" end="00:00:02:00" smpte:backgroundImage="1.png"/>
		<div region="region1" begin="00:00:03:20" end="00:00:04:12" smpte:backgroundImage="2.png"/>
		<div region="region2" itts:forcedDisplay="true" begin="00:00:03:20" end="00:00:04:12" smpte:backgroundImage="3.png"/>
	</body>
</tt>
</pre>

F. Extensions

F.1 General

The following sections define extension designations, expressed as relative URIs (fragment identifiers) relative to the IMSC 1.0 Extension Namespace base URI.

F.2 #progressivelyDecodable

A transformation processor supports the #progressivelyDecodable feature if it recognizes and is capable of transforming values of the ittp:progressivelyDecodable.

A presentation processor supports the #progressivelyDecodable feature if it implements presentation semantic support for values of the ittp:progressivelyDecodable attribute.

F.3 #aspectRatio

A transformation processor supports the #aspectRatio feature if it recognizes and is capable of transforming values of the ittp:aspectRatio.

A presentation processor supports the #aspectRatio feature if it implements presentation semantic support for values of the ittp:aspectRatio attribute.

F.4 #forcedDisplay

A transformation processor supports the #forcedDisplay feature if it recognizes and is capable of transforming values of the itts:forcedDisplay.

A presentation processor supports the #forcedDisplay feature if it implements presentation semantic support for values of the itts:forcedDisplay attribute.

F.5 #altText

A transformation processor supports the #altText feature if it recognizes and is capable of transforming values of the ittm:altText element.

A presentation processor supports the #altText feature if it implements presentation semantic support for values of the ittm:altText element.

F.6 #linePadding

A transformation processor supports the #linePadding feature if it recognizes and is capable of transforming values of the ebutts:linePadding attribute specified in [EBU-TT-D].

A presentation processor supports the #linePadding feature if it implements presentation semantic support for values of the ebutts:linePadding attribute specified in [EBU-TT-D].

F.7 #multiRowAlign

A transformation processor supports the #multiRowAlign feature if it recognizes and is capable of transforming values of the ebutts:multiRowAlign attribute specified in [EBU-TT-D].

A presentation processor supports the #multiRowAlign feature if it implements presentation semantic support for values of the ebutts:multiRowAlign attribute specified in [EBU-TT-D].

G. References

G.1 Normative references

[CLDR]
Unicode Consortium. The Common Locale Data Repository Project
[EBU-TT-D]
European Broadcasting Union (EBU). Tech 3380, EBU-TT-D Subtitling Distribution Format Version 1.0
[MHP]
ETSI TS 101 812 V1.3.1, Digital Video Broadcasting (DVB); Multimedia Home
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[ST2052-1]
SMPTE ST 2052-1, Timed Text Format (SMPTE-TT) URL: https://www.smpte.org/standards
[TTML1]
Glenn Adams. Timed Text Markup Language 1 (TTML1) (Second Edition). 24 September 2013. W3C Recommendation. URL: http://www.w3.org/TR/ttml1/
[UNICODE]
The Unicode Standard. URL: http://www.unicode.org/versions/latest/
[WCAG20]
Ben Caldwell; Michael Cooper; Loretta Guarino Reid; Gregg Vanderheiden et al. Web Content Accessibility Guidelines (WCAG) 2.0. 11 December 2008. W3C Recommendation. URL: http://www.w3.org/TR/WCAG20/
[xml-names]
Tim Bray; Dave Hollander; Andrew Layman; Richard Tobin; Henry Thompson et al. Namespaces in XML 1.0 (Third Edition). 8 December 2009. W3C Recommendation. URL: http://www.w3.org/TR/xml-names

G.2 Informative references

[CEA-608]
Line-21 Data Services, ANSI/CEA Standard.
[CEA-708]
Digital Television (DTV) Closed Captioning, ANSI/CEA Standard.
[HTML5]
Ian Hickson; Robin Berjon; Steve Faulkner; Travis Leithead; Erika Doyle Navara; Edward O'Connor; Silvia Pfeiffer. HTML5. 28 October 2014. W3C Recommendation. URL: http://www.w3.org/TR/html5/
[SUBM]
World Wide Web Consortium (W3C). TTML Text and Image Profiles for Internet Media Subtitles and Captions (Member Submission, 07 June 2013)
[namespaceState]
Norman Walsh. The Disposition of Names in an XML Namespace. 29 March 2006. W3C Working Draft. URL: http://www.w3.org/TR/namespaceState/
[ttml10-sdp-us]
Glenn Adams; Monica Martin; Sean Hayes. TTML Simple Delivery Profile for Closed Captions (US). 5 February 2013. W3C Note. URL: http://www.w3.org/TR/ttml10-sdp-us/