W3C

Encrypted Media Extensions

W3C Editor's Draft 14 April 2014

This Version:
http://dvcs.w3.org/hg/html-media/raw-file/default/encrypted-media/encrypted-media.html
Latest Published Version:
http://www.w3.org/TR/encrypted-media/
Latest editor's draft:
http://dvcs.w3.org/hg/html-media/raw-file/default/encrypted-media/encrypted-media.html
Previous Versions:
http://www.w3.org/TR/2014/WD-encrypted-media-20140218/
http://www.w3.org/TR/2013/WD-encrypted-media-20130510/
http://www.w3.org/TR/2013/WD-encrypted-media-20131022/
Editors:
David Dorwin, Google, Inc.
Adrian Bateman, Microsoft Corporation
Mark Watson, Netflix, Inc.
Bug/Issue lists:
Bugzilla, Tracker
Discussion list:
public-html-media@w3.org
Test Suite:
None yet

Abstract

This proposal extends HTMLMediaElement providing APIs to control playback of protected content.

The API supports use cases ranging from simple clear key decryption to high value video (given an appropriate user agent implementation). License/key exchange is controlled by the application, facilitating the development of robust playback applications supporting a range of content decryption and protection technologies.

This specification does not define a content protection or Digital Rights Management system. Rather, it defines a common API that may be used to discover, select and interact with such systems as well as with simpler content encryption systems. Implementation of Digital Rights Management is not required for compliance with this specification: only the simple clear key system is required to be implemented as a common baseline.

The common API supports a simple set of content encryption capabilities, leaving application functions such as authentication and authorization to page authors. This is achieved by requiring content protection system-specific messaging to be mediated by the page rather than assuming out-of-band communication between the encryption system and a license or other server.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Implementers should be aware that this specification is not stable. Implementers who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should join the mailing list mentioned below and take part in the discussions.

This document was published by the HTML working group as an Editor's Draft. Please submit comments regarding this document by using the W3C's (public bug database) with the product set to HTML WG and the component set to Encrypted Media Extensions. If you cannot access the bug database, submit comments to public-html-media@w3.org (subscribe, archives) and arrangements will be made to transpose the comments to the bug database. All feedback is welcome.

Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Note: It is an open issue whether and how the spec should do more to encourage/ensure CDM-level interop. See Bug 20944.

Note: This specification contains sections for describing security and privacy considerations. These sections are not final and review is welcome.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

This section is non-normative.

This proposal allows JavaScript to select content protection mechanisms, control license/key exchange, and implement custom license management algorithms. It supports a wide range of use cases without requiring client-side modifications in each user agent for each use case. This also enables content providers to develop a single application solution for all devices. A generic stack implemented using the proposed APIs is shown below. This diagram shows an example flow: other combinations of API calls and events are possible.

A generic stack implemented using the proposed APIs

1.1. Definitions

Text in this font and color is non-normative.

1.1.1. Content Decryption Module (CDM)

The Content Decryption Module (CDM) is a generic term for the client component that provides the functionality, including decryption, for one or more Key Systems.

Implementations may or may not separate the implementations of CDMs or treat them as separate from the user agent. This is transparent to the API and application. A user agent may support one or more CDMs.

1.1.2. Key System

A Key System is a generic term for a decryption mechanism and/or content protection provider. Key System strings provide unique identification of a Key System. They are used by the user agent to select the Content Decryption Modules and identify the source of a key-related event. Simple Decryption Key Systems are supported by all user agents. User agents may also provide additional CDMs with corresponding Key System strings.

A Key System string is always a reverse domain name. For example, "com.example.somesystem". Key System strings are compared using case-sensitive matching. It is recommended that CDMs use simple lower-case ASCII key system strings.

Within a given system ("somesystem" in the example), subsystems may be defined as determined by the key system provider. For example, "com.example.somesystem.1" and "com.example.somesystem.1_5". Key System providers should keep in mind that these will be used for comparison and discovery, so they should be easy to compare and the structure should remain reasonably simple.

1.1.3. Key Session

A Key Session, or simply Session, represents the lifetime of the license(s)/key(s) it contains and associates all messages related to them. Sessions are embodied as MediaKeySession objects. Each Key session is associated with a single instance of Initialization Data provided in the createSession() call.

Each Key Session is associated with a single MediaKeys object, and only media elements associated with that object may access key(s) associated with the session. Other MediaKeys objects, CDM instances, and media elements may not access the key session or use its key(s). Key sessions and the keys they contain are no longer usable by the CDM for decryption when the session is closed, including when the MediaKeySession object is destroyed.

1.1.4. Session ID

A Session ID is a unique string identifier generated by the user agent or CDM that can be used by the application to identify MediaKeySession objects. (The underlying content protection client or server do not necessarily need to support Session IDs.)

A new Session ID is generated each time the user agent and CDM successfully create a new session.

Each Session ID shall be unique within the browsing context in which it was created. (Note: Some use cases may require that Session IDs be unique within the origin over time, including across browsing sessions.)

1.1.5. Key

Unless otherwise stated, key refers to a decryption key that can be used to decrypt blocks within media data. Each key is uniquely identified by a key ID. A key is associated with the session used to provide it to the CDM. (The same key may be present in multiple sessions.) Such keys may only be provided to the CDM via an update() call. (They may later be loaded by loadSession() as part of the stored session data.)

1.1.6. Key ID

A key is associated with a key ID, which uniquely identifies a key. The container specifies the ID of the key that can decrypt a block or set of blocks within the media data. Initialization Data may contain key ID(s) to identify the keys that are needed to decrypt the media data. However, there is no requirement that Initialization Data contain any or all key IDs used in the media data or media resource. Licenses provided to the CDM associate each key with a key ID so the CDM can select the appropriate key when decrypting an encrypted block of media data.

1.1.7. License

A license is a key system-specific message that includes one or more decryption key(s) - each associated with a key ID - and potentially other information about key usage.

1.1.8. Initialization Data

Key Systems usually require a block of initialization data containing information about the stream to be decrypted before they can construct a license request message. This block could be a simple key or content ID or a more complex structure containing such information. It should always allow unique identification of the key(s) needed to decrypt the content. This initialization information may be obtained in some application-specific way or provided with the media data.

Initialization Data is a generic term for container-specific data that is used by CDMs to generate a license request. Initialization data found with the media data is provided to the application in the initData attribute of the needkey event.

The format of the initialization data depends upon the type of container, and containers may support more than one format of initialization data. The initialization data type is a string that indicates what format the initialization data is provided in. Initialization data type strings are always matched case-sensitively. It is recommended that initialization data type strings are lower-case ASCII strings.

The Encrypted Media Extensions Stream Format and Initialization Data Format Registry provides the mapping from initialization data type string to the specification for each format.

1.1.9. Cross Origin Support

During playback, embedded media data is exposed to script in the embedding origin. In order for the API to fire needkey and message events, media data must be CORS-same-origin with the embedding page. If media data is cross-origin with the embedding document, authors should use the crossorigin attribute on the media element and CORS headers on the media data response to make it CORS-same-origin.

2. Media Element Extensions

We extend media element to allow decryption key acquisition to be handled in JavaScript.

enum MediaWaitingFor { "none", "data", "key" };

partial interface HTMLMediaElement {
  // Encrypted Media
  readonly attribute MediaKeys mediaKeys;
  Promise<any> setMediaKeys(MediaKeys mediaKeys);
  
  attribute EventHandler onneedkey;

  readonly attribute MediaWaitingFor waitingFor;
};

interface MediaKeys {
  readonly attribute DOMString keySystem;

  Promise<MediaKeySession> createSession(DOMString initDataType, Uint8Array initData);
  Promise<MediaKeySession> loadSession(DOMString sessionId);

  static Promise<MediaKeys> create(DOMString keySystem)
  static bool isTypeSupported(DOMstring keySystem, optional DOMString contentType);
};

interface MediaKeySession : EventTarget {
  // error state
  readonly attribute MediaKeyError? error;

  // session properties
  readonly attribute DOMString keySystem;
  readonly attribute DOMString sessionId;
  readonly attribute Promise<any> close;

  // session operations
  Promise<any> update(Uint8Array response);
  Promise<any> release();
};

Issue 1

Extensions to HTMLSourceElement may be at risk as discussed in Bug 23827.

partial interface HTMLSourceElement { attribute DOMString keySystem; };
Note

All errors are reported asynchronously by rejecting the returned Promise. This includes WebIDL type mapping errors.

The steps of an algorithm are always aborted when resolving or rejecting a promise.

The mediaKeys attribute is the MediaKeys being used when decrypting encrypted media data for this media element.

The setMediaKeys(mediaKeys) method provides the MediaKeys to use when decrypting media data during playback. It must run the following steps:

  1. If mediaKeys and the mediaKeys attribute are the same object, return a promise resolved with undefined.

  2. Let promise be a new promise.

  3. Run the following steps asynchronously:

    1. If mediaKeys is already in use and the user agent is unable to use it with this element, reject promise with a new DOMException whose name is "QuotaExceededError" and that has the message "The MediaKeys object cannot be used with additional HTMLMediaElements."

    2. If the mediaKeys attribute is not null and the user agent or CDM do not support removing the association, return a promise rejected with a new DOMException whose name is "NotSupportedError" and that has the message "The existing MediaKeys object cannot be removed."

    3. TODO: Add more steps per Bug 24216.

    4. Set the mediaKeys attribute to mediaKeys.

    5. Resolve promise with undefined.

  4. Return promise.

Note: As a best practice, applications should create a MediaKeys object and call setMediaKeys() before providing media data (for example, setting the src attribute of the media element). This avoids potential delays in some implementations.

Note: In some implementations, MediaKeySession objects created by createSession() may not fire any events until the MediaKeys object is associated with a media element with setMediaKeys().

The onneedkey event handler for the needkey event must be supported by all HTMLMediaElements as both a content attribute and an IDL attribute.

The waitingFor attribute indicates what the media element is waiting for, if anything (indicated by the waiting and canplay events). This is described in the Encrypted Block Encountered algorithm.

The create(keySystem) method must run the following steps:

  1. If keySystem is an empty string, return a promise rejected with a new DOMException whose name is "InvalidAccessError" and that has the message "The keySystem parameter is empty."

  2. If keySystem is not one of the Key Systems supported by the user agent, return a promise rejected with a new DOMException whose name is "NotSupportedError" and that has the message "The key system keySystem is not supported." Key system string comparison is case-sensitive.

  3. Let promise be a new promise.

  4. Run the following steps asynchronously:

    1. Let cdm be the content decryption module corresponding to keySystem.

    2. Load and initialize the cdm if necessary.

    3. If cdm fails to load or initialize, reject promise with a new DOMException whose name is the appropriate error name and that has an appropriate message.

    4. Let media keys be a new MediaKeys object, and initialize it as follows:

      1. Set the keySystem attribute to keySystem.

    5. Resolve promise with media keys.

  5. Return promise.

The keySystem attribute identifies the Key System being used.

The createSession(initDataType, initData) method must run the following steps:

Note: The contents of initData are container-specific Initialization Data. initDataType is the initialization data type that indicates how to interpret initData.

Note: User agents and CDMs should not treat sessions created with audio data differently than those created with video data. That is, there is no such thing as an "audio session" or a "video session" - all sessions are used for all media streams processed by cdm.

  1. If initDataType is an empty string, return a promise rejected with a new DOMException whose name is "InvalidAccessError" and that has the message "The initDataType parameter is empty."

  2. If initData is null or an empty array, return a promise rejected with a new DOMException whose name is"InvalidAccessError" and that has the message "The initData parameter is empty."

  3. If initDataType is not an initialization data type supported by the content decryption module corresponding to the keySystem, return a promise rejected with a new DOMException whose name is "NotSupportedError" and that has the message "The initialization data type initDataType is not supported by the key system."

  4. Let promise be a new promise.

  5. Run the following steps asynchronously:

    1. Let request be null.

    2. Let default URL be null.

    3. Let cdm be the cdm loaded in create().

    4. Use the cdm to execute the following steps:

      1. Process the initData, interpreting it per initDataType.

      2. If a message exchange (e.g. a license request) is required:

        1. Let request be a request generated by the CDM using the initData.

          cdm must not use any stream-specific data, including media data, not provided via the initData.

        2. If the initData indicates a default URL relevant to keySystem, let default URL be that URL.

    5. Let the session ID be a unique Session ID string. It may be obtained from cdm.

    6. Let session be a new MediaKeySession object, and initialize it as follows:

      1. Set the error attribute to null.

      2. Set the keySystem attribute to the value of the MediaKeys object's keySystem attribute.

      3. Set the sessionId attribute to session ID.

    7. If any of the preceding steps failed, reject promise with a new DOMException whose name is the appropriate error name and that has an appropriate message.

    8. If request is not null, run the Queue a "message" Event algorithm on the session, providing request and default URL.

    9. Resolve promise with session.

  6. Return promise.

The loadSession(sessionId) method must run the following steps:

  1. If the content decryption module corresponding to the keySystem attribute does not support loading previous sessions, return a promise rejected with a new DOMException whose name is "NotSupportedError" and that has the message "The operation is not supported by the key system."

  2. If sessionId is an empty string, return a promise rejected with a new DOMException whose name is "InvalidAccessError" and that has the message "The sessionId parameter is empty."

  3. Let promise be a new promise.

  4. Run the following steps asynchronously:

    1. Let request be null.

    2. Let destination URL be null.

    3. Let origin be the origin of the MediaKeys object's Document.

    4. Let cdm be the cdm loaded in create().

    5. Use the cdm to execute the following steps:

      1. If there is no data stored for the sessionId in the origin, resolve promise with undefined.

      2. Let session data be the data stored for the sessionId in the origin. This must not include data from other origin(s) or that is not associated with an origin.

      3. Load the session data.

      4. If a message exchange is required:

        1. Let request be a request generated by the CDM based on the session data.

        2. If the session data indicates a destination URL for the request, let destination URL be that URL.

    6. Let session be a new MediaKeySession object, and initialize it as follows:

      1. Set the error attribute to null.

      2. Set the keySystem attribute to the value of the MediaKeys object's keySystem attribute.

      3. Set the sessionId attribute to sessionId.

    7. If any of the preceding steps failed, reject promise with a new DOMException whose name is the appropriate error name and that has an appropriate message.

    8. If the associated media element(s) are waiting for a key, queue a task to attempt to resume playback.

      In other words, resume playback if the necessary key is provided.

      The user agent may choose to skip this step if it knows resuming will fail (i.e. no usable key was added).

    9. If request is not null, run the Queue a "message" Event algorithm on the session, providing request and destination URL.

    10. Resolve promise with session.

  5. Return promise.

The isTypeSupported(keySystem, contentType) method returns whether keySystem is supported with the container and codec(s) specified by contentType.

The following list shows some examples.

Returns whether the Some System Key System is supported. Specific containers and codecs may or may not be supported with Some System.
MediaKeys.isTypeSupported("com.example.somesystem")
Returns whether the Some System Key System is present and supports the container and codec(s) specified by mimeType.
MediaKeys.isTypeSupported("com.example.somesystem", mimeType)
Returns whether the user agent supports Clear Key Simple Decryption of the container and codec(s) specified by mimeType.
MediaKeys.isTypeSupported("org.w3.clearkey", mimeType)

It must run the following steps:

  1. If keySystem is an empty string, return false and abort these steps.

  2. If keySystem contains an unrecognized or unsupported Key System, return false and abort these steps. Key system string comparison is case-sensitive.

  3. If contentType was not provided or is an empty string, return true and abort these steps.

  4. If contentType contains an invalid or unrecognized MIME type, return false and abort these steps.

  5. Issue 2

    isTypeSupported needs to be updated including using initDataType. This includes the discussion in Bug 24873.

    Let initDataFormat be the container type specified by contentType.

  6. If the user agent does not support the Initialization Data format initDataFormat, return false and abort these steps.

  7. If the CDM specified by keySystem does not support the Initialization Data format initDataFormat, return false and abort these steps.

  8. If neither the CDM specified by keySystem nor the user agent support all codec(s) specified by contentType, return false and abort these steps.

  9. Return true.

The error attribute is a MediaKeyError representing the current error state of the session. It is null if there is no error.

The keySystem attribute identifies the Key System of the MediaKeys that created the session.

The sessionId attribute is the Session ID for this object and the associated key(s) or license(s).

The close attribute signals when object becomes closed as a result of the Session Close algorithm being run. This promise can only be fulfilled and is never rejected.

The update(response) method must run the following steps:

Note: The contents of response are keySystem-specific.

  1. If response is null or an empty array, return a promise rejected with a new DOMException whose name is "InvalidAccessError" and that has the message "The response parameter is empty."

  2. Let promise be a new promise.

  3. Run the following steps asynchronously:

    1. Let cdm be the cdm loaded in create().

    2. Let request be null.

    3. Let destination URL be null.

    4. Use the cdm to execute the following steps:

      1. Process response.

        Note: When response contains key(s) and/or related data, cdm will likely cache the key and related data indexed by key ID.

        Note: The replacement algorithm within a session is Key System-dependent.

        Note: Keys from different sessions should be cached independently such that closing one session does not affect keys in other sessions, even if they have overlapping key IDs.

        Note: It is recommended that CDMs support a standard and reasonably high minimum number of keys per MediaKeySession object, including a standard replacement algorithm, and a standard and reasonably high minimum number of MediaKeySession objects. This enables a reasonable number of key rotation algorithms to be implemented across user agents and may reduce the likelihood of playback interruptions in use cases that involve various streams in the same element (i.e. adaptive streams, various audio and video tracks) using different keys.

      2. If another message needs to be sent to the server, execute the following steps:

        1. Let request be that message.

        2. If there is a specific destination URL for the message, let destination URL be that URL.

    5. If any of the preceding steps failed, reject promise with a new DOMException whose name is the appropriate error name and that has an appropriate message.

    6. If the associated media element(s) are waiting for a key, queue a task to attempt to resume playback.

      In other words, resume playback if the necessary key is provided.

      The user agent may choose to skip this step if it knows resuming will fail (i.e. no usable key was added).

    7. If request is not null, run the Queue a "message" Event algorithm on the session, providing request and destination URL.

    8. Resolve promise with undefined.

  4. Return promise.

The release() method allows an application to indicate to the system that it may release any resources associated with this object. It must run the following steps:

  1. If the Session Close algorithm has been run on this object, return a promise fulfilled with undefined.

  2. Let promise be a new promise.

  3. Run the following steps asynchronously:

    1. Let cdm be the cdm loaded in create().

    2. Use the cdm to execute the following steps:

      1. Process the release request.

        Note: the release() method is intended to act as a hint to the user agent that the application believes the session is no longer needed. However, the CDM determines whether resources can now be released.

      2. If the previous step caused the session to be closed, run the Session Close algorithm on this object.

    3. Resolve promise with undefined.

  4. Return promise.

The keySystem attribute of HTMLSourceElement specifies the Key System to be used with the media resource. The keySystem attribute must be supported by all HTMLSourceElement as both an IDL attribute and also a content attribute named keysystem. The resource selection algorithm is modified to check the keySystem attribute after the existing step 5 of the Otherwise branch of step 6:

  1. ⌛ If candidate has a keySystem attribute whose value represents a Key System that the user agent knows it cannot use with type, then end the synchronous section, and jump down to the failed step below.

2.1. Exceptions

The methods report errors by rejecting the the returned promise with a DOMException. The following DOMException names from DOM4 are used with messages as shown in the following table. In cases where the exact name is not specified in the algorithm, the message may differ to reflect the actual error.

Name Possible Messages (optional)
NotSupportedError The existing MediaKeys object cannot be removed.
The key system name is not supported.
The initialization data type type is not supported by the key system.
The operation is not supported by the key system.
InvalidAccessError The name parameter is empty.
QuotaExceededError The MediaKeys object cannot be used with additional HTMLMediaElements.

2.2. Errors

Issue 3

Bug 21798 - The future of error events and MediaKeyError is uncertain.

2.2.1. Interface

[Constructor(DOMString name, unsigned long systemCode, optional DOMString message = "")]
interface MediaKeyError : DOMError {
  readonly attribute unsigned long systemCode;
};

The MediaKeys(keySystem) constructor must return a new MediaKeyError whose systemCode attribute is initialized to systemCode and inherited attributes are initialized by passing name and message to the DOMError constructor.

The systemCode attribute of a MediaKeySession object is a Key System-specific value for the error that occurred. This allows a more granular status to be returned than the more general name. It should be 0 if there is no associated status code or such status codes are not supported by the Key System.

2.2.2. Error Names

The tables below list all the allowed error names for the name attribute along with a description. The message may be key system-specific.

Issue 4

Bug 21798 - The additional error names are yet to be defined.

The following DOMException names from DOM4 may be used as shown in the following table:

Name Use

The following new DOMException names are defined by this specification:

Name Use

2.3 Media Element Restrictions

This section is non-normative.

Media data processed by a CDM may not be available through Javascript APIs in the usual way (for example using the CanvasRenderingContext2D drawImage() method and the AudioContext MediaElementAudioSourceNode). This specification does not define conditions for such non-availability of media data, however, if media data is not available to Javascript APIs then these APIs may behave as if no media data was present at all.

Where media rendering is not performed by the UA, for example in the case of a hardware protected media pipeline, then the full set of HTML rendering capabilities, for example CSS Transforms, may not be available. One likely restriction is that video media may be constrained to appear only in rectangular regions with sides parallel to the edges of the window and with normal orientation.

3. Events

3.1. Event Definitions

[Constructor(DOMString type, optional MediaKeyNeededEventInit eventInitDict)]
interface MediaKeyNeededEvent : Event {
  readonly attribute DOMString initDataType;
  readonly attribute Uint8Array? initData;
};

dictionary MediaKeyNeededEventInit : EventInit {
  DOMString initDataType;
  Uint8Array? initData;
};
[Constructor(DOMString type, optional MediaKeyMessageEventInit eventInitDict)]
interface MediaKeyMessageEvent : Event {
  readonly attribute Uint8Array message;
  readonly attribute DOMString? destinationURL;
};

dictionary MediaKeyMessageEventInit : EventInit {
  Uint8Array message;
  DOMString? destinationURL;
};
event . initDataType

Returns a string indicating the initialization data type of the Initialization Data related to the event.

event . initData

Returns the Initialization Data related to the event.

event . message

Returns the message (i.e. license request) to send.

event . destinationURL

Returns the URL to send the message to.

The initDataType attribute contains a string indicating the initialization data type specific to the event. The format of the initData will vary according to the initDataType.

The initData attribute contains Initialization Data specific to the event.

The message attribute contains a message from the CDM. Messages are Key System-specific. In most cases, it should be sent to a key server.

The destinationURL is the URL to send the message to. An application may override this. In some cases, it may have been provided by the media data. It may be null.

3.2. Event Summary

This section is non-normative.

The following event is fired at HTMLMediaElement.

Event name Interface Dispatched when... Preconditions
needkey MediaKeyNeededEvent The user agent needs a key or license to begin or continue playback.

It may have encountered media data that may/does require decryption to load or play OR need a new key/license to continue playback.
readyState is equal to HAVE_METADATA or greater. It is possible that the element is playing or has played.

The following events are fired at MediaKeySession.

Event name Interface Dispatched when... Preconditions
error Event An error occurs in the session.
message MediaKeyMessageEvent A message has been generated (and likely needs to be sent to a server). For example, a license request has been generated as the result of a createSession() call or another message must be sent in response to an update() call.

4. Algorithms

4.1. Initialization Data Encountered

The following steps are run when the media element encounters a source that may contain encrypted blocks or streams during the resource fetch algorithm:

  1. Let initDataType be the empty string.

  2. Let initData be null.

  3. If Initialization Data was encountered and if the media data is CORS-same-origin, run the following steps:

    1. Let initDataType be the string representing the initialization data type of that initialization data.

    2. Let initData be that initialization data.

  4. Queue a task to fire a simple event named needkey at the media element.

    The event is of type MediaKeyNeededEvent and has:

    Firing this event allows the application to begin acquiring the key process before it is needed.

    Note that readyState is not changed and no algorithms are aborted. This event merely provides information.

    Note that if the media is not CORS-same-origin then the initData will be null. This allows applications that can retrieve initData from an alternative source to continue. Applications with no way to retrieve initData may wish to consider aborting playback in this case.

  5. Continue Normal Flow: Continue with the existing media element's resource fetch algorithm.

4.2. Encrypted Block Encountered

The following steps are run when the media element encounters a block (i.e. frame) of encrypted media data during the resource fetch algorithm:

  1. If the media element's mediaKeys attribute is not null, run the following steps:

    1. Let media keys be the MediaKeys object referenced by that atribute.

    2. Let cdm be the cdm loaded during the initialization of the media keys.

    3. If there is at least one MediaKeySession created by the media keys on which the Session Close algorithm has not been run, run the following steps:

      This check ensures the cdm has finished loading and is a prequisite for a matching key being available.

      1. Let the block key ID be the key ID of the current block (as specified by the container).

      2. Use the cdm to execute the following steps:

        1. Let available keys be the union of keys in sessions that were created by the media keys.

        2. Follow the steps for the first matching condition from the following list:

          In the following steps, a key is considered usable if it is valid as determined by the CDM. For example, a key is not usable if its license has expired.

          If any of the available keys corresponds to the block key ID and is usable
          Run the following steps:
          1. Let block key be the matching key.

            Note: If multiple sessions contain a usable key for the block key ID, which key to use is Key System-dependent.

          2. Use the cdm to decrypt the block using block key.

          3. Follow the steps for the first matching condition from the following list:

            If decryption fails
            Abort the media element's resource fetch algorithm, run the steps to report a MEDIA_ERR_DECODE error, and abort these steps.
            Otherwise
            Abort these steps and process the decrypted block as normal. (Decode the block.)

            Note: Not all decryption problems (i.e. using the wrong key) will result in a decryption failure. In such cases, no error is fired here but one may be fired during decode.

          If any of the available keys corresponds to the block key ID and is unusable
          Run the following steps:
          1. Let session be the MediaKeySession object associated with that session.

          2. Run the Queue an "error" Event algorithm on the session, providing the appropriate error name and system code value, if provided, and 0 otherwise.

          3. Abort these steps.

          Otherwise (there is no key for the block key ID in any session)
          Continue.
  2. Abort these steps and wait for a signal to resume playback.

    There is no usable key for the block.

    If playback stops because the stream cannot be decrypted when the media element is potentially playing, the media element is said to be waiting for a key.

For frame-based encryption, this may be implemented as follows when the media element attempts to decode a frame as part of the resource fetch algorithm:

  1. Let encrypted be false.

  2. Detect whether the frame is encrypted.

    If the frame is encrypted
    Run the steps above.
    Otherwise
    Continue.
  3. Decode the frame.

  4. Provide the frame for rendering.

The following paragraph is added to Playing the media resource.

4.3. Queue a "message" Event

The Queue a "message" Event algorithm is run when the CDM needs to queue a message event to a MediaKeySession object. Requests to run this algorithm include a target MediaKeySession object, a request, and a destination URL.

The following steps are run:

  1. Let the session be the specified MediaKeySession object.

  2. Queue a task to fire a simple event named message at the session.

    The event is of type MediaKeyMessageEvent and has:

4.4. Queue an "error" Event

The Queue an "error" Event algorithm is run when the CDM needs to queue an error event to a MediaKeySession object. Requests to run this algorithm include a target MediaKeySession object, an error name, and a system code.

The following steps are run:

  1. Let the session be the specified MediaKeySession object.

  2. Create a new MediaKeyError object with the following attributes:

  3. Set the session's error attribute to the error object created in the previous step.

  4. Queue a task to fire a simple event named error at the session.

4.5. Session Close

The Session Close algorithm is run when the CDM closes the session associated with a MediaKeySession object.

The CDM may close a session at any point, such as in response to a release() call, when the session is no longer needed, or when resources are lost. Keys in other sessions should be unaffected, even if they have overlapping key IDs.

The following steps are run:

  1. Let promise be the close attribute of the associated MediaKeySession object.

  2. Resolve promise with undefined.

5. Simple Decryption

All user agents must support the simple decryption capabilities described in this section regardless of whether they support a more advanced CDM. This ensures that there is a common baseline level of protection that is guaranteed to be supported in all user agents, including those that are entirely open source. Thus, content providers that need only basic protection can build simple applications that will work on all platforms without needing to work with any content protection providers.

5.1. Clear Key

The "org.w3.clearkey" Key System indicates a plain-text clear (unencrypted) key will be used to decrypt the source. No additional client-side content protection is required. Use of this Key System is described below.

The keySystem parameter and keySystem attributes are always "org.w3.clearkey". The sessionId string is numerical.

The initData attribute of the needkey event and the initData parameter of createSession() are the same container-specific Initialization Data format and values. If supported, these values should provide some type of identification of the content or key that could be used to look up the key (since there is no defined logic for parsing it). For containers that support a simple key ID, it should be a binary array containing the raw key ID. For other containers, it may be some other opaque blob or null.

The MediaKeyMessageEvent generated by createSession() has:

The response parameter of update() should be a JSON Web Key (JWK) representation of the symmetric key to be used for decryption, as defined in the IETF Internet-draft JSON Web Key (JWK) specification. The JSON string is encoded into the Uint8Array parameter using ASCII-compatible character encoding.

When the JWK 'key type' ("kty") member value is 'octet sequence' ("oct"), the 'key value' ("k") member will be a base64 encoding of the octet sequence containing the symmetric key value.

For example, the following contains a single symmetric key represented as a JWK, designated as being for use with the AES Key Wrap algorithm (line breaks for readability, only).

{
  "keys": 
    [{
      "kty":"oct",
      "alg":"A128KW",
      "kid":"67ef0gd8pvfd0=",
      "k":"GawgguFyGrWKav7AX4VKUg"
    }]
}

6. Security Considerations

Issue 5
Note: This section is not final and review is welcome.

Key system implementations must consider initialization data, key data and media data as potential attack vectors and must take care to safely parse, decrypt etc. initialization data, key data and media data. User Agents may want to validate data before passing it to the CDM, especially if the CDM does not run in the same (sandboxed) context as the DOM (i.e. rendering).

User Agents should treat key data and media data as untrusted content and use appropriate safeguards to mitigate any associated threats.

User Agents are responsible for providing users with a secure way to browse the web. Since User Agents may integrate with third party CDM implementations, CDM implementers must provide sufficient information and controls to user agent implementers to enable them to properly asses the security implications of integrating with the Key System.

Note: unsandboxed CDMs (or CDMs that use platform features) and UAs that use them must be especially careful in all areas of security, including parsing of key and media data, etc. due to the potential for compromises to provide access to OS/platform features, interact with or run as root, access drivers, kernel, firmware, hardware, etc., all of which may not be written to be robust against hostile software or web-based attacks. Additionally, CDMs may not be updated with security fixes as frequently, especially when part of the OS, platform or hardware.

7. Privacy Considerations

Issue 6
Note: This section is not final and review is welcome.

The presence or use of Key Systems on a user's device raises a number of privacy issues, falling into two categories: (a) user-specific information that may be disclosed by the EME interface itself, or within messages from Key Systems and (b) user-specific information that may be persistently stored on the users device.

User Agents should take responsibility for providing users with adequate control over their own privacy. Since User Agents may integrate with third party CDM implementations, CDM implementers must provide sufficient information and controls to user agent implementers to enable them to implement appropriate techniques to ensure users have control over their privacy, including but not limited to the techniques described below.

7.1. Information Disclosed by EME and Key Systems

Concerns regarding information disclosed by EME and Key Systems fall into two categories, concerns about non-specific information that may nevertheless contribute to the possibility of fingerprinting a user agent or device and user-specific information that may be used directly for user tracking.

7.1.1 Fingerprinting

Malicious applications may be able to fingerprint users or user agents by detecting or enumerating the list of key systems that are supported and related information. If proper origin protections are not provided this could include detection of sites that have been visited and information stored for those sites. In particular, Key Systems should not share key or other data between sites that are not CORS-same-origin.

7.1.2 Information Leakage

CDMs, especially those implemented outside the user agent, may not have the same fundamental isolations as the web platform. It is important that steps be taken to avoid information leakage, especially across origins. This includes both in-memory and stored data. Failure to do so could lead to information leakage to/from Incognito/Private Browsing sessions, across profiles, and even across different operating system user accounts.

To avoid such issues, user agent and CDM implementations should ensure that:

7.1.3 Tracking

User-specific information may be obtained over the EME API in two ways: through detection of stored keys and through Key System messages.

Key Systems may access or create persistent or semi-persistent identifiers for a device or user of a device. In some cases these identifiers may be bound to a specific device in a secure manner. If these identifiers are present in Key System messages, then devices and/or users may be tracked. If the mitigations below are not applied this could include both tracking of users / devices over time and associating multiple users of a given device. If not mitigated, such tracking may take three forms depending on the design of the Key System:

If a Key System permits keys to be stored and to be re-used between origins, then it may be possible for two origins to collude and track a unique user by recording their ability to access a common key.

Finally, if any user interface for user control of Key Systems presents data separately from data in HTTP session cookies or persistent storage, then users are likely to modify site authorization or delete data in one and not the others. This would allow sites to use the various features as redundant backup for each other, defeating a user's attempts to protect his privacy.

There are a number of techniques that can be used to mitigate these risks of tracking without user consent:

User deletion of persistent identifiers
User agents could provide users with the ability to clear any persistent identifiers maintained by Key Systems.
Use of (non-reversible) per-origin identifiers
The user / device identifier exposed by a Key System may be different for each origin, either by allocation of different identifiers for different origins or by use of a non-reversible origin-specific mapping from an origin-independent identifier.
Encryption of user identifiers
User identifiers in Key System messages could be encrypted, together with a timestamp or nonce, such that the Key System messages are always different. This would prevent the use of Key System messages for tracking except by applications fully supporting the Key System.
Site-specific white-listing of access to each Key System
User agents could require the user to explicitly authorize access by each site to each Key System. User agents should enable users to revoke this authorization either temporarily or permanently.
Treating Key System persistent identifiers as cookies
User agents should present the presence of persistent identifiers stored by Key Systems to the user in a way that associates them strongly with HTTP session cookies. This might encourage users to view such identifiers with healthy suspicion.
Shared blacklists
User agents may allow users to share their Key System domain blacklists. This would allow communities to act together to protect their privacy.
User alerts / prompts
User Agents could ensure that users are fully informed and / or give explicit consent before identifiers are exposed in messages from Key Systems.
User controls to disable Key Systems or Key System use of identifiers
User Agents could provide users with a global control of whether a Key System is enabled / disabled and / or whether Key System use of user / device identifiers is enabled or disabled (if supported by the Key System).

While these suggestions prevent trivial use of this feature for user tracking, they do not block it altogether. Within a single domain, a site can continue to track the user during a session, and can then pass all this information to a third party along with any identifying information (names, credit card numbers, addresses) obtained by the site. If a third party cooperates with multiple sites to obtain such information, and if identifiers are not per-origin, then a profile can still be created.

It is important to note that identifiers that are non-clearable, non-origin-specific or hardware-bound exceed the tracking impact of existing techniques such as Cookies or session identifiers embedded in URLs.

Thus, in addition to the various mitigations described above, if a browser supports a mode of operation intended to preserve user anonymity, then User Agent implementers should carefully consider whether access to Key Systems should be disabled in this mode.

7.2. Information Stored on User Devices

Key Systems may store information on a user's device, or user agents may store information on behalf of Key Systems. Potentially, this could reveal information about a user to another user of the same device, including potentially the origins that have used a particular Key System (i.e. sites visited) or even the content that has been decrypted using a Key System.

If information stored by one origin affects the operation of the Key System for another origin, then potentially the sites visited or content viewed by a user on one site may be revealed to another, potentially malicious, site.

There are a number of techniques that can be used to mitigate these privacy risk to users:

Origin-specific Key System storage
User agents may require that some or all of the Key System's persistently stored data is stored in an origin-specific way. Session data, licenses, and keys that are persistently stored should be stored per-origin.
User deletion of Key System storage
User agents may present the user with a way to delete Key System storage for a specific origin or all origins.
Treating Key System stored data like cookies / Web Storage
User agents should present the presence of persistent data stored by Key Systems to the user in a way that associates it strongly with HTTP session cookies and/or Web Storage. This might encourage users to view such data with healthy suspicion.
Encryption or obfuscation of Key System stored data
User agents should treat data stored by Key Systems as potentially sensitive; it is quite possible for user privacy to be compromised by the release of this information. To this end, user agents should ensure that such data is securely stored and when deleting data, it is promptly deleted from the underlying storage.

8. Examples

This section and its subsections are non-normative.

This section contains example solutions for various use cases using the proposed extensions. These are not the only solutions to these use cases. Video elements are used in the examples, but the same would apply to all media elements. In some cases, such as using synchronous XHR, the examples are simplified to keep the focus on the extensions.

8.1. Source and Key Known at Page Load (Clear Key)

In this simple example, the source file and clear-text license are hard-coded in the page. Only one session will ever be created.

<script>
  function load() {
    var video = document.getElementById("video");

    if (!video.mediaKeys) {
      MediaKeys.create("org.w3.clearkey").then(
        function(createdMediaKeys) {
          return video.setMediaKeys(createdMediaKeys);
        },
        console.error.bind(console, "Unable to create MediaKeys")
      ).then(
        function() {
          var initData = new Uint8Array([ ... ]);
          return video.mediaKeys.createSession("webm", initData);
        },
        console.error.bind(console, "Unable to set MediaKeys")
      ).then(
        function(keySession) {
          keySession.addEventListener("message", handleMessage, false);
        },
        console.error.bind(console, "Unable to create key session")
      );
    }
  }

  function handleMessage(event) {
    var keySession = event.target;

    var license = new Uint8Array([ ... ]);
    keySession.update(license).then(
      null,
      console.error.bind(console, "update() failed")
    );
  }
</script>

<body onload="load()">
  <video src="foo.webm" autoplay id="video"></video>
</body>

8.2. Selecting a Supported Key System and Using Initialization Data from the "needkey" Event

This example selects a supported Key System using the isTypeSupported() method then uses the Initialization Data from the media data to generate the license request and send it to the appropriate license server.

<script>
  var keySystem;
  var licenseUrl;

  function selectKeySystem() {
    if (MediaKeys.isTypeSupported("com.example.somesystem", "video/webm; codecs='vp8, vorbis'")) {
      licenseUrl = "https://license.example.com/getkey"; // OR "https://example.<My Video Site domain>"
      keySystem = "com.example.somesystem";
    } else if (MediaKeys.isTypeSupported("com.foobar", "video/webm; codecs='vp8, vorbis'")) {
      licenseUrl = "https://license.foobar.com/request";
      keySystem = "com.foobar";
    } else {
      throw "Key System not supported";
    }
  }

  function handleKeyNeeded(event) {
    var video = event.target;
    if (video.mediaKeysObject === undefined) {
      selectKeySystem();
      video.mediaKeysObject = null; // Prevent entering this path again.
      video.pendingSessionData = []; // Will store all initData until the MediaKeys is ready.
      MediaKeys.create(keySystem).then(
        function(createdMediaKeys) {
          video.mediaKeysObject = createdMediaKeys;
          for (var i = 0; i < video.pendingSessionData.length; i++) {
            var data = video.pendingSessionData[i];
            createSession(video.mediaKeysObject, data.initDataType, data.initData);
          }
          video.pendingSessionData = [];

          return video.setMediaKeys(createdMediaKeys);
        },
        console.error.bind(console, "Unable to create MediaKeys")
      ).then(
        null,
        console.error.bind(console, "Unable to set MediaKeys")
      );
    }
    addSession(video, event.initDataType, event.initData);
  }

  function addSession(video, initDataType, initData) {
    if (video.mediaKeysObject) {
      createSession(video.mediaKeysObject, initDataType, initData);
    } else {
      video.pendingSessionData.push({initDataType: initDataType, initData: initData});
    }
  }

  function createSession(mediaKeys, initDataType, initData) {
    mediaKeys.createSession(initDataType, initData).then(
      function(keySession) {
        keySession.addEventListener("message", licenseRequestReady, false);
      },
      console.error.bind(console, "Unable to create key session")
    );
  }

  function licenseRequestReady(event) {
    var keySession = event.target;
    var request = event.message;

    var xmlhttp = new XMLHttpRequest();
    xmlhttp.open("POST", licenseUrl);
    xmlhttp.onreadystatechange = function() {
      if (xmlhttp.readyState == 4) {
        var license = new Uint8Array(xmlhttp.response);
        keySession.update(license).then(
          null,
          console.error.bind(console, "update() failed")
        );
      }
    }
    xmlhttp.send(request);
  }
</script>

<video src="foo.webm" autoplay onneedkey="handleKeyNeeded(event)"></video>

8.3. Using All Events

This is a more complete example showing all events being used.

Note that handleMessage() could be called multiple times, including in response to the update() call if multiple round trips are required and for any other reason the Key System might need to send a message.

<script>
  var keySystem;
  var licenseUrl;

  // See the previous example for implementations of these functions.
  function selectKeySystem() { ... }
  function handleKeyNeeded(event) { ... }
  function addSession(video, initDataType, initData) { ... }

  // This replaces the implementation in the previous example.
  function createSession(mediaKeys, initDataType, initData) {
    mediaKeys.createSession(initDataType, initData).then(
      function(keySession) {
        keySession.addEventListener("message", handleMessage, false);
        keySession.addEventListener("error", handleError, false);
        keySession.close.then(
          console.log.bind(console, "Session closed")
        );
      },
      console.error.bind(console, "Unable to create key session")
    );
  }

  function handleMessageResponse(keySession, response) {
    var license = new Uint8Array(response);
    keySession.update(license).then(
      null,
      console.error.bind(console, "update() failed")
    );
  }
  
  function sendMessage(message, keySession) {
    xmlhttp = new XMLHttpRequest();
    xmlhttp.open("POST", licenseUrl);
    xmlhttp.onreadystatechange = function() {
      if (xmlhttp.readyState == 4)
        handleMessageResponse(keySession, xmlhttp.response);
    }
    xmlhttp.send(message);
  }

  function handleMessage(event) {
    sendMessage(event.message, event.target);
  }

  function handleError(event) {
    // Report event.target.error.name and event.target.error.systemCode,
    // and do some bookkeeping with event.target.sessionId if necessary.
  }
</script>

<video src="foo.webm" autoplay onneedkey="handleKeyNeeded(event)"></video>

9. Revision History

Version Comment
14 April 2014 Use promises.
1 April 2014 Moved Container Guidelines to the Encrypted Media Extensions Stream Format and Initialization Data Format Registry.
3 February 2014 Produced candidate WD.
17 September 2013 Produced candidate WD.
6 May 2013 Produced updated candidate FPWD.
14 January 2013 Produced candidate FPWD.
16 August 2012 Converted to the object-oriented API.
0.1b Last non-object-oriented revision.
0.1a Corrects minor mistakes in 0.1.
0.1 Initial Proposal