A majority of Internet traffic is now streaming video.
However, there are currently no standards or common conventions to provide commercial quality IP streaming video across different platforms and between unrelated companies.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MUST, MUST NOT, SHALL, SHOULD and SHOULD NOT in this specification are to be interpreted as described in RFC 2119 [[RFC2119]].
This specification only applies to one class of product: W3C Technical Reports . A number of specifications may be created to address the requirements enumerated in this document. In some cases the union of multiple parts of different specifications may be needed to address a single requirement. Nevertheless, this document speaks only of conforming specifications .
Conforming specifications are ones that address one or more requirements listed in this document. Conforming specifications should attempt to address SHOULD level requirements requirements unless there is a technically valid reason not to do so.
This section list the requirements that conforming specification(s) would need to adopt in order to ensure a common interface and interpretation for the playback and control of adaptive bit rate media. These requirements are the result of an interactive process of feedback and discussion within the Media Pipeline Task Force of the Web and TV Interest Group
One of the primary purposes for standardizing the way the media elements use adaptive bitrate streaming is to enable different existing and future adaptive bitrate streming methods to work consistently with HTML5 media tags. Therefore, media tags must work with the widely deployed adaptive bitrate methods that are available now.
In the past, the <obj> tag has been used to add non-standard functionality to HTML pages. In order to provide more consistent functionality, the <video> and <audio> elements were added to HTML. This allows for consistent handling of streaming media between different browsers and encoded with different codecs. In order to maintain this consistency, any ABR solution must define how the video and audio elements can be used for playback of adaptive delivery format media.
Frequently, it is necessary to synchronize serveral different steaming content sources. For example, audio tracks must be synchronized with streaming video or the experience of watching the video becomes unpleasant. Synchronization is also important for advertising, closed caption and other streaming media features. Since different streaming media sources have different time references, a strategy for synchronizing these different references must be adopted.
In addition to a common time reference, mapping to that common time reference must be seamless enough to enable continuous playback from sources spliced relative to different time bases.
A common time reference is also important in the context of trick play. Pausing, advancing or rewinding media content must be done accurately and within a common reference. This is necessary in order to advance or rewind to exact locations or the adjust the playback lockation by precise increments.
With media content available from sources around the world, it is important to quickly determine whether various content sources can be rendered. Therefore this determinitaion must be madew with a minimum of overhead.
HTML aspires to be a level playing field. This philosophy enables innovation to flourish and allows superior solutions to become quickly implemnted and adopted. ABR media systems are and should continue to be innovative solutions within this spirit of openness. Support for different ABR systems should not require any proprietary modification of the user agent
The ability for a browser vendor to implement playback of ABR media in accordance with the requirements in this document must be supported.
While specific implementations may include vendor-specific parameters for special features, the parameters required for basic playback should be publicly specified.
While specific implementations may include vendor-specific error codes, the error codes required for basic operation and diagnosis should be publicly specified. However, the particular ABR systems to be supported is an implementation decision.
This section is a non-exhaustive list of use cases that would be enabled by one (or more) specifications implementing the requirements listed above. Each use case is written according to the following template:
A user can play adaptive bit rate content identified in media tags regardless of the particular adaptive bit rate method used to format the content. Support for the playable content formats must be provided by the browser or extensible features of the browser.
Possible implementation:
There is no standard interface for adaptive bit rate content content. This leads to the implementation of multiple incompatible playback systems and interfaces. What should be standardized is:
In order to play adaptive bit rate content, the application interface must be provided.
Low Level | High Level |
---|---|
Compatibility with existing standards | |
Support for media tags | |
Support for a common time reference | |
Support for particular ABR media type | |
Specify the ABR parameters |
A user can use trick-play modes (pause, rewind, fast-forward) with adaptive bit rate content regardless of the particular adaptive bit rate method used to format the content.
Possible implementation:
Playback of media should be consistent regardless of the particular format of the content. Trick-play modes available to non-adaptive media formats should also be available to adaptive bit rate media. What should be standardized is:
None.
Low Level | High Level |
---|---|
Compatibility with existing standards | |
Support for media tags | |
Support for a common time reference | |
Support for trick-play modes | |
Specify the ABR parameters |
A user can search adaptive bit rate content identified in media tags to position playback at a specific point in time regardless of the particular adaptive bit rate method used to format the content.
Possible implementation:
Playback of media should be consistent regardless of the particular format of the content. Search modes available to non-adaptive media formats should also be available to adaptive bit rate media. What should be standardized is:
None.
Low Level | High Level |
---|---|
Compatibility with existing standards | |
Support for media tags | |
Support for a common time reference | |
Support for particular ABR media type | |
Specify the ABR parameters |
A user merge, splice and append adaptive bit rate content identified in media tags regardless of the particular adaptive bit rate method used to format the content.
Possible implementation:
There is no standard interface for merging, splicing content. Content can typically be appended by queueing up the next segment, but there is no guarantee that the common time reference will be preserved. What should be standardized is:
None.
Low Level | High Level |
---|---|
Compatibility with existing standards | |
Support for media tags | |
Support for a common time reference | |
Support for particular ABR media type | |
Specify the ABR parameters |
A user can play a continuous stream of adaptive bit rate content identified in media tags regardless of the particular adaptive bit rate method used to format the content. Continuous content could be a stream that is continuously encoded from a live source (e.g. it has no specific finite length) or a play list that is continually getting content appended and has a common time base.
Possible implementation:
There is no standard interface for continuous playback adaptive bit rate content. What should be standardized is:
None.
Low Level | High Level |
---|---|
Compatibility with existing standards | |
Support for media tags | |
Support for a common time reference | |
Support for particular ABR media type | |
Specify the ABR parameters |
A user can add timed tracks adaptive bit rate content at specific points in time regardless of the particular adaptive bit rate method used to format the content.
Possible implementation:
There is no standard interface for merging timed text tracks with adaptive bit rate content. What should be standardized is:
None.
Low Level | High Level |
---|---|
Compatibility with existing standards | |
Support for media tags | |
Support for a common time reference | |
Support for particular ABR media type | |
Specify the ABR parameters |
In the context of adaptive bit rate media, security is primarily concerned with ensuring that authorized users are able to play the media and unauthorized users are not. This may involve verifying that the content has been legally obtained. It may also mean that personally produced video is only viewable by friends and family. If viewing is intended to be restricted, a content protection system must be in place. Adaptive bit rate video should be treated the same as any video element in this regard. A content protection system for media elements has been proposed to the W3C HTML WG and is being reviewed. (Put reference here)
This proposal refers to work done in WHATWG regarding three implementation methods and their associated APIs.
This proposal was jointly developed by Microsoft, Google and Netflix. It is comprehensive and is intended to meet the requirements described in this document.
Many thanks to the members of the Media Pipeline Task Force of the W3C Web & TV Interest Group who collaborated to create this requirements document and reviewed the proposals to be submitted to the HTML WG for inclusion in the HTML specification.