This document captures the use cases and requirements for standardizing a solution for responsive images.
Hi! If you find any bugs, typos, or issues please file a bug on github or email us!
To visually communicate effectively, developers require a means to explicitly control which image, from a set of images, is shown to a user in response to the environmental conditions of the user agent. In a [[HTML5]] user agent, environmental conditions are expressed as CSS media features (e.g., max-width, orientation, monochrome, etc.) and CSS media types (e.g., print, screen, etc.). Many media features are dynamic in nature (e.g., a browser window is re-sized, a device is rotated from portrait to landscape, and so on), thus a user agent constantly responds to events that cause the properties of media features to change. These changes in media features and media type can reduce an image's ability to communicate effectively (e.g. images become blurry or pixelated).
As the number and varieties of high-density screens has increased (both on mobile and desktop devices), web developers have created custom techniques for serving images that best match a user's browsing environment. For a list of examples of the range of techniques in use in 2012, see Chris Coyier's article "Which responsive images solution should you use?". However, these techniques have a number of shortcomings, which are discussed below (and these shortcomings serve as the motivation to reach out to the W3C and WHATWG for standardization).
The RICG's goal for this document is to capture a complete set of developer requirements for responsive images. The group intends to deliver the document to the HTMLWG and WHATWG for comment (see bug 17061). The use cases and requirements were gathered by consultation with W3C and WHATWG participants, community group members, and the general public. The RICG's goal for this document is to make sure that developer's requirements for responsive images have been formally captured.
The RICG expects that a technical specification can be created to formally address each of the requirements (i.e., the solution). To date, two such specifications are currently under development:
Proposed solutions are not mutually exclusive. They may work together to address the complete set of use cases and requirements.
There are a number of problems with the techniques currently being relied on by web developers:
Large images incur unnecessary download and processing time, slowing the experience for users. To solve this problem, web developers provide multiple sources of the same image at different resolutions and then pick the correct size image based on the viewport size. This is commonly done for CSS background images in responsive designs, but web developers lack the markup to do so for images in HTML without hacking together a solution. This means that developers have resorted to using div
and other container elements to achieve the desired functionality.
In other words, developers are being forced to work around, or completely ignore, authoring requirements of [[!HTML5]].
Current solutions to this "responsive images" problem rely on either JavaScript libraries or server-side solutions - both of which add unnecessary complexity to the development process.
The RICG believes that standardization of a browser-based solution can overcome these problems.
The following use cases provide represent the techniques currently used by Web developers to achieve both repulsive designs with responsive images.
To communicate effectively across the range of screen resolutions and device pixel ratios available on devices today, web developers often need to provide different versions of the same image. Developers do this because if an image is too small on a screen the meaning of that image cannot be properly communicated to a user - conversely, if more space is available, a developer may want to show a different image that depicts more information. Another reason developers do this is to avoid images becoming blurry when scaled too much by the browser.
For example, in the following figure it is difficult to discern the man's facial expressions on the image on the left when compared to the image on the middle. The image on the far right show the effects of scaling the image too much, which also affects the image's ability to communicate effectively:
Typically, the desired communicative effect is achieved by changing the crop of an image so it can be targetted at the features of a particular display (or set of displays):
This is illustrated in the figure below.
Another related use case is one where orientation determines the source of the image, the crop, and how text flows around an image based on the size of the viewport. For example, on the Nokia Browser site where it describes the Meego browser, the Nokia Lumia is shown horizontally on wide screens. As the screen narrows, the Nokia Lumia is then shown vertically and cropped. Bryan and Stephanie Rieger, the designers of the site, have talked about how on a wide screen, showing the full phone horizontally showed the browser best, but on small screens, changing the image to vertical made more sense because it allowed the reader to still make out the features of the browser in the image.
In Web development, a breakpoint is one of a series of CSS Media Queries that update the styles of a page based on matching some media features. A single breakpoint represents a logical rule (or set of rules) determining the point at which the contents of a media query are applied to a page’s layout.
Developers currently match specific breakpoints for images to the breakpoints that they have defined in the CSS of their responsive designs. Being able to match the breakpoints ensures that images are operating under the same rules that define the layout of a design. It also helps developers verify their approach by ensuring that the same viewport measurement tests are being used in both HTML and CSS.
If a breakpoint in the design is specified in [[CSS21]] as:
@media screen and (max-width: 41em) {}
Then web developers would like to be able to define a similar breakpoint for images at a max-width of 41em and not have to translate that measurement into another unit like pixels even if it is possible to calculate that measurement:
When debugging a document, if the developer cannot specify breakpoints for images in the same manner that they are defined for the design, developers will need to convert the breakpoints back to the values specified in the layout in order to verify that they match. This increases authoring time and the likelihood of errors on the part of developers.
Printed web documents generally have pixelated images due to printers having a higher DPI than most images currently served on the web. According to Wikipedia's article on "Dots per inch":
"An inkjet printer sprays ink through tiny nozzles, and is typically capable of 300-600 DPI. A laser printer applies toner through a controlled electrostatic charge, and may be in the range of 600 to 1,800 DPI."
Defining higher resolution images for printing would increase the abilities of web developers to define printed versions of their documents. For example, a photo sharing site could provide a bandwidth-optimized image for display on screen, but a high-resolution image for print.
Displaying a color image on monochrome media is not always ideal (e.g., on paper an e-ink displays). This is because two different colors of similar luminosity are impossible to distinguish on monochrome media. Serving images specifically for monochrome media can help overcome this issue.
Currently, server side solutions exist to adapt web content to e-ink displays. For example, http://kinstant.com/. Or custom services have been created specifically for accessing popular websites, like www.kindletwit.com.
Additionally, Microsoft is proposing a "high-contrast" media feature, which enables developers to know if the user agent is operating in a high-contrast mode. Knowing if the user agent is operating in high-contrast mode allows developers to serve appropriate images, which could potentially assists visually impaired users. To be able to use such a solution with images on the Web, developers would currently need to rely on the problematic techniques previously discussed.
A common approach in sites that cater to a wide range of devices using a single codebase is a “mobile-first” development pattern—starting with a simple, linear layout and increasing layout and functional complexity for larger screen sizes using media queries. In such designs, web developers generally serve appropriately sized images first and increase the size of images as required for the available dimensions.
“Desktop-first” responsive design takes the opposite approach and starts from the desktop design and simplifies it using media queries to support small displays. Developers retrofitting existing sites often take a desktop-first approach out of necessity because changing to a mobile-first approach would be a significant undertaking.
In such cases, providing both a range of images to match the available dimensions, as well as a fallback for legacy user agents, is desirable. Current techniques are problematic in that they rely on scripting and the noscript
element to work.
A common practice in creating flexible layouts is to specify the size values in media queries as relative units: em
, rem
, vw
/vh
etc. See, for example, Lyza Gardner's article The EMs have it: Proportional Media Queries FTW!. This approach is most commonly seen using em
s in order to reflow layouts based on users’ zoom preferences, or to resize elements through JavaScript by dynamically altering a font-size value.
In flexible layout designs, when a user zooms into an design, images get proportionally scaled and can become blurry or pixelated potentially affecting the image's ability to communicate as the developer intended (or simply looks ugly, which is also unacceptable). Swapping to a more suitable image is used to overcome this problem.
There are cases where the image data may be dynamically generated (e.g., using canvas
element and related APIs) or may be acquired from the device itself (e.g., from the camera). In such cases, a practical means to interface programmatically with an image or a set of images is often necessary. Without having a suitable API, it will be difficult for developers to manipulate the sources of images.
The use cases give rise to the following requirements:
The solution MUST afford developers the ability to match image sources with particular media features and/or media types - and have the user agent update the source of an image as the media features and media types of the browser environment change dynamically over time.
The solution MUST degrade gracefully on legacy user agents by, for example, falling back on the img
element and by relying on [[HTML5]] built-in fallback mechanisms.
The solution MUST allow developers to provide textual descriptions of text content for images using [[HTML5]] markup. This helps overcome some of the limitations inherent in the image element's alt
attribute, particularly relating to internationalization.
The solution MUST adhere to akin solutions already in [[HTML5]], such as those used by audio
and video
elements.
The solution MUST NOT require any server-side processing.
The solution MUST provide developers with a means to programmatically interface with the displayed image, as well as access relevant attributes and methods that make solution easy to work with. In addition, the solution MUST provide means to hook into relevant events (e.g., loading, errors, etc.). In any case, an API SHOULD provide a means to:
Determine the current source of the image.
Determine what environmental condition caused the current source to be selected (reflected as, for example, a CSS Media Query).
Add, remove, and update image sources.
The solution MUST afford developers the ability to explicitly define different image versions as opposed to simply different resolutions of the same image.
The solution MUST afford developers the ability to define fallback image as the smallest image (mobile first) or the largest image (desktop first).
The solution MUST afford developers the ability to define the breakpoints for images as either minimum values (mobile first) or maximum values (desktop first) to match the media queries used in their design.
The solution MUST function in such a way that is is responsive to environmental changes in relative units (e.g., when the user increases the base font size of the browser by pressing ctrl+ or ctrl-).
To provide compatibility with legacy user agents, it SHOULD be possible for developers to polyfill the solution.
We are tracking open issues on Github. Please help close them!
We would like to thank the following people for reviewing the specification: Mike Taylor, Doug Shults.