This document captures the use cases and requirements for standardizing a solution for responsive images.

Hi! If you find any bugs, typos, or issues please file a bug on github or email us!

Introduction

To visually communicate effectively, developers require a means to explicitly control which image, from a set of images, is shown to a user in response to the environmental conditions of the user agent. In a [[HTML5]] user agent, environmental conditions are expressed as CSS media features (e.g., max-width, orientation, monochrome, etc.) and CSS media types (e.g., print, screen, etc.). Many media features are dynamic in nature (e.g., a browser window is re-sized, a device is rotated from portrait to landscape, and so on), thus a user agent constantly responds to events that cause the properties of media features to change. These changes in media features and media type can reduce an image's ability to communicate effectively (e.g. images become blurry or pixelated).

As the number and varieties of high-density screens has increased (both on mobile and desktop devices), web developers have created custom techniques for serving images that best match a user's browsing environment. For a list of examples of the range of techniques in use in 2012, see Chris Coyier's article "Which responsive images solution should you use?". However, these techniques have a number of shortcomings, which are discussed below (and these shortcomings serve as the motivation to reach out to the W3C and WHATWG for standardization).

The RICG's goal for this document is to capture a complete set of developer requirements for responsive images. The group intends to deliver the document to the HTMLWG and WHATWG for comment (see bug 17061). The use cases and requirements were gathered by consultation with W3C and WHATWG participants, community group members, and the general public. The RICG's goal for this document is to make sure that developer's requirements for responsive images have been formally captured.

The RICG expects that a technical specification can be created to formally address each of the requirements (i.e., the solution). To date, two such specifications are currently under development:

Proposed solutions are not mutually exclusive. They may work together to address the complete set of use cases and requirements.

Problems with current techniques

There are a number of problems with the techniques currently being relied on by web developers:

Reliance on semantically neutral elements and CSS backgrounds:

Large images incur unnecessary download and processing time, slowing the experience for users. To solve this problem, web developers provide multiple sources of the same image at different resolutions and then pick the correct size image based on the viewport size. This is commonly done for CSS background images in responsive designs, but web developers lack the markup to do so for images in HTML without hacking together a solution. This means that developers have resorted to using div and other container elements to achieve the desired functionality.

In other words, developers are being forced to work around, or completely ignore, authoring requirements of [[!HTML5]].

Reliance on scripts and server side processing:

Current solutions to this "responsive images" problem rely on either JavaScript libraries or server-side solutions - both of which add unnecessary complexity to the development process.

The RICG believes that standardization of a browser-based solution can overcome these problems.

Use cases

The following use cases provide represent the techniques currently used by Web developers to achieve both repulsive designs with responsive images.

Art direction

To communicate effectively across the range of screen resolutions and device pixel ratios available on devices today, web developers often need to provide different versions of the same image. Developers do this because if an image is too small on a screen the meaning of that image cannot be properly communicated to a user - conversely, if more space is available, a developer may want to show a different image that depicts more information. Another reason developers do this is to avoid images becoming blurry when scaled too much by the browser.

For example, in the following figure it is difficult to discern the man's facial expressions on the image on the left when compared to the image on the middle. The image on the far right show the effects of scaling the image too much, which also affects the image's ability to communicate effectively:

Obama talking seriously on the phone Obama talking seriously on the phone Obama talking seriously on the phone
Three images showing how The figure shows how different zooms and crop convey information about a man talking on the phone.

Typically, the desired communicative effect is achieved by changing the crop of an image so it can be targetted at the features of a particular display (or set of displays):

This is illustrated in the figure below.

Using different images that have been cropped to fit a particular screen's features can help in communicating a message effectively.

Another related use case is one where orientation determines the source of the image, the crop, and how text flows around an image based on the size of the viewport. For example, on the Nokia Browser site where it describes the Meego browser, the Nokia Lumia is shown horizontally on wide screens. As the screen narrows, the Nokia Lumia is then shown vertically and cropped. Bryan and Stephanie Rieger, the designers of the site, have talked about how on a wide screen, showing the full phone horizontally showed the browser best, but on small screens, changing the image to vertical made more sense because it allowed the reader to still make out the features of the browser in the image.

Design breakpoints

In Web development, a breakpoint is one of a series of CSS Media Queries that update the styles of a page based on matching some media features. A single breakpoint represents a logical rule (or set of rules) determining the point at which the contents of a media query are applied to a page’s layout.

Developers currently match specific breakpoints for images to the breakpoints that they have defined in the CSS of their responsive designs. Being able to match the breakpoints ensures that images are operating under the same rules that define the layout of a design. It also helps developers verify their approach by ensuring that the same viewport measurement tests are being used in both HTML and CSS.

If a breakpoint in the design is specified in [[CSS21]] as:

@media screen and (max-width: 41em) {}

Then web developers would like to be able to define a similar breakpoint for images at a max-width of 41em and not have to translate that measurement into another unit like pixels even if it is possible to calculate that measurement:

When debugging a document, if the developer cannot specify breakpoints for images in the same manner that they are defined for the design, developers will need to convert the breakpoints back to the values specified in the layout in order to verify that they match. This increases authoring time and the likelihood of errors on the part of developers.

Media types

Printed web documents generally have pixelated images due to printers having a higher DPI than most images currently served on the web. According to Wikipedia's article on "Dots per inch":

"An inkjet printer sprays ink through tiny nozzles, and is typically capable of 300-600 DPI. A laser printer applies toner through a controlled electrostatic charge, and may be in the range of 600 to 1,800 DPI."

Defining higher resolution images for printing would increase the abilities of web developers to define printed versions of their documents. For example, a photo sharing site could provide a bandwidth-optimized image for display on screen, but a high-resolution image for print.

Monochrome and high-contrast

Displaying a color image on monochrome media is not always ideal (e.g., on paper an e-ink displays). This is because two different colors of similar luminosity are impossible to distinguish on monochrome media. Serving images specifically for monochrome media can help overcome this issue.

Currently, server side solutions exist to adapt web content to e-ink displays. For example, http://kinstant.com/. Or custom services have been created specifically for accessing popular websites, like www.kindletwit.com.

Additionally, Microsoft is proposing a "high-contrast" media feature, which enables developers to know if the user agent is operating in a high-contrast mode. Knowing if the user agent is operating in high-contrast mode allows developers to serve appropriate images, which could potentially assists visually impaired users. To be able to use such a solution with images on the Web, developers would currently need to rely on the problematic techniques previously discussed.

Mobile-first and desktop-first responsive design

A common approach in sites that cater to a wide range of devices using a single codebase is a “mobile-first” development pattern—starting with a simple, linear layout and increasing layout and functional complexity for larger screen sizes using media queries. In such designs, web developers generally serve appropriately sized images first and increase the size of images as required for the available dimensions.

“Desktop-first” responsive design takes the opposite approach and starts from the desktop design and simplifies it using media queries to support small displays. Developers retrofitting existing sites often take a desktop-first approach out of necessity because changing to a mobile-first approach would be a significant undertaking.

In such cases, providing both a range of images to match the available dimensions, as well as a fallback for legacy user agents, is desirable. Current techniques are problematic in that they rely on scripting and the noscript element to work.

Relative units

A common practice in creating flexible layouts is to specify the size values in media queries as relative units: em, rem, vw/vh etc. See, for example, Lyza Gardner's article The EMs have it: Proportional Media Queries FTW!. This approach is most commonly seen using ems in order to reflow layouts based on users’ zoom preferences, or to resize elements through JavaScript by dynamically altering a font-size value.

In flexible layout designs, when a user zooms into an design, images get proportionally scaled and can become blurry or pixelated potentially affecting the image's ability to communicate as the developer intended (or simply looks ugly, which is also unacceptable). Swapping to a more suitable image is used to overcome this problem.

Dynamically acquired images data

There are cases where the image data may be dynamically generated (e.g., using canvas element and related APIs) or may be acquired from the device itself (e.g., from the camera). In such cases, a practical means to interface programmatically with an image or a set of images is often necessary. Without having a suitable API, it will be difficult for developers to manipulate the sources of images.

Requirements

The use cases give rise to the following requirements:

Open issues

We are tracking open issues on Github. Please help close them!

Acknowledgements

We would like to thank the following people for reviewing the specification: Mike Taylor, Doug Shults.