This document defines directives for the Content Security Policy mechanism to declare a set of input protections for a web resource's user interface, defines a non-normative set of heuristics for Web user agents to implement these input protections, and a reporting mechanism for when they are triggered.
This is a Working Draft of the User Interface Security Directives for Content Security Policy. [[!CSP]]
Portions of the technology described in this document were originally
developed as part of X-Frame-Options
[[!RFC7034]], the ClearClick
module of the Mozilla Firefox add-on NoScript, [[CLEARCLICK]] and in the InContext
system implemented experimentally in Internet Explorer [[INCONTEXT]].
In addition to the documents in the W3C Web Application Security working group, the work on this document is also informed by the work of the IETF websec working group, particularly that working group's requirements document: draft-hodges-websec-framework-reqs.
This document defines User Interface Security directives for Content Security Policy, a mechanism web applications can use to mitigate some of the risks of User Interface (UI) Redressing [[UIREDRESS]] (AKA "Clickjacking") vulnerabilities that can lead to fraudulent actions not intended by the user.
Content Security Policy (CSP) is a declarative policy that lets the authors (or server administrators) of a web application restrict the behavior of a document, e.g. the origins where it can load its resources from or the ways it can execute scripts. This document defines directives to restrict the presentation or the interactivity of a resource when its interaction with the user may be happening in an ambiguous or deceitful context due to the spatial and/or temporal contiguity with other content displayed by the user agent.
A user agent may implement the core directives of CSP independently from the
directives in this specification, but this specification requires the policy
conveyance and reporting mechanisms described in CSP. The interpretation of
terms imported into this document from CSP may vary depending on the version
implemented by the user agent. For example, a source-expression
in Content Security Policy 1.0 is at the granularity of an origin
[[!ORIGIN]] but may be more granular in future versions of the core Content
Security Policy.
Application authors SHOULD transmit the directives in this specification as part of a single, complete Content Security Policy, as indicated by that specification.
In some UI Redressing attacks (also known as Clickjacking), a malicious web application presents a user interface of another web application in a manipulated context to the user, e.g. by partially obscuring the genuine user interface with opaque layers on top, hence tricking the user to click on a button out of context.
Existing anti-clickjacking measures including frame-busting [[FRAMEBUSTING]]
codes and X-Frame-Options
cannot be used to protect resources where
the set of origins that should be allowed and disallowed is unknown, where
attacks might come from origins intended to be allowed by a use scenario, or
defend against timing-based attacks involving multiple windows instead of multiple
frames. Frame-busting scripts also rely on browser behavior that has not been
engineered to provide a security guarantee. As a consequence, such scripts may
be unreliable if loaded inside a sandbox or otherwise disabled.
The User Interface Security directives encompass the policies defined in
X-Frame-Options
and also provide a new mechanism to allow web
applications to enable heuristic input protections for its user interfaces on
user agents.
To mitigate UI redressing, for example, a web application can request that a user interface element should be fully visible for a minimum period of time before a user input can be delivered.
The User Interface Security directive can often be applied to existing applications with few or no changes, but the heuristic hints supplied by the policy may require considerable experimental fine-tuning to achieve an acceptable error rate.
This specification supercedes X-Frame-Options
. Resources may supply an X-Frame-Options
header in addition to a Content-Security-Policy header to indicate policy to user agents that do not implement the directives in this specification. A user agent that understands the directives in this document SHOULD ignore the X-Frame-Options
header, when present, if User Interface Security directives are also present in a Content-Security-Policy header. This is to allow resources to only be embedded if the mechanisms described in this specification are enforced, and more restrictive X-Frame-Options
policies applied otherwise.
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("MUST", "SHOULD", "MAY", etc) used in introducing the algorithm.
A conformant user agent is one that implements all the requirements listed in this specification that are applicable to user-agents. Treatment of
the input-protection
, input-protection-clip
and input-protection-selectors
directives are at the discretion of the
user agent.
A conformant server is one that implements all the requirements listed in this specification that are applicable to servers.
This section defines several terms used throughout the document.
The term security policy, or simply policy, for the purposes of this specification refers to either:
The security policies defined by this document are applied by a user agent on a per-resource representation basis. Specifically, when a user agent receives a policy along with the representation of a given resource, that policy applies to that resource representation only. This document often refers to that resource representation as the protected resource.
A server transmits its security policy for a particular resource as a collection of directives, such as default-src 'self'
, each of which controls a specific set of privileges for a document rendered by the user agent. More details are provided in the directives section.
A directive consists of a directive name, which indicates the privileges controlled by the directive, and a directive value, which specifies the restrictions the policy imposes on those privileges.
An ancestor is any resource between the protected resource and the top of the window frame tree; for example, if A embeds B which embeds C, both A and B are ancestors of C. If A embeds both B and C, B is not an ancestor of C, but A still is.
The term origin is defined in the Origin specification. [ORIGIN]
The term URI is defined in the URI specification. [[!URI]]
The <iframe>
, <object>
,
<embed>
, and <frame>
elements are defined in the HTML5 standard.
[[!HTML5]].
The <applet>
element is defined in the HTML 4.01
standard. [[!HTML401]].
The Augmented Backus-Naur Form (ABNF) notation used in this document is specified in RFC 5234. [[!ABNF]]
The following core rules are included by reference, as defined in [ABNF Appendix
B.1]: ALPHA
(letters), DIGIT
(decimal 0-9),
WSP
(white space) and VCHAR
(printing characters).
The OWS rule is used where zero or more linear whitespace octets might appear. OWS SHOULD either not be produced or be produced as a single SP. Multiple OWS octets that occur within field-content SHOULD either be replaced with a single SP or transformed to all SP octets (each octet other than SP replaced with SP) before interpreting the field value or forwarding the message downstream.
OWS = *( SP / HTAB / obs-fold ) ; "optional" whitespace obs-fold = CRLF ( SP / HTAB ) ; obsolete line folding
A selector string is a list of one or more
complex
selectors(see [[SELECTORS4]], section 3.1) that
may be surrounded by whitespace and matches the
dom_selectors_group
production.
dom_selectors_group : S* [ selectors_group ] S* ;
A embedding source list follows the ABNF and parsing rules defined for source-list (see [[!CSP]] section 3.22) with the following new productions:
embedding-keyword-source = "'self'" / "'deny'" embedding-source-expression = host-source / embedding-keyword-source embedding-source-list = *WSP [ embedding-source-expression *( 1*WSP embedding-source-expression ) *WSP ]
This section describes the content security policy directives introduced in this specification.
input-protection
The input-protection
directive, if present or implied, instructs the user
agent to apply the heuristic UI redressing protections described in the Input Protection Heuristic section to user input events, such as click
,
keypress
, touch
, and drag
, before they
are delivered to the resource.
The screenshot comparison heuristic, in particular, uses the body-bounding rectangle
of the document triggering the event as its default reference area,
or the rectangle defined by the input-protection-clip
and by the input-protection-selectors
directives if any of those is explicitly set.
If the input-protection-clip
directive is set as part of a Content-Security-Policy
, triggering of
the heuristic should cancel delivery of the UI event to the target and
cause a violation report to be sent. If set as part of a
Content-Security-Policy-Report-Only
, triggering of the heuristic
should result in the event being delivered with the unsafe
attribute on the UIEvent
set to true
and cause a violation report to be sent.
The optional directive value allows resource authors to provide options for heuristic tuning
in the form of space-separated option-name=option-value
pairs.
directive-name = "input-protection" directive-value = ["display-time=" num-val] ["tolerance=" num-val]
If the policy does not contain a value for this directive or any of the hint name=value pairs are absent, the user agent SHOULD apply default values for hints as described in the following.
display-time
tolerance
input-protection-clip
The input-protection-clip
directive defines a rectangular screen area
whose intersection with the bounding rectangle of the whole document's body should be used as the reference area in
the screenshot comparison check explained in the Input Protection Heuristic section.
If the input-protection-clip
directive is not explicitly set in a policy
which includes the input-protection
directive
and no input-protection-selectors
directive is set either,
the bounding rectangle of the whole document's body should be used for screenshot comparisons.
If explicitly set as part of a policy where no input-protection
directive is explicitly set, the input-protection-clip
directive
implies the input-protection
directive as if it was set in the same policy with its default value.
directive-name = "input-protection-clip" directive-value = ["before=" num-val] ["above=" num-val] ["after=" num-val] ["below=" num-val]
The optional directive value can include up to four non-negative numeric labeled offsets,
expressed in CSS pixels and relative to the screen coordinates of the UI event being processed
(event.screenX
and event.screenY
for mouse, touch or pointer events) or, if not applicable (e.g. for keyboard events),
to the geometrical center of the event target in screen coordinates.
These offsets define a rectangle with
x = eX - left, y = eY - top, width = left + right, height = top + bottomwhere
eX
and eY
are the event's explicit (when possible) or inferred (the target's center) screen ordinates.
The left
, top
, right
and bottom
values are mapped to the offsets labeled as
before
, above
, after
and below
respectively, unless the bi-directional text properties of the event target suggest otherwise: for instance,
if the target's direction is RTL, before
translates to right
and after
translates to left
.
The default value for this directive is before=250 above=250 after=50 below=50
. If a partial value is provided (i.e. any offset has been omitted) the default values should be implied for the missing offsets.
The intersection of the computed rectangle with the bounding rectangle of the document's body should be used as the reference area for the screenshot comparison check explained in the Input Protection Heuristic section, unless the UI event's target or one of its DOM ancestors match a input-protection-selector
directive set in the same policy.
If the input-protection-clip
directive is not set or provides an invalid value, the whole bounding rectangle of the document's body must be used as the reference area for the screenshot comparison, unless an input-protection-selectors
directive is set in the same policy.
input-protection-selectors
The input-protection-selectors
directive overrides the
implicit or explicit input-protection-clip
value when
the processed UI event target or one of its DOM ancestors match the dom_selectors_group
selector string provided as the mandatory directive's value:
in this case, the reference area used for screenshot comparison is the
bounding box of the event target itself, if it matches the selectors, or the bounding box of its nearest
matching DOM ancestor, if any, augmented by the margins given by the leading optional labeled offsets, if any.
UI events whose target and ancestors don't match any of the specified selectors should be ignored (not blocked)
unless an input-protection-clip
directive is explicitly included in the policy:
if this is the case, the UI event must be checked and the screenshot reference area
should be computed using the input-protection-clip
directive.
If set as part of a policy where no input-protection
directive is explicitly set, the input-protection-selectors
directive
implies the input-protection
directive as if it was set in the same policy with its default value.
directive-name = "input-protection-selectors" directive-value = ["before=" num-value] ["after=" num-value] ["above=" num-value] ["below=" num-value] dom_selectors_group
Any of the four non-negative numeric labeled offsets, which represent margins expressed in CSS pixels, may be omitted, taking 0 (zero) as their default values.
The reference screenshot area is computed as the rectangle having
x = match.x - left, y = match.y - top, width = left + match.width + right, height = top + match.height + bottomwhere
match
is the bounding rectangle around the UI event target, if it matches dom_selectors_group
,
or around its nearest matching ancestor. The
left
, top
, right
and bottom
values
are mapped to the offsets labeled as
before
, above
, after
and below
respectively, unless the bi-directional text properties of the event target suggest otherwise: for instance,
if the target's direction is RTL, before
translates to right
and after
translates to left
(similarly to the input-protection-clip
directive).
report-uri
The report-uri
directive specifies a URI to which the
user agent sends reports about policy violation.
The syntax for the name and value of this directive and the algorithm to prepare a report are described by Content Security Policy. [[!CSP]]
The core Content Security Policy specification provides directives to restrict from where external content may be loaded. As such, violation reports include a blocked-uri key/value pair that specifies the attempted resource load that was blocked by the policy.
As this is not applicable to the directives in this document, the following additional steps MUST be added to the algorithm defined in Content Security Policy to prepare a violation report:
In step 1, when preparing the JSON object violation-object, add the following keys and values to the csp-report: [[!CSP]]
If the violation is of the input-protection
directive, add the following keys and values. If a value is not set or applicable for the violation (e.g. pointer-height, if the violating event type is not a Pointer Event) the key SHOULD be omitted.
type
attribute of the UIEvent
that was blocked by policy.pointerType
value of a Pointer Event [[POINTER-EVENTS]].height
value of a Pointer Event
.width
value of a Pointer Event
.device-height
property as defined in [[!CSS3-MEDIAQUERIES]].device-width
property as defined in [[!CSS3-MEDIAQUERIES]].clientX
attribute of the UIEvent
[[!DOM-LEVEL-2-EVENTS]] that was blocked by policy, if set.clientY
attribute of the UIEvent
[[!DOM-LEVEL-2-EVENTS]] that was blocked by policy, if set.If the target of an UIEvent
which triggers an input-protection
violation has an explictly-set id
attribute:
id
attribute of the DOM Element that a violating
UIEvent
targeted.Otherwise, if the target element does not have an explicit id
attribute:
Element
of the UIEvent
that was blocked by policy.blocked-target-xpath
function getXPathFor(e) { var xpath = ''; while(e.nodeType == e.ELEMENT_NODE) { var child = e; var siblingIndex = 0; while( (child = child.previousSibling) != null ) { if(child.tagName == e.tagName) { siblingIndex++; } } xpath = e.tagName + '[' + siblingIndex + ']' + (xpath == '' ? '' : '/') + xpath; e = e.parentNode; } xpath = '/' + xpath; return(xpath); }Documents may be dynamically constructed and change structure in response to user interaction or other events, so an unambiguous XPath expression in the context of the current state of the DOM may not be unambiguous to the content author. To avoid this confusion, resource authors SHOULD include an
id
attribute for all elements of interest
and user agent implementers MAY include any additional information in the XPath they feel
may help disambiguate the blocked target, including class names and id attributes of
ancestors.
This specification introduces a new attribute for the UIEvent
interface introduced in DOM Level 2. [[!DOM-LEVEL-2-EVENTS]]
The unsafe
attribute allows web applications to monitor and
immediately respond to suspect violations in the report-only
mode. Applications may also use this interface for capability detection. For
example, a web application may monitor user inputs on a payment button element
like this:
document.getElementById('payment-button').addEventListener("click", function(eventObj) { if ("unsafe" in eventObj) { if (eventObj.unsafe == true) { return reportUnsafeOrShowDialog(); } } makePayment(); };
If associated with a Content Security Policy 1.1 [[CSP11]] or later implementation, the User Interface Security Directives include the following script interfaces which extend the experimental functinality defined therein: https://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html#script-interfaces--experimental
SecurityPolicyViolationEvent
Eventsblocked-event-type
property of violation reports for a description of this property.touch-event
property of violation reports for a description of htis property.pointer-type
property of violation reports for a description of this property.pointer-height
property of violation reports for a description of this property.pointer-width
property of violation reports for a description of this property.device-height
property of violation reports for a description of this property.device-width
property of violation reports for a description of this property.blocked-event-client-x
property of violation reports for a description of this property.blocked-event-client-y
property of violation reports for a description of this property.blocked-target-id
property of violation reports for a description of this property.blocked-target-xpath
property of violation reports for a description of this property.document-uri
property of violation reports for a description of this property.touch-event
property of violation reports for a description of htis property.pointer-type
property of violation reports for a description of this property.pointer-height
property of violation reports for a description of this property.pointer-width
property of violation reports for a description of this property.device-height
property of violation reports for a description of this property.device-width
property of violation reports for a description of this property.blocked-event-client-x
property of violation reports for a description of this property.blocked-event-client-y
property of violation reports for a description of this property.blocked-target-id
property of violation reports for a description of this property.blocked-target-xpath
property of violation reports for a description of this property.Let the active CSP policies be the set of CSP policies the user agent is currently enforcing for the associated document.
or
of whether
the input-protection
directive is present or implied in
each of the active CSP
policies. [[CSP11]]The algorithm described here can be implemented mostly in terms of HTML5 constructs, but requires the ability to monitor and intercept actions in the rendering of a resource and delivery of events to that resource. User agents may apply equivalent protections using means more optimized for their implementation details, may ignore recommendations where the browsing environment eliminates certain classes of attack, (e.g. the cursor sanity check in a touch-only environment) or may implement some features in terms of the underlying operating system or platform rather than directly in the user agent.
input-protection
display-time
can be discarded on update.
input-protection
display-time
value, whose repainted regions
intersect with the protected UI elements and whose repaint-causing
Origin is different than the protected one. If this is true, hinting at
a recent change in the way the protected UI is displayed, with causes external to the UI
itself (e.g. an overlapping element in an ancestor document or a
floating window being suddenly moved away), assume a timing attack is happening
and jump to step 4.
input-protection-clip
and
input-protection-selectors
directives and containing the DOM element which is about to receive the event.
tolerance
property of the input-protection
directive,
return. unsafe
property of the event been handled to true
and let the event processing continue. Otherwise, prevent the event from reaching its target. Create and send a violation report if a valid report-uri has been specified.
In the first implementation of this hueristic, NoScript's ClearClick, the screenshots are taken by using the CanvasRenderingContext2D.drawWindow() method, which is a Mozilla-proprietary extension of the HTML 5 Canvas API available to privileged code only, allowing the content of DOM windows to be drawn on a canvas surface exactly as rendered on the screen. The rest of this phase relies on cross-browser canvas features, instead, such as pixel grabbing and data URL serialization.
Some user agents use a strategy for hit testing and delivering UI events involving multiple composited layers managed on a GPU. This alternative heuristic describes one possible implementation strategy for the input-protection directive in this architecture that may be a better fit than the standard heuristic.
GPU-optimized user agents typically separate the browser UI process from the process that handles building and displaying the visual representation of the resource. (In this context the term "process" refers to any encapsulated subunit of user-agent functionality that communicates to other subunits through message passing, without implying any particular implementation details such as locality to a thread, OS-level "processes" or the like.) It is typical for the browser UI process to receive user events such as mouse clicks and then marshal these to the render process, where the event is hit tested through the page's DOM, checking for event handlers along the way. As an optimization, the render process may communicate hit test rectangles back to the UI process in advance so that the UI process can immediately respond to, e.g. a Touch event by scrolling, if the event target falls within coordinates for which there are no other registered handlers in the DOM. A similar strategy can be used to create an implementation of the input protection heuristic that is consistent with this multi-process, compositing architecture.
If a resource is being loaded in a frame
, iframe
,
object
, embed
or applet
context
specifies an input-protection
directive, apply the following steps:
input-protection
applies to
the DOMWindow or Document node, avoid this expensive process of walking the
renderers and simply use the view's bounds, as they're guaranteed to be inclusive.
input-protection display-time
can be discarded
on update.input-protection display-time
value, whose repainted regions intersect with the protected regions and
whose repaint-causing Origin is different than the protected one.
If this is true, hinting at a recent change in the way the protected UI is
displayed, with causes external to the UI itself (e.g. an overlapping element
in an ancestor document or a floating window being suddenly moved away), assume
a timing attack is happening and jump to Violation management.
tolerance
threshold associated with the input-protection
directive, proceed to deliver the event normally, otherwise proceed to
Violation management.
Differences are computed at a pixel-by-pixel level. Any difference in the value of a pixel and it does not match. For example, a protected area in blue overlayed entirely by cross-origin content in red at 1% opacity is considered to be 100% different, not 1% different. If portions of the control image are clipped by the view port or otherwise occluded, all such pixels must be considered not to match.
As a short-cut, a user agent MAY choose to treat any pixels in a protected layer with an opacity of less than 100% as failing to match by definition. In cases where a fully-composited user view is not available or extremely expensive to calculate, this optimization allows the obstruction check to be performed with only a knowledge of the layers that fall on top of the protected layer.
unsafe
property of the event been
handled to true
and let the event processing continue. Otherwise,
prevent the event from reaching its target. Create and send a violation report
if a valid report-uri has been specified.
Optimized and potentially cross-platform implementations of screen and cursor capturing and monitoring regions for invalidation may be available as part of e.g. screen- sharing functionality through getUserMedia() [[MEDIACAPTURE]] or other remote desktop-type functionality available in certian user agents, e.g. the ScreenCapturer interface in Chromium.
This section provides some sample use cases and accompanying security policies.
A resource wishes to block delivery of UI events to the document unless its whole body has been entirely visible (no tolerance) during the past 1 second (default display-time value):
Content-Security-Policy: input-protection
A resource wishes to block delivery of UI events to the element with id "send-box", all the elements with class ".tweet" and all the forms in the page unless those elements have been visible for the past 800 milliseconds at least, (their intrinsic sizes is used as a reference for screenshot comparison):
Content-Security-Policy: input-protection display-time 800; input-protection-selectors #send-button, .tweet, form
A resource wishes to block delivery of UI events to any obstructed HTML button and suggests a 15% tolerance threshold for determining obstruction of the element with a 200 pixels wide margin above and before (on the top and on the left, if orientation is LTR) the triggering element:
Content-Security-Policy: input-protection tolerance=15; input-protection-selectors above=200 before=200 after=0 below=0 button, input[type=submit], input[type=button]
A resource wishes to receive reports when the
UI Security heuristic is triggered for any element in the <body>
,
with the default 300 by 300 pixels clipped reference area and 0 tolerance:
Content-Security-Policy-Report-Only: input-protection; input-protection-clip; report-uri https://example.com/csp-report?unique_id=XKSJ9KAAHJDK9928KKSJEQ
A resource wants to react to potential clickjacking
directly, without sending a report, so it sets a report-only header but does not
specify a report-uri. When a UIEvent
is sent, the unsafe
attribute will still be set when the heuristic is triggered:
Content-Security-Policy-Report-Only: input-protection
A resource wants to allow itself to be embedded by ancestors that are same-origin or from the origin https://checkout.example.com
, but also to have the unsafe
attribute set on events that violate the input protection
heuristic.
Content-Security-Policy: frame-ancestors 'self' https://checkout.example.com Content-Security-Policy-Report-Only: input-protection
This section contains an example violation report the user agent might sent to a server when the protected resource violations a sample policy.
In the following example, a document from
http://example.org/page.html
was rendered with the following CSP
policy:
input-protection; report-uri https://example.org/csp-report.cgi?unique_id=12345
A click
violated the policy.
{ "csp-report": { "document-uri": "http://example.org/page.html", "referrer": "http://evil.example.com/haxor.html", "blocked-event-type": "click", "blocked-event-client-x": "325", "blocked-event-client-y": "122", "touch-event": "false", "device-width": "800", "device-height": "300", "blocked-target-xpath": "/html[0]/body[0]/div[6]/form[2]/input[0]", "violated-directive": "input-protection", "original-policy": "input-protection; report-uri https://example.org/csp-report.cgi?unique_id=12345" } }
A resource at OriginX embeds a resource at OriginY. The OriginY resource has the following policy:
Content-Security-Policy: input-protection tolerance=50; input-protection-selectors div;
and results in the following layout:
The element with the id "div1" has an onClick
handler defined, and a click event is triggered
at 120,120 in the OriginX document's coordinate system. The red dot indicates the position of the event.
The event is delivered to "div1", which matches the input-protection-selectors
, and no parent of
"div1" matches. As no input-protection-clip
value is defined, the entire area of "div1" becomes the boundaries for
the obstruction check, indicated by the cyan fill. As more than 50% of this
area is occluded behind the iframe viewport, and so does not match by definition, this will trigger a violation.
If the OriginY protected resource set the following policy, instead:
Content-Security-Policy: input-protection tolerance=50; input-protection-selectors div; input-protection-clip before=60 after=60 above=60 below=60;
The region for the obstruction check, still indicated in solid cyan, is now only the intersection of the boundaries of the protected element handling the event, indicated by diagonal cyan lines, and the clipping window around the event, indicated by the green dotted line. If the OriginX resource has not painted anything over the iframe viewport, this check will not trigger a violation because the entire cyan area will be identical in the user image and control image.
If the OriginY protected resource omitted selectors, as in this policy:
Content-Security-Policy: input-protection tolerance=50; input-protection-clip before=60 after=60 above=60 below=60;
The region for the obstruction check, still indicated in solid cyan, is now the intersection of the boundaries of the entire document, indicated by diagonal cyan lines, and the clipping window around the event, indicated by the green dotted line. This demonstrates that portions of the protected resource may be included in the obstruction check region, even if they do not have event listeners. Thus, the hit test rectangles which trigger the heuristic do not necessarily compose the entire region that must be checked.
As in the previous example, if the OriginX resource has not painted anything over the iframe viewport, this check will not trigger a violation because the entire cyan area will be identical in the user image and control image.
UI Redressing and Clickjacking attacks rely on violating the contextual and temporal integrity of embedded content. Because these attacks target the subjective perception of the user and not well-defined security boundaries, the heuristic protections afforded by the input-protection
directive can never be 100% effective for every interface. It provides no protection against certain classes of attacks, such as displaying content around an embedded resource that appears to extend a trusted dialog but provides misleading information.
The policy and intent of the user always takes precedence over the policy
of resources. In particular, transformations, customizations or enhancements
of visual content made by the user agent or user-installed plugins SHOULD NOT cause the
input-protection
heuristic to be triggered.
Many UI Redressing and Clickjacking attacks rely on exploiting specific features of user agents, such as repositioning of the browsing window, hiding or creating fake cursors, and script-driven scrolling and content repositioning. Not all attacks apply to all user agents in all contexts. User agents are free to optimize or not implement suggested heuristics when they do not apply, for example:
drag
is not a supported event typeui-width
and ui-height
values that exceed the
capabilities of the browsing environmentSome resource owners may specify a restrictive policy forbidding embedding in
user agents that only understand X-Frame-Options
but be more
permissive with user agents that implement UI Security directives. User agents
that are aware of but choose not to implement any of the heuristics in this
document MAY still ignore X-Frame-Options
when
presented in combination with UI Security directives in a Content Security Policy.
For example, a browsing environment that deliberately chooses not to implement
UI Security features because they interfere with assistive technologies SHOULD NOT deny
users access to resources on this account. User agents taking this stance SHOULD
implement the unsafe
attribute of the UIEvent
interface
as this may be interrogated by client applications doing feature detection.
In environments that support multiple, overlapping browser windows, attacks may be mounted by positioning a target window under another, instructing the user to double click, and closing the obstructing window with the first click. [[CLICKJACKING-Unresolved]] In such environments user agent implementers may wish to use a native operating system screenshot facility to calculate the user's view for the obstruction check phase of the heuristic. In such cases user agents should take special caution to potential infereference from accessibility technologies
While this document describes a mechanism for resource authors to opt-in to
User Interface Security protections, user agents MAY choose to opt-in resources
to input-protection
by default, or provide users with an option
to manually enable such protections.
If a user agent or user chooses to apply input protection in the absence of
an explicit directive, violations SHOULD NOT cause a violation report to be
generated, even if the resource supplied a Content Security Policy with a
report-uri.
Certain classes of accessibility technologies such as
screen readers will provide strong defenses against many classes
of UI Redressing attacks by presenting the content to the user
in a manner not subject to interference. Such SHOULD ignore
X-Frame-Options
headers when presented in
combination with UI Security directives in a
Content Security Policy.
Use of accessibility technologies MUST NOT by itself cause
the input-protection
heuristic to be triggered.
Accessibility technologies that modify the appearance of a resource,
such as screen magnifiers, contrast enhancers, or screen readers
that highlight the element currently being read, have the potential to
interfere with the obstruction check.
If a user agent is able to detect that accessibility technologies are in use
that could cause interference, the check MUST be disabled. In some cases,
interference from accessiblity tools may be avoided by acquiring
the user image in terms of the user agent's local
rendering surface, rather than using an operating-system level
screenshot.
User agents MUST provide a means for the user to manually disable enforcement of the Input Protection Heuristic if it interferes with their chosen accessibility technologies. The mechanism for manually disabling enforcement of the Input Protection Heuristic MUST be operable by assistive technolgies and by people with cognitive disabilities who are able to understand the security risk
Accessibility technologies that act as a proxy MAY filter any UISecurity policies if they cause interference with the user's chosen methods of accessing the content.
When possible, resource authors SHOULD make use of violation reports and the unsafe
attribute to apply additional security measures in the application or during back-end processing. Real-time measures in the application might include requiring completion of a CAPTCHA [[CAPTCHA-Wikipedia]] or responding to an out-of-band confirmation when the UI Security heuristic is triggered. Example back-end measures might include increasing a fraud risk score for individual actions that trigger or targets accounts/resources that frequently trigger UI Security heuristics. To be able to do this effectively, it is likely necessary to encode into the report-uri
a unique identifier that can be correlated to the authenticated user and the action they are taking.
Mechanisms for CAPTCHA and user verification should include options for people with different disabilities, including cognitive disabilities, people with impaired visual and auditory discrimination skills, and for different modalities. For example, if CAPTCHA or user verification require biometrics, a choice should be offered of what biometrics to use, as people with different disabilities may be unable to use one or more specific biometric mechanisms. Further, when two step verification procedures are used, any time limit is problem and it should not be dependent on the user's short term memory or on the user's ability to copy accurately. See Inaccessibility of CAPTCHA for more information about accessible CAPTCHA.
This document does not define new message headers and uses the existing grammar of the Content-Security-Policy and Content-Security-Policy-Report-Only headers, so no updates to the permanent message header field registry (see [RFC3864]) are required.