[css-syntax] Clarify how to decode from bytes

Fri, 03 Jan 2014 12:41:33 +0000

author
Simon Sapin <simon.sapin@exyr.org>
date
Fri, 03 Jan 2014 12:41:33 +0000
changeset 9701
a067b6c20248
parent 9700
6c5d1704f506
child 9702
04834ea861b7

[css-syntax] Clarify how to decode from bytes

css-syntax/Overview.html file | annotate | diff | comparison | revisions
css-syntax/Overview.src.html file | annotate | diff | comparison | revisions
     1.1 --- a/css-syntax/Overview.html	Fri Jan 03 11:59:47 2014 +0000
     1.2 +++ b/css-syntax/Overview.html	Fri Jan 03 12:41:33 2014 +0000
     1.3 @@ -429,18 +429,21 @@
     1.4  	which the user agent must use to decode the bytes into <a data-link-type=dfn href=#code-point title="code points">code points</a>.
     1.5  
     1.6  <p>	To decode the stream of bytes into a stream of <a data-link-type=dfn href=#code-point title="code points">code points</a>,
     1.7 -	UAs must follow these steps.
     1.8 -
     1.9 -<p>	The algorithms to <dfn data-dfn-type=dfn data-noexport="" id=get-an-encoding><a href=http://encoding.spec.whatwg.org/#concept-encoding-get>get an encoding</a><a class=self-link href=#get-an-encoding></a></dfn>
    1.10 -	and <dfn data-dfn-type=dfn data-noexport="" id=decode><a href=http://encoding.spec.whatwg.org/#decode>decode</a><a class=self-link href=#decode></a></dfn>
    1.11 -	are defined in <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>.
    1.12 -
    1.13 -<p>	First, <dfn data-dfn-type=dfn data-noexport="" id=determine-the-fallback-encoding>determine the fallback encoding<a class=self-link href=#determine-the-fallback-encoding></a></dfn>:
    1.14 +	UAs must use the <a href=http://encoding.spec.whatwg.org/#decode>decode</a> algorithm
    1.15 +	defined in <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>,
    1.16 +	with the fallback encoding determined as follows.
    1.17 +
    1.18 +<p class=note>	Note: The <a href=http://encoding.spec.whatwg.org/#decode>decode</a> algorithm
    1.19 +	gives precedence to a byte order mark (BOM),
    1.20 +	and only uses the fallback when none is found.
    1.21 +
    1.22 +<p>	To <dfn data-dfn-type=dfn data-noexport="" id=determine-the-fallback-encoding>determine the fallback encoding<a class=self-link href=#determine-the-fallback-encoding></a></dfn>:
    1.23  
    1.24  	<ol>
    1.25  		<li>
    1.26  			If HTTP or equivalent protocol defines an encoding (e.g. via the charset parameter of the Content-Type header),
    1.27 -			<a data-link-type=dfn href=#get-an-encoding title="get an encoding">get an encoding</a> for the specified value.
    1.28 +			<a href=http://encoding.spec.whatwg.org/#concept-encoding-get>get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
    1.29 +			for the specified value.
    1.30  			If that does not return failure,
    1.31  			use the return value as the fallback encoding.
    1.32  
    1.33 @@ -448,16 +451,18 @@
    1.34  			Otherwise, check the byte stream. If the first several bytes match the hex sequence
    1.35  
    1.36  <pre>40 63 68 61 72 73 65 74 20 22 (not 22)* 22 3B</pre>
    1.37 -<p>			then <a data-link-type=dfn href=#get-an-encoding title="get an encoding">get an encoding</a> for the sequence of <code>(not 22)*</code> bytes,
    1.38 +<p>			then <a href=http://encoding.spec.whatwg.org/#concept-encoding-get>get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
    1.39 +			for the sequence of <code>(not 22)*</code> bytes,
    1.40  			decoded per <code>windows-1252</code>.
    1.41  
    1.42 -<p class=note>			Note: Anything ASCII-compatible will do, so using <code>windows-1252</code> is fine.
    1.43 +<p class=note>			Note: Anything ASCII-compatible will do since valid labels are all ASCII,
    1.44 +			so using <code>windows-1252</code> is fine.
    1.45  
    1.46  
    1.47  <p class=note>			Note: The byte sequence above,
    1.48  			when decoded as ASCII,
    1.49  			is the string "<code>@charset "…";</code>",
    1.50 -			where the "…" is the sequence of bytes corresponding to the encoding’s name.
    1.51 +			where the "…" is the sequence of bytes corresponding to the encoding’s label.
    1.52  
    1.53  <p>			If the return value was <code>utf-16</code> or <code>utf-16be</code>,
    1.54  			use <code>utf-8</code> as the fallback encoding;
    1.55 @@ -474,10 +479,6 @@
    1.56  			Otherwise, use <code>utf-8</code> as the fallback encoding.
    1.57  	</ol>
    1.58  
    1.59 -<p>		Then, <a data-link-type=dfn href=#decode title=decode>decode</a> the byte stream using the fallback encoding.
    1.60 -
    1.61 -<p class=note>	Note: the <a data-link-type=dfn href=#decode title=decode>decode</a> algorithm lets the byte order mark (BOM) take precedence,
    1.62 -	hence the usage of the term "fallback" above.
    1.63  
    1.64  <h3 class="heading settled heading" data-level=3.3 id=environment-encoding><span class=secno>3.3 </span><span class=content>
    1.65  Environment encoding</span><a class=self-link href=#environment-encoding></a></h3>
    1.66 @@ -502,7 +503,8 @@
    1.67  
    1.68  <p>	<ul>
    1.69  		<li>
    1.70 -			<a data-link-type=dfn href=#get-an-encoding title="get an encoding">Get an encoding</a> for the value of the <code>charset</code> attribute of the element, if any.
    1.71 +			<a href=http://encoding.spec.whatwg.org/#concept-encoding-get>Get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
    1.72 +			for the value of the <code>charset</code> attribute of the element, if any.
    1.73  			If that does not return failure,
    1.74  			use the return value as the environment encoding.
    1.75  
    1.76 @@ -525,7 +527,8 @@
    1.77  
    1.78  <p>	<ul>
    1.79  		<li>
    1.80 -			<a data-link-type=dfn href=#get-an-encoding title="get an encoding">Get an encoding</a> for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
    1.81 +			<a href=http://encoding.spec.whatwg.org/#concept-encoding-get>Get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
    1.82 +			for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
    1.83  			If that does not return failure,
    1.84  			use the return value as the environment encoding.
    1.85  
    1.86 @@ -5038,7 +5041,6 @@
    1.87  <li>&lt;dashndashdigit-ident&gt;, <a href=#typedef-dashndashdigit-ident title="section 6.2">6.2</a>
    1.88  <li>declaration, <a href=#declaration title="section 5">5</a>
    1.89  <li>&lt;declaration-list&gt;, <a href=#typedef-declaration-list title="section 7.1">7.1</a>
    1.90 -<li>decode, <a href=#decode title="section 3.2">3.2</a>
    1.91  <li>&lt;delim-token&gt;, <a href=#typedef-delim-token title="section 4">4</a>
    1.92  <li>determine the fallback encoding, <a href=#determine-the-fallback-encoding title="section 3.2">3.2</a>
    1.93  <li>digit, <a href=#digit title="section 4.2">4.2</a>
    1.94 @@ -5052,7 +5054,6 @@
    1.95  <li>escaping, <a href=#escaping0 title="section 2.1">2.1</a>
    1.96  <li>function, <a href=#function title="section 5">5</a>
    1.97  <li>&lt;function-token&gt;, <a href=#typedef-function-token title="section 4">4</a>
    1.98 -<li>get an encoding, <a href=#get-an-encoding title="section 3.2">3.2</a>
    1.99  <li>&lt;hash-token&gt;, <a href=#typedef-hash-token title="section 4">4</a>
   1.100  <li>hex digit, <a href=#hex-digit title="section 4.2">4.2</a>
   1.101  <li>identifier, <a href=#identifier title="section 4.2">4.2</a>
     2.1 --- a/css-syntax/Overview.src.html	Fri Jan 03 11:59:47 2014 +0000
     2.2 +++ b/css-syntax/Overview.src.html	Fri Jan 03 12:41:33 2014 +0000
     2.3 @@ -265,18 +265,21 @@
     2.4  	which the user agent must use to decode the bytes into <a>code points</a>.
     2.5  
     2.6  	To decode the stream of bytes into a stream of <a>code points</a>,
     2.7 -	UAs must follow these steps.
     2.8 -
     2.9 -	The algorithms to <dfn><a href="http://encoding.spec.whatwg.org/#concept-encoding-get">get an encoding</a></dfn>
    2.10 -	and <dfn><a href="http://encoding.spec.whatwg.org/#decode">decode</a></dfn>
    2.11 -	are defined in [[!ENCODING]].
    2.12 -
    2.13 -	First, <dfn>determine the fallback encoding</dfn>:
    2.14 +	UAs must use the <a href="http://encoding.spec.whatwg.org/#decode">decode</a> algorithm
    2.15 +	defined in [[!ENCODING]],
    2.16 +	with the fallback encoding determined as follows.
    2.17 +
    2.18 +	Note: The <a href="http://encoding.spec.whatwg.org/#decode">decode</a> algorithm
    2.19 +	gives precedence to a byte order mark (BOM),
    2.20 +	and only uses the fallback when none is found.
    2.21 +
    2.22 +	To <dfn>determine the fallback encoding</dfn>:
    2.23  
    2.24  	<ol>
    2.25  		<li>
    2.26  			If HTTP or equivalent protocol defines an encoding (e.g. via the charset parameter of the Content-Type header),
    2.27 -			<a>get an encoding</a> for the specified value.
    2.28 +			<a href="http://encoding.spec.whatwg.org/#concept-encoding-get">get an encoding</a> [[!ENCODING]]
    2.29 +			for the specified value.
    2.30  			If that does not return failure,
    2.31  			use the return value as the fallback encoding.
    2.32  
    2.33 @@ -285,16 +288,18 @@
    2.34  
    2.35  			<pre>40 63 68 61 72 73 65 74 20 22 (not 22)* 22 3B</pre>
    2.36  
    2.37 -			then <a>get an encoding</a> for the sequence of <code>(not 22)*</code> bytes,
    2.38 +			then <a href="http://encoding.spec.whatwg.org/#concept-encoding-get">get an encoding</a> [[!ENCODING]]
    2.39 +			for the sequence of <code>(not 22)*</code> bytes,
    2.40  			decoded per <code>windows-1252</code>.
    2.41  
    2.42 -			Note: Anything ASCII-compatible will do, so using <code>windows-1252</code> is fine.
    2.43 +			Note: Anything ASCII-compatible will do since valid labels are all ASCII,
    2.44 +			so using <code>windows-1252</code> is fine.
    2.45  
    2.46  
    2.47  			Note: The byte sequence above,
    2.48  			when decoded as ASCII,
    2.49  			is the string "<code>@charset "…";</code>",
    2.50 -			where the "…" is the sequence of bytes corresponding to the encoding's name.
    2.51 +			where the "…" is the sequence of bytes corresponding to the encoding's label.
    2.52  
    2.53  			If the return value was <code>utf-16</code> or <code>utf-16be</code>,
    2.54  			use <code>utf-8</code> as the fallback encoding;
    2.55 @@ -311,10 +316,6 @@
    2.56  			Otherwise, use <code>utf-8</code> as the fallback encoding.
    2.57  	</ol>
    2.58  
    2.59 -		Then, <a>decode</a> the byte stream using the fallback encoding.
    2.60 -
    2.61 -	Note: the <a>decode</a> algorithm lets the byte order mark (BOM) take precedence,
    2.62 -	hence the usage of the term "fallback" above.
    2.63  
    2.64  <h3 id="environment-encoding">
    2.65  Environment encoding</h3>
    2.66 @@ -339,7 +340,8 @@
    2.67  
    2.68  	<ul>
    2.69  		<li>
    2.70 -			<a>Get an encoding</a> for the value of the <code>charset</code> attribute of the element, if any.
    2.71 +			<a href="http://encoding.spec.whatwg.org/#concept-encoding-get">Get an encoding</a> [[!ENCODING]]
    2.72 +			for the value of the <code>charset</code> attribute of the element, if any.
    2.73  			If that does not return failure,
    2.74  			use the return value as the environment encoding.
    2.75  
    2.76 @@ -365,7 +367,8 @@
    2.77  
    2.78  	<ul>
    2.79  		<li>
    2.80 -			<a>Get an encoding</a> for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
    2.81 +			<a href="http://encoding.spec.whatwg.org/#concept-encoding-get">Get an encoding</a> [[!ENCODING]]
    2.82 +			for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
    2.83  			If that does not return failure,
    2.84  			use the return value as the environment encoding.
    2.85  

mercurial