Closed
Bug 796882
Opened 12 years ago
Closed 11 years ago
Make BOM take precedence over HTTP for CSS
Categories
(Core :: CSS Parsing and Computation, defect)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
mozilla19
People
(Reporter: hsivonen, Assigned: hsivonen)
References
(Blocks 1 open bug)
Details
(Keywords: dev-doc-complete)
Attachments
(3 files, 6 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
Determining the character encoding when CSS loading should be consistent with bug 716579 and bug 687859 for the sake of the overall consistency of the platform. That is, a BOM should take precedence over HTTP-level charset in accordance with http://encoding.spec.whatwg.org/#decode .
![]() |
||
Comment 1•12 years ago
|
||
So BOM should imply UTF-16, then? Right now we actually look for @charset after a BOM and allow @charset to set the encoding if it's present.... In any case, this needs to be done in the working group, since it involves changing http://www.w3.org/TR/CSS21/syndata.html#charset
Assignee | ||
Comment 2•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #1) > So BOM should imply UTF-16, then? Or UTF-8 in the case of the UTF-8 BOM.
Assignee | ||
Comment 3•12 years ago
|
||
Chrome and Opera already do what this bug asks for. We are not currently matching IE, either. (IE strips the BOM from CSS and then proceeds to decode according to the encoding specified on the HTTP level, which is really bizarre.) I did not test Safari.
Assignee | ||
Comment 4•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #1) > In any case, this needs to be done in the working group, since it involves > changing http://www.w3.org/TR/CSS21/syndata.html#charset http://lists.w3.org/Archives/Public/www-style/2012Oct/0516.html
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → hsivonen
Assignee | ||
Comment 5•12 years ago
|
||
I don’t see mCharsetHint on SheetLoadData ever getting set to a non-empty value. What am I missing?
Assignee | ||
Comment 6•12 years ago
|
||
![]() |
||
Comment 8•12 years ago
|
||
> I don’t see mCharsetHint on SheetLoadData ever getting set to a non-empty value.
It might not be. It's sort of part of the public LoadSheet() API, but consumers may always be passing an empty string.
We could just nuke it.
Assignee | ||
Comment 9•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #8) > > I don’t see mCharsetHint on SheetLoadData ever getting set to a non-empty value. > > It might not be. Turns out it’s used for preloads. - - The tests for this bug are at https://test.csswg.org/source/contributors/opera/submitted/css3-syntax/charset/ That is, the tests are testharness.js tests but they are part of CSS WG test suite. We don’t have any precedent for importing tests with this combination. Where should I put them? Does the tooling at http://mxr.mozilla.org/mozilla-central/source/dom/imptests/ support importing a test suite under layout/ ?
Assignee | ||
Comment 10•12 years ago
|
||
Assignee | ||
Comment 11•12 years ago
|
||
Attachment #674651 -
Attachment is obsolete: true
Comment 12•12 years ago
|
||
This puts them under dom/imptests/css. I guess it would be a bit nicer to put them somewhere under layout; file a bug on me if you'd like me to extend the script to allow that.
Assignee | ||
Comment 13•12 years ago
|
||
This patch is needed on top of the one Ms2ger uploaded. However, this doesn’t mark dom/imptests/css/contributors/opera/submitted/css3-syntax/charset/test_page-utf16-css-bomless-utf16be.html as a known failure, because I couldn’t get parseFailures.py to do anything.
Assignee | ||
Comment 14•12 years ago
|
||
And it’s a known failure, because it ends up testing whether our UTF-16 decoders try to sniff endianness in the BOMless case. (Our decoder sniffs. The Encoding Standard says not to, at the moment.) Outside the scope of this bug.
Assignee | ||
Comment 15•12 years ago
|
||
Attachment #679679 -
Attachment is obsolete: true
Attachment #680628 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 16•12 years ago
|
||
Attachment #679680 -
Attachment is obsolete: true
Attachment #680630 -
Flags: review?(bzbarsky)
Updated•12 years ago
|
Keywords: dev-doc-needed
Comment 17•12 years ago
|
||
Comment on attachment 680627 [details] [diff] [review] Patch on top of Ms2ger’s patch Review of attachment 680627 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/imptests/Makefile.in @@ +22,5 @@ > > include $(srcdir)/editing.mk > include $(srcdir)/html.mk > include $(srcdir)/webapps.mk > +include $(srcdir)/css.mk Gah, sorry :/ I fixed the instructions in the readme to mention this: https://hg.mozilla.org/mozilla-central/rev/96e7580fb756
![]() |
||
Comment 18•12 years ago
|
||
Comment on attachment 680628 [details] [diff] [review] Adjust old tests by flipping red and green This is landing in the same changeset as the code changes, right? r=me
Attachment #680628 -
Flags: review?(bzbarsky) → review+
![]() |
||
Comment 19•12 years ago
|
||
Comment on attachment 680630 [details] [diff] [review] Implement CSS3 Syntax charset handling, with updated comment r=me if you leave the logging about where the charset info is coming from instead of removing it.
Attachment #680630 -
Flags: review?(bzbarsky) → review+
![]() |
||
Comment 20•12 years ago
|
||
Comment on attachment 680630 [details] [diff] [review] Implement CSS3 Syntax charset handling, with updated comment Actually, two more comments: >+++ b/layout/style/Loader.cpp >+ for (; i < sizeof(kCharsetSym) - 1; ++i) { Why not just strncmp? >+ aCharset.Trim(" \t\n\f\r"); Is this specced? Is it needed? Seems pretty weird... especially since having an unescaped linebreak inside a quoted-string is not OK in CSS.
Assignee | ||
Comment 21•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #20) > >+++ b/layout/style/Loader.cpp > >+ for (; i < sizeof(kCharsetSym) - 1; ++i) { > > Why not just strncmp? Because the old code did not have strncmp and I don’t tend to think in terms of the C standard library. > >+ aCharset.Trim(" \t\n\f\r"); > > Is this specced? Yes. See http://dev.w3.org/csswg/css3-syntax/#the-input-byte-stream Step 3 says, in part: “then get an encoding for the sequence of XX bytes, decoded per windows-1252, and let temp be the return value.” The sequence of XX bytes can contain anything except the double quote. “Get an encoding” is defined in http://encoding.spec.whatwg.org/#concept-encoding-get which says “Remove any leading and trailing ASCII whitespace from label.” And ASCII whitespace is defined to be " ", "\t", "\n", "\f" and "\r". I suppose it would be slightly more proper to call EncodingUtils::FindEncodingForLabel() instead of inlining “get an encoding” like this. > Is it needed? It’s definitely needed in other parts of the platform that call “get an encoding”. Dunno if it is needed here. It’s a side effect of reusing the “get an encoding” concept.
Assignee | ||
Comment 22•12 years ago
|
||
Attachment #680628 -
Attachment is obsolete: true
Attachment #680630 -
Attachment is obsolete: true
![]() |
||
Comment 23•12 years ago
|
||
Comment on attachment 681011 [details] [diff] [review] Implement CSS3 Syntax charset handling, with review comments addressed > + for (uint32_t i = sizeof(kCharsetSym); i < aDataLength; ++i) { Should that not be sizeof(kCharsetSym) - 1?
Assignee | ||
Comment 24•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #23) > Comment on attachment 681011 [details] [diff] [review] > Implement CSS3 Syntax charset handling, with review comments addressed > > > + for (uint32_t i = sizeof(kCharsetSym); i < aDataLength; ++i) { > > Should that not be sizeof(kCharsetSym) - 1? Oops. Yes. This patch uses strncmp and does not attempt to inline parts of "get an encoding".
Attachment #681011 -
Attachment is obsolete: true
Attachment #681867 -
Flags: review?(bzbarsky)
![]() |
||
Comment 25•12 years ago
|
||
Comment on attachment 681867 [details] [diff] [review] Implement CSS3 Syntax charset handling, with review comments addressed, without off-by-one error r=me
Attachment #681867 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 26•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #25) > r=me Thanks. Landed the code now in order to get all the BOM precedence stuff in the same train: https://hg.mozilla.org/integration/mozilla-inbound/rev/b301f9b2e956 Leaving open pending a decision and help from Ms2ger on how to deal with the test import. (Need to sort out the failure metadata and the HTTP headers.)
Whiteboard: [leave open]
![]() |
||
Comment 28•11 years ago
|
||
Having landed and shipped code changes that change behavior without resolving the bug and setting a target milestone indicating when it got is _really_ confusing for regression-hunting. Please resolve this bug appropriately and file a followup on whatever work remains.
Assignee | ||
Comment 29•11 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #28) > Having landed and shipped code changes that change behavior without > resolving the bug and setting a target milestone indicating when it got is > _really_ confusing for regression-hunting. Yeah. This remained open longer than I expected it to remain open. > Please resolve this bug appropriately and file a followup on whatever work > remains. Filed bug 880201.
Whiteboard: [leave open]
Target Milestone: --- → mozilla19
Assignee | ||
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 30•11 years ago
|
||
Updated https://developer.mozilla.org/en-US/docs/Web/CSS/@charset to reflect that change. Teoli, can you check if we have other places we document that topic ?
Flags: needinfo?(jypenator)
Keywords: dev-doc-needed → dev-doc-complete
You need to log in
before you can comment on or make changes to this bug.
Description
•