Closed Bug 562377 Opened 14 years ago Closed 6 years ago

Absent or unparseable Content-Type on a style sheet should not be equivalent to text/css

Categories

(Core :: CSS Parsing and Computation, defect)

defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: zwol, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: sec-low, Whiteboard: [sg:low] data exfiltration from sites with bad HTTP content labeling)

The CSS loader currently treats HTTP resources served with no Content-Type at all, or with an unparseable Content-Type header, exactly as if they had been served with the correct Content-Type (i.e. "text/css").  This opens a loophole in the fixes for bug 524223 -- a malicious site can still perform the attack against a victim site that doesn't label its *non*-CSS content correctly.  In particular, a site that relies on content sniffing for its HTML is vulnerable.

To know what to do about this, I think we need some statistics on the volume and nature of content served without a Content-Type, or with an ill-formed Content-Type.  How can we get those numbers?  I'd ask Chris Evans (who did the original investigation of bug 524223) but he doesn't seem to have a bugzilla handle.

See also bug 560388 (which makes it difficult to distinguish bogus from absent Content-Type) and bug 560392 (our Content-Type parser is more permissive than it should be).
There you are, Chris Evans.
Hmm - unfortunately, I never collected stats on that, because WebKit also treats (missing) as text/css. (It also treats application/x-unknown-content-type as text/css).
Is it possible for you to do that analysis now?  Or arrange access to the corpus you used, so that I can do it myself? :-)
Collin Jackson and co. recently did a MIME type crawl whilst looking for data on their paper about this whole area of CSS. I bet he has useful data if you contact him.
I already did :) No word yet.
Blocks: mimesniff
Is this something that we still intend to fix?
If yes, then perhaps the first step would be to collect some data.
It appears to be quite easy to do at first glance:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe
What are the conditions for the buckets we're interested in?
I think the buckets I would want are: of all resources that reach http://mxr.mozilla.org/mozilla-central/source/layout/style/Loader.cpp#902 (in SheetLoadData::OnStreamComplete), which is the point where we check the sheet's content-type,

 * number served with content-type text/css
 * number served with a content-type that is ill-formed
 * number served with no content-type header at all
 * number served with a well-formed content-type which isn't text/css

These correspond directly to the three ways the boolean on line 907 can get set to 'true' and the one way it can get set to 'false', respectively.  (The change contemplated in this bug is to remove the second and third ways.)

It might also be useful to know the same 4-way breakdown for *all* resources we load.  And, it's not directly relevant to *this* bug, but it's possible that the quirks-mode license implemented a bit lower down (look for eCompatibility_NavQuirks) is no longer necessary -- it is five years old at this point! -- so maybe at the same time we could collect telemetry on how often that one triggers.
I don't think this is worth fixing unless there's a realistic plan for getting it fixed across multiple browsers.

(Also, does the HTML spec say not to do this?  It wouldn't surprise me at this point.)
(In reply to David Baron [:dbaron] (UTC-8) from comment #9)
> I don't think this is worth fixing unless there's a realistic plan for
> getting it fixed across multiple browsers.
> 
> (Also, does the HTML spec say not to do this?  It wouldn't surprise me at
> this point.)

https://html.spec.whatwg.org/multipage/semantics.html#link-type-stylesheet says: "The default type for resources given by the stylesheet keyword is text/css" and "Once a resource has been obtained, if its Content-Type metadata is text/css, the user agent must [use it to create a style sheet for the document]" and "Quirk: If the document has been set to quirks mode, has the same origin as the URL of the external resource, and the Content-Type metadata of the external resource is not a supported style sheet type, the user agent must instead assume it to be text/css."

I interpret that to mean, in standards mode, style sheets delivered with no content-type at all should be processed but style sheets delivered with an ill-formed content-type should not.

I would expect Hixie to be fairly receptive to tightening this up, though.  Dunno about other browsers.  I would definitely want data before doing anything (including reaching out to whatwg and/or other browsers).
Like HTML document itself, any byte sequence is parsable as a mime type. There is no such concept "ill-formed" in the mimesniff spec.
https://mimesniff.spec.whatwg.org/#parsing-a-mime-type
(In reply to Masatoshi Kimura [:emk] from comment #11)
> Like HTML document itself, any byte sequence is parsable as a mime type.
> There is no such concept "ill-formed" in the mimesniff spec.
> https://mimesniff.spec.whatwg.org/#parsing-a-mime-type

I would be inclined to doubt that net_parseContentType <http://mxr.mozilla.org/mozilla-central/source/netwerk/base/nsURLHelper.cpp#935> actually implements that algorithm (except perhaps by accident); however, that's irrelevant, what I mean by "ill-formed content type" in the context of this bug is "any and all circumstances that cause the condition `contentType.EqualsLiteral(UNKNOWN_CONTENT_TYPE)` at http://mxr.mozilla.org/mozilla-central/source/layout/style/Loader.cpp#908 to be true."  (If that turns out to never happen, so much the better; then we can just go ahead and take that case out.)
It seems all implementations agree here unfortunately, so it seems unlikely we could tidy this up: https://github.com/web-platform-tests/wpt/pull/13144. I'll work on fixing the HTML Standard as well. If there's still desire to change this I think that should go through the HTML Standard community at this point as it'd require multiple vendors to commit, new tests, etc.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.