808593 - (mimesniff) [meta] Implement the MIME Sniffing Standard

Reporter

Description

•

12 years ago

The MIME Sniffing Standard describes how to sniff the MIME/Internet media type of files in an interoperable way.

While there are many areas where sniffing is optional and up to the User Agent, there are also a few requirements on User Agents for sniffing in particular contexts (e.g. with unknown types).

This is a tracking bug to track implementation of (or disagreement with) the sniffing standard.

Zack Weinberg (:zwol)

Comment 1

•

12 years ago

I'm assuming I've been cc:ed because of my work a few years back on what happens if you process arbitrary (and potentially malicious) content as CSS.

CSS is not *sniffed* in the strict sense, but in quirks mode we accept *any* MIME type as potentially CSS as long as it's same origin with the document (see bug 524223).  I would support getting rid of that quirk, but I don't think we have a bug for it right now, and it would need extensive web-compat testing and potentially evangelism.

Bug 521039 and bug 562377 are related CSS MIME issues.  Bug 560388 and bug 560392 are related Content-Type-handling-in-general issues.

Depends on: 521039, 562377, 560388, 560392

Gordon P. Hemsley [:GPHemsley]

Reporter

Comment 2

•

12 years ago

(In reply to Zack Weinberg (:zwol) from comment #1)
> I'm assuming I've been cc:ed because of my work a few years back on what
> happens if you process arbitrary (and potentially malicious) content as CSS.

You were CC'd primarily because you were the only one I could find who expressed knowledge (in bug 560392 comment 4 and bug 560388 comment 4) of the IETF document of which this WHATWG spec is a successor:

http://tools.ietf.org/html/draft-abarth-mime-sniff
http://tools.ietf.org/html/draft-ietf-websec-mime-sniff

But I see that, yes, this was related to the issues with CSS that you were involved with. The spec still does not address CSS specifically, so I will look into and get back to you.

Zack Weinberg (:zwol)

Comment 3

•

12 years ago

Yeah, I'm not presently up to speed on that spec as it relates to anything *but* CSS.  If nobody else has the time, I can try to evaluate it in more detail (as of today I have a bunch more discretionary time than I have had for the past several months).

Regarding CSS, our standards-mode behavior is: anything with "Content-Type: text/css", or with no Content-Type header at all, is *assumed* to be CSS and parsed as such.  Anything else is discarded regardless of its contents.  I would support putting that behavior into the sniffing spec.  http://www.w3.org/TR/CSS21/syndata.html#charset defines character-set sniffing rules for CSS; I do not remember whether or not we implement this, but regardless I don't think they need to be duplicated into the sniffing spec (perhaps a normative reference would be appropriate).

cc:ing some more of the usual suspects.

Gordon P. Hemsley [:GPHemsley]

Reporter

Comment 4

•

12 years ago

(In reply to Zack Weinberg (:zwol) from comment #3)
> Regarding CSS, our standards-mode behavior is: anything with "Content-Type:
> text/css", or with no Content-Type header at all, is *assumed* to be CSS and
> parsed as such.  Anything else is discarded regardless of its contents.  I
> would support putting that behavior into the sniffing spec. 
> http://www.w3.org/TR/CSS21/syndata.html#charset defines character-set
> sniffing rules for CSS; I do not remember whether or not we implement this,
> but regardless I don't think they need to be duplicated into the sniffing
> spec (perhaps a normative reference would be appropriate).

As I understand it, the most recent spec is here:

http://dev.w3.org/csswg/css3-syntax/

But as this is intended to be a tracking bug, I imagine it would be better if we moved the CSS-specific discussion to a different bug (or mailing list).

Boris Zbarsky [:bzbarsky]

Comment 5

•

12 years ago

We shouldn't be sniffing CSS.

Past that, Christian and I are probably most familiar with the sniffing stuff.  Except maybe for HTML bits where it's Henri.

Giving the linked document a read is on my todo list.  Of course it might have helped if it had not been totally rewritten from the thing that we had already read and reviewed before.... :(

Boris Zbarsky [:bzbarsky]

Comment 6

•

12 years ago

On the other hand, at first glance it was mostly reformatted, with no real substantive changes so far, right?

Gordon P. Hemsley [:GPHemsley]

Reporter

Comment 7

•

12 years ago

(In reply to Boris Zbarsky (:bz) from comment #5)
> Giving the linked document a read is on my todo list.  Of course it might
> have helped if it had not been totally rewritten from the thing that we had
> already read and reviewed before.... :(

That was intended to be a feature, not a bug. :/

(In reply to Boris Zbarsky (:bz) from comment #6)
> On the other hand, at first glance it was mostly reformatted, with no real
> substantive changes so far, right?

That was the plan. (And I think I've been mostly successful with it.)

Boris Zbarsky [:bzbarsky]

Comment 8

•

12 years ago

One obvious difference between the spec and what we do is that the spec allows sniffing text/plain to various types and then rendering them in the browser.  I'm not convinced we're willing to implement that.  Right now we very carefully force such types to be handled outside the browser, even for "non-scriptable" cases, because otherwise we might render content that filtering proxies would have blocked had they had the right type for it.

Boris Zbarsky [:bzbarsky]

Comment 9

•

12 years ago

Apart from the above, I think the differences between what we do and what this draft proposes are not fatal...  Though I did skim pretty quickly; I may have missed something.

Henri Sivonen (:hsivonen) (away from Bugzilla until 2023-09-11)

Comment 10

•

12 years ago

(In reply to Gordon P. Hemsley [:gphemsley] from comment #0)
> This is a tracking bug to track implementation of (or disagreement with) the
> sniffing standard.

I think the bits ‘type is equal to "font" or’ and ‘type is equal to "archive" or’ are highly questionable. The most popular font types are in the process of getting application/ types and the most popular archives already have application/ types.

I suspect the ‘a reasonable amount of time has elapsed, as determined by the user agent.’ is unnecessary. The HTML spec has the same provision for the <meta> prescan. Firefox didn’t implement it, a couple of people complained, then fixed their code, and the sky didn’t fall.

What are the use cases for ‘Sniffing archives specifically’? It appears that it sniffs ODF-style files (http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part3.html#__RefHeading__752809_826425813 ; EPUB, ODF, InDesign, etc.) and Open Packaging Conventions-based files (https://en.wikipedia.org/wiki/Open_Packaging_Conventions ; OOXML, XPS, etc.) files as zip archives. Is that intended and a desirable outcome in the light of use cases? 

Otherwise, looks good to me, but then I failed to notice the problem bz pointed out in the previous comment.

(In reply to Zack Weinberg (:zwol) from comment #3)
> http://www.w3.org/TR/CSS21/syndata.html#charset defines character-set
> sniffing rules for CSS; I do not remember whether or not we implement this,

Unfortunately, we mostly do and the rules don’t make sense. Bug 796882 has morphed into implementing Level 3 rules instead, but those rules apply only after we’ve decided the file is CSS.

Gordon P. Hemsley [:GPHemsley]

Reporter

Updated

•

12 years ago

Depends on: 471020

Masatoshi Kimura [:emk]

Updated

•

12 years ago

Blocks: whatwg

Gordon P. Hemsley [:GPHemsley]

Reporter

Updated

•

12 years ago

Depends on: 789123

Masatoshi Kimura [:emk]

Updated

•

11 years ago

Depends on: 864851

Gordon P. Hemsley [:GPHemsley]

Reporter

Updated

•

11 years ago

Depends on: 862088

Gordon P. Hemsley [:GPHemsley]

Reporter

Updated

•

11 years ago

Depends on: 877500

Masatoshi Kimura [:emk]

Updated

•

11 years ago

Depends on: 878922

Masatoshi Kimura [:emk]

Updated

•

11 years ago

Depends on: 975809

Masatoshi Kimura [:emk]

Updated

•

10 years ago

Depends on: 986924

Benjamin Smedberg

Updated

•

8 years ago

Product: Core → Firefox

Version: Trunk → unspecified

Anne (:annevk)

Updated

•

7 years ago

Depends on: 1406337

Anne (:annevk)

Updated

•

7 years ago

Depends on: 1420575

Anne (:annevk)

Updated

•

7 years ago

Depends on: 1423877

Anne (:annevk)

Updated

•

6 years ago

Depends on: 500713

Sylvestre Ledru [:Sylvestre]

Updated

•

5 years ago

Type: defect → enhancement

Masatoshi Kimura [:emk]

Updated

•

4 years ago

Depends on: 1602277

Masatoshi Kimura [:emk]

Updated

•

4 years ago

No longer depends on: 1602277

Anne (:annevk)

Updated

•

3 years ago

Depends on: 1718618

Frederik Braun [:freddy]

Updated

•

3 years ago

Depends on: 1725933

Tom S [:evilpie]

Updated

•

3 years ago

Depends on: 1725190

BMO Automation

Updated

•

2 years ago

Severity: normal → S3