Closed
Bug 414064
Opened 17 years ago
Closed 14 years ago
Remove support for UTF-7 (and others) per HTML5 spec
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
mozilla5
People
(Reporter: dveditz, Assigned: smontagu)
References
(Blocks 1 open bug)
Details
(Keywords: dev-doc-complete, html5, Whiteboard: [sg:want P3])
Attachments
(2 files, 8 obsolete files)
(deleted),
patch
|
emk
:
review+
benjamin
:
approval2.0-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
Section 8.2.2.2 of the WHAT-WG HTML5 spec says
User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU
encodings. Support for UTF-32 is not recommended. This encoding
is rarely used, and frequently misimplemented.
http://www.whatwg.org/specs/web-apps/current-work/#character0
I don't know about UTF-32, but as far as I can tell UTF-7 is only used to demo XSS holes and isn't used for live web content. Even the authors of UTF-7 agree:
http://mail.apps.ietf.org/ietf/charsets/msg01746.html
Some bugs seem to imply we may need utf-7 or a variant for IMAP folder names. Don't know if that's still a requirement or if there's a way to preserve it for mail protocols while killing its use for the web and mail _content_.
Comment 1•17 years ago
|
||
We still need IMAP Mod utf-7 for IMAP folder names (and keywords), and will for the forseeable future, afaik. We use the intl converters to convert back and forth from unicode to imap mod utf7.
Assignee | ||
Comment 2•17 years ago
|
||
Could we change the registered name of the converter to some arbitrary internal name so that it works when we call it manually for IMAP but doesn't recognize "utf-7" in content?
OTOH, I think I remember Neil saying when we discussed this on IRC a little while ago that he still comes across UTF-7 in mail content.
Comment 3•17 years ago
|
||
IMAP uses MUTF7, "x-imap4-modified-utf7" - see http://mxr.mozilla.org/mailnews/search?string=mutf7
Comment 4•16 years ago
|
||
Within the IMAP community, the fact that MUTF7 exists is generally believed to
be a design error, but it still exists. There is a draft proposal that might
deprecate it, but it's both controversial and not done:
http://tools.ietf.org/html/draft-ietf-eai-imap-utf8
So despite the need for MUTF7 for the time being in IMAP mailbox names, removal
of support for the regular UTF-7 charset seems like a good idea.
Comment 5•15 years ago
|
||
Just wanted to poke this bug to see if anyone has given further thought on how we can remove UTF-7 globally. We continue to see a steady trickle of XSS bugs that leverage UTF-7, so it would be great to shut it off once and for all.
Updated•15 years ago
|
Whiteboard: [sg:want P3]
Updated•15 years ago
|
Comment 6•15 years ago
|
||
Attachment #432349 -
Flags: review?(smontagu)
Comment 7•15 years ago
|
||
Attachment #432350 -
Flags: review?(smontagu)
Assignee | ||
Comment 8•15 years ago
|
||
So are we sure that there are no mail clients still around that send UTF-7 mails?
Comment 9•15 years ago
|
||
(In reply to comment #8)
MS Exchange 2000 uses UTF-7 for Delivery status notification messages
http://support.microsoft.com/kb/819831/en-us/
http://www.microsoft.com/exchange/2007/support/lifecycle/2000.mspx
Comment 10•15 years ago
|
||
(In reply to comment #9)
End of lifecycle of MS Exchange 2000 is January 1, 2011.
Comment 11•15 years ago
|
||
So, if we theoretically removed our UTF-7 support today, what issues could/would occur?
Comment 12•15 years ago
|
||
Webkit has been already removed UTF-7 support.
I filed a security bug around UTF-7 decoder of Webkit.
It is fixed by removal of UTF-7 support.
Comment 13•15 years ago
|
||
(In reply to comment #12)
> I filed a security bug around UTF-7 decoder of Webkit.
Oops, it is NOT correct description.
I thought it may be security-related, and filed it as Security-sensitive.
But, No security attack method using it was found.
Comment 14•15 years ago
|
||
(In reply to comment #13)
> (In reply to comment #12)
> > I filed a security bug around UTF-7 decoder of Webkit.
> Oops, it is NOT correct description.
> I thought it may be security-related, and filed it as Security-sensitive.
> But, No security attack method using it was found.
And main security-related reason was "UTF-7 is commonly used for XSS".
Comment 15•14 years ago
|
||
Comment on attachment 432349 [details] [diff] [review]
Remove UTF7 and move BasicUTF7
Canceling r? for now; comments still welcome though.
Attachment #432349 -
Flags: review?(smontagu)
Updated•14 years ago
|
Attachment #432350 -
Flags: review?(smontagu)
Comment 16•14 years ago
|
||
The way we handled this at Opera was disabling UTF-7 just for web content. UTF-32 support was nuked completely though.
Assignee | ||
Comment 17•14 years ago
|
||
I'm going to investigate moving the UTF-7 converters into mailnews/imap or somewhere.
Status: NEW → ASSIGNED
Assignee | ||
Comment 18•14 years ago
|
||
This patch breaks tests in mailnews/base, which is a Good Thing, because it means that there is already testing in place for the follow-up patch to reimplement UTF-7 in mailnews.
Attachment #462910 -
Flags: review?(VYV03354)
Assignee | ||
Comment 19•14 years ago
|
||
Attachment #462911 -
Flags: review?(hsivonen)
Assignee | ||
Comment 20•14 years ago
|
||
The right patch this time
Attachment #462911 -
Attachment is obsolete: true
Attachment #462911 -
Flags: review?(hsivonen)
Assignee | ||
Comment 21•14 years ago
|
||
Comment on attachment 462910 [details] [diff] [review]
Remove utf-7 (intl part)
Removing review request -- we may still need the alias handling.
Attachment #462910 -
Flags: review?(VYV03354)
Assignee | ||
Comment 22•14 years ago
|
||
Filed bug 587475 with a patch reimplementing UTF-7 decoding and encoding in mailnews using character sinks.
Comment 23•14 years ago
|
||
Be real nice if we could get this into 2.0, then we could stop worrying about these UTF-7 smuggled XSS issues.
blocking2.0: --- → ?
Assignee | ||
Comment 24•14 years ago
|
||
Tested with the patch in bug 587475.
Attachment #462910 -
Attachment is obsolete: true
Assignee | ||
Updated•14 years ago
|
Attachment #466601 -
Flags: review?(VYV03354)
Assignee | ||
Updated•14 years ago
|
Attachment #462912 -
Flags: review?(hsivonen)
Assignee | ||
Comment 25•14 years ago
|
||
Assignee | ||
Comment 26•14 years ago
|
||
Unfortunately View Source of a UTF-7 message is broken with this approach. I haven't tested viewing an .eml file in browser, but I expect it will be broken also.
Assignee | ||
Comment 27•14 years ago
|
||
Sorry, the previous comment belongs in bug 587475
Updated•14 years ago
|
Attachment #466601 -
Flags: review?(VYV03354) → review+
Comment on attachment 462912 [details] [diff] [review]
Remove utf-7 (parser part)
If mailnews restores an UTF-7 decoder, what's the expected behavior for an HTML message that doesn't specify the encoding on the MIME layer but has <meta http-equiv="Content-Type" content="text/html; charset=utf-7"> or <meta http-equiv="Content-Type" content="text/html; charset=x-imap4-modified-utf7"> inside the message?
Would GetUnicodeDecoderRaw when called with "utf-7" or "x-imap4-modified-utf7" from return a decoder when the HTML parser runs in mailnews?
In any case, please add a reftest that loads an HTML document that has <meta http-equiv="Content-Type" content="text/html; charset=utf-7"> and another with <meta http-equiv="Content-Type" content="text/html; charset=x-imap4-modified-utf7"> and content decodes differently in UTF-7 and Windows-1252.
Updated•14 years ago
|
blocking2.0: ? → -
Assignee | ||
Comment 29•14 years ago
|
||
(In reply to comment #29)
> Would GetUnicodeDecoderRaw when called with "utf-7" or "x-imap4-modified-utf7"
> from return a decoder when the HTML parser runs in mailnews?
The new UTF-7 decoder in mailnews is a character sink, not an nsIUnicodeDecoder, so GetUnicodeDecoderRaw is not going to return a decoder in this scenario.
> If mailnews restores an UTF-7 decoder, what's the expected behavior for an HTML
> message that doesn't specify the encoding on the MIME layer but has <meta
> http-equiv="Content-Type" content="text/html; charset=utf-7"> or <meta
> http-equiv="Content-Type" content="text/html; charset=x-imap4-modified-utf7">
> inside the message?
"UTF-7" is special cased in MimeInlineText_convert_and_parse_line() in the patch for bug 587475 so that it will be decoded, and I have tested that this works both in text and HTML messages. "x-imap4-modified-utf7" will not be decoded.
> In any case, please add a reftest that loads an HTML document that has <meta
> http-equiv="Content-Type" content="text/html; charset=utf-7"> and another with
> <meta http-equiv="Content-Type" content="text/html;
> charset=x-imap4-modified-utf7"> and content decodes differently in UTF-7 and
> Windows-1252.
I don't mind doing this, but do you think it adds significant value beyond the xpcshell test in attachment 466602 [details] [diff] [review]?
Comment on attachment 462912 [details] [diff] [review]
Remove utf-7 (parser part)
(In reply to comment #30)
> (In reply to comment #29)
>
> > Would GetUnicodeDecoderRaw when called with "utf-7" or "x-imap4-modified-utf7"
> > from return a decoder when the HTML parser runs in mailnews?
>
> The new UTF-7 decoder in mailnews is a character sink, not an
> nsIUnicodeDecoder, so GetUnicodeDecoderRaw is not going to return a decoder in
> this scenario.
Great.
> > If mailnews restores an UTF-7 decoder, what's the expected behavior for an HTML
> > message that doesn't specify the encoding on the MIME layer but has <meta
> > http-equiv="Content-Type" content="text/html; charset=utf-7"> or <meta
> > http-equiv="Content-Type" content="text/html; charset=x-imap4-modified-utf7">
> > inside the message?
>
> "UTF-7" is special cased in MimeInlineText_convert_and_parse_line() in the
> patch for bug 587475 so that it will be decoded, and I have tested that this
> works both in text and HTML messages. "x-imap4-modified-utf7" will not be
> decoded.
Code inspection of the patch suggests to me that internal encoding declaration (<meta>) saying UTF-7 wouldn't get decoded as UTF-7, but I'm OK with that.
> > In any case, please add a reftest that loads an HTML document that has <meta
> > http-equiv="Content-Type" content="text/html; charset=utf-7"> and another with
> > <meta http-equiv="Content-Type" content="text/html;
> > charset=x-imap4-modified-utf7"> and content decodes differently in UTF-7 and
> > Windows-1252.
>
> I don't mind doing this, but do you think it adds significant value beyond the
> xpcshell test in attachment 466602 [details] [diff] [review]?
Since there's not going to be an nsIUnicodeDecoder for UTF-7 in any configuration, I think it's not of significant value to add more tests.
r=me.
Attachment #462912 -
Flags: review?(hsivonen) → review+
Comment 31•14 years ago
|
||
Nominating for 2.0 (and I think should be backported to active branches as well).
blocking1.9.1: --- → ?
blocking1.9.2: --- → ?
blocking2.0: - → ?
Reporter | ||
Comment 32•14 years ago
|
||
We'll want this on branches after we figure out what we're doing on trunk and that it's not breaking things we care about. Not blocking a particular release.
blocking1.9.1: ? → needed
blocking1.9.2: ? → needed
status1.9.1:
--- → wanted
status1.9.2:
--- → wanted
Reporter | ||
Comment 33•14 years ago
|
||
(but I do hope it blocks Firefox 4)
Updated•14 years ago
|
blocking2.0: ? → betaN+
Assignee | ||
Comment 34•14 years ago
|
||
Patch using the mechanism from bug 601429 to mark the charsets as vulnerable, which will make GetUnicodeDecoder return an error. This still depends on bug 587475, but that will be a lot simpler to fix now too.
Attachment #432349 -
Attachment is obsolete: true
Attachment #432350 -
Attachment is obsolete: true
Attachment #462912 -
Attachment is obsolete: true
Attachment #466601 -
Attachment is obsolete: true
Comment 35•14 years ago
|
||
It might make sense to reorder the .notForBrowser block so that x-imap4-modified-utf7 is next to utf-7, I was about to post asking why x-imap4-modified-utf7 doesn't get .notForBrowser.
Also, some comments explaining the difference between .notForBrowser and .isXSSVulnerable would be nice; the former list has a lot of things that I don't immediately see what the problem is (us-ascii, for instance) and am left wondering whether others of those ought to be tagged xss-vulnerable.
Assignee | ||
Comment 36•14 years ago
|
||
.notForBrowser is *supposed* to mean "we need to be able to decode this if it appears in content, but we don't want to expose it in the UI", but you may well be right that some of the existing entries should become .isXSSVulnerable
Updated•14 years ago
|
blocking2.0: betaN+ → -
Assignee | ||
Comment 37•14 years ago
|
||
Attachment #491486 -
Attachment is obsolete: true
Attachment #500009 -
Flags: review?(VYV03354)
Assignee | ||
Comment 38•14 years ago
|
||
Attachment #466602 -
Attachment is obsolete: true
Updated•14 years ago
|
Attachment #500009 -
Flags: review?(VYV03354) → review+
Assignee | ||
Updated•14 years ago
|
Attachment #500009 -
Flags: approval2.0?
Comment 39•14 years ago
|
||
Comment on attachment 500009 [details] [diff] [review]
Flag utf-7 as XSS-vulnerable v.2
I believe this has missed the boat and we shouldn't introduce any new risk at this point. This should wait for the next 3-month cycle.
Attachment #500009 -
Flags: approval2.0? → approval2.0-
Updated•14 years ago
|
Keywords: checkin-needed
Comment 40•14 years ago
|
||
Pushed:
http://hg.mozilla.org/projects/cedar/rev/6a21a25f2400
http://hg.mozilla.org/projects/cedar/rev/4d103ea8178d
Flags: in-testsuite+
Whiteboard: [sg:want P3] → [sg:want P3][fixed-in-cedar]
Target Milestone: --- → mozilla2.2
Updated•14 years ago
|
Keywords: checkin-needed
Version: unspecified → Trunk
Comment 41•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/6a21a25f2400
http://hg.mozilla.org/mozilla-central/rev/4d103ea8178d
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [sg:want P3][fixed-in-cedar] → [sg:want P3]
Comment 42•14 years ago
|
||
I've added dev-doc-needed as at least that page must be upgraded:
https://developer.mozilla.org/en/Character_Sets_Supported_by_Gecko
Keywords: dev-doc-needed
Comment 43•14 years ago
|
||
Documented:
https://developer.mozilla.org/en/Character_Sets_Supported_by_Gecko
Also mentioned on Firefox 5 for developers; and while I was at it, I overhauled that page a bit and added some links to it. It's also tagged so it will show up when people go looking for things to do.
Keywords: dev-doc-needed → dev-doc-complete
Reporter | ||
Updated•13 years ago
|
blocking1.9.1: needed → ---
blocking1.9.2: needed → ---
Reporter | ||
Updated•13 years ago
|
status-firefox5:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•