Closed
Bug 802030
Opened 12 years ago
Closed 11 years ago
Stop treating us-ascii, iso-8859-1, and Windows-1252 as distinct encodings
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
DUPLICATE
of bug 936466
People
(Reporter: hsivonen, Unassigned)
References
(Blocks 1 open bug)
Details
At present, our character encoding infrastructure treats iso-8859-1 and Windows-1252 as distinct encodings even though they have identical decoders and having a true iso-8859-1 encoder is kind of pointless. In the Encoding Standard, iso-8859-1 is merely an alias for Windows-1252. We should get rid of the separate iso-8859-1 encoding and make its labels aliases for Windows-1252.
Risk: it is possible that there exists a site that reads an encoding label supplied by Gecko and expects it to say iso-8859-1 and 10 deal if it says Windows-1252.
Reporter | ||
Comment 1•12 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #0)
> and 10 deal
and can't deal
Comment 2•12 years ago
|
||
This also applies to the following:
* iso-8859-11 is the same as windows-874 in the spec and in IE/WebKit.
* tis-620 is the same as windows-874 in the spec and in IE/WebKit.
* us-ascii is the same as windows-1252 in the spec, but not in any browser.
* iso-8859-9 is the same as windows-1254 in the spec and WebKit, but not in IE.
* gbk is the same as gb2312 in the spec and WebKit, but not in IE.
* big5-hkscs is the same as big5 in the spec and IE, but not in WebKit.
* euc-kr is the same as x-windows-949 in the spec and in IE/WebKit.
* iso-8859-6-e and iso-8859-6-i are the same as iso-8859-6 in the spec and WebKit. IE seems not to recognize them at all.
* iso-8859-8-e is the same as iso-8859-8 in the spec and WebKit. IE seems not to recognize it.
Some or all of these should probably be in different bugs, though. In particular, all of them except iso-8859-9/windows-1254 are already implemented in at least one browser, so should be safer than this.
Comment 3•12 years ago
|
||
I should add that the data from the previous comment comes only from .characterSet, and didn't involve analysis of encoders or decoders. But I hope that if .characterSet is the same in a browser, the encoder/decoder is the same too.
Comment 4•12 years ago
|
||
And I also should add that by "WebKit" I mean "Chrome 23 dev". Anne tells me Safari uses a different ICU version.
Comment 5•12 years ago
|
||
This should cover us-ascii too. I'll open a new bug for the other ones, since they're more likely to be web-compatible.
Summary: Stop treating iso-8859-1 and Windows-1252 as distinct encodings → Stop treating us-ascii, iso-8859-1, and Windows-1252 as distinct encodings
Reporter | ||
Comment 6•11 years ago
|
||
The browser side label handling is done. Blocking on mailnews as far as getting rid of the extra code goes.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•