Closed
Bug 454
Opened 27 years ago
Closed 24 years ago
Unix: 0x80-0x9F in cp1252 do not display correctly
Categories
(Core :: Internationalization, defect, P2)
Tracking
()
VERIFIED
FIXED
People
(Reporter: tim, Assigned: erik)
References
Details
(Keywords: platform-parity)
Attachments
(1 file)
(deleted),
patch
|
Details | Diff | Splinter Review |
Created by Tim Eliseo (tim@quiknet.com) on Friday, June 19, 1998 8:48:18 PM PDT
Additional Details :
Many Web pages use quote characters in the range 0x91-0x94
which are Microsoft codepage 1252 extensions. For X these
are currently mapped to the ? character rather than normal
quote characters. A patch follows to fix this. Note that
sequences such as ‘ are currently mapped properly; this
problem only shows up when the actual characters are in the
file.
--- mozilla/lib/libi18n/sbconvtb.c Sat May 9 03:57:48 1998
+++ mozilla/lib/libi18n/sbconvtb.c.new Fri Jun 19 20:28:19
1998
@@ -71,7 +71,7 @@
/* Tables for Win CP1252 -> ISO 8859-1 */
PRIVATE unsigned char cp1252_to_iso8859_1[] = {
/*8x*/ '?', '?', ',', 'f', '?', '?', '?', '?', '^', '?',
'S', '<', '?', '?', '?', '?',
-/*9x*/ '?', '?', '?', '?', '?', '*', '-', '-', '~', '?',
's', '>', '?', '?', '?', 'Y',
+/*9x*/ '?', '`', '\'', '"', '"', '*', '-', '-', '~', '?',
's', '>', '?', '?', '?', 'Y',
/*Ax*/
0xA0,0xA1,0xA2,0xA3,0xA4,0xA5,0xA6,0xA7,0xA8,0xA9,0xAA,0xAB,0xAC,0xAD,0xAE,0xAF,
/*Bx*/
0xB0,0xB1,0xB2,0xB3,0xB4,0xB5,0xB6,0xB7,0xB8,0xB9,0xBA,0xBB,0xBC,0xBD,0xBE,0xBF,
/*Cx*/
0xC0,0xC1,0xC2,0xC3,0xC4,0xC5,0xC6,0xC7,0xC8,0xC9,0xCA,0xCB,0xCC,0xCD,0xCE,0xCF,
For those of you like myself annoyed by this bug in the
commercial Netscape version, here's a quick fix:
adb -w netscape
cp1252_to_iso8859_1+0x11?W 0x22222760
^d
This is correct for little-endian architectures.
Updated•26 years ago
|
Status: NEW → ASSIGNED
Comment 1•26 years ago
|
||
reassigned this to erik.
The patch does not work since it will break JavaScript string litera which force
to terminate eariler than it should. We have to move the fallback to the XFE.
But we need to keep those character value.....
Updated•26 years ago
|
Summary: codepage 1252 quote characters not mapped properly → 0x80-0x9F in cp1252 does not display correctly on Mac and UNIX
Comment 2•26 years ago
|
||
We won't take the same approach but we need to put code in the X rendering
engine to rneder those unicode code point which correspoding in 0x80-0x9F of
cp1252. Change the Summary to - 0x80-0x9F in cp1252 does not display correctly
on Mac and UNIX
Updated•26 years ago
|
QA Contact: 3851
Comment 3•26 years ago
|
||
Mac and Window is now working on apprunner and viewer. I don't think UNIX
is working. IQA, could you verify. We need to fix GTK GFX ...
I18n component in Bugzilla being retired. Moving these bugs to
Internationalization component.
Updated•26 years ago
|
OS: other → Linux
Summary: 0x80-0x9F in cp1252 does not display correctly on Mac and UNIX → 0x80-0x9F in cp1252 does not display correctly on UNIX
Whiteboard: Mac is fixed. Unix is not.
Comment 5•26 years ago
|
||
Change summary from "0x80-0x9F in cp1252 does not display correctly on Mac and
UNIX" to 0x80-0x9F in cp1252 does not display correctly on UNIX".
I believe Mac is now working. IQA, please verify Mac. If Mac is not working,
please open a seperate bug. One bug for two platform is difficult to track.
Thanks.
Updated•26 years ago
|
Assignee: ftang → erik
Status: ASSIGNED → NEW
Target Milestone: M5
Comment 6•26 years ago
|
||
reassign the UNIX rendering bug to erik and mark the target fix as M5.
Assignee | ||
Updated•26 years ago
|
Status: NEW → ASSIGNED
Updated•26 years ago
|
Summary: 0x80-0x9F in cp1252 does not display correctly on UNIX → UNIX GFX Unicode Text Drawing- 0x80-0x9F in cp1252 does not display correctly on UNIX
Summary: UNIX GFX Unicode Text Drawing- 0x80-0x9F in cp1252 does not display correctly on UNIX → [PP]UNIX GFX Unicode Text Drawing- 0x80-0x9F in cp1252 does not display correctly on UNIX
Assignee | ||
Updated•26 years ago
|
Summary: [PP]UNIX GFX Unicode Text Drawing- 0x80-0x9F in cp1252 does not display correctly on UNIX → [PP] Unix: 0x80-0x9F in cp1252 do not display correctly
Assignee | ||
Updated•26 years ago
|
Target Milestone: M5 → M6
Assignee | ||
Updated•26 years ago
|
Target Milestone: M6 → M7
Assignee | ||
Updated•26 years ago
|
Target Milestone: M7 → M10
Assignee | ||
Updated•26 years ago
|
Target Milestone: M10 → M12
Assignee | ||
Updated•26 years ago
|
Target Milestone: M12 → M15
Comment 10•25 years ago
|
||
Has this been tested on a font server configured to serve fonts as windows-1252
or in utf-16 or something where they are accessible through the proper unicode
codepoints?
Probably as many of these characters as possible should be displayed using
things like ' and " and -- if the correct glyphs aren't available rather than
displaying a character-not-displayed character.
Comment 11•25 years ago
|
||
Agreed -- displaying nothing at all (the current behavior) is even worse than
the 4.x behavior of showing the entity, since it means you don't see that you're
missing characters. Showing something vaguely close to the right character
would be a lot better than showing nothing.
Comment 12•25 years ago
|
||
I found, what I was looking for:
<URL:http://charts.unicode.org/Unicode.charts/normal/Unicode.html>
Assignee | ||
Comment 13•25 years ago
|
||
Moving all of my M15s to M16. Please add comments if you disagree.
Target Milestone: M15 → M16
Updated•25 years ago
|
Summary: [PP] Unix: 0x80-0x9F in cp1252 do not display correctly → Unix: 0x80-0x9F in cp1252 do not display correctly
Assignee | ||
Comment 14•25 years ago
|
||
Subject: Quotes problem
Date: Fri, 11 Feb 2000 12:52:38 +0000
From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Erik,
Could you check out, whether the following small character conversion
table fix could be made on the Netscape Web browser:
As you can see on the test page
http://www.cl.cam.ac.uk/~mgk25/ucs/CP1252.html
My Netscape Navigator 4.6 for Linux maps ‘ (LEFT SINGLE QUOTATION
MARK ‘) to 0x60 (GRAVE ACCENT). While this does look good in the
current X11 Adobe fonts which follow the old Adobe standard encoding for
ASCII and have on 0x27 "quoteright" and on 0x60 "quoteleft", the new X11
fonts will follow the modern Adobe Unicode mapping
<http://partners.adobe.com/asn/developer/typeforum/unicodegn.html>
and have accordingly instead on 0x27 "quotesingle" and on 0x60 "grave"
(because Unicode fonts have on U+2018 "quoteleft" and on U+2019
"quoteright".)
In other words: The ASCII text 'quote' will look acceptable with both
old and new fonts but `quote' will look slightly ugly with the new fonts
(this has been the case for a long time on MS-Windows already). The
advantage of the new fonts is that you will now find the proper
directional quotation marks on 0x2018 and 0x2019 such that you can show
all forms of the quotation marks accurately.
Therefore my urgent suggestion: Whenever you do a Unicode -> Latin-1
mapping, then please map both U+2018 and U+2019 to 0x27 and do NOT map
U+2018 to 0x60.
For details and background information on this issue, please read
http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
Sorry if you have fixed all this already long ago in Mozilla.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
Comment 15•25 years ago
|
||
“ and ” are lost completely.
They are generated by SGML-tools for the DocBook tag "quote".
Assignee | ||
Comment 16•25 years ago
|
||
We should address bug 31252 first, to get some basic fallback in place, and then
address this bug, so I'm targetting M17.
Target Milestone: M16 → M17
Assignee | ||
Comment 17•25 years ago
|
||
*** Bug 16872 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 18•25 years ago
|
||
*** Bug 24924 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 19•25 years ago
|
||
I am working on this right now.
Severity: trivial → normal
Priority: P3 → P2
Target Milestone: M17 → M15
Assignee | ||
Comment 20•25 years ago
|
||
It's done. I would like a code review. Anybody?
Comment 21•25 years ago
|
||
erik, if you attach the patch, I *may* take a look at it (no promises at all).
Assignee | ||
Comment 22•25 years ago
|
||
Hi Pav, I'm about to attach the diffs to get "smart quotes", trademark,
ellipsis, and all those other windows-1252 characters to display on ordinary
Unix systems via fallbacks. If you're OK with these, I'd like to check in.
Assignee | ||
Comment 23•25 years ago
|
||
Assignee | ||
Comment 24•25 years ago
|
||
Roger, I have written the code to do fallbacks for windows-1252 characters
(e.g. ellipsis -> ...) and '?' for others on Unix. The fix is attached to this
bug. It is quite similar to the code you wrote recently for Windows (thanks).
Would you be willing to review it for me so that I can check in?
Comment 25•25 years ago
|
||
OK, I have read the diff. The following
+ nsFontGTK* font = FindFont('a');
should be based on the actual REPLACEMENT_CHAR, i.e., FindFont('?').
This way, if someone has, e.g., font-family: Symbol, the search will
still return straight away because Symbol has '?'. Other than that,
the patch looks fine.
Assignee | ||
Comment 26•25 years ago
|
||
I decided to use 'a' instead of '?' as the argument to FindFont because the
replacements are strings such as "EUR" (for euro), "OE" (for OE ligature),
"..." (for ellipsis), and so on. So we need to pass something that is likely
to return a font that has all of those characters.
On Unix, there are several fonts that do not even contain 'a'. For example, all
the East Asian fonts (Japanese, Chinese, Korean). Also, Symbol does not contain
all of the upper-case and lower-case letters A-Z and a-z.
Ideally, nsFontGTKSubstitute would actually do some font switching of its own
in GetWidth and DrawString, but since all of the current replacement chars
(e.g. "EUR") are from ASCII, and since all fonts that contain 'a' also contain
the rest of the ASCII characters, I think FindFont('a') is the best first step
we can take in this development. Maybe we'll do actual font switching later.
Thanks for the review, Roger! Checked in; marking FIXED.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Comment 27•25 years ago
|
||
I verified this in 2000041307 Mac and 2000041310 Linux build.
Status: RESOLVED → VERIFIED
Comment 28•25 years ago
|
||
Problem still exists on OS/2
See http://www.heise.de/newsticker/data/jk-10.08.00-004/
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Comment 29•25 years ago
|
||
Setting platform to OS/2 and clearing status whiteboard.
OS: Linux → OS/2
Hardware: Other → PC
Whiteboard: Mac is fixed. Unix is not.
Target Milestone: M15 → ---
Assignee | ||
Comment 30•24 years ago
|
||
Daniel, please create a separate bug for OS/2. This bug is specifically for
Unix. Marking FIXED again.
Status: REOPENED → RESOLVED
Closed: 25 years ago → 24 years ago
OS: OS/2 → Linux
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•