Closed
Bug 167136
Opened 22 years ago
Closed 22 years ago
Allowed blank(space) glyph list have to be updated
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: jshin1987, Assigned: jshin1987)
References
()
Details
(Keywords: intl)
Attachments
(5 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
image/jpeg
|
Details | |
(deleted),
image/jpeg
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; ko-KR; rv:1.1b) Gecko/20020721
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; ko-KR; rv:1.1b) Gecko/20020721
In the page at the URL given above, Hangul Vowel filler(U+1160) is rendered as
a question mark. The font specified in the page (CODE2000 :
http://home.att.net/~jameskass)
has the non-spacing blank glyph for U+1160, but Mozilla regards the glyph (blank)
as invalid and falls back to the question mark for U+1160.
Reproducible: Always
Steps to Reproduce:
1.install CODE2000 font available at http://home.att.net/~jameskass
2. launch mozilla
3. go to http://jshin.net/i18n/korean/fillers.html
Actual Results:
Hangul vowel filler(U+1160) following Hangul leading consonants are rendered
as a question mark.
Expected Results:
Hangul vowel filler(U+1160) should be rendered as a non-spacing/combining/zero-width
blank.
It's easy to fix and I'll attach the patch.
Assignee | ||
Comment 1•22 years ago
|
||
add U+1160 to the list of characters that are allowed to have 'blank' glyph.
I haven't added U+115F(Hangul leading consonant filler) because it appears
to be rendered fine without being added to the list..
Assignee | ||
Comment 2•22 years ago
|
||
Assignee | ||
Comment 3•22 years ago
|
||
Comment 4•22 years ago
|
||
intl.
Assignee: kmcclusk → yokoyama
Status: UNCONFIRMED → NEW
Component: GFX Compositor → Internationalization
Ever confirmed: true
QA Contact: petersen → ruixu
Assignee | ||
Comment 5•22 years ago
|
||
Keith Packard (a member of XFree86 Core team and the maintainer of
fontconfig package) went through the Unicode
char. table and came up with a more extensive list of characters
that are supposed to have 'blank' visual representation (empty outline)
(his original list came from Mozilla source)
Below is the list taken from his email about the issue:
range added to fc comments
U+180B - U+180E no (but I don't have a Mongolian font to heck
against)
U+200C - U+200F yes (the Unicode description isn't clear)
U+2028 - U+2029 no (these seem like they're supposed to be drawn)
U+202A - U+202F yes (these also appear blank from the description)
U+3164 yes (HANGUL FILLER, similar to U+1160)
U+FEFF yes (byte order detector (ZERO WIDTH NO-BREAK
SPACE))
U+FFA0 yes HALFWIDTH HANGUL FILLER (similar to U+3164)
U+FFF9 - U+FFFB yes INTERLINEAR ANNOTATION marks for furigana
I guess some of characters listed above are taken care of by Mozilla (e.g.
ZWNBS/BOM), but I believe others have to be added.
FYI, the related thread in XF86-font list begins at
http://www.xfree86.org/pipermail/fonts/2002-September/002099.html
Assignee | ||
Comment 6•22 years ago
|
||
Although deprecated, U+206A - U+206D appear to have be included as well.
As for U+206E and U+206F, I'm not sure.
BTW, I'm wondering how these characters are handled in MacOS 9/X, gtk and X11.
At least in gtk, Mozilla doesn't have this problem rendering the page given
at the URL with the same truetype font(CODE2000). Are they handled at a higher
layer before reaching to the lower level of font access?
Assignee | ||
Comment 7•22 years ago
|
||
changing summary line because it's not just about Hangul Vowel filler but also
involves
many other characters.
also reassigning it to myself.
Assignee: yokoyama → jshin
Summary: U+1160(Hangul Vowel filler) is rendered as a question mark → Allowed blank(space) glyph list have to be updated
Assignee | ||
Comment 8•22 years ago
|
||
A simplstic patch for this bug is just modify the macro to check if a char.
is allowed to be blank. However, as comment #5 shows, there are a little
bit too many of them to use a simple macro. Would there be a better way
to deal with this list (a data structure?)?
Assignee | ||
Comment 9•22 years ago
|
||
Adding shanjian to CC to seek his opinion on the best way to represent
the list of blank characters as he was the last one to change the line
in question :-)
Assignee | ||
Comment 10•22 years ago
|
||
I ended up using CCMap. This may or may not be excessive for this
simple task. It seems to be all right
considering that the map is created only once per session at the beginning
and CCMap accessor macro is fast.
Shanjian, can you review?
Assignee | ||
Comment 11•22 years ago
|
||
A couple of issues to resolve:
- find out which characters currently in the list are reliably filtered out
(possibly
in a platform-independent way) upstream and remove them from the list. It
seems like
what chars are filtered out is not platform-independent (e.g.
nsFontMetricsWin does not get U+115f from upstream, while nsFontMetricsXft
gets it unfiltered. I can't check how this is handled in Mac)
- think about a need to make the list user-configurable (in prefs.js). Some fonts
have _legitimate_ blanks glyphs in code points in PUA. Obviously, this
cannot be hard-coded. With CCMap, it's easy to make this user-configurable.
Comment 12•22 years ago
|
||
jshin,
Thanks a lot for doing this. Using CCmap is the right approach. This has been in
my mind for quite some time and I haven't found the time to do it.
I have a suggestion. Can you write a perl tools to generated the CCMAP in binary
form instead of generate it in run time? That will shrink the memory footprint
and improve starttime performance. We will need to apply similar approach in
several other places. (Punctuation mark check in layout is one example.)
Assignee | ||
Comment 13•22 years ago
|
||
Shanjian,
Attached is a simple perl tool to generate PRUint16 array for CCMap.
Actually, it generates three files for LE/BE(16bit), BE(32bit) and BE(64bit).
I tested the result (with a simple test program modified from printCCMap()
in nsCompressedCharMap.cpp) on ix86 (32bit LE), Alpha(64bit LE), Sparc(32bit
BE),
and PA-Risc(32bit BE) and it worked fine. I couldn't find a 64bit BE machine
(PR-Risc machine I used is 64bit but its long is only 32bit..), but I believe
it should
work well there, too.
Can you tell me where else we need this (nsFontMetricsGTK.cpp
is one of them)? Perhaps, I'll file a new bug (to put 'precompiled CCMap' in
place of
character list) and make this bug dependent on it.
BTW, currently, it just works on BMP, but can be extended easily.
Comment 14•22 years ago
|
||
Thanks for your greak work!!!
punctuation checking in nsTextFrame is a sure thing:
http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsTextFrame.cpp#4645
CJK and hangul check in linebreaker is questionable,
http://lxr.mozilla.org/seamonkey/source/intl/lwbrk/src/nsJISx4501LineBreaker.cpp
I am sure that we will need this in some other places now and in future.
Assignee | ||
Comment 15•22 years ago
|
||
Shanjian,
Thank you for your kind words.
I filed a new bug 180266 for this
and am going to make this bug depend on it. I didn't have to,
but it seems like it's more 'conceptually' clear...
I added Shanjian to CC of bug 180266 and anyone here
is welcome to add her/himself to CC list there.
> I am sure that we will need this in some other places now and in future.
So am I :-) Especially, I guess Mozilla may need to look up Unicode
char. class table in several places (line breaking, rendering/layout - bidi, ...
text boundary identification, editing - search and replace, etc)
Most TRs at http://www.unicode.org/reports appear relevant to
adopting this approach in Mozilla in one way or another
(UTR 14, UTR 29, UTR 9, UTR 13 to name just a few)...
Depends on: 180266
Assignee | ||
Comment 16•22 years ago
|
||
180266 was just resolved and accordingly this is, too.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•