Closed
Bug 192088
Opened 22 years ago
Closed 21 years ago
zwnj show up as [zwnj]
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
RESOLVED
DUPLICATE
of bug 205387
mozilla1.4alpha
People
(Reporter: bulbul, Assigned: smontagu)
References
()
Details
(Keywords: regression)
Attachments
(6 files, 3 obsolete files)
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030205
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030205
Certain Farsi characters display as a box with "zwnj" instead of the character.
This is seen in the first paragraph on <http://www.farsikde.org>. I first
noticed this in a Jan 16 build of Mozilla, but it's in 2003-02-05-08 (Linux), as
well. Netscape 7.x (the latest, got it today) displays these characters correctly.
I have modelled the summary line on bug 117041.
Reproducible: Always
Steps to Reproduce:
Reporter | ||
Comment 1•22 years ago
|
||
Comment 2•22 years ago
|
||
Seems fine with xft here.
Assignee | ||
Comment 3•22 years ago
|
||
Confirming and taking. It looks like the problem is there in some form even in
xft: in attachment 113691 [details] I can see vertical lines in the same positions as the
ZWNJ symbols in attachment 113660 [details].
Assignee | ||
Comment 4•22 years ago
|
||
This shows a few words from the URL, using NCRs to make it clearer what is
happening. The first paragraph is written with presentation forms in the
source, so it doesn't go through the Arabic shaping routines at
nsTextTransformer::DoArabicShaping(), which is where we strip the zero width
characters. That is obviously an overoptimization.
Assignee | ||
Comment 5•22 years ago
|
||
Changing OS to All. The ZWNJ isn't rendered on Windows, but the layout of the
text is incorrect wherever it appears.
Status: NEW → ASSIGNED
OS: Linux → All
Assignee | ||
Comment 6•22 years ago
|
||
Assignee | ||
Comment 7•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #113773 -
Flags: superreview?(roc+moz)
Attachment #113773 -
Flags: review?(roc+moz)
Assignee | ||
Updated•22 years ago
|
Target Milestone: --- → mozilla1.4alpha
Attachment #113773 -
Flags: superreview?(roc+moz)
Attachment #113773 -
Flags: superreview+
Attachment #113773 -
Flags: review?(roc+moz)
Attachment #113773 -
Flags: review+
Assignee | ||
Comment 8•22 years ago
|
||
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 9•22 years ago
|
||
> it doesn't go through the Arabic shaping routines at
> nsTextTransformer::DoArabicShaping(), which is where we strip the zero
> width characters. That is obviously an overoptimization
However, ZWJ and ZWNJ have roles to play in Indic scripts
and probably elsewhere. I'm afraid simply removing them
caused the symptom reported in bug 202352. The web
site in question might have to be fixed instead of Mozilla.
I mean, if the Arabic presentation forms (in U+FB00 -FDFF)
are used, it's not clear to me what ZWNJ/ZWJ are doing there
in the middle of Arabic presentation forms. Or, we may have to take
care of them in a *context-sensitive* way if possible.
(see http://www.unicode.org/book/ch13.pdf : p.317)
In case of Xft, I guess it's almost doing the right thing
for *non-complex* scripts including Arabic/Farsi text
represented with Arabic presentation forms (where I guess
ZWJ/ZWNJ can be safely ignored. Even for Latin scripts, ZWJ/ZWNJ
can play a role...). The slight 'misalignment' observed in attachment 113691 [details])
may be a font issue where ZWJ/ZWNJ has non-zero-width
(advancing) blank glyphs instead of non-advancing blank glyphs.
Needless to say, when ZWJ/ZWNJ have to be 'honored', Xft build
does not do the right thing, but that's another issue.
Comment 10•22 years ago
|
||
As a possible result of removing ZWJ/ZWNJ, Mozilla(20030426)
doesn't render the following sequence(at the leftmost end of the second line)
correctly :
ژه‌ی‌.
Because U+06CC(dual-joining) is surrounded by ZWNJ(non-joining),
it has to be rendered with a nominal glyph. In case of
U+0647, it is following U+0698(Not right-join-causing)
and is followed by ZWNJ(non-joining) so that it also has to be
rendered with a nominal glyph. MS IE render them both
with their nominal glyphs while Mozilla joins them
and then ligate them. This should not happen if it's
Mozilla's internal ArabicShaping that does the shaping
because ArabicShaping should take care of this. If it's
the native Arabic shaper on Win2k/XP, this can be explained
by the removal of ZWJ/ZWNJ implemented in the patch.
In conclusion, I think we should NOT remove ZWJ/ZWNJ. Doing
so would deprive the renderer/shaper down the road (either native
as for various Indic scripts on Win2k/XP or Mozilla's own as for Hindi shaper
for Mozilla-gtk-x11core) of the opportunity to deal with them in a
context-sensitive manner (either ignore it or change the rendering behavior of
text around them).
As for the original bug (apparently filed for gtk-X11core+freetype)
build, nsFontMetricsGTK/FT has to prevent ZWJ/ZWNJ from being
rendered with 'last resort' glyphs if they can be safely ignored.
Assignee | ||
Comment 11•22 years ago
|
||
Jungshik, you are right. This was the wrong fix and should be backed out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 12•22 years ago
|
||
*** Bug 204234 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 13•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #122359 -
Flags: superreview?
Attachment #122359 -
Flags: review?(roc+moz)
Updated•22 years ago
|
Attachment #122359 -
Flags: superreview?(roc+moz)
Attachment #122359 -
Flags: superreview?
Attachment #122359 -
Flags: review?(roc+moz)
Attachment #122359 -
Flags: review+
Comment 14•22 years ago
|
||
Simon, thanks for taking care of it.
With zwnj/zwj passed along intact, attachment 113746 [details] is rendered
correctly under Win2k/XP (with isolated forms/nominal glyphs).
It also solved a part of bug 202352(one of two problems
reported there). As I wrote there, the other may be beyond what
Mozilla-Win can do at the moment (unless we want to switch over to Uniscribe
APIs soon, which we have to do for editing/selection).
BTW, bstell, where does Mozilla-Gtk/FT draw 'last resort' glyphs([zwnj]) when
characters are not available in any font on the system? Can you exclude
zwnj/zwj from that process? Falling short of handling zwnj/zwj in a
context-sensitve manner, we may be better
off just ignoring them. Well, this is debatable. What it does now can
be better...
Comment 15•22 years ago
|
||
Comment on attachment 122359 [details] [diff] [review]
back it out
Do we still want to
keep StripZeroWidthJoinControls()
as a function? It's invoked only once with this change.
Comment 16•21 years ago
|
||
Let's get in either Simon's (attachment 122359 [details] [diff] [review]) or this quickly.
We need at least one platform (Win2k/XP) where Indic scripts(not just
Devanagari but other Indic scripts) are rendered on par with MS IE, don't we?
Attachment #122359 -
Flags: superreview?(roc+moz) → superreview+
Assignee | ||
Comment 17•21 years ago
|
||
Comment on attachment 122359 [details] [diff] [review]
back it out
Requesting approval to back out this patch, which caused serious regressions in
Persian and many Indian languages.
Attachment #122359 -
Flags: approval1.4?
Comment 18•21 years ago
|
||
Comment on attachment 122359 [details] [diff] [review]
back it out
a=asa (on behalf of drivers) for checkin to 1.4
Attachment #122359 -
Flags: approval1.4? → approval1.4+
Assignee | ||
Comment 19•21 years ago
|
||
Checked in attachment 122359 [details] [diff] [review]. The original problem will now resurface.
Assignee | ||
Comment 20•21 years ago
|
||
With respect to comment 14, the glyphs being displayed e.g. in attachment 113660 [details]
aren't last-resort glyphs created by mozilla. They are coming from the font
-mutt-clearlyu-medium-r-normal--17-120-100-100-p-*-iso10646-1
Comment 21•21 years ago
|
||
Thanks for figuring out where the glyph comes from. Without any font having
visible glyphs for 'non-visible' characters (see bug 205387), transliteration
implemented in GFX-Gtk/X11core would turn them to nothingness. Situations like
this (some 'aggressive' fonts have visible spacing glyphs for the invisible that
control shaping of characters around them, but no toolkit-specific shaper is yet
available for them) are hard to deal with.
One possibility is to implement 'blackhole' font rbs suggested in bug 205387 for
GFX-Gtk/X11core and include ZWJ/ZWNJ in the list of characters to throw into the
blackhole font *only until* we implement shaping for them. For instance, Prabhat
is aware of problems with ZWJ/ZWNJ in his hindi shaper and plans to fix them.
Another is to do nothing. Arguably, it's not a bug to render characters like
ZWJ/ZWNJ (that are supposed to have visual effects) with visible glyphs when no
other option is available. They're different from characters like INVISIBLE
TIMES (U+2062)that has to be black-holed unless explicitly turned visible by
yet-to-be-specified CSS or other higher level markups. Especially, in this
particular case(http://www.farsikde.org), we may say 'WONT FIX' because
U+200C(ZWNJ) follows Arabic presentation forms like U+FEC2 and U+FEAA, which
doesn't make a lot of sense to me because Arabic presen. forms are non-joining
without being followed by ZWNJ.
Alternatively, we may do some special-casing in nsTextTra... Remove zwj/zwnj
only when preceded by Arabic presentation forms, but I'm wondering how many web
pages mix zwj/zwnj with Arabic presentation forms as is done at
http://www.farsikde.org
Assignee | ||
Comment 22•21 years ago
|
||
I have at last got a handle on this, with a lot of thanks due to Roland Mainz:
we decide whether X fonts have a glyph for a codepoint by looking at only the
|ascent| and |descent| members of the XCharStruct. We should be checking all the
members of the struct. See http://tronche.com/gui/x/xlib/graphics/font-metrics/ :
"A nonexistent character is represented with all members of its XCharStruct set
to zero."
Assignee | ||
Comment 23•21 years ago
|
||
This fixes http://www.farsikde.org, but should be tested more widely.
Attachment #113773 -
Attachment is obsolete: true
Attachment #122359 -
Attachment is obsolete: true
Attachment #123371 -
Attachment is obsolete: true
Comment 24•21 years ago
|
||
jshin / katakai, can you test the patch, please ?
Comment 25•21 years ago
|
||
The patch itself makes GFX:GTK more 'compliant' to Xlib spec(?) and is useful.
However, I'm afraid it doesn't remove the root cause of the problem (comment
#21). What happens if a font with visible glyphs for zwj/zwnj is _ahead of_ a
font with invisible glyphs for them in the loaded font list?
Perhaps what we can do include :
1. black-hole font (as rbs proposed)
2. removes default_ignorable_code_point from ccmap of any _10646_ fonts.
this is similar to what's done for CJK special chars
(http://lxr.mozilla.org/seamonkey/source/gfx/src/gtk/nsFontMetricsGTK.cpp#2242)
The second is attractive because even when shapers associated with
sun.unicode.india-0 or other XLFDs begin to support zwj/zwnj, we don't need to
change anything (note that we're removing zwj/zwnj only from ccmap of iso10646-1
fonts). In the meantime, the transliterator (nsFontGTKSubstitute) takes care of
zwj/zwnj.
Comment 26•21 years ago
|
||
> 2. removes default_ignorable_code_point from ccmap of any _10646_
> fonts. this is similar to what's done for CJK special chars
Or, to take a quick route, we can just block them in |GetMapFor10646|.
Either way, it'll even more slow down Mozilla-X11core when 10646 fonts are used.
Assignee | ||
Comment 27•21 years ago
|
||
Shall we just dupe this to bug 205387?
Comment 28•21 years ago
|
||
Yeah, it is the same root cause. And since you are hot on it right now, it is
perhaps best to settle on implementing a definitive solution.
*** This bug has been marked as a duplicate of 205387 ***
Status: REOPENED → RESOLVED
Closed: 22 years ago → 21 years ago
Resolution: --- → DUPLICATE
Assignee | ||
Comment 29•21 years ago
|
||
Comment on attachment 123952 [details] [diff] [review]
Don't reject glyphs unless every member of the XCharStruct is empty
I was wrong: this patch doesn't make any difference and is probably not worth
pursuing. The reason that it seemed to work was that I had the patch for bug
204272 in the tree where I was testing and the characters were being stripped
anyway.
Attachment #123952 -
Attachment is obsolete: true
Comment 30•21 years ago
|
||
Comment on attachment 123952 [details] [diff] [review]
Don't reject glyphs unless every member of the XCharStruct is empty
Well, we may want it at least to conform little bit better to the Xlib specs...
:)
I'll open a seperate bug for that later today...
Attachment #123952 -
Attachment is obsolete: false
Assignee | ||
Comment 31•21 years ago
|
||
Yes, if you care about that issue, it belongs in a separate bug.
Comment 32•21 years ago
|
||
Simon Montagu wrote:
> Yes, if you care about that issue, it belongs in a separate bug.
Filed bug 207438 ("Glyph-detection code for iso10646-1 fonts should conform more
to the Xlib specs") for that issue...
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in
before you can comment on or make changes to this bug.
Description
•