Closed
Bug 204272
Opened 22 years ago
Closed 19 years ago
Why do we classify shaped Arabic characters as left-to-right?
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: smontagu, Assigned: smontagu)
References
(Blocks 1 open bug, )
Details
Attachments
(5 files, 5 obsolete files)
(deleted),
image/jpeg
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
image/jpeg
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
patch
|
rbs
:
review+
rbs
:
superreview+
chofmann
:
approval1.8b3+
|
Details | Diff | Splinter Review |
I just noticed a problem in ordering in the testcase for bug 192008 (attachment
113746 [details]), which I think is caused by the following code in nsBidiPresUtils.cpp
which sets |charType = eCharType_LeftToRight| for shaped Arabic characters in
the U+FExx range.
As far as I remember we originally did that for compatibility with IE on
Windows, but I don't see the same problem in IE6 on Windows 2000. Is it an
outdated hack which we can now remove?
Comment 1•22 years ago
|
||
The recent testing of GetStringTypeW(CT_CTYPE2, ...) and GetCharacterPlacementW
(..., GCP_REORDER) on W2K and WinNT shows that shaped Arabic characters are
properly classifies as RTL and get reordered by the mentioned platforms.
However, IIRC we assigned |eCharType_LeftToRight| to shaped characters for
compatibility with the bidi engine of [some version of] Windows, which treated
those as LTR.
So is it possible to ask the platform itself about character type?
*****
BTW, I don't see a problem in attachment 113746 [details]. What platform is experiencing
it?
Comment 2•22 years ago
|
||
Does the platform (in which the problem is encountered) have native bidi
support?
Assignee | ||
Comment 3•22 years ago
|
||
I see the bug on Windows 2000 with support installed for Arabic, Hebrew, Indic,
and Japanese (default language is Hebrew and user locale is English United
States).
I don't see it on current trunk, but I do see it on 1.3, and also with
attachment 122359 [details] [diff] [review] from bug 192088 applied.
Assignee | ||
Comment 4•22 years ago
|
||
Display of the testcase on the system described above.
Debugging, I see that both lines are passed to ExtTextOutW in the same (visual)
ordering, but as the screenshot shows, the first line apparently gets reordered
again by Windows (which is theoretically correct) while the second is displayed
as-is (which is the expected bug which we try to work around, as Lina has
said).
Assignee | ||
Comment 5•22 years ago
|
||
Curiouser and curiouser: in this version of the testcase, cases 2-4 are
displayed correctly in Mozilla 1.3, and cases 1-3 are displayed correctly in IE
6. My head is aching.
Attachment #122497 -
Attachment is obsolete: true
Assignee | ||
Comment 6•22 years ago
|
||
I think this is the correct fix, subject to testing on various flavours of
Windows. (ResolveBackwards is only ever called when mRightToLeftText is TRUE,
i.e. only for right to left text and only for Windows with right-to-left
support, so this won't cause any slowdown for non-Bidi pages)
A big advantage of the patch is that it moves the workaround for a bug in
Windows to a Windows-only code path. The current code will cause problems on
non-Windows platforms with their own Bidi support.
Comment 7•22 years ago
|
||
Comment on attachment 122521 [details] [diff] [review]
Patch
Simon:
1. Why don't we provide a similar fix in nsFontMetricsWin.cpp when
mRightToLeftText is FALSE?
2. Can it be acceptable to normalize (convert to 06xx range) Arabic text before
rendering? (Though I think that once we rejected it, probably because of
possible loss of composed characters.)
3. The removed code in CalculateCharType in nsBidiPresUtils.cpp seems to be
affecting only visual text. It shouldn't impact the attached testcases. I'm
saying that because it may help us in understanding the bizarre behaviour on
windows (and it must be removed if buggy, of course :).)
Comment 8•22 years ago
|
||
What I said in comment 7.3, applies to bidi platform. I doesn't seem to affect
text of any type on non-bidi one.
Comment 9•22 years ago
|
||
Oops, I'm sorry. Mozilla performs its own ordering when character type is
inconsistent with the embedding level. So the code in nsBidiPresUtils involves
both types of text.
Assignee | ||
Comment 10•22 years ago
|
||
>Why don't we provide a similar fix in nsFontMetricsWin.cpp when
>mRightToLeftText is FALSE?
mRightToLeftText is set according to the value of isRightToLeftOnBidiPlatform in
nsTextFrame::PaintUnicodeText(), where
isRightToLeftOnBidiPlatform = (isBidiSystem &&
(eCharType_RightToLeft == charType ||
eCharType_RightToLeftArabic == charType));
After the change to nsBidiPresUtils.cpp, charType will be
eCharType_RightToLeftArabic, and isBidiSystem will be TRUE on Bidi versions of
Windows (which is where the bug exists), so mRightToLeftText will be TRUE in all
cases where we need the workaround, no?
Assignee | ||
Comment 11•22 years ago
|
||
>Can it be acceptable to normalize (convert to 06xx range) Arabic text before
>rendering? (Though I think that once we rejected it, probably because of
>possible loss of composed characters.)
Not only the possible loss of composed characters. Windows has at least one bug
with rendering 06xx range characters (Farsi yeh), and if the page author is
deliberately using preshaped characters to avoid that bug, they will get a nasty
shock if we normalize :-)
Comment 12•21 years ago
|
||
Comment on attachment 122521 [details] [diff] [review]
Patch
Being agree with Simon's comment #10 and comment #11, I think that
the suggested patch is the best workaround for now.
However, though rendering character by character will work 100%, ideally, in my
opinion, we should try to investigate the problem more thoroughly and seek fine
tuned solution.
Parentheses seem to be missing in
+ IS_FE_CHAR(*currChar+1)) {
It probably should be
+ IS_FE_CHAR(*(currChar+1))) {
Comment 13•21 years ago
|
||
Illustrates how ExtTextOutW handles shaped characters on Windows 2000:
1. Homogeneous string of shaped characters, without output flags, is not
ordered.
2. Same string, with ETO_IGNORELANGUAGE flag (disabling international support),
looks identically.
3. Same string with attached 06xx range character, without output flags, is
reordered.
4. String as above, but with ETO_IGNORELANGUAGE flag, is not ordered.
Comment 14•21 years ago
|
||
Correct image (with *Arabic* Alef)
Attachment #123030 -
Attachment is obsolete: true
Comment 15•21 years ago
|
||
Summarizing behavior of ExtTextOutW (and maybe other Windows GDI functions), it
looks like ordering of shaped Arabic characters doesn't take place unless there
is a strong RTL character or BIDI control (which supposes implicit or explicit
text ordering).
This explains described in comment 3, comment 4, and comment 5 (in Mozilla, but
maybe also in IE).
On one hand, such Windows behavior apparently contradicts the Unicode Bidi
Algorithm, on the other hand, it's pretty reasonable that pre-shaped data would
also be pre-ordered (visual); AFAIK shaped characters are commonly used to
represent visually formatted data.
I wonder how other bidi platforms handle shaped Arabic characters?
Assignee | ||
Comment 16•21 years ago
|
||
As far as I can tell from looking at the testcases from here and bug 154399 in
Konqueror, qt treats shaped Arabic characters as right-to-left. It would be
interesting to know what OSX does.
Assignee | ||
Comment 17•21 years ago
|
||
At least in some circumstances this patch also prevents the regression of bug
192088 even with the original fix backed out. Time to head for reviews.
Assignee | ||
Comment 18•21 years ago
|
||
Windows seems to have a similar bug with Hebrew presentation forms as well
judging by this testcase, which is reduced from the "Exemplar characters" in
http://oss.software.ibm.com/cgi-bin/icu/lx/en/?_=he
IE exhibits the bug in the testcase, but not in the original page, which is
consistent with what Lina said in comment 15. Whether Mozilla exhibits it
depends on font prefs. If the presentation characters are rendered in a
different font, and so by a separate call to ExtTextOut, they are not ordered
correctly.
Assignee | ||
Comment 19•21 years ago
|
||
Attachment #122521 -
Attachment is obsolete: true
Assignee | ||
Comment 20•20 years ago
|
||
Pango needs this patch too
Comment 21•20 years ago
|
||
Behdad, would you take a look at this please?
Assignee | ||
Comment 22•20 years ago
|
||
Attachment #132169 -
Attachment is obsolete: true
Assignee | ||
Comment 23•20 years ago
|
||
Attachment #178243 -
Attachment is obsolete: true
Assignee | ||
Updated•20 years ago
|
Attachment #178363 -
Flags: superreview?(rbs)
Attachment #178363 -
Flags: review?(rbs)
Comment 24•20 years ago
|
||
Comment on attachment 178363 [details] [diff] [review]
Fixed a bug in the previous patch
r+sr=rbs
Attachment #178363 -
Flags: superreview?(rbs)
Attachment #178363 -
Flags: superreview+
Attachment #178363 -
Flags: review?(rbs)
Attachment #178363 -
Flags: review+
Comment 25•19 years ago
|
||
Any reason this was never checked in?
Assignee | ||
Comment 26•19 years ago
|
||
This is low risk, but I didn't think it was major enough to be worth asking for
drivers' approval during beta freeze. If you can find real-world sites that are
broken without the patch, feel free to request a1.8b3.
Comment 27•19 years ago
|
||
Oh, sorry. I didn't realize that the trunk was already frozen in April.
Attachment #178363 -
Flags: approval1.8b3?
Comment 28•19 years ago
|
||
Comment on attachment 178363 [details] [diff] [review]
Fixed a bug in the previous patch
a=chofmann
Attachment #178363 -
Flags: approval1.8b3? → approval1.8b3+
Assignee | ||
Comment 29•19 years ago
|
||
As you wish...
Checked in.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in
before you can comment on or make changes to this bug.
Description
•