Closed Bug 1495067 Opened 6 years ago Closed 6 years ago

Drop cap effect layout weirdness (possibly due to presence of BOM)

Categories

(Core :: Internationalization, defect, P3)

63 Branch
Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: miketaylr, Assigned: hsivonen)

References

()

Details

Attachments

(2 files)

Attached image chrome (left), firefox (right).png (deleted) —
Originally filed @ https://github.com/webcompat/web-bugs/issues/19171. See screenshot for expected behavior on left, actual behavior on right. Possibly a regression? But kind of unclear, especially with mozregression pointing at https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ec6bd22c4d0034a2ee2dc12781406f2d74202295&tochange=9a92be6b431a3de68dd8fca2a3a883f3a10750cd. Forgive the crappy Summary.
I think the regression range makes sense, actually, though it may be the case that this reproduces with rtl content and is a layout bug.
Component: Layout → Layout: Text and Fonts
Flags: needinfo?(hsivonen)
Ah, the BOM is part of Arabic Presentation Forms-B, which is listed as an RTL block at https://www.unicode.org/roadmaps/bmp/ . So that's why the new code does what it does. I'm going to investigate why the old code didn't do this. (In reply to Emilio Cobos Álvarez (:emilio) from comment #1) > I think the regression range makes sense, actually, though it may be the > case that this reproduces with rtl content and is a layout bug. This indeed seems to be the case.
So there are two differences between the new an old code: 1) The new code categorizes Hebrew presentation forms as LTR, because https://www.unicode.org/roadmaps/bmp/ fails to mark them as RTL. 2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the last of these is bidi-neutral. I'm going to change encoding_rs, but it's unclear to me what the right thing for U+FEFD and U+FEFE is. jfkthame, do you know or can you guess why the two unassigned code points U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The comments in bug 240943 don't discuss them.
Flags: needinfo?(hsivonen) → needinfo?(jfkthame)
(In reply to Henri Sivonen (:hsivonen) from comment #3) > So there are two differences between the new an old code: > 1) The new code categorizes Hebrew presentation forms as LTR, because > https://www.unicode.org/roadmaps/bmp/ fails to mark them as RTL. The roadmaps/bmp table isn't a sufficient guide here; it only highlights blocks that default to RTL. The Hebrew presentation forms are individually RTL even though they're in a block that doesn't have this default. > 2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the > last of these is bidi-neutral. > > I'm going to change encoding_rs, but it's unclear to me what the right thing > for U+FEFD and U+FEFE is. Although these codepoints are unassigned, their bidi category should be AL, according to Unicode: see https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBidiClass.txt. > jfkthame, do you know or can you guess why the two unassigned code points > U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The > comments in bug 240943 don't discuss them. I don't know, but I strongly suspect this was simply an oversight: the range in IS_RTL_PRESENTATION_FORM was based on the characters that are actually assigned, and nobody noticed/cared that these two codepoints at the end of the range were supposed to have an explicit (non-LTR) directionality even though they're unassigned. For consistency, we should fix them there as well (although it's unlikely to affect any real-world use case, given that they're unassigned).
Flags: needinfo?(jfkthame)
* Update encoding_rs to 0.8.8. * Change U+FEFD and U+FEFE to RTL in IS_RTL_PRESENTATION_FORM to make the Rust and C++ code agree on what's RTL. MozReview-Commit-ID: CuK6fN4pojG
Moving this to another component. Spinning off bug 1495674 for layout. (In reply to Jonathan Kew (:jfkthame) from comment #4) > The roadmaps/bmp table isn't a sufficient guide here; it only highlights > blocks that default to RTL. The Hebrew presentation forms are individually > RTL even though they're in a block that doesn't have this default. Looks like they are in a non-block range with a default... > > 2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the > > last of these is bidi-neutral. > > > > I'm going to change encoding_rs, but it's unclear to me what the right thing > > for U+FEFD and U+FEFE is. > > Although these codepoints are unassigned, their bidi category should be AL, > according to Unicode: see > https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBidiClass.txt. > > > jfkthame, do you know or can you guess why the two unassigned code points > > U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The > > comments in bug 240943 don't discuss them. > > I don't know, but I strongly suspect this was simply an oversight: the range > in IS_RTL_PRESENTATION_FORM was based on the characters that are actually > assigned, and nobody noticed/cared that these two codepoints at the end of > the range were supposed to have an explicit (non-LTR) directionality even > though they're unassigned. > > For consistency, we should fix them there as well (although it's unlikely to > affect any real-world use case, given that they're unassigned). Thanks. I'll do that.
Component: Layout: Text and Fonts → Internationalization
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Comment on attachment 9013304 [details] Bug 1495067 - Make HasRTLChars() consider Hebrew presentation forms as RTL (again) and not consider U+FEFF as RTL (again). Jonathan Kew (:jfkthame) has approved the revision.
Attachment #9013304 - Flags: review+
Pushed by hsivonen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e9fda19be060 Make HasRTLChars() consider Hebrew presentation forms as RTL (again) and not consider U+FEFF as RTL (again). r=jfkthame
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/13320 for changes under testing/web-platform/tests
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: