Closed
Bug 177147
Opened 22 years ago
Closed 18 years ago
BiDi run breaks should not be line break opportunities
Categories
(Core :: Layout: Text and Fonts, defect)
Core
Layout: Text and Fonts
Tracking
()
RESOLVED
FIXED
People
(Reporter: ian, Assigned: ilya.konstantinov+future)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
(deleted),
text/html
|
Details |
Inexplicably, Right-to-Left Embedding (U+202B) and Right-To-Left Override
(U+202E) characters are considered valid line-breaking opportunities.
http://www.hixie.ch/tests/adhoc/css/text/white-space/normal/line-breaking/bidi/002.html
Comment 2•22 years ago
|
||
It seems as if the only reason that LRE and LRO are not also causing
line breaks in the testcase is that we are optimizing them away. Adding ‏ to
the beginning of each line makes every line go red.
Summary: RLE and RLO characters allow line breaks → LRE, RLE, LRO and RLO characters allow line breaks
Comment 3•20 years ago
|
||
shanjian is no longer working on mozilla for 2 years and these bugs are still
here. Mark them won't fix. If you want to reopen it, find a good owner first.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → WONTFIX
Comment 5•20 years ago
|
||
Mass Re-opening Bugs Frank Tang Closed on Wensday March 02 for no reason, all
the spam is his fault feel free to tar and feather him
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Comment 6•20 years ago
|
||
Simon should know what to do with this
Assignee: nobody → smontagu
Status: REOPENED → NEW
Comment 8•19 years ago
|
||
Actually, we seem to allow line-breaking anywhere between RTL and LTR text
runs, as this simple testcase demonstrates.
Simon - this is not by design, is it? Is there a separate bug for that? (I
couldn't find one).
Comment 9•19 years ago
|
||
It's not explicitly by design, but it's probably an artefact of the way that we
split text frames at points where the text direction changes.
Comment 10•19 years ago
|
||
But when one word is split into frames for other reasons, we don't allow
breaking (try adding <b>aaaaaaaaaaaa</b><i>aaaaaaaaaaaaa</i> to my testcase and
notice it won't be line-broken).
So I think it's not an unavoidable consequence of splitting bidi text into frames.
Comment 11•19 years ago
|
||
Yeah, I made that test too. I meant to imply "and in that case it's probably not
too hard to fix if we know where to look", not "and in that case there's nothing
we can do about it" :D
Comment 12•19 years ago
|
||
I see this quite often in real-world situations, where a number in Hebrew text
is followed by a punctuation mark (period or comma). A line break is allowed
between the number and the following punctuation mark, and the result looks
really bad.
I tried looking at the code for fixing it, but the line-breaking code is *very*
complicated, and contais some bidi-specific code, which I do not understand. If
I'm given some pointers/hints, I might be able to do something.
Assignee | ||
Comment 13•19 years ago
|
||
For the reference, the function taking care of line-breaking is nsTextFrame::MeasureText (in nsTextFrame.cpp). I'm still looking into it.
Assignee | ||
Comment 14•19 years ago
|
||
More info:
As I've already said, nsTextFrame::MeasureText(..) is responsible for line-wrapping. Uri rightfully said that this function must look further than its own frame, since otherwise any formatting change (<b>foo</b>bar) would be a break opportunity.
MeasureText achieves this by using an interator function GetNextWord. Unfortunatelly, in the buggy case, it passes worldLen to GetNextWord which limits it up the the nearest run break.
worldLen = start
and earlier in the code, start = start of the next run
If we have:
foo bar (BAZ)
^
8 = wordLen
GetNextWord, if IBMBIDI, will limit itself up the wordLen, instead of slurping the entire fragment's length.
Actually, it makes a whole lot of sense for MeasureText to limit itself to a single LTR/RTL run -- it makes visual measurements and it needs visually-consecutive text for that.
From here, I need to check:
- how is the line-breaking verdict applied ('when the fact is set in stone?'),
- is it possible to take that verdict out of MeasureText' hands and merely make MeasureText pass context info from one invocation to its next?
Will look into it more tomorrow.
Assignee | ||
Updated•19 years ago
|
Blocks: line-breaking
Assignee | ||
Comment 15•19 years ago
|
||
Answers to myself:
Q: How is the line-breaking verdict applied ('when the fact is set in stone') ?
A: It's applied after the reflow, by nsBlockFrame::SplitLine, following decissions made by nsLineLayout::CanPlaceFrame.
Q: Is it possible to take that verdict out of MeasureText' hands and merely make
MeasureText pass context info from one invocation to its next?
A: MeasureText doesn't make the verdict. It just finds the best line break within a text frame, given the metrics of the containing-box. The result of MeasureText is *NOT* a measurement of the entire text, but a measurement of as much text as can fit in the width. If nsLineLayout::CanPlaceFrame determines MeasureText was too optimistic, the reflow engine will re-run it (the "eReflowReason_Resize" reflow) with smaller containing-box metrics (<-- my guess).
---
The line-breaking engine is very simplistic. For example:
HTML: "foo bar<span>baz</span>"
Frames:
+- nsTextFrame("foo bar")
+- nsInlineFrame("span") <-- or whatever type it is
+- nsTextFrame("baz")
The containing-box width can fit "foo bar" but can't fit "baz".
1. On initial reflow, MeasureText will see "foo bar" and not knowing any better (it doesn't go as far as seeing "baz"), it will measure the whole "foo bar".
2. CanPlaceFrame will determine there's no place for the "baz" frame, and since "foo bar" cannot be broken after, it'll re-run MeasureText with smaller available width (i.e. eReflowReason_Resize).
3. On the re-run, MeasureText will measure only "foo" cause "bar" wouldn't fit.
The same thing should happen with BiDi nsContinuingTextFrames (= the text frame after the BiDi run break).
Why doesn't it happen? One reason was that the condition for the final check in MeasureText, which results in the possible rollback of the last word, was wrong:
(aTextData.mOffset == contentLength)
It should've also had the condition (aTextData.mOffset == startOfTheNextBidiRun).
This helps resolve the "a bC" (HEBREW = UPPERCASE) issue (making the linebreak between "a" and "b").
However, this still doesn't solve my pet peeve: "i like (CHEESECAKE)". It can still line-break between the "(" and the "CHEESECAKE".
I'll be researching this further...
Assignee: smontagu → mozilla-bugzilla
Summary: LRE, RLE, LRO and RLO characters allow line breaks → BiDi run breaks should not be line break opportunities
Assignee | ||
Comment 16•19 years ago
|
||
Robert, I'm CCing you, as you were rumored to rework MeasureText for your Thebes project ( http://wiki.mozilla.org/Gecko2:NewTextAPI ).
Assignee | ||
Comment 17•19 years ago
|
||
Following a conversation with dbaron, I realized new things:
- MeasureText must be able to run over more than one text frame, since one of its other implicit jobs is to reunite nsContinuingTextFrames created by line breaking done in a previous reflow. (Uri's work in bug 299065 has the potential to make distinction between nsContinuingTextFrames that need to be unified and those that that don't easier.)
- contentLength != this->mContentLength ... To hell with whoever named the variables in this function!
Empowered by that new knowledge, I'll try to iron it out tomorrow. Probably one of my patches will have to be one which simply renames variables and puts comments all over.
MeasureText does look ahead to the rest of the word in the following frame, starting here:
http://lxr.mozilla.org/mozilla/source/layout/generic/nsTextFrame.cpp#5706
Check whether nsLineLayout::FindNextText is correctly finding the next text across a Bidi run break:
http://lxr.mozilla.org/mozilla/source/layout/generic/nsTextFrame.cpp#5727
Depends on: 343445
Comment 19•18 years ago
|
||
Fixed by bug 343445
Status: NEW → RESOLVED
Closed: 20 years ago → 18 years ago
Resolution: --- → FIXED
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in
before you can comment on or make changes to this bug.
Description
•