Closed Bug 343445 Opened 18 years ago Closed 18 years ago

Change inline reflow strategy to avoid looking ahead through multi-word frames

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: roc, Assigned: roc)

References

Details

Attachments

(9 files, 11 obsolete files)

(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), patch
dbaron
: review+
dbaron
: superreview+
Details | Diff | Splinter Review
Currently when we have a word consisting of multiple frames, the nsTextFrame for the first text in the word tries to figure out whether the whole word will fit in the available width. If it doesn't fit, then that text frame forces a break before the word. The problem is that this requires nasty grovelling through the frame tree (including through frames that haven't been reflowed yet) and requires redundant transformation and measurement of the text in those later frames, even though in most cases the word will fit.

It would simplify textframes if we do all transformation and measurement during reflow and don't require this lookahead step. We can handle the multi-frame word case by recording the last place we can break before a word as we reflow the line, and if we find that a word is overflowing the available width, reflow the line again, forcing a break before the word.

I'm not yet sure what the performance effects are. We should save time when multi-frame words don't overflow the line, and we will probably cost more time when multi-frame words do overflow the line.
Attached patch patch version zero (obsolete) (deleted) — Splinter Review
This basically works, but needs considerably more testing, especially with first-letter and floats. I also want to instrument it to see how multi-frame words behave during Tp.
Attached patch add missing deltas (obsolete) (deleted) — Splinter Review
the previous patch ommitted tweaks to intl/ and content/
Attachment #227929 - Attachment is obsolete: true
Robert, this is what you referred to in bug 338993 comment #8, right?
If so, bug 338993 and bug 177147 should probably be marked as dependencies of this bug.
It's not as radical a change as I proposed in https://bugzilla.mozilla.org/show_bug.cgi?id=338993#c8 ... in particular, CanPlaceFrame still exists.

However, this patch does actually fix bugs 338993 and 177147 :-) because it stops boundaries between text frames from defaulting to break opportunities.
Blocks: 177147, 338993
Attached file first-letter testcase (deleted) —
This first-letter testcase is broken on trunk (break after the first letter) and fixed by my patch.
Attached file floating first-letter test (deleted) —
floated first-letter; works on trunk, not regressed by the patch.
(In reply to comment #4)

> However, this patch does actually fix bugs 338993 and 177147 :-) because it
> stops boundaries between text frames from defaulting to break opportunities.
> 

Does it fix bug 104517?
Attached file white-space:pre tests (deleted) —
These work on trunk and expose some bugs in this patch.
> Does it fix bug 104517?

Yes and no. It fixes that bug, ensuring that we never break the line after an open-parenthesis. But it introduces a new bug, where we don't actually break the line early enough and the line overflows the width. I will fix the latter right now...
Attached file CJK tests (deleted) —
This tests combinations of CJK breaking with white-space:pre. In particular, when breaking between textframes consisting of CJK ideographs, we need to find the nearest common ancestor of the two textframes to check whether it allows the "virtual whitespace" to be a break point.
Attached patch updated patch (obsolete) (deleted) — Splinter Review
This fixes the regressions I've found so far, and also fixes bug 104517.
Attachment #227932 - Attachment is obsolete: true
Things that I need to address before going for review:
-- check how often this triggers a second line reflow on our Tp tests
-- fix CanBreakBetween in nsTextFrame

The latter is tricky because the linebreaker is strange. It works basically like this:
-- look forward and backwards from the break point to see if there's any "CJK" (not sure what this really means) character before the nearest space. If there isn't, just use a simple whitespace breaking rule: break if a whitespace character is before or after the break point.
-- Otherwise, if the characters before and after the break point are Thai, go off and do some special Thai thing. Currently we use a heuristic that looks at a fixed number of characters in either direction but I'll assume that eventually we want to use a dictionary approach that can look arbitrarily far in either direction.
-- Otherwise we run a computation that can look at the two characters before the break point and the two characters after the break point. But it looks to me like even if the four characters around the break point are ASCII, we can get different results here than we would have with the whitespace test above. For example, the whitespace test does not allow breaking between control characters, the CJK class-based test does. It seems wrong that we can do this differently depending on whether CJK characters are present somewhere nearby.
This testcase shows the sort of trouble we get into with the linebreaker.

On the first line, note how clicking the button to add a CJK ideograph triggers CJK rules for the whole word, which makes us decide to break much earlier.

On the second line, note how breaking the word into multiple frames causes trunk to behave differently.
So the question is, is this sort of action-at-a-distance really necessary, or can we restrict the application of these CJK rules to situations where at least one of the characters next to the break point is a CJK character?
bug 255990 will help a lot here
Depends on: 255990
I did some tests on the Tp page set and this patch. In one pass over the pages, we did 42631 line reflows. The patch only induced a redo pass 55 times. Therefore I hope the performance impact will be minimal. (Of those 42631 line reflows, 37741 involved at least one text frame.)
Attached patch updated patch (obsolete) (deleted) — Splinter Review
Attachment #228100 - Attachment is obsolete: true
Attached patch diff -w for review (obsolete) (deleted) — Splinter Review
This is basically what I already had with some small changes to CanBreakInBetween.

I'm basically ignoring the problems that exist with the current linebreaker as seen in attachment 228101 [details]. They'll probably get a bit worse with this patch because the existing code does accumulate the text of the "current word" as it scans a word that spans text frames, and provides that as the left context for the linebreaker. My patch doesn't do that, it just provides the text of the previous frame as the left context. But really, the linebreaker needs to be fixed properly to always get the context it needs. That will be much easier when bug 255990 or similar is fixed, so that the linebreaker needs a bounded amount of context in almost all situations.

Having said that, this does fix the CJK issues in bug 104517.
Attachment #232689 - Flags: superreview?(dbaron)
Attachment #232689 - Flags: review?(dbaron)
Attached patch updated patch for review (obsolete) (deleted) — Splinter Review
The changes to linebreaker BreakInBetween, to return "need more context" when we run out of text looking for CJK characters, were no good as long as nsTextFrame doesn't actually provide more context. Instead, we should just assume non-CJK text for now. So I removed all the intl/ changes from the patch.
Attachment #232689 - Attachment is obsolete: true
Attachment #232861 - Flags: superreview?(dbaron)
Attachment #232861 - Flags: review?(dbaron)
Attachment #232689 - Flags: superreview?(dbaron)
Attachment #232689 - Flags: review?(dbaron)
Note: this breaks MEW computation for words that cross frame boundaries. I think that'd be much easier to fix on the reflow branch, but I don't really want this patch to wait for the reflow branch to land because my new text code in bug 333659 depends on this :-(.
Well, on second thoughts, I think I can fix MEW computation relatively easily without the reflow branch and separate this Gordian Knot of dependencies.
Attached patch fix MEW computation in the new world (diff -w) (obsolete) (deleted) — Splinter Review
I should have figured this out ages ago, sorry. This fixes MEW computation, very easy.

I would really like to land this on the trunk. If you then merge it to the reflow branch you should be able to fix multi-frame word min-width computation. And I will be able to press on with the textframe work, and hopefully land it soon (albeit off by default) since it doesn't change much other than this.
Attachment #239616 - Flags: superreview?(dbaron)
Attachment #239616 - Flags: review?(dbaron)
Attached patch sorry, this is the right patch (obsolete) (deleted) — Splinter Review
Sorry, this is the right patch. Fixes MEW computation. I would like to land this; it might help you with min-width computation for multi-frame words on the reflow branch, and it will help me get my textframe work landed soon (albeit turned off by default).
Attachment #232686 - Attachment is obsolete: true
Attachment #232861 - Attachment is obsolete: true
Attachment #239616 - Attachment is obsolete: true
Attachment #239617 - Flags: superreview?(dbaron)
Attachment #239617 - Flags: review?(dbaron)
Attachment #232861 - Flags: superreview?(dbaron)
Attachment #232861 - Flags: review?(dbaron)
Attachment #239616 - Flags: superreview?(dbaron)
Attachment #239616 - Flags: review?(dbaron)
Attached patch patch to apply (obsolete) (deleted) — Splinter Review
The previous patch was diff -w, this is the corresponding complete patch.
How does this interact with the part of nsLineLayout::VerticalAlignLine where we redo the MEW computation because the first time wasn't good enough?
We're not actually redoing the computation there, right? We're computing the final MEW for the line, and using the pfd->mMaxElementWidth for the inline frames. With this patch, that should be fine. The last frame in a multi-frame word will return an MEW that's the length of the whole word (unless it also contains a longer word, of course), which gets stored in pfd->mMaxElementWidth, and then VerticalAlignLine uses that.
Is the basic idea of this patch still what you said in comment 0?  Would you mind summarizing the changes in a little more detail?
Sure, no problem. I wish there was a nice way to provide patch commentary. Comment #0 is still a good summary of the patch.

Before I forget: MEW computation is broken for multi-frame words that use borders and margins for some of the frames, both with this patch and on trunk, almost but not exactly in the same way. I've never seen a bug on this though. We can fix it later.
Attached file testcase (deleted) —
Testcase for min-width computation for multi-frame words with borders and margins on part of the word.

On trunk and with this patch, we completely fail to account for inline borders in the min-width.

Trunk assigns the word min-width to the first frame containing the word, so when that frame contains the margins, things are OK, otherwise the margin is effectively ignored for the min-width.

This patch assigns the word min-width to the last frame containing the word, so when that frame contains the margins, things are OK, otherwise the margin is effectively ignored for the min-width.
Summary: currently our model is that frames must predict when the available width will be exceeded and break eagerly to prevent it. With this patch, we allow overrunning the available width, but remember where we last could have broken and roll back to that point.

The overall approach of the patch is to do line reflow in two passes. During the first pass we call nsLineLayout::NotifyOptionalBreakPosition to record in nsLineLayout possible places where we could insert a soft line break. These break positions are stored as a (content, offset) pair. nsLineLayout just remembers the last possible position.

If we finish line reflow and we haven't overflowed the available width (e.g., because nsTextFrame saw the boundary approaching and was able to induce a linebreak itself), then nothing else needs to happen. Otherwise, nsLineLayout::CanPlaceFrame() detects that we're out of space to place something, but instead of forcing a line break immediately, it can just set a flag to request second-pass reflow. nsBlockFrame::DoReflowInlineFrames checks that flag and requests another pass from nsBlockFrame::ReflowInlineFrames, telling it to break at the last record soft break opportunity. That break point is stored via nsLineLayout::ForceBreakAtPosition(). During the second pass reflow, we break at that place and nowhere else.

The rest is details. One big effect of this change is that we can move from "by default, allow breaks between frames" to "by default, disallow breaks between frames that CanContinueTextRun()". This is good, it fixes bug 177147 and bug 338993 and simplifies logic in nsLineLayout::CanPlaceFrame ... we don't have to do anything to ensure that first-letter stays with its word. And of course we eliminate the need for textframes to run ahead computing the full word width, and FindNextText which that requires. (Although we may need to bring back something like FindNextText when the time comes to do large-context line breaking.)

Another big detail, which interests you I think, is detecting word breaks between frames. On trunk the first nsTextFrame in a word scans through the frames for the word, accumulating the word text in a buffer, which it can use to detect word breaks. With this patch we're not doing that anymore so I had to create CanBreakBetween. It takes two text frames which we assume are adjacent in the text flow and checks whether a break is allowed between them. It's complex because it needs to handle CSS white-space values on either of the two frames and also their nearest common ancestor, whose 'white-space' value controls whether we can break between the two frames when the junction has no whitespace but a break is allowed. As we reflow the line we remember the last text frame whose text is logically before the current frame (if any) to be used  as input to CanBreakBetween; this is done via nsLineLayout::Get/SetTrailingTextFrame.

nsLineLayout::SetInWord and nsLineLayout::InWord track whether the last trailing text frame ended in non-whitespace and what the running width of the last word was.
Comment on attachment 239617 [details] [diff] [review]
sorry, this is the right patch

> void
>+nsTextFragment::AppendTo(nsAString& aString, PRInt32 aOffset, PRInt32 aLength) const
>+{
>+  if (mState.mIs2b) {
>+    aString.Append(m2b + aOffset, aLength);
>+  } else {
>+    AppendASCIItoUTF16(Substring(m1b + aOffset, m1b + aLength), aString);

This looks wrong.  I think you want just aLength (or m1b + aOffset + aLength) for the second argument to Substring.

>Index: content/base/src/nsTextFragment.h
>+   * Append the contents of this string fragment to aString
>+   */
>+  void AppendTo(nsAString& aString, PRInt32 aOffset, PRInt32 aCount) const;

This needs comments saying what the offset and the count are into.  Also, it might be nice to pick one of aCount or aLength (the declaration uses one and the definition uses the other).

>+  nsIFrame* lastCommonFrame = aKnownCommonAncestorHint;
>+  int last1 = frame1Ancestors.Count() - 1;
>+  int last2 = frame2Ancestors.Count() - 1;

Maybe use PRInt32 since that's what ns*VoidArray returns?

>+  while (last1 >= 0 && last2 >= 0) {
>+    nsIFrame* frame1 = NS_STATIC_CAST(nsIFrame*, frame1Ancestors.ElementAt(last1));
>+    if (frame1 != frame2Ancestors.ElementAt(last2))
>+      break;
>+    lastCommonFrame = frame1;
>+    last1--;
>+    last2--;
>+  }

Would be good to assert that lastCommonFrame is non-null, to catch disconnected trees on the same pres context (which aren't otherwise detected).

>+  return lastCommonFrame;

>Index: layout/base/nsLayoutUtils.h
>+  static nsIFrame*
>+  GetClosestCommonAncestorViaPlaceholders(nsIFrame* aFrame1, nsIFrame* aFrame2,
>+                                          nsIFrame* aKnownCommonAncestorHint);

Could you document the restrictions on aKnownCommonAncestorHint?  In particular, that the implementation checks whether it's really a common ancestor, so mistakes only have performance cost.

>Index: layout/generic/nsBlockFrame.cpp

Up to here.
Comment on attachment 239617 [details] [diff] [review]
sorry, this is the right patch

Please make sure you've tested redoing next to floats -- in other words, that a long word that doesn't fit next to a float gets pushed down past the float.  (Also try this with a long word that's multiple frames (both where the first frame fits and where the first frame doesn't), and with a long word that doesn't even fit below the float.)

>Index: layout/generic/nsBlockFrame.cpp
>       lineLayout.EndLineReflow();
>+      forceBreakInContent =
>+        lineLayout.GetLastOptionalBreakPosition(&forceBreakOffset);

Shouldn't you be checking lineLayout.NeedsBackup() here, since LINE_REFLOW_REDO_NO_PULL could happen in other cases, right?  It seems like at the very least you need a bunch of assertions -- but perhaps you need the |breakContent| variable from DoReflowInlineFrames.

>Index: layout/generic/nsLineLayout.h
>+  void SetInWord(PRBool aInWord, nscoord aWordWidth) {
>+    SetFlag(LL_INWORD, aInWord);
>+    mWordWidth = aWordWidth;
>+  }
>+  PRBool InWord(nscoord* aWordWidth) {
>+    if (!GetFlag(LL_INWORD))
>+      return PR_FALSE;
>+    *aWordWidth = mWordWidth;
>+    return PR_TRUE;
>+  }
>+  
>+  void SetTrailingTextFrame(nsIFrame* aFrame, PRBool aWrappingEnabled)
>+  { 
>+    mTrailingTextFrame = aFrame;
>+    SetFlag(LL_LASTTEXTFRAME_WRAPPINGENABLED, aWrappingEnabled);
>+  }
>+  nsIFrame* GetTrailingTextFrame(PRBool* aWrappingEnabled) {
>+    *aWrappingEnabled = GetFlag(LL_LASTTEXTFRAME_WRAPPINGENABLED);
>+    return mTrailingTextFrame;
>+  }

This needs documentation.  What's the difference between having a non-null trailing textframe and being InWord?  What's the wrapping-enabled state?  What's the word width?


>Index: layout/generic/nsLineLayout.cpp
> nsLineLayout::ReflowFrame(nsIFrame* aFrame,
>                           nsReflowStatus& aReflowStatus,
>                           nsHTMLReflowMetrics* aMetrics,
>                           PRBool& aPushedFrame)
> {
>-  // Initialize OUT parameter
>+  // Initialize OUT parameters
>   aPushedFrame = PR_FALSE;

This change seems bad.  (Did you undo something that added a new one?)
Comment on attachment 239617 [details] [diff] [review]
sorry, this is the right patch

>Index: layout/generic/nsLineLayout.cpp
>+  PRInt32 savedCurrentBreakOffset;
>+  nsIContent* savedCurrentBreakPosition =
>+    GetLastOptionalBreakPosition(&savedCurrentBreakOffset);

Seems like you should call the second variable |savedCurrentBreakContent| for consistency with the member variables.  And you might want to s/Current/Optional/ for both variables.

(I was getting confused about whether these variables had the same semantics as mLastOptionalBreakContent...)

>+    PRBool canContinueTextRun;
>+    aFrame->CanContinueTextRun(canContinueTextRun);
>+    
>+    if (!canContinueTextRun) {
>+      SetFlag(LL_INWORD, PR_FALSE);
>+      // Otherwise the current value is OK
>+    }

I find the naming of this variable |canContinueTextRun| and the |aCanContinueTextRun| argument in which it is passed to CanPlaceFrame to be rather confusing.  It seems to me that these are whether the frame *does* continue a text run.  (The definition of text run here is an unbreakable segment of text, I think.)  So in this case, we're calling CanContinueTextRun to determine if the frame can continue a text run -- because in this case if it can continue a text run then it does.  So I don't think the variable and the argument should have "can" in their names.

(Or am I getting something confused here?)

In addition to changing the name, you might want a comment to that effect (that if it can, it does) where you call CanContinueTextRun.

>+      if (pfd->GetFlag(PFD_ISTEXTFRAME)) {
>+        if (CanPlaceFloatNow()) {
>+          // The text frame must be empty.
>+          // Empty text frames at the logical start of a line should not be
>+          // allowed to specify soft break positions.
>+          RestoreSavedBreakPosition(savedCurrentBreakPosition,
>+                                    savedCurrentBreakOffset);

Should you assert that savedCurrentBreakPosition is null?

However, I'm not confident that this is a sufficient workaround for the problem you're trying to solve.  Why is it that:
 (a) this is needed, and yet
 (b) you don't have similar problems with "<p>  averylongword" or especially "<p><span><span>  </span>averylongword" causing extra lines to be created.  (Would such extra lines have any height?  You might need to check in the frame dump.)

>+      } else if (!canContinueTextRun) {
>+        mTrailingTextFrame = nsnull;

If this is the right condition for nulling out mTrailingTextFrame, I think we need to take out the CanContinueTextRun overrides -- which override to the derived class behavior but have comments saying they should be changed.  It's not clear to me whether CanContinueTextRun was meant to answer whether the frame does not introduce break points at its edges or whether the frame is invisible to breaking.  I think we use it for both, which confuses me.

(And then, at least in quirks mode and probably also in standards mode at least pre-reflow-branch, there are quirks where images are neither.  Or something like that.  I have testcases at http://dbaron.org/css/test/2003/line-breaking/ .)

>+          // If this returns true then we are being told to actually break here.
>+          if (NotifyOptionalBreakPosition(aFrame->GetContent(), PR_INT32_MAX)) {
>+            aReflowStatus = NS_INLINE_LINE_BREAK_AFTER(aReflowStatus);
>+          }

The comment seems to belong better inside the if.

>+  nscoord margin = ltr ? pfd->mMargin.right : pfd->mMargin.left;

I'd call this |endMargin|.

>-    // There are no frames on the line or we are in the first word on
>-    // the line. If the line isn't impacted by a float then the
>-    // current frame fits.
>+    // There are no frames on the line that take up width. If the line isn't
>+    // impacted by a float then the current frame fits.

Make sure to test the case that you removed from the comment -- element changes within the first word on the line -- both with and without floats, and with the element change both before and after the end of the line.  (I probably said some of that before.)
(In reply to comment #32)
> (From update of attachment 239617 [details] [diff] [review] [edit])
> Please make sure you've tested redoing next to floats -- in other words, that
> a long word that doesn't fit next to a float gets pushed down past the float. 
> (Also try this with a long word that's multiple frames (both where the first
> frame fits and where the first frame doesn't), and with a long word that
> doesn't even fit below the float.)

Will do.

> >Index: layout/generic/nsBlockFrame.cpp
> >       lineLayout.EndLineReflow();
> >+      forceBreakInContent =
> >+        lineLayout.GetLastOptionalBreakPosition(&forceBreakOffset);
> 
> Shouldn't you be checking lineLayout.NeedsBackup() here, since
> LINE_REFLOW_REDO_NO_PULL could happen in other cases, right?

Yes, good catch, thanks.

> >Index: layout/generic/nsLineLayout.h
> >+  void SetInWord(PRBool aInWord, nscoord aWordWidth) {
> >+    SetFlag(LL_INWORD, aInWord);
> >+    mWordWidth = aWordWidth;
> >+  }
> >+  PRBool InWord(nscoord* aWordWidth) {
> >+    if (!GetFlag(LL_INWORD))
> >+      return PR_FALSE;
> >+    *aWordWidth = mWordWidth;
> >+    return PR_TRUE;
> >+  }
> >+  
> >+  void SetTrailingTextFrame(nsIFrame* aFrame, PRBool aWrappingEnabled)
> >+  { 
> >+    mTrailingTextFrame = aFrame;
> >+    SetFlag(LL_LASTTEXTFRAME_WRAPPINGENABLED, aWrappingEnabled);
> >+  }
> >+  nsIFrame* GetTrailingTextFrame(PRBool* aWrappingEnabled) {
> >+    *aWrappingEnabled = GetFlag(LL_LASTTEXTFRAME_WRAPPINGENABLED);
> >+    return mTrailingTextFrame;
> >+  }
> 
> This needs documentation.

Sure, I'll do that.

> What's the difference between having a non-null trailing textframe and being
> InWord?

TrailingTextFrame => if the last content placed on the line (not counting inline containers) was text, and can form a contiguous text flow with the next content to be placed, this is the frame for the last content ... otherwise it's null.

> What's the wrapping-enabled state?

Whether that text had word-wrapping enabled (white-space:normal or -moz-pre-wrap).

> What's the word width?

InWord is set when the last text frame to be reflowed ended in non-whitespace (so it has content that might form a word with subsequent text). The word width is the width of contiguous text up to the end of that last word, including possibly words from previous frames.

> >Index: layout/generic/nsLineLayout.cpp
> > nsLineLayout::ReflowFrame(nsIFrame* aFrame,
> >                           nsReflowStatus& aReflowStatus,
> >                           nsHTMLReflowMetrics* aMetrics,
> >                           PRBool& aPushedFrame)
> > {
> >-  // Initialize OUT parameter
> >+  // Initialize OUT parameters
> >   aPushedFrame = PR_FALSE;
> 
> This change seems bad.  (Did you undo something that added a new one?)

I don't remember, but probably so.

(In reply to comment #33)
> (From update of attachment 239617 [details] [diff] [review] [edit])
> >Index: layout/generic/nsLineLayout.cpp
> >+  PRInt32 savedCurrentBreakOffset;
> >+  nsIContent* savedCurrentBreakPosition =
> >+    GetLastOptionalBreakPosition(&savedCurrentBreakOffset);
> 
> Seems like you should call the second variable |savedCurrentBreakContent| for
> consistency with the member variables.  And you might want to
> s/Current/Optional/ for both variables.

Indeed.

> >+    PRBool canContinueTextRun;
> >+    aFrame->CanContinueTextRun(canContinueTextRun);
> >+    
> >+    if (!canContinueTextRun) {
> >+      SetFlag(LL_INWORD, PR_FALSE);
> >+      // Otherwise the current value is OK
> >+    }
> 
> I find the naming of this variable |canContinueTextRun| and the
> |aCanContinueTextRun| argument in which it is passed to CanPlaceFrame to be
> rather confusing.  It seems to me that these are whether the frame *does*
> continue a text run.  (The definition of text run here is an unbreakable
> segment of text, I think.)

CanContinueTextRun returns true for frames whose boundaries *could* be crossed by contiguous, non-breakable text. What we're really looking for here is frames that return false, i.e. frames that always break up text. Perhaps I could call  canContinueTextRun "doesntBreakText"?

> >+      if (pfd->GetFlag(PFD_ISTEXTFRAME)) {
> >+        if (CanPlaceFloatNow()) {
> >+          // The text frame must be empty.
> >+          // Empty text frames at the logical start of a line should not be
> >+          // allowed to specify soft break positions.
> >+          RestoreSavedBreakPosition(savedCurrentBreakPosition,
> >+                                    savedCurrentBreakOffset);
> 
> Should you assert that savedCurrentBreakPosition is null?

Yes, I think so.

> However, I'm not confident that this is a sufficient workaround for the
> problem you're trying to solve.  Why is it that:
>  (a) this is needed, and yet
>  (b) you don't have similar problems with "<p>  averylongword" or especially
> "<p><span><span>  </span>averylongword" causing extra lines to be created. 

I'm not sure I understand what you're getting at. We've never allowed a soft line break until after some non-empty content has been placed on the line; we still don't. This code just makes sure we forget about any *potential* line breaks as well as actual ones.

> >+      } else if (!canContinueTextRun) {
> >+        mTrailingTextFrame = nsnull;
> 
> If this is the right condition for nulling out mTrailingTextFrame, I think we
> need to take out the CanContinueTextRun overrides -- which override to the
> derived class behavior but have comments saying they should be changed.

You mean for images and <canvas>?

> It's not clear to me whether CanContinueTextRun was meant to answer whether
> the frame does not introduce break points at its edges or whether the frame
> is invisible to breaking.  I think we use it for both, which confuses me.

I assumed it means that the frame and its descendants can form unbreakable text runs with text before and after it, i.e., !CanContinueTextRun means that the start and end of the frame are definitely break opportunities. I believe we only use it for this, it's only called in two places currently, both in nsLineLayout::FindNextText which bails as soon as CanContinueTextRun returns false.

The quirk you're referring to is this one, I think:
http://lxr.mozilla.org/mozilla/source/layout/generic/nsLineLayout.cpp#1740
where we accumulate image sizes into the MEW in a hacky way, to approximate the effect of images not offering break opportunities. If we ever do change inline layout so that images don't induce before-and-after break opportunities (something I fear would break sites, since it's pretty fundamental), I suggest we introduce a new method or flag to control it. It would be pretty easy to do after this patch, actually.

So yeah, I agree with taking out those overrides in image and canvas, and I'll do it here if you want.
(In reply to comment #34)
> > However, I'm not confident that this is a sufficient workaround for the
> > problem you're trying to solve.  Why is it that:
> >  (a) this is needed, and yet
> >  (b) you don't have similar problems with "<p>  averylongword" or especially
> > "<p><span><span>  </span>averylongword" causing extra lines to be created. 
> 
> I'm not sure I understand what you're getting at. We've never allowed a soft
> line break until after some non-empty content has been placed on the line; we
> still don't. This code just makes sure we forget about any *potential* line
> breaks as well as actual ones.

What I was getting at is that I think CanPlaceFloat will return false after reflowing "  averylongword" or (because of the span -- although maybe only if it has a border) after reflowing "  ".  Although I didn't actually check that it would -- that's from memory.

> > If this is the right condition for nulling out mTrailingTextFrame, I think we
> > need to take out the CanContinueTextRun overrides -- which override to the
> > derived class behavior but have comments saying they should be changed.
> 
> You mean for images and <canvas>?

Yes.

> > It's not clear to me whether CanContinueTextRun was meant to answer whether
> > the frame does not introduce break points at its edges or whether the frame
> > is invisible to breaking.  I think we use it for both, which confuses me.
> 
> I assumed it means that the frame and its descendants can form unbreakable text
> runs with text before and after it, i.e., !CanContinueTextRun means that the
> start and end of the frame are definitely break opportunities. I believe we
> only use it for this, it's only called in two places currently, both in
> nsLineLayout::FindNextText which bails as soon as CanContinueTextRun returns
> false.

Well, IIRC, in standards mode, before the reflow branch, !CanContinueTextRun doesn't mean that there are break opportunities for the purpose of minimum intrinsic width calculation, but I'm changing that on the reflow branch so min-intrinsic is more consistent with where breaks are actually allowed.  (Unless I find I need to change back, anyway.)

> So yeah, I agree with taking out those overrides in image and canvas, and I'll
> do it here if you want.

Sure.

(It's more the implications of the comments in the overrides that I'm worried about -- since they seem to imply that CanContinueTextRun means something different.)
(In reply to comment #34)
> > >+      if (pfd->GetFlag(PFD_ISTEXTFRAME)) {
> > >+        if (CanPlaceFloatNow()) {
> > >+          // The text frame must be empty.
> > >+          // Empty text frames at the logical start of a line should not be
> > >+          // allowed to specify soft break positions.
> > >+          RestoreSavedBreakPosition(savedCurrentBreakPosition,
> > >+                                    savedCurrentBreakOffset);
> > 
> > Should you assert that savedCurrentBreakPosition is null?
> 
> Yes, I think so.
> 
> > However, I'm not confident that this is a sufficient workaround for the
> > problem you're trying to solve.  Why is it that:
> >  (a) this is needed, and yet
> >  (b) you don't have similar problems with "<p>  averylongword" or especially
> > "<p><span><span>  </span>averylongword" causing extra lines to be created. 
> 
> I'm not sure I understand what you're getting at. We've never allowed a soft
> line break until after some non-empty content has been placed on the line; we
> still don't. This code just makes sure we forget about any *potential* line
> breaks as well as actual ones.

Although, more fundamentally, I'm concerned that this code is just a workaround for a deeper problem, and you shouldn't really need it at all.

For example, are you marking trimmed whitespace as an allowed break point?  For example, are you changing things so that we might break after rather than before the border in:

foo <span style='border-left:500px solid'> x</span>

(We don't break after the border because the whitespace there is collapsed with the whitespace before the border.  I wonder if the spec really says to do that, though, or if other browsers do.)
(In reply to comment #36)
> Although, more fundamentally, I'm concerned that this code is just a
> workaround for a deeper problem, and you shouldn't really need it at all.

Yeah, you were absolutely right. The correct fix is to ignore all-empty textframes for the purposes of GetTrailingTextFrame, then all-skipped-whitespace textframes at the start of the line don't create a break opportunity for the next frame. With that, I can get rid of the save/restore around empty textframes. This does change behaviour against trunk for <p><span style="border:1px black">  </span>averylongword --- trunk breaks after the span, this patch doesn't. I think the new behaviour is better.
(In reply to comment #36)
> For example, are you changing things so that we might break after rather than
> before the border in:
> 
> foo <span style='border-left:500px solid'> x</span>

Yeah, I was.

Turns out fixing this is tricky. Fixing that particular case is easy, e.g. by whacking CanBreakBetween to skip leading whitespace in the after-frame if necessary. But fixing the general case is hard, e.g.
<p>foo <span style="border-left:500px solid black;"><span> </span>x</span>
In fact, this case is currently broken on trunk :-). I think it is important that we not break between the span border and the 'x', because when we do our behaviour is really weird (the border can disappear (almost) entirely) and I don't want to get into fixing that.

Basically, we do not want to allow breaking between an entirely-skipped textframe and the next textframe. We want the break (if any) to occur *before* any entirely-skipped textframes because that's where the whitespace is that we're actually breaking at. This is fairly easy to implement using my new machinery here. We make textframes that end in non-skipped whitespace and reflow completely, and that we want to be able to break after, set an optional break position at their end. We make textframes that are completely skipped clear the trailing-text-frame state so the next textframe can't break-before. Presto, we can't break after entirely-skipped textframes but we can break before them.
Here are the testcases for long words next to floats.
These are the testcase we just discussed involving trimmed whitespace inducing (bogus) break opportunities. The last test breaks on trunk, though you may have to resize the window for a little while to get it to trigger.
Attached patch updated patch to apply (obsolete) (deleted) — Splinter Review
Attachment #239618 - Attachment is obsolete: true
Changes from the last patch:
-- nsBlockFrame::ReflowInlineFrames: don't set a forced break unless this retry needs to back up. Also clear mCurrentLineFloats when retrying (fixes a trunk bug I think)
-- nsBlockFrame::DoReflowInlineFrames: set a break opportunity at the start of any line impacted by floats
-- delete image/canvas CanContinueTextRun stubs
-- nsLineLayout::ReflowFrame: Clear notSafeToBreak if line is impacted by floats (both users of this variable want that anyway). Requested variable name changes. Allow breaking after placing a float. Don't restore break state after reflowing an empty textframe at the start of the line.
-- Added requested comments to nsLineLayout.h
-- nsLineLayout::ClearOptionalBreakPosition() should not assert if we're in a NeedsBackup situation, so implement it directly instead of calling NotifyOptionalBreakPosition.
-- nsTextFrame.cpp CanBreakBetween: don't allow breaking if the after-frame has skipped leading whitespace. Also, handle discarded characters (regression fix). 
-- nsTextFrame::MeasureText: do SetTrailingTextFrame and SetInWord only if the textframe has some non-skipped content (the latter fixes a massive regression in MEW calculation which was setting MEW to NS_UNCONSTRAINEDSIZE in some cases). Do lineLayout.SetTrailingTextFrame(nsnull, PR_FALSE) when the textframe is all skipped whitespace. Record break opportunity at the end of a complete, wrappable textframe ending in non-skipped whitespace (if the opportunity occurs inside the available width).
Attachment #239617 - Attachment is obsolete: true
Attachment #242369 - Flags: superreview?(dbaron)
Attachment #242369 - Flags: review?(dbaron)
Attachment #239617 - Flags: superreview?(dbaron)
Attachment #239617 - Flags: review?(dbaron)
Comment on attachment 242369 [details] [diff] [review]
updated patch for comments and bug fixes (diff -w)

I noticed this doesn't address my comments on the early parts of the patch -- at least nsTextFragment and nsLayoutUtils -- in case you missed those.

>Index: layout/generic/nsLineLayout.cpp
>+    // Check whether this frame breaks up text runs. All frames return do
>+    // (hence return false here) except for text frames and inline containers.

I'm having trouble parsing this.

>+      if (!continuingTextRun) {
>+        mTrailingTextFrame = nsnull;
>+        if (!psd->mNoWrap && (!CanPlaceFloatNow() || placedFloat)) {
>+          // record soft break opportunity after this content that can't be
>+          // part of a text run. This is not a text frame so we know
>+          // that offset PR_INT32_MAX means "after the content".
>+          if (NotifyOptionalBreakPosition(aFrame->GetContent(), PR_INT32_MAX)) {
>+            // If this returns true then we are being told to actually break here.
>+            aReflowStatus = NS_INLINE_LINE_BREAK_AFTER(aReflowStatus);
>+          }
>+        }
>+      }

Is there corresponding code somewhere else for a break before the replaced element?  Should there be?  (I'm wondering since such a break isn't associated with the text if there's no whitespace.)

>+  if (aFrameCanContinueTextRun) {
>+    // Let it fit, but we reserve the right to roll back
>+    // to before the text run! Note that we usually won't get here because
>+    // a text frame will break itself to avoid exceeding the available width.
>+    // We'll only get here for text frames that couldn't break early enough.
>+#ifdef NOISY_CAN_PLACE_FRAME
>+    printf("   ==> placing overflowing textrun, requesting backup\n");
>+#endif
>+    if (!mLastOptionalBreakContent) {
>+      // Nowhere to roll back to, so make this fit
>+      return PR_TRUE;
>+    }

So why do we not hit this case if we're the first long piece of text next to a float?  Is this return PR_TRUE needed?  (Especially if we initialize mLastOptionalBreak* to the beginning of the line, at least when adjacent to floats.)

>Index: layout/generic/nsLineLayout.h
>+  /**
>+   * InWord is true when the last text frame to be reflowed ended in non-whitespace

s/to be//

>+   * (so it has content that might form a word with subsequent text). The word width
>+   * is the width of contiguous text up to the end of that last word, including
>+   * possibly words from previous frames.

s/including possibly/possibly including/

>+  /**
>+   * If the last content placed on the line (not counting inline containers)
>+   * was text, and can form a contiguous text flow with the next content to be
>+   * placed, and is not just a frame of all-skipped whitespace, this is the
>+   * frame for that last content ... otherwise it's null.

Could you answer whether its nullness is equivalent to !InWord?  (If not, when are they different.)

Does LL_LASTTEXTFRAME_WRAPPINGENABLED need to be a flag, or could you just get the style data from the frame when you need it in CanBreakBetween?

You should document what GetLastOptionalBreakPosition/mLastOptionalBreakContent being null means.  (In particular, whether it means there hasn't been one yet on the line, or under what conditions there's an implied break at the start of the line.)

>+  PRBool NeedsBackup() { return GetFlag(LL_NEEDBACKUP); }

You should document that a caller checking NeedsBackup should ignore the result if a forced break position was set.  (Either that or you should avoid setting LL_NEEDBACKUP when a forced break position is set.)

>+  nsIFrame* GetBlockFrame() { return mBlockReflowState->frame; }

Is this always a block frame?  Is it always non-null?

>Index: layout/generic/nsPlaceholderFrame.h

Up to here.
Comment on attachment 242369 [details] [diff] [review]
updated patch for comments and bug fixes (diff -w)

>Index: layout/generic/nsPlaceholderFrame.h
>+  NS_IMETHOD CanContinueTextRun(PRBool& aContinueTextRun) const

Could you put the implementation in the .cpp file, please?

>+    if (!mOutOfFlowFrame) {
>+      aContinueTextRun = PR_FALSE;
>+      return NS_OK;
>+    }

Is a null mOutOfFlowFrame even valid for a frame being Reflowed?  It seems like you shouldn't need this check.

>Index: layout/generic/nsTextFrame.cpp

>+static PRBool IsWrapping(const nsStyleText* aTS)
>+{
>+  return (NS_STYLE_WHITESPACE_NORMAL == aTS->mWhiteSpace) ||
>+    (NS_STYLE_WHITESPACE_MOZ_PRE_WRAP == aTS->mWhiteSpace);
>+}

Could you add this as a method on nsStyleText instead?  (Perhaps called WhiteSpaceCanWrap, given the existing WhiteSpaceIsSignificant.)

>+static PRBool CanBreakBetween(nsTextFrame* aBefore,
>+                              PRBool aBreakWhitespaceBefore,
>+                              nsTextFrame* aAfter,
>+                              PRBool aBreakWhitespaceAfter,
>+                              nsIFrame* aBlockContainer)
>+{
>+  // This assumes text transforms don't change text breaking properties
>+  const nsTextFragment* fragBefore = aBefore->GetContent()->GetText();
>+  const nsTextFragment* fragAfter = aAfter->GetContent()->GetText();
>+  if (!fragBefore || !fragAfter)
>+    return PR_FALSE;

Why should this ever happen?  Should there be an assertion?

>+  PRInt32 beforeOffset = aBefore->GetContentOffset() + aBefore->GetContentLength();
>+  PRInt32 afterOffset = aAfter->GetContentOffset();
>+  PRInt32 afterLength = fragAfter->GetLength() - afterOffset;
>+  
>+  if (beforeOffset <= 0 || afterLength <= 0) {

Looks like this check should be for |beforeLength| (which you'd need to add), not |beforeOffset|.

More generally, the code here and later in this function seems to respect the text frame's fragment range on the after text frame, but go back through to the beginning of the content (even if out of the text frame's fragment range) on the before text frame.  It seems like it should certainly be consistent, although it might need to look at multiple text frames at some point in the future (or maybe even now).

Or was that intentional based on the asymmetry of how the before and after frames are chosen to call this?

>+  PRUnichar lastBefore = fragBefore->CharAt(beforeOffset - 1);
>+  PRUnichar firstAfter = fragAfter->CharAt(afterOffset);
>+  
>+  // If we're skipping leading whitespace in the after-frame, and we actually
>+  // have leading whitespace in the after-frame, then we can't break before
>+  // it. We will rely on a saved break opportunity from the end of the last frame
>+  // (if any). The problem is that we can't accurately figure out here whether
>+  // a break is allowed.
>+  if (aSkipLeadingWhitespaceAfter && XP_IS_SPACE(firstAfter))
>+    return PR_FALSE;

I don't see how this even compiles.  Where does aSkipLeadingWhitespaceAfter come from?

>+  // If the character before or after the boundary is a breakable whitespace
>+  // character that's not skipped, we're good to break.
>+  if ((XP_IS_SPACE(lastBefore) && aBreakWhitespaceBefore) ||
>+      (XP_IS_SPACE(firstAfter) && aBreakWhitespaceAfter))
>+    return PR_TRUE;

If the space is firstAfter, why don't you want the break to be in the after text frame, after the space?

>+  nsIFrame* f =
>+    nsLayoutUtils::GetClosestCommonAncestorViaPlaceholders(aBefore, aAfter, aBlockContainer);
>+  NS_ASSERTION(f, "Frames to check not in the same document???");
>+  // Check if our nearest common ancestor allows wrapping between its children
>+  if (!IsWrapping(f->GetStyleText()))
>+    return PR_FALSE;

Is there any reason this needs to walk through placeholders?
Comment on attachment 242369 [details] [diff] [review]
updated patch for comments and bug fixes (diff -w)

>+  PRInt32 result = nsContentUtils::LineBreaker()->BreakInBetween(
>+    beforeStr.get(), beforeStr.Length(), afterStr.get(), afterStr.Length());
>+  if (result >= 0)
>+    return result > 0;

I don't understand this code, given that BreakInBetween returns a PRBool.  Or is that PRBool not a PRBool?  It's not documented that way in nsILineBreaker.

>+  // Check whether we can break between the last text frame (if any) and this one
>+  PRBool trailingTextFrameCanWrap;
>+  nsIFrame* checkLastTextFrame = lineLayout.GetTrailingTextFrame(&trailingTextFrameCanWrap);

Could you NS_ASSERTION on the frame's type (since it's cast to nsTextFrame*)?

>+  PRBool canBreakBetweenTextFrames = PR_FALSE;
>+  if (checkLastTextFrame) {

check* seems like an odd name for a frame.  How about just lastTextFrame?

>+      PRBool isFirstThing = firstThing;
>       firstThing = PR_FALSE;

Having two different variables called firstThing and isFirstThing seems really dangerous.  Could you rename something to clarify this?

>+          if (0 == aTextData.mX) {
>+            canBreak = canBreakBetweenTextFrames;
>+            // Allow breaking between text frames even if mWrapping is false
>+            // (e.g., we're white-space:pre). If canBreakBetweenTextFrames is
>+            // true, then the previous text frame's mWrapping must have been
>+            // true, and we allow breaking between text frames if at least
>+            // one of them allows breaking.

I'm a little suspicious of the this, as I mentioned above in the code that it depends on.  Can't we distinguish between breaking between the frames and breaking after the whitespace that begins the second one?  (The 0 == aTextData.mX test would then be the wrong test.)

>+      PRInt32 forcedOffset = lineLayout.GetForcedBreakPosition(GetContent());
>+      PRInt32 measureChars = textRun.mTotalNumChars;
>+      if (forcedOffset >= 0) {

This outer check should probably just be != -1 rather than >= 0, since only -1 is special.  (Then we might assert in more buggy cases.)

>+      // save last possible backup position
>+      PRInt32 lastWhitespaceSegment =
>+        endsInWhitespace ? lastSegment : lastSegment - 1;
>+      if (lastWhitespaceSegment >= 0) {
>+        lineLayout.NotifyOptionalBreakPosition(GetContent(),
>+            aTextData.mOffset + textRun.mSegments[lastWhitespaceSegment].ContentLen());
>+      }

This is depending on each segment being entirely whitespace or non-whitespace ... or is it entirely breakable or non-breakable?  Do you need to check something about the 'white-space' property?  This could is a little confusing; it deserves a comment.

>+    // Don't allow subsequent text frame to break-before. We don

We don...?

>+  if (rs == NS_FRAME_COMPLETE && 0 != aTextData.mX && endsInWhitespace &&
>+      aTextData.mWrapping && aTextData.mX <= maxWidth) {
>+    // Remember the break opportunity at the end of this frame
>+    if (lineLayout.NotifyOptionalBreakPosition(GetContent(), aTextData.mOffset))
>+      return NS_INLINE_LINE_BREAK_AFTER(rs);
>   }

Given that you do this here, why do you need BreakInBetween to consider the whitespace at the end?

>         !aMetrics.mComputeMEW &&
>+        lineLayout.GetForcedBreakPosition(GetContent()) < 0 &&
>         (lastTimeWeSkippedLeadingWS == skipWhitespace) &&

Might want to check against -1 to show that -1 is a special value, not that you're interested in negative positions.

At least I got to the end this time through.

(I didn't look closely at the removed code, though.  Perhaps I ought to see if there's anything interesting in it that you're not doing.  Except I doubt I'll be able to understand it.)
> Would be good to assert that lastCommonFrame is non-null, to catch disconnected
> trees on the same pres context (which aren't otherwise detected).

Well, returning nsnull is a reasonable thing to do in that situation. I'll just document that that means the frames are in different frame trees.

> I noticed this doesn't address my comments on the early parts of the patch --
> at least nsTextFragment and nsLayoutUtils -- in case you missed those.

I miss them, sorry. I'll have those in the next version.

> Is there corresponding code somewhere else for a break before the replaced
> element?  Should there be?  (I'm wondering since such a break isn't
> associated with the text if there's no whitespace.)

I think we don't need to record a break opportunity before replaced stuff. CanPlaceFrame will return false if it doesn't fit, we will never place it and hope to backup to before it later.

>+    if (!mLastOptionalBreakContent) {
>+      // Nowhere to roll back to, so make this fit
>+      return PR_TRUE;
>+    }
> So why do we not hit this case if we're the first long piece of text next to
> a float?  Is this return PR_TRUE needed?  (Especially if we initialize
> mLastOptionalBreak* to the beginning of the line, at least when adjacent to
> floats.)

With the latest version of the patch, we don't hit this case because we record a break opportunity at the start of any line impacted by floats. The return PR_TRUE is needed, otherwise we will not place a multi-frame word at the start of an overflowing line.

> Could you answer whether its nullness is equivalent to !InWord?  (If not,
> when are they different.)

When GetTrailingTextFrame is null then InWord should be false. (The converse is not true when the last text frame ends in whitespace.) This wasn't quite true after I added SetTrailingTextFrame(nsnull) for completely-skipped textframes, so I've fixed that. I also moved SetFlag(LL_INWORD, PR_FALSE) in nsLineLayout::ReflowFrame down next to SetTrailingTextFrame(nsnull) to make it clear things are consistent. This means we don't call SetFlag(LL_INWORD, PR_FALSE) when we didn't place the frame, but that's OK because we're not going to reflow any more frames.

> Does LL_LASTTEXTFRAME_WRAPPINGENABLED need to be a flag, or could you just
> get the style data from the frame when you need it in CanBreakBetween?

I could just get the style data. Do you think that's better?

> You should document what GetLastOptionalBreakPosition/mLastOptionalBreakContent
> being null means.  (In particular, whether it means there hasn't been one yet
> on the line, or under what conditions there's an implied break at the start
> of the line.)

There's never an implied break at the start of the line. When a line is impacted by floats, or places a float, we insert an explicit optional break opportunity (as of the last patch revision).

> You should document that a caller checking NeedsBackup should ignore the
> result if a forced break position was set.  (Either that or you should avoid
> setting LL_NEEDBACKUP when a forced break position is set.)

Actually this should never happen. If a forced break position is set, it should have been set at a point where the line won't overflow. So I turned the !HaveForcedBreakPosition() check in nsBlockframe::DoReflowInlineFrames into an assertion.

> +GetBlockFrame
> Is this always a block frame?  Is it always non-null?

Good question. It's not always a block frame but it's never null. So I renamed it to GetLineContainerFrame().

> Could you put the implementation in the .cpp file, please?

Sure.

> Is a null mOutOfFlowFrame even valid for a frame being Reflowed?  It seems
> like you shouldn't need this check.

I would like CanContinueTextRun to not crash even if it's called at an odd time.

> Could you add this as a method on nsStyleText instead?  (Perhaps called
> WhiteSpaceCanWrap, given the existing WhiteSpaceIsSignificant.)

Sure

> Why should this ever happen?  Should there be an assertion?

Yes.

>+  if (beforeOffset <= 0 || afterLength <= 0) {
> Looks like this check should be for |beforeLength| (which you'd need to add),
> not |beforeOffset|.

The idea is merely to prevent an error when we do
>+  PRUnichar lastBefore = fragBefore->CharAt(beforeOffset - 1);
below, so checking beforeOffset is the right thing. beforeOffset points to the end of the before-content. I'll add a comment about that.

> More generally, the code here and later in this function seems to respect the
> text frame's fragment range on the after text frame, but go back through to
> the beginning of the content (even if out of the text frame's fragment range)
> on the before text frame.  It seems like it should certainly be consistent,
> although it might need to look at multiple text frames at some point in the
> future (or maybe even now).

Actually it scans the entire content for both the before and after frames:
+  PRInt32 afterLength = fragAfter->GetLength() - afterOffset;
afterLength does not depend on aAfter's mContentLength. So things are consistent.

We do need to do global linebreak analysis across multiple text frames, in fact, across all text frames in a paragraph. I have a plan for this but I want to do it after the new textrun code is in, partly to reduce churn but also because my plan leverages my textrun construction code.

> I don't see how this even compiles.  Where does aSkipLeadingWhitespaceAfter
> come from?

Sorry, I was adding and removing it to check that it was needed (it is), and I must have accidentally removed it from the parameter list when I made the diff. 

> If the space is firstAfter, why don't you want the break to be in the after
> text frame, after the space?

In theory we break both before and after spaces. In practice trailing spaces are always tacked on even if there isn't enough room for them, see:
      //Even if there is not enough space for this "space", we still put it 
      //here instead of next line
We basically pre-trim it, so it appears we always break after spaces. But I want to make sure CanBreakBetween is as consistent as possible with the word breaking provided by nsTextTransformer and nsILineBreaker.

> Is there any reason this needs to walk through placeholders?

I want it to do the right thing when the before-frame is floated first-letter text, although it's possible that in the end, it doesn't make a difference.

> I don't understand this code, given that BreakInBetween returns a PRBool.  Or
> is that PRBool not a PRBool?  It's not documented that way in nsILineBreaker.

Oooooops. At one time I had changed BreakInBetween's signature. I changed it back and forgot to update CanBreakBetween. I guess my Unicode tests were all CJK that breaks anywhere...

> Could you NS_ASSERTION on the frame's type (since it's cast to nsTextFrame*)?

Sure

> check* seems like an odd name for a frame.  How about just lastTextFrame?

Indeed

> Having two different variables called firstThing and isFirstThing seems
> really dangerous.  Could you rename something to clarify this?

Renamed isFirstThing to currentWordIsFirstThing. We really should set firstThing to false only after we've finished processing the current word, so we don't need to remember its old value, but we need to set firstThing before we fall into the big nest of conditionals because some cases break out of the loop. Don't worry too much, I promise this code won't live long. (I can say that with a straight face this time because I have the new code in another tree!)

> I'm a little suspicious of the this, as I mentioned above in the code that it
> depends on.  Can't we distinguish between breaking between the frames and
> breaking after the whitespace that begins the second one?  (The 0 ==
> aTextData.mX test would then be the wrong test.)

I hope my earlier comment allayed your suspicions.

> This outer check should probably just be != -1 rather than >= 0, since only
> -1 is special.  (Then we might assert in more buggy cases.)

OK

> This is depending on each segment being entirely whitespace or non-whitespace

That is valid. I will add a comment.

> Do you need to check something about the 'white-space' property?

No because nsTextTransformer reports preformatted whitespace as "not whitespace" (except for newlines). This is confusing, the new textframe code will be much better :-).

> We don...?

Oops, fixed.

> Given that you do this here, why do you need BreakInBetween to consider the
> whitespace at the end?

Good question. Strictly speaking, it doesn't need to consider whitespace at the end. However, it could become a performance issue; I prefer to break eagerly rather than relying on backup, wherever that's convenient, and here eager breaking is no problem.

> Might want to check against -1 to show that -1 is a special value, not that
> you're interested in negative positions.

Sure.

> At least I got to the end this time through.

Thanks. It was a heroic effort!
Attached patch updated diff -w (deleted) — Splinter Review
-- Addressed comments in https://bugzilla.mozilla.org/show_bug.cgi?id=343445#c31
-- Rephrased comment
    // Check whether this frame breaks up text runs. All frames break up text
    // runs (hence return false here) except for text frames and inline containers.
-- Fixed comments by request
-- Made InWord/TrailingTextFrame state consistent
-- Fixed comments in nsLineLayout.h
-- Made nsBlockFrame::DoReflowInlineFrames do !aLineLayout.HaveForcedBreakPosition() as an assertion.
(If it fails it will actually back up multiple times, which is OK strictly speaking since we're guaranteed
to terminate as the force-break position moves strictly further back in the line, but it would be slow. Anyway
it shouldn't happen.)
-- Renamed nsLineLayout::GetBlockFrame() to GetContainerFrame().
-- Added WhiteSpaceCanWrap to nsStyleText
-- Changed textFragment null checks in nsTextFrame::CanBreakBetween to assertions
-- Re-add aSkipLeadingWhitespaceAfter to CanBreakBetween
-- Rename checkLastTextFrame to lastTextFrame and assert that it's a textframe
-- Rename isFirstThing to currentWordIsFirstThing
-- Change forcedOffset >= 0 check to forcedOffset == -1
-- requested comments in nsTextFrame
-- Change lineLayout.GetForcedBreakPosition(GetContent()) < 0 check to == -1
Attachment #242365 - Attachment is obsolete: true
Attachment #242369 - Attachment is obsolete: true
Attachment #242705 - Flags: superreview?(dbaron)
Attachment #242705 - Flags: review?(dbaron)
Attachment #242369 - Flags: superreview?(dbaron)
Attachment #242369 - Flags: review?(dbaron)
Comment on attachment 242705 [details] [diff] [review]
updated diff -w

I'll trust that you've addressed the comments here, and mark r+sr=dbaron.

I think I'm still a little suspicious of the white-space handling in CanBreakBetween, but I think you should get this in, and then we can discuss later if I have a chance to understand the whole thing a bit better.
Attachment #242705 - Flags: superreview?(dbaron)
Attachment #242705 - Flags: superreview+
Attachment #242705 - Flags: review?(dbaron)
Attachment #242705 - Flags: review+
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Doesn't seem to have affected Tp, for better or worse.
Depends on: 357351
Comment on attachment 242705 [details] [diff] [review]
updated diff -w

>+      if (forcedOffset == -1) {
>+        forcedOffset -= aTextData.mOffset;
>+        NS_ASSERTION(forcedOffset >= 0,
>+                     "Overshot forced offset, we should have already exited");
>+        if (forcedOffset >= 0 && forcedOffset < textRun.mTotalNumChars) {
>+          // Only measure up to forcedOffset characters. We still need to measure
>+          // to make sure we get the right text dimensions, even though we know
>+          // where we're going to break. This will reduce the number of chars that
>+          // fit, which we then detect as a required break.
>+          measureChars = forcedOffset;
>+        }
>+      }
I seem to be hitting this assertion a lot...
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1161589440.30473.gz
I happend to look at balsa's log for a completely different reason, but:
###!!! ASSERTION: Overshot forced offset, we should have already exited: 'forcedOffset >= 0', file /builds/tinderbox/SeaMonkey-gcc3.4/Linux_2.4.7-10_Depend/mozilla/layout/generic/nsTextFrame.cpp, line 5398
(In reply to comment #51)
> I seem to be hitting this assertion a lot...

Bug 357351.
Depends on: 358433
Depends on: 362630
Blocks: 191699
Depends on: 363448
Depends on: 364867
Depends on: 368860
Depends on: 369227
Depends on: 376164
Depends on: 398733
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: