Closed Bug 73348 Opened 24 years ago Closed 22 years ago

Block Level Elements being removed does not always cause reflow. When selection made over this area causes crash.

Categories

(Core :: Layout, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: pbwiz, Unassigned)

References

Details

(Keywords: testcase)

Attachments

(7 files, 4 obsolete files)

I have created a script that removes the contents of a Range selected by the 
user.  If the range contains block level elements and I then remove the 
contents of the range, that section of the page does not reflow.

What is worse, if the user then selects that part of the page again (using the 
mouse) the browser crashes.

I have written a page as a test case.  Follow the instructions on the page and 
it will lead to a crash!  So far it has never failed to crash for me as long as 
the instructions are followed.

If you do not follow the instructions closely the extraction of the range's 
contents sometimes works property.

My script does not use anything to manipulate the DOM except normal Node 
operations.  

What I think is going on is that the layout engine is not detecting the 
mutation of the DOM so does not "release" its resources.  Thus when the cursor 
is dragged across the "missing" content the layout engine tries to access Nodes 
that are not there.

I am getting the following message at the time of the crash:
The instruction at "0x60a2c7fa" referenced memory at "0x0021200".  The memory 
could not be "read".

When I let Visual C run debug I get the following message when it opens up:
Unhandled exception in mozilla.exe (GKHTML.DLL) 0xC0000005: Access Violation.

Operating System:  Windows 2000.
Build: All builds including Netscape 6.01
Reproducable: always as long as the same steps are taken.


Jeff Yates
Attached file Test Case HTML file (deleted) —
Keywords: crash
When I went back to workin on this I tried a work around.  If I hide the DIV 
element named work before I do the extraction, then display it again after the 
extraction I NEVER run into this error.

I would definitaly say this is a Layout problem.

Jeff
Keywords: testcase
Hidding the content then showing it again I found does not always work like I 
thought, but it happens less often.  

This is a major stumbling block to my work and I would at least like to find 
out what part of the browser is causing this.  Then maybe I could work around 
it.  Can anybody that understands how to do a stack trace find out what is 
going on.

Jeff.
Mozilla build 2001-03-26-08 on linux.  I get the following assertion:

###!!! ASSERTION: offset we got from binary search is messed up: 'i<=
mContentLength', file nsTextFrame.cpp, line 3063

but no crash.

The lack of reflow is still there, though....
Status: UNCONFIRMED → NEW
Ever confirmed: true
Moving to m0.9.1 and reassigning to waterson.
Assignee: karnaze → waterson
Target Milestone: --- → mozilla0.9.1
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Hey Jeff, I can't actually get your test to work in a build from today
(2000-06-07), but I think this sounds like it might be a dup of bug 77394.
shanjian, kin: what do you think?
The code requires the prototype object to be exposed.  Sometime after 0.9 this 
was broken (see bug http://bugzilla.mozilla.org/show_bug.cgi?id=83433).  So 
until this get's fixed my test case will not work.

Marking dependancy to bug 83433.

Jeff.
Depends on: 83433
Chris this isn't a dup of bug 77394. I just attatched a version of Jeff's 
testcase that works around the bug 83433.

It looks like we just aren't reflowing like we should?
By the way the dependency on 83433 should be removed since 83433 isn't 
contributing to this bug, it just breaks a specific test case.
Also, I just noticed that after the delete, I see the following message in the 
console whenever the page is exposed/painted:

WARNING: content length exceeds fragment length, file y:\mozilla\layout\html\bas
e\src\nsTextFrame.cpp, line 3072

Look familiar Chris? ;-)
As best I can tell, the DOM range stuff is pretty horribly broken. For example,
I would've expected the following HTML to delete between ``BEGIN'' and ``END''.
It clobbers the entirety of the first paragraph (but at least appears to get the
``end''-point right!) The test case that's attached to this bug is just way to
convoluted to be useful.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<script language="JavaScript">
function delete_range()
{
  dump("*** delete_range()\n");
  var range = document.createRange();
  range.setStartBefore(document.getElementById("start"));
  range.setEndAfter(document.getElementById("end"));

  dump("range={(" + range.startContainer + "@" + range.startOffset + "), (" +
                    range.endContainer + "@" + range.endOffset + ")}\n");

  range.deleteContents();
}
</script>
</head>
<body>
  <div>
    <p>
      before before before before before before before before before before
      before before before before before before before before before before
      before before before before before before before before before before
      before before before before before before before before before before
      <span id="start">START</span>
      before before before before before before before before before before
      before before before before before before before before before before
      before before before before before before before before before before
      before before before before before before before before before before
    </p>
    <p>
      after after after after after after after after after after
      after after after after after after after after after after
      after after after after after after after after after after
      after after after after after after after after after after
      <span id="end">END</span>
      after after after after after after after after after after
      after after after after after after after after after after
      after after after after after after after after after after
      after after after after after after after after after after
    </p>
  </div>
  <button onclick="delete_range();">Delete</button>
</body>
</html>
Keywords: testcaseqawanted
Removing dependency on 83433.
No longer depends on: 83433
Attached file further reduced (deleted) —
Keywords: qawantedtestcase
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Priority: -- → P2
Attached file test case, minimized further (deleted) —
Okay, I have a patch that fixes the simplest test case, but still doesn't appear
to work when the range contains a block frame (or perhaps spans lines; not
exactly sure yet).

The problem is this. The XUL box frame and the HTML block frame use the
NS_FRAME_IS_DIRTY and NS_FRAME_HAS_DIRTY_CHILDREN flags inconsistently. One way
to look at it is that the XUL box sets NS_FRAME_IS_DIRTY conservatively, and the
block frame sets it promiscuously.

Specifically, if a block has only one dirty child (that is, it's only recieved
on |ReflowDirtyChild()|request), it sets the NS_FRAME_HAS_DIRTY_CHILDREN bit and
schedules a reflow targeted at itself. If more than one child is dirtied (more
precisely, if |ReflowDirtyChild()| is called more than once before a reflow
arrives), the block will cancel the reflow targeted at itself, set the
NS_FRAME_IS_DIRTY bit, and call ReflowDirtyChild() on its parent. In this way,
the NS_FRAME_IS_DIRTY bit propagates up the frame hierarchy, and is guaranteed
to be set on at least one child of the frame for which a reflow is targeted.

Box, on the other hand, _never_ schedules a reflow for itself. Instead, it
manages layout by examining the NS_FRAME_HAS_DIRTY_CHILDREN method, which is the
only flag that gets propagated up the frame hierarchy.
Whew! So, the fix is pretty simple: change HTML block frames to be
XUL-box-frame-aware. That is, an HTML block should also check to see if the
NS_FRAME_HAS_DIRTY_CHILDREN flag is set when it tries to decide how to reflow
one of its children.
Attached patch make HTML blocks box-frame aware (obsolete) (deleted) — Splinter Review
cc'ing layout geeks. I should note that the above patch fixes the simplest case
(attachment 39573 [details], ``06/21/01 16:34 test case, minimized further'') but still
doesn't fix the more complicated case.
Okay, so if I put the button back, so we have

  <body>
  <button onclick="remove_range();">
  ...

instead of

  <body onclick="remove_range();">
  ...

The bug re-appears, even with the patch! Coversely, once the above patch is
applied, the original test case (attachment 37550 [details]) works if modified so that the
|extractToBuffer()| method is called from |<body onclick="">|.

Okay, on to box vs. block, round two.
Okay, this sorta sucks. What's happening is that the mouse-up is scheduling a
style-change reflow targeted at the button, then before it's processed, the
onclick fires that does the DOM manipulation. The DOM manipulation first
schedules a dirty reflow targeted at the <div>, but the next manipulation gets
coalesced. The <div> frame cancels the reflow command and calls
ReflowDirtyChild() on the container GfxScrollFrame (a box). The box passes it up
to the <body>, which schedules a dirty reflow targeted at itself.

Now we reflow. The first reflow is the style-change targeted at the button
frame. For some reason, the line containing the GFX scroll frame is being
dirtied as well. So, we end up sending a style-change reflow into that line. The
block code punts and converts this to a resize reflow, the GFX scroll frame
detects that the resize has had no effect, and doesn't bother actually reflowing
the child <div>. The bad news is, on the way out, it clears all the dirty flags.

Now we process the dirty reflow that _should_ have reflowed the <div>. It's
targeted at the body, but now the GFX scroll frame ``lies'' and tells us that
it's not dirty, nor does it have any dirty children. So the reflow doesn't get
down to the <div>, where it's needed.

I may be able to fix this specific problem by figuring out how to avoid dirtying
the line containing the GFX scroll frame, but in general, I don't think that
solution is appropriate. (A style-change reflow could, in fact, propagate bona
fide reflow damage through the flow in a situation like this.)
Another problem could be that the block frame is converting the style-change to
a resize reflow. But I suspect that the box will try to optimize away that, too.

I think the best solution would be to make box use another frame state flag that
would differentiate whether it has other dirty _box_ children (vs. dirty _block_
children). Box could then set and clear the dirty-box-children flag to its
hearts content, but would only clear the dirty-block-children flag if its layout
ended up winding down and reflowing a dirty block.

Anyway, thought I'd make some notes. Need to poke at this some more.
Attached patch beat nsBlockReflowContext until it works (obsolete) (deleted) — Splinter Review
Okay, so the last patch fixes the bug. It doesn't appear to cause any
regressions with plaintext or HTML editing that I can immediately see. I don't
understand all the intricacies of the reflow reason massaging that happens here,
but this change ``made sense to me''. By default BRC sets the reflow reason to
resize. I changed it so that, we'll propagate that along if we're participating
in an incremental reflow _unless_:

  - We're dirty, or we've got dirty children. In which case, we'll hijack
    the reason to eReflowReason_Dirty. (This is the part that fixes the bug.)

  - We're the target of the reflow command. In which case, we'll propagate
    the real reflow reason. (This was for my own sanity, and seemed correct.)

Previously, we'd only propagate the real reason if we were the target of a
style-change reflow, or if we were dirty and the reflow type was ``dirty''.
Anyway, I'm going on a hunch here and any suggestions would be appreciated.
Keywords: patch
Chris, on your latest patch, what's up with that comment block that you changed?
Should we just remove the comment? Comment on why it is commented out? As for
the rest, I will trust your judgement on it, I am less clued in than you. I rely
completly on testing for this type of change, so if you are heppy with the
testing (and deal with the comment block somehow), [s]r=attinasi
buster commented that code out on the way out the door (r1.75, 02/11/2001)
fixing bug 47549 (floater cause entire page to redraw). I figure now's as good a
time as any to say sayonara to it...CVS will remember...
+  else if (mOuterReflowState.reason == eReflowReason_Incremental) {
+    nsIReflowCommand* rc = mOuterReflowState.reflowCommand;
+    NS_ASSERTION(rc != nsnull, "no reflow command");
+    if (rc) {
+      nsIReflowCommand::ReflowType type;   <-- retrieved and not used later?
+      rc->GetType(type);
[...]

Re: extension of the cases where the real reflow reason is propagated.
Seems also correct to me -- but since this is an enlargement that encompasses
the previous cases (no regression for sure :-), I was wondering if some 
unnecessary cases weren't introduced -- meaning the optimization that would have 
happened is now gone. I can't think of such cases, BTW, and I was just wondering 
if you had any thoughts about that.
Yeah, good thinking. Let me ponder that a bit. (I'll go see if floaters are
repaint happy again, too...)
It's curious that this logic is duplicated in nsLineLayout::ReflowFrame().
                         Season for a Reflow Reason
                                    by
                              Chris Waterson

The valid reflow reasons are:

  eReflowReason_Initial
  eReflowReason_Incremental
  eReflowReason_Resize
  eReflowReason_StyleChange
  eReflowReason_Dirty

The valid (and used) reflow command types are:

    ContentChanged
    StyleChanged
    ReflowDirty
    Timeout
    UserDefined

Any direct reflow requests made of the PresShell (i.e., via
InitialReflow(), ResizeReflow(), and StyleChangeReflow()) end up
creating nsHTMLReflowState objects whose reflow reason is the obvious
thing (i.e., eReflowReason_Initial, eReflowReason_Resize, and
eReflowReason_StyleChange, respectively).

The other top-level entry-point into reflow is in
nsHTMLReflowCommand::Dispatch(). Here, the reflow reason is always set
to eReflowReason_Incremental.

So, we never start off the root nsHTMLReflowState with a
eReflowReason_Dirty; however, there are several places where the
reflow reason is subsequently converted from eReflowReason_Incremental
to eReflowReason_Dirty. (One especially notable place is in
nsBoxToBlockAdaptor.cpp:712, where it looks like a fair amount of
spackle was applied.)

The two ``most important'' places where this occurs are in
nsLineLayout::ReflowFrame() and
nsBlockReflowContext::ReflowBlock(). The logic there goes something
like this:

  - By default, reflow the child frame with eReflowReason_Resize as
    the reflow reason.

  - If the frame-state flags indicate that the frame hasn't been
    flowed yet (i.e., NS_FRAME_FIRST_REFLOW is set), change the reason
    to eReflowReason_Initial.

  - Otherwise, if the parent reflow state's reason is
    eReflowReason_StyleChange, then make our reason by
    eReflowReason_StyleChange; i.e. always propagate style change
    reflows to children.

  - Otherwise, if the parent reflow state's reason is
    eReflowReason_Dirty, then

    - If our NS_FRAME_IS_DIRTY bit is set, make our reason be
      eReflowReason_Dirty. (Otherwise, leave the reason as
      eReflowReason_Resize.)

    In other words, propagate dirty reflows to children if we're
    marked as dirty. Propagate dirty reflows as a resize reflow
    otherwise. (This seems non-optimal, particularly in the case of
    nsBlockFrame which is likely to end up reflowing many, many
    lines.)

  - Otherwise, if the parent's reflow reason is
    eReflowReason_Incremental, then

    - If the reflow command's type is nsIReflowCommand::StyleChanged,
      then make our reflow reason by eReflowReason_StyleChange.

      (This seems excessive: only frames following the target in the
      flow should be affected by this, and that should be taken care
      of by reflow damage propagation, right?)

    - Otherwise, if the reflow command's type is
      nsIReflowCommand::Dirty, and our NS_FRAME_IS_DIRTY bit is set,
      then make our reflow reason be eReflowReason_Dirty.

    - Otherwise, leave the reason as eReflowReason_Resize.

      (Again, this seems excessive.)

So how does nsBlockFrame handle eReflowReason_Dirty versus
eReflowReason_Incremental?

  - If the reflow state's reason is eReflowReason_Dirty, we do
    absolutely no special setup for the reflow.

  - If the reflow state's reason is eReflowReason_Incremental, then:

    - If we're the target of the reflow, we check the reflow command's
      type:

      - If it's a nsIReflowCommand::StyleChange, then we prepare for a
        style change reflow (i.e., in the same was as if the reason
        were eReflowReason_StyleChange).

        Preparing for a style-change reflow involves marking all the
        lines dirty.

      - If it's nsIReflowCommand::Dirty, we do no special setup for
        the reflow.

      - Otherwise, if it's anything else (realistically
        nsIReflowCommand::ContentChanged is probably the only other
        thing we'd ever see), we prepare for a resize reflow (i.e., as
        if the reason were eReflowReason_Resize), modulo the fact that
        we _won't_ call DrainOverflowLines().

        Preparing for a resize reflow involves (essentially) marking
        all the lines dirty that are wrapped, or otherwise depend on
        the block's width.

    - Otherwise, one of our children is the target of the reflow, so
      we pluck the next frame off the reflow command (via
      nsIReflowCommand::GetNext()) into
      nsBlockReflowState::mNextRCFrame, and prepare for a child
      incremental reflow.

      Preparing for a child incremental reflow involves marking the
      linebox containing the target frame as dirty.

nsInlineFrame does nothing special for a dirty reflow. An incremental
reflow includes plucking the next frame off the reflow command via
nsIReflowCommand::GetNext(); if it's not the current frame (i.e.,
we're not the target), then InlineReflowState::mNextRCFrame gets the
frame.
So I think that this reflow reason stuff is squirrely and bad and could use a
good enema. I'd like to propose new rules for reflow-reason propagation.
Specifically, I think that:

  - By default, a parent frame should propagate the reflow reason _as is_
    to its children. (Exceptions allowed, e.g. see changes to handle the
    table first-pass reflow in nsBlockFrame.)

  - If a frame is the target of an incremental reflow, then ``the buck
    stops here''. The parent must use the reflow command to set the correct
    reflow reason appropriately for its child frames.
The above patch is an attempt at making this work. It seems to work most of the
time, but I've noticed that

  - typing in text fields (i.e. <input type="text"> is badly broken. But
    <textarea> seems fine, as does HTML editing.

  - typing tends to trip one of the new assertions I've added in 
    nsLineLayout.cpp (``dirty reflow targeted at clean frame'') like crazy


So, it's not really ready for prime-time yet. Does anyone have comments on this
approach?
Attached patch more work-in-progress, start bashing box frame (obsolete) (deleted) — Splinter Review
Attached patch oops, missed a file. (deleted) — Splinter Review
Target Milestone: mozilla0.9.3 → mozilla0.9.4
Target Milestone: mozilla0.9.4 → mozilla0.9.5
karnaze, you probably know the most about incremental reflow reason propagation;
do you have any comments on the above approach? (/me starts hacking on it again...)

I think that bug 91423 is probably getting burned by similar problems.
Blocks: 91423
waterson, I'm not very familar with how blocks change the reflow reason, but I 
seem to recall some issues with incremental reflow, which you are probably 
addressing. I'm not sure I understand your 2nd point regarding incremental 
reflow. I agree that the target frame of the incremental reflow should get it as 
is and be responsible for processing it, if that is what you are saying. But 
after the target frame returns from the incremental reflow, it is not 
unreasonable for its parent (or some other ancestor) to resize reflow some or 
all of its children. Tables do this for example when a cell changes size during 
an incremental reflow. 

Regarding your comment in the patch about table cells getting reflowed 3 times, 
here is how tables control the reflow process (I can't speak about what blocks 
do). Usually, a cell will be reflowed twice, once (initial) with an 
unconstrained width and once (resize) after the column widths are determined. A 
nested table is smart about limiting this to 2 reflows to its cells. It will not 
do 2 passes when it is reflowed unconstrained, but wait until it is reflowed on 
its 2nd pass to do the 2nd pass on its cells. There is an optimization where if 
the table already has columns, then new cells will be reflowed with existing 
column widths and might end up only reflowing the new cells once. At the other 
extreme, I am about to check in new code where cells with percent heights or 
containing percent height tables may need 3 reflows especially if the table does 
not have a computed height. 
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Target Milestone: mozilla0.9.7 → mozilla0.9.8
evaughan made things much better with his changes for bug 110328 (re-write of
box-to-block adaptor): we now no longer crash!

Unfortunately, the original test case /still/ isn't quite right: it looks as if
we're leaving the frames ``as is'' after doing an unconstrained reflow.

Since we're no longer crashing, I'm going to push this one out to `Future', and
then re-prioritize once I return from sabbatical.
Depends on: 110328
Keywords: crash
Target Milestone: mozilla0.9.8 → Future
Severity: critical → normal
Attachment #39590 - Attachment is obsolete: true
Attachment #39753 - Attachment is obsolete: true
Attachment #40094 - Attachment is obsolete: true
Attachment #41798 - Attachment is obsolete: true
Attachment #41458 - Flags: needs-work+
Attachment #41799 - Flags: needs-work+
No longer blocks: 91423
restoring dependency.  Please do not remove dependencies without good reason 
(they denote a lot more than "this bug needs to be fixed to fix that bug")
Blocks: 91423
Could this be fixed by treating the IFRAME as a reflow root (a la bug 135146)?
Depends on: 135146
No longer depends on: 135146
Does this still crash Mozilla?
So.. is this still a problem?  The testcases all work for me, but we haven't
really made major changes to reflow, I don't think.... Though waterson _did_
land reflow coalescing after the last relevant comments on this bug.
Assignee: waterson → misc
Status: ASSIGNED → NEW
Component: Layout → Layout: Misc Code
Priority: P2 → --
QA Contact: petersen → nobody
Target Milestone: Future → ---
I vote we mark this "WORKSFORME"
Yep, after more testing, that's where we're at.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WORKSFORME
Product: Core → Core Graveyard
Component: Layout: Misc Code → Layout
Product: Core Graveyard → Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: