Closed Bug 73348 Opened 24 years ago Closed 22 years ago

Block Level Elements being removed does not always cause reflow. When selection made over this area causes crash.

Categories

(Core :: Layout, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: pbwiz, Unassigned)

References

Details

(Keywords: testcase)

Attachments

(7 files, 4 obsolete files)

I have created a script that removes the contents of a Range selected by the user. If the range contains block level elements and I then remove the contents of the range, that section of the page does not reflow. What is worse, if the user then selects that part of the page again (using the mouse) the browser crashes. I have written a page as a test case. Follow the instructions on the page and it will lead to a crash! So far it has never failed to crash for me as long as the instructions are followed. If you do not follow the instructions closely the extraction of the range's contents sometimes works property. My script does not use anything to manipulate the DOM except normal Node operations. What I think is going on is that the layout engine is not detecting the mutation of the DOM so does not "release" its resources. Thus when the cursor is dragged across the "missing" content the layout engine tries to access Nodes that are not there. I am getting the following message at the time of the crash: The instruction at "0x60a2c7fa" referenced memory at "0x0021200". The memory could not be "read". When I let Visual C run debug I get the following message when it opens up: Unhandled exception in mozilla.exe (GKHTML.DLL) 0xC0000005: Access Violation. Operating System: Windows 2000. Build: All builds including Netscape 6.01 Reproducable: always as long as the same steps are taken. Jeff Yates
Attached file Test Case HTML file (deleted) —
Keywords: crash
When I went back to workin on this I tried a work around. If I hide the DIV element named work before I do the extraction, then display it again after the extraction I NEVER run into this error. I would definitaly say this is a Layout problem. Jeff
Keywords: testcase
Hidding the content then showing it again I found does not always work like I thought, but it happens less often. This is a major stumbling block to my work and I would at least like to find out what part of the browser is causing this. Then maybe I could work around it. Can anybody that understands how to do a stack trace find out what is going on. Jeff.
Mozilla build 2001-03-26-08 on linux. I get the following assertion: ###!!! ASSERTION: offset we got from binary search is messed up: 'i<= mContentLength', file nsTextFrame.cpp, line 3063 but no crash. The lack of reflow is still there, though....
Status: UNCONFIRMED → NEW
Ever confirmed: true
Moving to m0.9.1 and reassigning to waterson.
Assignee: karnaze → waterson
Target Milestone: --- → mozilla0.9.1
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Hey Jeff, I can't actually get your test to work in a build from today (2000-06-07), but I think this sounds like it might be a dup of bug 77394. shanjian, kin: what do you think?
The code requires the prototype object to be exposed. Sometime after 0.9 this was broken (see bug http://bugzilla.mozilla.org/show_bug.cgi?id=83433). So until this get's fixed my test case will not work. Marking dependancy to bug 83433. Jeff.
Depends on: 83433
Chris this isn't a dup of bug 77394. I just attatched a version of Jeff's testcase that works around the bug 83433. It looks like we just aren't reflowing like we should?
By the way the dependency on 83433 should be removed since 83433 isn't contributing to this bug, it just breaks a specific test case.
Also, I just noticed that after the delete, I see the following message in the console whenever the page is exposed/painted: WARNING: content length exceeds fragment length, file y:\mozilla\layout\html\bas e\src\nsTextFrame.cpp, line 3072 Look familiar Chris? ;-)
As best I can tell, the DOM range stuff is pretty horribly broken. For example, I would've expected the following HTML to delete between ``BEGIN'' and ``END''. It clobbers the entirety of the first paragraph (but at least appears to get the ``end''-point right!) The test case that's attached to this bug is just way to convoluted to be useful. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html> <head> <script language="JavaScript"> function delete_range() { dump("*** delete_range()\n"); var range = document.createRange(); range.setStartBefore(document.getElementById("start")); range.setEndAfter(document.getElementById("end")); dump("range={(" + range.startContainer + "@" + range.startOffset + "), (" + range.endContainer + "@" + range.endOffset + ")}\n"); range.deleteContents(); } </script> </head> <body> <div> <p> before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before <span id="start">START</span> before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before before </p> <p> after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after <span id="end">END</span> after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after after </p> </div> <button onclick="delete_range();">Delete</button> </body> </html>
Keywords: testcaseqawanted
Removing dependency on 83433.
No longer depends on: 83433
Attached file further reduced (deleted) —
Keywords: qawantedtestcase
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Priority: -- → P2
Attached file test case, minimized further (deleted) —
Okay, I have a patch that fixes the simplest test case, but still doesn't appear to work when the range contains a block frame (or perhaps spans lines; not exactly sure yet). The problem is this. The XUL box frame and the HTML block frame use the NS_FRAME_IS_DIRTY and NS_FRAME_HAS_DIRTY_CHILDREN flags inconsistently. One way to look at it is that the XUL box sets NS_FRAME_IS_DIRTY conservatively, and the block frame sets it promiscuously. Specifically, if a block has only one dirty child (that is, it's only recieved on |ReflowDirtyChild()|request), it sets the NS_FRAME_HAS_DIRTY_CHILDREN bit and schedules a reflow targeted at itself. If more than one child is dirtied (more precisely, if |ReflowDirtyChild()| is called more than once before a reflow arrives), the block will cancel the reflow targeted at itself, set the NS_FRAME_IS_DIRTY bit, and call ReflowDirtyChild() on its parent. In this way, the NS_FRAME_IS_DIRTY bit propagates up the frame hierarchy, and is guaranteed to be set on at least one child of the frame for which a reflow is targeted. Box, on the other hand, _never_ schedules a reflow for itself. Instead, it manages layout by examining the NS_FRAME_HAS_DIRTY_CHILDREN method, which is the only flag that gets propagated up the frame hierarchy. Whew! So, the fix is pretty simple: change HTML block frames to be XUL-box-frame-aware. That is, an HTML block should also check to see if the NS_FRAME_HAS_DIRTY_CHILDREN flag is set when it tries to decide how to reflow one of its children.
Attached patch make HTML blocks box-frame aware (obsolete) (deleted) — Splinter Review
cc'ing layout geeks. I should note that the above patch fixes the simplest case (attachment 39573 [details], ``06/21/01 16:34 test case, minimized further'') but still doesn't fix the more complicated case.
Okay, so if I put the button back, so we have <body> <button onclick="remove_range();"> ... instead of <body onclick="remove_range();"> ... The bug re-appears, even with the patch! Coversely, once the above patch is applied, the original test case (attachment 37550 [details]) works if modified so that the |extractToBuffer()| method is called from |<body onclick="">|. Okay, on to box vs. block, round two.
Okay, this sorta sucks. What's happening is that the mouse-up is scheduling a style-change reflow targeted at the button, then before it's processed, the onclick fires that does the DOM manipulation. The DOM manipulation first schedules a dirty reflow targeted at the <div>, but the next manipulation gets coalesced. The <div> frame cancels the reflow command and calls ReflowDirtyChild() on the container GfxScrollFrame (a box). The box passes it up to the <body>, which schedules a dirty reflow targeted at itself. Now we reflow. The first reflow is the style-change targeted at the button frame. For some reason, the line containing the GFX scroll frame is being dirtied as well. So, we end up sending a style-change reflow into that line. The block code punts and converts this to a resize reflow, the GFX scroll frame detects that the resize has had no effect, and doesn't bother actually reflowing the child <div>. The bad news is, on the way out, it clears all the dirty flags. Now we process the dirty reflow that _should_ have reflowed the <div>. It's targeted at the body, but now the GFX scroll frame ``lies'' and tells us that it's not dirty, nor does it have any dirty children. So the reflow doesn't get down to the <div>, where it's needed. I may be able to fix this specific problem by figuring out how to avoid dirtying the line containing the GFX scroll frame, but in general, I don't think that solution is appropriate. (A style-change reflow could, in fact, propagate bona fide reflow damage through the flow in a situation like this.) Another problem could be that the block frame is converting the style-change to a resize reflow. But I suspect that the box will try to optimize away that, too. I think the best solution would be to make box use another frame state flag that would differentiate whether it has other dirty _box_ children (vs. dirty _block_ children). Box could then set and clear the dirty-box-children flag to its hearts content, but would only clear the dirty-block-children flag if its layout ended up winding down and reflowing a dirty block. Anyway, thought I'd make some notes. Need to poke at this some more.
Attached patch beat nsBlockReflowContext until it works (obsolete) (deleted) — Splinter Review
Okay, so the last patch fixes the bug. It doesn't appear to cause any regressions with plaintext or HTML editing that I can immediately see. I don't understand all the intricacies of the reflow reason massaging that happens here, but this change ``made sense to me''. By default BRC sets the reflow reason to resize. I changed it so that, we'll propagate that along if we're participating in an incremental reflow _unless_: - We're dirty, or we've got dirty children. In which case, we'll hijack the reason to eReflowReason_Dirty. (This is the part that fixes the bug.) - We're the target of the reflow command. In which case, we'll propagate the real reflow reason. (This was for my own sanity, and seemed correct.) Previously, we'd only propagate the real reason if we were the target of a style-change reflow, or if we were dirty and the reflow type was ``dirty''. Anyway, I'm going on a hunch here and any suggestions would be appreciated.
Keywords: patch
Chris, on your latest patch, what's up with that comment block that you changed? Should we just remove the comment? Comment on why it is commented out? As for the rest, I will trust your judgement on it, I am less clued in than you. I rely completly on testing for this type of change, so if you are heppy with the testing (and deal with the comment block somehow), [s]r=attinasi
buster commented that code out on the way out the door (r1.75, 02/11/2001) fixing bug 47549 (floater cause entire page to redraw). I figure now's as good a time as any to say sayonara to it...CVS will remember...
+ else if (mOuterReflowState.reason == eReflowReason_Incremental) { + nsIReflowCommand* rc = mOuterReflowState.reflowCommand; + NS_ASSERTION(rc != nsnull, "no reflow command"); + if (rc) { + nsIReflowCommand::ReflowType type; <-- retrieved and not used later? + rc->GetType(type); [...] Re: extension of the cases where the real reflow reason is propagated. Seems also correct to me -- but since this is an enlargement that encompasses the previous cases (no regression for sure :-), I was wondering if some unnecessary cases weren't introduced -- meaning the optimization that would have happened is now gone. I can't think of such cases, BTW, and I was just wondering if you had any thoughts about that.
Yeah, good thinking. Let me ponder that a bit. (I'll go see if floaters are repaint happy again, too...)
It's curious that this logic is duplicated in nsLineLayout::ReflowFrame().
Season for a Reflow Reason by Chris Waterson The valid reflow reasons are: eReflowReason_Initial eReflowReason_Incremental eReflowReason_Resize eReflowReason_StyleChange eReflowReason_Dirty The valid (and used) reflow command types are: ContentChanged StyleChanged ReflowDirty Timeout UserDefined Any direct reflow requests made of the PresShell (i.e., via InitialReflow(), ResizeReflow(), and StyleChangeReflow()) end up creating nsHTMLReflowState objects whose reflow reason is the obvious thing (i.e., eReflowReason_Initial, eReflowReason_Resize, and eReflowReason_StyleChange, respectively). The other top-level entry-point into reflow is in nsHTMLReflowCommand::Dispatch(). Here, the reflow reason is always set to eReflowReason_Incremental. So, we never start off the root nsHTMLReflowState with a eReflowReason_Dirty; however, there are several places where the reflow reason is subsequently converted from eReflowReason_Incremental to eReflowReason_Dirty. (One especially notable place is in nsBoxToBlockAdaptor.cpp:712, where it looks like a fair amount of spackle was applied.) The two ``most important'' places where this occurs are in nsLineLayout::ReflowFrame() and nsBlockReflowContext::ReflowBlock(). The logic there goes something like this: - By default, reflow the child frame with eReflowReason_Resize as the reflow reason. - If the frame-state flags indicate that the frame hasn't been flowed yet (i.e., NS_FRAME_FIRST_REFLOW is set), change the reason to eReflowReason_Initial. - Otherwise, if the parent reflow state's reason is eReflowReason_StyleChange, then make our reason by eReflowReason_StyleChange; i.e. always propagate style change reflows to children. - Otherwise, if the parent reflow state's reason is eReflowReason_Dirty, then - If our NS_FRAME_IS_DIRTY bit is set, make our reason be eReflowReason_Dirty. (Otherwise, leave the reason as eReflowReason_Resize.) In other words, propagate dirty reflows to children if we're marked as dirty. Propagate dirty reflows as a resize reflow otherwise. (This seems non-optimal, particularly in the case of nsBlockFrame which is likely to end up reflowing many, many lines.) - Otherwise, if the parent's reflow reason is eReflowReason_Incremental, then - If the reflow command's type is nsIReflowCommand::StyleChanged, then make our reflow reason by eReflowReason_StyleChange. (This seems excessive: only frames following the target in the flow should be affected by this, and that should be taken care of by reflow damage propagation, right?) - Otherwise, if the reflow command's type is nsIReflowCommand::Dirty, and our NS_FRAME_IS_DIRTY bit is set, then make our reflow reason be eReflowReason_Dirty. - Otherwise, leave the reason as eReflowReason_Resize. (Again, this seems excessive.) So how does nsBlockFrame handle eReflowReason_Dirty versus eReflowReason_Incremental? - If the reflow state's reason is eReflowReason_Dirty, we do absolutely no special setup for the reflow. - If the reflow state's reason is eReflowReason_Incremental, then: - If we're the target of the reflow, we check the reflow command's type: - If it's a nsIReflowCommand::StyleChange, then we prepare for a style change reflow (i.e., in the same was as if the reason were eReflowReason_StyleChange). Preparing for a style-change reflow involves marking all the lines dirty. - If it's nsIReflowCommand::Dirty, we do no special setup for the reflow. - Otherwise, if it's anything else (realistically nsIReflowCommand::ContentChanged is probably the only other thing we'd ever see), we prepare for a resize reflow (i.e., as if the reason were eReflowReason_Resize), modulo the fact that we _won't_ call DrainOverflowLines(). Preparing for a resize reflow involves (essentially) marking all the lines dirty that are wrapped, or otherwise depend on the block's width. - Otherwise, one of our children is the target of the reflow, so we pluck the next frame off the reflow command (via nsIReflowCommand::GetNext()) into nsBlockReflowState::mNextRCFrame, and prepare for a child incremental reflow. Preparing for a child incremental reflow involves marking the linebox containing the target frame as dirty. nsInlineFrame does nothing special for a dirty reflow. An incremental reflow includes plucking the next frame off the reflow command via nsIReflowCommand::GetNext(); if it's not the current frame (i.e., we're not the target), then InlineReflowState::mNextRCFrame gets the frame.
So I think that this reflow reason stuff is squirrely and bad and could use a good enema. I'd like to propose new rules for reflow-reason propagation. Specifically, I think that: - By default, a parent frame should propagate the reflow reason _as is_ to its children. (Exceptions allowed, e.g. see changes to handle the table first-pass reflow in nsBlockFrame.) - If a frame is the target of an incremental reflow, then ``the buck stops here''. The parent must use the reflow command to set the correct reflow reason appropriately for its child frames. The above patch is an attempt at making this work. It seems to work most of the time, but I've noticed that - typing in text fields (i.e. <input type="text"> is badly broken. But <textarea> seems fine, as does HTML editing. - typing tends to trip one of the new assertions I've added in nsLineLayout.cpp (``dirty reflow targeted at clean frame'') like crazy So, it's not really ready for prime-time yet. Does anyone have comments on this approach?
Attached patch more work-in-progress, start bashing box frame (obsolete) (deleted) — Splinter Review
Attached patch oops, missed a file. (deleted) — Splinter Review
Target Milestone: mozilla0.9.3 → mozilla0.9.4
Target Milestone: mozilla0.9.4 → mozilla0.9.5
karnaze, you probably know the most about incremental reflow reason propagation; do you have any comments on the above approach? (/me starts hacking on it again...) I think that bug 91423 is probably getting burned by similar problems.
Blocks: 91423
waterson, I'm not very familar with how blocks change the reflow reason, but I seem to recall some issues with incremental reflow, which you are probably addressing. I'm not sure I understand your 2nd point regarding incremental reflow. I agree that the target frame of the incremental reflow should get it as is and be responsible for processing it, if that is what you are saying. But after the target frame returns from the incremental reflow, it is not unreasonable for its parent (or some other ancestor) to resize reflow some or all of its children. Tables do this for example when a cell changes size during an incremental reflow. Regarding your comment in the patch about table cells getting reflowed 3 times, here is how tables control the reflow process (I can't speak about what blocks do). Usually, a cell will be reflowed twice, once (initial) with an unconstrained width and once (resize) after the column widths are determined. A nested table is smart about limiting this to 2 reflows to its cells. It will not do 2 passes when it is reflowed unconstrained, but wait until it is reflowed on its 2nd pass to do the 2nd pass on its cells. There is an optimization where if the table already has columns, then new cells will be reflowed with existing column widths and might end up only reflowing the new cells once. At the other extreme, I am about to check in new code where cells with percent heights or containing percent height tables may need 3 reflows especially if the table does not have a computed height.
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Target Milestone: mozilla0.9.7 → mozilla0.9.8
evaughan made things much better with his changes for bug 110328 (re-write of box-to-block adaptor): we now no longer crash! Unfortunately, the original test case /still/ isn't quite right: it looks as if we're leaving the frames ``as is'' after doing an unconstrained reflow. Since we're no longer crashing, I'm going to push this one out to `Future', and then re-prioritize once I return from sabbatical.
Depends on: 110328
Keywords: crash
Target Milestone: mozilla0.9.8 → Future
Severity: critical → normal
Attachment #39590 - Attachment is obsolete: true
Attachment #39753 - Attachment is obsolete: true
Attachment #40094 - Attachment is obsolete: true
Attachment #41798 - Attachment is obsolete: true
Attachment #41458 - Flags: needs-work+
Attachment #41799 - Flags: needs-work+
No longer blocks: 91423
restoring dependency. Please do not remove dependencies without good reason (they denote a lot more than "this bug needs to be fixed to fix that bug")
Blocks: 91423
Could this be fixed by treating the IFRAME as a reflow root (a la bug 135146)?
Depends on: 135146
No longer depends on: 135146
Does this still crash Mozilla?
So.. is this still a problem? The testcases all work for me, but we haven't really made major changes to reflow, I don't think.... Though waterson _did_ land reflow coalescing after the last relevant comments on this bug.
Assignee: waterson → misc
Status: ASSIGNED → NEW
Component: Layout → Layout: Misc Code
Priority: P2 → --
QA Contact: petersen → nobody
Target Milestone: Future → ---
I vote we mark this "WORKSFORME"
Yep, after more testing, that's where we're at.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WORKSFORME
Product: Core → Core Graveyard
Component: Layout: Misc Code → Layout
Product: Core Graveyard → Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: