Closed Bug 310638 Opened 19 years ago Closed 19 years ago

Crash [@ DoDeletingFrameSubtree] [@ DeletingFrameSubtree]

Categories

(Core :: Layout, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: jruderman, Assigned: MatsPalmgren_bugz)

References

Details

(4 keywords, Whiteboard: [sg:critical?] uses freed memory[rft-dl])

Crash Data

Attachments

(11 files, 6 obsolete files)

(deleted), image/svg+xml
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), patch
bzbarsky
: review+
dbaron
: superreview+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
roc
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
The testcase crashes [@ DoDeletingFrameSubtree].  

While reducing the testcase, I also saw crashes at:
[@ DeletingFrameSubtree] - exploitable, i saw it call many random functions and
random non-code addresses
[@ nsCSSFrameConstructor::WipeContainingBlock]
[@ nsBlockFrame::DrainOverflowLines]
[@ nsFrameManager::CaptureFrameStateFor] (upon leaving the page)

See also bug 310436, another crash involving mixed SVG and HTML and DOM
manipulation.
Summary: Crash [@ DoDeletingFrameSubtree] → Crash [@ DoDeletingFrameSubtree] involving mixed SVG and HTML
Whiteboard: [sg:fix]
Attached image testcase (deleted) —
Build: a debug trunk build from about 2 hours ago, on Mac OS X.
Taking bug, since I have a fix for this (and hopefully a few others)...
Assignee: general → mats.palmgren
OS: MacOS X → All
Hardware: Macintosh → All
Attached file Dump 1, DoRemoveFrame() bug (deleted) —
I have found two rather serious bugs, the first one is that
nsBlockFrame::DoRemoveFrame() fails to find the deleted frame
in some situations when the next-in-flow is in the overflow list.
See attached frame dump.
Attached file Dump 2, DoDeletingFrameSubtree() bug (deleted) —
The second bug is in nsCSSFrameConstructor.cpp:DoDeletingFrameSubtree()
which schedules out-of-flow (OOF) frames to be destroyed twice.
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/base/nsCSSFrameConstructor.cpp&rev=1.1138&root=/cvsroot&mark=9557#9513

DoDeletingFrameSubtree() check that the OOF-frame is not a descendant
of the "start frame" |aRemovedFrame| before adding it to the destroy
list, but it also needs to check if the OOF-frame is a descendant of
any of the frames already added to the destroy list.
The attached frame dump illustrates a case where this happens
(which is from the testcase of bug 307992).

Also, in the case the frame is a descendant of |aRemovedFrame| or
a frame in the destroy list, then we shouldn't traverse it again.
Attached patch Patch rev. 1 (obsolete) (deleted) — Splinter Review
Attached patch Patch rev. 1 (diff -w) (obsolete) (deleted) — Splinter Review
The two files can be reviewed independently, as you see fit.

I'm pretty sure DoDeletingFrameSubtree() isn't correct for popup frames
(assertions in bug 307992 indicates so), but I have kept the original logic
for them for now. Does anyone know why they needs this special handling?
Attachment #198183 - Flags: superreview?(dbaron)
Attachment #198183 - Flags: review?(bzbarsky)
So the DoDeletingFrameSubtree changes make it so removing a non-positioned frame
with N absolutely positioned children (or rather placeholders for them) is
O(N^2), do they not?  For each placeholder we'll walk the entire list of queued
up items...

I was thinking about this... Is it true that the following invariant holds?

  The parent of an out-of-flow is always an ancestor of the out-of-flow's
  placeholder.

This definitely holds as far as I can see except perhaps for page splitting, but
if it fails there we have an issue already, since we could have the placeholder
under one frame and the out-of-flow under some in-flow and we'd fail to deal
already.

If this invariant holds, I think we can enforce the invariant we want (that no
frame ends up on the deletion queue which has an ancestor already in that queue)
by doing the following: when recursing into an out-of-flow that we queued up for
deletion, pass that out-of-flow in as the aRemovedFrame.  Indeed, say we do
this.  Then we definitely have the following invariant:

  Any time we're looking at a placeholder (call it A), in DoDeletingFrameSubtree
  it will be a descendant of aRemovedFrame.

Indeed, if we're looking at a placeholder (A) and it's not a descendant of
aRemovedFrame then it must have some out-of-flow ancestor that was also not a
descendant of aRemovedFrame (and hence would have been queued for destruction
and become aRemovedFrame itself, which is a contradiction).  We also have the
invariant: 

  No strict ancestor of aRemovedFrame is being deleted

(this is the invariant we're maintaining, so we can assume we've preserved it
for frames we've hit before now).  Now say we're looking at an out-of-flow
(called B) which has A as its placeholder.  We have the following two options:

1)  B has no ancestor that's being deleted, so B should end up in our queue.
    Since aRemovedFrame is guaranteed to be something we're deleting and B
    has no ancestor being deleted, no ancestor of B is aRemovedFrame, so
    we'll put B in our queue.
2)  B has an ancestor that's being deleted.  In particular, B's parent (call it
    C) is being deleted.  C is an ancestor of A.  So is aRemovedFrame.  So
    either C is a strict ancestor of aRemovedFrame, or aRemovedFrame is an
    ancestor of B.  But no strict ancestor of aRemovedFrame is being deleted,
    and C is being deleted, so aRemovedFrame must be an ancestor of B.

I haven't looked at the rest of the patch yet, but does this part make sense?
I think your deduction is correct, so it comes down to if the initial
invariant holds. I have found one case where it doesn't:

<fieldset style="position:relative">
  <legend><span style="position:absolute">abs</span></legend>
</fieldset>

Results in something like this:

FieldSet(fieldset)(1)@0x85e99f4 [view=0x85ebad8] ...<
  Legend(legend)(1)@0x85e9d08 next=0x85e9a48 ...<
    Placeholder(span)(0)@0x85eb634 outOfFlowFrame=Area(span)(0)@0x85eb5e0
  >
  Area(fieldset)(1)@0x85e9a48 ...<
    Absolute-list<
      Area(span)(0)@0x85eb5e0 ...<
      >
    >
  >
>

Maybe other complex frames, e.g. comboboxes have the same problem?

I am also a bit worried about splitting as you mentioned. I'm not sure
what you meant by "fail to deal already" - did you mean the current code
or my patch, if the latter, how so?

Also, we have bugs... bug 306534 and bug 223064 comes to mind;
I don't advocate coding around bugs in general, but since this one is
exploitable (comment 0) I think we should consider that here until we are
confident the invariant holds in all cases before optimizing.

My guess is that the O(N^2) ancestor check for the out-of-flows is probably
a minor performance problem compared to the overall reframing process anyway?
(N=number of out-of-flows, which is typically low)
> I have found one case where it doesn't

Hmm...  I'll need to think through this example carefully.  I'll comment
tomorrow when I've had a chance to do that.

> I'm not sure what you meant by "fail to deal already"

If we have two in-flows and the placeholder is a child of the first while the
out-of-flow is a child of the second, we'll queue up the out-of-flow for
destruction, then destroy things in which order?  I suppose if we destroy first
in-flow, then out-of-flow, then second in-flow it'll actually work... is that
guaranteed?

> My guess is that the O(N^2) ancestor check for the out-of-flows is probably
> a minor performance problem compared to the overall reframing process anyway?

Reframing is O(N), so for large N (hundreds, say) I suspect it's not, based on
other profiles I've seen.  And it's easy to find pages with hundreds or even
thousands of positioned elements -- google cache for any PDF, for example.  Any
page with a complicate menu system.  Lots of DHTML stuff.  So an O(N^2)
algorithm here _will_ bite us.
So apart from the fact that I think it's a bug that we're putting the abs pos
frame there, I think my deduction is still ok if we replace the invariant:

  The parent of an out-of-flow is always an ancestor of the out-of-flow's
  placeholder.

By the more relevant invariant:

  If an ancestor of an out-of-flow is being deleted, then the out-of-flow and
  its placeholder share an ancestor that is also being deleted.

And then replace the reasoning in part 2 with:

 2) B has an ancestor that's being deleted.  In particular, B has an ancestor
    being deleted that is also an ancestor of A.  Call this ancestor C.  Now C 
    is an ancestor of A.  So is aRemovedFrame.  So either C is a strict ancestor 
    of aRemovedFrame, or aRemovedFrame is an ancestor of B.  But no strict 
    ancestor of aRemovedFrame is being deleted, and C is being deleted, so
    aRemovedFrame must be an ancestor of B.
Oh the "more relevant invariant" holds for fieldset and company because we'd not
be deleting the content insertion frame without deleting the outer frame too.

For in-flows, I think we're still as ok as we are now with this setup.
I think splitting is OK for floats. I worked *very* hard to make sure that the
invariant in comment #8 is true for floats.

Abs-pos frames in split blocks are a potential problem. There's a bug on it: bug
288357 --- which currently doesn't crash, but I suspect the problem is still there.
Comment on attachment 198183 [details] [diff] [review]
Patch rev. 1 (diff -w)

I'd really prefer to not introduce the O(N^2) behavior here if we can make my
suggestion work.  If you're convinced that it won't, rerequest review, I guess.
Attachment #198183 - Flags: review?(bzbarsky) → review-
Blocks: 310426
Attachment #198183 - Flags: superreview?(dbaron)
Whiteboard: [sg:fix] → [sg:fix] [patch]
I did a bit more digging, and the whole aRemovedFrame thing is an optimization.
 That is, we could queue up all the out-of-flows if we really wanted (since
those are processed via RemoveFrame from their parent before we destroy the
aRemovedFrame), but that would just be slower than what we do now.

So given that, I think my suggestion should work quite nicely....
Blocks: 317854
Mats will you have any time to look into this over the next few weeks?
Is there any relationship to bug 315752?
I will make an updated patch soon (before new year) and also have a look
at bug 315752.
(In reply to comment #17)
> Is there any relationship to bug 315752?

That appears to back-out and re-fix bug 117984 (bug 315752 comment 3) which was a post-1.8 branch trunk fix; the bug 315752 testcase does not crash 1.8

This bug, in contrast, does crash the 1.8 branch.
Flags: blocking1.8.0.1?
Whiteboard: [sg:fix] [patch] → [sg:critical?] uses freed memory
I think this is the fix that Boris suggested in comment 8.
Attached file Dump 3 (deleted) —
This dump is from running RandomStyles (bug 306939) on the URL
http://www.csszengarden.com/ with the patch above (attachment 206982 [details] [diff] [review]).

The initial (DeletingFrameSubtree) frame is 0x88f04ec (lime),
then we recurse on the following out-flow-frames:
0x89a7b34 (pink) 
  0x8a67f78 (red)
    0x8999498 (cyan)
      0x8a9afb8 (blue)
where we find that (blue) has an ancestor already in the destroy
queue (red).

An ancestor of (red) that is also being deleted is 0x8999498 (cyan)
which is not ancestor of the placeholder (0x89fdb8c) -
i.e your updated invariant in comment 11:

  If an ancestor of an out-of-flow is being deleted, then the out-of-flow and
  its placeholder share an ancestor that is also being deleted.

does not hold (or I misunderstood your suggested fix).

On the other hand, adding (blue) to the destroy queue in this case
does not crash because it's added after its OOF ancestors (red, cyan)
and we're processing the destruction in reverse order:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/base/nsCSSFrameConstructor.cpp&rev=1.1126&root=/cvsroot&mark=9473-9475#9438

NOTE: In the trace I printed the destroy queue in reverse order, i.e.
"Destroy Queue: [ ... ]" is in the order the frames will be
destroyed by DeletingFrameSubtree().
Attached file Dump 4 (deleted) —
Here is another dump (same test) where we are not so lucky.
The interesting bit is about 4 pages down, search for
"UnregisterPlaceholderFrame". We are about to add (blue)
a second time, this will crash eventually...
(See also the last "Destroy Queue [ ... ]" printout)
I've analyzed this tree a bit:

DeletingFrameSubtree 0x8881948 (red) - loop normal flow list:

child Area(div)(5)@0x87302e8 - loop normal flow list:

placeholder 0x8809d9c -> 0x87f6034 (magenta)
  placeholder 0x87ef7c8 -> 0x89acf50 (white)
    placeholder 0x87e1464 -> 0x880a81c (blue)   <------------+
      placeholder 0x8849128 -> 0x886dd74 (lime)              |
      placeholder 0x87ce870 -> 0x88bc380                     |
    placeholder 0x8453dec -> 0x884f354 (yellow)              |
      placeholder 0x872f9bc -> 0x881954c                     |
    placeholder 0x8819dec -> 0x8829e20 (cyan)                |
    placeholder 0x886926c -> 0x886dc08                       |
    placeholder 0x87db1dc -> 0x8823968 (pink)                |
      placeholder 0x87df878 -> 0x8904a08                     |
    placeholder 0x87d1848 -> 0x88f09d8                       |
placeholder 0x89ad26c -> 0x8952590                           |
placeholder 0x87eeb50 -> 0x88ecf9c                           |
placeholder 0x8740b6c -> 0x87db8c8                           |
  placeholder 0x87304fc -> 0x8955684                         |
    placeholder 0x885c838 -> 0x89a38d8                       |
    placeholder 0x87dfcc8 -> 0x89a3b88                       |
      placeholder 0x87bc308 -> 0x89a3f10                     |
        placeholder 0x873aed4 -> 0x89526a8                   |
    placeholder 0x849503c -> 0x8952a78                       |
      placeholder 0x87f2430 -> 0x8952e94                     |
    placeholder 0x884e160 -> 0x87d1d44                       |
    placeholder 0x8860410 -> 0x8952734                       |
      placeholder 0x87b27d8 -> 0x89529dc                     |
        placeholder 0x87ef8f0 -> 0x89b1020                   |
                                                             |
child Area(div)(5)@0x87302e8 - loop Absolute-List:           |
  TableOuter(ul)(3)@0x89acf50                                |
    0x87e1464 -> 0x880a81c (blue)  --------------------------+
Comment on attachment 206984 [details]
Dump 4

(The file was to large so I had to delete some stuff, so the interesting bit is now at the top, not 4 pages down as I said)
Attached patch Patch rev. 2 (obsolete) (deleted) — Splinter Review
Now the good news, your suggestion + nulling out the out-of-flow
pointers as we go works and isn't O(N^2). We pay the price for
not checking ancestors in the destroy queue by sometimes walking
the same out-of-flow twice but the second time all placeholders
will be nulled out so we don't enter the OOF on the destroy queue
(as we saw in "Dump 4"), this case is rare(?) I think.

This patch fixes this bug and bug 317854 and also some of the
crashes produced by "Random styles", for example "Crash 1" in
bug 316599 and some stacks for bug 316639. I've tested this patch
extensively with both StirDOM and RandomStyles and my impression
is that it makes them more stable. However, I've seen at least
one more bug that is related to out-of-flow frames - I'll report
my findings in a separate bug (I'm pretty sure it's not in
DeletingFrameSubtree() this time).
Attachment #198182 - Attachment is obsolete: true
Attachment #198183 - Attachment is obsolete: true
Attachment #206985 - Flags: superreview?(bzbarsky)
Attachment #206985 - Flags: review?(bzbarsky)
Attached patch Patch rev. 2 (diff -w) (obsolete) (deleted) — Splinter Review
BTW, this crash is generic and occurs without SVG.
Component: SVG → Layout
Summary: Crash [@ DoDeletingFrameSubtree] involving mixed SVG and HTML → Crash [@ DoDeletingFrameSubtree] [@ DeletingFrameSubtree]
Blocks: 316639
> where we find that (blue) has an ancestor already in the destroy
> queue (red).

Hmm... Looking at the frame dump, the relevant markup looks to me something like (based on placeholder chains, etc):

<div id="lime" note="has no positioned non-table ancestors">
  <div id="pink" style="position: absolute">
    <div id="red" style="position: fixed">
      <p id="container" style="position: absolute">
        <span id="cyan" style="position: fixed; display: table">
          <acronym id="blue" style="position: absolute">

So the problem here is that a positioned table SHOULD be an abs pos containing block but isn't one.  Hence we end up using the id="container" frame as the containing block for the id="blue" frame.  Which does violate my invariant, as you pointed out.  Tables again.  :(

I guess we can't fix this on branch by just disallowing positioning of things that would end up with the table as containing block (that is, by pushing a null absolute containing block when we hit a fixed or absolute table)... :(  On trunk, of course, we should simply fix this issue (and any other similar issues) and change this code back to rely on the invariant I described; we should file a new bug on doing that.
> Here is another dump (same test) where we are not so lucky.

So in this case the basic markup is:

<div id="red" style="position: absolute">
  <ul id="white" style="position: absolute; display: table">
    <li id="blue" style="position: fixed">

etc, and the point is that we look at the id="white" node twice -- once because we have an in-flow placeholder pointing to it, and once because it's in our absolute list, right?  That seems somewhat unfortunate (if only because nesting N abs pos frames inside each other will mean 2^N calls to DoDeletingFrameSubtree as far as I can see).

Perhaps the walk over frame lists in DoDeletingFrameSubtree should skip the absolute, fixed, and float child lists?  After all, if a frame has such a child, the corresponding placeholder will also be reachable from it (even in these cases), and hence we'll tear it down properly...  at least I think so.

Mats, what do you think?

Also, you seem to have some nice debugging code for this method in your tree; it might be worth checking in.
Blocks: 321901
(In reply to comment #27)
> So the problem here is that a positioned table SHOULD be an abs pos
> containing block but isn't one.

Right, that seems to be the problem here.


(In reply to comment #28)
> and the point is that we look at the id="white" node twice -- once because
> we have an in-flow placeholder pointing to it, and once because it's in our
> absolute list, right? 

Yes. I still think it would have been a rare exception though, because
the IsProperAncestorFrame() test would have pruned most of the
calls from placeholders?


> Perhaps the walk over frame lists in DoDeletingFrameSubtree should skip the
> absolute, fixed, and float child lists?  After all, if a frame has such a
> child, the corresponding placeholder will also be reachable from it (even in
> these cases), and hence we'll tear it down properly...  at least I think so.

Good idea. This also made it easier to verify the correctness of the tree,
which got me some more data on that elusive third bug I was talking about -
I have filed bug 321901 on it.

(BTW, maybe this is what Troy meant with his comment on the child lists
 in DoCleanupFrameReferences() (bug 315752 comment 3) ?)


> Also, you seem to have some nice debugging code for this method in your tree;

Yes, the problem is I have to much of it ;-)
I need to clean it up a bit first... I'll file a separate bug on this
if that's ok.

Perhaps it would also be an improvment if we changed all the
DeletingFrameSubtree() callers so that they call a new function, e.g.
RemoveFrameSubtree() that takes care of the whole thing,
instead of them manually dealing with placeholders etc.
That would also allow the verification code do a better job since
it currently does not cover the initial "aRremovedFrame" in case it's a
placeholder/out-of-flow.

Also, I think you're right in your comment on DoCleanupFrameReferences()
that it can be merged with DeletingFrameSubtree().
(we need to look again at why it was necessary to walk the out-of-flow
lists for bug 315752, maybe the list we really needed was "popupList" ?)

Finally, I don't particularly like the special handling of popup frames...
I think the root cause is that the menu frame hides the child list from the
frame constructor (for performance reasons only?):
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/xul/base/src/nsMenuFrame.cpp&rev=1.305&root=/cvsroot&mark=295-306#292
Do you know if that performance argument still holds?
I think it would be an improvement if they could be handled like any other
frame...
Attached patch Patch rev. 3 (obsolete) (deleted) — Splinter Review
Recurse on all out-of-flows from placeholders, but skip those child lists.
I also removed the "aPresShell" parameter since it wasn't used.
Attachment #206985 - Attachment is obsolete: true
Attachment #206986 - Attachment is obsolete: true
Attachment #207178 - Flags: superreview?(bzbarsky)
Attachment #207178 - Flags: review?(bzbarsky)
Attachment #206985 - Flags: superreview?(bzbarsky)
Attachment #206985 - Flags: review?(bzbarsky)
Attached patch Patch rev. 3 (diff -w) (obsolete) (deleted) — Splinter Review
> because the IsProperAncestorFrame() test would have pruned most of the calls
> from placeholders?

Ah, good point.

> (BTW, maybe this is what Troy meant with his comment on the child lists
> in DoCleanupFrameReferences() (bug 315752 comment 3) ?)

I doubt it.  I think he meant that since we manually clean up the still-pending out-of-flows we should be ok.  But that assumes that all nonprincipal lists are out-of-flows, which is not the case.

> I'll file a separate bug on this if that's ok.

Sounds like a plan.

> Do you know if that performance argument still holds?

No idea.  Ask bryner, since it's his code?  I can't follow what changes were made there, esp. since none of the bugs have a diff in them.

I agree that unless there are _very_ strong reasons to do something wacky here we should just make popups work just like other frames.

Brian, I realize it's been close to 6 years, but do you recall what the deal here was?
Mats, does that patch work even given the table containing block bustage?
Blocks: 322348
(In reply to comment #33)
> Mats, does that patch work even given the table containing block bustage?

Yes. The reason the table problem violates the invariant is that we
process the out-of-flow both from the placeholder and the child list.
This shouldn't be a problem anymore since we don't walk the out-of-flow
child lists.

Actually, I think the whole test:

if (outOfFlowFrame->GetStyleDisplay()->mDisplay == NS_STYLE_DISPLAY_POPUP ||
    !nsLayoutUtils::IsProperAncestorFrame(aRemovedFrame, outOfFlowFrame)) {

could probably be removed if we wanted to, since we add the out-of-flow
to the queue before recursing and we destroy frames from the end.
I'm not sure which is faster though; doing IsProperAncestorFrame() for
every out-of-flow or adding every out-of-flow to the queue and removing
it one by one (implies a child list search). I'm guessing that the frame
that owns the out-of-flow lists destroys them more efficiently.
I checked the ratio of IsProperAncestorFrame() out-of-flows and it
seems to be 40-60% for RandomStyles depending on the markup, and
30-90% for StirDOM (fixing the table problem will increase those numbers?)
I suggest we keep the test for now, since we know we have
other bugs concerning the placeholder/out-of-flows.

There is one difference between the suggested patch and the current
code I'd like to point out. If we are processing a frame tree where
a placeholder is missing (a bug of course) the current code is more
robust since it will find the out-of-flow on a child list.
If that out-of-flow itself contains placeholders the suggested patch
would miss those. (This is what causes bug 321901?)
Overall, in StirDOM/RandomStyles tests, the suggested patch is
much more stable than the current code though.
Flags: blocking1.8.0.2?
Flags: blocking1.8.0.1?
Flags: blocking1.8.0.1-
Blocks: 321894
> I'm guessing that the frame that owns the out-of-flow lists destroys them more
> efficiently.

Yeah, it does.

> If we are processing a frame tree where a placeholder is missing (a bug of
> course) the current code is more robust since it will find the out-of-flow on
> a child list.

While true, I think we currently crash if an out of flow has no corresponding placeholder...  so I wouldn't worry too much about that case.  ;)




Comment on attachment 207178 [details] [diff] [review]
Patch rev. 3

>Index: layout/base/nsCSSFrameConstructor.cpp

> DoDeletingFrameSubtree(nsPresContext*  aPresContext,

>+      } else {
...
>+        if (outOfFlowFrame->GetStyleDisplay()->mDisplay == NS_STYLE_DISPLAY_POPUP ||
>             !nsLayoutUtils::IsProperAncestorFrame(aRemovedFrame, outOfFlowFrame)) {
>+          aDestroyQueue.AppendElement(outOfFlowFrame);

Should we assert that outOfFlowFrame is not in aDestroyQueue here?  I think that would be a good idea.

>+        // Always recurse into the out-of-flow since we don't walk those lists,
>+        // see |childListName| increment below.
>+        DoDeletingFrameSubtree(aPresContext, aFrameManager, aDestroyQueue,
>+                               outOfFlowFrame, outOfFlowFrame);

I think we should only pass outOfFlowFrame as the removed frame if we stuck it in the destroy queue above.  Otherwise, if it's being deleted normally we might as well compare its descendants to our original frame...

> DeletingFrameSubtree(nsPresContext*  aPresContext,

>       GetSpecialSibling(aFrameManager, aFrame, &specialSibling);
>       if (specialSibling)
>-        DeletingFrameSubtree(aPresContext, aPresShell, aFrameManager,
>-                             specialSibling);
>+        DoDeletingFrameSubtree(aPresContext, aFrameManager, destroyQueue,
>+                               specialSibling, specialSibling);

I don't think this change is right.  In particular, if we have three special siblings in a row (two inlines and a block in a full {ib} split), this will fail to call DoDeletingFrameSubtree on the third one, no?

>@@ -10032,17 +10016,16 @@ UpdateViewsForTree(nsPresContext* aPresC
>           nsIFrame* outOfFlowFrame =
>             nsPlaceholderFrame::GetRealFrameForPlaceholder(child);
>-          NS_ASSERTION(outOfFlowFrame, "no out-of-flow frame");

Why?  This assertion should stay, no?

>Index: layout/generic/nsBlockFrame.cpp

I'm really not comfortable reviewing this code; could you get roc or dbaron to look at it?

r=bzbarsky on the CSSFrameConstructor changes with my comment addressed and the following done:

1)  A bug filed on removing the setting of the out-of-flow in the placeholder to null once we fix table abs pos containing stuff (and depending on that bug).
2)  A bug filed on making popups more sane (and mail sent to bryner?).

David, given that you know the blockframe code and DeletingFrameSubtree, would you mind also taking a look?
Attachment #207178 - Flags: superreview?(dbaron)
Attachment #207178 - Flags: superreview?(bzbarsky)
Attachment #207178 - Flags: review?(bzbarsky)
Attachment #207178 - Flags: review+
Maybe useful/interesting to know, this is a crasher testcase that also seems to be fixed by the patch.
Blocks: 322678
very similar crash using stirdom:
<http://test.bclary.com/tests/w3.org/2001/DOM-Test-Suite/tests/level3/core/files/barfoo_base.svg?fuzz=139,38,169,144>

comment 4 can be reproduced with stirdom on:
<http://test.mozilla.com/tests/w3.org/2001/DOM-Test-Suite/tests/level1/core/files/hc_staff.svg?fuzz=139,38,169,144>
<http://test.mozilla.com/tests/w3.org/2001/DOM-Test-Suite/tests/level3/core/files/external_barfoo.svg?fuzz=220,118,223,109>
<http://test.mozilla.com/tests/w3.org/2001/DOM-Test-Suite/tests/level3/core/files/typeinfo.svg?fuzz=220,118,223,109>

and results in assertions|stacks like:

###!!! ASSERTION: prev sibling not in line list: 'Not Reached', file c:/work/mozilla/builds/ff/trunk/mozilla/layout/generic/nsBlockFrame.cpp, line 5315

###!!! ASSERTION: frame not in line: 'line->Contains(aDeletedFrame)', file c:/work/mozilla/builds/ff/trunk/mozilla/layout/generic/nsBlockFrame.cpp, line 5666

dddddddd()
nsLineBox::DeleteLineList(nsPresContext * 0x02da5fb8, nsLineList & {...}) line 326
...

where nextChild	is 0xdddddddd

I think a related crashe can be found with stirdom:
<http://test.mozilla.com/tests/w3.org/2001/DOM-Test-Suite/tests/level1/core/files/hc_nodtdstaff.svg?fuzz=198,132,199,245>

###!!! ASSERTION: prev sibling not in line list: 'Not Reached', file c:/work/mozilla/builds/ff/trunk/mozilla/layout/generic/nsBlockFrame.cpp, line 5315

###!!! ASSERTION: frame not in line: 'line->Contains(aDeletedFrame)', file c:/work/mozilla/builds/ff/trunk/mozilla/layout/generic/nsBlockFrame.cpp, line 5666

nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager * 0x02dad8fc, nsIFrame * 0x02dd177c, nsIContent * 0x02dbc068, nsIContent * 0x02dbc740, nsFindFrameHint * 0x00000000) line 11216 + 17 bytes
...

where kidContent is 0xdddddddd
Depends on: 323105
(In reply to comment #36)
> >+          aDestroyQueue.AppendElement(outOfFlowFrame);
> 
> Should we assert that outOfFlowFrame is not in aDestroyQueue here?  I think
> that would be a good idea.

Fixed. (In general we should also assert that outOfFlowFrame isn't an
ancestor of a frame on the destroy queue, I'll fix that later depending
on the outcome of bug 323105).


> I think we should only pass outOfFlowFrame as the removed frame if we stuck
> it in the destroy queue above.  Otherwise, if it's being deleted normally
> we might as well compare its descendants to our original frame...

Yes, that is more efficent since we would add fewer out-of-flows to
the destroy queue. Fixed.


> >+        DoDeletingFrameSubtree(aPresContext, aFrameManager, destroyQueue,
> >+                               specialSibling, specialSibling);
> 
> I don't think this change is right.

Good catch. Fixed by reverting to the original.
I still have a feeling this could go wrong though...
There are cases when the placeholder/out-of-flow are in separate
special-sibling parents, see for example the frame dump in bug 322688 -
there we're ok since it's the placeholder that is in
the special-sibling - if the opposite occurs (placeholder is an in-flow
child, out-of-flow is a special-sibling child) then we would crash.
I'm not sure if that order can ever occur though.
(For the moment the crash won't occur since we're leaking the
special-siblings - bug 322678)
To be considered by bug 322678 / bug 323105 I suppose...


> >             nsPlaceholderFrame::GetRealFrameForPlaceholder(child);
> >-          NS_ASSERTION(outOfFlowFrame, "no out-of-flow frame");
> 
> Why?  This assertion should stay, no?

GetRealFrameForPlaceholder() already asserts this.
Address Boris' comments as described above.
I changed the null-test of |aFrameManager| in DeletingFrameSubtree() to be
an early return, to make the code a bit more readable, like so:

  if (NS_UNLIKELY(!aFrameManager)) {
    return NS_OK;
  }

I'm making a separate patch for the nsBlockFrame change...
Attachment #207178 - Attachment is obsolete: true
Attachment #207179 - Attachment is obsolete: true
Attachment #208613 - Flags: superreview?(dbaron)
Attachment #208613 - Flags: review?(bzbarsky)
Attachment #207178 - Flags: superreview?(dbaron)
Attached patch nsBlockFrame patch, rev. 1 (deleted) — Splinter Review
This patch is independent of the frame constructor change (and vice versa).
Attachment #208616 - Flags: superreview?(dbaron)
Attachment #208616 - Flags: review?(roc)
Comment on attachment 208613 [details] [diff] [review]
nsCSSFrameConstructor patch, rev. 4

r=bzbarsky
Attachment #208613 - Flags: review?(bzbarsky) → review+
Attachment #208616 - Flags: superreview?(dbaron)
Attachment #208616 - Flags: superreview+
Attachment #208616 - Flags: review?(roc)
Attachment #208616 - Flags: review+
Comment on attachment 208616 [details] [diff] [review]
nsBlockFrame patch, rev. 1

Checked in on trunk at 2006-01-21 02:33 PDT.
Comment on attachment 208613 [details] [diff] [review]
nsCSSFrameConstructor patch, rev. 4

-    // Now destroy any frames that have been enqueued for destruction.
+  // Now destroy any out-of-flow frames that have enqueued for destruction.

Don't remove the "been".

-          NS_ASSERTION(outOfFlowFrame, "no out-of-flow frame");

Why was this assertion removed?

sr=dbaron.  Sorry for the delay.
Attachment #208613 - Flags: superreview?(dbaron) → superreview+
Comment on attachment 208613 [details] [diff] [review]
nsCSSFrameConstructor patch, rev. 4

sicking just checked this in to the trunk
This should be fixed. I'll let Mats do the honor of closing this bug.

We should consider this for the branches, though i'll leave the safty-analysis to Mats and dbaron
Flags: blocking1.8.1?
Mats: I put back the assertion dbaron mentioned in comment 45. Let me know if it really shouldn't be there and i'll remove it.
Doesn't the end of comment 39 cover that assertion?
Bob, could you file separate bugs for the problems in comment 38
in case they still occur?  (please also attach testcases as I don't have
access to http://test.mozilla.com/) Thanks.
Comment on attachment 208613 [details] [diff] [review]
nsCSSFrameConstructor patch, rev. 4

I think it makes sense to take this on branches since it's
likely exploitable (comment 0) and it would be hard to do
other fuzz-testing on the branches without these patches.
After a proper baking time on the trunk of course...

FWIW, I tested this extensively with StirDOM/RandomsStyles,
but only on trunk/Linux.
Attachment #208613 - Flags: approval1.8.1?
Attachment #208613 - Flags: approval1.8.0.2?
Attachment #208616 - Flags: approval1.8.1?
Attachment #208616 - Flags: approval1.8.0.2?
For the record, the trunk checkins occurred:
nsBlockFrame.cpp          2006-01-21 02:33 PDT
nsCSSFrameConstructor.cpp 2006-01-26 14:15 PDT

-> FIXED
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
(In reply to comment #54)

Ok, I ran StirDOM on all the five URLs in comment 38 (at test.bclary.com)
for at least 10 minutes each, there was no crash and no assertions.
Blocks: 325047
Depends on: 325218
Attachment #208613 - Flags: approval1.8.1? → branch-1.8.1?(bzbarsky)
Attachment #208616 - Flags: approval1.8.1? → branch-1.8.1?(roc)
This has an outstanding crash regression (bug 325218) -- I'd like to see us at least get traction on this before we take the frame constructor changes here on branch.
No longer blocks: 322678
Depends on: 322678
(In reply to comment #51)

just to confirm, I get no crash on those urls with stirdom with a trunk winxp build.
Flags: blocking1.8.1?
Flags: blocking1.8.1+
Flags: blocking1.8.0.2?
Flags: blocking1.8.0.2+
Attachment #208616 - Flags: approval-branch-1.8.1?(roc) → approval-branch-1.8.1+
Comment on attachment 208613 [details] [diff] [review]
nsCSSFrameConstructor patch, rev. 4

OK.  We have a fix for the regression... Let's do this.
Attachment #208613 - Flags: approval-branch-1.8.1?(bzbarsky) → approval-branch-1.8.1+
Note that after this lands on 1.8.1 branch we need to land bug 322678 there too.

Mats, I assume you'll merge this to branch?  If not, let me know and I'll do it, I guess.  Note dbaron's comment nit!
Comment on attachment 208613 [details] [diff] [review]
nsCSSFrameConstructor patch, rev. 4

approved for 1.8.0 branch, a=dveditz for drivers
Attachment #208613 - Flags: approval1.8.0.2? → approval1.8.0.2+
Comment on attachment 208616 [details] [diff] [review]
nsBlockFrame patch, rev. 1

approved for 1.8.0 branch, a=dveditz for drivers
Attachment #208616 - Flags: approval1.8.0.2? → approval1.8.0.2+
Attaching the final 1.8/1.8.0 branch patch for posterity, there were some
minor conflicts I had to resolve.

Checked in to MOZILLA_1_8_BRANCH at 2006-02-16 01:25 PDT
Checked in to MOZILLA_1_8_0_BRANCH at 2006-02-16 01:50 PDT
Whiteboard: [sg:critical?] uses freed memory → [sg:critical?] uses freed memory[rft-dl]
v.fixed on 1.8.0 branch with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.1) Gecko/20060224 Firefox/1.5.0.1, no crashes with testcases.
Flags: in-testsuite+
Flags: blocking1.7.14?
Flags: blocking-aviary1.0.9?
Group: security
Depends on: 341382
Depends on: 344557
https://bugzilla.mozilla.org/attachment.cgi?id=198093
ff2b2 debug/nightly windows/linux no crash

https://bugzilla.mozilla.org/attachment.cgi?id=207800
ff2b2 debug/nightly windows/linux no crash

verified fixed 1.8

Flags: in-testsuite+ → in-testsuite?
I checked in my testcase and Martijn's testcase as crashtests.
Flags: in-testsuite? → in-testsuite+
Crash Signature: [@ DoDeletingFrameSubtree] [@ DeletingFrameSubtree]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: