Closed Bug 289517 Opened 20 years ago Closed 19 years ago

Crash in [@ BasicTableLayoutStrategy::AssignNonPctColumnWidths] with evil testcase, using display:inherit

Categories

(Core :: Layout: Tables, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: martijn.martijn, Assigned: bzbarsky)

References

()

Details

(5 keywords, Whiteboard: [rft-dl])

Crash Data

Attachments

(5 files, 1 obsolete file)

The testcase that I'll attach crashes Mozilla for me, when I click on the button
in the testcase.

It works with 2005-01-25 09:17am build.
It crashes with 2005-01-26 08:19am build.
So it seems like a regression:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2005-01-25+08%3A00%3A00&maxdate=2005-01-26+09%3A00%3A00&cvsroot=%2Fcvsroot
Attached file Testcase (deleted) —
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050407
Firefox/1.0+

WFM.  What platform/OS are you using?
January builds are a bit old to be reporting bugs against, no?
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=TB4942585K

broad hint: if Martijn reports a crash, it is a crash period.
> January builds are a bit old to be reporting bugs against, no?

William, Martijn commented on the January builds because that's when the crash
first appeared.  See exactly what he said about them again.  Note also comment 3.
Attached file Backtrace (deleted) —
Well, I could have been a bit more clear. I'm using the latest nightly trunk
build.
I see that one talkback report already has been filed, but maybe a backtrace
from my up to date debug build is also still useful. My debug build indeed
doesn't seem to crash all of the times on the testcase (although the normal
build does crash every time)
(gdb) frame
#0  0x414c3354 in nsCellMap::GetCellInfoAt(nsTableCellMap&, int, int, int*, int*) (
    this=0x86ddda0, aMap=@0x833a4d8, aRowX=0, aColX=0, aOriginates=0xbfffa300, 
    aColSpan=0xbfffa2fc)
    at /home/bzbarsky/mozilla/xlib/mozilla/layout/tables/nsCellMap.cpp:2397
2397            cellFrame->GetColIndex(initialColIndex);
(gdb) p *cellFrame
$3 = {<nsHTMLContainerFrame> = {<nsContainerFrame> = {<nsSplittableFrame> =
{<nsFrame> = {<nsBox> = {<nsIFrame> = {<nsISupports> = {_vptr.nsISupports =
0x0}, mRect = {
                x = -572662307, y = -572662307, width = -572662307, 
                height = -572662307}, mContent = 0xdddddddd, mStyleContext =
0xdddddddd, 
              mParent = 0xdddddddd, mNextSibling = 0xdddddddd, mState =
3722304989}, 

etc.

So the cellmap is holding a destroyed frame for some reason....  I don't see
anything in the regression range that would obviously trigger this, though.  :(
Component: Layout → Layout: Tables
OS: Windows 2000 → All
QA Contact: layout → layout.tables
Hardware: PC → All
Martijn are you really sure that the 2005-01-25 does not crash
... 2005-01-24 14:51 was a checkin that looks much more promising to be the
reason for this.
Ok, I rechecked. The 2005-01-25 indeed crashes also directly. However builds
before that also crash after I tried it a few (appr. 3) times (and then I click
on the 'Go' button to try it another time). I was used to it that the testcase
crashed directly on me.
The good regression range seems to be: 2004-08-09 build working (tried it 20
times or so), the 2004-08-10 build crashing.
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2004-08-09+06%3A00%3A00&maxdate=2004-08-10+10%3A00%3A00&cvsroot=%2Fcvsroot
With that regression range, I always think about bug 230170.
crash with Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.8b) Gecko/20050125
http://archive.mozilla.org/pub/mozilla/nightly/2005-01-25-07-trunk/

When I click, the body grows, the buttons move down.
I crash on reload, or testing the click in DOMinspector.

TB4957135Z has only numerical offsets into GKLAYOUT.DLL
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=TB4957135Z

TB4949341 MozillaTrunk Build ID	2005040506 
Stack Signature nsCellMap::GetCellInfoAt 
 
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=4949341
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=4963039
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=4962901
both Stack Signature: BasicTableLayoutStrategy::AssignNonPctColumnWidths
...
nsXULScrollFrame::DoLayout 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/generic/nsGfxScrollFrame.cpp,
line 544]
nsIFrame::Layout 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/xul/base/src/nsBox.cpp,
line 802]
nsHTMLScrollFrame::Reflow 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/generic/nsGfxScrollFrame.cpp,
line 472]
nsContainerFrame::ReflowChild 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/generic/nsContainerFrame.cpp,
...


differ only in three lines from:
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=4949341
Stack Signature: nsCellMap::GetCellInfoAt
...
nsHTMLScrollFrame::DoLayout 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/generic/nsGfxScrollFrame.cpp,
line 544]
nsIFrame::Layout 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/xul/base/src/nsBox.cpp,
line 802]
nsXULScrollFrame::Reflow 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/generic/nsGfxScrollFrame.cpp,
line 472]
nsContainerFrame::ReflowChild 
[c:/builds/tinderbox/MozillaTrunk/WINNT_5.0_Clobber/mozilla/layout/generic/nsContainerFrame.cpp,


crashes with seamonkey 2005-04-19-05  wfm with 2005-04-20-05 probably fixed by
the checkin for bug 286137. Martijn could you please retest?
2005-04-19 trunk build crashes for me. 2005-04-20 trunk build doesn't crash for
me, but hangs. It doesn't happen every time with that build. When it doesn't
happen, press the "Go" button and click on the "Click me" button again. It
should normall happen within 5 times trying.
I did get it once to crashing, Martijn this testcase needs be tweaked probably
to crash reliably. I am sure you can do it :-).
Well, the testcase is crashing pretty reliable for me, either directly or in the
second time I try it.

But you might want to try it with the bookmarklet way.
http://wargers.org/bookmarklets/docwrite.htm
Use the toggle_style8 bookmarklet and use it on the url testcase:
http://www.dynamicdrive.com/forums/showthread.php?t=2235
Choose display:inherit and in the select drop down just under "Backwards" choose 4.
But that is not a minimal testcase, off course.
Flags: blocking1.8b3?
Flags: blocking-aviary1.1?
Flags: blocking1.8b3? → blocking1.8b3-
Flags: blocking-aviary1.1? → blocking-aviary1.1-
this crashes in nsTableRowGroupFrame::CalculateRowHeights, when the rowgroup has
two rows as child frames, where the second does not have
NS_STYLE_DISPLAY_TABLE_ROW as the display type. When this happens the rowInfo
array will be sized to small and the access to rowInfo will go past array
boundaries as GetNextRow is based on frame type rather than display type.

nsLayoutAtoms::tableRowFrame == childFrame->GetType()

So in principle rows should always have NS_STYLE_DISPLAY_TABLE_ROW and both
checks should give the same result.
Under typical conditions a pseudo frame does have the right display type, but
the frame here has NS_STYLE_DISPLAY_TABLE_ROW_GROUP instead, so thats more a
frame construction issue.
Attached file reduced testcase (deleted) —
Attached patch debug help (deleted) — Splinter Review
use this to crash frame construction failure
It looks from my debugging that Martijn's suspicion about bug 230170 "Look into
batching style reresolutions" is correct.  Or in other words this is a style
resolution issue which I do not understand, I have to give up :-( .

Below follows my analysis to help Boris to pick this up where I have to give up:
If one looks at attachment 200447 [details].

the key action is 

nodes[2].removeAttribute('style');
nodes[4].setAttribute('style','display:inherit');

They have to be executed in a single sweep so one cannot seperate them further
by a javascript timeout.

Its essential that first the inherit is removed from the <tr> and then the
inherit is set at the <td>.

I placed a break point inside 
nsCSSFrameConstructor::RecreateFramesForContent(nsIContent* aContent)

Once you press the button and you see first the <tr> tag treated, thats the bad
case. If first comes the <td>, you have to reload the testcase, after a couple
of tries it should have first the <tr>.

Assuming that you have the <tr> hit first, let it remove the content but look
inside the nsCSSFrameConstructor::ContentInserted when it tries to determine
first the display type of the content at
nsCSSFrameConstructor::ConstructFrameInternal
then we hit inside nsRuleNode::GetStyleData
the following code,

  if (mDependentBits & nsCachedStyleData::GetBitForSID(aSID)) {
    // We depend on an ancestor for this struct since the cached struct
    // it has is also appropriate for this rule node.  Just go up the
    // rule tree and return the first cached struct we find.
    data = GetParentData(aSID);  <========
    NS_ASSERTION(data, "dependent bits set but no cached struct present");
    return data;
  }
so we retrieve the parent style data for the <tr> tag which would be fine if we
wouldn't just remove the inherit and fall back to the table-row display type.


 
Depends on: 230170
Oh, this is great fun...  Here's the sequence of events:

1)  We process the frame reconstruct for the <tr>.  Now the frame structure looks like:

  TableRow(tr)
    TableCell(tr)
      Block(tr)
        TableOuter(tr)
          Table(tr)
            TableRowGroup(tr)
              TableRow(td)

(since we restyle all kids of the <tr> too, so the table cell picks up its new inherited display:table-row).

2)  We go to restyle the <td>.
3)  We call GetParentStyleContextFrame() on that TableRow frame.
4)  It returns the TableRowGroup frame(!).
5)  We take its style context and use that to resolve style for the <td>, which
    gives it a table-row-group display.
6)  We capture the change, but since both style contexts have the same rulenode, 
    we don't actually compute a change (see optimization in
    nsStyleContext::CalcStyleDifference).
7)  No change means nothing to do, so we leave the frame tree as above, but the
    <td> now has a table-row-group display type.

And then we're screwed.

So the real issue is that step 4 is wrong -- it should be returning the outer TableRow frame, not what it returns.

Now as for how to fix it...  Can pseudo-frames ever legitimately be style context parents for actual content?  I'm especially thinking :first-line here.  If not, then we can probably just walk up till we find a non-pseudo parent when getting the parent style context frame.  If, on the other hand, we can legitimately have pseudo-frames as style context parents, then we should figure out which ones can be and stop the walk at those.  In either case, the loop in GetCorrectedParent (the one that handles skipping the scrolledContent pseudo!) is the thing we want to modify.
Of course one question that arises is what the style context parent for all those pseudo frames should be....  :(
Attached patch Sorta-fix (obsolete) (deleted) — Splinter Review
This is ok in almost all cases, I think.  Except not handling non-table stuff.  And except for not giving the same results as initial style resolution (which is just broken for kids of table cells -- see bug 85872).
Depends on: 85872
Comment on attachment 200640 [details] [diff] [review]
Sorta-fix

Might be worth it as a stopgap...
Attachment #200640 - Flags: superreview?(dbaron)
Attachment #200640 - Flags: review?(dbaron)
Comment on attachment 200448 [details] [diff] [review]
debug help

Boris, I have to apologize but this patch got checked in as part of bug 291060. Can you review it? At least we hit the assert also in bug 317876.
Attachment #200448 - Flags: review?(bzbarsky)
Blocks: 317876
Attachment #200448 - Flags: superreview+
Attachment #200448 - Flags: review?(bzbarsky)
Attachment #200448 - Flags: review+
Summary: Crash with evil testcase, using display:inherit → Crash in [@ BasicTableLayoutStrategy::AssignNonPctColumnWidths] with evil testcase, using display:inherit
Attachment #200640 - Flags: superreview?(dbaron)
Attachment #200640 - Flags: superreview+
Attachment #200640 - Flags: review?(dbaron)
Attachment #200640 - Flags: review+
Comment on attachment 200640 [details] [diff] [review]
Sorta-fix

I just checked this in to the trunk.
Actually, I'm tempted to back out the ::cellContent part of the patch because it's inconsistent with what we do in the static case, and I'd rather not have reresolution bugs.
...which I did.

And if the static case were fixed, I'm not sure this patch would actually match, for the case where somebody used ::-moz-cell-content { border: inherit; } or some such, not that they should or that it should matter, except that we could see lots of wrong parent style context warnings.
Assignee: nobody → bzbarsky
Comment on attachment 200640 [details] [diff] [review]
Sorta-fix

I think we should perhaps consider this (with my one line fix, which I'd like Boris to review sometime) for the patch releases...
Attachment #200640 - Flags: approval1.8.0.1?
Let's get a roll-up patch (with the backout) and some more bake time on the trunk before we consider for 1.8.0.
Attachment #200640 - Attachment is obsolete: true
Attachment #206344 - Flags: superreview?(bzbarsky)
Attachment #206344 - Flags: review?(bzbarsky)
Attachment #206344 - Flags: approval1.8.0.1?
Attachment #200640 - Flags: approval1.8.0.1?
Blocks: 321106
Fixed by dbaron's checkin.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Comment on attachment 206344 [details] [diff] [review]
patch that I want for the branch

r+sr=bzbarsky
Attachment #206344 - Flags: superreview?(bzbarsky)
Attachment #206344 - Flags: superreview+
Attachment #206344 - Flags: review?(bzbarsky)
Attachment #206344 - Flags: review+
Any risk here of layout changes or regressions?  
Comment on attachment 206344 [details] [diff] [review]
patch that I want for the branch

a-=schrep for drivers.  Until we get more confort on possible regressions (comment 33) we'll hold.  Renominating for 1.8.0.2
Attachment #206344 - Flags: approval1.8.0.2?
Attachment #206344 - Flags: approval1.8.0.1?
Attachment #206344 - Flags: approval1.8.0.1-
It could change the layout of some Web pages in response to dynamic changes by making it consistent with what it is in the static case.  Such a change is generally pretty safe, though.  I haven't heard of any regressions.
Verified FIXED using SeaMonkey 1.5a;Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060123 Mozilla/1.0 trunk on both testcases.  No crashes.
Status: RESOLVED → VERIFIED
Comment on attachment 206344 [details] [diff] [review]
patch that I want for the branch

approved for 1.8.0 branch, a=dveditz
Attachment #206344 - Flags: approval1.8.0.2? → approval1.8.0.2+
Flags: blocking1.8.0.2+
Fixed for 1.8.0.2
Keywords: fixed1.8.0.2
Attachment #206344 - Flags: approval-branch-1.8.1?(dbaron)
Comment on attachment 206344 [details] [diff] [review]
patch that I want for the branch

I think I actually wrote the branch patch based on bz's trunk patch, but approval-branch-1.8.1+ anyway.
Attachment #206344 - Flags: approval-branch-1.8.1?(dbaron) → approval-branch-1.8.1+
Fixed on 1.8 branch
Keywords: fixed1.8.1
Whiteboard: [rft-dl]
v.fixed on 1.8.0 branch with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.1) Gecko/20060224 Firefox/1.5.0.1, no crash with either testcase.
Blocks: 230170
No longer depends on: 230170
Crash Signature: [@ BasicTableLayoutStrategy::AssignNonPctColumnWidths]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: