Closed
Bug 393936
Opened 17 years ago
Closed 16 years ago
[PATCH] Crash [@ nsCellMapColumnIterator::GetNextFrame] with display: table-footer-group, binding, contenteditable and image
Categories
(Core :: Graphics: ImageLib, defect, P3)
Core
Graphics: ImageLib
Tracking
()
VERIFIED
FIXED
mozilla1.9.2a1
People
(Reporter: martijn.martijn, Assigned: joe)
References
(Blocks 1 open bug)
Details
(5 keywords)
Crash Data
Attachments
(5 files, 5 obsolete files)
(deleted),
text/xml
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
bzbarsky
:
review+
vlad
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
joe
:
superreview+
|
Details | Diff | Splinter Review |
See upcoming testcase, which crash current trunk builds on load or reload.
http://crash-stats.mozilla.com/report/index/a03be961-54fc-11dc-92fe-001a4bd43ed6
0 nsCellMapColumnIterator::GetNextFrame(int*, int*) mozilla/layout/tables/nsCellMap.cpp:2801
1 BasicTableLayoutStrategy::ComputeColumnIntrinsicWidths(nsIRenderingContext*) mozilla/layout/tables/BasicTableLayoutStrategy.cpp:295
2 BasicTableLayoutStrategy::ComputeIntrinsicWidths(nsIRenderingContext*) mozilla/layout/tables/BasicTableLayoutStrategy.cpp:576
3 BasicTableLayoutStrategy::GetMinWidth(nsIRenderingContext*) mozilla/layout/tables/BasicTableLayoutStrategy.cpp:69
4 nsTableFrame::TableShrinkWidthToFit(nsIRenderingContext*, int) mozilla/layout/tables/nsTableFrame.cpp:1658
5 nsTableFrame::ComputeAutoSize(nsIRenderingContext*, nsSize, int, nsSize, nsSize, nsSize, int) mozilla/layout/tables/nsTableFrame.cpp:1690
6 nsFrame::ComputeSize(nsIRenderingContext*, nsSize, int, nsSize, nsSize, nsSize, int) mozilla/layout/generic/nsFrame.cpp:2966
7 nsTableFrame::ComputeSize(nsIRenderingContext*, nsSize, int, nsSize, nsSize, nsSize, int) mozilla/layout/forms/nsFieldSetFrame.cpp:429
8 nsHTMLReflowState::InitConstraints(nsPresContext*, int, int, nsMargin const*, nsMargin const*) mozilla/layout/generic/nsHTMLReflowState.cpp:1820
9 nsHTMLReflowState::Init(nsPresContext*, int, int, nsMargin const*, nsMargin const*) mozilla/layout/generic/nsHTMLReflowState.cpp:315
10 nsTableOuterFrame::InitChildReflowState(nsPresContext&, nsHTMLReflowState&) mozilla/layout/tables/nsTableOuterFrame.cpp:468
11 nsTableOuterFrame::OuterReflowChild(nsPresContext*, nsIFrame*, nsHTMLReflowState const&, void*, nsHTMLReflowMetrics&, int, nsSize&, nsMargin&, unsigned int&) mozilla/layout/tables/nsTableOuterFrame.cpp:1133
12 nsTableOuterFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned int&) mozilla/layout/tables/nsTableOuterFrame.cpp:1255
13 nsBlockReflowContext::ReflowBlock(nsRect const&, int, nsCollapsingMargin&, int, int, nsMargin&, nsLineBox*, nsHTMLReflowState&, unsigned int&) mozilla/layout/generic/nsBlockReflowContext.cpp:338
14 nsBlockFrame::ReflowBlockFrame(nsBlockReflowState&, nsLineList_iterator, int*) mozilla/layout/generic/nsBlockFrame.cpp:2922
etc.
This regressed between 2007-06-26 and 2007-06-28. And because the contenteditable attribute is necessary for the crash, I think this is somehow a regression from bug 237964.
Reporter | ||
Comment 1•17 years ago
|
||
Updated•17 years ago
|
Flags: blocking1.9?
Comment 2•17 years ago
|
||
###!!! ASSERTION: reflowing in the middle of frame construction
Updated•17 years ago
|
OS: Windows XP → All
Comment 3•17 years ago
|
||
Can we make one of the calls on that stack (in the #39 -- #51 range)
asynchronous off an event instead?
Comment 4•17 years ago
|
||
Uh... Cancel() must not notify anything sync. In particular, the removal from the loadgroup MUST be async per the nsIRequest contract (though we should document this more clearly in nsIRequest.idl).
Component: Layout: Tables → ImageLib
QA Contact: layout.tables → imagelib
Comment 5•17 years ago
|
||
Compiled, but untested (don't have a fully compiling build at the moment).
Comment 6•17 years ago
|
||
So I just tested this, and it doesn't quite work. Unfortunately, mListener is a weak pointer and consumers rely on Cancel() nulling it out. There are at least some cases where the Cancel() call is actually in mListener's destructor. In these cases we really shouldn't be sending OnStopRequest to the listener either.
Perhaps there should be a status code that such callsites can use to indicate "cancel and never talk to me again"? Feels hacky, but this whole setup with having to hold the imgIRequest which has a pointer back to us has this little leak problem if all the refs are made strong. :(
In any case, with just that patch we crash in OnStartRequest on the testcase because mListener is pointing to deleted memory.
Flags: blocking1.9? → blocking1.9+
Updated•17 years ago
|
Priority: -- → P2
Reporter | ||
Comment 7•17 years ago
|
||
Still crashing for me, using a 2007-11-01 trunk build.
Comment 8•17 years ago
|
||
Stuart, would you be OK with either introducing a status code to pass to cancel(), as described in comment 6, or an API which we could use to tell the imgRequestProxy to drop its listener reference instead of overloading cancel()?
Updated•17 years ago
|
Assignee: nobody → pavlov
Updated•17 years ago
|
Flags: wanted1.9.0.x+
Flags: tracking1.9+
Flags: blocking1.9-
Priority: P2 → --
Comment 9•17 years ago
|
||
This testcase now triggers
###!!! ASSERTION: Someone forgot to block scripts: 'aIsSafeToFlush == nsContentUtils::IsSafeToRunScript()', file /Users/jruderman/trunk/mozilla/layout/base/nsPresShell.cpp, line 4532
in addition to the three assertions it has triggered for a long time
###!!! ASSERTION: reflowing in the middle of frame construction: 'mPresContext->mLayoutPhaseCount[eLayoutPhase_FrameC] == 0', file ../../dist/include/layout/nsPresContext.h, line 948
###!!! ASSERTION: Why no rowgroups?: 'mOrigCells == 0', file /Users/jruderman/trunk/mozilla/layout/tables/nsCellMap.h, line 632
###!!! ASSERTION: Bogus mOrigCells?: 'mCurMapRow < mCurMapRelevantRowCount', file /Users/jruderman/trunk/mozilla/layout/tables/nsCellMap.cpp, line 2928
Blocks: 430215
Reporter | ||
Comment 10•16 years ago
|
||
Still crashing for me, using:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b1pre) Gecko/20080905031348 Minefield/3.1b1pre
Flags: blocking1.9.1?
Assignee | ||
Updated•16 years ago
|
Assignee: pavlov → joe
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P1
Priority: P1 → P3
Assignee | ||
Comment 11•16 years ago
|
||
Since we absolutely must make removal from the loadgroup async, this is an updated version of Boris' patch that includes a new function imgIRequest::CancelAndForgetObserver().
I think I fixed all callers of Cancel() who needed to be CancelAndForgetObserver instead, but there are a couple of calls to SetFrame(nsnull) in nsImageFrame and nsBulletFrame that I can't fix since I don't understand the class hierarchy. Ideally, that would be a simple call to CancelAndForgetObserver in nsImageFrame::Destroy() instead of removing the observer from the image loader and setting the frame to null, but I don't know how things are structured to do that.
Attachment #278728 -
Attachment is obsolete: true
Attachment #341971 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 12•16 years ago
|
||
Also, I should mention that patch fixes the crash in this bug. I'm running through mochitests and reftests to make sure I haven't broken anything else.
Assignee | ||
Comment 13•16 years ago
|
||
Attachment #341971 -
Attachment is obsolete: true
Attachment #341977 -
Flags: review?(bzbarsky)
Attachment #341971 -
Flags: review?(bzbarsky)
Comment 14•16 years ago
|
||
So as I mentioned on irc, we need to keep the SetFrame(nsnull) calls that are in place now. Just replacing Cancel() with CancelAndForgetObserver() sounds reasonable to me.
Assignee | ||
Comment 15•16 years ago
|
||
Attachment #341977 -
Attachment is obsolete: true
Attachment #342282 -
Flags: superreview?
Attachment #342282 -
Flags: review?(bzbarsky)
Attachment #341977 -
Flags: review?(bzbarsky)
Assignee | ||
Updated•16 years ago
|
Attachment #342282 -
Flags: superreview? → superreview?(pavlov)
Assignee | ||
Comment 16•16 years ago
|
||
FYI, this passes all mochitests and reftests.
Updated•16 years ago
|
Attachment #342282 -
Flags: review?(bzbarsky) → review+
Comment 17•16 years ago
|
||
Comment on attachment 342282 [details] [diff] [review]
Re-add SetFrame(nsnull) calls
>+NS_IMETHODIMP imgRequestProxy::CancelAndForgetObserver(nsresult aStatus)
>+ NullOutListener();
>+}
return NS_OK? Or return the return value of Cancel()? In any case, as written this won't compile on Windows: gotta return _something_.
With that fixed, looks good.
Assignee | ||
Comment 18•16 years ago
|
||
Carrying over r+.
Attachment #342288 -
Flags: superreview?(pavlov)
Attachment #342288 -
Flags: review+
Assignee | ||
Updated•16 years ago
|
Attachment #342282 -
Attachment is obsolete: true
Attachment #342282 -
Flags: superreview?(pavlov)
Updated•16 years ago
|
Attachment #342288 -
Flags: superreview?(pavlov) → superreview+
Assignee | ||
Comment 19•16 years ago
|
||
From my tests, this no longer crashes in current Firefox 3.1 nightlies.
Assignee | ||
Comment 20•16 years ago
|
||
Pushed to 3.1 in http://hg.mozilla.org/mozilla-central/rev/6bedb1e92dd0
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 21•16 years ago
|
||
Crap - I forgot to reopen this. It bounced out immediately because this patch caused refcount leaks. I'm currently debugging that.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 22•16 years ago
|
||
The previous patch had a nice subtle leak that occurred because CancelAndForgetObserver() called Cancel() before calling NullOutListener() (to avoid code duplication). Unfortunately, the important part of Cancel(), it calling mOwner->RemoveProxy(), was done asynchronously, so by the time it get called, mListener was null, and so OnStopRequest was never called on the listener.
Attachment #342288 -
Attachment is obsolete: true
Attachment #351252 -
Flags: superreview?(pavlov)
Attachment #351252 -
Flags: review?(bzbarsky)
Comment 23•16 years ago
|
||
Hold on. Which listener needed the OnStopRequest call? See comment 6.
Assignee | ||
Comment 24•16 years ago
|
||
Hm, that's not entirely clear. (It might be nsGIFDecoder2, in at least one case, but it's not clear to me where its mObserver gets nulled out.)
This patch does the "simple" thing, which is simply make it possible to do both an asynchronous and a synchronous cancel, with the mechanics of that cancel being the same. This isn't quite the same as what you mentioned in comment 6 (AIUI), since we do talk to the listener again, we just do it synchronously.
Comment 25•16 years ago
|
||
But the point is that a number of the callers of this method are calling it in a destructor. You _can't_ talk to them safely. I know the old code sort of did, but that seems bogus to me...
Assignee | ||
Comment 26•16 years ago
|
||
I agree. In fact, almost all callers of CancelAndForgetObserver() are directly _inside_ the destructor.
However, I'm not sure that I feel 100% comfortable making that change, because it involves some pretty significant changes to the callers. In the end, I think it'd end up making CancelAndForgetObserver() entirely obsolete.
If you really want that change, I can do it, but it doesn't feel like the safest thing to do. This way maintains the previous behaviour for the places that need it, and fixes Cancel() for the places that didn't know about imgRequestProxy's broken Cancel() behaviour.
Comment 27•16 years ago
|
||
Ok, fair enough. I guess this is at least not making things any worse.
Will try to review tomorrow.
Updated•16 years ago
|
Attachment #351252 -
Flags: review?(bzbarsky) → review+
Summary: Crash [@ nsCellMapColumnIterator::GetNextFrame] with display: table-footer-group, binding, contenteditable and image → [PATCH] Crash [@ nsCellMapColumnIterator::GetNextFrame] with display: table-footer-group, binding, contenteditable and image
Comment on attachment 351252 [details] [diff] [review]
Fix refcount leaks
Looks fine, but add more documentation for the new method and for the old cancel() explaining what they do and why you shouldn't use the new one in new code.
Attachment #351252 -
Flags: superreview?(pavlov) → superreview+
Assignee | ||
Comment 29•16 years ago
|
||
Pushed to mozilla-central: http://hg.mozilla.org/mozilla-central/rev/5f6b82c5c67e
After some baking on m-c, I'll push to 1.9.1.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Comment 30•16 years ago
|
||
had to back this due to leaks on tinderbox
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 31•16 years ago
|
||
Pushed to mozilla-central: http://hg.mozilla.org/mozilla-central/rev/e87d4b42a4b2
After some baking, I'll push to 1.9.1.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 32•16 years ago
|
||
I can't verify the fix, since builds before the fix were already not crashing with the testcase.
Assignee | ||
Comment 33•16 years ago
|
||
Keywords: fixed1.9.1
Updated•16 years ago
|
Hardware: x86 → All
Target Milestone: --- → mozilla1.9.2a1
Comment 34•16 years ago
|
||
Verified with:
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090114 Shiretoko/3.1b3pre Ubiquity/0.1.4 ID:20090114020333
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090113 Minefield/3.2a1pre ID:20090113020320
Can we please get this checked-in as a crashtest?
Comment 35•16 years ago
|
||
Attachment #358377 -
Flags: review?(bzbarsky)
Comment 36•16 years ago
|
||
Does that include the fix for bug 471101?
Also, you presumably want Joe to review this too.
Assignee | ||
Updated•16 years ago
|
Attachment #358377 -
Flags: superreview+
Assignee | ||
Comment 37•16 years ago
|
||
Comment on attachment 358377 [details] [diff] [review]
for 1.9.0 with imgIRequest_MOZILLA_1_9_0_BRANCH
Looks good, but you definitely need the fix to bug 471101 too. sr=joe so long as you do that.
Comment 38•16 years ago
|
||
Comment on attachment 358377 [details] [diff] [review]
for 1.9.0 with imgIRequest_MOZILLA_1_9_0_BRANCH
Looks ok if you don't change the CID and backport the other bug.
Attachment #358377 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 39•15 years ago
|
||
Alexander,
What ended up happening with your 1.9.0 patch? Did Linux distributions accept it/ship it?
Comment 40•15 years ago
|
||
asac: Ditto Joe's question. See also regression fix bug 508057 that would be required if you did ship this patch.
Updated•13 years ago
|
Crash Signature: [@ nsCellMapColumnIterator::GetNextFrame]
You need to log in
before you can comment on or make changes to this bug.
Description
•