Closed
Bug 347899
Opened 18 years ago
Closed 18 years ago
[FIX]crash when restarting for update [@ nsCSSStyleSheet::SetComplete]
Categories
(Core :: CSS Parsing and Computation, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla1.8.1beta2
People
(Reporter: shaver, Assigned: bzbarsky)
References
Details
(Keywords: crash, verified1.8.0.7, verified1.8.1)
Crash Data
Attachments
(4 files)
(deleted),
application/rtf
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
dbaron
:
review+
dbaron
:
superreview+
dveditz
:
approval1.8.0.7+
beltzner
:
approval1.8.1+
|
Details | Diff | Splinter Review |
Stack looks like memory corruption, a wise man told me:
Thread 0 Crashed:
0 org.mozilla.firefox 0x0014781b nsCSSStyleSheet::SetComplete() + 55
1 org.mozilla.firefox 0x00104695 CSSLoaderImpl::SheetComplete(SheetLoadData*, int) + 131
2 org.mozilla.firefox 0x00105943 SheetLoadData::OnStreamComplete(nsIUnicharStreamLoader*, nsISupports*, unsigned, nsIUnicharInputStream*) + 493
3 org.mozilla.firefox 0x0034d4ee nsUnicharStreamLoader::OnStopRequest(nsIRequest*, nsISupports*, unsigned) + 378
4 org.mozilla.firefox 0x0038c251 nsJARChannel::OnStopRequest(nsIRequest*, nsISupports*, unsigned) + 67
5 org.mozilla.firefox 0x00346cec nsInputStreamPump::OnStateStop() + 88
6 org.mozilla.firefox 0x00347271 nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) + 133
7 libxpcom_core.dylib 0x01872b49 nsInputStreamReadyEvent::EventHandler(PLEvent*) + 45
8 libxpcom_core.dylib 0x01847579 PL_HandleEvent + 21
9 libxpcom_core.dylib 0x01847832 PL_ProcessPendingEvents + 103
10 libxpcom_core.dylib 0x01848950 nsEventQueueImpl::ProcessPendingEvents() + 54
11 libxpcom_core.dylib 0x0180e4b9 NS_ShutdownXPCOM_P + 323
12 org.mozilla.firefox 0x00002668 ScopedXPCOMStartup::~ScopedXPCOMStartup [in-charge]() + 46
[...]
Recent-ish Bon Echo nightly, don't know how to find the specific version that I was coming from, now that I have updated.
Assignee | ||
Comment 1•18 years ago
|
||
So either memory is completely hosed or we're somehow ending up with an mDocument pointing to a deleted document, I guess... Not really sure how either one would happen.
Comment 2•18 years ago
|
||
I've seen this consistently through several 1.8branch updates as well. I figured it was just my profile or hoochy extensions or something; I should have known better.
Comment 3•18 years ago
|
||
The latter option, almost certainly. We're in the middle of shutting down XPCOM and processing final events.
Updated•18 years ago
|
Severity: normal → critical
Keywords: crash
Summary: crash when restarting for update → crash when restarting for update [@ nsCSSStyleSheet::SetComplete]
Reporter | ||
Comment 5•18 years ago
|
||
Yeah, beltzner's been seeing this as well, and I assure you that I have a strict no-hoochy-extensions policy.
Wonder how to find out what the document is that we think we're crashing on. Maybe there's some logging I can turn on that will give me document object addresses as they're created and destroyed?
Severity: critical → normal
Assignee | ||
Comment 6•18 years ago
|
||
~nsDocument has the following code:
719 // Let the stylesheets know we're going away
720 indx = mStyleSheets.Count();
721 while (--indx >= 0) {
722 mStyleSheets[indx]->SetOwningDocument(nsnull);
723 }
724 indx = mCatalogSheets.Count();
725 while (--indx >= 0) {
726 mCatalogSheets[indx]->SetOwningDocument(nsnull);
727 }
728 if (mAttrStyleSheet)
729 mAttrStyleSheet->SetOwningDocument(nsnull);
730 if (mStyleAttrStyleSheet)
731 mStyleAttrStyleSheet->SetOwningDocument(nsnull);
So the only way to have a bogus document here is to have a stylesheet that the document doesn't know about...
If this is reproducible, what's the URI of the stylesheet?
Comment 7•18 years ago
|
||
refcount logging can do that, sure XPCOM_MEM_LOG_CLASSES=nsDocument... unfortunately it's dog-slow
Comment 8•18 years ago
|
||
XPCOM_MEM_LEAK_LOG=leak.log XPCOM_MEM_LOG_CLASSES=nsDocument XPCOM_MEM_ALLOC_LOG=alloc.log
shouldn't be as slow as logging refcounts (and then look at alloc.log).
Comment 9•18 years ago
|
||
Update needs to work without crashing. My personal "rough regression window" is "within the last week".
Updated•18 years ago
|
Severity: normal → critical
Comment 10•18 years ago
|
||
An exact regression window might be helpful for figuring this out.
Comment 11•18 years ago
|
||
regression window: I saw this crash when updating from 2006080506 to 2006080606, but not from 2006080408 to 2006080506, on my intel mac. I've seen it each subsequent day, and only on update, and not when I try to reproduce it :-(
So my regression window is 2006080506-2006080606.
Comment 12•18 years ago
|
||
Attaching the Apple Report in case it is useful. I just crashed on my PPC machine updating from the 7-26 build, which is one of the most recent crashes that showed up in the list of talkback reports. I will nail a definite date.
Comment 13•18 years ago
|
||
I have seen it not only on update, but when I restarted Firefox after an extension update. See http://talkback-public.mozilla.org/search/start.jsp?search=2&type=iid&id=21917493 for a crash that was generated from an extension install.
Comment 14•18 years ago
|
||
(In reply to comment #11)
> regression window: I saw this crash when updating from 2006080506 to
> 2006080606, but not from 2006080408 to 2006080506, on my intel mac. I've seen
> it each subsequent day, and only on update, and not when I try to reproduce it
> :-(
>
> So my regression window is 2006080506-2006080606.
Is the crash the old build shutting down, or the new build shutting down the extra run for registration?
Comment 15•18 years ago
|
||
new build
Comment 16•18 years ago
|
||
And that's a regression window on the 1.8 branch, or on the trunk?
Comment 17•18 years ago
|
||
(In reply to comment #16)
> And that's a regression window on the 1.8 branch, or on the trunk?
>
1.8 branch (I didn't specify since we're all talking about bon echo nightly updates. We *are* all talking about 1.8 branch nightly updates, right?)
Assignee | ||
Comment 18•18 years ago
|
||
Checkin range from comment 11:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=MozillaTinderboxAll&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=20060805+02&maxdate=20060806+10&cvsroot=%2Fcvsroot
Nothing there jumps out at me...
Comment 19•18 years ago
|
||
when did addons.mozilla.org get updated?
Comment 20•18 years ago
|
||
(In reply to comment #19)
> when did addons.mozilla.org get updated?
Saturday, from 8-12 PDT, according to http://weblogs.mozillazine.org/it/.
Comment 21•18 years ago
|
||
bug 347831 and bug 347921 look suspicious - ssl errors at addons
Comment 22•18 years ago
|
||
Happened again here, again after a restart for an update (Marcia's seeing it on restart for extension install makes me suspect the "restart" method).
Comment 23•18 years ago
|
||
Here's my sessionstore.js file, in case that's informative.
Comment 24•18 years ago
|
||
Dietrich should really be cc'd on this bug. He may even be interested in taking it!
Status: NEW → ASSIGNED
Comment 25•18 years ago
|
||
I can reproduce this every single time on the first startup after a rebuild, if I start from the command line. The stack is the same.
---------
...
*** Registering nsSoftwareUpdate components (all right -- a generic module!)
/Users/sayrer/firefox/mozilla/fb-debug/dist/MinefieldDebug.app/Contents/MacOS/components/nsUrlClassifierLib.js mtime changed, invalidating FastLoad file
*** Registering Apprunner components (all right -- a generic module!)
WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed: file /Users/sayrer/firefox/mozilla/chrome/src/nsChromeRegistry.cpp, line 1258
++WEBSHELL 0x16219900 == 1
/Users/sayrer/firefox/mozilla/fb-debug/dist/MinefieldDebug.app/Contents/MacOS/chrome/classic.jar mtime changed, invalidating FastLoad file
WARNING: NS_ENSURE_TRUE(cv) failed: file /Users/sayrer/firefox/mozilla/xpfe/appshell/src/nsXULWindow.cpp, line 1544
++DOMWINDOW == 1
++DOMWINDOW == 2
###!!! ASSERTION: not an nsIRDFRemoteDataSource: 'remote != nsnull', file /Users/sayrer/firefox/mozilla/rdf/datasource/src/nsLocalStore.cpp, line 345
WARNING: nsExceptionService ignoring thread destruction after shutdown: file /Users/sayrer/firefox/mozilla/xpcom/base/nsExceptionService.cpp, line 191
fb-debug/dist/MinefieldDebug.app/Contents/MacOS/run-mozilla.sh: line 451: 15682 Bus error "$prog" ${1+"$@"}
Assignee | ||
Comment 26•18 years ago
|
||
Robert, can you answer the question in comment 6?
Comment 27•18 years ago
|
||
*** Bug 348458 has been marked as a duplicate of this bug. ***
Comment 28•18 years ago
|
||
(In reply to comment #26)
> Robert, can you answer the question in comment 6?
>
Hmm, I'm having trouble getting to happen now :/ I did get it in a debugger a couple times, and I believe resource://gre/res/forms.css is where the crash occurs. At least, I would see nativescrollbars.css come through, and get a SIGBUS on the next entry into SetComplete. Here's what I see with some printfs in a working load:
---
*************** mSheetURI: resource://gre/res/html.css
*************** mSheetURI: chrome://global/content/xul.css
*************** mSheetURI: resource://gre/res/quirk.css
*************** mSheetURI: resource://gre/res/ua.css
++DOMWINDOW == 2
*************** mSheetURI: chrome://global/skin/nativescrollbars.css
*************** mSheetURI: resource://gre/res/forms.css
*************** mSheetURI: chrome://mozapps/content/plugins/missingPluginBinding.css
###!!! ASSERTION: not an nsIRDFRemoteDataSource: 'remote != nsnull', file /Users/sayrer/firefox/mozilla/rdf/datasource/src/nsLocalStore.cpp, line 346
WARNING: nsExceptionService ignoring thread destruction after shutdown: file /Users/sayrer/firefox/mozilla/xpcom/base/nsExceptionService.cpp, line 191
*************** mSheetURI: chrome://browser/content/places/places.css
###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp, line 744
###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp, line 744
###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp, line 744
[... many lines ...]
###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp, line 744
###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp, line 744
###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp, line 744
--WEBSHELL 0x16218640 == 0
WARNING: Leaking the RDF Service.: file /Users/sayrer/firefox/mozilla/rdf/build/nsRDFModule.cpp, line 236
nsMenuItemX 0
nsMenuX 0
nsStringStats
=> mAllocCount: 6792
=> mReallocCount: 30
=> mFreeCount: 6407 -- LEAKED 385 !!!
=> mShareCount: 1979
=> mAdoptCount: 894
=> mAdoptFreeCount: 892 -- LEAKED 2 !!!
~/firefox/mozilla>
and then restart.
Assignee | ||
Comment 29•18 years ago
|
||
> ###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file
> /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp,
> line 744
Please do make sure a bug is filed on that... ;)
In any case, if someone can catch this in a debugger, please look up the sheet URI. You can get if from SetComplete, since it's a member of the sheet.
resource://gre/res/forms.css shouldn't have a non-null mDocument. Furthermore, it's not loaded via the unichar stream loader (see stack in comment 0). So I doubt you're crashing on forms.css.
Comment 30•18 years ago
|
||
(In reply to comment #29)
> > ###!!! ASSERTION: tag in XUL doc epilog: 'mState != eInEpilog', file
> > /Users/sayrer/firefox/mozilla/content/xul/document/src/nsXULContentSink.cpp,
> > line 744
>
> Please do make sure a bug is filed on that... ;)
That's bug 348075.
Comment 31•18 years ago
|
||
No crash at this morning's update (->2006081503 mac). I looked at checkins . . . Bug 348360 might have had an effect?
Comment 32•18 years ago
|
||
I'd been seeing this crash after updating each day, too, (for about a week) and like davel, I didn't see it when I updated this morning.
Comment 33•18 years ago
|
||
Adam G or DaveL - can you take yesterday's nightlies and see if you can debug this (sounds like it happens with extension install restart) and answer Bz's question.
Comment 34•18 years ago
|
||
I can't seem to catch this in a debugger. For anyone that's trying to reproduce, I've found that installing an extension and using the Restart Now button will reliably crash on restart. (This is marginally more convenient than doing an update.)
Assignee | ||
Comment 35•18 years ago
|
||
Can't you just attach the debugger after we crash? Or do we only do the sleep thing on Linux?
Reporter | ||
Comment 36•18 years ago
|
||
We don't do the sleep() thing on OS X, but crash reporter should be able to let you attach gdb.
http://developer.apple.com/technotes/tn2004/tn2123.html#SECCRASHREPORTERPREFS
If that isn't working for whatever reason, check the ulimit settings for the shell you start from, and see if you can get a core dump to analyze post-mortem?
Comment 37•18 years ago
|
||
I have not yet been successful in causing the crash to happen again, using the 20060810 build as a base and the partial update to 20060811 as the update. I suspect my test profile/extensions/setup is missing something required to crash, or contains something that masks the crash.
Does anyone have a recipe (or can someone suggest a recipe) to cause the browser to do the registration thing? things like create .qutoreg, remove compreg.dat and xpti.dat, ...
Reporter | ||
Comment 38•18 years ago
|
||
Did Adam's install-extension-and-restart-now approach in comment #34 not work? (I suspect you can do that with a profiles.ini tweak too, but I'm not sure what the tweak in question is.)
Comment 39•18 years ago
|
||
adam's recipe did not work for me.
Comment 40•18 years ago
|
||
If nobody's seen this crash since that checkin, are we confident enough to resolve this as fixed by that patch?
Assignee | ||
Comment 41•18 years ago
|
||
I'm not, since ispiked applied that patch to an Aug 10 build and is still seeing this. We're going to try debugging some more when he gets back from lunch.
Assignee | ||
Comment 42•18 years ago
|
||
I'm blind...
Assignee | ||
Comment 43•18 years ago
|
||
Comment on attachment 234078 [details] [diff] [review]
Patch that should fix this
ispiked says this does fix it.
David, the issue here is that we were crashing on intl.css, imported from global.css, imported from something (profileSelection.css? Not sure). In any case, when we tear down the document without this patch, it notifies its child sheet, which notifies its kids, which don't notify their kids. We need to notify all the way down...
Attachment #234078 -
Flags: superreview?(dbaron)
Attachment #234078 -
Flags: review?(dbaron)
Assignee | ||
Comment 44•18 years ago
|
||
We should take this on all branches and trunk...
Assignee: nobody → bzbarsky
Status: ASSIGNED → NEW
Component: General → Style System (CSS)
Flags: blocking1.8.0.7?
Priority: -- → P1
Summary: crash when restarting for update [@ nsCSSStyleSheet::SetComplete] → [FIX]crash when restarting for update [@ nsCSSStyleSheet::SetComplete]
Comment 45•18 years ago
|
||
here's a really dumb question - are there ever any cycles in the nsCSSStyleSheet graph?
Comment 46•18 years ago
|
||
Comment on attachment 234078 [details] [diff] [review]
Patch that should fix this
Import loop detection happens well before we actually construct these objects, right? (Might be worth double-checking on Hixie's infinite @import tests in the importtest.)
Anyway, r+sr=dbaron
(Maybe it'd be nice to avoid stack overflow on deep style sheet trees at some point.)
Attachment #234078 -
Flags: superreview?(dbaron)
Attachment #234078 -
Flags: superreview+
Attachment #234078 -
Flags: review?(dbaron)
Attachment #234078 -
Flags: review+
Assignee | ||
Comment 47•18 years ago
|
||
Comment on attachment 234078 [details] [diff] [review]
Patch that should fix this
> are there ever any cycles in the nsCSSStyleSheet graph?
Only if something is very majorly screwed up. There shouldn't be any.
> Import loop detection happens well before we actually construct these objects,
Yes. We do loop detection before even creating the child nsCSSStyleSheet object in LoadChildSheet. I did double-check with hixie's @import tests, and we pass as much as we used to.
Requesting branch approvals. This is a _really_ safe patch that prevents a deleted document object reference. In practice, I suspect a malicious website could trigger this crash at will just by controlling stylesheet loading speed...
Attachment #234078 -
Flags: approval1.8.1?
Attachment #234078 -
Flags: approval1.8.0.7?
Comment 48•18 years ago
|
||
Comment on attachment 234078 [details] [diff] [review]
Patch that should fix this
a=beltzner on behalf of drivers, for landing on the MOZILLA_1_8_BRANCH. Please mark fixed1.8.1 when it lands.
Attachment #234078 -
Flags: approval1.8.1? → approval1.8.1+
Comment 49•18 years ago
|
||
Fix checked in to trunk and 1.8 branch on behalf of author.
Comment 50•18 years ago
|
||
Comment on attachment 234078 [details] [diff] [review]
Patch that should fix this
approved for 1.8.0 branch, a=dveditz for drivers
Attachment #234078 -
Flags: approval1.8.0.7? → approval1.8.0.7+
Updated•18 years ago
|
Flags: blocking1.8.0.7? → blocking1.8.0.7+
Assignee | ||
Comment 51•18 years ago
|
||
Fixed on the 1.8.0 branch.
For future reference, please don't check in my patches for me unless it's an absolute "if we don't check this in tonight it'll miss the release" emergency (which it wasn't in this case, since freeze got delayed). I really like having checkin comments that actually explain what the patch is about (and point to the bug and all), so that a year from now I can tell why I wrote the code that way.
Keywords: fixed1.8.0.7
Comment 52•18 years ago
|
||
(In reply to comment #51)
> Fixed on the 1.8.0 branch.
>
> For future reference, please don't check in my patches for me unless it's an
> absolute "if we don't check this in tonight it'll miss the release" emergency
> (which it wasn't in this case, since freeze got delayed).
Understood, I'll refrain from doing so in the future. FWIW, at the time I checked it in, I thought the freeze was still happening that night (I thought it was merely delayed by a few hours).
Comment 53•18 years ago
|
||
v.fixed on both branches based on latest Talkback data and the lack of any recent comments from people (no one seems to be seeing this crash with the latest builds/updates).
Updated•13 years ago
|
Crash Signature: [@ nsCSSStyleSheet::SetComplete]
You need to log in
before you can comment on or make changes to this bug.
Description
•