Closed Bug 714320 Opened 13 years ago Closed 3 years ago

Firefox Crash @ nsStyleContext::AddChild with AMD Radeon HD 6xxx series

Categories

(Core :: Layout, defect)

10 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox10 - ---
firefox11 - ---
firefox12 - ---
firefox20 --- affected
firefox21 --- affected
firefox22 --- affected
firefox-esr10 - ---

People

(Reporter: marcia, Unassigned)

References

Details

(4 keywords, Whiteboard: [platform-rel-AMD])

Crash Data

Seen while looking at 10.0b2 stats. https://crash-stats.mozilla.com/report/list?signature=nsStyleContext::AddChild%28nsStyleContext*%29. Not enough volume to get correlations. This was seen in other versions but in lower volume. https://crash-stats.mozilla.com/report/index/f03b1b5d-db19-4fcd-bead-bc4ce2111229 Frame Module Signature [Expand] Source 0 xul.dll nsStyleContext::AddChild layout/style/nsStyleContext.cpp:148 1 xul.dll nsStyleContext::nsStyleContext layout/style/nsStyleContext.cpp:85 2 xul.dll NS_NewStyleContext layout/style/nsStyleContext.cpp:718 3 xul.dll nsStyleSet::GetContext layout/style/nsStyleSet.cpp:621 4 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1233 5 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1597 6 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1597 7 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1597 8 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1597 9 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1576 10 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1597 11 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1597 12 xul.dll nsFrameManager::ReResolveStyleContext layout/base/nsFrameManager.cpp:1597 13 xul.dll nsFrameManager::ComputeStyleChangeFor layout/base/nsFrameManager.cpp:1683 14 xul.dll mozilla::css::RestyleTracker::ProcessRestyles layout/base/RestyleTracker.cpp:240 15 xul.dll nsCSSFrameConstructor::ProcessPendingRestyles layout/base/nsCSSFrameConstructor.cpp:11615 16 xul.dll PresShell::FlushPendingNotifications layout/base/nsPresShell.cpp:4051 17 xul.dll nsDocument::FlushPendingNotifications content/base/src/nsDocument.cpp:6268 18 xul.dll xpc_qsUnwrapThis<nsGenericElement> js/xpconnect/src/nsDOMQS.h:121 19 xul.dll nsIDOMNSElement_GetBoundingClientRect obj-firefox/js/xpconnect/src/dom_quickstubs.cpp:6432 20 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:629 21 mozjs.dll js::Interpret js/src/jsinterp.cpp:3948 22 mozjs.dll js::types::TypeMonitorCall js/src/jsinferinlines.h:330 23 mozjs.dll UncachedInlineCall js/src/methodjit/InvokeHelpers.cpp:392 24 mozjs.dll js::mjit::stubs::UncachedCallHelper js/src/methodjit/InvokeHelpers.cpp:479 25 mozjs.dll js::mjit::stubs::CompileFunction js/src/methodjit/InvokeHelpers.cpp:305 26 mozjs.dll js::mjit::EnterMethodJIT js/src/methodjit/MethodJIT.cpp:1064 27 mozjs.dll js::mjit::JaegerShot js/src/methodjit/MethodJIT.cpp:1142 28 mozjs.dll js::Interpret js/src/jsinterp.cpp:3987 29 mozjs.dll js::types::TypeMonitorCall js/src/jsinferinlines.h:330 30 mozjs.dll UncachedInlineCall js/src/methodjit/InvokeHelpers.cpp:392 31 mozjs.dll js::mjit::stubs::UncachedCallHelper js/src/methodjit/InvokeHelpers.cpp:479 32 mozjs.dll js::mjit::stubs::CompileFunction js/src/methodjit/InvokeHelpers.cpp:305 33 mozjs.dll js::mjit::EnterMethodJIT js/src/methodjit/MethodJIT.cpp:1064 34 mozjs.dll js::mjit::JaegerShot js/src/methodjit/MethodJIT.cpp:1142 35 mozjs.dll js::RunScript js/src/jsinterp.cpp:581 36 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:647 37 mozjs.dll js::Invoke js/src/jsinterp.h:148 38 mozjs.dll js_fun_apply js/src/jsfun.cpp:1817 39 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:629 40 mozjs.dll js::Interpret js/src/jsinterp.cpp:3948 41 mozjs.dll js::RunScript js/src/jsinterp.cpp:584 42 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:647 43 mozjs.dll js::Invoke js/src/jsinterp.cpp:679 44 mozjs.dll JS_CallFunctionValue js/src/jsapi.cpp:5199 45 xul.dll nsJSContext::CallEventHandler dom/base/nsJSEnvironment.cpp:1937 46 xul.dll nsGlobalWindow::RunTimeout dom/base/nsGlobalWindow.cpp:9307 47 xul.dll nsGlobalWindow::TimerCallback dom/base/nsGlobalWindow.cpp:9747 48 xul.dll nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:425 49 xul.dll nsTimerEvent::Run xpcom/threads/nsTimerImpl.cpp:521 50 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:631 51 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:134 52 xul.dll xul.dll@0xbb999f 53 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:201 54 xul.dll _SEH_epilog4 55 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:175 56 xul.dll nsHTMLBodyElement::AddRef content/html/content/src/nsHTMLBodyElement.cpp:319 57 xul.dll nsBaseAppShell::Run widget/src/xpwidgets/nsBaseAppShell.cpp:189 58 xul.dll xul.dll@0xbb999f 59 xul.dll nsAppStartup::Run toolkit/components/startup/nsAppStartup.cpp:228 60 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:3551 61 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:107 62 firefox.exe firefox.exe@0x4033 63 firefox.exe __tmainCRTStartup crtexe.c:594 64 firefox.exe _SEH_epilog4 65 kernel32.dll kernel32.dll@0x51113 66 ntdll.dll __RtlUserThreadStart 67 kernel32.dll kernel32.dll@0x62acc 68 kernel32.dll kernel32.dll@0x62acc 69 ntdll.dll LdrpGetShimEngineInterface 70 ntdll.dll _RtlUserThreadStart 71 firefox.exe pre_c_init crtexe.c:304 72 firefox.exe pre_c_init crtexe.c:304 73 @0x7ffd3fff
This is officially an explosive crash with 11041 crashes in Firefox 10 B2 on Windows according to the signature summary. I will dig into manual correlations and look at the 51 comments. In the meantime adding the relevant keywords.
OS: Mac OS X → Windows 7
Version: 9 Branch → 10 Branch
Can someone look at nightly data and get an accurate range for when this started?
Facebook appears a lot on the comments. Do the URLs in crash stats also point to facebook? Other sites?
Keywords: needURLs
Keywords: needURLs
21 crashes in 8.0.1 in the last 4 weeks. 3 crashes in 9.0.1 in the last 4 weeks. 1 crash in 10.0b1. In the last 4 weeks of data I see other versions such as the 4.0 betas and 3.6.x represented in small numbers, but trunk is not among the versions that are showing up in my query. I have tried reproducing it on facebook.com but no luck so far. What I am seeing on this machine while running Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0) Gecko/20100101 Firefox/10.0 is that Firefox is periodically freezing with a few tabs open (including Facebook).
Marcia and I just looked through the crash data a bit more. Based upon the fact that: * 10.0b1 has 1 crash report in the past 4 weeks * 9.0.1 has 4 crash reports in the past 4 weeks * 10.0b2 has 16888 crash reports in the past 4 weeks we can safely assume that this bug is caused by a change that we've taken in 10.0b2. The full changeset from 10.0b1 to 10.0b2 is https://bugzilla.mozilla.org/buglist.cgi?quicksearch=710060%2C697215%2C699668%2C711195%2C712169%2C712506%2C701662%2C&list_id=1993989 In lieu of STR, I'm CC'ing the assignees of some of the more suspicious changes (JS/layout related). The investigation of this bug should be a top priority since it may force a portion of our beta audience to use other FF versions or browsers to surf Facebook. Thanks in advance.
What about 11.0a1 and 12.0a1?
A Socorro query shows no crashes in that signature for either 11.0a1 or 12.0a1. In reply to David Baron [:dbaron] from comment #7) > What about 11.0a1 and 12.0a1?
Severity: normal → critical
Could be http://hg.mozilla.org/releases/mozilla-beta/rev/7a0d36baf5be (bug 697215). I think we should try backing it out; crashstats results should be quick, and I think we can live without the patch.
Wasn't this patch checked into Aurora as well as Beta? If it's the culprit, wouldn't we see crashes in 11.0a2 - I don't find any. In any case, if this looks suspicious, I am all for backing it out. As Rob said, it will be easy to verify.
Darn, we missed beta 3's build. Should we respin with that backed out to see? Can we verify any other the way than putting a build with the backout to our beta audience?
A quick query shows in a 24 hr time we have accumulated over 5K in crashes. I would be in favor of respinning b3 with the backout at this point. We haven't yet been able to repro the crash but the remote testing team will be working on it later this evening.
Ok, backed out and commented in bug 697215. We'll respin the beta.
(In reply to Sheila Mooney from comment #10) > Wasn't this patch checked into Aurora as well as Beta? If it's the culprit, > wouldn't we see crashes in 11.0a2 - I don't find any. Maybe some other change on Aurora prevents the crashes on Aurora and trunk? The only other reasonable candidate would be bug 701662. Can you look back through the reports to see the exact day crashes started showing up?
So the actual signature has been around for a long time at very low volume ie: <10 a week for any particular release - from 3.6 to 9.0. I searched back for the last couple of months. I see a single crash on 10.0a1 only in the past 2 weeks. I didn't find any crashes on 11.0a1, 11.0a2 or 12.0a1 in the past 3 weeks. All the crashes seem to be exclusively with 10.0b2 (20111228055358). I don't see any trend of increase/decrease on Aurora since we had so few to begin with. Looking back historically, few of these signatures appeared pre-release or beta anyhow.
From a date perspective, the first crash in this signature showed up in crash-stats on Dec 06, 2011. That is going back in time 4 weeks which is the max that Socorro allows in the UI.
This signature goes way back. You can see the odd crash in 3.6. In Socorro you are restricted to a 4 week search window but you can change the dates and go back in time. I did a few searches back in Aug and Sept. We average about 10-20 or so crashes with this signature a week across all versions. That seemed to be the steady state up until we release 10.b2 and it really exploded.
What's the next step here? Are we going to try a backout and see if that works?
(In reply to Sheila Mooney from comment #18) > What's the next step here? Are we going to try a backout and see if that > works? As Christian noted in https://bugzilla.mozilla.org/show_bug.cgi?id=714320#c13, we backed out 697215 for beta 3. We'll continue to look for STR, but if the crash volume goes back to normal amounts in beta 3, I think we can safely say it was bug 697215.
Duh...sorry, my bad.
Blocks: 697215
Sheila - Could you send this to somebody to check whether the volume goes down in beta 3? We want to make sure that all tracked bugs are assigned to the person with next action, or closed. Thanks!
Assignee: nobody → smooney
One thing I mentioned to Sheila is I am not seeing the NPSWF version being reported in the module section of these crashes. I do see it in other Windows crashes, however.
I will be monitoring what happens in b3. Nothing in b3 yet but the volume is still too low. I will update over the weekend.
I was able to look at the report that rhelmer generated (http://people.mozilla.org/~rhelmer/temp/Firefox-10.0b2-correlation/) - shows a high correlation to some ATI dlls: Windows NT nsStyleContext::AddChild(nsStyleContext*)|EXCEPTION_ACCESS_VIOLATION_WRITE (6022 crashes) 96% (5808/6022) vs. 28% (7391/26633) atiuxpag.dll 96% (5808/6022) vs. 28% (7392/26633) atidxx32.dll 94% (5678/6022) vs. 27% (7271/26633) aticfx32.dll The addon correlation showed: Windows NT nsStyleContext::AddChild(nsStyleContext*)|EXCEPTION_ACCESS_VIOLATION_WRITE (6022 crashes) 90% (5420/6022) vs. 82% (21863/26633) testpilot@labs.mozilla.com (Mozilla Labs - Test Pilot, https://addons.mozilla.org/addon/13661)
When I drilled down to the by version module report (http://people.mozilla.org/~rhelmer/temp/Firefox-10.0b2-correlation/20120106_Firefox_10.0-interesting-modules-with-versions.txt), it seems correlated to different versions: 95% (548/575) vs. 28% (7391/26633) atiuxpag.dll 0% (0/575) vs. 0% (92/26633) 8.14.1.6117 0% (0/575) vs. 0% (30/26633) 8.14.1.6126 0% (0/575) vs. 0% (20/26633) 8.14.1.6136 0% (0/575) vs. 0% (49/26633) 8.14.1.6143 0% (0/575) vs. 0% (35/26633) 8.14.1.6150 22% (125/575) vs. 6% (1568/26633) 8.14.1.6160 6% (37/575) vs. 2% (496/26633) 8.14.1.6170 12% (70/575) vs. 3% (696/26633) 8.14.1.6178 5% (26/575) vs. 2% (405/26633) 8.14.1.6187 3% (19/575) vs. 1% (248/26633) 8.14.1.6195 0% (0/575) vs. 0% (6/26633) 8.14.1.6203 2% (13/575) vs. 1% (196/26633) 8.14.1.6210 34% (196/575) vs. 9% (2378/26633) 8.14.1.6214 1% (3/575) vs. 0% (56/26633) 8.14.1.6221 7% (38/575) vs. 2% (463/26633) 8.14.1.6226 0% (1/575) vs. 0% (78/26633) 8.14.1.6229 1% (3/575) vs. 0% (53/26633) 8.14.1.6233 1% (8/575) vs. 1% (137/26633) 8.14.1.6237 2% (9/575) vs. 1% (348/26633) 8.14.1.6243 0% (0/575) vs. 0% (37/26633) 8.14.1.6248
comment 25 makes it look like atiuxpag.dll versions 8.14.1.6160 and newer are related to something that causes the crash, but versions 8.14.1.6150 and older are not.
it would explain why i have not had any of these crashes, all my systems have either nvidia gpu's or intel igp's. i swear, the junk dll's amd drivers have running 24/7 are quickly bringing them to Creative labs level of bloat.
Since 10.0 Beta 3 is live, there have been no crashes in this version so the backout of bug 697215 did it.
I'm more than a little curious about why current Aurora works with bug 697215 applied and Beta crashes out. Are we sure Aurora doesn't suffer from the same problem, and just isn't tested widely enough?
Assigning to roc for further investigation since we haven't seen a crash in beta 3. Also untracking for FF10, but tracking for FF11 in case, as Jeff suggests, Aurora isn't yet being tested enough to uncover this crasher.
Assignee: smooney → roc
It's no longer a top crasher in 10.0 Beta. There's a spike in crashes from 12.0a1/20120114. The regression range for the spike is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=964b118ac852&tochange=3eaa7d9f1c69
Keywords: topcrash
(In reply to Scoobidiver from comment #31) > There's a spike in crashes from 12.0a1/20120114. The spike lasts one (build)day. Now there's a spike in 11.0a2/20120131. All crashes I checked happen with AMD Radeon HD 6xxx series.
Blocks: 605780
Depends on: 722538
Summary: Firefox Crash [@ nsStyleContext::AddChild(nsStyleContext*) ] → Firefox Crash @ nsStyleContext::AddChild with AMD Radeon HD 6xxx series
With Beta 10, this ended up being an explosive crasher. Have we seen anything similar for Firefox 11? If so, we should consider backing out bug 697215 for Firefox 11 Beta 4.
i think you should contact AMD and enquire why their drivers are the most affected first.
Signature summary shows there was only 1 crash in 11.0b1. I will check some of the other signatures that were correlated to Radeon as well and see what their volume is.
(In reply to Alex Keybl [:akeybl] from comment #33) > With Beta 10, this ended up being an explosive crasher. It's probably another crash signature form of other bugs that depend on bug 722538, which is applicable to Fx 10.
Most of these still seem to be in 10.0b2 for some reason. Very few in FF11 at all. I say we remove the tracking flag for now, see if it comes back in significant volume. There are none in FF11b3 and only 1 in FF11b2. 10.0.1 has 5 and none in 10.0.2 so far.
We are probably handling this with the blocklist in bug 722538.
Am getting crashs with this signature with the 20120319 moz-central nightly build. Am using win7 32bit with D2D off. AMD E-350 with HD6310 GPU. https://crash-stats.mozilla.com/report/index/bp-f80285f6-a022-45d1-ae3f-ebc052120319 https://crash-stats.mozilla.com/report/index/bp-5d5cce50-29bd-4cad-8b9e-bbb6f2120319 About:support Application Basics Name Firefox Version 14.0a1 User Agent Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20120319 Firefox/14.0a1 Profile Folder Show Folder Enabled Plugins about:plugins Build Configuration about:buildconfig Crash Reports about:crashes Memory Use about:memory Extensions Name Version Enabled ID Adblock Plus 2.0.4a.3417 true {d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d} British English Dictionary 1.19.1 true en-GB@dictionaries.addons.mozilla.org Close Tab By Double Click 1.14 true close@doubleclick Element Hiding Helper for Adblock Plus 1.2.2a.410 true elemhidehelper@adblockplus.org Nightly Tester Tools 3.2.1.1 true {8620c15f-30dc-4dba-a131-7c5d20cf4a29} NoScript 2.3.5 true {73a6fe31-595d-460b-a920-fcc0f8843232} Adobe Acrobat - Create PDF 1.2 false web2pdfextension@web2pdf.adobedotcom Readability 2.1 false readability@readability.com Zotero 3.0.3 false zotero@chnm.gmu.edu Important Modified Preferences Name Value accessibility.typeaheadfind.flashBar 0 browser.cache.disk.smart_size.enabled false browser.cache.disk.smart_size.first_run false browser.cache.disk.smart_size_cached_value 1048576 browser.places.smartBookmarksVersion 3 browser.startup.homepage http://www.google.co.uk browser.startup.homepage_override.buildID 20120319031122 browser.startup.homepage_override.mstone 14.0a1 extensions.checkCompatibility false extensions.checkCompatibility.3.6 false extensions.checkCompatibility.3.6b false extensions.checkCompatibility.3.6p false extensions.checkCompatibility.3.6pre false extensions.checkCompatibility.3.7a false extensions.checkCompatibility.4.0b false extensions.checkCompatibility.4.0p false extensions.checkCompatibility.4.0pre false extensions.checkCompatibility.4.2 false extensions.checkCompatibility.4.2a false extensions.checkCompatibility.4.2a1 false extensions.checkCompatibility.4.2a1pre false extensions.checkCompatibility.4.2b false extensions.checkCompatibility.5.0 false extensions.checkCompatibility.5.0a false extensions.checkCompatibility.5.0b false extensions.checkCompatibility.6.0 false extensions.checkCompatibility.6.0a false extensions.checkCompatibility.6.0b false extensions.checkCompatibility.7.0 false extensions.checkCompatibility.7.0a false extensions.checkCompatibility.7.0b false extensions.checkCompatibility.nightly false extensions.lastAppVersion 14.0a1 gfx.content.azure.enabled false gfx.direct2d.disabled true gfx.direct3d.prefer_10_1 true gfx.font_rendering.cleartype_params.force_gdi_classic_for_families Arial,Courier New,Tahoma,Trebuchet MS,Verdana,Segoe UI,Lucida Grande image.mem.discardable false network.cookie.cookieBehavior 1 network.cookie.prefsMigrated true network.prefetch-next false places.database.lastMaintenance 1332145200 places.history.expiration.transient_current_max_pages 69886 places.history.expiration.transient_optimal_database_size 111816048 privacy.donottrackheader.enabled true privacy.popups.showBrowserMessage false privacy.sanitize.migrateFx3Prefs true security.warn_viewing_mixed false Graphics Adapter Description AMD Radeon HD 6310 Graphics Vendor ID 0x1002 Device ID 0x9802 Adapter RAM 384 Adapter Drivers aticfx32 aticfx32 aticfx32 atiumdag atidxx32 atiumdva Driver Version 8.950.0.0 Driver Date 2-14-2012 Direct2D Enabled false DirectWrite Enabled false (6.1.7601.17776) ClearType Parameters ClearType parameters not found WebGL Renderer Google Inc. -- ANGLE (AMD Radeon HD 6310 Graphics) -- OpenGL ES 2.0 (ANGLE 1.0.0.963) GPU Accelerated Windows 1/1 Direct3D 9 JavaScript Incremental GC 1 Library Versions Expected minimum version Version in use NSPR 4.9 4.9 NSS 3.13.3.0 Basic ECC 3.13.3.0 Basic ECC NSS Util 3.13.3.0 3.13.3.0 NSS SSL 3.13.3.0 Basic ECC 3.13.3.0 Basic ECC NSS S/MIME 3.13.3.0 Basic ECC 3.13.3.0 Basic ECC
(In reply to DB Cooper from comment #39) > Am getting crashs with this signature with the 20120319 moz-central nightly > build. Am using win7 32bit with D2D off. AMD E-350 with HD6310 GPU. We're trying to move forward with Bug 722538 after some speed bumps on staging last week. In the meantime, we should try to reproduce with our most similar hardware and add-ons in QA. Roc - with D2D already disabled, will the blocklist in bug 722538 have an affect on this top crasher? Thanks.
This page reliably produces the crash when you scroll down through it: http://www.flatpanelshd.com/review.php?subaction=showfull&id=1331899332
(In reply to DB Cooper from comment #39) > Device ID > 0x9802 > Direct2D Enabled > false If it happens with D2D disabled, the blocklist planned in bug 722538 won't be helpful.
(In reply to Scoobidiver from comment #42) > (In reply to DB Cooper from comment #39) > > Device ID > > 0x9802 > > Direct2D Enabled > > false > If it happens with D2D disabled, the blocklist planned in bug 722538 won't > be helpful. To be clear, this crash has only happened with today's (20120319) moz-central nightly. Build config: Built from http://hg.mozilla.org/mozilla-central/rev/58a2cd0203ee
(In reply to DB Cooper from comment #43) > To be clear, this crash has only happened with today's (20120319) > moz-central nightly. The regression range for the spike is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e94edfdb1f5b&tochange=58a2cd0203ee It might be a regression from bug 666041.
Keywords: reproducible
(In reply to Scoobidiver from comment #44) > (In reply to DB Cooper from comment #43) > > To be clear, this crash has only happened with today's (20120319) > > moz-central nightly. > The regression range for the spike is: > http://hg.mozilla.org/mozilla-central/ > pushloghtml?fromchange=e94edfdb1f5b&tochange=58a2cd0203ee > It might be a regression from bug 666041. Tomorrow I can test a try build with that bug's patches backed out if you like.
It's #7 top crasher in 14.0a1 over the last 3 days. (In reply to DB Cooper from comment #45) > Tomorrow I can test a try build with that bug's patches backed out if you > like. Yes, please.
Keywords: topcrash
(In reply to Scoobidiver from comment #46) > It's #7 top crasher in 14.0a1 over the last 3 days. > > (In reply to DB Cooper from comment #45) > > Tomorrow I can test a try build with that bug's patches backed out if you > > like. > Yes, please. I'd need someone to compile that build for me though, unfortunately.
[Triage Comment] Roc - can you provide a try build with the necessary backouts? Also, should we be looking at filing a separate bug for this as it was first found in a recent nightly?
Bug 666041 is almost certainly not the problem.
Also, this went away again in today's builds (March 20).
(In reply to David Baron [:dbaron] from comment #50) > Also, this went away again in today's builds (March 20). It's the second bug that lasts one build after bug 736507 on March 16. It's odd.
Any luck tracking this down further? Is there anything QA can do to help?
Removing qawanted as per discussion in channel meeting -- will be covered under the AMD blocklisting meta issue.
Keywords: qawanted
We just resolved bug 722538 on Friday. We'll need to see whether the crash numbers fall now. Since this is no longer a top crasher, we'll untrack for FF12.
There's is a spike in crashes from 14.0a1/20120419. The regression range for the spike is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=0c7e2911be75&tochange=da53be684794
Looks like we should keep an eye on this for 14, along with bug 700288. What else can we do here to close in on a fix? There is mention of a static variable not initialized somewhere in bug 700288 comment 29 - can someone try to track that down and create some builds for qa to test?
Depends on: 755974
No longer appears to be a top crasher - untracking.
Renominating as this has moved up to the top of the crash data in early Firefox 14b9 crash stats. It doesn't appear as if this was a problem in b8.
(In reply to Marcia Knous [:marcia] from comment #59) > Renominating as this has moved up to the top of the crash data in early > Firefox 14b9 crash stats. It doesn't appear as if this was a problem in b8. It's the normal behavior of this crash that appears or disappears depending on the build (likely an uninitialized static variable). But this time, its usual brother crashes are not there but instead there's a new one, bug 768383. Maybe the Beta regression range can help finding the underlying issue: http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=f8d3886db65a&tochange=d050090e578c
Keywords: topcrash
This has jumped up quite a bit in volume, 24563 crashes since we shippped B9.
(In reply to Marcia Knous [:marcia] from comment #61) > This has jumped up quite a bit in volume, 24563 crashes since we shippped B9. 50% of crashes in 14.0b9 is indeed a big jump!
Let's wait to see if this continues to trend in b10 before devoting much qa/engineering time to this bug.
Is it possible that this is showing up with a different signature when it's not showing up under this one?
Also, the 14.0b9 crashes are EXCEPTION_BOUNDS_EXCEEDED crashes, unlike the 10.0b2 crashes which were EXCEPTION_ACCESS_VIOLATION_WRITE. Do they have similar correlations or is it no longer an ATI-related problem?
(In reply to David Baron [:dbaron] from comment #65) > Is it possible that this is showing up with a different signature when it's > not showing up under this one? Its brother crashes in 14.0b9 are bug 768383 and bug 768560. (In reply to David Baron [:dbaron] from comment #66) > Do they have similar correlations or is it no longer an ATI-related problem? Neither https://crash-analysis.mozilla.com/rkaiser/, nor https://crash-analysis.mozilla.com/crash_analysis/ contain correlations per device and vendor IDs, but I checked manually some crash reports and there were those ATI/AMD devices (0x98..).
There are no crashes in 14.0b10.
Keywords: topcrash
Depends on: 772330
It's back in 16.0a1/20120713030548.
Bad news! It's #1 top crasher in 10.0.6 ESR with 19% of all crashes.
How does this line trigger an EXCEPTION_ACCESS_VIOLATION_WRITE? http://hg.mozilla.org/releases/mozilla-esr10/annotate/6c432561c1fd/layout/style/nsStyleContext.cpp#l137 The only thing we seem to be writing to is the newly defined "nsStyleContext **list". "aChild->mRuleNode->IsRoot() ? &mEmptyChild : &mChild;" looks like they're all reads, although I guess there could be some operators going on in there. If there is memory writing going on here, and if someone can find a reproducible case (at 18% of crashes it should be possible to catch) then this is a potential security vulnerability.
Group: core-security
There is no operator magic there. Those should all be pure reads. The most obvious conclusion is that something corrupted page permissions on the stack? ;)
There are nearly a dozen of these bugs (the crash comes and goes under a few different signatures, but isn't around in most releases); bug 772330 is the metabug.
We'll track for the ESR shipping alongside FF16 given this is a top crasher, but it's not clear if there will be progress in that timeframe.
Building a new version is usually enough to make it disappear, so it will likely be fixed in 10.0.7esr (15).
As expected, there are no crashes in 10.0.7esr.
Closing during CritSmash triage. We are not seeing this crash in 17+
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
It shows up in 19.0b4 along with bug 837371.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
This affects AddChild, like bug 839280, although AddChild was at a pretty different address back in FF14b9 (439CE). However, I'm not so sure what was going on. We crash at an instruction that's actually valid, so there's no clear evidence we took an unexpected jump. The code looks like this: 100439CE: 8B 51 1C mov edx,dword ptr [ecx+1Ch] 100439D1: 83 7A 04 00 cmp dword ptr [edx+4],0 100439D5: 74 0C je 100439E3 100439D7: 83 C0 04 add eax,4 100439DA: 8B 10 mov edx,dword ptr [eax] 100439DC: 85 D2 test edx,edx 100439DE: 75 08 jne 100439E8 100439E0: 89 08 mov dword ptr [eax],ecx 100439E2: C3 ret We crash at 100439E0. I can't tell whether the value at EAX is valid or where it came from. The stack shows that we've only just entered AddChild, but I don't know how EAX has a bad address here. AddChild never modifies EAX to point to invalid memory AFAICT.
I had a poke around with a disassembler to see if a wild jump near here could put us in the the middle of an instruction sequence that would corrupt EAX and then reach 100439E0, but I couldn't find one. I can't rule out something like that, though.
There are crashes in 21.0a1/20130213031137.
Crash Signature: [@ nsStyleContext::AddChild(nsStyleContext*) ] → [@ nsStyleContext::AddChild(nsStyleContext*) ] [@ xul.dll@0x640256 | nsStyleSet::GetContext(nsStyleContext*, nsRuleNode*, nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int)]
There are also crashes in 20.0a2/20130218.
Status: REOPENED → ASSIGNED
Keywords: topcrash
When 20.0 switched from Aurora to Beta, crashes stopped.
Keywords: topcrash
It spiked in 22.0a1/20130325105600.
(In reply to Daniel Veditz [:dveditz] from comment #71) > How does this line trigger an EXCEPTION_ACCESS_VIOLATION_WRITE? > http://hg.mozilla.org/releases/mozilla-esr10/annotate/6c432561c1fd/layout/ > style/nsStyleContext.cpp#l137 > > The only thing we seem to be writing to is the newly defined "nsStyleContext > **list". "aChild->mRuleNode->IsRoot() ? &mEmptyChild : &mChild;" looks like > they're all reads, although I guess there could be some operators going on > in there. If there is memory writing going on here, and if someone can find > a reproducible case (at 18% of crashes it should be possible to catch) then > this is a potential security vulnerability. Boris replied in comment 72 that it's not a write. Also, roc's patch in bug 839270 added a bunch of NOPS before that line which did seem to help (at least for that particular build): https://hg.mozilla.org/releases/mozilla-beta/rev/23f455023faf I don't see a reason to keep this bug hidden / sec-high. We already have a dozen or so crash bugs with "AMD Radeon" in the subject that are public so I think we can open this one too (if we hide comments with URLs). I don't see anything more sensitive here compared to the other bugs.
Flags: needinfo?(dveditz)
OK.
Group: core-security
Flags: needinfo?(dveditz)
Keywords: sec-vector
Keywords: sec-high
22.0a2/20130511 is a bad build.
Let's not pollute the topcrash list with what we know is a single bad Aurora build, please. We don't need to put any special radar on the individual signatures of the Radeon thing as long as it's not in a beta or release we shipped to hundreds of thousands of people at least. I'll invoke the "bugs that spearhead investigation or fixes across a large collection of crashes" clause on the meta tracker bug of those crashes, though.
Keywords: topcrash
Crash Signature: [@ nsStyleContext::AddChild(nsStyleContext*) ] [@ xul.dll@0x640256 | nsStyleSet::GetContext(nsStyleContext*, nsRuleNode*, nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int)] → [@ nsStyleContext::AddChild(nsStyleContext*) ] [@ xul.dll@0x640256 | nsStyleSet::GetContext(nsStyleContext*, nsRuleNode*, nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int)] [@ nsStyleContext::AddChild ] [@ xul.dll@…
The Socorro reports shows that this crash signature hasn't appeared on the last 28 days: [@ xul.dll@0x640256 | nsStyleSet::GetContext ] Also, Socorro shows that this signature is still present on the last 7 days: [@ nsStyleContext::AddChild ] Considering this, I don't think we should close this bug, until this last crash signature is no longer present.
Assignee: roc → nobody
Status: ASSIGNED → NEW
platform-rel: --- → ?
Whiteboard: [platform-rel-AMD]
platform-rel: ? → ---

According to the crash stats, there are no new crashes in the last 6 months - with the signatures from this report.
Closing this as Resolved Worksforme, please do re-open it if this crash will appear again on the latest Firefox versions.

Status: NEW → RESOLVED
Closed: 12 years ago3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.