Closed Bug 900453 Opened 11 years ago Closed 9 years ago

Intermittent test_fullscreen-api.html | application crashed [@ JSContext::restoreFrameChain()] (after "Assertion failure: savedFrameChain_ > 0, at ../../../js/src/vm/Stack.h:1209")

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox39 --- wontfix
firefox40 --- wontfix
firefox41 --- fixed
firefox-esr31 --- unaffected
firefox-esr38 --- unaffected

People

(Reporter: cbook, Assigned: xidorn)

References

()

Details

(Keywords: crash, intermittent-failure)

Crash Data

Rev4 MacOSX Lion 10.7 mozilla-inbound debug test mochitest-1 on 2013-08-01 03:12:29 PDT for push db6598c7675f slave: talos-r4-lion-034 Assertion failure: savedFrameChain_ > 0, at ../../../js/src/vm/Stack.h:1208 PROCESS-CRASH | /tests/content/html/content/test/test_fullscreen-api.html | application crashed [@ JSContext::restoreFrameChain()] 03:38:59 INFO - Crash dump filename: /var/folders/qd/srwd5f710sj0fcl9z464lkj00000gn/T/tmpBr6RXR/minidumps/6A743EF3-88DC-4664-B757-9A8948F160D1.dmp 03:38:59 INFO - Operating system: Mac OS X 03:38:59 INFO - 10.7.2 11C74 03:38:59 INFO - CPU: amd64 03:38:59 INFO - family 6 model 23 stepping 10 03:38:59 INFO - 2 CPUs 03:38:59 INFO - Crash reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS 03:38:59 INFO - Crash address: 0x0 03:38:59 INFO - Thread 0 (crashed) 03:38:59 INFO - 0 XUL!JSContext::restoreFrameChain() [Stack.h:db6598c7675f : 1208 + 0x0] 03:38:59 INFO - rbx = 0x00007fff78967630 r12 = 0x0000000000000003 03:38:59 INFO - r13 = 0x00007fff5fbfd617 r14 = 0x000000011f8b3228 03:38:59 INFO - r15 = 0x0000000106aff200 rip = 0x00000001037b4743 03:38:59 INFO - rsp = 0x00007fff5fbfd490 rbp = 0x00007fff5fbfd4b0 03:38:59 INFO - Found by: given as instruction pointer in context 03:38:59 INFO - 1 XUL!JS_RestoreFrameChain(JSContext*) [jsapi.cpp:db6598c7675f : 5466 + 0x7] 03:38:59 INFO - rbx = 0x000000011f8b3140 r12 = 0x000000010017c1a8 03:38:59 INFO - r13 = 0x00007fff5fbfd617 r14 = 0x00007fff5fbfd4c8 03:38:59 INFO - r15 = 0x000000010017c1b0 rip = 0x00000001037857ae 03:38:59 INFO - rsp = 0x00007fff5fbfd4c0 rbp = 0x00007fff5fbfd4e0 03:38:59 INFO - Found by: call frame info 03:38:59 INFO - 2 XUL!XPCJSContextStack::Pop() [XPCJSContextStack.cpp:db6598c7675f : 52 + 0x8] 03:38:59 INFO - rbx = 0x000000011f8b3140 r12 = 0x000000010017c1a8 03:38:59 INFO - r13 = 0x00007fff5fbfd617 r14 = 0x0000000000000000 03:38:59 INFO - r15 = 0x000000010017c1b0 rip = 0x00000001022b48cf 03:38:59 INFO - rsp = 0x00007fff5fbfd4f0 rbp = 0x00007fff5fbfd510 03:38:59 INFO - Found by: call frame info 03:38:59 INFO - 3 XUL!nsXPConnect::AfterProcessNextEvent(nsIThreadInternal*, unsigned int) [nsXPConnect.cpp:db6598c7675f : 1334 + 0x4] 03:38:59 INFO - rbx = 0x000000010017aea0 r12 = 0x0000000000000000 03:38:59 INFO - r13 = 0x00007fff5fbfd617 r14 = 0x000000010013d410 03:38:59 INFO - r15 = 0x0000000000000000 rip = 0x000000010230c9dd 03:38:59 INFO - rsp = 0x00007fff5fbfd520 rbp = 0x00007fff5fbfd530 03:38:59 INFO - Found by: call frame info 03:38:59 INFO - 4 XUL!_ZThn8_N11nsXPConnect21AfterProcessNextEventEP17nsIThreadInternalj [nsXPConnect.cpp:db6598c7675f : 1197 + 0x8] 03:38:59 INFO - rbx = 0x000000010013d3d0 r12 = 0x0000000000000000 03:38:59 INFO - r13 = 0x00007fff5fbfd617 r14 = 0x000000010013d410 03:38:59 INFO - r15 = 0x0000000000000000 rip = 0x000000010230ce5d 03:38:59 INFO - rsp = 0x00007fff5fbfd540 rbp = 0x00007fff5fbfd540 03:38:59 INFO - Found by: call frame info
I added this assert as part of the interpreter stack rewrite. It means JS_RestoreFrameChain is called but there's no matching JS_SaveFrameChain call for the topmost activation Activation::savedFrameChain_ is incremented when we save the frame chain and decremented when we restore it. Bobby, any idea what's going on here based on the stack trace?
Flags: needinfo?(bobbyholley+bmo)
Yikes, that's bad. Aside from one well-understood exception in nsJSUtils.cpp, the browser only ever calls JS_SaveFrameChain when pushing JSContexts. And for the most part, we only do _that_ with stack-scoped classes. Some of those classes have annoying potentially-non-LIFO Maybe<> semantics, but we try to catch those with assertions. The two places we don't use stack-scoped classes are callback situations - one is a callback from JS-Ctypes ( mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCJSRuntime.cpp#1221 ), and the other is a callback from XPCOM when we spin the event loop ( http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/nsXPConnect.cpp#1162 ). We appear to be triggering this assertion in the latter case, though there's no guarantee that the XPCOM callback is the one botching the invariant. Indeed, we have pretty explicit machinery in place there with mEventDepth to make sure we never call Pop() more than we call Push(). Tomcat, can we get the rest of the stack, or a link to the log? How often is this happening? Any correlations?
Flags: needinfo?(bobbyholley+bmo) → needinfo?(cbook)
Log link is in the URL field of the bug.
bholley: as ryan mentioned. Also there was/is a minidump file Crash dump filename: /var/folders/qd/srwd5f710sj0fcl9z464lkj00000gn/T/tmpBr6RXR/minidumps/6A743EF3-88DC-4664-B757-9A8948F160D1.dmp on talos-r4-lion-034 we could try to see if the minidump is already there and for that we would need to ask releng.
Flags: needinfo?(cbook)
My gut feeling is that it has something to do with the complicated interactions between OS code and XPCOM when we use the fullscreen API. A minidump wouldn't be nearly as useful as STR, and those STR would probably be better directed at smichaud, though I can't say for sure. It looks like there are a number of other bugs with the fullscreen tests, and it's disabled on quite a few platforms.
(In reply to Bobby Holley (:bholley) from comment #5) > My gut feeling is that it has something to do with the complicated > interactions between OS code and XPCOM when we use the fullscreen API. A > minidump wouldn't be nearly as useful as STR, and those STR would probably > be better directed at smichaud, though I can't say for sure. It looks like > there are a number of other bugs with the fullscreen tests, and it's > disabled on quite a few platforms. (In reply to TinderboxPushlog Robot from comment #6) > philor > https://tbpl.mozilla.org/php/getParsedLog.php?id=26677373&tree=Mozilla-Aurora > Rev4 MacOSX Lion 10.7 mozilla-aurora opt test mochitest-1 on 2013-08-17 > 01:48:59 > revision: 5ccb6826afa7 > slave: talos-r4-lion-055 > > TEST-UNEXPECTED-FAIL | > /tests/content/html/content/test/test_fullscreen-api.html | Exited with code > 1 during test run > PROCESS-CRASH | /tests/content/html/content/test/test_fullscreen-api.html | > application crashed [@ JSContext::findVersion() const] > Return code: 256 The failure in c#6 was also the first job to fail today for our "hdiutil" failures on this slave class, which ends up NOT allowing automation to reboot the system (causing it to fail many many more times) and prior to teh crash the log has messages about inconsistent OS state. c.f. Bug 898519 and my copy/paste of the crash and some lines of context in its attachment https://bugzilla.mozilla.org/attachment.cgi?id=791741
Different signature, but same issue it appears. This one also led to the slave being stuck in an hdiutil error loop. https://tbpl.mozilla.org/php/getParsedLog.php?id=26723180&tree=Fx-Team
Summary: Intermittent TEST-UNEXPECTED-FAIL | /tests/content/html/content/test/test_fullscreen-api.html | Exited with code 1 during test run | Assertion failure: savedFrameChain_ > 0, at ../../../js/src/vm/Stack.h:1208| crashed [@ JSContext::restoreFrameChain()] → Intermittent test_fullscreen-api.html | application crashed [@ JSContext::restoreFrameChain()] (after "Assertion failure: savedFrameChain_ > 0, at ../../../js/src/vm/Stack.h:1209")
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #8) > Different signature, but same issue it appears. This one also led to the > slave being stuck in an hdiutil error loop. > https://tbpl.mozilla.org/php/getParsedLog.php?id=26723180&tree=Fx-Team Same again: https://tbpl.mozilla.org/php/getParsedLog.php?id=27214461&tree=Mozilla-Inbound
Test disabled on OS X 10.7 and 10.8, for intermittently breaking the OS X slaves (was already disabled on 10.8) meaning they require a reboot: remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/bbd8b29a4261
Whiteboard: [test disabled on OS X 10.7 & 10.8][leave open]
Assignee: general → nobody
We have been using the same system API for Fullscreen API among all OS X versions since bug 1105939. As that API has been using on OS X 10.6 for a long time, and doesn't seem to break 10.6 slaves, I suppose it is unlikely to break 10.7+ slaves now.
Blocks: 1170369
Status: NEW → RESOLVED
Closed: 9 years ago
Depends on: 1105939
Resolution: --- → FIXED
Whiteboard: [test disabled on OS X 10.7 & 10.8][leave open]
Assignee: nobody → quanxunzhen
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.