Closed Bug 1099370 Opened 10 years ago Closed 10 years ago

Intermittent browser_widget.js | application crashed [@ mozalloc_abort(char const*)] after FATAL ERROR: AsyncShutdown timeout in ShutdownLeaks: Wait for cleanup to be finished before checking for leaks Conditions: [{"name":"DevTools: Wait until toolbox...

Categories

(DevTools :: Inspector, defect)

x86
Linux
defect
Not set
normal

Tracking

(firefox36 unaffected, firefox37 fixed, firefox38 fixed, firefox-esr31 unaffected)

RESOLVED FIXED
Firefox 38
Tracking Status
firefox36 --- unaffected
firefox37 --- fixed
firefox38 --- fixed
firefox-esr31 --- unaffected

People

(Reporter: RyanVM, Assigned: sjakthol)

References

Details

(Keywords: crash, intermittent-failure)

Attachments

(2 files)

11:04:09 INFO - 3279 INFO TEST-START | chrome://mochitests/content/browser/browser/devtools/webide/test/browser_widget.js 11:04:10 INFO - 3280 INFO TEST-PASS | chrome://mochitests/content/browser/browser/devtools/webide/test/browser_widget.js | Found WebIDE button 11:04:10 INFO - 3281 INFO TEST-PASS | chrome://mochitests/content/browser/browser/devtools/webide/test/browser_widget.js | WebIDE button uninstalled 11:04:10 INFO - 3282 INFO Closing WebIDE 11:04:10 INFO - 3283 INFO Console message: [JavaScript Error: "not well-formed" {file: "file:///builds/slave/test/build/tests/mochitest/chrome/browser/devtools/webide/test/addons/simulators.json" line: 1 column: 1 source: "{"}] 11:04:10 INFO - 3284 INFO WebIDE closed 11:04:11 INFO - 3285 INFO TEST-OK | chrome://mochitests/content/browser/browser/devtools/webide/test/browser_widget.js | took 1332ms 11:04:21 INFO - WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"DevTools: Wait until toolbox is destroyed","state":"(none)","filename":"resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js","lineNumber":1660,"stack":["resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js:Toolbox.prototype.destroy/leakCheckObserver:1660","chrome://mochikit/content/browser-test.js:Tester.prototype.nextTest</<:548","chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/maybeRunTests/<:683","null:null:0"]}] Barrier: ShutdownLeaks: Wait for cleanup to be finished before checking for leaks 11:05:12 INFO - FATAL ERROR: AsyncShutdown timeout in ShutdownLeaks: Wait for cleanup to be finished before checking for leaks Conditions: [{"name":"DevTools: Wait until toolbox is destroyed","state":"(none)","filename":"resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js","lineNumber":1660,"stack":["resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js:Toolbox.prototype.destroy/leakCheckObserver:1660","chrome://mochikit/content/browser-test.js:Tester.prototype.nextTest</<:548","chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/maybeRunTests/<:683","null:null:0"]}] At least one completion condition failed to complete within a reasonable amount of time. Causing a crash to ensure that we do not leave the user with an unresponsive process draining resources. 11:05:12 INFO - [Parent 1806] ###!!! ABORT: file resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js, line 1660 11:05:12 INFO - [Parent 1806] ###!!! ABORT: file resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/framework/toolbox.js, line 1660 11:05:13 INFO - TEST-INFO | Main app process: killed by SIGSEGV 11:05:13 INFO - 3286 INFO checking window state 11:05:13 INFO - 3287 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/devtools/webide/test/browser_widget.js | application terminated with exit code 11 11:05:13 INFO - runtests.py | Application ran for: 0:36:00.869964 11:05:13 INFO - zombiecheck | Reading PID log: /tmp/tmpOxqJLNpidlog 11:05:13 INFO - ==> process 1806 launched child process 1846 11:05:13 INFO - zombiecheck | Checking for orphan process with PID: 1846 11:05:13 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/fx-team-linux/1415987176/firefox-36.0a1.en-US.linux-i686.crashreporter-symbols.zip 11:05:39 INFO - mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/5a45c836-50fc-e333-09a84154-05d656f2.dmp 11:05:39 INFO - mozcrash Saved app info as /builds/slave/test/build/blobber_upload_dir/5a45c836-50fc-e333-09a84154-05d656f2.extra 11:05:39 WARNING - PROCESS-CRASH | chrome://mochitests/content/browser/browser/devtools/webide/test/browser_widget.js | application crashed [@ mozalloc_abort(char const*)] 11:05:39 INFO - Crash dump filename: /tmp/tmpSkYDwW.mozrunner/minidumps/5a45c836-50fc-e333-09a84154-05d656f2.dmp 11:05:39 INFO - Operating system: Linux 11:05:39 INFO - 0.0.0 Linux 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686 11:05:39 INFO - CPU: x86 11:05:39 INFO - GenuineIntel family 6 model 44 stepping 2 11:05:39 INFO - 1 CPU 11:05:39 INFO - Crash reason: SIGSEGV 11:05:39 INFO - Crash address: 0x0 11:05:39 INFO - Thread 0 (crashed) 11:05:39 INFO - 0 libmozalloc.so!mozalloc_abort(char const*) [mozalloc_abort.cpp:91b67b9a26bb : 37 + 0x0] 11:05:39 INFO - eip = 0xb730d8a9 esp = 0xbf9e3730 ebp = 0xbf9e3748 ebx = 0xb73105c0 11:05:39 INFO - esi = 0xb75f9d9c edi = 0xbf9e3788 eax = 0x0000000a ecx = 0xffffffff 11:05:39 INFO - edx = 0xb75fa8ac efl = 0x00210282 11:05:39 INFO - Found by: given as instruction pointer in context 11:05:39 INFO - 1 libxul.so!NS_DebugBreak [nsDebugImpl.cpp:91b67b9a26bb : 469 + 0xe] 11:05:39 INFO - eip = 0xb367453a esp = 0xbf9e3750 ebp = 0xbf9e3b98 ebx = 0xb6e58804 11:05:39 INFO - esi = 0xbf9e3770 edi = 0xbf9e3788 11:05:39 INFO - Found by: call frame info 11:05:39 INFO - 2 libxul.so!nsDebugImpl::Abort(char const*, int) [nsDebugImpl.cpp:91b67b9a26bb : 150 + 0xd] 11:05:39 INFO - eip = 0xb36745af esp = 0xbf9e3ba0 ebp = 0xbf9e3bc8 ebx = 0xb6e58804 11:05:39 INFO - esi = 0xbf9e3d10 edi = 0xbf9e3cb8 11:05:39 INFO - Found by: call frame info 11:05:39 INFO - 3 libxul.so!NS_InvokeByIndex + 0x2f 11:05:39 INFO - eip = 0xb36ba628 esp = 0xbf9e3bd0 ebp = 0xbf9e3bf8 ebx = 0xb6e58804 11:05:39 INFO - esi = 0xbf9e3d10 edi = 0xbf9e3cb8 11:05:39 INFO - Found by: call frame info 11:05:39 INFO - 4 libxul.so!XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) [XPCWrappedNative.cpp:91b67b9a26bb : 2395 + 0x15] 11:05:39 INFO - eip = 0xb3a458dc esp = 0xbf9e3c00 ebp = 0xbf9e3da8 11:05:39 INFO - Found by: previous frame's frame pointer 11:05:39 INFO - 5 libxul.so!XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) [XPCWrappedNativeJSOps.cpp:91b67b9a26bb : 1244 + 0x9] 11:05:39 INFO - eip = 0xb3a47ddc esp = 0xbf9e3db0 ebp = 0xbf9e3e78 ebx = 0xb6e58804 11:05:39 INFO - esi = 0xb7260aa0 edi = 0xbf9e3e04 11:05:39 INFO - Found by: call frame info 11:05:39 INFO - 6 libxul.so!js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) [jscntxtinlines.h:91b67b9a26bb : 231 + 0x1b] 11:05:39 INFO - eip = 0xb5313c53 esp = 0xbf9e3e80 ebp = 0xbf9e4118 ebx = 0xb6e58804 11:05:39 INFO - esi = 0xb7260aa0 edi = 0xbf9e3ec8 11:05:39 INFO - Found by: call frame info
This test does not open a toolbox, so maybe a previous test is bleeding into it?
Paul, that doesn't look good. Can you dispatch someone to look at it?
Flags: needinfo?(paul)
Bug 1053463 seems related with view._elementStyle.rules errors. I believe this is really a Style Inspector error. browser_widget just happens to be the last test in the DevTools browser suite at the moment.
Component: Developer Tools: WebIDE → Developer Tools: Style Editor
Flags: needinfo?(paul)
Component: Developer Tools: Style Editor → Developer Tools: Inspector
Patrick, any guess what the issue is here? I believe you've done some recent Inspector test improvements, so thought you might have an idea. Ignore the WebIDE test name, it just happens to be the last test of the suite.
My understanding so far is that it sometimes takes longer for the toolbox to close, and that makes the whole thing crash because of the blocker code we added to the toolbox destroy function in bug 1060840. I don't exactly know what this does, but it looks like it prevents the tab from closing until the toolbox is fully destroy (to avoid leaks). But in doing this, it just waits for some reasonable amount of time, and declares the thing dead if it takes longer to complete. Here is the relevant code: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/asyncshutdown/AsyncShutdown.jsm#760 Tim, you worked on bug 1060840, can you confirm my understanding of things so far? Do you agree with me that this shutdown crash can simply be caused by the toolbox taking more time than usual (maybe the machine is slower)?
Flags: needinfo?(pbrosset) → needinfo?(ttaubert)
Bug 1060840 just makes the shutdown leak detector wait until all tabs/toolboxes have been closed. Otherwise it would report those as leaked not knowing that they're still alive. If we see a shutdown crash then it seems like they're just never destroyed. AsyncShutdown stops waiting after 60s and just kills the process. Looks like for some reason the toolbox is never destroyed.
Flags: needinfo?(ttaubert)
(In reply to Tim Taubert [:ttaubert] (away Dec 19th, back Jan 5th) from comment #119) > Bug 1060840 just makes the shutdown leak detector wait until all > tabs/toolboxes have been closed. Otherwise it would report those as leaked > not knowing that they're still alive. If we see a shutdown crash then it > seems like they're just never destroyed. AsyncShutdown stops waiting after > 60s and just kills the process. Looks like for some reason the toolbox is > never destroyed. Or maybe the toolbox takes a whole lot of time to get destroyed. 60s really seems like an awful lot of time, but what if the machine is running very slow? Anyway, I've looked at the logs, there are plenty of exceptions in there and uncaught promise rejections, but it's hard to relate any of this to the toolbox not being able to close. I've also run a small script on the log that calculates the time between 2 lines, and I didn't find anything that was taking more than 60s. There's a couple of occurrences of stuff that take about 20seconds, but they were not about closing the toolbox anyway. So basically, I can't reproduce, I don't know where it comes from, and I don't know where to look :) Yay, best type of bug! Tim, one more question: is it right to assume that this error message is just displayed here because that's the end of the testsuite, but really, any test in the suite could have caused it, right? It seems pretty obvious that browser_widget.js isn't to blame here because it doesn't even open the toolbox, but I just want to make sure.
Flags: needinfo?(ttaubert)
(In reply to Patrick Brosset [:pbrosset] [:patrick] from comment #124) > Tim, one more question: is it right to assume that this error message is > just displayed here because that's the end of the testsuite, but really, any > test in the suite could have caused it, right? > It seems pretty obvious that browser_widget.js isn't to blame here because > it doesn't even open the toolbox, but I just want to make sure. Yeah, exactly. Some test running before will have opened a toolbox but never destroyed it properly. It probably wouldn't be too hard to add some debug output to find out which of the tests creates a toolbox and never destroys it. From there on you could dig deeper and try to find the actual cause, unfortunately try is your best bet to reproduce as it seems.
Flags: needinfo?(ttaubert)
> Or maybe the toolbox takes a whole lot of time to get destroyed. 60s really seems like an awful lot of time, but what if the machine is running very slow? In that case, crashing will be a mercy for the user. A shutdown that takes more than 60 seconds is clearly unacceptable, even on a slow machine.
I noticed when running canvasdebugger tests that they hang and the AsyncShutdown warning is printed to terminal. After a bit of bisection I was able to narrow the cause down to browser_canvas-frontend-open.js which would hang the shutdown IF another browser_canvas-frontend-* test was also executed. For example, I can reproduce the AsyncShutdown hang with following command: ./mach mochitest-devtools --e10s browser/devtools/canvasdebugger/test/browser_canvas-frontend-call-stack-03.js browser/devtools/canvasdebugger/test/browser_canvas-frontend-open.js I don't know if canvasdebugger is solely responsible for these intermittent failures but it sure seems to contribute to the result.
(In reply to Sami Jaktholm from comment #143) > I noticed when running canvasdebugger tests that they hang and the > AsyncShutdown warning is printed to terminal. After a bit of bisection I was > able to narrow the cause down to browser_canvas-frontend-open.js which would > hang the shutdown IF another browser_canvas-frontend-* test was also > executed. > > For example, I can reproduce the AsyncShutdown hang with following command: > ./mach mochitest-devtools --e10s > browser/devtools/canvasdebugger/test/browser_canvas-frontend-call-stack-03. > js browser/devtools/canvasdebugger/test/browser_canvas-frontend-open.js > > I don't know if canvasdebugger is solely responsible for these intermittent > failures but it sure seems to contribute to the result. Thanks Sami for finding this out. It sure does seem like it contributes to this bug. Victor, here's a slight change the canvasdebugger test teardown function that makes sure the toolbox is destroyed before closing the tab which seems to avoid the hang.
Attachment #8540601 - Flags: review?(vporof)
Comment on attachment 8540601 [details] [diff] [review] bug1099370-fix-canvasdebugger-teardown v1.patch Review of attachment 8540601 [details] [diff] [review]: ----------------------------------------------------------------- LGTM
Attachment #8540601 - Flags: review?(vporof) → review+
Thanks Victor. Let's check this in and keep the bug open in case this wasn't the only cause for the toolbox destroy hang.
Keywords: leave-open
If a debugger test times out, this problem seems to be triggered (see below). Here's a patch should fix that case. It modifies the cleanup function in head.js to a) destroy all toolboxes and wait for them to be destroyed b) close all extra tabs to avoid leaks It also removes calls to removeTab(gBrowser.selectedTab) from some tests that called that in their own cleanup functions. Now that the tabs are destroyed in the global cleanup function those calls would remove last tab and close the browser in the middle of the test suite. Try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c183694d706 ---- When a timeout occurs test harness will execute the cleanup function registered in debugger/test/head.js [1] which destroys the debugger server. Next the harness will remove leaked tabs. The open tab will be closed and the toolbox starts to destroy itself. At this point I kind of lost track of what's happening. The client destroyer sequence seems to hang in TabTarget._client.close(cleanupAndResolve)[2] which is waiting for the transport to be closed. When the cleanup function called DebuggerServer.destroy it closed the transport on the server side. The transport is apparently being closed from both ends at the same time (though I have no idea how e10s fits into this mix) which seem to cause the closure callback [3] to be never executed. Thus the destroyer promise does not resolve and the leak check barrier is never removed. I think this could occur even without a timeout. If a test destroys the tab but doesn't wait for the toolbox to be destroyed before finishing it could trigger similar chain of events so this might also help to reduce the general frequency of this bug. [1] https://hg.mozilla.org/mozilla-central/annotate/369a8f14ccf8/browser/devtools/debugger/test/head.js#l46 [2] https://hg.mozilla.org/mozilla-central/annotate/369a8f14ccf8/browser/devtools/framework/target.js#l591 [3] https://hg.mozilla.org/mozilla-central/annotate/369a8f14ccf8/browser/devtools/framework/target.js#l577
Attachment #8550719 - Flags: review?(vporof)
Comment on attachment 8550719 [details] [diff] [review] debugger-automatic-cleanup-after-test.patch Review of attachment 8550719 [details] [diff] [review]: ----------------------------------------------------------------- I think ejpbruel should review this instead
Attachment #8550719 - Flags: review?(vporof)
Attachment #8550719 - Flags: review?(ejpbruel)
This is on my radar for tomorrow. Sorry it took so long.
Comment on attachment 8550719 [details] [diff] [review] debugger-automatic-cleanup-after-test.patch Review of attachment 8550719 [details] [diff] [review]: ----------------------------------------------------------------- Patch looks good to me
Attachment #8550719 - Flags: review?(ejpbruel) → review+
Keywords: checkin-needed
Comment 232 seems like a false positive, since it's not the DevTools test suite. So far, no more occurrences since the last patch.
Ryan, the new hits after the last patch are only on Aurora (or they aren't actually the DevTools suite). Would it help you for us to uplift the fix to Aurora?
Flags: needinfo?(ryanvm)
Yes please! It can land a=test-only whenever.
Flags: needinfo?(ryanvm)
Assignee: nobody → sjakthol
Target Milestone: --- → Firefox 38
Marking FIXED, as there have not been new hits (for the DevTools suite) on fx-team / m-c in the last week.
Status: NEW → RESOLVED
Closed: 10 years ago
Keywords: leave-open
Resolution: --- → FIXED
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: