Closed Bug 990355 Opened 11 years ago Closed 9 years ago

Intermittent test_workersDisabled.xul | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,JSUseHelperThreads)] | after Hit MOZ_CRASH()

Categories

(Core :: DOM: Workers, defect)

31 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: KWierso, Assigned: mccr8)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Crash Data

https://tbpl.mozilla.org/php/getParsedLog.php?id=37029719&tree=Fx-Team Windows 7 32-bit fx-team debug test mochitest-other on 2014-03-31 16:03:49 PDT for push 4d8323b1389d slave: t-w732-ix-091 Another Windows worker crash filed today in bug 990346. Probably related. 16:14:09 INFO - 2239 INFO TEST-START | chrome://mochitests/content/chrome/dom/workers/test/test_url.xul 16:14:09 INFO - ++DOMWINDOW == 63 (21321B68) [pid = 2836] [serial = 2195] [outer = 17202D38] 16:14:09 INFO - 2240 INFO TEST-INFO | MEMORY STAT vsize after test: 1824899072 16:14:09 INFO - 2241 INFO TEST-INFO | MEMORY STAT vsizeMaxContiguous after test: 17776640 16:14:09 INFO - 2242 INFO TEST-INFO | MEMORY STAT residentFast after test: 1246052352 16:14:09 INFO - 2243 INFO TEST-END | chrome://mochitests/content/chrome/dom/workers/test/test_url.xul | finished in 160ms 16:14:09 INFO - ++DOMWINDOW == 64 (21312BE0) [pid = 2836] [serial = 2196] [outer = 17202D38] 16:14:09 INFO - 2244 INFO TEST-START | chrome://mochitests/content/chrome/dom/workers/test/test_workersDisabled.xul 16:14:10 INFO - ++DOMWINDOW == 65 (282478D8) [pid = 2836] [serial = 2197] [outer = 17202D38] 16:14:10 INFO - Hit MOZ_CRASH() at c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\xpcom\base\CycleCollectedJSRuntime.cpp:463 16:14:10 INFO - [2460] ###!!! ABORT: Aborting on channel error.: file c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\ipc\glue\MessageChannel.cpp, line 1522 16:14:10 INFO - [2460] ###!!! ASSERTION: Cannot call AnnotateCrashReport in child processes from non-main thread.: 'Error', file c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\toolkit\crashreporter\nsExceptionHandler.cpp, line 1691 16:14:10 INFO - NS_DebugBreak [xpcom/base/nsDebugImpl.cpp:367] 16:14:10 INFO - mozilla::ipc::MessageChannel::OnChannelErrorFromLink() [ipc/glue/MessageChannel.cpp:1522] 16:14:10 INFO - base::MessagePumpForIO::WaitForIOCompletion(unsigned long,base::MessagePumpForIO::IOHandler *) [ipc/chromium/src/base/message_pump_win.cc:524] 16:14:10 INFO - base::MessagePumpForIO::WaitForWork() [ipc/chromium/src/base/message_pump_win.cc:501] 16:14:10 INFO - base::MessagePumpForIO::DoRunLoop() [ipc/chromium/src/base/message_pump_win.cc:463] 16:14:10 INFO - base::MessagePumpWin::RunWithDispatcher(base::MessagePump::Delegate *,base::MessagePumpWin::Dispatcher *) [ipc/chromium/src/base/message_pump_win.cc:55] 16:14:10 INFO - base::MessagePumpWin::Run(base::MessagePump::Delegate *) [ipc/chromium/src/base/message_pump_win.h:78] 16:14:10 INFO - MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:226] 16:14:10 INFO - MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:220] 16:14:10 INFO - MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:194] 16:14:10 INFO - base::Thread::ThreadMain() [ipc/chromium/src/base/thread.cc:165] 16:14:10 INFO - `anonymous namespace'::ThreadFunc(void *) [ipc/chromium/src/base/platform_thread_win.cc:27] 16:14:10 INFO - kernel32 + 0x53c45 16:14:10 INFO - ntdll + 0x637f5 16:14:10 INFO - ntdll + 0x637c8 16:14:10 INFO - [2460] ###!!! ASSERTION: Cannot call AnnotateCrashReport in child processes from non-main thread.: 'Error', file c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\toolkit\crashreporter\nsExceptionHandler.cpp, line 1641 16:14:10 INFO - TEST-INFO | Main app process: exit status 80000003 16:14:10 WARNING - TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/dom/workers/test/test_workersDisabled.xul | application terminated with exit code 2147483651 16:14:10 INFO - INFO | runtests.py | Application ran for: 0:07:07.231000 16:14:10 INFO - INFO | zombiecheck | Reading PID log: c:\users\cltbld\appdata\local\temp\tmpshf8b0pidlog 16:14:10 INFO - ==> process 2836 launched child process 3220 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.10ea3070.98849443 -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" tab) 16:14:10 INFO - ==> process 2836 launched child process 1392 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1974dd50.114896960 -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" tab) 16:14:10 INFO - ==> process 2836 launched child process 3980 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1fba72f0.1270593268 "c:\users\cltbld\appdata\local\temp\tmpt97lem\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" plugin) 16:14:10 INFO - ==> process 2836 launched child process 2492 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1fba1a40.7711135 "c:\users\cltbld\appdata\local\temp\tmpt97lem\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" plugin) 16:14:10 INFO - ==> process 2836 launched child process 3404 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1fba2bb8.1333941138 "c:\users\cltbld\appdata\local\temp\tmpt97lem\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" plugin) 16:14:10 INFO - ==> process 2836 launched child process 3356 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1fba23a8.1927953419 "c:\users\cltbld\appdata\local\temp\tmpt97lem\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" plugin) 16:14:10 INFO - ==> process 2836 launched child process 2864 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1fe92448.750700427 "c:\users\cltbld\appdata\local\temp\tmpt97lem\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" plugin) 16:14:10 INFO - ==> process 2836 launched child process 2816 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1fe8cfa0.1551328127 "c:\users\cltbld\appdata\local\temp\tmpt97lem\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" plugin) 16:14:10 INFO - ==> process 2836 launched child process 2460 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=2836.1fe8cb98.373661720 "c:\users\cltbld\appdata\local\temp\tmpt97lem\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 2836 "\\.\pipe\gecko-crash-server-pipe.2836" plugin) 16:14:18 WARNING - PROCESS-CRASH | chrome://mochitests/content/chrome/dom/workers/test/test_workersDisabled.xul | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,JSUseHelperThreads)] 16:14:18 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmpt97lem\minidumps\8e5b71a8-12c3-4daa-8248-9f382564379c.dmp 16:14:18 INFO - Operating system: Windows NT 16:14:18 INFO - 6.1.7601 Service Pack 1 16:14:18 INFO - CPU: x86 16:14:18 INFO - GenuineIntel family 6 model 30 stepping 5 16:14:18 INFO - 8 CPUs 16:14:18 INFO - Crash reason: EXCEPTION_BREAKPOINT 16:14:18 INFO - Crash address: 0x65d0582c 16:14:18 INFO - Thread 65 (crashed) 16:14:18 INFO - 0 xul.dll!mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,JSUseHelperThreads) [CycleCollectedJSRuntime.cpp:4d8323b1389d : 463 + 0x17] 16:14:18 INFO - eip = 0x65d0582c esp = 0x227af734 ebp = 0x227af738 ebx = 0x00000000 16:14:18 INFO - esi = 0x227af758 edi = 0x2fdb4598 eax = 0x00000000 ecx = 0xe0980ea5 16:14:18 INFO - edx = 0x686de4d8 efl = 0x00000212 16:14:18 INFO - Found by: given as instruction pointer in context 16:14:18 INFO - 1 xul.dll!`anonymous namespace'::WorkerThreadPrimaryRunnable::Run() [RuntimeService.cpp:4d8323b1389d : 2538 + 0x1c] 16:14:18 INFO - eip = 0x66b78f5c esp = 0x227af740 ebp = 0x227af88c 16:14:18 INFO - Found by: call frame info 16:14:18 INFO - 2 xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:4d8323b1389d : 694 + 0xd] 16:14:18 INFO - eip = 0x65d4c46d esp = 0x227af894 ebp = 0x227af8f0 16:14:18 INFO - Found by: call frame info 16:14:18 INFO - 3 xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:4d8323b1389d : 263 + 0xc] 16:14:18 INFO - eip = 0x65cddd52 esp = 0x227af8f8 ebp = 0x227af904 16:14:18 INFO - Found by: call frame info
Well, I guess WONTFIX isn't the right answer for these. But this is probably due to GGC increasing memory spikes during these tests a little. Terrance, looks like we're still randomly having OOM on chrome worker tests. What should we try?
Summary: Intermittent test_workersDisabled.xul | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,JSUseHelperThreads)] | after ABORT: Aborting on channel error.: MessageChannel.cpp, line 1522 → Intermittent test_workersDisabled.xul | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,JSUseHelperThreads)] | after Hit MOZ_CRASH()
We should probably fix NS_DebugBreak to not try to annotate the crash report from off the main thread in child processes since that doesn't work: http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsDebugImpl.cpp#363
Great - have filed bug 991824 :-)
Maybe we could work around this a bit by forcing GCs in between tests. This directory allocates a ton of workers, and each one is using up a lot of memory. 19:29:46 INFO - 2873 INFO TEST-INFO | MEMORY STAT vsizeMaxContiguous after test: 125890560 ... 19:29:46 INFO - 2879 INFO TEST-INFO | MEMORY STAT vsizeMaxContiguous after test: 25100288 ... 19:29:46 INFO - 2885 INFO TEST-INFO | MEMORY STAT vsizeMaxContiguous after test: 14606336 ... crash Uhh, so it looks like running dom/workers/test/test_fileSubWorker.xul consumes 100MB of vsizeMaxContiguous. The test creates 4 workers. Maybe just running GC+CC after each call to accessFileProperties() would help.
Assignee: nobody → continuation
I think the real problem here is that the test ends up spinning up eight workers (!!!) at the same time, then they start producing results. Sticking GCs in there doesn't matter. I've tried chaining them together, by having one .onmessage call the next. That at least makes it so we don't start them all up at the same time, but I think the event handler captures the old worker, so forcing a GC in there doesn't help. Maybe I can use promises or something. Or just split this into 4 separate top-level tests.
> by having one .onmessage call the next I mean, "by having one .onmessage start the next test"
Blocks: MochiMem
Depends on: 1034621
"the test" in comment 21 is test_fileSubWorker.xul in case that isn't clear.
This is still Windows-only right? I think we need to figure out why the nursery hurts so much on Windows.
From comment 20, we're running out of address space. I think Windows just has bigger problems with running out of address space.
Well, and all of these have been on Win7 it looks like, and Win7 in particular is the platform where we've had severe tree closing problems with running out of address space before.
Yes, our test slaves are 32bit Win7 and 64bit Win8, so that's not fun :(. I wonder if we even use /3GB option on our Win7 slaves.
No longer depends on: 1034621
Depends on: 1037510
This hasn't happened since bug 1037510 landed, but it isn't super frequent so it is hard to say if it has really gone away yet.
Inactive; closing (see bug 1180138).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.