Closed Bug 991845 Opened 10 years ago Closed 10 years ago

crash in nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::IncrementLength(unsigned int) | `anonymous namespace''::WorkerJSRuntime::WorkerJSRuntime(JSRuntime*, mozilla::dom::workers::WorkerPrivate*)

Categories

(Core :: JavaScript Engine, defect)

31 Branch
x86
Windows NT
defect
Not set
critical

Tracking

()

VERIFIED FIXED
Tracking Status
firefox31 + wontfix
firefox32 + verified
firefox33 + verified
firefox34 --- verified

People

(Reporter: u279076, Assigned: mccr8)

References

(Depends on 1 open bug)

Details

(Keywords: crash, topcrash-win, Whiteboard: [GGC])

Crash Data

This bug was filed from the Socorro interface and is 
report bp-13735765-c09f-4f77-a852-f3e2a2140403.
=============================================================
0 	xul.dll 	nsTArray_base<nsTArrayInfallibleAllocator,nsTArray_CopyWithMemutils>::IncrementLength(unsigned int) 	obj-firefox/dist/include/nsTArray.h
1 	xul.dll 	`anonymous namespace'::WorkerJSRuntime::WorkerJSRuntime(JSRuntime *,mozilla::dom::workers::WorkerPrivate *) 	dom/workers/RuntimeService.cpp
2 	xul.dll 	`anonymous namespace'::WorkerThreadPrimaryRunnable::Run() 	dom/workers/RuntimeService.cpp
3 	xul.dll 	nsThread::ProcessNextEvent(bool,bool *) 	xpcom/threads/nsThread.cpp
4 	xul.dll 	NS_ProcessNextEvent(nsIThread *,bool) 	xpcom/glue/nsThreadUtils.cpp
5 	xul.dll 	mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) 	ipc/glue/MessagePump.cpp
6 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc
7 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc
8 	xul.dll 	nsThread::ThreadFunc(void *) 	xpcom/threads/nsThread.cpp
9 	nss3.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c
10 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c
11 	msvcr100.dll 	_callthreadstartex 	f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c
12 	msvcr100.dll 	_threadstartex 	f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c
13 	kernel32.dll 	kernel32.dll@0x1495d 	
14 	ntdll.dll 	ntdll.dll@0x498ee 	
15 	ntdll.dll 	ntdll.dll@0x498c4 	

More reports:
https://crash-stats.mozilla.com/report/list?product=Firefox&signature=nsTArray_base%3CnsTArrayInfallibleAllocator%2C+nsTArray_CopyWithMemutils%3E%3A%3AIncrementLength%28unsigned+int%29+%7C+%60anonymous+namespace%27%27%3A%3AWorkerJSRuntime%3A%3AWorkerJSRuntime%28JSRuntime%2A%2C+mozilla%3A%3Adom%3A%3Aworkers%3A%3AWorkerPrivate%2A%29

This crash first showed up on 2014-03-29 following the enabling of GGC in Nightly. There are only three comments so far and both mention crashing PluginContainer.exe. It is currently #18 @ 0.64% on Nightly.

Terrence, could you please look into this?
Flags: needinfo?(terrence)
Unable to reproduce on http://one-piece.ru/ (extracted from crash comments).
31.0a1 (2014-04-04), win 7 x64
Summary: [GGC?] crash in nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::IncrementLength(unsigned int) | `anonymous namespace''::WorkerJSRuntime::WorkerJSRuntime(JSRuntime*, mozilla::dom::workers::WorkerPrivate*) → crash in nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::IncrementLength(unsigned int) | `anonymous namespace''::WorkerJSRuntime::WorkerJSRuntime(JSRuntime*, mozilla::dom::workers::WorkerPrivate*)
Whiteboard: [GGC]
Since bug 992535 landed, the frequency is way down: only a handful a day.
Flags: needinfo?(terrence)
I can confirm on WinXP with Nightly bp-9096a7c4-4983-4b5b-956d-c1f142140419 when OOM. My BP's "Crashing Thread" is a close copy of Comment 0 .

Indeed it is quite an "InfallibleAllocator", I did not file because I was OOM and expecting a crash for a while (the UI was toast). While checking my BPs for another Report I looked at this one; still I feel I'm out of the 'Bell Curve' to Report but I was able to trigger it and thought I'd chime in (to confirm and provide a possibly unhelpful BP).
This is still showing around 20-40 crashes per day on recent builds. There are 194 crashes in the last 7 days. It's still the #4 top crash for Firefox 31.   I don't see the same pattern of crashes dropping dramatically after bug 992535 landed.
Thanks for the information, Rob.

Terrence, does Rob's report tell you anything actionable?
Flags: needinfo?(terrence)
Yes, it does. Thank you for the information, Rob!

This almost certainly crashing at the browser's equivalent of CrashAtUnhandlableOOM. We should try to handle this case more gracefully.

Jon, is this likely to just be a shifted signature, or did GGC add to the size of these?
Flags: needinfo?(terrence) → needinfo?(jcoppeard)
Component: General → DOM: Workers
This seems to be just another spot that is likely to crash when we get close to OOM with GGC... Moving to JS for the time being.
Component: DOM: Workers → JavaScript Engine
From the second frame and the fact that the crash reason is EXCEPTION_BREAKPOINT, I think this is the MOZ_CRASH in the CycleCollectedJSRuntime constructor which happens when JS_NewRuntime() fails:

http://dxr.mozilla.org/mozilla-central/source/xpcom/base/CycleCollectedJSRuntime.cpp#469

I don't know why it's reporting this as coming from nsTArray though.

The problem is that we are OOM more often here now the new runtime allocates a 16MB nursery.
Flags: needinfo?(jcoppeard)
Over in bug 1007763 comment 9 I came to the same conclusion: it's a failed JS_NewRuntime() due to OOM. The compiler folded a bunch of MOZ_CRASH calls together and that's why it looks like nsTArray.

This can be reproduced with large Mega downloads since they use tons of memory and exercise this worker codepath.
Depends on: 1016388
Crash Signature: [@ nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::IncrementLength(unsigned int) | `anonymous namespace''::WorkerJSRuntime::WorkerJSRuntime(JSRuntime*, mozilla::dom::workers::WorkerPrivate*)] → [@ nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::IncrementLength(unsigned int) | `anonymous namespace''::WorkerJSRuntime::WorkerJSRuntime(JSRuntime*, mozilla::dom::workers::WorkerPrivate*)] [@ nsTArray_base<nsTArrayInfallibleAllo…
This is now the #3 topcrash in Firefox 31.
Keywords: crashtopcrash-win
Keywords: crash
This is currently the #10 topcrash in Firefox 31.0b with 2459/59660 crashes.
This remains around the #9 topcrash in Firefox 31.0b with 6447/ 277133 crashes in the last 7 days. 

I noticed that there aren't any crashes yet reported for builds after 6/26.
Well, 6/26 is the newest beta build that was created so far. Today should be the next one.
It seems we are assuming that creating a CycleCollectedJSRuntime is infallible.  With GGC attempting to allocate a 16MB nursery this is definitely no longer the case.

I think the solution here is to change the way the initialization of worker threads is done to allow for the possibility of this failing and report errors.   Unfortunately this will require a bit of refactoring - the constructors for CycleCollectedJSRuntime and WorkerJSRuntime will need to be split into an infallible constructor and a fallible initialisation method, and whatever's further up the stack adapted to handle this correctly.

Andrew, you know more about this area than me, does this sound feasible?
Flags: needinfo?(continuation)
Depends on: 1034611
Yeah, I was looking into that before, and it seems pretty easy.  I filed bug 1034611 for that.
Flags: needinfo?(continuation)
Depends on: 1034621
Liz, Kairo, do you still a trace of this crash? Thanks
Flags: needinfo?(lhenry)
Flags: needinfo?(kairo)
No longer depends on: 1034621
Depends on: 1037510
My SO's Firefox 31 is crashing like this

Report ID 	Date Submitted
bp-77874d66-d57f-4549-8da7-ae21a2140715	15/07/2014	09:54 a.m.

Commenting so we can track this bug...

Let us know if there's anything needed from a crashing profile.
That is too late for this bug. Does it affect 32 too?
(In reply to Sylvestre Ledru [:sylvestre] from comment #20)
> That is too late for this bug. Does it affect 32 too?

This is currently the #13 topcrash in Firefox 32.0a2.

Source: https://crash-stats.mozilla.com/topcrasher_ranks_bybug/?bug_number=991845
It looks like there are no crashes on trunk since bug 1037510 landed.  It has just landed on Aurora and Beta, so hopefully the signature will go away there, too.
(In reply to Andrew McCreight [:mccr8] from comment #22)
> It looks like there are no crashes on trunk since bug 1037510 landed.  It
> has just landed on Aurora and Beta, so hopefully the signature will go away
> there, too.

Kairo - Can you please help confirm that this crash has been fixed by bug 1037510?
Flags: needinfo?(kairo)
(In reply to Lawrence Mandel [:lmandel] from comment #23)
> (In reply to Andrew McCreight [:mccr8] from comment #22)
> > It looks like there are no crashes on trunk since bug 1037510 landed.  It
> > has just landed on Aurora and Beta, so hopefully the signature will go away
> > there, too.
> 
> Kairo - Can you please help confirm that this crash has been fixed by bug
> 1037510?

From all I can see, it's fixed in 32.0b5 (the signature still exists but at very low volume, I saw 5 crashes overall, which is probably some other cause). I also do see it last appearing on 8/4 aurora builds and from what6 I see, nightly is good as well.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(kairo)
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Assignee: nobody → continuation
You need to log in before you can comment on or make changes to this bug.