Closed Bug 916531 Opened 11 years ago Closed 11 years ago

Firefox freezes randomly

Categories

(Core :: JavaScript Engine, defect)

x86_64
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla27
Tracking Status
firefox25 --- unaffected
firefox26 + verified
firefox27 + verified

People

(Reporter: coolypf, Assigned: bhackett1024)

References

Details

(Keywords: hang)

Attachments

(1 file)

Blocks: 915482
Severity: major → critical
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: hang
Yes I can confirm this on Win7
I can't reproduce this on Linux. And it's *really* surprising (and worrying) that 47e05e8df03b would be the cause, since that patch just removed some unnecessary #include statements, which shouldn't change behaviour at all. Nonetheless, I've backed it out: https://hg.mozilla.org/integration/mozilla-inbound/rev/14a4dbb53c37 Please check again once this hits a new Nightly to confirm if this patch really was at fault. Thanks!
Flags: needinfo?(coolypf)
Not sure is this will help any, but the only hang I was able to catch in a debugger had the following (I have the Status4Ever extension; however, many hang that don't have the extension): [JavaScript Error: "this._getters.statusOverlay is null" {file: "resource://status4evar/Status.jsm" line: 306}] eax=00000000 ebx=778ce120 ecx=00000000 edx=00000000 esi=00000000 edi=008e2ef0 eip=7780deb4 esp=004ff164 ebp=004ff17c iopl=0 nv up ei pl nz na po nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00200202 ntdll!NtTerminateProcess+0xc: 7780deb4 c20800 ret 8
if bug#915482 cause freeze/hang problems, i think patch #1 and/or #2 (not #4)should be back-out. because #3 and 4 are landed after 0913 NIGHTLY. freeze/hang problem start on 0913 NIGHTLY. http://forums.mozillazine.org/viewtopic.php?p=13069511#p13069511 (see my post)
(In reply to Nicholas Nethercote [:njn] from comment #4) > I can't reproduce this on Linux. And it's *really* surprising (and > worrying) that 47e05e8df03b would be the cause, since that patch just > removed some unnecessary #include statements, which shouldn't change > behaviour at all. > > Nonetheless, I've backed it out: > https://hg.mozilla.org/integration/mozilla-inbound/rev/14a4dbb53c37 > > Please check again once this hits a new Nightly to confirm if this patch > really was at fault. Thanks! I checked the latest mozilla-inbound build in http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32/1379289423/ (cset 14a4dbb53c37) and the problem is still reproducible. Maybe I should check the last good build again.
Flags: needinfo?(coolypf)
Using the nightly from http://hg.mozilla.org/mozilla-central/rev/a4e9c9c9dbf9 I cannot reproduce the hang/freeze
No longer blocks: 915482
Summary: Firefox freezes due to a regression in Bug 915482 → Firefox freezes randomly
I got a hang in 2 times the following tinderbox build(so, no symbol information). bp-27ef076d-1c78-4155-8195-03b9c2130916 bp-f7d9436e-534e-445c-90cb-7ee022130916 bp-560c8f15-3868-4871-a2f7-6d7d72130916 http://hg.mozilla.org/integration/mozilla-inbound/rev/3ca22e239a1d Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 ID:20130911164259
err, s/2 times/3 times/
(In reply to Alice0775 White from comment #10) > I got a hang in 2 times the following tinderbox build(so, no symbol > information). > http://hg.mozilla.org/integration/mozilla-inbound/rev/3ca22e239a1d That's bug 906371, which is a much more likely culprit than bug 916531.
Blocks: 906371
It seems that some race condition causes dead lock.
Changing javascript.options.parallel_parsing to false fixes the hangs for me using the links in the bug
I see AutoPauseWorkersForGC on the stack so this is probably a dupe of bug 916504.
The stack from comment 1 on thread 0 is: 5 nss3.dll PR_WaitCondVar nsprpub/pr/src/threads/combined/prucv.c 6 mozjs.dll js::AutoPauseWorkersForGC::AutoPauseWorkersForGC(JSRuntime *) js/src/jsworkers.cpp 7 mozjs.dll GCCycle js/src/jsgc.cpp 8 mozjs.dll Collect js/src/jsgc.cpp 9 xul.dll nsAppShell::ProcessNextNativeEvent(bool) widget/windows/nsAppShell.cpp 10 mozjs.dll js::GCSlice(JSRuntime *,js::JSGCInvocationKind,JS::gcreason::Reason,__int64) js/src/jsgc.cpp 11 xul.dll nsJSContext::GarbageCollectNow(JS::gcreason::Reason,nsJSContext::IsIncremental,nsJSContext::IsCompartment,nsJSContext::IsShrinking,__int64) dom/base/nsJSEnvironment.cpp 12 xul.dll GCTimerFired(nsITimer *,void *) dom/base/nsJSEnvironment.cpp The line linked to is the state.wait(WorkerThreadState::MAIN) call. On thread 17, we have: 6 nss3.dll PR_WaitCondVar nsprpub/pr/src/threads/combined/prucv.c 7 mozjs.dll js::SourceCompressionTask::complete() js/src/jsworkers.cpp 8 mozjs.dll js::frontend::CompileScript(js::ExclusiveContext *,js::LifoAlloc *,JS::Handle<JSObject *>,JS::Handle<JSScript *>,JS::CompileOptions const &,wchar_t const *,unsigned int,JSString *,unsigned int,js::SourceCompressionTask *) js/src/frontend/BytecodeCompiler.cpp 9 mozjs.dll js::WorkerThread::handleParseWorkload(js::WorkerThreadState &) js/src/jsworkers.cpp 10 mozjs.dll js::WorkerThread::threadLoop() js/src/jsworkers.cpp The line linked to is the state.wait(WorkerThreadState::MAIN) call. Both threads have an AutoLockWorkerThreadState on the stack. And on threads 18,19 we have: 5 nss3.dll PR_WaitCondVar nsprpub/pr/src/threads/combined/prucv.c 6 mozjs.dll js::WorkerThread::threadLoop() js/src/jsworkers.cpp and the linked-to line is pause()... and we're still holding a AutoLockWorkerThreadState there too. Hoping someone understands how AutoLockWorkerThreadState is supposed to work. ;)
Depends on: 916504
Flags: needinfo?(bhackett1024)
Attached patch patch (deleted) — Splinter Review
There's actually another worker thread in comment 1's stack, thread 16, with this stack: 16|3|nss3.dll|_PR_MD_WAIT_CV|hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/md/windows/w95cv.c:53d5e43e23cc|248|0xe 16|4|nss3.dll|_PR_WaitCondVar|hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/threads/combined/prucv.c:53d5e43e23cc|172|0x1c 16|5|nss3.dll|PR_WaitCondVar|hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/threads/combined/prucv.c:53d5e43e23cc|515|0xb 16|6|mozjs.dll|js::WorkerThreadState::wait(js::WorkerThreadState::CondVar,unsigned int)|hg:hg.mozilla.org/mozilla-central:js/src/jsworkers.cpp:53d5e43e23cc|435|0x20 16|7|mozjs.dll|js::WorkerThread::pause()|hg:hg.mozilla.org/mozilla-central:js/src/jsworkers.cpp:53d5e43e23cc|1015|0x8 16|8|mozjs.dll|js::SourceCompressionTask::MOZ_Z_compress()|hg:hg.mozilla.org/mozilla-central:js/src/jsscript.cpp:53d5e43e23cc|1187|0x15 16|9|mozjs.dll|js::WorkerThread::threadLoop()|hg:hg.mozilla.org/mozilla-central:js/src/jsworkers.cpp:53d5e43e23cc|944|0x3d 16|10|nss3.dll|_PR_NativeRunThread|hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/threads/combined/pruthr.c:53d5e43e23cc|397|0x9 16|11|nss3.dll|pr_root|hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/md/windows/w95thred.c:53d5e43e23cc|90|0xc 16|12|msvcr100.dll|_callthreadstartex|f:\\dd\\vctools\\crt_bld\\self_x86\\crt\\src\\threadex.c|314|0x6 16|13|msvcr100.dll|_threadstartex|f:\\dd\\vctools\\crt_bld\\self_x86\\crt\\src\\threadex.c|292|0x5 The problem here is I think one which Bill noticed while reviewing bug 906371 and which I forgot to apply when landing the patch. Multiple threads can now be waiting for worker threads to complete --- in this case, the main thread is waiting for worker threads to pause, and one worker thread (#17) handling a parse workload is waiting for another worker thread (#16) to finish its compression workload. What I think happened is that #16 was the last thread to pause, and when trying to wake up the main thread it woke up thread #17 instead, which just resumed waiting. The presence of the AutoLockWorkerThreadState's are red herrings, as the lock is released when one thread is waiting for another. The fix here should be to just apply Bill's review comments from bug 906371 (:(). I'd like to remove the pausing functionality completely but it seems better to get this fixed first and then rip that stuff out.
Attachment #805332 - Flags: review?(wmccloskey)
Flags: needinfo?(bhackett1024)
Comment on attachment 805332 [details] [diff] [review] patch Review of attachment 805332 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsworkers.h @@ +134,5 @@ > # ifdef DEBUG > PRThread *lockOwner; > # endif > > + /* Condvar to notify threads waiting for work that they may be able to make progress. */ Can you move these comments up so they go with the CONSUMER and PRODUCER decls? I'm guessing that's where people will actually look.
Attachment #805332 - Flags: review?(wmccloskey) → review+
Comment on attachment 805332 [details] [diff] [review] patch [Approval Request Comment] Bug caused by (feature/regressing bug #): 906371 User impact if declined: deadlocks Testing completed (on m-c, etc.): on m-i Risk to taking this patch (and alternatives if risky): low Note that the patch in rev 4bcf9b261b94 also contains fixes for bugs 916753 and bug 916504, which are similar deadlocks. This approval request is for the combined patch.
Attachment #805332 - Flags: approval-mozilla-aurora?
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Attachment #805332 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee: general → bhackett1024
Depends on: 918862
Keywords: verifyme
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 Verified as fixed on latest Aurora 26.0a2 (buildID: 20131001004005).
verified with Nightly build 20131024030204 on Win 7
Status: RESOLVED → VERIFIED
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: