Closed
Bug 1370448
Opened 7 years ago
Closed 7 years ago
Fake deadlock detected due to nsSocketTransportService::StartPollWatchdog()
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)
References
Details
Attachments
(1 file)
My patch in bug 1368286 makes the deadlock detector complain on Windows, for example see this try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=464da20f5513409c084c54e88f3ded61510f5fc4
12:34:48 INFO - TEST-PASS | netwerk/test/unit/test_http2.js | testOnStopRequest - [testOnStopRequest : 81] true == true
12:34:48 INFO - TEST-PASS | netwerk/test/unit/test_http2.js | testOnStopRequest - [testOnStopRequest : 82] true == true
12:34:48 INFO - PID 4684 | starting test number 40
12:34:48 INFO - (xpcshell/head.js) | test pending (2)
12:34:48 INFO - (xpcshell/head.js) | test pending (3)
12:34:48 INFO - (xpcshell/head.js) | test finished (3)
12:34:48 INFO - (xpcshell/head.js) | test pending (3)
12:34:48 INFO - (xpcshell/head.js) | test finished (3)
12:34:48 INFO - (xpcshell/head.js) | test finished (2)
12:34:48 INFO - (xpcshell/head.js) | test finished (1)
12:34:48 INFO - exiting test
12:34:48 INFO - PID 4684 | ###!!! ERROR: Potential deadlock detected:
12:34:48 INFO - PID 4684 | === Cyclical dependency starts at
12:34:48 INFO - PID 4684 | --- Mutex : TimerThread.mMonitor calling context
12:34:48 INFO - PID 4684 | [stack trace unavailable]
12:34:48 INFO - PID 4684 | --- Next dependency:
12:34:48 INFO - PID 4684 | --- Mutex : nsSocketTransportService::mLock (currently acquired)
12:34:48 INFO - PID 4684 | calling context
12:34:48 INFO - PID 4684 | [stack trace unavailable]
12:34:48 INFO - PID 4684 | --- Next dependency:
12:34:48 INFO - PID 4684 | --- Mutex : nsTimerImpl::mMutex (currently acquired)
12:34:48 INFO - PID 4684 | calling context
12:34:48 INFO - PID 4684 | [stack trace unavailable]
12:34:48 INFO - PID 4684 | === Cycle completed at
12:34:48 INFO - PID 4684 | --- Mutex : TimerThread.mMonitor calling context
12:34:48 INFO - PID 4684 | [stack trace unavailable]
12:34:48 INFO - PID 4684 | Deadlock may happen for some other execution
12:34:48 INFO - PID 4684 | [4684] ###!!! ASSERTION: Potential deadlock detected:
12:34:48 INFO - PID 4684 | Cyclical dependency starts at
12:34:48 INFO - PID 4684 | Mutex : TimerThread.mMonitor
12:34:48 INFO - PID 4684 | Next dependency:
12:34:48 INFO - PID 4684 | Mutex : nsSocketTransportService::mLock (currently acquired)
12:34:48 INFO - PID 4684 | Next dependency:
12:34:48 INFO - PID 4684 | Mutex : nsTimerImpl::mMutex (currently acquired)
12:34:48 INFO - PID 4684 | Cycle completed at
12:34:48 INFO - PID 4684 | Mutex : TimerThread.mMonitor
12:34:48 INFO - PID 4684 | Deadlock may happen for some other execution
12:34:48 INFO - PID 4684 | : 'Error', file c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/xpcom/threads/BlockingResourceBase.cpp, line 308
12:35:00 INFO - PID 4684 | #01: mozilla::OffTheBooksMutex::Lock() [xpcom/threads/BlockingResourceBase.cpp:383]
12:35:00 INFO - PID 4684 | #02: nsTimerImpl::InitCommon(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const &,unsigned int,nsTimerImpl::Callback &&) [xpcom/threads/nsTimerImpl.cpp:224]
12:35:00 INFO - PID 4684 | #03: nsTimerImpl::InitCommon(unsigned int,unsigned int,nsTimerImpl::Callback &&) [xpcom/threads/nsTimerImpl.cpp:203]
12:35:00 INFO - PID 4684 | #04: nsTimerImpl::Init(nsIObserver *,unsigned int,unsigned int) [xpcom/threads/nsTimerImpl.cpp:328]
12:35:00 INFO - PID 4684 | #05: nsTimer::Init(nsIObserver *,unsigned int,unsigned int) [xpcom/threads/nsTimerImpl.h:231]
12:35:00 INFO - PID 4684 | #06: mozilla::net::nsSocketTransportService::StartPollWatchdog() [netwerk/base/nsSocketTransportService2.cpp:1622]
12:35:00 INFO - PID 4684 | #07: mozilla::net::nsSocketTransportService::OnDispatchedEvent(nsIThreadInternal *) [netwerk/base/nsSocketTransportService2.cpp:810]
12:35:00 INFO - PID 4684 | #08: nsThread::PutEvent(already_AddRefed<nsIRunnable>,nsThread::nsNestedEventTarget *) [xpcom/threads/nsThread.cpp:751]
12:35:00 INFO - PID 4684 | #09: nsThread::DispatchInternal(already_AddRefed<nsIRunnable>,unsigned int,nsThread::nsNestedEventTarget *) [xpcom/threads/nsThread.cpp:809]
12:35:00 INFO - PID 4684 | #10: nsThread::Dispatch(already_AddRefed<nsIRunnable>,unsigned int) [xpcom/threads/nsThread.cpp:868]
12:35:00 INFO - PID 4684 | #11: mozilla::net::nsSocketTransportService::Dispatch(already_AddRefed<nsIRunnable>,unsigned int) [netwerk/base/nsSocketTransportService2.cpp:153]
12:35:00 INFO - PID 4684 | #12: nsIEventTarget::Dispatch(nsIRunnable *,unsigned int) [obj-firefox/dist/include/nsIEventTarget.h:37]
12:35:00 INFO - PID 4684 | #13: mozilla::net::nsHttpConnectionMgr::PostEvent(void ( mozilla::net::nsHttpConnectionMgr::*)(int,mozilla::net::ARefBase *),int,mozilla::net::ARefBase *) [netwerk/protocol/http/nsHttpConnectionMgr.cpp:291]
12:35:00 INFO - PID 4684 | #14: mozilla::net::nsHttpConnectionMgr::Shutdown() [netwerk/protocol/http/nsHttpConnectionMgr.cpp:236]
12:35:00 INFO - PID 4684 | #15: mozilla::net::nsHttpHandler::ShutdownConnectionManager() [netwerk/protocol/http/nsHttpHandler.cpp:2622]
12:35:00 INFO - PID 4684 | #16: nsObserverList::NotifyObservers(nsISupports *,char const *,char16_t const *) [xpcom/ds/nsObserverList.cpp:112]
12:35:00 INFO - PID 4684 | #17: nsObserverService::NotifyObservers(nsISupports *,char const *,char16_t const *) [xpcom/ds/nsObserverService.cpp:285]
12:35:00 INFO - PID 4684 | #18: NS_InvokeByIndex
12:35:00 INFO - PID 4684 | #19: CallMethodHelper::Invoke() [js/xpconnect/src/XPCWrappedNative.cpp:1996]
12:35:00 INFO - PID 4684 | #20: XPCWrappedNative::CallMethod(XPCCallContext &,XPCWrappedNative::CallMode) [js/xpconnect/src/XPCWrappedNative.cpp:1282]
12:35:00 INFO - PID 4684 | #21: XPC_WN_CallMethod(JSContext *,unsigned int,JS::Value *) [js/xpconnect/src/XPCWrappedNativeJSOps.cpp:982]
12:35:00 INFO - PID 4684 | #22: js::CallJSNative(JSContext *,bool (*)(JSContext *,unsigned int,JS::Value *),JS::CallArgs const &) [js/src/jscntxtinlines.h:293]
12:35:00 INFO - PID 4684 | #23: js::InternalCallOrConstruct(JSContext *,JS::CallArgs const &,js::MaybeConstruct) [js/src/vm/Interpreter.cpp:470]
12:35:00 INFO - PID 4684 | #24: InternalCall [js/src/vm/Interpreter.cpp:515]
12:35:00 INFO - PID 4684 | #25: js::jit::DoCallFallback [js/src/jit/BaselineIC.cpp:2453]
12:35:00 INFO - PID 4684 | #26: ??? (???:???)
12:35:00 INFO - PID 4684 | #27: ??? (???:???)
12:35:00 INFO - PID 4684 | #28: ??? (???:???)
12:35:00 INFO - PID 4684 | #29: EnterBaseline [js/src/jit/BaselineJIT.cpp:164]
12:35:00 INFO - PID 4684 | #30: js::jit::EnterBaselineAtBranch(JSContext *,js::InterpreterFrame *,unsigned char *) [js/src/jit/BaselineJIT.cpp:270]
12:35:00 INFO - PID 4684 | #31: Interpret [js/src/vm/Interpreter.cpp:1979]
12:35:00 INFO - PID 4684 | #32: js::RunScript(JSContext *,js::RunState &) [js/src/vm/Interpreter.cpp:410]
12:35:00 INFO - PID 4684 | #33: js::ExecuteKernel(JSContext *,JS::Handle<JSScript *>,JSObject &,JS::Value const &,js::AbstractFramePtr,JS::Value *) [js/src/vm/Interpreter.cpp:699]
12:35:00 INFO - PID 4684 | #34: js::Execute(JSContext *,JS::Handle<JSScript *>,JSObject &,JS::Value *) [js/src/vm/Interpreter.cpp:731]
12:35:00 INFO - PID 4684 | #35: Evaluate [js/src/jsapi.cpp:4662]
12:35:00 INFO - PID 4684 | #36: JS::Evaluate(JSContext *,JS::ReadOnlyCompileOptions const &,char const *,unsigned int,JS::MutableHandle<JS::Value>) [js/src/jsapi.cpp:4703]
12:35:00 INFO - PID 4684 | #37: ProcessArgs [js/xpconnect/src/XPCShellImpl.cpp:1131]
12:35:00 INFO - PID 4684 | #38: XRE_XPCShellMain(int,char * *,char * *,XREShellData const *) [js/xpconnect/src/XPCShellImpl.cpp:1522]
12:35:00 INFO - PID 4684 | #39: mozilla::BootstrapImpl::XRE_XPCShellMain(int,char * *,char * *,XREShellData const *) [toolkit/xre/Bootstrap.cpp:53]
12:35:00 INFO - PID 4684 | #40: NS_internal_main(int,char * *,char * *) [js/xpconnect/shell/xpcshell.cpp:74]
12:35:00 INFO - PID 4684 | #41: wmain [toolkit/xre/nsWindowsWMain.cpp:118]
12:35:00 INFO - PID 4684 | #42: __scrt_common_main_seh [f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253]
12:35:00 INFO - PID 4684 | #43: kernel32.dll + 0x53c45
12:35:00 INFO - PID 4684 | #44: ntdll.dll + 0x637f5
12:35:00 INFO - PID 4684 | #45: ntdll.dll + 0x637c8
The reason that this is happening is that I'm adding a call to TimerThread::FindNextFireTimeForCurrentThread() indirectly from nsThread::HasPendingEvent() and that function holds the timer thread lock, <https://searchfox.org/mozilla-central/rev/d441cb24482c2e5448accaf07379445059937080/xpcom/threads/TimerThread.cpp#596> creating this weird "potential" deadlock, which of course isn't a deadlock since the said timer thread lock has been freed, so there is no actual reference cycle.
In order to make the deadlock detector stop complaining, I'm going to make this job be dispatched off of a runnable from the main thread. I checked with Dragana on IRC and she said that should be OK.
Assignee | ||
Comment 1•7 years ago
|
||
Attachment #8874711 -
Flags: review?(dd.mozilla)
Assignee | ||
Comment 2•7 years ago
|
||
This is try push with many Windows debug xpcshell-test retriggers that are all green (except for one intermittent):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ca91e688ea3b777c10da658ff179e44e831879ed
Updated•7 years ago
|
Attachment #8874711 -
Flags: review?(dd.mozilla) → review+
Comment 4•7 years ago
|
||
Ehsan, how does that lambda hold the nsSocketTransportService alive during your main thread dispatch?
Flags: needinfo?(ehsan)
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #4)
> Ehsan, how does that lambda hold the nsSocketTransportService alive during
> your main thread dispatch?
No. Note that this is a thread observer so this code will run on the next event loop run after an event has been dispatched to the STS thread, which means the thread object will be around since during XPCOM shutdown we will run all pending events as we're waiting for the threads to join. But I suppose I can make it hold the object alive explicitly, it's easy enough to do and avoids the guesswork around the lifetime.
Flags: needinfo?(ehsan)
Comment 6•7 years ago
|
||
Or at least add a comment, because I think its somewhat non-obvious.
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #6)
> Or at least add a comment, because I think its somewhat non-obvious.
Somewhat is an understatement. :-) I'm testing a patch on try to add an explicit reference to just hold the object alive and not rely on this magic!
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/551c2fc04c72
Start poll watchdog timer off of a runnable from the main thread in order to avoid a false positive detected deadlock; r=dragana
Comment 9•7 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #0)
> The reason that this is happening is that I'm adding a call to
> TimerThread::FindNextFireTimeForCurrentThread() indirectly from
> nsThread::HasPendingEvent() and that function holds the timer thread lock,
> <https://searchfox.org/mozilla-central/rev/
> d441cb24482c2e5448accaf07379445059937080/xpcom/threads/TimerThread.cpp#596>
> creating this weird "potential" deadlock, which of course isn't a deadlock
> since the said timer thread lock has been freed, so there is no actual
> reference cycle.
If the timer thread lock is freed it should have been removed from tracking in the deadlock detector [1], if that's not the case we should file a bug to follow up on that.
[1] http://searchfox.org/mozilla-central/rev/507743376d1ba753d14ab6b9305b7c6358570be8/xpcom/threads/BlockingResourceBase.cpp#236
Updated•7 years ago
|
Flags: needinfo?(ehsan)
Assignee | ||
Comment 10•7 years ago
|
||
Filed bug 1370644 about fixing the deadlock detector.
Flags: needinfo?(ehsan)
Comment 11•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Updated•7 years ago
|
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•