Closed Bug 616072 Opened 14 years ago Closed 13 years ago

Deadlock involving proxy object lock

Categories

(Core :: XPCOM, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: roc, Unassigned)

References

Details

I was just doing regular browsing, my browser hung, and I caught it in a debugger. The main thread is stuck here: nspr4.dll!PR_Lock(PRLock * lock) Line 234 C xul.dll!nsProxyEventObject::Release() Line 105 C++ xul.dll!nsProxyObjectCallInfo::~nsProxyObjectCallInfo() Line 158 C++ xul.dll!nsProxyObjectCallInfo::`scalar deleting destructor'() + 0x8 bytes C++ > xul.dll!nsDOMWorkerEventListenerBase::Release() Line 50 + 0x1f bytes C++ xul.dll!nsCOMPtr_base::~nsCOMPtr_base() Line 82 C++ xul.dll!nsThread::ProcessNextEvent(int mayWait, int * result) Line 636 C++ xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, int mayWait) Line 250 + 0xd bytes C++ We're releasing 'event', an nsCOMPtr<nsIRunnable>, which happens to be an nsProxyObjectCallInfo. The method is sameTypeRootTreeInfo, and we seem to be calling it on an nsDocShell. (I don't know how nsDOMWorkerEventListenerBase is showing up there, it's probably an artifact of my opt build.) That call originated in nsNSSIOLayer.cpp. The main thread is hung waiting for the mProxyCreationLock. As far as I can tell, the main thread is not holding any locks. Another thread has this stack: nspr4.dll!PR_Lock(PRLock * lock) Line 234 C xul.dll!nsProxyObjectManager::GetProxyForObject(nsIEventTarget * aTarget, const nsID & aIID, nsISupports * aObj, int proxyType, void * * aProxyObject) Line 240 + 0x22 bytes C++ xul.dll!NS_GetProxyForObject(nsIEventTarget * target, const nsID & aIID, nsISupports * aObj, int proxyType, void * * aProxyObject) Line 353 + 0x18 bytes C++ > xul.dll!nsThreadPool::ShutdownThread(nsIThread * thread) Line 147 C++ xul.dll!nsThreadPool::Run() Line 234 C++ xul.dll!nsThread::ProcessNextEvent(int mayWait, int * result) Line 626 + 0x6 bytes C++ xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, int mayWait) Line 250 + 0xd bytes C++ So it's waiting for the mProxyCreationLock too. But as far as I can tell, it's not holding any locks either. Strangely, in PR_Lock, mProxyCreationLock->owner is NULL. But in the Windows CRITICAL_SECTION data correctly, OwningThread is the NSS thread that originated the proxied call to sameTypeRootTreeInfo. But that might just be the *last* thread that owned the critical section. Here's that thread's call stack: > nspr4.dll!_PR_MD_WAIT_CV(_MDCVar * cv, _MDLock * lock, unsigned int timeout) Line 282 C nspr4.dll!_PR_WaitCondVar(PRThread * thread, PRCondVar * cvar, PRLock * lock, unsigned int timeout) Line 205 C nspr4.dll!PR_Wait(PRMonitor * mon, unsigned int ticks) Line 184 + 0x1a bytes C xul.dll!nsEventQueue::GetEvent(int mayWait, nsIRunnable * * result) Line 85 + 0x9 bytes C++ xul.dll!nsThread::ProcessNextEvent(int mayWait, int * result) Line 622 C++ xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, int mayWait) Line 250 + 0xd bytes C++ xul.dll!nsProxyEventObject::CallMethod(unsigned short methodIndex, const XPTMethodDescriptor * methodInfo, nsXPTCMiniVariant * params) Line 260 + 0xa bytes C++ xul.dll!PrepareAndDispatch(nsXPTCStubBase * self, unsigned int methodIndex, unsigned int * args, unsigned int * stackBytesToPop) Line 114 + 0x15 bytes C++ xul.dll!SharedStub() Line 142 C++ xul.dll!nsNSSSocketInfo::EnsureDocShellDependentStuffKnown() Line 415 C++ xul.dll!nsNSSSocketInfo::GetPreviousCert(nsIX509Cert * * _result) Line 850 C++ xul.dll!HandshakeCallback(PRFileDesc * fd, void * client_data) Line 948 C++ ssl3.dll!ssl3_HandleFinished(sslSocketStr * ss, unsigned char * b, unsigned int length, const SSL3Hashes * hashes) Line 8503 + 0xa bytes C ssl3.dll!ssl3_HandleHandshakeMessage(sslSocketStr * ss, unsigned char * b, unsigned int length) Line 8660 C ssl3.dll!ssl3_HandleHandshake(sslSocketStr * ss, sslBufferStr * origBuf) Line 8727 + 0x8 bytes C ssl3.dll!ssl3_HandleRecord(sslSocketStr * ss, SSL3Ciphertext * cText, sslBufferStr * databuf) Line 9066 + 0x9 bytes C ssl3.dll!ssl3_GatherCompleteHandshake(sslSocketStr * ss, int flags) Line 209 + 0x11 bytes C ssl3.dll!ssl_GatherRecord1stHandshake(sslSocketStr * ss) Line 1259 C ssl3.dll!ssl_Do1stHandshake(sslSocketStr * ss) Line 151 + 0x3 bytes C ssl3.dll!ssl_SecureSend(sslSocketStr * ss, const unsigned char * buf, int len, int flags) Line 1213 + 0x6 bytes C ssl3.dll!ssl_SecureWrite(sslSocketStr * ss, const unsigned char * buf, int len) Line 1258 + 0x13 bytes C ssl3.dll!ssl_Write(PRFileDesc * fd, const void * buf, int len) Line 1653 C xul.dll!nsSSLThread::Run() Line 1052 C++ The state of the CRITICAL_SECTION is weird. Here it is: - lock 0x09f6c888 {links={...} owner=0x00000000 waitQ={...} ...} PRLock * + links {next=0x09f6c888 prev=0x09f6c888 } PRCListStr + owner 0x00000000 {state=??? priority=??? arg=??? ...} PRThread * + waitQ {next=0x09f6c894 prev=0x09f6c894 } PRCListStr priority PR_PRIORITY_FIRST PRThreadPriority boostPriority PR_PRIORITY_FIRST PRThreadPriority - ilock {mutex={...} notified={...} } _MDLock - mutex {DebugInfo=0x00362430 LockCount=-10 RecursionCount=1 ...} _RTL_CRITICAL_SECTION - DebugInfo 0x00362430 {Type=0 CreatorBackTraceIndex=0 CriticalSection=0x09f6c8a4 ...} _RTL_CRITICAL_SECTION_DEBUG * Type 0 unsigned short CreatorBackTraceIndex 0 unsigned short + CriticalSection 0x09f6c8a4 {DebugInfo=0x00362430 LockCount=-10 RecursionCount=1 ...} _RTL_CRITICAL_SECTION * + ProcessLocksList {Flink=0x003639e8 Blink=0x00362168 } _LIST_ENTRY EntryCount 0 unsigned long ContentionCount 15 unsigned long Flags 0 unsigned long CreatorBackTraceIndexHigh 0 unsigned short SpareWORD 0 unsigned short LockCount -10 long RecursionCount 1 long OwningThread 0x000014c0 void * LockSemaphore 0x000006b4 void * SpinCount 0 unsigned long + notified {length=0 cv=0x09f6c8c0 link=0x00000000 } _MDNotified http://msdn.microsoft.com/en-us/magazine/cc164040.aspx#S2 says that OwningThread is definitely zeroed out when a process releases the critical section. And LockCount should apparently never be less than -1. RecursionCount 1 suggests that the the NSS thread is definitely holding the lock. But all this information might be out of date :-(. I really don't have a clue what could have happened here. Maybe memory corruption?
Same deadlock occurs on 2011-04-10 nightly 0:006> ~0k ChildEBP RetAddr 002bd61c 771d0177 ntdll!ZwWaitForSingleObject+0x15 002bd680 771d0142 ntdll!RtlpWaitOnCriticalSection+0x155 002bd6a8 6505b3a7 ntdll!RtlEnterCriticalSection+0x152 002bd6b8 61c2656d nspr4!PR_Lock+0x17 002bd6cc 61c255f8 xul!nsProxyEventObject::Release+0x15 002bd6d8 61c2562c xul!nsProxyObjectCallInfo::~nsProxyObjectCallInfo+0x38 002bd6e0 61adc6b3 xul!nsProxyObjectCallInfo::`scalar deleting destructor'+0x8 002bd6ec 61a5111e xul!nsRunnable::Release+0x33 002bd6f4 61ac2833 xul!nsRefPtr<nsIRunnable>::~nsRefPtr<nsIRunnable>+0xe 002bd720 61aeb02e xul!nsThread::ProcessNextEvent+0x163 002bd754 61c97352 xul!mozilla::ipc::MessagePump::Run+0x6e 002bd760 61c9733b xul!MessageLoop::RunInternal+0x11 002bd778 64dc808b xul!MessageLoop::RunHandler+0x1d 002bd798 61c972e7 MOZCRT19!_VEC_memzero+0x36 002bd7b8 61c7a932 xul!MessageLoop::Run+0x15 002bd7c4 61ca2ad3 xul!nsBaseAppShell::Run+0x34 002bf718 61ca2a7d xul!nsAppShell::Run+0x42 002bf724 61bc7589 xul!nsAppStartup::Run+0x1e 002bfab4 010c134c xul!XRE_main+0xde9 002bfb04 010c16f2 firefox!wmain+0x34c 0:006> dt -r nspr4!PRLock 05caa4c0 +0x000 links : PRCListStr +0x000 next : 0x05caa4c0 PRCListStr +0x000 next : 0x05caa4c0 PRCListStr +0x004 prev : 0x05caa4c0 PRCListStr +0x004 prev : 0x05caa4c0 PRCListStr +0x000 next : 0x05caa4c0 PRCListStr +0x004 prev : 0x05caa4c0 PRCListStr +0x008 owner : (null) +0x00c waitQ : PRCListStr +0x000 next : 0x05caa4cc PRCListStr +0x000 next : 0x05caa4cc PRCListStr +0x004 prev : 0x05caa4cc PRCListStr +0x004 prev : 0x05caa4cc PRCListStr +0x000 next : 0x05caa4cc PRCListStr +0x004 prev : 0x05caa4cc PRCListStr +0x014 priority : 0 ( PR_PRIORITY_FIRST ) +0x018 boostPriority : 0 ( PR_PRIORITY_FIRST ) +0x01c ilock : _MDLock +0x000 mutex : _RTL_CRITICAL_SECTION +0x000 DebugInfo : 0x00312730 _RTL_CRITICAL_SECTION_DEBUG +0x004 LockCount : -14 +0x008 RecursionCount : 1 +0x00c OwningThread : 0x00000ee0 +0x010 LockSemaphore : 0x0000084c +0x014 SpinCount : 0 +0x018 notified : _MDNotified +0x000 length : 0 +0x004 cv : [6] <unnamed-tag> +0x04c link : (null) 0:006> ~ 0 Id: 156c.1410 Suspend: 1 Teb: fffdd000 Unfrozen 1 Id: 156c.fb8 Suspend: 1 Teb: fffaf000 Unfrozen 2 Id: 156c.12c0 Suspend: 1 Teb: fffa9000 Unfrozen 3 Id: 156c.e84 Suspend: 1 Teb: fffa6000 Unfrozen 4 Id: 156c.ae8 Suspend: 1 Teb: fffa3000 Unfrozen 5 Id: 156c.f74 Suspend: 1 Teb: fffa0000 Unfrozen . 6 Id: 156c.1bcc Suspend: 1 Teb: fff9d000 Unfrozen 7 Id: 156c.ee0 Suspend: 1 Teb: fff9a000 Unfrozen 8 Id: 156c.15d8 Suspend: 1 Teb: fff97000 Unfrozen 9 Id: 156c.1734 Suspend: 1 Teb: fff94000 Unfrozen 10 Id: 156c.1440 Suspend: 1 Teb: fff85000 Unfrozen 11 Id: 156c.1698 Suspend: 1 Teb: fff91000 Unfrozen 12 Id: 156c.180 Suspend: 1 Teb: fff8e000 Unfrozen 13 Id: 156c.1740 Suspend: 1 Teb: fff82000 Unfrozen 14 Id: 156c.e58 Suspend: 1 Teb: fff7c000 Unfrozen 15 Id: 156c.17d0 Suspend: 1 Teb: fff79000 Unfrozen 16 Id: 156c.ea8 Suspend: 1 Teb: fff76000 Unfrozen 17 Id: 156c.153c Suspend: 1 Teb: fff70000 Unfrozen 18 Id: 156c.1098 Suspend: 1 Teb: fff6d000 Unfrozen 19 Id: 156c.b6c Suspend: 1 Teb: fff37000 Unfrozen 20 Id: 156c.1498 Suspend: 1 Teb: fff7f000 Unfrozen 21 Id: 156c.1494 Suspend: 1 Teb: fff52000 Unfrozen 22 Id: 156c.1598 Suspend: 1 Teb: fffac000 Unfrozen 23 Id: 156c.17b8 Suspend: 1 Teb: fff64000 Unfrozen 24 Id: 156c.e4c Suspend: 1 Teb: fff61000 Unfrozen 25 Id: 156c.cf8 Suspend: 1 Teb: fff58000 Unfrozen 26 Id: 156c.1560 Suspend: 1 Teb: fff55000 Unfrozen 27 Id: 156c.d6c Suspend: 1 Teb: fff8b000 Unfrozen 29 Id: 156c.1818 Suspend: 1 Teb: fff6a000 Unfrozen 30 Id: 156c.958 Suspend: 1 Teb: fff88000 Unfrozen 0:006> ~7k 200 ChildEBP RetAddr 04ecf33c 74da1270 ntdll!ZwWaitForSingleObject+0x15 04ecf3ac 74da11d8 kernel32!WaitForSingleObjectEx+0xbe 04ecf3c0 6505f3f9 kernel32!WaitForSingleObject+0x12 04ecf3e4 6505af28 nspr4!_PR_MD_WAIT_CV+0xc9 04ecf404 6505ab31 nspr4!_PR_WaitCondVar+0x58 04ecf420 61ac2a50 nspr4!PR_Wait+0x51 04ecf448 61a5111e xul!nsThread::ProcessNextEvent+0x380 04ecf4d0 61bee46b xul!nsRefPtr<nsIRunnable>::~nsRefPtr<nsIRunnable>+0xe 04ecf510 61c384b8 xul!nsProxyObjectManager::GetProxyForObject+0x191 04ecf558 6233e430 xul!NS_GetProxyForObject+0x48 04ecf59c 6234e2ee xul!nsNSSSocketInfo::EnsureDocShellDependentStuffKnown+0xc2 04ecf5b0 6236d23c xul!nsNSSSocketInfo::GetPreviousCert+0x1c 04ecf644 64dc808b xul!HandshakeCallback+0x313 04ecf708 64db84f9 MOZCRT19!_VEC_memzero+0x36 04ecf724 63ed1bc4 MOZCRT19!arena_dalloc+0x39 04ecf728 64db83c8 softokn3!sftk_DestroySession+0x6b 04ecf740 64db84f9 MOZCRT19!arena_dalloc_small+0x28 04ecf7a0 644d4490 MOZCRT19!arena_dalloc+0x39 04ecf7bc 6505b402 ssl3!ssl3_CacheWrappedMasterSecret+0x142 04ecf7d4 6505b3e9 nspr4!PR_AssertCurrentThreadOwnsLock+0x12 04ecf7dc 644df940 nspr4!PR_Unlock+0x39 04ecf820 644d953a ssl3!CacheSID+0xa0 04ecf870 644d96a4 ssl3!ssl3_HandleHandshakeMessage+0x267 04ecf890 644d9b50 ssl3!ssl3_HandleHandshake+0xa8 04ecf91c 644d9e8c ssl3!ssl3_HandleRecord+0x3b9 04ecf948 644daff9 ssl3!ssl3_GatherCompleteHandshake+0x5d 04ecf95c 644dfaba ssl3!ssl_GatherRecord1stHandshake+0x29 04ecf970 644e0776 ssl3!ssl_Do1stHandshake+0x44 04ecf988 644e37e6 ssl3!ssl_SecureRecv+0xdf 04ecf9a8 61f5a59a ssl3!ssl_Recv+0x51 04ecf9d8 6237e39b xul!nsSSLThread::requestRecvMsgPeek+0xf2 04ecf9f8 61c30714 xul!PSMRecv+0x35 04ecfa28 61c30766 xul!nsSocketTransport::IsAlive+0x6e 04ecfa3c 61be8805 xul!nsHttpConnection::IsAlive+0x15 04ecfa50 61bb8c52 xul!nsHttpConnection::CanReuse+0x3f 04ecfa64 61bb89ce xul!nsHttpConnectionMgr::GetConnection+0x9e 04ecfaa4 61bb8934 xul!nsHttpConnectionMgr::ProcessNewTransaction+0x87 04ecfab4 61a07189 xul!nsHttpConnectionMgr::OnMsgNewTransaction+0x2b 04ecfac0 61ac282a xul!nsHttpConnectionMgr::nsConnEvent::Run+0x10 04ecfb5c 61b29318 xul!nsThread::ProcessNextEvent+0x15a 04ecfb7c 6505bdd9 xul!nsThread::ThreadFunc+0x88 00000000 00000000 nspr4!_PR_NativeRunThread+0x169
SSL thread gets lock by nsProxyObjectManager::GetProxyForObject(), then this thread is waiting event in xul!nsThread::ProcessNextEvent(). But main UI thread try getting this lock. So dead lock occurs.
There is no SSL thread after bug 674147. Also, the use of XPCOM proxies in PSM was removed in bug 675221.
Status: NEW → RESOLVED
Closed: 13 years ago
Depends on: 675221, 674147
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.