Closed Bug 484367 Opened 16 years ago Closed 12 years ago

nsNativeModuleLoader got stuck proxying do_GetService(NS_X509CERTDB_CONTRACTID)

Categories

(Core :: XPCOM, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mayhemer, Assigned: timeless)

Details

(Keywords: hang, Whiteboard: [psm-fatal])

Attachments

(1 file)

I was testing while reviewing bug 483168 with trunk. My session store had following pages saved: https://www.softballoutlet.com/, https://www.interfimo.fr/, https://secure.comodo.com/, https://www.gandi.net/, https://ev.tbs-x509.com/. The last session I installed all CAs from bug 479029 comment 5 (probably not related to this bug). Right after start all tabs remain empty with progress indicator cycling, UI was NOT frozen but page loads hanged. I had applied following patches: 480509 483440 483437 484111 479029 483168 483168, but it is quit unlikely that any of them would cause this hang. Apparently deadlock appeared right after start. I closed firefox but it hanged during shutdown on the main thread joining the PSB background thread. nsPSMBackgroundThread stack trace: ntdll.dll!_KiFastSystemCallRet@0() ntdll.dll!_NtWaitForSingleObject@12() + 0xc bytes kernel32.dll!_WaitForSingleObjectEx@12() + 0x8b bytes kernel32.dll!_WaitForSingleObject@8() + 0x12 bytes nspr4.dll!_PR_MD_WAIT_CV(_MDCVar * cv=0x0646c23c, _MDLock * lock=0x066cec44, unsigned int timeout=4294967295) Line 280 + 0x14 bytes C nspr4.dll!_PR_WaitCondVar(PRThread * thread=0x0517c0c0, PRCondVar * cvar=0x0646c1c8, PRLock * lock=0x066cec28, unsigned int timeout=4294967295) Line 204 + 0x17 bytes C nspr4.dll!PR_Wait(PRMonitor * mon=0x06427e08, unsigned int ticks=4294967295) Line 175 + 0x1d bytes C xpcom_core.dll!nsAutoMonitor::Wait(unsigned int interval=4294967295) Line 340 + 0x11 bytes C++ xpcom_core.dll!nsEventQueue::GetEvent(int mayWait=1, nsIRunnable * * result=0x05b2f100) Line 86 C++ xpcom_core.dll!nsThread::nsChainedEventQueue::GetEvent(int mayWait=1, nsIRunnable * * event=0x05b2f100) Line 113 C++ xpcom_core.dll!nsThread::ProcessNextEvent(int mayWait=1, int * result=0x05b2f124) Line 501 + 0x49 bytes C++ xpcom_core.dll!NS_ProcessNextEvent_P(nsIThread * thread=0x066fec60, int mayWait=1) Line 230 + 0x16 bytes C++ xpcom_core.dll!nsProxyEventObject::CallMethod(unsigned short methodIndex=3, const XPTMethodDescriptor * methodInfo=0x026aa8d0, nsXPTCMiniVariant * params=0x05b2f1d0) Line 260 + 0xb bytes C++ xpcom_core.dll!PrepareAndDispatch(nsXPTCStubBase * self=0x062f39f8, unsigned int methodIndex=3, unsigned int * args=0x05b2f290, unsigned int * stackBytesToPop=0x05b2f280) Line 114 + 0x21 bytes C++ xpcom_core.dll!SharedStub() Line 142 C++ xpcom_core.dll!nsNativeModuleLoader::LoadModule(nsILocalFile * aFile=0x02dd7b30, nsIModule * * aResult=0x05b2f828) Line 127 + 0x2d bytes C++ xpcom_core.dll!nsNativeModuleLoader::LoadModule(nsILocalFile * aFile=0x02dd7b30, nsIModule * * aResult=0x05b2f828) Line 127 + 0x2d bytes C++ xpcom_core.dll!nsFactoryEntry::GetFactory(nsIFactory * * aFactory=0x05b2f844) Line 3601 + 0x2f bytes C++ xpcom_core.dll!nsComponentManagerImpl::CreateInstanceByContractID(const char * aContractID=0x0147d648, nsISupports * aDelegate=0x00000000, const nsID & aIID={...}, void * * aResult=0x05b2f8a8) Line 1680 + 0xc bytes C++ xpcom_core.dll!nsComponentManagerImpl::GetServiceByContractID(const char * aContractID=0x0147d648, const nsID & aIID={...}, void * * result=0x05b2f910) Line 2252 + 0x34 bytes C++ xpcom_core.dll!CallGetService(const char * aContractID=0x0147d648, const nsID & aIID={...}, void * * aResult=0x05b2f910) Line 95 C++ xpcom_core.dll!nsGetServiceByContractID::operator()(const nsID & aIID={...}, void * * aInstancePtr=0x05b2f910) Line 278 + 0x13 bytes C++ pipnss.dll!nsCOMPtr<nsIX509CertDB>::assign_from_gs_contractid(nsGetServiceByContractID gs={...}, const nsID & aIID={...}) Line 1219 + 0x10 bytes C++ pipnss.dll!nsCOMPtr<nsIX509CertDB>::operator=(nsGetServiceByContractID rhs={...}) Line 691 C++ > pipnss.dll!nsNSSCertificate::hasValidEVOidTag(SECOidTag & resultOidTag=SEC_OID_UNKNOWN, int & validEV=0) Line 910 C++ pipnss.dll!nsNSSCertificate::getValidEVOidTag(SECOidTag & resultOidTag=SEC_OID_UNKNOWN, int & validEV=0) Line 1017 + 0x10 bytes C++ pipnss.dll!nsNSSCertificate::GetIsExtendedValidation(int * aIsEV=0x05b2fcb4) Line 1043 + 0x13 bytes C++ pipnss.dll!AuthCertificateCallback(void * client_data=0x00000000, PRFileDesc * fd=0x051b8440, int checksig=1, int isServer=0) Line 982 C++ ssl3.dll!ssl3_HandleCertificate(sslSocketStr * ss=0x051ab818, unsigned char * b=0x051b4cbe, unsigned int length=0) Line 7280 + 0x21 bytes C ssl3.dll!ssl3_HandleHandshakeMessage(sslSocketStr * ss=0x051ab818, unsigned char * b=0x051b385c, unsigned int length=5218) Line 7958 + 0x11 bytes C ssl3.dll!ssl3_HandleHandshake(sslSocketStr * ss=0x051ab818, sslBufferStr * origBuf=0x051aba60) Line 8082 + 0x19 bytes C ssl3.dll!ssl3_HandleRecord(sslSocketStr * ss=0x051ab818, SSL3Ciphertext * cText=0x05b2fe50, sslBufferStr * databuf=0x051aba60) Line 8345 + 0xd bytes C ssl3.dll!ssl3_GatherCompleteHandshake(sslSocketStr * ss=0x051ab818, int flags=0) Line 206 + 0x17 bytes C ssl3.dll!ssl_GatherRecord1stHandshake(sslSocketStr * ss=0x051ab818) Line 1258 + 0xb bytes C ssl3.dll!ssl_Do1stHandshake(sslSocketStr * ss=0x051ab818) Line 151 + 0xf bytes C ssl3.dll!ssl_SecureSend(sslSocketStr * ss=0x051ab818, const unsigned char * buf=0x05211e60, int len=369, int flags=0) Line 1176 + 0x9 bytes C ssl3.dll!ssl_SecureWrite(sslSocketStr * ss=0x051ab818, const unsigned char * buf=0x05211e60, int len=369) Line 1221 + 0x13 bytes C ssl3.dll!ssl_Write(PRFileDesc * fd=0x051b8440, const void * buf=0x05211e60, int len=369) Line 1488 + 0x17 bytes C pipnss.dll!nsSSLThread::Run() Line 1029 + 0x1c bytes C++ pipnss.dll!nsPSMBackgroundThread::nsThreadRunner(void * arg=0x051735b0) Line 45 C++ nspr4.dll!_PR_NativeRunThread(void * arg=0x0517c0c0) Line 436 + 0xf bytes C nspr4.dll!pr_root(void * arg=0x0517c0c0) Line 122 + 0xf bytes C msvcr80d.dll!_callthreadstartex() Line 348 + 0xf bytes C msvcr80d.dll!_threadstartex(void * ptd=0x0517d258) Line 331 C kernel32.dll!_BaseThreadStart@8() + 0x37 bytes It looks like the event queue never gets re-spinned.
Locals of frame xpcom_core.dll!nsThread::nsChainedEventQueue::GetEvent(int mayWait=1, nsIRunnable * * event=0x05b2f100) Line 113 C++ - mQueue {mMonitor=0x06427e08 mHead=0x00000000 mTail=0x00000000 ...} nsEventQueue mMonitor 0x06427e08 PRMonitor * + mHead 0x00000000 {mNext=??? mEvents=0x00000004 } nsEventQueue::Page * + mTail 0x00000000 {mNext=??? mEvents=0x00000004 } nsEventQueue::Page * mOffsetHead 0 unsigned short mOffsetTail 0 unsigned short mayWait 1 int - this 0x0635c6e0 {mNext=0x066fec7c mFilter={...} mQueue={...} } nsThread::nsChainedEventQueue * const + mNext 0x066fec7c {mNext=0x00000000 mFilter={...} mQueue={...} } nsThread::nsChainedEventQueue * - mFilter {mRawPtr=0x0657a560 } nsCOMPtr<nsIThreadEventFilter> - mRawPtr 0x0657a560 {mRefCnt={...} _mOwningThread={...} } nsIThreadEventFilter * - [nsProxyThreadFilter] {mRefCnt={...} _mOwningThread={...} } nsProxyThreadFilter + nsIThreadEventFilter {...} nsIThreadEventFilter - mRefCnt {mValue=2 } nsAutoRefCnt mValue 2 unsigned long - _mOwningThread {mThread=0x0517c0c0 } nsAutoOwningThread mThread 0x0517c0c0 void * + nsISupports {...} nsISupports - mQueue {mMonitor=0x06427e08 mHead=0x00000000 mTail=0x00000000 ...} nsEventQueue mMonitor 0x06427e08 PRMonitor * + mHead 0x00000000 {mNext=??? mEvents=0x00000004 } nsEventQueue::Page * + mTail 0x00000000 {mNext=??? mEvents=0x00000004 } nsEventQueue::Page * mOffsetHead 0 unsigned short mOffsetTail 0 unsigned short Locals of frame: xpcom_core.dll!nsThread::ProcessNextEvent(int mayWait=1, int * result=0x05b2f124) Line 501 + 0x49 bytes C++ - this 0x066fec60 {mRefCnt={...} _mOwningThread={...} mLock=0x02f7f290 ...} nsThread * const + nsIThreadInternal {...} nsIThreadInternal + nsISupportsPriority {...} nsISupportsPriority + mRefCnt {mValue=4 } nsAutoRefCnt + _mOwningThread {mThread=0x0517c0c0 } nsAutoOwningThread - sGlobalObserver 0x00daa094 {mRefCnt={...} _mOwningThread={...} mRuntime=0x00daa138 ...} nsIThreadObserver * - [nsXPConnect] {mRefCnt={...} _mOwningThread={...} mRuntime=0x00daa138 ...} nsXPConnect + nsIXPConnect {...} nsIXPConnect + nsIThreadObserver {...} nsIThreadObserver + nsSupportsWeakReference {mProxy=0x00000000 } nsSupportsWeakReference + nsCycleCollectionJSRuntime {...} nsCycleCollectionJSRuntime + nsCycleCollectionParticipant {...} nsCycleCollectionParticipant + nsIJSRuntimeService {...} nsIJSRuntimeService + nsIThreadJSContextStack {...} nsIThreadJSContextStack + mRefCnt {mValue=14 } nsAutoRefCnt + _mOwningThread {mThread=0x00cb3b98 } nsAutoOwningThread + gSelf 0x00daa090 {mRefCnt={...} _mOwningThread={...} mRuntime=0x00daa138 ...} nsXPConnect * gOnceAliveNowDead 0 int + mRuntime 0x00daa138 {mStrIDs=0x00daa138 mStrJSVals=0x00daa184 mXPConnect=0x00daa090 ...} XPCJSRuntime * + mInterfaceInfoManager {mRawPtr=0x00d25830 } nsCOMPtr<nsIInterfaceInfoSuperManager> + mDefaultSecurityManager 0x00d0d108 {mRefCnt={...} _mOwningThread={...} mOriginToPolicyMap=0x02e854d0 ...} nsIXPCSecurityManager * mDefaultSecurityManagerFlags 63 unsigned short mShuttingDown 0 int + mCycleCollectionContext 0x00000000 {mState=??? mXPC=??? mThreadData=??? ...} XPCCallContext * mCycleCollecting 0 int + mScopes {...} nsBaseHashtable<nsPtrHashKey<void const >,nsISupports *,nsISupports *> + mBackstagePass {mRawPtr=0x0264d83c } nsCOMPtr<nsIXPCScriptable> gReportAllJSExceptions 0 unsigned int + nsISupports {...} nsISupports mLock 0x02f7f290 PRLock * + mObserver {mRawPtr=0x00000000 } nsCOMPtr<nsIThreadObserver> - mEvents 0x0635c6e0 {mNext=0x066fec7c mFilter={...} mQueue={...} } nsThread::nsChainedEventQueue * + mNext 0x066fec7c {mNext=0x00000000 mFilter={...} mQueue={...} } nsThread::nsChainedEventQueue * + mFilter {mRawPtr=0x0657a560 } nsCOMPtr<nsIThreadEventFilter> + mQueue {mMonitor=0x06427e08 mHead=0x00000000 mTail=0x00000000 ...} nsEventQueue - mEventsRoot {mNext=0x00000000 mFilter={...} mQueue={...} } nsThread::nsChainedEventQueue + mNext 0x00000000 {mNext=??? mFilter={...} mQueue={...} } nsThread::nsChainedEventQueue * + mFilter {mRawPtr=0x00000000 } nsCOMPtr<nsIThreadEventFilter> + mQueue {mMonitor=0x06374e40 mHead=0x00000000 mTail=0x00000000 ...} nsEventQueue mPriority 0 int mThread 0x0517c0c0 PRThread * mRunningEvent 0 unsigned int + mShutdownContext 0x00000000 {joiningThread=??? shutdownAck=??? } nsThreadShutdownContext * mShutdownRequired 0 unsigned char mShutdownPending 205 'Í' unsigned char mEventsAreDoomed 0 unsigned char mayWait 1 int - result 0x05b2f124 int * 95613232 int notifyGlobalObserver 1 int + event {mRawPtr=0x00000000 } nsCOMPtr<nsIRunnable> rv 103230692 unsigned int + obs {mRawPtr=0x00000000 } nsCOMPtr<nsIThreadObserver>
Believe this is a duplicate of bug 468736.
(In reply to comment #2) > Believe this is a duplicate of bug 468736. Looks pretty same. I don't have more info at the moment, so just duplicate it.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
ok, this is special/different. I wish you had kept the other threads around, but let's build an imaginary stack to tell our story: main thread: A main B NS_ShutdownXPCOM(nsIServiceManager* servMgr) C NotifyObservers(mgr, NS_XPCOM_SHUTDOWN_OBSERVER_ID, D NotifyObservers(nsnull, NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID, E nsTimerImpl::Shutdown(); F nsThreadManager::get()->Shutdown(); G nsPSMBackgroundThread->join() H EnumerateObservers(NS_XPCOM_SHUTDOWN_LOADERS_OBSERVER_ID, I observerService->Shutdown(); J nsComponentManagerImpl::gComponentManager->FreeServices(); K nsProxyObjectManager::Shutdown(); L obs->Observe(nsnull, NS_XPCOM_SHUTDOWN_LOADERS_OBSERVER_ID, M (nsComponentManagerImpl::gComponentManager)->Shutdown(); On the main thread, we are at G. Note that we haven't hit H, K, or L yet, and they're all somewhat interesting psm thread: N _BaseThreadStart O nsPSMBackgroundThread::nsThreadRunner P nsSSLThread::Run Q nsNSSCertificate::GetIsExtendedValidation R nsComponentManagerImpl::GetServiceByContractID S nsNativeModuleLoader::LoadModule T nsProxyEventObject::CallMethod U nsThread::ProcessNextEvent The other thread is in U. sadly, the critical thing here is the PEO which will die in K. I think the fix is for PEOs to listen for D.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Summary: Strange deadlock on nsPSMBackgroundThread → nsNativeModuleLoader get stuck proxying during shutdown
ok, here's a proposal: nsProxyObjectCallInfo::nsProxyObjectCallInfo adds itself to a thread local list of objects (not TLS, just private to the POM and its cohorts) nsProxyObjectCallInfo::SetCompleted removes itself from the same list nsProxyThreadFilter::AcceptEvent is taught about some new magic signal, when it gets the magic signal, it runs through the thread local list of nsProxyObjectCallInfo objects and sets a squished flag. and then it returns true. nsProxyObjectCallInfo::Run() checks for squished, and if it's set, it returns NS_OK instead of calling nsProxyObjectCallInfo supports a squished flag, so when it's marked as squished, it does nothing. EventQueues or something register for NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID and use it to send the signal to each queue's nsProxyThreadFilter.
Status: REOPENED → NEW
Just to emphasize, this didn't happen during shutdown, but during start. I got stuck from the very start before any page get loaded, as I describe in comment 0. Main thread at the shutdown was here (after I closed FF and all windows disappeared): ... nsPSMBackgroundThread::requestExit PR_JoinThread That doesn't spin the loop of the calling thread. I saved local vars from the stuck thread to let you take a look and potentially deduce something.
Summary: nsNativeModuleLoader get stuck proxying during shutdown → nsNativeModuleLoader get stuck proxying
I think the simple solution to this bug, if it needs one, is to make sure that PSM gets whatever service it needs (step R) from the main thread before running the PSM thread. Making proxy objects do shutdown correctly is a very involved and racy task: I think we can probably simplify Josh's suggestion considerably if everyone can check for shutdown state and notify monitors more simply. I wonder if we should be preventing proxy calls from being issued during XPCOM shutdown at all, or at some well-defined point during the shutdown process before we shut down all threading.
Attached patch take bs's advice (deleted) — Splinter Review
909 certdb = do_GetService(NS_X509CERTDB_CONTRACTID); is the component that we need to poke.
Assignee: nobody → timeless
Status: NEW → ASSIGNED
Attachment #371407 - Flags: review?(kaie)
Summary: nsNativeModuleLoader get stuck proxying → nsNativeModuleLoader got stuck proxying do_GetService(NS_X509CERTDB_CONTRACTID)
Comment on attachment 371407 [details] [diff] [review] take bs's advice This may not be called on the main thread the first time.
Whiteboard: [psm-fatal]
> I think the simple solution to this bug, if it needs one, is to make sure > that PSM gets whatever service it needs (step R) from the main thread before > running the PSM thread. I guess something like that has been done in the recent past.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago12 years ago
Resolution: --- → WORKSFORME
Comment on attachment 371407 [details] [diff] [review] take bs's advice bitrot
Attachment #371407 - Flags: review?(kaie)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: