Closed Bug 468736 Opened 16 years ago Closed 13 years ago

PSM nsPSMBackgroundThread::requestExit is racy and not threadsafe

Tracking

(firefox5-, firefox6-, blocking2.0 .x+)

Status:

RESOLVED FIXED

Milestone:

mozilla7

Tracking Flags:

Tracking

Status

firefox5

---

firefox6

---

blocking2.0

---

.x+

People

(Reporter: timeless, Assigned: briansmith)

References

Details

(Keywords: hang, Whiteboard: [inbound])

Attachments

(4 files, 7 obsolete files)

profiled stack trace from camino 16 years ago timeless (deleted), text/plain		Details
Patch v1 [i think this might address the race i described] 16 years ago timeless (deleted), patch	mayhemer : review-	Details \| Diff \| Splinter Review
another possible sample of this hang 15 years ago Oscar van Vliet (deleted), text/plain		Details
A sample of a hang from Firefox 3.5 final 15 years ago Randy Oo (deleted), text/plain		Details
whitespace-ignore version of patch v1 15 years ago Kai Engert (:KaiE:) (deleted), patch		Details \| Diff \| Splinter Review
smaller whitespace-ignore vresion of patch v1 15 years ago Kai Engert (:KaiE:) (deleted), patch		Details \| Diff \| Splinter Review
Updated patch 14 years ago (no longer active) (deleted), patch	mayhemer : review+	Details \| Diff \| Splinter Review
Wait for 25ms or until notified by background thread 13 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) (deleted), patch		Details \| Diff \| Splinter Review
Wait for 25ms or until notified by background thread, fixed 13 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) (deleted), patch		Details \| Diff \| Splinter Review
Have background threads send exit notification to requestExit in the main thread 13 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) (deleted), patch	mayhemer : review+	Details \| Diff \| Splinter Review
v2 Have background threads send exit notification to requestExit in the main thread 13 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) (deleted), patch	briansmith : review+	Details \| Diff \| Splinter Review

timeless

Reporter

Description

•

16 years ago

Attached file profiled stack trace from camino (deleted) — Details

ok, so the main thread is here: http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/security/manager/ssl/src/nsPSMBackgroundThread.cpp?rev=23dd66069c44&mark=88-97#82 - i think this code needs a stack object. it looks like a race http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/security/manager/ssl/src/nsNSSIOLayer.cpp?rev=23dd66069c44&mark=341-375#334 - is the other thread - the second thread enters passes 340 and then gets dangerous. the main thread enters and does stuff, but it's too late.

Mike Beltzner [:beltzner, not reading bugmail]

Comment 2

•

16 years ago

This causes unpredictable hangs (see bug 468234) and is thusly nominated for blocking.

Flags: blocking1.9.1?

Benjamin Smedberg

Comment 3

•

16 years ago

I don't think this is prevalent enough to be a blocker. Definitely wanted.

Flags: wanted1.9.1+

Flags: blocking1.9.1?

Flags: blocking1.9.1-

Stewart Campbell

Comment 5

•

16 years ago

I can reproduce this almost everytime I use Camino (Version 2.0b1 (1.9.0.6pre 2008121820)) (See Duplicate bug 471080). It seems to be more prevalent when I use GMail before I try to quit.

Honza Bambas (:mayhemer)

Comment 7

•

16 years ago

(In reply to comment #6) > *** Bug 484367 has been marked as a duplicate of this bug. *** nsPSMBackgroundThread::requestExit is innocent here. See bug 484367, it happened to me right after start, that bug also contains some local variables dump on the dead thread, hope it might move us forward a bit. It seems to me that this is some event queue problem, but it's a guess. I could reproduce just ones.

timeless

Reporter

Comment 8

•

16 years ago

Attached patch Patch v1 [i think this might address the race i described] (obsolete) (deleted) — Details — Splinter Review

could someone who hits but try running w/ this? (you can tryserver it)

Assignee: kaie → timeless

Status: NEW → ASSIGNED

Attachment #371365 - Flags: review?

timeless

Reporter

Updated

•

16 years ago

Attachment #371365 - Flags: superreview?(kaie)

Attachment #371365 - Flags: review?(stewart.campbell)

Attachment #371365 - Flags: review?(honzab.moz)

Attachment #371365 - Flags: review?

Honza Bambas (:mayhemer)

Comment 9

•

16 years ago

Comment on attachment 371365 [details] [diff] [review] Patch v1 [i think this might address the race i described] In your new while cycle in nsPSMBackgroundThread::requestExit you should access mExitState under the lock as you modify it on the other thread. Why did you remove NS_ASSERTION(mThreadHandle, "Could not create nsPSMBackgroundThread\n"); ? Nit: if (ssl_thread_singleton->mExitState): I'd rather see if (ssl_thread_singleton->mExitState != ePSMThreadRunning), just to make more clear what the condition is about Please don't include the only-formatting changes in nsNSSComponent, it's unrelated to this bug. Otherwise it looks good, we just have to take care to test shutdown well, there might be races introduced.

Attachment #371365 - Flags: review?(honzab.moz) → review-

Oscar van Vliet

Comment 10

•

15 years ago

Attached file another possible sample of this hang (deleted) — Details

I have had another hang after it was suggested in bug 466148 that my crashes are due to this bug. I attached the sample. If it is the same bug, I would be happy to test out builds with fixes, as Smokey said it was difficult to reproduce.

Randy Oo

Comment 12

•

15 years ago

Attached file A sample of a hang from Firefox 3.5 final (deleted) — Details

Oscar van Vliet wrote: > Smokey said it was difficult to reproduce. I really don't understand this. My bug, comment #1 (which existed before this one was reported, yet somehow got closed as a dupe of this one?!?) has a testcase that triggers the problem every single time, and a website that I visit frequently brings Firefox 3.5 to its knees EVERY SINGLE TIME, without fail. I can't (and won't) upgrade from 3.0.11 until this bug is resolved. I'd just like to point out that there may be information in the comments under my bug that could prove helpful for someone working on this one (if anyone), and also that I can reproduce the problem consistently, so I'd be happy to help with testing patches or capture samples from the crashing, if it would be helpful.

Ilya Sherman

Comment 13

•

15 years ago

(In reply to comment #12) > Created an attachment (id=386436) [details] > A sample of a hang from Firefox 3.5 final AFAICT, this sample does not have nsPSMBackgroundThread::requestExit in it, and so is a different bug.

Wayne Mery (:wsmwk)

Updated

•

15 years ago

Severity: major → critical

Kai Engert (:KaiE:)

Updated

•

15 years ago

Attachment #371365 - Attachment description: i think this might address the race i described → Patch v1 [i think this might address the race i described]

Kai Engert (:KaiE:)

Comment 14

•

15 years ago

Attached patch whitespace-ignore version of patch v1 (obsolete) (deleted) — Details — Splinter Review

timeless, maybe it's just me, but I find it very difficult to separate "bug fix meat" from "white space changes" when looking through a patch, especially when trying to understand fixes for tricky code. This is timeless' patch, ignoring all whitespace changes (which doesn't work for the places where you split code into multiple lines).

Kai Engert (:KaiE:)

Comment 15

•

15 years ago

Attached patch smaller whitespace-ignore vresion of patch v1 (obsolete) (deleted) — Details — Splinter Review

I've removed additional unnecessary chunks from the patch.

Attachment #439892 - Attachment is obsolete: true

Kai Engert (:KaiE:)

Comment 16

•

15 years ago

Comment on attachment 371365 [details] [diff] [review] Patch v1 [i think this might address the race i described] clearing review requests, because Honza already gave r- waiting for answers to Honza's comments and a new patch

Attachment #371365 - Flags: superreview?(kaie)

Attachment #371365 - Flags: review?(stewart.campbell)

(no longer active)

Comment 17

•

14 years ago

So, what's needed to make progress on this bug? Just addressing the comments in comment 9?

Honza Bambas (:mayhemer)

Comment 18

•

14 years ago

(In reply to comment #17) > So, what's needed to make progress on this bug? Just addressing the comments > in comment 9? Probably yes. Also some kind of automated testing for this would be excelent, but probably unachievable.

(no longer active)

Comment 19

•

14 years ago

OK, then. If this is deemed to be blocking, I can pick it up and work on it. If not, I'm afraid that I can take a look at it post-2.0.

blocking2.0: --- → ?

chris hofmann

Comment 20

•

14 years ago

looks like this one might have missed the grand re-triage. have we decided to block on firefox 4, or not?

Benjamin Smedberg

Comment 21

•

14 years ago

Not a regression from 3.6, not a blocker.

blocking2.0: ? → .x

Myk Melez [:myk] [@mykmelez]

Comment 22

•

14 years ago

It looks like this might be causing bug 651665. Ehsan: any chance you can pick this up now that we're post-2.0?

(no longer active)

Comment 23

•

14 years ago

Attached patch Updated patch (obsolete) (deleted) — Details — Splinter Review

This patch updates the no-whitespace-change version to apply cleanly on trunk, and then addresses review comments in comment 9. I can't reproduce the problem, so I don't really know how to test this. Running the browser and going to SSL sites doesn't seem to break anything obvious. I've posted this to the try server as well.

Attachment #527557 - Flags: review?(honzab.moz)

Myk Melez [:myk] [@mykmelez]

Comment 24

•

14 years ago

One indication that your patch resolves bug 651665 is that the Add-on SDK tests pass on Linux Debug with your push to try server: http://tbpl.mozilla.org/?tree=MozillaTry&noignore=1&pusher=eakhgari@mozilla.com Compare with this random push of yours to mozilla-central: http://tbpl.mozilla.org/?noignore=1&pusher=eakhgari@mozilla.com That isn't determinative, since it sometimes happens anyway (f.e. in that random push of yours, one of the four debug builds that have run the tests have passed them). But most of the time the tests fail on debug builds when tested by the test automation, so it's a good sign. It'd be useful to have Linux64 Debug, OS X Debug, and OS X64 Debug results for that try server push as well.

Myk Melez [:myk] [@mykmelez]

Updated

•

14 years ago

Blocks: 651665

Myk Melez [:myk] [@mykmelez]

Comment 25

•

14 years ago

(In reply to comment #24) > It'd be useful to have Linux64 Debug, OS X Debug, and OS X64 Debug results for > that try server push as well. The results are now in, and all those builds passed the SDK tests, so it looks like your patch does indeed fix the problem.

(no longer active)

Comment 26

•

14 years ago

And in general try server was extremely happy with my push: <http://tbpl.mozilla.org/?tree=MozillaTry&rev=b53e241153e5>. (The two windows reftest failures are caused by another patch in my queue, and the OS X opt Moth orange is intermittent.)

Myk Melez [:myk] [@mykmelez]

Comment 27

•

14 years ago

(In reply to comment #25) > The results are now in, and all those builds passed the SDK tests, so it looks > like your patch does indeed fix the problem. To make extra sure, I re-ran all the SDK tests on the four Mac/Linux debug builds, and they all passed the second time as well: http://tbpl.mozilla.org/?tree=MozillaTry&rev=b53e241153e5&noignore=1 We don't have test results on Windows debug builds because the SDK tests aren't being run on those yet (bug 651806).

Honza Bambas (:mayhemer)

Comment 28

•

14 years ago

Comment on attachment 527557 [details] [diff] [review] Updated patch >+++ b/security/manager/ssl/src/nsCertVerificationThread.cpp >+ while (!mExitState && (0 == verification_thread_singleton->mJobQ.GetSize())) { mExitState == ePSMThreadRunning please. >+ if (mExitState) > break; mExitState != ePSMThreadRunning please. -1917,21 +1915,29 @@ nsNSSComponent::Init() > mSSLThread = new nsSSLThread(); >- if (mSSLThread) >- mSSLThread->startThread(); >+ if (mSSLThread) { >+ if (NS_FAILED(mSSLThread->startThread())) { >+ delete mSSLThread; >+ mSSLThread = nsnull; >+ } >+ } > mCertVerificationThread = new nsCertVerificationThread(); >- if (mCertVerificationThread) >- mCertVerificationThread->startThread(); >+ if (mCertVerificationThread) { >+ if (NS_FAILED(mCertVerificationThread->startThread())) { >+ delete mCertVerificationThread; >+ mCertVerificationThread = nsnull; >+ } >+ } > Do the same in nsNSSComponent::DoProfileChangeNetRestore or rather remove this change (I prefer the letter). I don't think we have non-null checks on all places in the code. > void nsPSMBackgroundThread::requestExit() >- if (mExitRequested) >+ if (mExitState) > return; mExitState != ePSMThreadRunning please. >+ while (GetExitStateThreadSafe() < ePSMThreadStopped) { >+ NS_ProcessNextEvent(nsnull, PR_FALSE); >+ if (GetExitStateThreadSafe() < ePSMThreadStopped) >+ PR_Sleep(PR_SecondsToInterval(5)); >+ } We should add a comment here that we cannot rely on posting/processing events since there is no event loop that late when this code gets called. > PRBool nsSSLThread::exitRequested() >+ return ssl_thread_singleton->mExitState; mExitState != ePSMThreadRunning please. r=honzab with these fixes

Attachment #527557 - Flags: review?(honzab.moz) → review+

(no longer active)

Comment 29

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/05fe748d2031

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla6

(no longer active)

Updated

•

14 years ago

Depends on: 653161

(no longer active)

Comment 30

•

14 years ago

I've backed this out on the suspicion of being the culprit for the 400% Tshutdown regression: http://hg.mozilla.org/mozilla-central/rev/da816b446ffa If this is the cause, we'll need a new owner for this bug, as I won't have enough time to diagnose the regression. If it's not the cause, I'll reland it.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

(no longer active)

Comment 31

•

14 years ago

The backout fixed the regression. Somebody more familiar with the code here should pick it up and investigate the regression.

Assignee: timeless → nobody

Myk Melez [:myk] [@mykmelez]

Comment 32

•

14 years ago

Brian & Honza: would one of you be able to pick this up?

Boris Zbarsky [:bzbarsky]

Comment 33

•

14 years ago

To be clear, this regressed Tp4_shutdown.

Honza Bambas (:mayhemer)

Comment 34

•

14 years ago

I will do that.

Assignee: nobody → honzab.moz

Status: REOPENED → ASSIGNED

Jesse Ruderman

Updated

•

13 years ago

Blocks: cmd-q

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 37

•

13 years ago

What's blocking this from moving forward? I'm hitting this about every other time I shut down a Nightly ...

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 38

•

13 years ago

I do not think the Tshutdown regression is worse than this hang; the hang is actually a worse-than-400% shutdown time regression and lots of people are running into this. The patch can and should be improved. I will help Honza with it.

(no longer active)

Comment 39

•

13 years ago

(In reply to comment #38) > I do not think the Tshutdown regression is worse than this hang I beg to differ. The Tshutdown regression might affect all of our users, but the hang definitely only affects some of them. That being said, I also definitely think that this could be improved, but I didn't know enough about the code in question to do that myself.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 40

•

13 years ago

We know this old bug is happening much more frequently in nightly. Do we know why? Is this occurring more frequently on FF5 beta too? If so, then we should try to address it for FF5. Did patches related to SYN retry and/or pipelining cause this to happen more frequently? I know SYN retry itself doesn't cause this, because I have seen it even when the SYN retry timeout is zero (which disables the actual retry). But, I wouldn't be surprised if the associated changes to support SYN retry had some effect in increasing the frequency of occurrence.

tracking-firefox5: --- → ?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 41

•

13 years ago

Comment on attachment 527557 [details] [diff] [review] Updated patch Review of attachment 527557 [details] [diff] [review]: ----------------------------------------------------------------- ::: security/manager/ssl/src/nsPSMBackgroundThread.cpp @@ +94,5 @@ > } > + while (GetExitStateThreadSafe() < ePSMThreadStopped) { > + NS_ProcessNextEvent(nsnull, PR_FALSE); > + if (GetExitStateThreadSafe() < ePSMThreadStopped) > + PR_Sleep(PR_SecondsToInterval(5)); Why not PR_Sleep(PR_INTERVAL_NO_WAIT) or a lower-than-5-seconds sleep interval here? Five seconds seems ridiculously long.

Honza Bambas (:mayhemer)

Comment 42

•

13 years ago

(In reply to comment #41) > Why not PR_Sleep(PR_INTERVAL_NO_WAIT) or a lower-than-5-seconds sleep > interval here? Five seconds seems ridiculously long. Because otherwise we would have 100% load on that thread w/o any context switches. It would lead to hang of UI thread as well. (In reply to comment #40) > Did patches related to SYN retry and/or pipelining cause this to happen more > frequently? I strongly believe there is no relation. I was experiencing this bug also pretty often year or longer ago. Probably some pattern makes this appear more often, therefor people experience it often when working on a bug and repeating the same behavior. Brian, if you have time to investigate the regression, then please do so. If not, I can get to this maybe at the end of this week.

Honza Bambas (:mayhemer)

Comment 43

•

13 years ago

We potentially could: - if there is an event queue (e.g. during profile change) then just process messages with canWait = true and do not PR_Sleep - if there is no event queue (usual case during shutdown) then we could use a monitor or cond var bound to the existing thread mutex notified when the state changes I would just need to figure out how to check whether there's still the event queue.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 44

•

13 years ago

(In reply to comment #42) > Because otherwise we would have 100% load on that thread w/o > any context switches. It would lead to hang of UI thread as > well. It would still yield. I can see why it might be better to sleep a little, but less than five seconds. I am going to investigate trying a smaller interval first. > I strongly believe there is no relation. I was experiencing this bug also > pretty often year or longer ago. Probably some pattern makes this appear > more often, therefor people experience it often when working on a bug and > repeating the same behavior. There are 4 people (including me) sitting within 10 meters of me that are regularly running into this for the last four weeks, so I think something has changed recently that makes this occur much more frequently. Maybe it's even IPV6 related, because I think IPV6 was turned on our office recently. > Brian, if you have time to investigate the regression, then please do so. > If not, I can get to this maybe at the end of this week. I am working on it now.

Assignee: honzab.moz → bsmith

Honza Bambas (:mayhemer)

Comment 45

•

13 years ago

> PR_SecondsToInterval Oh! Right, not 5 SECONDS, but rather MILLISECONDS... I completely overlooked that...

Patrick McManus [:mcmanus]

Comment 46

•

13 years ago

(In reply to comment #45) > > PR_SecondsToInterval > > Oh! Right, not 5 SECONDS, but rather MILLISECONDS... I completely > overlooked that... :) as a mostly unimportant aside, I would make it 15ms (still a massive drop from 5000) instead of 5ms as a lot of the older NT derivations had a 15ms tick by default.. if you specify less it sometimes just ends up busy waiting anyhow for you.

Asa Dotzler [:asa]

Comment 47

•

13 years ago

too late for us to do anything in 5.

tracking-firefox5: ? → -

Asa Dotzler [:asa]

Updated

•

13 years ago

tracking-firefox6: --- → ?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 48

•

13 years ago

Attached patch Wait for 25ms or until notified by background thread (obsolete) (deleted) — Details — Splinter Review

I made every change of mExitState cause a notification on mCond; requestExit will then wait until mExitState changes, while processing an event every 25ms.

Attachment #527557 - Attachment is obsolete: true

Attachment #538139 - Flags: review?(honzab.moz)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 49

•

13 years ago

I have not figured out yet why this is happening more often recently for many of us.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 50

•

13 years ago

(In reply to comment #49) > I have not figured out yet why this is happening more often recently for > many of us. Oh, you missed the news: http://www.nasa.gov/mission_pages/sunearth/news/News060711-blast.html

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 51

•

13 years ago

Attached patch Wait for 25ms or until notified by background thread, fixed (obsolete) (deleted) — Details — Splinter Review

The previous patch wasn't folded into the patch I based it on, so it was really an interdiff against the patch that got backed out. Here is the folded patch.

Attachment #538139 - Attachment is obsolete: true

Attachment #538307 - Flags: review?(honzab.moz)

Attachment #538139 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 52

•

13 years ago

Comment on attachment 538307 [details] [diff] [review] Wait for 25ms or until notified by background thread, fixed Review of attachment 538307 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the updated patch, Brian. But, I was wrong about one major thing. I was convinced that during shutdown there is no event queue anymore at the point of requestExit being called. But there is. We shut the main thread loop later in ShutdownXPCOM function. Based on that and fact that nsNSSComponent is thanks nsPSMInitPanic only instantiated on the main thread I propose: - remove the code that is waiting for the cond var with 25ms delays until the thread sets the state to 'stopped' - let the thread we want to stop set the state to 'stopped' and then post just an empty event to the main thread after it has finished - main thread waiting for shutdown will do: while (GetExitState(threadLock) != ePSMThreadStopped) { threadLock.unlock(); NS_ProcessPendingEvents(do_GetCurrentThread(), PR_IntervalToMilliseconds(100)); threadLock.lock(); } .. or what ever form of this code - the timeout should be there just for safety, but I will r+ patch even w/o it This gives us no delays and fluent dispatch of all events pending on the main thread. Time will prove if processing events from this place won't cause any trouble with reentering. Dropping the r flag since the patch will be adjusted anyway. General comments: ::: security/manager/ssl/src/nsCertVerificationThread.cpp @@ +132,5 @@ > > { > MutexAutoLock threadLock(verification_thread_singleton->mMutex); > + > + while (GetExitState(threadLock) == ePSMThreadRunning && Repeating this on and on is unreadable and tends to mistakes. Maybe change this to bool IsRunning() method? Also could have IsStopped() method for the check in the request exiting thread. ::: security/manager/ssl/src/nsNSSComponent.cpp @@ +432,5 @@ > + delete mCertVerificationThread; > + mCertVerificationThread = nsnull; > + } > +} > + Let's make these two functions members of nsNSSComponent, no need to separate them. Also have the two threads as arguments suggests these are one-purpose functions. Nit: prefixing arguments with 'm' is generally quit unusual. ::: security/manager/ssl/src/nsPSMBackgroundThread.cpp @@ +77,5 @@ > + return mExitState; > +} > + > +nsPSMBackgroundThread::ExitState > +nsPSMBackgroundThread::GetExitStateWithoutLock() const Maybe rather GetExitStateUnlocked? Kai: why exactly we need access to this w/o a lock? exitRequest seems not to be called under the thread's lock, or it is? http://mxr.mozilla.org/mozilla-central/search?string=exitRequested @@ +109,5 @@ > + if (GetExitState(threadLock) == ePSMThreadStopped) { > + break; > + } > + } > + NS_ProcessNextEvent(nsnull, PR_FALSE); NS_ProcessPendingEvents(do_GetCurrentThread()) is better on this place. It handles all events that might get posted, not just the first one. ::: security/manager/ssl/src/nsPSMBackgroundThread.h @@ +67,5 @@ > + ePSMThreadStopRequested = 1, > + ePSMThreadStopped = 2 > + }; > + > + ExitState GetExitState(::mozilla::MutexAutoLock const & proofOfLock) const; Little bit knotty this 'proofOfLock' but let's take it. @@ +84,5 @@ > nsresult startThread(); > void requestExit(); > + > +private: > + PRUint32 GetExitStateThreadSafe(); You don't need this anymore.

Attachment #538307 - Flags: review?(honzab.moz)

Johnathan Nightingale [:johnath]

Comment 53

•

13 years ago

Minusing this for tracking - would like to see an approval request for this to land on aurora once we have a safe, reviewed patch baking on central.

tracking-firefox6: ? → -

David :Bienvenu

Comment 54

•

13 years ago

I used to be able to easily reproduce this on Windows with Thunderbird on my desktop, but not so much on my windows laptop for some reason. But FTR, if I got IMAP autosync going on SSL connection and shutdown in the middle, I would generally hit this hang.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 55

•

13 years ago

Attached patch Have background threads send exit notification to requestExit in the main thread (obsolete) (deleted) — Details — Splinter Review

Here is the fix that we talked before. The background thread will post a message to the main thread, as the main thread spins the event loop in requestExit waiting for it.

Attachment #371365 - Attachment is obsolete: true

Attachment #439893 - Attachment is obsolete: true

Attachment #538307 - Attachment is obsolete: true

Attachment #541211 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 56

•

13 years ago

Comment on attachment 541211 [details] [diff] [review] Have background threads send exit notification to requestExit in the main thread Review of attachment 541211 [details] [diff] [review]: ----------------------------------------------------------------- Looks nice. r+ after following comments gets addressed, manly the last one: ::: security/manager/ssl/src/nsNSSComponent.cpp @@ +397,2 @@ > { > + if (mSSLThread) { Why not left in the following form? if () { } You are against the file style. @@ +409,5 @@ > + > +void > +nsNSSComponent::createBackgroundThreads() > +{ > + NS_ASSERTION(mSSLThread == nsnull, "SSL thread already created."); New line after this line please. Only one space between SSL and thread. @@ +419,5 @@ > + return; > + } > + > + NS_ASSERTION(mCertVerificationThread == nsnull, > + "Cert verification thread already created."); New line as well here. Maybe put this at the top of the method to get assertion/warning even we fail to create the ssl thread? ::: security/manager/ssl/src/nsPSMBackgroundThread.cpp @@ +73,5 @@ > > +PRBool > +nsPSMBackgroundThread::exitRequested(const MutexAutoLock & /*proofOfLock*/) const > +{ > + return mExitState != ePSMThreadRunning; Maybe let this method return exitRequestedNoLock() ? exitRequestedNoLock then could be explicitly inline. @@ +81,5 @@ > +nsPSMBackgroundThread::postStoppedEventToMainThread( > + MutexAutoLock const & /*proofOfLock*/) > +{ > + NS_ASSERTION(PR_GetCurrentThread() == mThreadHandle, > + "Background thread stopped from another thread"); New line after this line please. ::: security/manager/ssl/src/nsPSMBackgroundThread.h @@ +64,4 @@ > mozilla::CondVar mCond; > > + PRBool exitRequested(::mozilla::MutexAutoLock const & proofOfLock) const; > + PRBool exitRequestedNoLock() const { return mExitState; } return mExitState != ePSMThreadRunning; @@ +81,5 @@ > nsresult startThread(); > void requestExit(); > + > +private: > + PRUint32 GetExitStateThreadSafe(); What is this used for? ::: security/manager/ssl/src/nsSSLThread.cpp @@ +1143,3 @@ > { > + return !ssl_thread_singleton || > + ssl_thread_singleton->exitRequestedNoLock(); Why exactly are you changing this code, I mean the result value here? This code has been introduced in bug 420187. I understand the code looks wrong (we should return PR_TRUE for !ssl_thread_singleton). Not sure if it really is wrong or not. However, I would change the method's result in a different bug, not here, where we solve a quit different issue.

Attachment #541211 - Flags: review?(honzab.moz) → review+

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 57

•

13 years ago

Attached patch v2 Have background threads send exit notification to requestExit in the main thread (deleted) — Details — Splinter Review

Attachment #541211 - Attachment is obsolete: true

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 58

•

13 years ago

Comment on attachment 541736 [details] [diff] [review] v2 Have background threads send exit notification to requestExit in the main thread Carrying forward r+. Review comments addressed. Bug 666999 was filed for the bug in nsSSLThread::stoppedOrStopping.

Attachment #541736 - Flags: review+

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

13 years ago

Whiteboard: [inbound]

Marco Bonardo [:mak]

Comment 59

•

13 years ago

http://hg.mozilla.org/mozilla-central/rev/1d2879d39b2a

Status: ASSIGNED → RESOLVED

Closed: 14 years ago → 13 years ago

Resolution: --- → FIXED

Target Milestone: mozilla6 → mozilla7

Josh Matthews [:jdm]

Updated

•

13 years ago

Depends on: 669050

Nobody; OK to take it and work on it

Updated

•

8 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.