Open
Bug 928222
Opened 11 years ago
Updated 2 years ago
remove nsThreadPool per-thread event queues
Categories
(Core :: XPCOM, defect)
Tracking
()
REOPENED
mozilla27
People
(Reporter: karlt, Assigned: karlt)
References
(Blocks 1 open bug)
Details
Attachments
(5 files, 1 obsolete file)
(deleted),
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
benjamin
:
review-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
nsThreadPool doesn't need to use nsIThread for its threads. The nsIThread
interface is not useful for threads running in an nsThreadPool.
The nsIEventTarget on the nsIThreadPool should be used for dispatching events,
not an interface on the individual threads, and the threads don't need an
nsEventQueue because they use the nsEventQueue on the nsThreadPool.
Shutdown of single event threads is easier than running nested event loops for
nsIThreads, avoiding the multilevel nested event loop situations when several
threads finish and are shutdown.
Assignee | ||
Comment 1•11 years ago
|
||
While the ThreadFunc is running, a nested event loop is still used in Shutdown() in case some consumers might need it and because that is the documented API.
This also simplifies thread creation, avoiding races that could mean there was
temporarily one or more extra threads.
Attachment #818830 -
Flags: review?(benjamin)
Comment 2•11 years ago
|
||
Comment on attachment 818830 [details] [diff] [review]
part 2: remove nsThreadPool per-thread event queues
I'm going to have to delay reviewing this until Monday, because its multi-threaded code and I can't seem to get my head in the game today :-(
Assignee | ||
Comment 3•11 years ago
|
||
With changes in patch order from bug 926522, this needs a little merging, if anyone wants to apply. The merged version, applying on top of parts 1 and 2 in bug 926522, is at https://hg.mozilla.org/try/rev/2efd29e87300
On the B2G emulator build, nullptr is '0' I assume, which means
mThreads.IndexOf(nullptr) gives
nsTArray.h:544: error: ISO C++ forbids comparison between pointer and integer
and mThreads.AppendElement(nullptr) gives
nsTArray.h:530: error: invalid conversion from 'int' to 'PRThread*'
I guess replacing nullptr with (nsThreadPool*)0 might be the
simplest option.
https://tbpl.mozilla.org/?tree=Try&rev=3be0734b8a83
Comment 4•11 years ago
|
||
Comment on attachment 818830 [details] [diff] [review]
part 2: remove nsThreadPool per-thread event queues
Things to worry about here:
* Does existing code that uses threadpools expect to be able to use NS_GetCurrentThread and then spin an nested event loop waiting on the main thread? That is probably pretty broken anyway, so if this is green on try we're probably ok.
I supposed the nested event loop in nsThreadPool::Shutdown is still necessary because the threadpool tasks may block on the main thread, and so we need to avoid joining the worker thread while there are still tasks running? That kinds sucks, I was hoping to avoid that nested event loop altogether.
Attachment #818830 -
Flags: review?(benjamin) → review+
Assignee | ||
Comment 5•11 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4)
> Things to worry about here:
>
> * Does existing code that uses threadpools expect to be able to use
> NS_GetCurrentThread and then spin an nested event loop waiting on the main
> thread? That is probably pretty broken anyway, so if this is green on try
> we're probably ok.
NS_GetCurrentThread should be fine because getting nsIThreadManager::currentThread will wrap the current thread in an nsIThread if it doesn't have one. Try is OK with it.
> I supposed the nested event loop in nsThreadPool::Shutdown is still
> necessary because the threadpool tasks may block on the main thread, and so
> we need to avoid joining the worker thread while there are still tasks
> running? That kinds sucks, I was hoping to avoid that nested event loop
> altogether.
Yes, that was my reasoning, and yes it does, but that is the model we use when emptying the nsEventQueues on nsIThreads.
Let me know if you'd like me to add signalling instead and do a try run, but I'm concerned about changing the interface.
Even if the events on the pool threads don't wait for the main thread, existing shutdown() callers may not want to main thread to block until the events have all run.
Anyway, I now see that my assumption about mThreads.Length() being atomic enough is delicate because the length is in mHdr [1]. It works out because nsTArray::RemoveElement() doesn't reallocate to change mHdr, but this is too dependent on the implementation. I'll change to use the mutex, and get review on those changes.
[1] http://hg.mozilla.org/mozilla-central/annotate/177bf37a49f5/xpcom/glue/nsTArray.h#l370
Assignee | ||
Comment 6•11 years ago
|
||
Assignee | ||
Updated•11 years ago
|
Attachment #818830 -
Attachment description: patch → part 2: remove nsThreadPool per-thread event queues
Assignee | ||
Updated•11 years ago
|
Attachment #820789 -
Flags: review?(benjamin)
Assignee | ||
Comment 7•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=5b070158f17c
Some runs for other platforms without the b2g compile fixes:
https://tbpl.mozilla.org/?tree=Try&rev=71444fbb4509
https://tbpl.mozilla.org/?tree=Try&rev=152e226849c6
Attachment #820790 -
Flags: review?(benjamin)
Updated•11 years ago
|
Attachment #820789 -
Flags: review?(benjamin) → review+
Comment 8•11 years ago
|
||
Comment on attachment 820790 [details] [diff] [review]
part 2 fold-in: access mThreads.Length() with the mutex held and cast nullptr for template selector
per IRC, let's use IndexOf<PRThread*> instead of (PRThread*)0
Attachment #820790 -
Flags: review?(benjamin) → review+
Assignee | ||
Updated•11 years ago
|
Assignee | ||
Comment 9•11 years ago
|
||
Comment 10•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/95ec386c179c
https://hg.mozilla.org/mozilla-central/rev/242bb2279283
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Assignee | ||
Comment 11•11 years ago
|
||
Bug 930495, bug 930479, bug 930428 are reporting crashes on MacOS 10.6 and 10.7 in
libSystem.B.dylib + 0x39d6d
PR_CreateThread
XUL!nsThreadPool::PutEvent(nsIRunnable*) [nsThreadPool.cpp:31e679e3e4d7 : 92 + 0x25]
XUL!nsThreadPool::Dispatch(nsIRunnable*, unsigned int) [nsThreadPool.cpp:31e679e3e4d7 : 259 + 0xa]
XUL!nsAStreamCopier::Start(nsIInputStream*, nsIOutputStream*, nsIEventTarget*, void (*)(void*, tag_nsresult), void*, unsigned int, bool, bool, void (*)(void*, unsigned int)) [nsStreamUtils.cpp:31e679e3e4d7 : 437 + 0xb]
XUL!NS_AsyncCopy(nsIInputStream*, nsIOutputStream*, nsIEventTarget*, nsAsyncCopyMode, unsigned int, void (*)(void*, tag_nsresult), void*, bool, bool, nsISupports**, void (*)(void*, unsigned int)) [nsStreamUtils.cpp:31e679e3e4d7 : 586 + 0x37]
XUL!nsInputStreamTransport::OpenInputStream(unsigned int, unsigned int, unsigned int, nsIInputStream**) [nsStreamTransportService.cpp:31e679e3e4d7 : 109 + 0x40]
XUL!mozilla::net::nsHttpChannel::OpenCacheInputStream(nsICacheEntry*, bool) [nsHttpChannel.cpp:31e679e3e4d7 : 3434 + 0xe]
so backed out part 2:
http://hg.mozilla.org/mozilla-central/rev/5a9ac6fed6ff
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 15•11 years ago
|
||
A new thread could sometimes exit before receiving an event if the idle count rose (and theoretically also if the idle time expired).
This patch essentially reverts the behavior on thread creation failure to what it was before attachment 818830 [details] [diff] [review]: the events are left in the queue.
We could empty the queue if there are no threads, but I'm trying to limit how much is changed in this bug. The return code of nsThreadPool::PutEvent() is currently not returned from Dispatch().
This happens to work around the crashes, but I'll add another patch to avoid them for certain.
Attachment #824402 -
Flags: review?(benjamin)
Assignee | ||
Comment 16•11 years ago
|
||
PR_CreateThread writes to objects deleted in PR_JoinThread after the new thread is created:
http://hg.mozilla.org/mozilla-central/annotate/19fd3388c372/nsprpub/pr/src/pthreads/ptthread.c#l500
This lead to crashes when PR_JoinThread was called on a different thread to PR_CreateThread. Bug 932029 made this a little harder to diagnose.
With PR_UNJOINABLE_THREAD, NSPR ensures the objects are not deleted before PR_CreateThread has finished writing to them.
Updated documentation at https://developer.mozilla.org/en-US/docs/PR_JoinThread$compare?to=484741&from=203306
This patch also includes changes to avoid a potential hang in Shutdown() if called on a non-main thread.
Attachment #824412 -
Flags: review?(benjamin)
Assignee | ||
Comment 17•11 years ago
|
||
Sorry about all the revisions.
Perhaps it is easier to look at the complete new version than the differences.
Attachment #818830 -
Attachment is obsolete: true
Updated•11 years ago
|
Attachment #824402 -
Flags: review?(benjamin) → review+
Comment 18•11 years ago
|
||
Comment on attachment 824412 [details] [diff] [review]
part 2 fold-in 3: use a detached thread to avoid NSPR create/join races and wake the correct thread during shutdown
So this basically works around an NSPR bug which only occurs if PR_CreateThread and PR_JoinThread are called on different threads but near the same time?
What if, instead of making all of these threads joinable, we made all of the threadpool threads unjoinable, and just used flags/monitors to verify that they weren't doing anything? It seems that would solve the same problem without the weirdness here of another thread whose only job is to shut down other threads.
Updated•11 years ago
|
Flags: needinfo?(karlt)
Assignee | ||
Comment 19•11 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #18)
> So this basically works around an NSPR bug which only occurs if
> PR_CreateThread and PR_JoinThread are called on different threads but near
> the same time?
Yes, that's correct.
It also dispatches the ShutdownCompleteNotification to the thread that is waiting for an event in Shutdown(). The previous version could get stuck wait for an event on the Shutdown() thread, when there would be no events on that thread.
> What if, instead of making all of these threads joinable, we made all of the
> threadpool threads unjoinable, and just used flags/monitors to verify that
> they weren't doing anything? It seems that would solve the same problem
> without the weirdness here of another thread whose only job is to shut down
> other threads.
I'm sure there are other options here, and I'm happy to look into them. One awkward thing with unjoinable threads is that we don't know when they have really finished.
However, I don't understand what you are proposing here. This patch *is* making all threadpool threads unjoinable. It is not adding another thread, but merely recording which thread called and is still waiting in Shutdown().
The tricky part is that Shutdown() should continue processing events on its thread and so can't simply wait for a monitor notify from the threadpool threads.
I'm happy to find a different solution that keeps joinable threads, if you prefer.
Flags: needinfo?(karlt)
Comment 20•11 years ago
|
||
Comment on attachment 824412 [details] [diff] [review]
part 2 fold-in 3: use a detached thread to avoid NSPR create/join races and wake the correct thread during shutdown
I apologize for the delay. Every time I've read this I've gotten confused with the multithreaded code here.
I see why you're keeping mShutdownThread now. It's not clear to me why you're using mShutdownThread as the signal to stop spinning the nested event loop. Couldn't we just break out of the nested event loop if (mThreads.IsEmpty()) ? Then ShutdownCompleteNotification::Run could be an empty method, and I wouldn't have to worry a lot about race conditions on the null-ness of mShutdown thread: I suspect that the current loop could exit early and leave an orphaned ShutdownCompleteNotification.
Style issues: I don't think we like nested class declarations: they are hard or impossible to set breakpoints on in many debuggers; please move it outside the method.
There's a couple if( that need to be un-snuggled.
Attachment #824412 -
Flags: review?(benjamin) → review-
Assignee | ||
Comment 21•11 years ago
|
||
Once the work in bug 956899 moves to mfbt, there will be no need to work around nspr bugs.
Depends on: 956899
Assignee | ||
Comment 23•8 years ago
|
||
I think this is still worth fixing.
https://bugzilla.mozilla.org/show_bug.cgi?id=956899#c143 implies that moving js/src/threading APIs to mfbt may be difficult, but perhaps using those APIs may still be an option even though they are in js/src.
I'm not likely to finish this off soon myself, though, so would be quite happy if someone would like to take it.
Flags: needinfo?(karlt)
Updated•2 years ago
|
Severity: normal → S3
Assignee | ||
Updated•2 years ago
|
Flags: needinfo?(karlt)
You need to log in
before you can comment on or make changes to this bug.
Description
•