Closed
Bug 1436247
Opened 7 years ago
Closed 7 years ago
Image decoding threadpool size seems quite large
Categories
(Core :: Graphics: ImageLib, enhancement, P3)
Core
Graphics: ImageLib
Tracking
()
RESOLVED
FIXED
mozilla60
Tracking | Status | |
---|---|---|
firefox60 | --- | fixed |
People
(Reporter: bzbarsky, Assigned: aosmond)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Whiteboard: [MemShrink][gfx-noted])
Crash Data
Attachments
(4 files)
(deleted),
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
It looks like our image decoding threadpool size is set to number of logical CPUs minus 1, _and_ preallocates all the threads. So in my case, on a 4-core maching, I have 7 threads here. On an 8-core machine I have 15 threads.
Do we ever actually end up using 15 threads effectively here? Especially in a multiprocess world, with various other things contending for CPU timeslices across all the processes?
I don't think we have any other threadpools that have this sort of behavior....
This came up in bug https://bugzilla.mozilla.org/show_bug.cgi?id=1436179 because the memory overhead of those 15 threads ends up being about 600KB just for the profiler metadata.
Reporter | ||
Updated•7 years ago
|
Blocks: memshrink-content
Updated•7 years ago
|
Whiteboard: [MemShrink]
Comment 1•7 years ago
|
||
Seems like at a minimum we should let the threadpool deallocate threads down to some reasonable number of idle threads.
Comment 2•7 years ago
|
||
I imagine the typical load on the decoder threads is highly bursty; during pageloads it will spike to use (all?) available threads on many pages. IIRC we don't decode all images for a page (especially if large) until they're made visible (or perhaps about to), so scrolling and the like can cause other bursts. I think a max of all cores isn't *unreasonable*, though you can argue optimality. Using all cores might limit desktop (or laptop) frequency maximums, though I don't see that as a major concern.
Given the burstiness, it might make sense to allow the pool to keep some threads around (argue as to how many), and keep others temporarily (N seconds?) However, anything that drops the threads will cause us to spend cycles (and time) recreating them repeatedly, hurting our perf and pageload latency (a little) and out power use (a very little). These hits might be minimal, especially if we keep (say 4) threads alive, but requires checking.
Assignee | ||
Comment 3•7 years ago
|
||
I do have some patches that rework our threading a bit. One part spawns the one decoder thread always (and release asserts its creation) and the remaining threads on demand (if there are pending items in the queue, and we have capacity in our pool, attempt to spawn another). It never releases threads from the pool however. My own local testing suggests that it will save us from spawning more than 1 thread in the parent process (since it does limited decoding...) and 1/2 to 2/3s of the threads get spawned for content processes on the new tab page. I need to do some testing on an optimized build to see if that changes things.
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → aosmond
Comment 4•7 years ago
|
||
Reducing it in the Master process would be a big win (doubly so until we can get the Profiler to stop allocating 32K+ per thread, just in case).
You might want to throw a pref in for how many to keep alive, and there definitely should be *some* hysteresis to dropping threads to avoid churn. I'd also suggest playing with these parameters a timing something like page-down or End on heavy-image pages.
Things like speedometer (and probably talos page-load tests) won't show any issues here, or won't show them much, so other ideas are welcome.
Assignee | ||
Comment 5•7 years ago
|
||
This patch makes us spawn a new thread when we have more tasks than idle threads. It will always spawn one at the beginning.
Attachment #8950325 -
Flags: review?(tnikkel)
Assignee | ||
Comment 6•7 years ago
|
||
This patch will shutdown threads after a configurable timeout. I set the default to 10 minutes, although that was not chosen for any particular reason other than "long." It will keep at least half of the maximum number of threads, rounded up -- only the excess threads can be shutdown due to being idle.
Attachment #8950327 -
Flags: review?(tnikkel)
Assignee | ||
Comment 7•7 years ago
|
||
Not mandatory, just the FILO task ordering always seemed strange to me vs FIFO?
Attachment #8950328 -
Flags: review?(tnikkel)
Assignee | ||
Comment 8•7 years ago
|
||
I noticed when rebuilding that some headers included DecodePool.h when they did not have to.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=53c534ab9f1c7a1597a74669dc5c4a0b541a65b8
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=28e8279d6bd360ef96e17fb954bb85caf956fd90
Attachment #8950329 -
Flags: review?(tnikkel)
Assignee | ||
Comment 9•7 years ago
|
||
There is a related crash (see signature) where we cannot create all the desired threads on startup. These patches will fix the crash most likely (unless it is the first thread we fail to spawn), albeit incidentally. If we get hammered with requests, it will try to spawn a new thread and fail later -- and it will keep retrying to spawn that thread as long as the queue depth is greater than the number of idle threads. There will clearly be a runtime cost to this, but maybe I'm worrying over nothing, as a system which cannot spawn a thread will probably crash due to OOM or something? :)
Crash Signature: mozilla::image::DecodePool::DecodePool
Priority: -- → P3
Whiteboard: [MemShrink] → [MemShrink][gfx-noted]
Assignee | ||
Updated•7 years ago
|
Crash Signature: mozilla::image::DecodePool::DecodePool → [@ mozilla::image::DecodePool::DecodePool ]
Comment 10•7 years ago
|
||
Comment on attachment 8950325 [details] [diff] [review]
0001-Bug-1436247-Part-1.-Spawn-image-decoder-threads-on-d.patch, v1
Review of attachment 8950325 [details] [diff] [review]:
-----------------------------------------------------------------
::: image/DecodePool.cpp
@@ +63,4 @@
> , mShuttingDown(false)
> + {
> + MonitorAutoLock lock(mMonitor);
> + bool rv = CreateThread();
Usually rv is for nsresult I thought? So just |success| would be okay here.
@@ +110,5 @@
> // Drop any new work on the floor if we're shutting down.
> return;
> }
>
> + // If there are pending tasks, create more workers if and only if we have
It seems like this block would make more sense after we append to the queue? Or am I missing something?
Attachment #8950325 -
Flags: review?(tnikkel) → review+
Updated•7 years ago
|
Attachment #8950329 -
Flags: review?(tnikkel) → review+
Updated•7 years ago
|
Attachment #8950328 -
Flags: review?(tnikkel) → review+
Updated•7 years ago
|
Attachment #8950327 -
Flags: review?(tnikkel) → review+
Assignee | ||
Comment 11•7 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #10)
> Comment on attachment 8950325 [details] [diff] [review]
> 0001-Bug-1436247-Part-1.-Spawn-image-decoder-threads-on-d.patch, v1
>
> Review of attachment 8950325 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: image/DecodePool.cpp
> @@ +63,4 @@
> > , mShuttingDown(false)
> > + {
> > + MonitorAutoLock lock(mMonitor);
> > + bool rv = CreateThread();
>
> Usually rv is for nsresult I thought? So just |success| would be okay here.
>
Yes, my bad. Will do.
> @@ +110,5 @@
> > // Drop any new work on the floor if we're shutting down.
> > return;
> > }
> >
> > + // If there are pending tasks, create more workers if and only if we have
>
> It seems like this block would make more sense after we append to the queue?
> Or am I missing something?
You are correct. What you are missing was how the check to decide to spawn a thread changed :). I will move it.
Comment 12•7 years ago
|
||
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c32ead4e3525
Part 1. Spawn image decoder threads on demand, rather than at startup. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/3650631487c7
Part 2. Shutdown idle image decoder threads after the configured timeout. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ce2bfe462a0
Part 3. Process image decoding tasks in a more predictable order. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/858d629f761d
Part 4. Fix image/DecodePool.h inclusions. r=tnikkel
Comment 13•7 years ago
|
||
Backed out 4 changesets (bug 1436247) for c2 failures in dom/events/test/test_DataTransferItemList.html
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=858d629f761ddb77258cbb3e5c557df1ead5b2df&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=runnable&filter-resultStatus=pending&selectedJob=161992788
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=161992788&repo=mozilla-inbound&lineNumber=2733
Backout: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=f1e39e547b29ee6eb323ff5fe18c5084791c8da1&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=runnable&filter-resultStatus=pending
Flags: needinfo?(aosmond)
Assignee | ||
Comment 14•7 years ago
|
||
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=43f4fd3a4b65c5ee81b1a41fe518472f25b35055
This suggests part 3 is the problematic piece. It is interesting because I guess if you are a test, and you only have one image, it *is* to your advantage that the most recently requested image is decoded first...
I'm going to try landing parts 1, 2 and 4, and consider landing part 3 in the future once I get other machinery in place / tests fixed.
Flags: needinfo?(aosmond)
Comment 15•7 years ago
|
||
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7c913713301f
Part 1. Spawn image decoder threads on demand, rather than at startup. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/e65b7fee0060
Part 2. Shutdown idle image decoder threads after the configured timeout. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/38557fb9f111
Part 3. Fix image/DecodePool.h inclusions. r=tnikkel
Comment 16•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/7c913713301f
https://hg.mozilla.org/mozilla-central/rev/e65b7fee0060
https://hg.mozilla.org/mozilla-central/rev/38557fb9f111
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox60:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Assignee | ||
Comment 17•7 years ago
|
||
I'm a little worried here where I'm only seeing one image decoder thread on the latest nightly in my content processes. Not at all what I saw during my local dev testing. I need to investigate further.
Flags: needinfo?(aosmond)
Assignee | ||
Comment 18•7 years ago
|
||
My local builds (debug and optimized) have the expected behaviour. Maybe it is related to pgo.
Assignee | ||
Comment 19•7 years ago
|
||
User error. I had changed image.multithreaded_decoding.limit to 1 for testing in my nightly profile at some point, and completely forgot that I did so.
Flags: needinfo?(aosmond)
You need to log in
before you can comment on or make changes to this bug.
Description
•