Closed
Bug 602080
Opened 14 years ago
Closed 8 years ago
Perf regression from bug 130078 in CPU use during video playback
Categories
(Core Graveyard :: Tracking, defect)
Tracking
(blocking2.0 -)
RESOLVED
INCOMPLETE
mozilla2.0b7
Tracking | Status | |
---|---|---|
blocking2.0 | --- | - |
People
(Reporter: kinetik, Assigned: romaxa)
References
Details
While investigating bug 601542, I discovered further possible perf regressions in video playback that need to be investigated. I'm testing on x86_64 Linux and building with optimizations (-g -O2 -fno-omit-frame-pointer).
Testing against a locally downloaded copy of http://media.tinyvid.tv/3d198wqepg78m.ogg and using /usr/bin/time to measure CPU. The browser is run with the video on the command line, and closed with Ctrl-C once it fades to black:
4613e0369937 (immediately before Ogg decoder rewrite/removal of liboggplay):
21.38user 5.00system 3:36.90elapsed 12%CPU (0avgtext+0avgdata
303504maxresident)k
16inputs+824outputs (1major+405814minor)pagefaults 0swaps
1d00691be5f2 (Ogg decoder rewrite landed):
29.67user 6.00system 3:34.40elapsed 16%CPU (0avgtext+0avgdata
296608maxresident)k
0inputs+14040outputs (0major+32544minor)pagefaults 0swaps
* User CPU increased 38%.
e560e0fe90d1 (recent trunk):
35.71user 5.47system 3:34.09elapsed 19%CPU (0avgtext+0avgdata
320256maxresident)k
0inputs+408outputs (0major+22370minor)pagefaults 0swaps
* User CPU increased 20% from 1d00691be5f2, 67% from 4613e0369937.
Bug 601542 is tracking a regression that seems to be caused by writing smaller audio blocks more frequently. We also seem to have lost another ~5% between the old backend and the new backend (see bug 601542 comment 6).
Possible causes for the rest of the regression (completely untested guesses, also not exhaustive):
* Audio Data API (bug 490705)
Adds extra work to decoder even when there are no listeners are attached.
* Replacement Y'CbCr code (bug 551277)
Possibly slower on x86_64 and/or machines with modern SSE cores.
* Decoder refactoring (bug 560708/bug 563825)
* Layers changes (bug ?)
Reporter | ||
Comment 1•14 years ago
|
||
0aefd9d484f1 (before Audio Data API landed):
30.86user 5.97system 3:38.43elapsed 16%CPU (0avgtext+0avgdata 359120maxresident)k
67520inputs+8608outputs (60major+34701minor)pagefaults 0swaps
081a707a76b8 (Audio Data API landed):
30.39user 5.94system 3:34.86elapsed 16%CPU (0avgtext+0avgdata 605008maxresident)k
37480inputs+35176outputs (44major+47022minor)pagefaults 0swaps
This seems to eliminate bug 490705, and also suggests that ~5s have been lost sometime between 081a707a76b8 and now.
Reporter | ||
Comment 2•14 years ago
|
||
Someone suggested disabling controls (by commenting out the binding in html.css), which doesn't gain our ~5s back, but does reveal the controls cost us around ~4-5% user CPU overhead:
081a707a76b8 (Audio Data API landed), no controls:
29.14user 5.54system 3:35.66elapsed 16%CPU (0avgtext+0avgdata 644400maxresident)k
12200inputs+29104outputs (1major+46763minor)pagefaults 0swaps
e560e0fe90d1 (recent trunk), no controls:
34.03user 7.08system 3:40.14elapsed 18%CPU (0avgtext+0avgdata 303664maxresident)k
28960inputs+6824outputs (179major+20910minor)pagefaults 0swaps
Reporter | ||
Comment 3•14 years ago
|
||
ac526bc2af65 (before gfx/ycbcr landed):
33.48user 5.56system 3:34.39elapsed 18%CPU (0avgtext+0avgdata 266896maxresident)k
4064inputs+8outputs (0major+52678minor)pagefaults 0swaps
ae968e2a4792 (after that and bug 551277 landed, contains quite a few other changesets):
33.57user 6.68system 3:34.21elapsed 18%CPU (0avgtext+0avgdata 269184maxresident)k
184inputs+8outputs (2major+43070minor)pagefaults 0swaps
So, no change there. Also, user CPU has dropped from ~33s to ~30s when the Audio Data API landed, then gone back up to ~35s? I'm confused. The measurements I'm making seem to be repeatable within ~0.5s, so I don't think this is measurement error.
Reporter | ||
Comment 4•14 years ago
|
||
Ugh, last set of measurements (comment 3 only) are bogus because I was relying on the new default objdir behaviour. Rerun:
ac526bc2af65 (before gfx/ycbcr landed):
29.28user 5.01system 3:39.30elapsed 15%CPU (0avgtext+0avgdata 334688maxresident)k
38056inputs+39664outputs (0major+35131minor)pagefaults 0swaps
ae968e2a4792 (after that and bug 551277 landed, contains quite a few other
changesets):
30.42user 5.09system 3:57.63elapsed 14%CPU (0avgtext+0avgdata 297888maxresident)k
3560inputs+12656outputs (1major+22330minor)pagefaults 0swaps
~3.8% difference, possibly within the noise.
Reporter | ||
Comment 5•14 years ago
|
||
So there are two major changes and one smaller one:
* ~21s to ~29s - decoder rewrite, mostly audio stuff (bug 601542)
* ~29s to ~30s - YCbCr changes (bug 551277)
* ~30s to ~35s - tracemonkey merge (http://hg.mozilla.org/mozilla-central/rev/f290d1fe9544)
Reporter | ||
Comment 6•14 years ago
|
||
Bisected the ~30s to ~35s change within the TM merge and found it was this changeset: http://hg.mozilla.org/mozilla-central/rev/ffacf623a9df
previous changeset:
30.63user 6.73system 3:33.80elapsed 17%CPU (0avgtext+0avgdata 616192maxresident)k
8320inputs+13768outputs (0major+45025minor)pagefaults 0swaps
ffacf623a9df:
36.71user 5.35system 3:37.48elapsed 19%CPU (0avgtext+0avgdata 606416maxresident)k
0inputs+4704outputs (0major+40762minor)pagefaults 0swaps
20% user CPU hit from that change.
Reporter | ||
Updated•14 years ago
|
Summary: Investigate possible perf regression in video decoder → Multiple perf regressions in CPU use during video playback (67% higher CPU use)
Reporter | ||
Comment 7•14 years ago
|
||
Nominating for blocking so that this is on the table for someone (me?) to continue tracking down the 30 to 35s regression.
blocking2.0: --- → ?
blocking2.0: ? → final+
We should probably split these out into separate bugs. We've already got bug 601542 so we probably just need one more bug for the regression associated with bug 130078.
Reporter | ||
Comment 9•14 years ago
|
||
romaxa's patch in bug 615870 solves (and even improves) the painting regression:
31.15user 4.27system 3:40.86elapsed 16%CPU (0avgtext+0avgdata 413120maxresident)k
8inputs+10432outputs (1major+29800minor)pagefaults 0swaps
23.53user 4.33system 3:41.61elapsed 12%CPU (0avgtext+0avgdata 411488maxresident)k
0inputs+12752outputs (1major+29882minor)pagefaults 0swaps
Roughly 25% less CPU use. I'm going to mark this bug as depending on 615870. The other large regression is covered by 601542, and the YCbCr stuff has changed again with an updated code landing, so it has probably improved.
Updated•14 years ago
|
Assignee: nobody → romaxa
Depends on: 621601
Updated•14 years ago
|
Whiteboard: [softblocker]
Comment 10•14 years ago
|
||
So bug 615870 gets back the performance that we lost and then some, but do we understand why bug 130078 caused a perf regression in the first place? Maybe we are leaving perf on the table, or maybe 130078 is causing perf problems in other places that bug 615870 don't fix.
I assume that it's the cost of extra display list construction, but I can't prove it.
One interesting thing would be to compare video performance on a popup window with no chrome ... see if that regressed when 130078 landed.
Depends on: 626962
Taking off the blocker list since we fixed the regression, but leaving open for investigation.
blocking2.0: final+ → -
Whiteboard: [softblocker]
Comment 13•8 years ago
|
||
Marking all tracking bugs which haven't been updated since 2014 as INCOMPLETE.
If this bug is still relevant, please reopen it and move it into a bugzilla component related to the work
being tracked. The Core: Tracking component will no longer be used.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
Updated•8 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•