Closed
Bug 601542
Opened 14 years ago
Closed 8 years ago
High CPU use playing video with sound on some platforms (old Ubuntu specific?)
Categories
(Core :: Audio/Video: Playback, defect)
Tracking
()
RESOLVED
INCOMPLETE
mozilla2.0
People
(Reporter: alice0775, Unassigned)
References
()
Details
(Keywords: regression)
Attachments
(1 file, 5 obsolete files)
(deleted),
application/octet-stream
|
Details |
Build Identifier: Mozilla/5.0 (X11; Linux i686; rv:2.0b7pre) Gecko/20101002 Firefox/4.0b7pre ID:20101002031237
After landing Bug 531340,
High CPU usage on OGG Video Playback in Lunux.
Reproducible: Always
Steps to Reproduce:
1. Start Minefield with new profile
2. Open URL ( http://www.dailymotion.com/html5 )
3. Watch System Monitor
Actual Results:
High CPU usage than before landing the Bug 531340.
CPU usage is about 2 or 3 times.
Regression range:
CPU less than 20%:
http://hg.mozilla.org/mozilla-central/rev/6a398023b956
Mozilla/5.0 (X11; Linux i686; rv:2.0b7pre) Gecko/20101002 Firefox/3.7a4pre
ID:20100401034504
CPU more then 70%:
http://hg.mozilla.org/mozilla-central/rev/c8f0660250e0
Mozilla/5.0 (X11; Linux i686; rv:2.0b7pre) Gecko/20101002 Firefox/3.7a4pre
ID:20100402030334
And I revert to cgangeset 4613e0369937 in local build then CPU usage less than
20%.
So, The following changeset causes the hight CPU usage in Linux build.
1d00691be5f2 Chris Pearce — Bug 531340 - New Ogg video decoder. r=doublec
sr=roc
Comment 1•14 years ago
|
||
Can you please include your computer's specs in the bug? Particularly, CPU type and speed, video card (and driver, if known) and what Linux distro you're running.
Comment 2•14 years ago
|
||
Also, can you please test against http://media.tinyvid.tv/3d198wqepg78m.ogg or some other single-video simplified case and report if you still see the problem?
The DailyMotion link is slightly complicated because it plays a random video, and it also will use canvas to paint the video if you click any of the effects buttons. It's possible this problem is related to DailyMotion specifically, so testing the link above will help narrow this down.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 3•14 years ago
|
||
Looks like I can reproduce this...
On a 1.7GHz Core i7 with CPU scaling disabled, testing with that video (and without the controls showing), I see around 11% CPU in Firefox and 7% in Xorg with 3.6.9 beta. With a 4.0 nightly, I see around 20% CPU in Firefox and 7% in Xorg.
Reporter | ||
Comment 4•14 years ago
|
||
(In reply to comment #1)
> Can you please include your computer's specs in the bug? Particularly, CPU
> type and speed, video card (and driver, if known) and what Linux distro you're
> running.
Core2Quad Q8300 @2.5GHz, ATI HD4350, Ubuntu 8.04 LTS japanese.
However I use Ubuntu on VMware Player, and use 2 of 4CPU for the virtual machine and GPU Accelerated Windows 0/1.
(In reply to comment #2)
> Also, can you please test against http://media.tinyvid.tv/3d198wqepg78m.ogg or
> some other single-video simplified case and report if you still see the
> problem?
Using http://media.tinyvid.tv/3d198wqepg78m.ogg,
Recent build:CPU around 55-66%
http://hg.mozilla.org/mozilla-central/rev/e3fa26aab8a6
Mozilla/5.0 (X11; Linux i686; rv:2.0b7pre) Gecko/20101003 Firefox/4.0b7pre ID:20101003192227
Before landing the Bug 531340 :CPU around 16-22%
http://hg.mozilla.org/mozilla-central/rev/6a398023b956
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.3a4pre) Gecko/20100401 Minefield/3.7a4pre ID:20100401034504
Updated•14 years ago
|
blocking2.0: --- → ?
Comment 5•14 years ago
|
||
I profiled using oprofile the revisions which Alice noted the performance decrease. x86 Linux. This machine has no sound hardware.
Comment 6•14 years ago
|
||
Using kinda primitive tools here, but /usr/bin/time (passing video to Firefox binary so it autoplays, then closing the browser once the video fades to black) reports:
old, sound:
21.38user 5.00system 3:36.90elapsed 12%CPU (0avgtext+0avgdata 303504maxresident)k
16inputs+824outputs (1major+405814minor)pagefaults 0swaps
new, sound:
29.67user 6.00system 3:34.40elapsed 16%CPU (0avgtext+0avgdata 296608maxresident)k
0inputs+14040outputs (0major+32544minor)pagefaults 0swaps
User CPU increased 28%.
old, no sound:
24.12user 6.27system 3:34.00elapsed 14%CPU (0avgtext+0avgdata 325472maxresident)k
32inputs+16496outputs (0major+47087minor)pagefaults 0swaps
new, no sound:
25.40user 3.92system 3:33.23elapsed 13%CPU (0avgtext+0avgdata 268064maxresident)k
360inputs+4664outputs (0major+19369minor)pagefaults 0swaps
User CPU increased 5%.
Alice0775, can you please test against http://flim.org/~kinetik/3d198wqepg78m_nosound.ogg and report whether you see the high CPU use with the new decoder implementation with this file?
Comment 7•14 years ago
|
||
Typo, I meant to say User CPU increased 38% in the first case.
Reporter | ||
Comment 8•14 years ago
|
||
(In reply to comment #6)
> Alice0775, can you please test against
> http://flim.org/~kinetik/3d198wqepg78m_nosound.ogg and report whether you see
> the high CPU use with the new decoder implementation with this file?
The CPU use is low(almost same as before landing of Bug 531340).
Comment 9•14 years ago
|
||
Thanks for confirming! I wonder if this is mostly caused by the behavioural change mentioned in bug 531340 comment 91, then. Given you're testing in a VM and with a distro using PulseAudio (I think), it's possible the overhead is magnified by the cross-process calls to PA and further magnified by the VM making contexting switching more expensive than usual. I'll come up with a patch tomorrow and post it for testing.
Comment 10•14 years ago
|
||
The CPU numbers I posted earlier tested 4613e0369937 (revision before bug 531340 landed) and 1d00691be5f2 (bug 531340 landed). I've run the same measurement with a recent trunk build using the same compile flags:
sound:
35.71user 5.47system 3:34.09elapsed 19%CPU (0avgtext+0avgdata 320256maxresident)k
0inputs+408outputs (0major+22370minor)pagefaults 0swaps
no sound:
31.25user 3.63system 3:34.15elapsed 16%CPU (0avgtext+0avgdata 344432maxresident)k
96inputs+10088outputs (0major+29127minor)pagefaults 0swaps
So it looks like there are further regressions that need investigation.
Comment 11•14 years ago
|
||
Filed bug 602080 for comment 10.
Should we block on a regression here?
Comment 13•14 years ago
|
||
Yeah, I'm part way through a patch that will help with this. I don't think it'll fix the entire perf regression, but it'll be in the same ballpark at least.
Updated•14 years ago
|
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
blocking2.0: ? → final+
Comment 14•14 years ago
|
||
Recording some more numbers here for my own convenience:
old:
17.39user 3.30system 8:05.41elapsed 4%CPU (0avgtext+0avgdata 253120maxresident)k
0inputs+4248outputs (0major+21012minor)pagefaults 0swaps
new:
21.82user 7.86system 8:16.39elapsed 5%CPU (0avgtext+0avgdata 255872maxresident)k
0inputs+4128outputs (0major+21119minor)pagefaults 0swaps
25%
trunk:
32.13user 8.94system 9:15.38elapsed 7%CPU (0avgtext+0avgdata 308048maxresident)k
0inputs+8336outputs (0major+21410minor)pagefaults 0swaps
trunk #2:
31.77user 9.69system 8:02.48elapsed 8%CPU (0avgtext+0avgdata 347728maxresident)k
8inputs+5816outputs (0major+24837minor)pagefaults 0swaps
45%
trunk #3 (recent alsa-plugins):
29.96user 7.43system 7:47.09elapsed 8%CPU (0avgtext+0avgdata 316448maxresident)k
3608inputs+12632outputs (0major+22987minor)pagefaults 0swaps
patched trunk:
28.24user 4.71system 8:10.55elapsed 6%CPU (0avgtext+0avgdata 284192maxresident)k
120inputs+6376outputs (0major+22055minor)pagefaults 0swaps
-11%
patched trunk #2:
28.27user 4.59system 7:47.16elapsed 7%CPU (0avgtext+0avgdata 282416maxresident)k
0inputs+4040outputs (0major+19666minor)pagefaults 0swaps
no sound device trunk:
25.73user 4.75system 7:48.33elapsed 6%CPU (0avgtext+0avgdata 281968maxresident)k
6928inputs+536outputs (0major+19098minor)pagefaults 0swaps
patched trunk:
26.57user 5.67system 11:21.78elapsed 4%CPU (0avgtext+0avgdata 327776maxresident)k
80inputs+1480outputs (0major+23188minor)pagefaults 0swaps
-----------------------------------------------------------
trunk, no controls, IK+:
9.72user 5.68system 8:25.51elapsed 3%CPU (0avgtext+0avgdata 293040maxresident)k
64inputs+6608outputs (1major+19689minor)pagefaults 0swaps
patched trunk, no controls, IK+:
7.73user 2.58system 8:07.01elapsed 2%CPU (0avgtext+0avgdata 353056maxresident)k
7712inputs+32920outputs (0major+23460minor)pagefaults 0swaps
with fixed 32kB writeBuffer:
6.88user 2.12system 7:58.31elapsed 1%CPU (0avgtext+0avgdata 282960maxresident)k
0inputs+728outputs (0major+18947minor)pagefaults 0swaps
and floorf removed:
5.91user 2.13system 7:50.50elapsed 1%CPU (0avgtext+0avgdata 282800maxresident)k
136inputs+888outputs (0major+18486minor)pagefaults 0swaps
Comment 15•14 years ago
|
||
Short term fix until we rework this with the new audio library (sydneyaudio replacement).
Coalesce audio writes in the audio loop. Reduces the number of writes to sydneyaudio (and, thus, on PulseAudio systems, interprocess communication to the PA server) significantly.
Needs testing on other platforms.
Updated•14 years ago
|
Summary: High CPU usage on OGG Video Playback → Many small audio writes causes high CPU during media playback
Comment 16•14 years ago
|
||
Fixed test failures caused by bugs in the patch. Should be good-to-go now, but I'll test on other platforms before requesting review.
Attachment #483392 -
Attachment is obsolete: true
Comment 17•14 years ago
|
||
Oh, also: this patch contains a fix for bug 598829.
Comment 18•14 years ago
|
||
Tests pass on Linux and Win32 for me, pushed to tryserver for better coverage. Test builds should turn up in a few hours: http://ftp.mozilla.org/pub/mozilla.org/firefox/tryserver-builds/mgregan@mozilla.com-e852bf2718bd/
Comment 19•14 years ago
|
||
Builds are available. Alice0775, if you could test these and let me know whether it helps, that'd be much appreciated.
Linux tests were all green. Win32 haven't come in yet. OSX got this:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1287391620.1287393618.6758.gz
s: talos-r3-snow-042
55345 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_a4_tone.html | The audioAvailable event's time attribute was invalid.
55346 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_a4_tone.html | Check amount of signal data processed - got 103424, expected 133120
55347 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_a4_tone.html | Check signal present - got false, expected true
55348 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_a4_tone.html | Check mute fragments present - got false, expected true
Reporter | ||
Comment 20•14 years ago
|
||
(In reply to comment #19)
> Builds are available. Alice0775, if you could test these and let me know
> whether it helps, that'd be much appreciated.
Tested on VMware as same as comment #4
Laatest Nightly: http://hg.mozilla.org/mozilla-central/rev/397c458b40a4
Mozilla/5.0 (X11; Linux i686; rv:2.0b8pre) Gecko/20101016 Firefox/4.0b8pre ID:20101016025826
TryServer in comment #18: http://hg.mozilla.org/try/rev/e852bf2718bd
Mozilla/5.0 (X11; Linux i686; rv:2.0b8pre) Gecko/20101018 Firefox/4.0b8pre ID:20101018003128
Video: http://www.dailymotion.com/html5
Latest Nightly: CPU around around 55-77%
TrySercer: CPU around around 66-72%
Video : http://media.tinyvid.tv/3d198wqepg78m.ogg
Latest Nightly: CPU around 55-66%
TrySercer: CPU around 55-66%
Comment 21•14 years ago
|
||
Thanks for that, I'm a little disappointed that the patch hasn't helped at all. I'm setting up a more similarly configured system to what you're testing against to see if I can work out what's going on.
The OS X failures above are caused by the new code queuing a large enough amount of data that the event queue size limit in nsAudioAvailableEventManager is exceeded and it starts dropping events. This is due to Available() returning 882000 on OS X. The event queue size limit is 100 "frame buffer" (typically 1024 byte) sized events. It actually makes more sense to try to always queue a fixed amount (e.g. 32kB) rather than using Available()--changing the patch to do this, so long as it doesn't exceed the limits in nsAudioAvailableEventManager, fixes the test failures on OS X.
Comment 22•14 years ago
|
||
Attachment #483910 -
Attachment is obsolete: true
Comment 23•14 years ago
|
||
Rebased after Tremor landing. Use a 4k buffer rather than 32k.
Attachment #484460 -
Attachment is obsolete: true
Comment 24•14 years ago
|
||
Looking at this problem on the N900, where it seems to be rather extreme.
Using this testcase: http://flim.org/~kinetik/out.html, which is a mono 48kHz Vorbis stream.
N900, audio disabled in nsAudioStream (mInError=true before returning OK from Init()): 8% CPU
N900, audio enabled: 100% (75 user, 25 system) CPU, would need even more to play back smoothly
No audio, system profile:
samples % image name app name symbol name
278301 85.0309 no-vmlinux no-vmlinux /no-vmlinux
5484 1.6756 oprofiled oprofiled /usr/bin/oprofiled
4838 1.4782 no-vmlinux plugin-container /no-vmlinux
3636 1.1109 libxul.so plugin-container mdct_backward
2030 0.6202 no-vmlinux pulseaudio /no-vmlinux
No audio, plugin-container profile:
samples % image name symbol name
4838 19.8124 no-vmlinux /no-vmlinux
3636 14.8900 libxul.so mdct_backward
1428 5.8479 libxul.so nsAudioAvailableEventManager::QueueWrittenAudioData(short*, unsigned int, unsigned long long)
1165 4.7709 libxul.so _vorbis_apply_window
1012 4.1443 libxul.so decode_packed_entry_number
Audio, system profile:
samples % image name app name symbol name
72364 19.0834 no-vmlinux no-vmlinux /no-vmlinux
55717 14.6933 libasound.so.2.0.0 plugin-container /usr/lib/libasound.so.2.0.0
44412 11.7121 no-vmlinux plugin-container /no-vmlinux
31489 8.3041 libasound_module_pcm_pulse.so plugin-container /usr/lib/alsa-lib/libasound_module_pcm_pulse.so
19831 5.2297 libpulsecommon-0.9.15.so plugin-container /usr/lib/libpulsecommon-0.9.15.so
19332 5.0981 libpulse.so.0.8.0 plugin-container /usr/lib/libpulse.so.0.8.0
15096 3.9810 no-vmlinux pulseaudio /no-vmlinux
13820 3.6445 libpthread-2.5.so plugin-container pthread_mutex_lock
10356 2.7310 module-nokia-voice.so pulseaudio /usr/lib/pulse-0.9.15/modules/module-nokia-voice.so
9405 2.4802 oprofiled oprofiled /usr/bin/oprofiled
8656 2.2827 libpthread-2.5.so plugin-container __pthread_mutex_unlock_usercnt
7408 1.9536 libc-2.5.so plugin-container poll
4872 1.2848 libpulsecore-0.9.15.so pulseaudio /usr/lib/libpulsecore-0.9.15.so
4310 1.1366 libpthread-2.5.so plugin-container pthread_mutex_unlock
4174 1.1007 libpulsecommon-0.9.15.so pulseaudio /usr/lib/libpulsecommon-0.9.15.so
3110 0.8201 libxul.so plugin-container mdct_backward
Audio, plugin-container profile:
samples % image name symbol name
55717 23.5732 libasound.so.2.0.0 /usr/lib/libasound.so.2.0.0
44412 18.7902 no-vmlinux /no-vmlinux
31489 13.3226 libasound_module_pcm_pulse.so /usr/lib/alsa-lib/libasound_module_pcm_pulse.so
19831 8.3903 libpulsecommon-0.9.15.so /usr/lib/libpulsecommon-0.9.15.so
19332 8.1792 libpulse.so.0.8.0 /usr/lib/libpulse.so.0.8.0
13820 5.8471 libpthread-2.5.so pthread_mutex_lock
8656 3.6623 libpthread-2.5.so __pthread_mutex_unlock_usercnt
7408 3.1342 libc-2.5.so poll
4310 1.8235 libpthread-2.5.so pthread_mutex_unlock
3110 1.3158 libxul.so mdct_backward
2209 0.9346 libc-2.5.so __libc_enable_asynccancel
Comment 25•14 years ago
|
||
The problem on the N900 may be different after all. I'm seeing a pattern like this when writing audio, for almost every write:
ms
337: write 8192
337: writei 8192
342: writei 2432
602 writei recover: -32
611: writei 2432
write is sa_pcm_write, writei is calling snd_pcm_writei. We try to write 8k, which succeeds but reports that only part of the buffer was accepted (5760 bytes). We immediately try to write the remaining 2432 bytes, which blocks for 260ms before returning EPIPE (indicating an underrun occurred), so we call snd_pcm_recover and reattempt the write, which succeeds.
snd_pcm_recover is *very* expensive, and we wouldn't expect to call it frequently in normal use.
We initialize the PCM device with 250ms latency, so I think we're ending up in a state where the buffer sizing is pathological for this configuration. Maybe we end up with a single 250ms buffer, and we end up blocked in writei waiting for more space and underrun while we wait for it to empty.
As a quick hack, changing:
143 if (snd_pcm_set_params(s->output_unit,
144 #ifdef SA_LITTLE_ENDIAN
145 SND_PCM_FORMAT_S16_LE,
146 #else
147 SND_PCM_FORMAT_S16_BE,
148 #endif
149 SND_PCM_ACCESS_RW_INTERLEAVED,
150 s->n_channels,
151 s->rate,
152 1,
153 250000) < 0) {
250000 at line 253 to something larger (I used 1000000, which is 1s) makes this problem disappear completely and we're able to play the testcase while using around 8-10% CPU in plugin-container (and similar in PulseAudio).
Comment 26•14 years ago
|
||
Note that the value used to be 500ms, but we dropped it to 250ms in bug 526411 to work around playback problems with the old oggplay decoder implementation. We can probably revert to 500ms safely on trunk since we're using our own decoder implementation now.
snd_pcm_dump output for 250ms configuration:
buffer_size : 12000
period_size : 3000
period_time : 62500
tstamp_mode : NONE
period_step : 1
avail_min : 3000
period_event : 0
start_threshold : 12000
stop_threshold : 12000
silence_threshold: 0
silence_size : 0
boundary : 1572864000
Logging:
write 1:
89785: writei 4672
write 2:
89799: writei 4096
write 3:
89800: writei 4096 -> total writes would be 12864 (buffer_size is 12000)
89807: writei 864 -> need to rewrite 864 bytes -> blocks in snd_pcm_writei for 250ms
90057: writei recover: -32
Note that 12000 / 48000 is 250ms. If I understand how ALSA works correctly, there should be 4 3000 frame buffers (for 12000 frames total), so I'd only expect the write to block until one of those freed up (~63ms), but it seems to block until the entire buffer runs dry.
Comment 27•14 years ago
|
||
Comment 28•14 years ago
|
||
Spun the N900 stuff off into bug 607200 as it's a separate issue, though the patch in this bug may also help reduce CPU during playback.
Updated•14 years ago
|
Attachment #485228 -
Attachment is obsolete: true
Comment 29•14 years ago
|
||
Alice, are you able to reproduce this regression outside of a VM (i.e. on a native Linux install)?
Reporter | ||
Comment 30•14 years ago
|
||
Tested on Native installed Ubuntu 10.10 (on the same hardware of the above mentioned VM)
Graphic
GPU Accelerated Windows : 0/1
http://hg.mozilla.org/mozilla-central/rev/f7016571b472
Mozilla/5.0 (X11; Linux i686; rv:2.0b8pre) Gecko/20101104 Firefox/4.0b8pre ID:20101104025921
Video: http://www.dailymotion.com/html5
CPU around 11-22%
Video : http://media.tinyvid.tv/3d198wqepg78m.ogg
CPU around 5-16%
It is a quite low CPU utilization!!
Updated•14 years ago
|
Attachment #484538 -
Attachment is obsolete: true
Comment 31•14 years ago
|
||
I tested this on a 2.4GHz T7700 Core 2 Duo running VMWare Fusion 3.x with a Fedora 12 VM and couldn't reproduce the problem (playing an audio-only file with controls disabled used ~3% CPU). I then tried my Ubuntu 9.10 VM with the same testcase and saw the CPU use at around 40-80% during playback. That eliminates VMs as the common factor, so perhaps this bug is specific to older Ubuntu versions.
blocking2.0: final+ → ---
Summary: Many small audio writes causes high CPU during media playback → High CPU use playing video with sound on some platforms (Ubuntu 8.04 specific?)
Updated•14 years ago
|
Summary: High CPU use playing video with sound on some platforms (Ubuntu 8.04 specific?) → High CPU use playing video with sound on some platforms (old Ubuntu specific?)
Reporter | ||
Comment 32•14 years ago
|
||
Yeh,10.10 is better than 8.04 on VM.
On VMware Player3.1.2 build-301548 (host Windows7x64)
Video: http://www.dailymotion.com/html5
ubuntu8.04 : CPU around around 55-77%
ubuntu10.10: CPU around around 11-33%
Video : http://media.tinyvid.tv/3d198wqepg78m.ogg
ubuntu8.04: CPU around 55-66%
ubuntu10.10: CPU around 5-27%
Comment 33•14 years ago
|
||
Oprofile seems to suggest that we're spending most of our time below sa_stream_write and sa_stream_get_position, but applying my patch from this bug makes very little difference despite reducing the number of writes by at least 8x.
Updated•11 years ago
|
Assignee: kinetik → nobody
Updated•9 years ago
|
Component: Audio/Video → Audio/Video: Playback
Reporter | ||
Updated•8 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•