Closed
Bug 498770
Opened 15 years ago
Closed 15 years ago
Enable optimized Theora code in Windows builds
Categories
(Core :: Audio/Video, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: kinetik, Assigned: ds)
References
Details
Attachments
(1 file, 2 obsolete files)
(deleted),
patch
|
kinetik
:
review+
|
Details | Diff | Splinter Review |
We were forced to disable the assembly optimized Theora code on Windows due to crashes, but that should only be a temporary fix. I can't find a bug to get the root cause fixed and enable this code again, so I'm filing a new one.
This is really a bug in the upstream code, but we should track this and work towards getting a fix because it's important for playback performance.
There's a bunch of analysis and context in bug 474937.
Reporter | ||
Comment 1•15 years ago
|
||
There's documentation about the preservation of registers inside inline asm blocks here: http://msdn.microsoft.com/en-us/library/k1a8ss06%28VS.80%29.aspx
In general, ebx should be saved and restored by the compiler... but there's a note about use of ebx when FPO is not used:
> If the compiler does not perform FPO, it will use EBX and EBP. To ensure code runs correctly, do not modify EBX in asm code if the function requires dynamic stack alignment as it could modify the frame pointer.
I'm not sure if this case applies here, though.
Comment 2•15 years ago
|
||
For what hearsay is worth, using EBX inappropriately is my best understanding of what's wrong with the 1.0 msvc assembly. However, we (upstream) don't have anyone with domain expertise to develop the fix.
Note that a work-around is to link to a version of the theora library
cross-compiled with gcc. For example, those available from
http://people.xiph.org/~tterribe/theora-win32-builds/ This would provide temporary relief of the performance complaints, which we've been hearing a lot of lately.
We have a volunteer working upstream on the msvc assembly toward the 1.1
release, but that may not be ready for some time. This is a separate issue from fixing the 1.0 msvc assembly; the code has changed significantly on that branch.
Reporter | ||
Comment 3•15 years ago
|
||
There's more information here: http://msdn.microsoft.com/en-us/library/aa290049.aspx
...however, it's slightly conflicting with the earlier link. In any case, it does seem like avoiding use of ebx (or spilling and restoring it manually) may solve this problem.
I'm busy with other bugs for now, and I'm no expert in this area, but I can take a crack at fixing it later if nobody else beats me to it. However, if Theora 1.1 is going to be released before we do another Firefox release, it may not be worth spending time fixing this in Theora 1.0.
Comment 4•15 years ago
|
||
I think, if PGO is disable on theora using NO_PROFILE_GUIDED_OPTIMIZE=1 in Makefile, this may be able to fix it. This hack is used on sqlite3.
Assignee | ||
Comment 5•15 years ago
|
||
This is an untested patch to save ebx on the stack and/or use a different register.
Assignee | ||
Comment 6•15 years ago
|
||
Oops. Forgot to remove some irrelevant cruft in my tree in the first patch.
Attachment #383594 -
Attachment is obsolete: true
Reporter | ||
Comment 7•15 years ago
|
||
Thanks Makoto and David! I'm in the processing of testing now. It'll take a while to get the results, as PGO builds on my little Windows machine will take a good part of day. I'll post results when I have them.
Reporter | ||
Comment 8•15 years ago
|
||
With David's patch applied, I noticed the following warnings:
libtheora\lib\dec\x86_vc\mmxstate.c(144) : warning C4731: 'oc_state_frag_recon_mmx' : frame pointer register 'ebx' modified by inline assembly code
libtheora\lib\dec\x86_vc\mmxstate.c(144) : warning C4731: 'oc_state_frag_recon_mmx' : frame pointer register 'ebx' modified by inline assembly code
libtheora\lib\dec\x86_vc\mmxstate.c(156) : warning C4731: 'oc_state_frag_recon_mmx' : frame pointer register 'ebx' modified by inline assembly code
libtheora\lib\dec\x86_vc\mmxstate.c(156) : warning C4731: 'oc_state_frag_recon_mmx' : frame pointer register 'ebx' modified by inline assembly code
libtheora\lib\dec\x86_vc\mmxstate.c(156) : warning C4731: 'oc_state_frag_recon_mmx' : frame pointer register 'ebx' modified by inline assembly code
libtheora\lib\dec\x86_vc\mmxstate.c(160) : warning C4731: 'oc_state_frag_recon_mmx' : frame pointer register 'ebx' modified by inline assembly code
libtheora\lib\dec\x86_vc\mmxstate.c(160) : warning C4731: 'oc_state_frag_recon_mmx' : frame pointer register 'ebx' modified by inline assembly code
I'm not sure if they occurred in the earlier PGO build I ran (to make sure I could reproduce the crash), since the scrollback log doesn't go back far enough. The patched build has only just started the LTCG phase, so I'll test it tomorrow morning.
Reporter | ||
Comment 9•15 years ago
|
||
We do always get warnings like the ones above when the optimized assembly is built, so it's not new with David's patch.
The patch fixes the crashes. Videos that I'd crash on (in the MMX code) immediately play fine with the patch applied.
Mochitests past, but a couple of reftests fail. I'll investigate tomorrow.
Reporter | ||
Comment 10•15 years ago
|
||
Oops, the reftests are just fine. They failed because I was running them against a build that also included an unrelated patch to disable EXTEND_PAD.
We should land this on trunk.
Comment 11•15 years ago
|
||
This is David's patch packaged up with README_MOZILLA changes and fixes to makefile to enable it.
Attachment #383595 -
Attachment is obsolete: true
Comment 12•15 years ago
|
||
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/240cad5e94b6
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•15 years ago
|
Attachment #384046 -
Flags: review+
Comment 13•15 years ago
|
||
(In reply to comment #4)
> I think, if PGO is disable on theora using NO_PROFILE_GUIDED_OPTIMIZE=1 in
> Makefile, this may be able to fix it. This hack is used on sqlite3.
For clarification note that "FPO" and "PGO" are not the same thing. "FPO" is "Frame pointer optimization", making use of ebp as a general purpose register. "PGO" is "Profile guided optimization", running an instrumented version of the binary to generate information about what codepaths are taken, and then relinking the binaries using that information for further optimization.
We use both in our Win32 nightly/release builds, FWIW.
Assignee: nobody → ds
Comment 14•15 years ago
|
||
Would be good to fix this for Firefox 3.5 too, currently theora video is almost unusable on older cpus/netbooks under windows, a lot slower/higher cpu usage than flash video, which is not exactly helpful for promoting <video> ;)
Flags: blocking1.9.1.1?
Comment 15•15 years ago
|
||
I'm not sure we should enable this for 1.9.1.1, but I'd love to hear comments from the <video> guys about relative safety of doing so.
Either way, this doesn't block 1.9.1.1 and we'd probably enable it for 1.9.1.2 or later.
Flags: blocking1.9.1.1? → wanted1.9.1.x?
Reporter | ||
Comment 16•15 years ago
|
||
I've been hearing a lot of complaints about performance, so it would be nice to turn this on in 1.9.1 at some point. It's quite low risk, as that code is called for each frame decode, so it has had a exposure through trunk and our testsuite for 3 weeks or so now.
Having said that, I'm not sure if enabling the optimized Theora code is going to be enough to make a noticeable different on the machines that are hurting now, as our performance problems are mostly higher up the abstraction layers.
Perhaps the next step should be to find someone with a slow Windows machine (the Atom netbooks are probably an ideal testbed) to test builds with this enabled and disabled, and find out if it'll make enough of a difference to be worthwhile.
Comment 18•15 years ago
|
||
This patch causes problems with compilation on mingw. I've filled bug 530313.
You need to log in
before you can comment on or make changes to this bug.
Description
•