Open Bug 297959 Opened 20 years ago Updated 2 years ago

Table-based video doesn't work fast enough

Categories

(Core :: DOM: Core & HTML, defect, P5)

defect

Tracking

()

People

(Reporter: gerv, Unassigned)

References

()

Details

(Keywords: perf, testcase, Whiteboard: need profiles)

Attachments

(3 files, 1 obsolete file)

Recently, I attempted "table-based video" - i.e. changing table cell background colours to get animation. It was far too slow, understandably. I got about 0.5fps with a 100x100 cell matrix, meaning we did about 5000 background colour change operations a second. I'm filing this bug at Boris's request, because he wanted to do some profiling; I don't really expect Firefox's DOM to be fast enough to update 100,000 (100x100 pixels at 10fps) background image colours a second. Test file coming up. Gerv
Attached file Video test HTML (deleted) —
Oops, I used the wrong comment char there. But the testcase still works. Gerv
Attached file Canvas test HTML (obsolete) (deleted) —
Canvas manages about double the speed - 1fps - but then, a canvas-based implementation could colour contiguous pixel ranges at lower cost. How well this would work in practice depends on the amount of change - either deliberate or pixel jitter - between frames. So perhaps a cartoon "no-plugin video" implementation might be a lot more successful than a captured video one. Gerv
I'd be curious to see what the bottleneck is here, if someone wants to spend some quality oprofile time with this.
Keywords: perf
The <table> and <canvas> variants here aren't a fair comparison. They're different sizes, do different things, and the <table> case even tests transparency. Interesting tests though...
Whiteboard: need profiles
Ian: sure; the two test HTMLs are not "tests" in the sense that you might use the word. I was just messing around. I'm sure someone could use them to do a careful comparison if they wished to. It would be very cool if <canvas> got fast enough that you could stream JS to it and do cartoon-style animation at a decent size and frame rate. Maybe it is already, I don't know. Gerv
Ok, here are some real tests: http://www.hixie.ch/tests/adhoc/perf/video/ Boy, do we suck. Especially compared to, say, Opera. I'm getting an order of magnitude performance on Opera's worst test than I'm getting on Firefox's best test. It also seems that our getElementById() and childNodes[] implementations suck ass (in comparison Opera's are virtually instantaneous). But I guess those are separate bugs.
So for hixie's test 1 (001.html), I have: Total hit count: 566305 Painting: 11368 Computed style recalculation: 43136 Security checks: 51048 (this is with some local changes to make them a lot faster!) Total time spent in Gecko when called from JS: 72287 of this, setting background color style: 69601 which is split 50/50 between parsing the CSS and other things XPCConvert::NativeInterface2JSObject: 118062 JS_GC: 31492 XPConnect overhead: Time under XPC_WN_Helper_GetProperty that hasn't been listed above: 27615 Time under XPC_WN_GetterSetter that hasn't been listed above: 63197 JS overhead: Time under js_Invoke that hasn't been listed above: 158330 That covers 99.8% of all the hits. Summary: 20% or so of the time is actually in gklayout. 9% is security checks, more in stock trunk builds. 20% of the time is spent creating XPCWrappedNative objects, presumably because there are enough things in flight here that we GC the suckers as fast as we create them, so when we get back to the beginning of the next animation frame we have to recreate them all over again. When I tried adding an extra ref to all XPCWrappedNatives on creation so they wouldn't get GCed, time on this testcase went down about 20%, as expected. 25% of the time is spent in actual JS execution. 18% of the time is various XPConnect overhead. For comparison's sake, we take 50s or so to run this testcase; Opera 8 on the same system takes 10.5s. I believe we have existing bugs on the security manager end of things. The other obvious "easy" issue is the over-eager GC of the wrapped natives. Is there anything we can do about that?
Blocks: 213943
The "JS overhead" number is actually about 30000 too big; I accidentally included the JS_GC time in it. I'm going to try to calculate the overhead numbers via jprof instead of by hand and see what comes out.
Depends on: 299689
Depends on: 299703
So more profiling (with oprofile this time) on test 3 gives us: samples| %| ------------------ 57149 29.9863 libgklayout.so 32800 17.2103 libmozjs.so 27493 14.4257 no-vmlinux (this is kernel time; called from whereever) 15433 8.0977 libxpconnect.so 13853 7.2687 libxpcom_core.so 9821 5.1531 libpthread-0.10.so 9372 4.9175 libc-2.3.2.so 6522 3.4221 libnspr4.so 3004 1.5762 libcaps.so 2474 1.2981 libgkgfx.so Some of the libmozjs.so time here would be calls to JS_GetPrivate and JS_GetClass from xpconnect, of course. I also tried doing splits with jprof; time spent under (not in) js_Invoke but excluding XPC_WN_GetterSetter, XPC_WN_Helper_NewResolve, XPC_WN_NoHelper_Resolve, XPC_WN_ModsAllowed_Proto_Resolve, XPC_NW_GetProperty, XPC_WN_Helper_GetProperty, XPC_WN_CallMethod, XPC_WN_OnlyIWrite_PropertyStub, XPC_WN_Helper_Mark, XPC_WN_NoHelper_Finalize, DOMGCCallback, JS_GC, nsJSContext::DOMBranchCallback, jsd_FunctionCallHook is about 16% of the profile. oprofile refuses to do callgraphs with my kernel version, so I can't get any more useful data from there; help from someone with a fully functioning oprofile setup would be much appreciated. Also of interest is that jsd_ObjectHook and jsd_FunctionCallHook together add up to about 4.3% of the profile, which seems a little on the high side as far as I'm concerned. I filed bug 299788 on that.
Depends on: 299788
Keywords: testcase
Depends on: 307441
Depends on: 311456
Depends on: 311458
Depends on: 311485
Depends on: 311511
Depends on: 311546
Depends on: 311547
Depends on: 311550
Blocks: 203448
Depends on: 311566
Depends on: 311571
Depends on: 311582
Depends on: 311592
Attached file fixed canvas test HTML (deleted) —
The canvas testcase has a bug: ctx.fillRect(i, j, i+1, j+1); should be ctx.fillRect(i, j, 1, 1); as fillRect takes (x,y,w,h), not (x0,y0,x1,y1). Setting the fillStyle on every pixel seems to be extremely expensive; I'm not sure why, though the string ops could certainly be a part of it. Moving the fillStyle setting outside of the loop generates a significant (300% speedup) for me.
Attachment #186525 - Attachment is obsolete: true
Vlad, I think the relevant testcase is actually at http://www.hixie.ch/tests/adhoc/perf/video/ and I don't see a way to move the fill setting outside the loop there. The whole point is that we're trying to draw arbitrary images to canvas (or table, or whatever) using pixel addressing.
(In reply to comment #13) > Vlad, I think the relevant testcase is actually at > http://www.hixie.ch/tests/adhoc/perf/video/ and I don't see a way to move the > fill setting outside the loop there. The whole point is that we're trying to > draw arbitrary images to canvas (or table, or whatever) using pixel addressing. Yeah, I'm not suggesting that it should be moved out -- more pointing out that it's ridiculously expensive, when it shouldn't be. It doesn't seem to be showing up on your profiles though..
The profiles in this bug were not done on the canvas testcase. The profile in bug 311571 does show the SetFillStyle part....
Depends on: 314255
Can anyone attach a minimal testcase?
Uh... Minimal in what sense? The link in the URL field points to several different testcases that this is a tracking bug for.
Yes, I posted too much quickly. I've run the two testcases on some browsers three-four times. This is the comparison between Fx 3.1 beta and Opera 9.5 beta2, in FPS: 1st testacase Fx3: 1.5 Opera: 4.6 2nd testcase: Fx3: 5.0 Opera: 11.6 IE doesn't work with testcases, Safari only with one of them and it's slow as Fx3. Some considerations: 1) the testcases sometimes ends immediately returning 0.1 FPS if reloaded 2) I think in Whiteboard should be added Parity-Opera 3) Opera seems to use very much memory anyway
Assignee: general → nobody
QA Contact: ian → general
The first time I tried the 1st testcase, it ran at 4fps. Then I ran the 2nd testcase, and it didn't run, just showed 1 color and printed 0.1fps Then I tried Chrome. Visually, the 1st testcase it is very different. Nightly shows a colored stripe, then no stripe, then a stripe, a no stripe... and so on. And Chrome only has mini gray boxes. It printed 3fps. For the 2nd testcase, Chrome ran at 40fps. Then I cameback to Nightly and tried again the testcases, and the 2nd ran ok, at 20 fps. But the 1st gave the error that I had at the 2nd (0.1 fps and no animation). I reloaded a few times, until it worked ok again. Strange. Then I went to hixie website, and all 4 tests say "Test failed to run." Chrome also briefly print "Test failed to run.", but then they work ok.
The word on the street is that WebGL-based video rendering works fine https://brendaneich.com/2013/05/today-i-saw-the-future/
(In reply to David Bruant from comment #20) > The word on the street is that WebGL-based video rendering works fine > https://brendaneich.com/2013/05/today-i-saw-the-future/ This bug does not exist because someone wants to do table-based video today; it exists because the test case showed up speed issues with our DOM implementation. Gerv
Priority: -- → P5
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: