Closed
Bug 523298
Opened 15 years ago
Closed 2 years ago
Much slower than Chrome in demo due to temporary surfaces in background image painting
Categories
(Core :: Graphics, defect)
Core
Graphics
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: sicking, Unassigned)
References
()
Details
(Whiteboard: chromeexperiments)
Attachments
(1 file)
(deleted),
image/png
|
Details |
On the following demo
http://mrdoob.com/projects/chromeexperiments/depth_of_field/
we are getting badly beaten by chrome. Not actually sure if this is due to javascript issues, or canvas issues. Guessing canvas for now, but we really need profiles.
However I think that drawImage currently makes us always fall off trace because of the way that the optional arguments work. Bug 459452 might fix that.
Reporter | ||
Comment 1•15 years ago
|
||
Actually, this isn't canvas at all since it's not using canvas. Seems like this is simply a load of absolutely positioned elements combined with CSS-sprite and transforms (to do scaling).
So also not bug 459452 as there is no drawImage calls.
Reporter | ||
Updated•15 years ago
|
Summary: Much slower that chrome in canvas-heavy demo → Much slower that chrome in demo
Comment 2•15 years ago
|
||
Basic breakdown from a shark profile:
52% under background painting (and yes, transforms are involved).
20% (!) vm_fault
If I exclude the supervisor callstacks, painting is at 80%, 10% is the usual AppKit/HiToolBox event mess, 7% is under js_Interpret. We don't seem to jit this, in fact, but given the painting that's not exactly the bottleneck. Might still want a bug on that, of course.
Summary: Much slower that chrome in demo → Much slower that chrome in demo due to painting being slow
Comment 3•15 years ago
|
||
Hmm. So I just did a malloc trace in shark. Over 10s, the testcase allocated about 11MB. 2MB of that was painting; the rest was JS.
Comment 4•15 years ago
|
||
The painting allocations are from ripl_Create called from ripc_GetColor, called from ripc_Render, called from ripc_DrawRects, called from CGContextFillRects, called from CGContextFillRect,called from CGContextDrawTiledImage, called from _cairo_quartz_surface_paint, called via some cairo stuff from imgFrame::Draw.
The JS allocations are from js_NewStringFromCharBuffer (3.5 MB), js_ValueToString on numbers (2.5MB), js_ConcatStrings (1.6MB). Also about 500KB under quickstub conversions to string, 140k under ExecuteTree (running regexps). 88KB under js_GetMutableScope (JSScope::create), 70KB under js_SetProperty (JSScope::changeTable). Other allocations are pretty small all around.
Comment 5•15 years ago
|
||
The painting code spends 39% of total testcase time in (not under) sseCGSBlendXXXX8888 in the CoreGraphics library; this is under CGContextDrawTiledImage called from moz_cairo_paint_with_alpha called from imgFrame::Draw. The moz_cairo_fill_preserve call Draw() does is the other 35+% there; this has no single chokepoint (thought resample_band in CoreGraphics is a lot of it).
Comment 6•15 years ago
|
||
(In reply to comment #4)
> The painting allocations are from ripl_Create called from ripc_GetColor, called
> from ripc_Render, called from ripc_DrawRects, called from CGContextFillRects,
> called from CGContextFillRect,called from CGContextDrawTiledImage, called from
> _cairo_quartz_surface_paint, called via some cairo stuff from imgFrame::Draw.
Instruments showed 95% of allocations coming from here.
Comment 7•15 years ago
|
||
Comment 8•15 years ago
|
||
The Safari GFX call stacks look pretty different:
One of the big differences is that we use CGContextDrawTiledImage whereas Safari uses CGContextDrawImage. Also interesting is that sseCGSBlendXXXX8888 doesn't even show up in Safari.
The testcase is using background-position and transforms, so I suspect image drawing is going through the path where we create a temporary image which is the piece of the image we need to render, so we can EXTEND_PAD it and not sample the wrong pixels.
We could test that hypothesis by replacing !subimage.Contains(imageRect) with PR_FALSE here:
http://mxr.mozilla.org/mozilla-central/source/modules/libpr0n/src/imgFrame.cpp#535
If that is a big performance issue, then we probably need to cache extracted subimages; exactly how we store them depends on how to maximize performance of the native APIs cairo is using. Or alternatively we could bite the bullet and move forward with adding some kind of subimage API to cairo.
Comment 10•15 years ago
|
||
(In reply to comment #6)
>
> Instruments showed 95% of allocations coming from here.
Shark malloc trace had something similar to say with "Record Only Active Blocks" unchecked.
Comment 12•15 years ago
|
||
> with "Record Only Active Blocks" unchecked.
Oh, without that it only records the blocks that are still alive after the profile ends, it seems? Bah!
Comment 13•15 years ago
|
||
(In reply to comment #9)
> The testcase is using background-position and transforms, so I suspect image
> drawing is going through the path where we create a temporary image which is
> the piece of the image we need to render, so we can EXTEND_PAD it and not
> sample the wrong pixels.
Yeah, that's my guess as well.
Comment 14•15 years ago
|
||
I tried the suggestion in comment 9 paragraph 2. That dropped the paint time from 80% to 53% with supervisor callstacks hidden, and dropped vm_fault from 20% to 2%. The animation also looks much smoother. There seems to be no more moz_cairo_paint_with_alpha under imgFrame::Draw.
Comment 15•15 years ago
|
||
WebKit is drawing the images as background tiles at 1:1 scale, then somehow scaling down the rendered image according to the transform. (They also clip the destination image when they're drawing sprites, rather than use source clipping.)
Comment 16•15 years ago
|
||
FWIW,
1. Open http://mrdoob.com/projects/chromeexperiments/depth_of_field/ with Opera...
2. ???
3. Profit!
It's fast!
Comment 17•14 years ago
|
||
(In reply to comment #14)
> I tried the suggestion in comment 9 paragraph 2. That dropped the paint time
> from 80% to 53%
Or put another way made the testcase about 2.5 times faster... Seems like the tail end of comment 9 might be worth looking into.
Updated•14 years ago
|
Summary: Much slower that chrome in demo due to painting being slow → Much slower than Chrome in demo due to temporary surfaces in background image painting
Updated•14 years ago
|
Whiteboard: chromeexperiments
Comment 18•13 years ago
|
||
Still much faster in Chrome than Firefox trunk. Testing on Win7 w/ D2D enabled.
Updated•12 years ago
|
Component: Layout → Graphics
Comment 19•12 years ago
|
||
Much smoother in IE10 , compared to Chrome.
Updated•2 years ago
|
Severity: normal → S3
Comment 20•2 years ago
|
||
This demo performs great now.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•