Closed
Bug 638241
Opened 14 years ago
Closed 14 years ago
DoDrawImageSecurityCheck is 10% of the profile in FishIE Tank
Categories
(Core :: Graphics: Canvas2D, defect)
Tracking
()
RESOLVED
FIXED
mozilla5
Tracking | Status | |
---|---|---|
blocking2.0 | --- | - |
People
(Reporter: roc, Assigned: roc)
References
()
Details
(Whiteboard: [fx4-rc-ridealong])
Attachments
(3 files, 2 obsolete files)
(deleted),
patch
|
joe
:
review+
dveditz
:
approval2.0-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
joe
:
review+
jrmuizel
:
review+
shaver
:
review+
dveditz
:
approval2.0-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
My xperf profile is a bit messy, perhaps due to JM stacks, but it appears to show 10% of our main thread in CanvasUtils::DoDrawImageSecurityCheck. That sounds dumb.
Comment 1•14 years ago
|
||
Possibly, yes. Security checks are slow, sadly.
We could try to have some sort of cache, or could try to rewrite security checks to be faster, I guess... We can't just get rid of the check.
Assignee | ||
Comment 2•14 years ago
|
||
In fact I'm seeing even more time spent in nsLayoutUtils::SurfaceFromElement; some in nsXPConnect::Push (needed due to bug 604262, which can be fixed in another way), plus a bunch of overhead doing QIs and examining images.
So, I'm thinking of adding to nsHTMLCanvasElement a caching map from image elements to their gfxASurfaces. The cache only contains elements that have been drawn to the canvas, so if we find a hit in the cache, we don't need to do another security check --- if the canvas needs to be write-only, it already will be. We can make this a time-based cache. Also, the cache only contains nsHTMLImageElements --- the other elements are likely to change, which makes caching harder and less useful. We do need to make sure that the image-request for the image element hasn't changed since the last time we put it in the cache.
Comment 3•14 years ago
|
||
> In fact I'm seeing even more time spent in nsLayoutUtils::SurfaceFromElement;
Yes; I've seen that come up quite a bit in profiles...
The cache thing will help for cases when the same image is banged into the canvas over and over again, yeah. Apart from silly things like fishtank, is this common? I guess for game stuff it might be.
The other thing that would help with the security check but not SurfaceFromElement is to store in the image node itself a lazily computed boolean that indicates whether it's same-origin with its image data. Then we could check that boolean and do a pointer compare on the two NodePrincipal()s. But hten cross-frame access would still be slow.
Assignee | ||
Comment 4•14 years ago
|
||
(In reply to comment #3)
> > In fact I'm seeing even more time spent in nsLayoutUtils::SurfaceFromElement;
>
> Yes; I've seen that come up quite a bit in profiles...
>
> The cache thing will help for cases when the same image is banged into the
> canvas over and over again, yeah. Apart from silly things like fishtank, is
> this common? I guess for game stuff it might be.
I think any canvas usage that a) draws images and b) is dynamic will end up redrawing the same image many times to the same canvas. Games certainly, and they're important, but anything involving images and animation or interaction.
> The other thing that would help with the security check but not
> SurfaceFromElement is to store in the image node itself a lazily computed
> boolean that indicates whether it's same-origin with its image data. Then we
> could check that boolean and do a pointer compare on the two NodePrincipal()s.
> But hten cross-frame access would still be slow.
That's a good idea, but hopefully my cache will subsume it.
Comment 5•14 years ago
|
||
Yeah. We should add noscript notxpcom getters on images to get the imgIRequest quickly; then we can store a pointer to that in the cache and do a fast compare.
Assignee | ||
Comment 6•14 years ago
|
||
Attachment #516496 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 7•14 years ago
|
||
Attachment #516497 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 8•14 years ago
|
||
With these patches, I go from 40fps to 60fps on FishIE tank with 1000 fish. Yeehaw!
Assignee | ||
Comment 9•14 years ago
|
||
Guys, check this out!
Boy I'd like to take these patches for Firefox 4!
Comment 10•14 years ago
|
||
Comment on attachment 516496 [details] [diff] [review]
Part 1: remove useless QI
Please don't lose the curlies, and r=me
Attachment #516496 -
Flags: review?(bzbarsky) → review+
Comment 11•14 years ago
|
||
Comment on attachment 516497 [details] [diff] [review]
Part 2: cache
>+++ b/content/canvas/public/CanvasImageCache.h
I assume this is in public/ and exported just so you can include it in the layoutmodule stuff? I'd rather just add content/canvas/src to the LOCAL_INCLUDES for layoutmodule if it's not there already and not export this header.
>+++ b/content/canvas/src/CanvasImageCache.cpp
>+ nsRefPtr<nsIDOMElement> mImage;
nsCOMPtr? Not like it really matters much....
The rest of this looks good. Can you file a followup (on me if desired) about having a non-COM version of GetRequest?
Attachment #516497 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 12•14 years ago
|
||
Will do.
Assignee | ||
Comment 13•14 years ago
|
||
Assignee | ||
Comment 14•14 years ago
|
||
Assignee | ||
Comment 15•14 years ago
|
||
Assignee | ||
Comment 16•14 years ago
|
||
The profile is still messy. After taking those patches, identifiable hotspots under drawImage are
-- 4% in nsCanvasRenderingContext2D::Redraw. The first 100 updates trigger Invalidate() calls. We can probably make this go away by tweaking our policy: if there more than 100 redraws between paints in the last cycle, Invalidate the whole canvas at the first next redraw instead of waiting for 100 redraws.
-- 2.5% in CanvasImageCache::Lookup. Of that, 2% is in the QI to nsIImageLoadingContent to check the current request!
-- Lots of random cairo and D2D stuff
-- Lots of overhead in various quickstub stuff, around calls to drawImage/save/restore/translate/scale/transform
-- 8% in actual window painting. We could probably reduce that to about 1% if we make canvas participate in empty transactions. Worth doing.
So I reckon we could knock off another 10% fairly easily, but beyond that is going to require improving quickstubs or cairo/D2D.
Assignee | ||
Comment 17•14 years ago
|
||
For the record, with these patches both FF4 and IE9 RC1 hit 60fps for 1000 fish. Hacking the test to support 2000 fish, I get about 34fps for FF4 and 38fps for IE9. Awfully tempting to try to close the gap there!
(I tried Chrome 11 but enabling acceleration didn't seem to work.)
Assignee | ||
Comment 18•14 years ago
|
||
BTW I think what happened here is that we used to be about as fast on FishIE as IE9 beta, then IE9 RC1 seems to have gotten faster and we regressed due to the fix for bug 604262 and maybe other things.
Assignee | ||
Comment 19•14 years ago
|
||
With various simple fixes I'm up to 37fps (2000 fish). Empty transactions for canvas would put us over the top.
I note the test has a useless ctx.save/restore pair.
I also note that in my current profiles we spend 10% of the profile in xpc_qsUnwrap*. Despite the fact that our save/restore is pretty inefficient (we save and restore the cairo context *and* maintain our own state stack in nsCanvasRenderingContext2D), for save(), restore(), translate(), scale() and transform(), qs overhead is about the same as the cost of the actual functionality.
I also note that we spend 10% of the profile in "?!? <itself>"...
Comment 20•14 years ago
|
||
(In reply to comment #19)
> I also note that we spend 10% of the profile in "?!? <itself>"...
Presumably this is JIT generated code?
Comment 21•14 years ago
|
||
> Of that, 2% is in the QI to nsIImageLoadingContent to check the current request!
Yeah, that part sort of sucks. We should think about a better way to deal with it...
> -- Lots of overhead in various quickstub stuff, around calls to
> drawImage/save/restore/translate/scale/transform
We're hoping to make that better when we redo the DOM bindings. Having a testcase just banging on that stuff would be pretty useful for that work.
> I also note that in my current profiles we spend 10% of the profile in
> xpc_qsUnwrap*
How much of that is arg and how much is this?
Comment 22•14 years ago
|
||
I can confirm the performance boost in FishIETank.
FishIETank:
Fx4 beta 12: 32fps
try server build: 42fps
IE9: 46fps
Speed Reading:
Fx4 beta 12: 48 seconds
try server build: 24 seconds!!! 2x improvement!
Comment 23•14 years ago
|
||
(In reply to comment #22)
> I can confirm the performance boost in FishIETank.
>
> FishIETank:
> Fx4 beta 12: 32fps
> try server build: 42fps
> IE9: 46fps
>
> Speed Reading:
> Fx4 beta 12: 48 seconds
> try server build: 24 seconds!!! 2x improvement!
There should be a patch on the speedreading parity bug somewhere from me that gives Speedreading an additional big boost.
Comment 24•14 years ago
|
||
Put in on trunk after FF4 branched and lets make FF5 rock.
Comment 25•14 years ago
|
||
(In reply to comment #24)
> Put in on trunk after FF4 branched and lets make FF5 rock.
Not so fast. Is .x out? How about picking up the fix on a respun RC? This is a competitive issue. Where's therisk analysis?
/be
Comment 26•14 years ago
|
||
This changes security checks in canvas. I am entirely uncomfortable with landing this in a .x release. It should just wait for 5.
Assignee | ||
Updated•14 years ago
|
Attachment #516510 -
Flags: approval2.0?
Comment 27•14 years ago
|
||
Comment on attachment 516510 [details] [diff] [review]
Part 1 updated
Sorry, we're closed down hard for 2.0.
Attachment #516510 -
Flags: approval2.0? → approval2.0-
Assignee | ||
Comment 28•14 years ago
|
||
Comment on attachment 516513 [details] [diff] [review]
Part 2 updated
Asking approval for these patches to land as a ridealong in any respin, or failing that, the next minor update. I do not want to respin just for these patches.
The patches give a large performance improvement in drawImage-heavy canvas benchmarks (see numbers above) --- benchmarks that Microsoft IE9 marketing is heavily focused on, as are tech bloggers.
The risk is relatively low. The new code is isolated and in a well-understood area.
Attachment #516513 -
Flags: approval2.0?
Assignee | ||
Comment 29•14 years ago
|
||
Oops, didn't see the comments here.
Assignee | ||
Comment 30•14 years ago
|
||
Normally I would agree with comments #24 and #26 but right now and in the next few weeks as FF4 and IE9 roll out we are going to be looking at stuff like this:
http://downloadsquad.switched.com/2011/03/01/opera-enables-hardware-acceleration-matches-ie9-beats-firefox/
Plus a lot more focused marketing from Microsoft itself. I think these patches will have a significant impact on how FF4 is perceived.
Updated•14 years ago
|
blocking2.0: --- → ?
Assignee | ||
Comment 31•14 years ago
|
||
Sure, we can catch up and pass IE9 in a dot release or FF5 but if the story is already "IE9 wins" then it's really hard to change that perception.
Comment 32•14 years ago
|
||
Asking for .x in addition/alternative to the approval. I think this is an important marketing piece, at the same time we should absolutely not do anything that has any risk what so ever, except if we have to respin anyway.
Assignee | ||
Comment 33•14 years ago
|
||
(In reply to comment #26)
> This changes security checks in canvas. I am entirely uncomfortable with
> landing this in a .x release. It should just wait for 5.
The security checks only protect against cross-origin image data information leaks. If they fail, they're not exploitable in the usual sense.
Also, if this patch introduces a failure mode for canvas tainting then it's probably a very subtle one and we might well not find it in a normal release cycle anyway. More effective to have more eyes review it --- perhaps someone familiar with imagelib, like you :-).
Comment 34•14 years ago
|
||
Are both parts needed?
Comment 35•14 years ago
|
||
In order for me to even consider thinking about changing my mind, we will have to have significant testing (and by testing I mean "trying to break this") on cross-origin drawImage. If there aren't automated tests for that, they need to be written as part of any patch that gets landed for this.
(And that's the case regardless of what branch this lands on.)
Comment 36•14 years ago
|
||
I am not too worried about the security implications here, I am much more worried about a lot of new code. Part 1 looks like a ride-along if there is a respin, part 2 scares me. I would love to have this though.
Assignee | ||
Comment 37•14 years ago
|
||
(In reply to comment #34)
> Are both parts needed?
The first part is less significant, but it's a true no-brainer. We're QIing to nsINode, which is a static superclass of the type we're QIing from!
Assignee | ||
Comment 38•14 years ago
|
||
(In reply to comment #35)
> In order for me to even consider thinking about changing my mind, we will have
> to have significant testing (and by testing I mean "trying to break this") on
> cross-origin drawImage. If there aren't automated tests for that, they need to
> be written as part of any patch that gets landed for this.
test_bug397524 in test_canvas.html tests that basic cross-origin loads and loads redirected cross-origin clear the origin-clean flag.
Comment 39•14 years ago
|
||
Whats the perf impact of part1 and part2?
Assignee | ||
Comment 40•14 years ago
|
||
Comment on attachment 516510 [details] [diff] [review]
Part 1 updated
This is really, unbelievably safe.
Attachment #516510 -
Flags: review?(joe)
Attachment #516510 -
Flags: approval2.0?
Attachment #516510 -
Flags: approval2.0-
Assignee | ||
Updated•14 years ago
|
Attachment #516513 -
Flags: review?(joe)
Assignee | ||
Updated•14 years ago
|
Attachment #516513 -
Flags: review?(jmuizelaar)
Assignee | ||
Updated•14 years ago
|
Attachment #516513 -
Flags: review?(shaver)
Assignee | ||
Comment 41•14 years ago
|
||
Almost all the perf impact is in part2. part1 is actually bypassed by the cache in part2. We can drop part1 if it's just confusing things (it is, however, extraordinarily safe).
Assignee | ||
Comment 42•14 years ago
|
||
(In reply to comment #35)
> In order for me to even consider thinking about changing my mind, we will have
> to have significant testing (and by testing I mean "trying to break this")
I've already thought about how to break this and haven't come up with anything (obviously, or I would have done things differently :-) ). So it'd be best to have other people trying to break it.
I will add a mochitest to verify that if we draw a same-origin <img>, then change the source to be cross-origin and draw the same <img> again, we clear the origin-clean flag (and draw the new image, not the old image). The patch handles that by storing the element's imgIRequest (with a strong ref!) and checking that it hasn't changed before we use that entry during cache lookup.
I'm also going to investigate if changes to document.domain might affect any of this.
Comment on attachment 516513 [details] [diff] [review]
Part 2 updated
a lot of it is whitespace/indent changes for the cache-miss case.
some indentation trivia in the Makefiles
cache logic looks sound, I presume it's been tested to not leak all canvases or whatever
canvas access is only on the main thread, so no worries about cache-access races.
does this affect mobile? do they still use canvases in their scrolling whatzit?
r+, but I'm not a peer, so take it for what it's worth.
Attachment #516513 -
Flags: review?(shaver) → review+
Assignee | ||
Comment 44•14 years ago
|
||
OK, so this testcase shows that my patch changes behavior with respect to document.domain changes:
http://people.mozilla.org/~roc/taint-after-setting-domain.html
After changing the domain to mozilla.org, it can still draw the image from people.mozilla.org without tainting the canvas, which is wrong.
This isn't a security bug, since it doesn't let the page do anything it couldn't already do, but I'll fix it anyway.
Assignee | ||
Comment 45•14 years ago
|
||
(In reply to comment #43)
> some indentation trivia in the Makefiles
Will fix.
> cache logic looks sound, I presume it's been tested to not leak all canvases or
> whatever
Yes.
> does this affect mobile? do they still use canvases in their scrolling whatzit?
No.
Assignee | ||
Comment 46•14 years ago
|
||
(In reply to comment #44)
> OK, so this testcase shows that my patch changes behavior with respect to
> document.domain changes:
> http://people.mozilla.org/~roc/taint-after-setting-domain.html
> After changing the domain to mozilla.org, it can still draw the image from
> people.mozilla.org without tainting the canvas, which is wrong.
Scrub that, the new behavior is fine. http://www.w3.org/TR/html5/the-canvas-element.html says that the origin-clean check is based on the document's origin, but setting document.domain only changes the effective script origin, NOT the document origin. IE9, Chrome and Opera all behave the same as with my patch.
Nothing to see here, move along.
Assignee | ||
Comment 47•14 years ago
|
||
(I guess the trunk behavior of that testcase is a bug that we need to fix by introducing a distinction between 'origin' and 'effective script origin' in our code.)
Comment 48•14 years ago
|
||
Comment on attachment 516510 [details] [diff] [review]
Part 1 updated
Obviously correct.
Attachment #516510 -
Flags: review?(joe) → review+
Assignee | ||
Comment 49•14 years ago
|
||
In fact, trunk currently allows a page that changes document.domain to get the image pixels of an image in the new domain, which the spec does not allow, although I don't know if that's really bad.
So the logic for why the patch is correct for the origin-clean flag goes like this:
1) A given imgIRequest will consistently return the same principal when we call it after have decoded the first frame of the image. (I haven't actually looked at code to verify this, but I can't think of a situation where it wouldn't be true.)
2) If CanvasImageCache returns a miss we do the canvas security check just as on trunk, so we'll set the origin-clean flag if necessary.
3) Otherwise, CanvasImageCache returns a hit.
3.1) If the canvas origin-clean flag is currently clear, then it doesn't matter whether we clear it again or not.
3.2) Otherwise the canvas origin-clean flag is currently set. Since the origin-clean flag can never be explicitly set, it must never have been cleared on the element. Since the CanvasImageCache contains an entry for this image request and canvas element, the first frame of the image for this image request must have been previously painted to the canvas, without clearing the origin-clean flag. Therefore, at that time the principal for the image is "the same" as the canvas element's principal.
3.3) Per #1, the image request principal for the current paint is equal to the image request principal for the previous paint. Therefore the image request principal for the current paint is "the same" as what the canvas element's principal *used to be*.
3.4) Now per HTML5 the canvas element's origin can't change so we would be guaranteed that the image request principal for the current paint is "the same" as the current canvas element principal. But we don't do that. HOWEVER, we *do* know that the image request's current principal *could have been* rendered to a canvas by the page without setting the origin-clean flag, since it happened before. So the page could have already obtained the pixel data before setting document.domain (or changing its principal in some other unknown way). So even if we don't clear the origin flag when we should, we're not giving the page the ability to do anything it couldn't have done another way.
Comment 50•14 years ago
|
||
Comment on attachment 516513 [details] [diff] [review]
Part 2 updated
This looks correct to me.
Things I considered:
- Image principals start out as null and change to the correct principal in OnStartRequest. However, we simply ignore images that haven't finished loading, so this doesn't hurt us.
- Image principals stay otherwise unchanged from within imagelib, though script or other code can change the principal on any given Image js element. I don't know enough about how & when principals change to say if that can cause problems.
- nsLayoutUtils is not exported from libxul, so no external addons can depend on it and we can change its return values (eg SurfaceFromElementResult) willy-nilly.
>diff --git a/layout/build/Makefile.in b/layout/build/Makefile.in
>--- a/layout/build/Makefile.in
>+++ b/layout/build/Makefile.in
>@@ -317,16 +317,17 @@ include $(topsrcdir)/config/rules.mk
> LOCAL_INCLUDES += -I$(srcdir)/../base \
> -I$(srcdir)/../generic \
> -I$(srcdir)/../forms \
> -I$(srcdir)/../tables \
> -I$(srcdir)/../style \
> -I$(srcdir)/../xul/content/src \
> -I$(srcdir)/../xul/base/src \
> -I$(topsrcdir)/content/base/src \
>+ -I$(topsrcdir)/content/canvas/src \
> -I$(topsrcdir)/content/html/content/src \
> -I$(topsrcdir)/content/html/document/src \
> -I$(topsrcdir)/content/html/style/src \
> -I$(topsrcdir)/content/xslt/src/base \
> -I$(topsrcdir)/content/xslt/src/xml \
> -I$(topsrcdir)/content/xslt/src/xpath \
> -I$(topsrcdir)/content/xslt/src/xslt \
> -I$(topsrcdir)/content/xul/content/src \
Can you fix the indentation there?
Attachment #516513 -
Flags: review?(joe) → review+
Assignee | ||
Comment 51•14 years ago
|
||
There's also a much simpler argument:
When CanvasImageCache returns a hit and the canvas element is currently origin-clean, it returns a gfxASurface containing data that was previously painted into the canvas without clearing the origin-clean flag. Therefore the page already could have obtained the pixel data in gfxASurface. Therefore painting the same data from the same surface again without clearing the origin-clean flag is not going to give the page access to anything it couldn't already have.
This argument relies on the contents of the gfxASurface that we obtained for the first frame of the image never changing. I'm pretty sure that's true but not 100% sure.
This argument implies that the GetImageRequest check is not needed for security. It's still needed for correctness though since we don't want to paint the old image if we've changed the <img src> between drawImages.
Updated•14 years ago
|
blocking2.0: ? → .x+
Whiteboard: [fx4-rc-ridealong]
Comment 52•14 years ago
|
||
Comment on attachment 516510 [details] [diff] [review]
Part 1 updated
Removing approval requests; we'll consider this if we do a second RC.
Attachment #516510 -
Flags: approval2.0? → approval2.0-
Updated•14 years ago
|
Attachment #516513 -
Flags: approval2.0? → approval2.0-
Comment 53•14 years ago
|
||
Comment on attachment 516513 [details] [diff] [review]
Part 2 updated
Do we have tests for using drawImage with animated images?
(Seems like CanvasImageCache::Lookup does handle that case)
Assignee | ||
Comment 54•14 years ago
|
||
I'm going to write such a test tomorrow. Thanks Olli!
Assignee | ||
Comment 55•14 years ago
|
||
Actually test_canvas.html already has a test to verify that the first frame of an animated GIF is drawn.
Assignee | ||
Comment 56•14 years ago
|
||
This adds a test to verify that painting an <img>, changing its src attribute and then painting the <img> again paints the new image.
Comment 57•14 years ago
|
||
(In reply to comment #21)
> We're hoping to make that better when we redo the DOM bindings. Having a
> testcase just banging on that stuff would be pretty useful for that work.
Do you have any more information on the work happening here?
Comment 58•14 years ago
|
||
Bug 622298 is the tracker. Past that, things are still in a design stage; we hope to have some code soonish, though.
Comment 59•14 years ago
|
||
Comment on attachment 516513 [details] [diff] [review]
Part 2 updated
In SurfaceFromElement you could remove the QI to nsINode, remove the null-checks and use |canvas| instead in the one place that it's actually used (node->NodePrincipal()).
Comment 60•14 years ago
|
||
Actually, I guess you can't since we use that check to make sure we've got a real element, not a fake JS object.
Assignee | ||
Comment 61•14 years ago
|
||
Comment on attachment 516510 [details] [diff] [review]
Part 1 updated
Re-requesting approval to land these performance patches on mozilla-central. I expect them to be relatively less risky for mobile.
Attachment #516510 -
Flags: approval2.0- → approval2.0?
Assignee | ||
Comment 62•14 years ago
|
||
Comment on attachment 516513 [details] [diff] [review]
Part 2 updated
Re-requesting approval to land these performance patches on mozilla-central. I expect them to be relatively less risky for mobile.
Attachment #516513 -
Flags: approval2.0- → approval2.0?
Updated•14 years ago
|
Attachment #516513 -
Flags: review?(jmuizelaar) → review+
Comment 63•14 years ago
|
||
Is this ready to go? Also, which patches should land here?
Assignee | ||
Updated•14 years ago
|
Attachment #516496 -
Attachment is obsolete: true
Assignee | ||
Updated•14 years ago
|
Attachment #516497 -
Attachment is obsolete: true
Assignee | ||
Comment 64•14 years ago
|
||
Comment 65•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/592bb8168d61
http://hg.mozilla.org/mozilla-central/rev/64b858e79f89
http://hg.mozilla.org/mozilla-central/rev/7dc400c4b8ce
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.2
Comment 66•14 years ago
|
||
Firefox 4 seems to be doing fine without this, and Firefox 5 will be that much better. release-drivers aren't seeing a compelling case for landing this in the mozilla2.0 repo so not blocking. Please make the case and renominate if we've missed something that tips the scales.
blocking2.0: .x+ → -
Updated•14 years ago
|
Attachment #516510 -
Flags: approval2.0? → approval2.0-
Updated•14 years ago
|
Attachment #516513 -
Flags: approval2.0? → approval2.0-
You need to log in
before you can comment on or make changes to this bug.
Description
•