Closed Bug 1155249 Opened 10 years ago Closed 9 years ago

Scale then Copy Repeated Images in layout

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla42
Tracking Status
firefox42 --- fixed

People

(Reporter: mchang, Assigned: mchang)

References

(Blocks 1 open bug)

Details

(Whiteboard: gfx-noted)

Attachments

(2 files, 11 obsolete files)

(deleted), application/zip
Details
(deleted), patch
mchang
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #1073209 +++ Instead of having repeated images be drawn within platform specific APIs such as OS X CGContextDrawTiledImage vs CGContextDrawImage, we should do this optimization in layout so it works across all platforms. The general idea is that we should: 1) Scale the image 2) memcpy the scaled image repeatedly until we fill the rect Figure out how to deal with non pixel-aligned situations.
Assignee: nobody → mchang
Status: NEW → ASSIGNED
Hardware: x86 → All
No longer blocks: 902639, 970311
The difficulty will be in how we handle scales that don't transform the background images to full pixel amounts.
Attached patch WIP - Scale then Tile repeated images (obsolete) (deleted) — Splinter Review
WIP to scale then tile repeated images to make drawing repeated images faster. It didn't seem to make much different on OS X where the majority of the time spent when browsing http://redrockbiblecamp.com/cyclathon/ from bug 1164601. I also think, looking at the code some more, that doing it in gfxDrawable is too late in the pipeline and might be better to do it somewhere in imgFrame instead.
Can you attach a profile? (Ideally with proper symbols?)
(In reply to Mason Chang [:mchang] from comment #2) > I also think, looking at the code some more, that doing it in > gfxDrawable is too late in the pipeline and might be better to do it > somewhere in imgFrame instead. Can you expand on that?
(In reply to Mason Chang [:mchang] from comment #4) > Profiles: > > Nightly > http://people.mozilla.org/~bgirard/cleopatra/ > #report=defe578eb2664ce83f454985f24dcd05679fe2aa > > With Scaled then Tile image patch > http://people.mozilla.org/~bgirard/cleopatra/ > #report=18c32ff1308da1ed27b28c9f111c341bc7e70210 What are the source and destination sizes for these again?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6) > (In reply to Mason Chang [:mchang] from comment #4) > > Profiles: > > > > Nightly > > http://people.mozilla.org/~bgirard/cleopatra/ > > #report=defe578eb2664ce83f454985f24dcd05679fe2aa > > > > With Scaled then Tile image patch > > http://people.mozilla.org/~bgirard/cleopatra/ > > #report=18c32ff1308da1ed27b28c9f111c341bc7e70210 > > What are the source and destination sizes for these again? The source image is 1680x900 and scaled to 2240x1200 (In reply to Seth Fowler [:seth] from comment #5) > (In reply to Mason Chang [:mchang] from comment #2) > > I also think, looking at the code some more, that doing it in > > gfxDrawable is too late in the pipeline and might be better to do it > > somewhere in imgFrame instead. > > Can you expand on that? I was thinking in conjunction with bug 1166407, imgFrame would check the cache for a scaled image. If it doesn't exist, imgFrame would scale then cache the resulting source surface, create a new drawable, then the drawable would fill the target rect with the scaled image via tiling. The patch as it stands now checks in gfxDrawable if a scaling exists, does the scaling, then tiles the image across the fill rect. I thought it would make more sense to keep the scaling information/cache in imgFrame since scaling is expensive on most platforms. Especially if we want to put images in layers on the GPU and we have to do that in layout, it seems like the earlier we do these things in layout, the better rather than pushing those decisions to the gfx backends. Does that make sense?
(In reply to Mason Chang [:mchang] from comment #7) > (In reply to Jeff Muizelaar [:jrmuizel] from comment #6) > > (In reply to Mason Chang [:mchang] from comment #4) > > > Profiles: > > > > > > Nightly > > > http://people.mozilla.org/~bgirard/cleopatra/ > > > #report=defe578eb2664ce83f454985f24dcd05679fe2aa > > > > > > With Scaled then Tile image patch > > > http://people.mozilla.org/~bgirard/cleopatra/ > > > #report=18c32ff1308da1ed27b28c9f111c341bc7e70210 > > > > What are the source and destination sizes for these again? > > The source image is 1680x900 and scaled to 2240x1200 Where is the destination size coming from? One thing, that would be worth trying, to rule out lupus, would be to keep the temporary surface alive and reuse it for scaling operation. I'd interested to see how that affects the timing. I did some local testing of how long this operation should take and I got 22ms with CoreGraphics and 6ms with Skia. That seems like it should be fast enough to avoid caching.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #8) > (In reply to Mason Chang [:mchang] from comment #7) > > (In reply to Jeff Muizelaar [:jrmuizel] from comment #6) > > > (In reply to Mason Chang [:mchang] from comment #4) > > > > Profiles: > > > > > > > > Nightly > > > > http://people.mozilla.org/~bgirard/cleopatra/ > > > > #report=defe578eb2664ce83f454985f24dcd05679fe2aa > > > > > > > > With Scaled then Tile image patch > > > > http://people.mozilla.org/~bgirard/cleopatra/ > > > > #report=18c32ff1308da1ed27b28c9f111c341bc7e70210 > > > > > > What are the source and destination sizes for these again? > > > > The source image is 1680x900 and scaled to 2240x1200 > > Where is the destination size coming from? Doing a zoom then scrolling. > > One thing, that would be worth trying, to rule out lupus, would be to keep > the temporary surface alive and reuse it for scaling operation. I'd > interested to see how that affects the timing. > > I did some local testing of how long this operation should take and I got > 22ms with CoreGraphics and 6ms with Skia. That seems like it should be fast > enough to avoid caching. Is the "this operation" the scaling or keeping the temporary surface alive? 6 ms should be fast enough or 22 ms? Skia has a scaled image cache, I wonder if it's hitting Skia's cache.
This operation is the scaling. 22 ms is probably fast enough, 6 ms certainly should be. Looking at a profile of my test program shows that skia is spending all of the time in the scaling functions so I don't think it's hitting any kind of cache. I don't think skia has a cache for this kind of thing.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #10) > This operation is the scaling. 22 ms is probably fast enough, 6 ms certainly > should be. Looking at a profile of my test program shows that skia is > spending all of the time in the scaling functions so I don't think it's > hitting any kind of cache. I don't think skia has a cache for this kind of > thing. Hmm, 22 ms is already missing one frame and it's just for the background image. It might be fast enough w/ APZ, but will lead to really janky scrolling w/o APZ. I think Skia has a scaled image cache [1], or that's a really misleading name. [1] https://dxr.mozilla.org/mozilla-central/source/gfx/skia/trunk/src/core/SkScaledImageCache.h
Attached patch Example of expensive scaled images (obsolete) (deleted) — Splinter Review
Using this patch and the test case in bug 1073209, I can reproduce 20 ms overhead to scale a single image. The image itself is scaled from 1680x900 to either +9% or +25% depending on zoom level. I am not measuring the time to tile the image. I just created another DrawTarget, set the transform to scale, and draw the surface. I have a feeling I'm doing something really wrong or am missing something obvious. @Jeff - Can you please take a look at this patch? It creates a new DrawTarget and scales the background image, and it takes up to 20 ms. It doesn't paint or anything yet, but I just wanted to measure the time it took to scale. Thanks!
Attachment #8607684 - Attachment is obsolete: true
Attachment #8633202 - Flags: feedback?(jmuizelaar)
Attached patch Current WIP where things make no sense (obsolete) (deleted) — Splinter Review
Attachment #8633202 - Attachment is obsolete: true
Attachment #8633202 - Flags: feedback?(jmuizelaar)
Attached file test case (deleted) —
Attached patch WIP - Scale then repeat images (obsolete) (deleted) — Splinter Review
Current WIP. Creates an intermediate surface the size of the scaled needed rectangle. Draws the scaled image into the temporary surface then draws the scaled image into the destination. I think we need an intermediate surface, otherwise we'll hit the original problem of scaling at every call to the destination draw target which is tiled, and caused the original regression. However, the timings with this approach are much worse, which is very confusing. Scale factor: 2.000000, 2.000000 Drawing scaled image Rect: A NeededRect (120.000000, 0.000000) w: 1441.000000, h: 795.000000 Alloc: 0.051533, scaled: 39.179494, dest: 42.522245 Draw Scaled Image time: 42.577871 Regular drawable time: 34.364314 Scale factor: 2.222222, 2.222222 Drawing scaled image Rect: A NeededRect (192.000000, 0.000000) w: 1297.000000, h: 716.000000 Alloc: 0.096642, scaled: 36.954609, dest: 55.014265 Draw Scaled Image time: 55.081958 Regular drawable time: 30.778283 Scale factor: 2.400000, 2.400000 Drawing scaled image Rect: A NeededRect (240.000000, 0.000000) w: 1201.000000, h: 663.000000 Alloc: 0.080673, scaled: 32.264034, dest: 35.647018 Draw Scaled Image time: 35.716288 Regular drawable time: 33.341899 Scale factor: 2.608696, 2.608696 Drawing scaled image Rect: A NeededRect (288.000000, 0.000000) w: 1105.000000, h: 610.000000 Alloc: 0.073015, scaled: 65.004633, dest: 80.162574 Draw Scaled Image time: 80.219204 Regular drawable time: 33.927553 So allocating the surface is pretty cheap, but drawing the scaled image is expensive. Dest refers to drawing the scaled image into the destination surface. Regular drawable time here refers to just calling the drawable and scale the image in each tile. What's really disturbing is that this can be 2.4x slower. I think I'm probably doing something wrong or scaling in 1 go is just a lot slower than doing smaller scales on a per tile basis for some reason? @Jeff - Can you please take a look at this? Thanks!
Attachment #8634947 - Attachment is obsolete: true
Flags: needinfo?(jmuizelaar)
In OS X instruments, I'm getting this inverted call stack as taking the most time in the profiles: Running Time Self (ms) Symbol Name 60.0ms 31.0% 60.0 argb32_sample_argb32 60.0ms 31.0% 0.0 argb32_image_mark 60.0ms 31.0% 0.0 argb32_image 60.0ms 31.0% 0.0 ripl_Mark 60.0ms 31.0% 0.0 RIPLayerBltImage 60.0ms 31.0% 0.0 ripc_RenderImage 60.0ms 31.0% 0.0 ripc_DrawImage 60.0ms 31.0% 0.0 CGContextDrawImage 60.0ms 31.0% 0.0 mozilla::gfx::DrawTargetCG::DrawSurface(mozilla::gfx::SourceSurface 56.0ms 29.0% 0.0 DrawScaledImage(gfxDrawable*, gfxContext*, 56.0ms 29.0% 0.0 gfxUtils::DrawPixelSnapped(gfxContext*,
(In reply to Mason Chang [:mchang] from comment #15) > Created attachment 8636344 [details] [diff] [review] > WIP - Scale then repeat images > > Current WIP. Creates an intermediate surface the size of the scaled needed > rectangle. Draws the scaled image into the temporary surface then draws the > scaled image into the destination. I think we need an intermediate surface, > otherwise we'll hit the original problem of scaling at every call to the > destination draw target which is tiled, and caused the original regression. > However, the timings with this approach are much worse, which is very > confusing. > > Scale factor: 2.000000, 2.000000 > Drawing scaled image > Rect: A NeededRect (120.000000, 0.000000) w: 1441.000000, h: 795.000000 > Alloc: 0.051533, scaled: 39.179494, dest: 42.522245 > Draw Scaled Image time: 42.577871 > Regular drawable time: 34.364314 > > Scale factor: 2.222222, 2.222222 > Drawing scaled image > Rect: A NeededRect (192.000000, 0.000000) w: 1297.000000, h: 716.000000 > Alloc: 0.096642, scaled: 36.954609, dest: 55.014265 > Draw Scaled Image time: 55.081958 > Regular drawable time: 30.778283 > > Scale factor: 2.400000, 2.400000 > Drawing scaled image > Rect: A NeededRect (240.000000, 0.000000) w: 1201.000000, h: 663.000000 > Alloc: 0.080673, scaled: 32.264034, dest: 35.647018 > Draw Scaled Image time: 35.716288 > Regular drawable time: 33.341899 > > Scale factor: 2.608696, 2.608696 > Drawing scaled image > Rect: A NeededRect (288.000000, 0.000000) w: 1105.000000, h: 610.000000 > Alloc: 0.073015, scaled: 65.004633, dest: 80.162574 > Draw Scaled Image time: 80.219204 > Regular drawable time: 33.927553 > > So allocating the surface is pretty cheap, but drawing the scaled image is > expensive. Dest refers to drawing the scaled image into the destination > surface. Regular drawable time here refers to just calling the drawable and > scale the image in each tile. What's really disturbing is that this can be > 2.4x slower. I think I'm probably doing something wrong or scaling in 1 go > is just a lot slower than doing smaller scales on a per tile basis for some > reason? I think this might make some sense. I'm guessing the cost of the operation is basically proportional to the number of destination pixels of the scale operation. When using an intermediate makes this number larger we lose. When using an intermediate makes this small enough to eat the cost of doing the extra work (which is especially high with CoreGraphics because it seems to make an additional intermediate) we win. It's important for this patch to only create an intermediate when it makes sense. Note: If we're tiling in device space we should be able to get rid of the extra intermediate that's being created in the CG backend. (Or we could stop using CG :))
Flags: needinfo?(jmuizelaar)
Attached patch Scale Image via DrawSurface (obsolete) (deleted) — Splinter Review
Timings of directly calling draw surface on the destination draw target. Seems a little faster than the current status, although still confusing to me. Scale factor: 2.222222, 2.222222 Rect: Needed Rect (192.000000, 0.000000) w: 1297.000000, h: 716.000000 Draw Scaled Image time: 26.352622 Regular drawable time: 31.248245 Scale factor: 2.400000, 2.400000 Rect: Needed Rect (240.000000, 0.000000) w: 1201.000000, h: 663.000000 Draw Scaled Image time: 26.190218 Regular drawable time: 29.743001 Scale factor: 2.608696, 2.608696 Rect: Needed Rect (288.000000, 0.000000) w: 1105.000000, h: 610.000000 Draw Scaled Image time: 23.653488 Regular drawable time: 29.722015 Scale factor: 2.000000, 2.000000 Rect: Needed Rect (120.000000, 0.000000) w: 1441.000000, h: 795.000000 Draw Scaled Image time: 25.081379 Regular drawable time: 32.126836 Scale factor: 2.222222, 2.222222 Rect: Needed Rect (192.000000, 0.000000) w: 1297.000000, h: 716.000000 Draw Scaled Image time: 28.794398 Regular drawable time: 31.171078 Scale factor: 2.400000, 2.400000 Rect: Needed Rect (240.000000, 0.000000) w: 1201.000000, h: 663.000000 Draw Scaled Image time: 22.734294 Regular drawable time: 28.592733
Attached patch Scale Image via Scaled Surface Pattern (obsolete) (deleted) — Splinter Review
Scales the image via a surface pattern, then call FillRect on the draw target. Scale factor: 2.000000, 2.000000 Rect: Needed Rect (120.000000, 0.000000) w: 1441.000000, h: 795.000000 Draw Scaled Image time: 33.498958 Regular drawable time: 32.478540 Scale factor: 2.222222, 2.222222 Rect: Needed Rect (192.000000, 0.000000) w: 1297.000000, h: 716.000000 Draw Scaled Image time: 30.255121 Regular drawable time: 30.758179 Scale factor: 2.400000, 2.400000 Rect: Needed Rect (240.000000, 0.000000) w: 1201.000000, h: 663.000000 Draw Scaled Image time: 23.884157 Regular drawable time: 28.564923 Scale factor: 2.608696, 2.608696 Rect: Needed Rect (288.000000, 0.000000) w: 1105.000000, h: 610.000000 Draw Scaled Image time: 25.996282 Regular drawable time: 29.143483 Scale factor: 2.000000, 2.000000 Rect: Needed Rect (120.000000, 0.000000) w: 1441.000000, h: 795.000000 Draw Scaled Image time: 27.442695 Regular drawable time: 31.388309
I tested this with another test that has 1024x768 image and repeated 30 times when zoomed out. I created an intermediate surface, downscaled the image once, then repeated the downscaled surface pattern to the required fill rect. Here are the timings: Scale factor: 0.600000, 0.600000 Rect: Needed Rect (-864.000000, 0.000000) w: 4802.000000, h: 2648.000000 Rect: Image Rect (0.000000, 0.000000) w: 1024.000000, h: 768.000000 Draw Scaled Image time: 20.812527 Regular drawable time: 34.471564 Scale factor: 0.600000, 0.600000 Rect: Needed Rect (-864.000000, 0.000000) w: 4802.000000, h: 2648.000000 Rect: Image Rect (0.000000, 0.000000) w: 1024.000000, h: 768.000000 Draw Scaled Image time: 23.749103 Regular drawable time: 33.748639 It's a little faster, not much though.
Next I changed the backend to skia, the timings are a lot better: Scale factor: 0.600000, 0.600000 Rect: Needed Rect (-864.000000, 0.000000) w: 4802.000000, h: 2648.000000 Rect: Image Rect (0.000000, 0.000000) w: 1024.000000, h: 768.000000 Draw Scaled Image time: 2.850728 Regular drawable time: 15.443863 Scale factor: 0.600000, 0.600000 Rect: Needed Rect (-864.000000, 0.000000) w: 4802.000000, h: 2648.000000 Rect: Image Rect (0.000000, 0.000000) w: 1024.000000, h: 768.000000 Draw Scaled Image time: 3.037796 Regular drawable time: 15.287917 I think, instead of focusing efforts on getting the CG backend to be better, this is more data on how skia performs versus CG and that we should spend this time instead on making skia work.
Uses the drawable to create an intermediate scaled image, then tiles that image onto the destination draw target. Sometimes though, it breaks the test case attached to this bug by painting white, but it only happens with APZ.
Comment on attachment 8638242 [details] [diff] [review] Prescale image with an intermediate surface if repeated often Actually disregard comment 22. I updated to master and can no longer reproduce the issue.
Attachment #8638242 - Flags: review?(mstange)
Attachment #8638242 - Attachment is obsolete: true
Attachment #8638242 - Flags: review?(mstange)
Attachment #8638259 - Flags: review?(mstange)
Comment on attachment 8638259 [details] [diff] [review] Prescale image with an intermediate surface if repeated often Cancelling review due to reftest failures: https://treeherder.mozilla.org/#/jobs?repo=try&revision=768237700654 Investigating.
Attachment #8638259 - Flags: review?(mstange)
(In reply to Mason Chang [:mchang] from comment #25) > Comment on attachment 8638259 [details] [diff] [review] > Prescale image with an intermediate surface if repeated often > > Cancelling review due to reftest failures: > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=768237700654 > > Investigating. Looks like it might be some bad rounding of coordinates.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #26) > (In reply to Mason Chang [:mchang] from comment #25) > > Comment on attachment 8638259 [details] [diff] [review] > > Prescale image with an intermediate surface if repeated often > > > > Cancelling review due to reftest failures: > > > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=768237700654 > > > > Investigating. > > Looks like it might be some bad rounding of coordinates. Yeah part of it is. I was rounding the pre-scaled rects then scaling them. Inverting that, to scale the rects, then rounding them fixed some cases but broke others. https://treeherder.mozilla.org/#/jobs?repo=try&revision=ff3353f6f64f More investigating.
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=da4dd30f2d40 Still getting some reftest failures. Seems like no matter what, I'm getting some rounding error due to floating point precision.
Attachment #8638259 - Attachment is obsolete: true
Try - https://treeherder.mozilla.org/#/jobs?repo=try&revision=dba8db5d4ffa This version bails out if the scaled image rect isn't pixel aligned. The Debug oth tests being orange is interesting... from the log it just looks like the manifests are buggy that it's skipping every test. But the reftests are passing.
Attachment #8640523 - Flags: review?(mstange)
Comment on attachment 8640523 [details] [diff] [review] Prescale image with an intermediate surface if repeated often Review of attachment 8640523 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/thebes/gfxUtils.cpp @@ +605,5 @@ > #endif > > +#ifdef MOZ_WIDGET_COCOA > +static bool > +ShouldUseTempSurface(Rect aImageRect, Rect aNeededRect) Please add a comment above this line about the two costs that we need to balance here. @@ +625,5 @@ > + gfxSize scaleFactor = aContext->CurrentMatrix().ScaleFactors(true); > + gfxMatrix scaleMatrix = gfxMatrix::Scaling(scaleFactor.width, scaleFactor.height); > + const float fuzzFactor = 0.01; > + > + // If we aren't scaling, don't go down this path We also shouldn't go down this path if our transform is more than just a scale + translation. In that case, the "unscaled" draw would still be slow. @@ +667,5 @@ > + > + nsRefPtr<gfxContext> tmpCtx = new gfxContext(scaledDT); > + scaledDT->SetTransform(ToMatrix(scaleMatrix)); > + gfxRect gfxImageRect(aImageRect.x, aImageRect.y, aImageRect.width, aImageRect.height); > + aDrawable->Draw(tmpCtx, gfxImageRect, true, GraphicsFilter::FILTER_FAST, 1.0, gfxMatrix()); I think you need to pass aFilter instead of FILTER_FAST here. Otherwise you're cheating and the scaling will look worse than before. CSRD uses FILTER_FAST because it's not scaling.
Updated with feedback from comment 30.
Attachment #8636344 - Attachment is obsolete: true
Attachment #8636655 - Attachment is obsolete: true
Attachment #8636657 - Attachment is obsolete: true
Attachment #8639991 - Attachment is obsolete: true
Attachment #8640523 - Attachment is obsolete: true
Attachment #8640523 - Flags: review?(mstange)
Attachment #8640799 - Flags: review?(mstange)
Comment on attachment 8640799 [details] [diff] [review] Prescale image with an intermediate surface if repeated often Review of attachment 8640799 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/thebes/gfxUtils.cpp @@ +606,5 @@ > > +#ifdef MOZ_WIDGET_COCOA > +// Only prescale a temporary surface if we're going to repeat it often. > +// Scaling is expensive on OS X and without prescaling, we'd scale > +// a repeated every image. However, using a temp surface also potentially uses "a repeated every image" sounds wrong @@ +661,5 @@ > + (int32_t)scaledImageRect.height); > + if (scaledImageSize.width != scaledImageRect.width || > + scaledImageSize.height != scaledImageRect.height) { > + // If the scaled image isn't pixel aligned, we'll get artifacts > + // so we have to take the slow path. Poor zooming. Let's remove the "Poor zooming." part since we might forget to update it once we snap the dest rect in ComputeSnappedImageDrawingParameters (and at that point zooming will no longer hit this early return). @@ +691,5 @@ > + DrawOptions drawOptions(aOpacity, > + CompositionOpForOp(aContext->CurrentOperator()), > + aContext->CurrentAntialiasMode()); > + > + SurfacePattern scaledImagePattern(scaledImage, ExtendMode::REPEAT, Matrix(), ToFilter(aFilter)); This line is definitely too long.
Attachment #8640799 - Flags: review?(mstange) → review+
Carrying r+, updated with feedback from comment 32.
Attachment #8640799 - Attachment is obsolete: true
Attachment #8641095 - Flags: review+
Successful try on OS X since this only affects OS X. https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c9512bb9fb5
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: