Closed Bug 435296 Opened 16 years ago Closed 15 years ago

imagelib should support decode-on-draw

Categories

(Core :: Graphics: ImageLib, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1

People

(Reporter: joe, Assigned: bholley)

References

(Blocks 1 open bug)

Details

(Keywords: memory-footprint, perf)

Attachments

(11 files, 63 obsolete files)

(deleted), image/jpeg
Details
(deleted), image/jpeg
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Dolske
: review+
Details | Diff | Splinter Review
(deleted), patch
mozilla
: review+
Details | Diff | Splinter Review
(deleted), patch
Dolske
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
It should be possible (or possibly required) to pass a cairo surface into imagelib when drawing, which would then decode and draw the image. There are also caching implications for this, because we probably don't want to hold on to a cache of compressed/encoded data if it's a tiny or transparent image whose decoded version is smaller than the encoded version.
Blocks: 435299
Assignee: nobody → joe
Flags: wanted1.9.1?
Priority: -- → P2
Flags: wanted1.9.1?
Flags: wanted1.9.1+
Flags: blocking1.9.1-
stealing the bug from joe.

all mine now. ;-)
Assignee: joe → bobbyholley
Attached patch WIP patch 1 (obsolete) (deleted) — Splinter Review
work on this is going well. Attaching a WIP patch (applied on top of the patch from bug 753).

Some pretty significant re-architecting of libpr0n goes on in this patch - the imgContainer plays a much more central role in things, essentially being the gatekeeper of incoming image data and deciding what to do with it. I just managed to fix the last bug so that firefox runs properly with discarding and decode-on-draw disabled (currently global constants in imgRequest.cpp). I have yet to deal with progressive loading (the main barrier to getting things working with either of those features turned on), but I don't think it should be too tough. Needless to say, I'm pretty fried at the moment so I'll work on that stuff tomorrow.
Depends on: 753
No longer depends on: 435294
Attached patch WIP patch 2 (obsolete) (deleted) — Splinter Review
Updated WIP patch - discarding an decode-on-draw both work. Still need to deal with things like progressive loading and a host of other issues I've made a list of, but it works. yay!
Attachment #386079 - Attachment is obsolete: true
Blocks: 502270
Attached patch WIP patch 3 (obsolete) (deleted) — Splinter Review
Updated the WIP patch to sit on top of tip and joe's updated patch from bug 753.

Joe created bug 502270 for progressive redecoding (or d-o-d decoding), so that issue can be moved there.

The next big issue for this patch is to do a better job getting the size. Nobody's on IRC right now, so I'll just dump my thoughts here (mostly for joe).

Right now, we open a decoder, feed it chunks of data until we hear back from the decoder about the size, and then leave it in limbo until we finish decoding. If it were just a question of having an open decoder sitting around this might be ok. However, in general the same parts of the decoders that read the size of the image also create the (first) frame in the container, which means allocating the whole pixel buffer. So except for multiframe (animated) images, the current implementation brings no memory savings, since we end up holding on to mostly-empty buffers for a full frame of each image.

My general sense is that it's not really feasible to try to fix this problem in a single pass. I don't think we can reliably guess the exact right amount of data for each header 100% of the time, and things like the libpng progressive decode don't look like they have the ability to be rewound. This means that we would need a complicated way of efficiently storing overflow pixels, which sounds really messy.

By far the easiest and cleanest thing to do would be to open the image, wait until we get size info, allow the frame to be allocated and maybe do a few bits of doomed decoding, and then close the decoder and immediately discard the frame, putting the image in a 'discarded' state. The downside here, of course, is that we do a quick allocate-free, and perhaps do a bit of unnecessary decoding. We can't just lie to the decoders and tell them that a frame has been allocated when it hasn't, because the decoders decode directly into the frames' internal buffers.

The uglier, but probably better solution is to modify the decoder interface to allow us to flag a "header-only" decode, whereby the decoder ignores all data processed past the size info (things like libpng will still do some unnecessary processing, but this is probably unavoidable). This also lets us avoid other bits of unnecessary work like ICC profile creation. I'm generally loathe to make significant decoder-side changes because the work is multiplied by 7, but in this case it's probably worth it. Once we have the size, we can mark image as discarded, like above, and continue on. At the moment it's not quite clear to me whether we would want to notify content-side imgIDecoderObservers about this little shenanigan - I need to get a better feel for how things work over there first.

As a minor bonus, getting rid of the partial decoders should simplify the code a tiny bit, since we no longer have any distinction between decode-on-draw images and discarded images after load time.

Joe - curious for your thoughts on this.
Attachment #386405 - Attachment is obsolete: true
I definitely agree that we should have a headers-only decode. And we likely don't want to notify imgIDecoderObservers at all when we're doing a headers-only decode, because we might wreck their state machines - getting an OnStartContainer without any OnStartFrame, for instance.
The other thing to keep in mind is that it seems that your current patch gets rid of ContainerLoader, which is sort of a do-nothing class that just makes it possible to re-decode images without telling anyone about it. AFAICT right now your patch will tell all the DecoderObservers about the re-decoding, which is not the same behaviour as we have now, so it'll have to be tested well. (A potential state machine problem.)
Keywords: footprint, perf
Attached a WIP patch for header-only decoding. Does some pretty ugly setjmp/longjmp in the PNG decoder to bail out at the right time while avoiding expensive initialization and frame creation. This works as far as I can tell, but one nasty side effect of using longjmp is that you can't step over it (ie, as soon as gdb detects a longjmp it breaks at the jump target). So anyone trying to step over a function who's nth grand-child calls WriteFrom on a png decoder will get sucked down the call stack. Curious as to what people think about this.

I've spent the past few days trying to get this to work properly, and have come to the conclusion that partial decoding more or less needs to be shelved until we figure out how we're dealing with decode notifications on the layout side (bug 502270). Under the previous patch, I've discovered that we were more or less always starting with enough data to create a frame, which fired off enough events to get the layout state machines into a usable state. When we start becoming more stingy with notifications, the drawing code gets stuck and refuses to initiate any drawing, which in turn prevents the triggering of the decode-on-draw machinery.

So for the moment I'm switching gears to go figure that stuff out and will return to this once that's settled.
Attached patch WIP patch 3 (updated to tip/753) (obsolete) (deleted) — Splinter Review
Rebased WIP 3 on top of tip and joe's updates in 753. works as before
Attachment #386909 - Attachment is obsolete: true
Attached patch WIP patch 4 (obsolete) (deleted) — Splinter Review
Another WIP patch containing what I've been doing lately. There ended up being quite a bit of coupling between various things, so I made this one big patch - depending on what we decide it can be split up later. At the moment, I'm turning off discarding to focus my debugging time. I'll re-enable it later once everything is working without it.

Aside from numerous changes and bugfixes, the two main results of this patch are the integration of progressive re-decoding and header-only decodes. Applied on its own, this patch produces a basic but working progressive decode implementation where you can load up an html page with a few big images, tab over, watch the first one decode before your eyes, and then scroll down and watch the second one decode once it comes into view. I say "basic" because I think it needs a fair bit of tweaking, and perhaps some finer heuristics, to get the right balance of decoding speed and interactivity.
The one main issue at the moment is that, with this patch, some UI images (such as the URL bar) don't paint at all. I've spent some time on it, and I've more or less figured out the cause. In a nutshell, at the moment we "touch" the image to prod it into decoding when someone tries to access the frame data (drawing it, among other things). Unfortunately, some layout-side code is really timid about interacting with the images, and doesn't try to do anything with them until it sees STATUS_FRAME_COMPLETE when it queries GetImageStatus on the imgRequest. The frame won't be complete until we decode, we don't decode until we draw, and the layout-side code doesn't want to draw until the frame is complete. So we more or less have a chicken-egg scenario. Kludging GetImageStatus to touch the image makes things render wonderfully, but GetImageStatus is called often enough that this approach more or less reverts us to decode-on-load.

I haven't had a chance to look at the layout code in much detail, but one possible solution would be to expose the notion of a "touch" to layout, giving it a function to call when it would like to draw the image. Curious of anyone has any feedback on this idea.

This is still far from complete. It still doesn't render jpegs right all of the time, and I didn't even add support for header-only BMP/ICO/ICON/XBM/GIF. Additionally, I have a huge list of things that need doing, not the least of which is making sure we handle broken images properly. Nevertheless, we're converging on a patch that at least starts to do something pretty.
Attachment #387859 - Attachment is obsolete: true
Attachment #388206 - Attachment is obsolete: true
Attached patch WIP patch 5 (obsolete) (deleted) — Splinter Review
Updated WIP patch.

I've fixed some of the bugs mentioned in the above comment, and things are starting to look pretty good. I've got JPEGs, PNGs, GIFs, and BMPs working, and ICO, Icon, and XBM are on the way (just didn't get around to it today). I fixed the chrome painting bug (I elected to use STATUS_LOAD_COMPLETE in place of STATUS_FRAME_COMPLETE per roc's suggestions), as well as numerous other assertions and bugs. So right now, things are pretty usable. At the moment, my list of things to do next is the following (in order):

1) finish up support for the rest of the decoders
2) add code to switch from header-only to full decode if we get a touch mid-header-decode
3) figure out what's going on with the leaked URLs reported on shutdown
4) remove mNumFrames (dead concept), and make sure layout is ok with that
5) figure out stuff with animated images
6) get things working with discarding enabled
7) much more...
Attachment #388728 - Attachment is obsolete: true
Attached patch WIP 5 rebased to tip/753 (obsolete) (deleted) — Splinter Review
Quite a few conflicts after the latest update to 753. Rebased on top of tip/753. Haven't done significant testing, but it appears to work as before.
Attachment #389025 - Attachment is obsolete: true
Attached patch WIP 6 (obsolete) (deleted) — Splinter Review
Latest WIP patches. Knocked off the first two items in the above list, and fixed a whole bunch of other bugs (I spent a few hours hunting down a race condition with the single-threaded event-workers - bluh). This patch is now very usable, and doesn't crash or have any serious rendering bugs. The xul-popup with larry doesn't render right, and I haven't given animated images more than a cursory test. But nonetheless, things look better.
Attachment #389099 - Attachment is obsolete: true
Attached patch WIP 7 (obsolete) (deleted) — Splinter Review
Updated WIP patch. A good deal of progress, including:
1) Abolishing 'mNumFrames', which doesn't make sense in a d-o-d world
2) timer-based progressive decoding
3) several other progressive-decode optimizations
4) sending onstartcontainer only once
5) got border-image (and larry!) working
6) other fixes

The UI is pretty snappy now.
Attachment #389139 - Attachment is obsolete: true
Attached patch WIP 8 (obsolete) (deleted) — Splinter Review
Finally fixed that damn memory leak, started running tests, fixed other small bugs, and got discard up and running. I ran into 2 issues that I want feedback on, which I'm posting in 2 separate comments below.
Attachment #389951 - Attachment is obsolete: true
====Reftests / Canvas->DrawWindow()========

Currently almost all of the libpr0n reftests fail with this patch applied, as the rendered images all come out blank. This is due to the fact that the reftast framework waits for load to complete, and then calls drawWindow to draw the window to the canvas, finally comparing the results. Thus, Draw() is in fact called on the images, but they never get the chance to draw themselves. On face it would appear that we should see partially-rendered images, but one of the UI-snapiness optimizations I cooked up was to defer _all_ drawing and return immediately, so that tabbing over to a page with undecoded images can happen instantaneously. All that aside, we need to figure out what kind of behavior we want for drawWindow. One way or another, we want to be able to say something like "wait until the page is ready to be rendered as you would see it on the screen, then render it". There are this two apparent vectors to fix this in the decode-on-draw case.

The first is to have an event analagous to 'load', but that takes into account image decoding. We would then of course need some way to flag that we want the images to be decoded, otherwise we'd wait for that event indefinitely (this could be accomplished with the TouchImage() call I added to the API).

The second is to modify the DrawWindow call (or rather, the RenderDocument call and its children) to be able to flag some sort of synchronous render mode. This seems somewhat easier, and probably requires fewer api modifications.

Either way, there's a whole ton of content and layout code, none of which I know much about, between the canvas code and libpr0n. I haven't really looked if there are any flags that get passed all the way down, and I know that flag bits are jealously guarded in some areas of the code.

What do people think? If there are other solutions I haven't thought of, please suggest them. ;-)
====Flickering with Discard enabled======

As before, under the current patch any image that doesn't get drawn (or have frames queried) for 15 seconds gets discarded. This includes images that are currently being displayed. So when you do certain scrolling operations or interact with the page, the images need to be redecoded. Previously, with synchronous redecoding, this would mean a slight "sleepiness-hang" when scrolling image-heavy web pages. This is still pretty crappy behavior, but I guess it was deemed acceptable. However, with progressive (re)decoding, we now have a "flicker" in all the loaded images as they get redecoded. One solution here would be to just disable progressive REdecoding for "small" images, which would fix most of the flickering issues on regular web pages and revert us to the old situation. This would be pretty easy, but seems lame. In my mind, the better solution is to somehow let images know that they're currently "on screen" and that the frames might be needed for "minor" user-interaction (scrolling, mousing-over images with a different hover property, etc - this is opposed to "major" interaction like switching tabs). I feel like this should be possible, but again I don't know enough about layout to know where to begin.

What do people think of this strategy? Is it a good one? If not, what do you suggest? If yes, how should I go about it?
Attached patch WIP 8 (rebased) (obsolete) (deleted) — Splinter Review
Just pushed to try to get us some perf and mem numbers. These numbers may not be totally accurate, and even if they were it's not exactly clear how we should interpret them (for example, if Tp doesn't scroll down a big page, this will seem like a big win, but we're in fact only really delaying processing). Regardless, I'm curious.

While I was doing it I discovered that there was a bit of rebasing to be done thanks to rob's work in bug 502711 and joe's work in bug 505474. This is the rebased patch.
Attachment #390189 - Attachment is obsolete: true
(In reply to comment #16)
> ====Reftests / Canvas->DrawWindow()========
> The second is to modify the DrawWindow call (or rather, the RenderDocument call
> and its children) to be able to flag some sort of synchronous render mode. This
> seems somewhat easier, and probably requires fewer api modifications.

DrawWindow() should - somehow - figure out what images are going to be drawn (so we don't accidentally draw every image currently on the page), then draw them synchronously. The same should probably apply to drawImage, fwiw.

(In reply to comment #17)
> ====Flickering with Discard enabled======
> 
> As before, under the current patch any image that doesn't get drawn (or have
> frames queried) for 15 seconds gets discarded. This includes images that are
> currently being displayed.

I have been wondering about this for a while. Should we be discarding images that are currently being displayed? I guess it does make a big memory difference, and especially with your progressive-decode work, it should be less bad.

> So when you do certain scrolling operations or
> interact with the page, the images need to be redecoded. Previously, with
> synchronous redecoding, this would mean a slight "sleepiness-hang" when
> scrolling image-heavy web pages. This is still pretty crappy behavior, but I
> guess it was deemed acceptable. However, with progressive (re)decoding, we now
> have a "flicker" in all the loaded images as they get redecoded.

Okay, I downloaded your tryserver build (available at https://build.mozilla.org/tryserver-builds/bobbyholley@stanford.edu-try-02ff9b1fb87f/ for the time being) and reproduced that flicker.

I think the right solution here is to make the images in the foremost tab in every window not discard. That should be possible with some small content and browser work; you might want to file followup bug(s) on making that happen. Boris Zbarsky probably knows how we could do that, or even if we already have that information somewhere.
For now, we'll nominate this as a 1.9.2 branch blocker, and if it seems impossible come next week, we'll denominate it.
Flags: blocking1.9.2?
Priority: P2 → P1
Attached patch WIP 9 (obsolete) (deleted) — Splinter Review
Updated WIP patch. This includes a new hand-written png decoder for header-only decodes, which fixes all the PNG ugliness, is super fast, efficient, and safe, and puts that issue squarely in the 'done' category.

So here are the outstanding issues that I feel need to be handled with this patch (I have another list of followup bugs):

1) Proper error handling - This is something that I somewhat put on the back burner when I was writing this patch, ie I didn't take much time to make sure that we did all the right things when we get bad data in different places along the pipeline. I've debugged several assertions and mochitest failures to this issue, so hopefully a lot of problems will go away when this is fixed.

2) Canvas issues - drawWindow and drawImage need to trigger synchronous decoding in the decode-on-draw case. This is the cause of failing basically every reftest and quite a few mochitests

3) Test failures - It's not clear if all, most, or just some of these are subsumed by 1 and 2

4) Flickering - We need some way to disable discarding on images in the currently visible tab, as well as chrome images. It should be pretty easy to make chrome images non-discardable. Not sure about the second part (waiting for bz's input here).

5) Handling multipart/x-mixed-replace. I think this can actually be fixed pretty easily by just forcing such images to be non-discardable and non-decode-on-draw, but I can't really test this well until I get the issue 1 resolved.

I'm pretty burned out from this week's sprint, so I don't think I'll do much this weekend. Given that, I'm not sure that it's all that likely that things will be ready, tested, and reviewed by the branch date (especially because I'm taking the second half of next week off for my 21st birthday). What still _might_ be possible is to land things with decode-on-draw switched off. This would mean that most of the infrastructure would be there, but the behavior shouldn't change too much (we'll still have issues related to progressive re-decoding for discarded images, which might still be too much). This only kicks one item (the canvas stuff) off of the above list, but it might mean that we see fewer issues crop up beyond this (time will tell). The benefit of this would be that the code changes to add decode-on-draw would be pretty minimal, and could probably be added separately to the fennec branch if we decided it was something we wanted.
Attachment #390196 - Attachment is obsolete: true
Yeah, as-is, this isn't going to make 1.9.2, and it isn't going to be an alpha blocker. We can reevaluate later, but I think not landing right now is the right (but unsatisfying) choice.
Flags: blocking1.9.2?
Priority: P1 → P2
Attached patch error handling interdiff (obsolete) (deleted) — Splinter Review
Just did a bunch of work to make the decoders, imgContainer, and imgRequest error handling consistent, robust, and sane. This fixes at least one of the mochitest failures. Attaching the interdiff (80k!) here, will post the full patch soon (need to leave for dinner now).
Definitely drawWindow should force synchronous decoding of anything it draws that's not already decoded. We're going to need some kind of flag, probably in nsDisplayListBuilder, that controls whether synchronous decoding is required.

I need to think about the flicker issue.
Attached patch WIP 10 (with error handling) (obsolete) (deleted) — Splinter Review
Didn't get around to posting the combined patch yesterday. Here's a folded version now.
Attachment #390463 - Attachment is obsolete: true
Attached patch WIP 11 (obsolete) (deleted) — Splinter Review
I woke up with a surprising lack of a hangover this morning (21st birthday last night - woo!), so I decided to do get some work done. This patch is the culmination of the past few days of work. Lots of small bugfixes, but the big feature here is synchronous decode support, which stretched its tentacles all over layout and content, among other things. With this patch, we only have 11 reftest failures (which, incidentally, have me baffled, because they don't render any images). Mochitests seem to be triggering an assertion somewhere which I need to look into as well. Nonetheless, the issue list is definitely shrinking.
Attachment #390880 - Attachment is obsolete: true
Attachment #391037 - Attachment is obsolete: true
Attached patch WIP 12 (obsolete) (deleted) — Splinter Review
Updated patch with today's work.

So I spent today trying to figure out what was going on with that hanging mochitest (tests/caps/tests/mochitest/test_bug292789.html). The first half of the day involved being confused by a failed load of /favicon.ico and thinking that the failure I saw in the image decoding logs (as a result of that) was related to the failure of the test.

The second part of the day was the fun part. As it turned out, the problem had to do with us not firing the onload event once the image was loaded. I looked into it, and all the onload firings were being hooked off of OnStopDecode, which of course didn't fire with decode-on-draw. However, I began to start wondering what the event pattern should really look like. It seemed that onload should indeed fire when the image data is loaded (as opposed to decoded), but that would create some weird asymmetry with onerror, which it would seem should fire on either decode error or load error. However, after some investigation I discovered that firefox (along with safari and chrome) actually doesn't report image decoding errors past decoder instantiation, and that the decoders currently lie in their "OnStopDecode" status and force NS_OK even if something bad happened.

So here's the state of affairs with all the browsers that we of course want to preserve:
* Someone writes img.src="foo" in js
* An image request is created and fired up
* The mimetype is sniffed (we don't trust the server)
* A decoder is instantiated with the given mimetype

If this instantiation fails, we fire "onerror". If it succeeds, we fire "onload". Thus, the only cases where we fire onerror are:
1) The URL is somehow invalid or ruffles the feathers of the content policy
2) A network error occurs before all the data arrives
3) The first few (sniffed) bytes give us an invalid mimetype, an we balk on decoder instantiation

Otherwise, we fire onload once the network request completes. Furthermore, I did some testing and this exact behavior is replicated in safari and chrome (I don't have IE handy).

So this means that our current observer notification strategy is pretty nonsensical. OnStopDecode is only used to determine success/failure status, but the status in question really has nothing to do with the status of the decode per se. Thus, a more appropriate strategy would be to eliminate OnStopDecode and add the status flag to OnStopRequest, but that requires a lot of modifications and frankly I'm scared to touch any of the onstartrequest/onstoprequest stuff until we fix bug 339610 and can stop spending so much effort in imgRequest.cpp to hack multipart/x-mixed-replace into workable shape. Nonetheless, this is a bigger chunk than I'd like to byte off for this patch, so I'm saving that for bug 505385. As such, I ended up implementing a bit of a kludge.

So here's what we do in this patch (temporarily). We keep the intelligent decoder notifications in imgContainer and the decoders, but supress them in imgRequest. Within imgRequest, we make OnStopDecode and OnStopRequest best friends and always send them at the same time. All state associated with OnStopDecode is eliminated and folded into the state for OnStopRequest. I'd really like to clean up imgRequest, but one of the main things that's tangling it up right now is multipart/x-mixed-replace. We could make things quite a bit cleaner, but really can't until we fix bug 339610. So if there's nobody with enough fu to do it, I might end up diving in and fixing it myself.

With this patch, mochitests are starting to look a lot better. I've pushed to try to get a better picture.
Attachment #391669 - Attachment is obsolete: true
I discovered and filed a pretty bad layout-side bug (bug 507902), that turned into a pseudo-random crasher with this patch applied. Flagging it as a dep and working up a patch for it.
Depends on: 507902
Attached patch WIP 13 (obsolete) (deleted) — Splinter Review
updated WIP patch. With decode-on-draw the canvas drawImage tests for animated images (which ensure that the first frame is the one that's drawn) began to pass unexpectedly for obvious reasons. Since we still couldn't _count_ on them passing, I bit the bullet, modified the imgIContainer API, and implemented support for extracting/getting/copying both the current frame _and_ the first frame. I then updated the canvas implementation to use the "first frame" flag, so now under this patch we're incrementally closer to spec compliance for canvas.
Attachment #391897 - Attachment is obsolete: true
Blocks: 478398
Attached patch WIP 14 (obsolete) (deleted) — Splinter Review
Updated WIP patch. Big change is the addition of a counted locking mechanism for imgContainers to prevent them from being discard when layout wants them. I'm starting to get an idea on what needs to happen over in layout for the locking stuff, but it's separate enough that I think I'm going to make it a separate patch. Also selectively disabled decode-on-draw for chrome images, which fixes the flicker issue for UI objects. Did the same for multipart/x-mixed-replace images which should in theory be all we need to support them, but I haven't had the chance to test them yet. Finally, a couple of other small fixes that fix test issues.

Most of my time wasn't spent working on bug 507902 and on debugging test failures. We're currently failing the reftest modules/libpr0n/test/reftest/apng/delaytest.html?bug411852-1.png and I'm not sure what to do about it. This test sets up an img element pointing a single-cycle APNG, waits 100 ms, and verifies that we see the second frame and not the first. However, this doesn't work with decode-on-draw because the animation doesn't start until the synchronous decode forced by drawWindow, and by that point it's too late. CCing dolske to see if he has any ideas on how to continue testing APNG in a decode-on-draw world.

There's also some flakiness with CSS3 background-image. I've seen several failures on the tryserver that I can't reproduce locally that have that in common, and I have at least one locally-reproducible one  that doesn't appear on the tryserver. I need to talk to someone in layout to see if there's something special about the way background-image gets drawn.
Attachment #392492 - Attachment is obsolete: true
CCing dolske re the apng test
From IRC: should be able to avoid the problem by having the test use a <canvas> + drawImage() to force a decode, and then start 100ms wait.
Attached patch WIP 15 (obsolete) (deleted) — Splinter Review
Updated WIP.

I fixed the failing reftest per dolske's suggestion. As it turns out, just increasing the delay kind of worked, because a draw still happens (just later) when the window is in the background. Nonetheless, I instituted a more robust fix.

The major code change here is support for multipart/x-mixed-replace. Quite a bit of code change to make this happen. One nice byproduct of it all is the removal of the imgILoad interface, which was kind of silly to begin with. The multipart handling in imgRequest.cpp is now a lot more sane.
Attachment #392804 - Attachment is obsolete: true
Spent part of monday and all of tuesday hunting down a really strange memory leak that occurred in the mochitest
http://mxr.mozilla.org/mozilla-central/source/content/events/test/test_bug336682_2.xul

As it turns out, it was a bug in necko. Filed and fixed bug 509929.
Depends on: 509929
Attached patch WIP 16 (obsolete) (deleted) — Splinter Review
Updated WIP patch. Most of this week was spent on dependent bugs (bug 510001 also cropped up). My big accomplishment was fixing and pushing bug 509929, but I also managed to fix some other issues to get all tests passing on all platform with decode-on-draw and discarding switched off. Huge thanks to bz for spending probably a sum total of 5 hours in conversation helping me work through various issues.

Changes include:
passing WANT_FIRST_FRAME for canvas patterns
locking icon images
forcing sync for backgrounds
fixing a weird case that only showed up on winnt try (couldn't reproduce locally) where border image would sometimes not trigger a reflow once the override size came in.

Next task - get things working with decode-on-draw switched on.

Also, bz convinced me that we should be blocking firing document onload for any images that start decoding before onload is fired (for one thing, this means that our Tp results have been somewhat artificially inflated). So that's another thing that needs to happen before this lands.
Attachment #393197 - Attachment is obsolete: true
Attached patch WIP 17 (obsolete) (deleted) — Splinter Review
Update WIP patch. Fixes some reftest failures, a fair bit of changes to be play smarter with the cache, updated some docs, and fixed a small error reporting issue with OnDataAvailable.

This patch, as far as I can tell, passes all unit tests on all platforms with both the configuration off,off and on,on. Orange still shows up occasionally, but I believe they're "random" and not related to this patch.

This is a pretty big milestone. The things remaining then before I start seeking review are:
1) Layout side changes to lock images on the current page and prevent them from being discarded. I think the optimization of allowing discarding of images outside the viewport should be a followup patch.
2) Implementing bz's invariant that onload should be blocked until all images that are already decoding are decoded.
3) Figure out what needs to be done to make sure we don't break printing (ie, where do we pass aForceSync?)
4) self-review of the patch

I can't say for sure how long the layout stuff will take. But it's very possible that this will be ready for review by the end of next week.
Attachment #394483 - Attachment is obsolete: true
Bobby, 
Just an heads up: I tried to run your try-server build this evening hoping to check out the perf updates (under Vista).  Build 20080814 crashed on me while running a You-Tube video in tab 1, opened facebook in tab 2 and went to the ILike Music application page and I think I may have tried to play content.  It then crashed, which I thought was odd and doesn't happen on the regular trunk build.
cuz84d,

Could you go to about:crashes in Firefox and send us the crash reporter links generated when your tryserver build crashed? That'll help Bobby diagnose the problem.
Unfortunately its reporting no submitted crash reports.  I imagine that since the whole browser died abruptly and shutdown, then restarted, it was unable to submit anything upon the crash.  I tried to check the crash reporting tool, but nothing.  I had another crash again, but under another type of site, but neither appears to be reproducible that I can tell.
This is all I get for crash
Firefox doesn't appear able to generate or fire off crash data before or after this type of crash occurs.  Don't know if this is already a known problem/bug in bugzilla.
Oh, crap; I wonder if try server builds have the crash reporter enabled.
Do try server builds upload symbols?
Bobby, you might find this helpful:

I had the 20090814 tryserver build crash again twice under the same behavior just now.  2 tabs open, 1 to minefield start page, 1st tab: log into bugzilla, view this bug, then clicked on the attachment link in comment #41 to try to view the about:crashes screenshot here, image loaded, which looked fine (just smaller, which I don't remember Trunk builds behavior shrinking the image here), then it bombed out both times within (m)seconds afterward.

Joe, Ryan:

It then re-opened and said it had a problem restoring this bugzilla bug page, I imagine due to SSL or something. Also still no crash report submitted.

Also, maybe we should spin off the crashing submission behavior to a new bug checking into try server builds debug/crashing status.
Reproduced the behavior in comment #44 a third time.  It appears either its the image type: jpeg in this case, or its how the loading/decoding the images are handled here in this case of clicking open the attachment in the current window after drawing the image it crashes.  

FWIW, The images where created using windows' print screen behavior pasted into MS Paint and saved as is.
Ok, throwing this out there for everyone following along about the crash I found.

Steps to reproduce image related crash:
1. Open Minefield Tryserver build BH_20090814
2. load this bug 435296 in first tab.
3. go to comment #41 (maybe not relevant)
4. click on link to open attachment link id-394667

Actual results: Image loads, then crashes build.

Expected results: Image loads, then able to zoom in/out.

Followup to comment about image size: loaded image size has not changed.
cuz84d - nice! Thanks so much for tracking down a testcase. I followed your steps in comment #46 and reproduced the crash (I get an assertion failure). I'll look into it in a minute.

For reference, the assert is:
Calling ShutdownDecoder() with no active decoder!: 'mDecoder', file /files/mozilla/src/modules/libpr0n/src/imgContainer.cpp, line 2104

Joe, Ryan - I'm assuming the builds on the tryserver shouldn't don't have NS_ABORT_IF_FALSE compiled in? If so, then it must mean that the assert catches the problem right before a less graceful crash.
Attached patch WIP 18 (obsolete) (deleted) — Splinter Review
Updated WIP patch.

My email was being flakey this morning, so I didn't get all the bugmail about the crash until just now. As a result, I spent the morning getting printing working instead. I don't have a printer handy, but I did make sure that print preview properly forces sync on offscreen images.

Looking into the crash next.
Attachment #394530 - Attachment is obsolete: true
Attached patch WIP 19 (obsolete) (deleted) — Splinter Review
Updated WIP patch to fix the crash reported by cuz84d. There was a SourceDataComplete call coming from inside the JPEG decoder, which is totally backward. I looked at the history, and it appears that there used to be a RestoreDataDone call there, and I'd changed it to be SourceDataComplete before I had a full picture of how the code was going to look.

cuz84d - If you'd like to verify that this is fixed for you, I've pushed this to try in rev 7d669c980e2b. It should be built and ready for testing in about an hour. Don't confuse this with c0d9601add17, which I pushed earlier this morning.

Thanks for catching that.
Attachment #394799 - Attachment is obsolete: true
Gah, tryserver decided to randomly crash the build process on windows. Repushed rev as 9428880490ce. Seems to be working now.
I don't see the install directory for the Windows build yet (11:15 CST).  Once its out, I'd only be able to check it on XP during the day since I'm at work, and maybe not on my Vista box till tonight.  Thanks.
Bobby, looks like you fixed that crash issue in comment #46 as tryserver build rev 9428880490ce appears to be working running on XP.  Good job and thanks for the update.
Here is what I can tell you on the printing side using the same tryserver build.
Print in general fires off data to the printer.  I found two issues thus far, I don't have time now to compare to other builds/firefox/etc but here is what I see.

Issue 1: Open attachment used for comment #41, print preview shows image shrink 84%, but printed image appears to be 100% and off the margins.  The printed image didn't match the preview layout.
  
Issue 2: Open http://www.msn.com and hit print preview, you get exactly what you see on the print preview there on the printout.  But, all the page content doesn't preview or print.  For example, the header is not showing or the use of the blue background colors.  The Ad image didn't didn't show on the preview and was not printed either, the preview layout drops the positioning for the ad image, and moves everything else snug up the page.
Doh, I forgot there was a setting in page setup to print background colors and images so Issue 2 above prints out the blue background color in left column and around the border and in the footer of the page. 

Do you have a testcase in mind for the rest?  I just tried to go to a MySpace page to find a background image page to print, I selected a random page which did print the background image and color.
cuz84d - so the only issue that (I hope) would be introduced by decode-on-draw would be that images might not be decoded (or only partially decoded) at print time, and so we want to make sure to force a full decode on them before printing. So the test case you should look at is printing a page with some images out of the viewport, and not scrolling to them before printing. This should test whether printing properly forces the decode.

If you're seeing other weird behavior, try checking it against a nightly (I see the same thing you do on the msn site on a nightly). If something's wrong there too, still please do file a bug (just not against me ;-)).
Followup to crash, I tested the rev 9428880490ce on Vista which worked also for crash in comment #41.  

Printing using Bobby's testcase:

I run into two things trying to find a real testcase Bobby mentioned for printing. Print Preview stops access until a full page load, Print does not before running. 

The Result: 

Testcase 2 revealed I was able to print a page which showed image outlines without the images off the screen, I didn't scroll to.  

Summary:

It appears the thing I'm seeing is that the full page load and or decode being completed by the time print runs does indeed make a difference here.

Again using XP to test rev 9428880490ce to test, here is the details of two tests I ran today:

Testcase 1: Tried http://www.winsupersite.com 
A) Print Preview doesn't want to load till the page is done loading, I get a Dialog box that pops up stopping print preview.  
B) If I try to Print a page without moving the scroll bars I find everything is printing on any page I tried, but its a case too where I am not fast enough to move the mouse to the File Menu-> Print features or hit Ctrl-P on a Lan internet connection.  Everything that I didn't see was printed.

Testcase 2: Tried http://www.youtube.com 
A) Result appears same as previous testcase.
B) This time I used Ctrl-P as soon as the page started loading, and hit ok, I could see the status bar still trying to get retrieve data.  Some images printed below the viewable screen without scrolling and some did not.  I only got outlines for several of the images in the Most Popular section of the webpage.  I then went to the page, scrolled down, saw the same on the browser window, then after I stopped scrolling, the images appeared.
Printing Test Part 2: 

Bobby, Did another test on YouTube to verify results.  This should help you.

Testcase 3: Again went to YouTube (I have never enable history and private browsing enabled this whole time)  
A) Didn't check print preview, but should be the same as Testcase 1 and 2.
B) I let YouTube Fully Load the page.  Hit Ctrl-P, Printout images in the Most Popular section printed exactly the same as before, only the outline of the images, the + Sign and Video Length for showed and printed for all those images.  I went back to the browser, (revising behavior to Testcase 2 here) Images outlines were present, + sign, Video length also showed just like printout as I scrolled down, but then *as I was scrolling down the images appeared.  The YouTube image that printed below in the unscrolled viewport is actually a real JPEG.  The images that didn't printout where JPG video images.
And only appeared again as I was scrolling the browser window down.

So either its decoded, but not displayed as in Testcase 2 & 3 below the viewport or its already decoded and displayed as in Testcase 1.  Or Testcase 2 & 3 shows some images were not decoded at all, until scrolling occured.  Maybe also the image snapshots of video content presents different behavior than real image types between Testcase 1 vs 2 & 3.
cuz84d - thanks for the super-detailed reports! I again reproduced the behavior you were seeing.

So there's a couple of issues at work here. The first is the question of "is the image downloaded from the network". If it isn't, then we expect any form of printing to leave the image blank, because we don't have it (this happens on nightlies without this patch applied). Thus, in testing for this patch you should never have to worry about "quick ctrl-p" - on the contrary, you should be waiting until the status bar says that everything is done.

Just because the image has been loaded though doesn't mean that it has necessarily been decoded to something viewable. This is where decode-on-draw comes in. Without this patch we just decode as the data is loaded, so we don't have any issues. With this patch, we only decode when we display the image, so off-screen images don't get decoded until we scroll to them or force drawing them in some way. They get decoded a little bit at a time, so the issue with printing is that we're suddenly taking a snapshot of the page, so while that does indeed wake up the decoding process, it doesn't give it time to finish. Thus, we need to wait for it explicitly.

Youtube is _more_ complicated. I just investigated it, and it looks like the video thumbnails actually _load_ on demand. If you open up firebug and go to the 'net panel', you can see the images being downloaded as you scroll down. So the issue with youtube is one of loading, not decoding, and not so much relevant here.

In general, I'm pretty sure the printing stuff works as it should. If you find an issue, you should make sure that the same issue doesn't exist in a nightly without this patch applied.

Again, thanks for all the input.
No problem, glad I can help out.  The YouTube case was what caught my attention initially to keep testing the tryserver builds.  I see the end user benefit to wait to decode on scroll as maybe a user would never scroll at all.  

I should of thought about double checking the nightly, doh, easy to forget if its not a priority.

Verified YouTube with nightly now, same result. Just learned about Firebug last night off the Trunk build forum, though didn't play with it much. ;)

If we need a better testcase, maybe something like http://blogs.msdn.com/Access would be a good example where we could potentially have a ton of images loading, I see a lot of images on that site all the way down to the bottom.  Or a search engine retrieving pictures site.

So is/would the patch force a decode before print at all, should we expect images to be have been decoded once the snapshot occurs?
(In reply to comment #59)
> So is/would the patch force a decode before print at all, should we expect
> images to be have been decoded once the snapshot occurs?

Yep - when the printing code renders the document, it passes a special flag down the rendering chain that tells any images "before you draw, decode everything you've got". I'd already implemented this for the canvas drawWindow() API, but I forgot about the printing case until comment #36. It should be all good to go now.
Attached patch WIP 20 (obsolete) (deleted) — Splinter Review
Updated WIP that fixes a few odds and ends. I think this is at the point where joe should be taking a look at it.
Attachment #394814 - Attachment is obsolete: true
Overall, very, very nice. Well written and well commented. I'm very excited about this!

While reviewing this, I'd regularly think I found a problem, only to be proven wrong. That is the sort of wrong I like to be.

Here are some comments:

- mIgnoreTheRest is sort of gross. Is there some way we could drain the input buffer if we're explicitly going to ignore the rest? And then exit out gracefully?
- If not, why doesn't every decoder (eg JPEG, PNG) need mIgnoreTheRest?
- Can you elaborate on why you needed to make the observer a weak pointer?
- observer = nsnull is only necessary if you're going to do a longjmp, because that screws up the stack unwinding.
- You'll need to also change the "contract names" for any interfaces you change (ie remove/change functions) (eg @mozilla.org/image/container;2 -> @mozilla.org/image/container;3)
- It sucks that IDL doesn't have enumerated types for type safety, but for kFrameFirst, etc, could you rename them to something like FRAME_FIRST?
- DFLAG -> DECODER_FLAG, and move it above the init function.
- For header-only decodes, does it make sense to also fire OnStopContainer()?
- "If we're not storing any source data, shut down the decoder" - add a note that that's because all source data was written directly to the decoder, and so we're done.
- Sort of sucks that multipart/x-mixed-replace creates a new decoder for every frame.
- Not too happy about WriteToContainer being in the global namespace. Could you make it a static member of some class? Or change its name to WriteStreamToImgContainer or something along those lines?
- Why did delayTest.html need to add an explicit forceDecode()? Doesn't the reftest snapshot use a canvas internally?
- The drag-and-drop code should force a synchronous decode.
- Synchronous decoding should probably be given its own named value, so instead of passing PR_TRUE and PR_FALSE as random arguments for functions, we can pass DECODE_SYNCHRONOUS or DECODE_DEFAULT or whatever you want to call them.
- Why do background images (in nsCSSRendering) paint synchronously?

- Nitpicky:
 - blank new lines after function opening braces? :(
 - incur, not encur

- Some white-box testing it'd be awesome to have:
 - A multipart/x-mixed-replace stream with differing frame sizes
 - Ensure we don't discard when we've decoded only the first frame of an animated image by pausing in the sending of data or in the decoding (somehow)
 - Once we get callers of LockImage/UnlockImage, I'd like to see if we can arrange for images to be partially decoded when locked/unlocked, and fully decoded (& waiting for discard) when locked/unlocked.

And, of course, try to resolve all the TODOs and XXXbholleys that you've added.

Great work!
Attached patch WIP 21 (obsolete) (deleted) — Splinter Review
Given that joe's now put in the time to look at the patch, I'm going to try to start posting WIPs for logical changes so that it's easier to track the changes.

This patch here gets rid of all the tracking within imagelib for "number of bytes actually read", since we were always just setting it to 0 or count anyway. This cleans things up a bit.

pushed to try as 53563a080d85
Attachment #395340 - Attachment is obsolete: true
Attached patch WIP 22 (obsolete) (deleted) — Splinter Review
> - Can you elaborate on why you needed to make the observer a weak pointer?
It's not needed anymore thanks to the more robust error handling I put in later in the development process. I think. Removed it in this patch.

> - observer = nsnull is only necessary if you're going to do a longjmp, because
> that screws up the stack unwinding.

This too. Removed all null sets to the image and the observer within the decoders.

pushed to try as 7a127fe6699b. We'll see if any memory leaks crop up.
Attachment #395554 - Attachment is obsolete: true
Attached patch WIP 23 (obsolete) (deleted) — Splinter Review
Updated WIP. Aside from the comments addressed below, I found a bug where there were really 2 non-error shutdown intents, one where we wanted to ensure that we got what we wanted, and the other where we were interrupting. Handled this explicitly.

> - mIgnoreTheRest is sort of gross. Is there some way we could drain the input
> buffer if we're explicitly going to ignore the rest? And then exit out
> gracefully?
Should be fixed now so that we never write to the decoder after we get what we want.

> - If not, why doesn't every decoder (eg JPEG, PNG) need mIgnoreTheRest?
well, JPEG really did, it was just called mHaveHeader instead. PNG had its own separate routine for header-only decodes, which was why. Regardless, it's all fixed now.

pushed to try as 3d5b6097e1bf
Attachment #395570 - Attachment is obsolete: true
Attached patch WIP 24 (obsolete) (deleted) — Splinter Review
Updated WIP. This fixes a crash that seems to have shown up recently as a result of moving the reflow code in nsImageLoader from OnStopFrame to OnStopRequest. OnStopFrame only gets fired if things get to a reasonable state, but OnStopRequest gets called even after the most borked request. Check now.
Attachment #395580 - Attachment is obsolete: true
Attached image 100% CPU Usage mem leak? (obsolete) (deleted) —
Bobby, 

looks like the latest, I haven't tested the other Try builds yet, but against the 8020 hourly (nightly) i had a 100% CPU spike on XP.  The nightly isn't doing so.  I was browsing web pages and flipping back and forth on http://msn.foxsports.com and http://autos.msn.com.  I started scrolling up and down with the mousewheel and it just started slowing down as it was loading the page I believe.  The memory on the hourly has dropped off.  But maybe there is a memory leak.  I had to kill the task.  Trying to reproduce.  I didn't seem to have much else running on the machine that would show a CPU spike.
Looks like the latest hourly does the same, but its the site here, i started using the scroll wheel during page load and it keep going down, will have to do more testing to see where this needs to be reported: http://autos.msn.com/home/used_research.aspx.
Comment on attachment 395614 [details]
100% CPU Usage mem leak?

This is a JS bug in 3.5.2 and trunk.
Attachment #395614 - Attachment is obsolete: true
Attached patch WIP 25 (obsolete) (deleted) — Splinter Review
An intermittent orange on try (layout/reftests/bugs/315920-20.svg) made me realize that we should be forcing sync in a few other places too (xul and svg). Did that here.

pushed to try as 25a96509433b
Attachment #395607 - Attachment is obsolete: true
Attached patch WIP 26 (obsolete) (deleted) — Splinter Review
Bumped the versions in the contract names. This could break something somewhere so pushing to try as 9019c8e8a613.
Attachment #395810 - Attachment is obsolete: true
Attached patch WIP 27 (obsolete) (deleted) — Splinter Review
(In reply to comment #62)

> - It sucks that IDL doesn't have enumerated types for type safety, but for
> kFrameFirst, etc, could you rename them to something like FRAME_FIRST?
done

done.

> - DFLAG -> DECODER_FLAG, and move it above the init function.

and done.

> - For header-only decodes, does it make sense to also fire OnStopContainer()?

Don't think so. OnStopContainer doesn't really have any meaning, and is going away in bug 505385. For the moment, it only has one substantive consumer:
http://mxr.mozilla.org/mozilla-central/source/layout/xul/base/src/nsImageBoxFrame.cpp#517
and I don't think that cares about header-only decodes.

> - "If we're not storing any source data, shut down the decoder" - add a note
> that that's because all source data was written directly to the decoder, and so
> we're done.

done.

> - Sort of sucks that multipart/x-mixed-replace creates a new decoder for every
> frame.
Yeah, I was thinking the same when I wrote it, but that's how the old system worked too. In general I think we should get bug 339610 fixed and then clean up a lot of this stuff.

> - Not too happy about WriteToContainer being in the global namespace. Could you
> make it a static member of some class? Or change its name to
> WriteStreamToImgContainer or something along those lines?

done.

> - Why did delayTest.html need to add an explicit forceDecode()? Doesn't the
> reftest snapshot use a canvas internally?

It uses a canvas yes, but the test assumes that the animation is running while the timeout runs. This doesn't happen with decode-on-draw, so we need to manually force a decode
to start the animation. See comment #30 and and comment #32.
Attachment #395819 - Attachment is obsolete: true
Attached patch WIP 28 (obsolete) (deleted) — Splinter Review
Good thing I pushed WIP 26 to try - looks like I missed bumping one of the version numbers, which was causing hangs.

I also discovered a bug where we were re-entering the decoder for header-only decodes by calling flush.

Also, fixed WriteToContainer to squelch errors, otherwise we assert in imgRequest.

pushed to try as 5f2987c47f61
Attachment #395830 - Attachment is obsolete: true
Blocks: 511899
Sounds good..I'll try to test it again.  

Also here is the generic try-server builds link: https://build.mozilla.org/tryserver-builds/?C=M;O=A
Bobby, Looks like the latest try build didn't produce a windows installer.
Yeah, the last patch had a small subtlety that breaks windows builds (assuming NS_METHOD and nsresult are equivalent). I've fixed this in my local version, and it'll be in the next patch I push.
Attached patch WIP 29 (obsolete) (deleted) — Splinter Review
Updated WIP. Big change here is rebasing off bug 113577. This changed a fair bit of how nsCSSRendering draws certain images, so it took some time to grok and I wanted to push it to try.

Small changes: Windows bustage fix, fixed the stuff with ResetImage, and fixed "incur".

Pushed to try as 8015190eedd8.
Attachment #395850 - Attachment is obsolete: true
Attached patch WIP 30 (obsolete) (deleted) — Splinter Review
(In reply to comment #62)
> - Nitpicky:
>  - blank new lines after function opening braces? :(

Fixed all the ones in imgContainer.cpp. If you see any others, let me know. Note that I'm going to do a followup to fix some of the formatting in imgContainer.
Attachment #396195 - Attachment is obsolete: true
Blocks: 512260
bz thinks that the flicker/locking stuff should be done later. Filed bug 512260 separately for it.
Attached patch WIP 31 (obsolete) (deleted) — Splinter Review
Updated WIP.

> - Synchronous decoding should probably be given its own named value, so instead
> of passing PR_TRUE and PR_FALSE as random arguments for functions, we can pass
> DECODE_SYNCHRONOUS or DECODE_DEFAULT or whatever you want to call them.

This replaces the PRBool aForceSync with PRUint32 aFlags, which can (currently) be imgIContainer::FLAG_NONE or imgIContainer::FLAG_SYNCDECODE. Also took another crack at fixing that windows compilation bustage.

pushed to try as d4a814383c13

> - Why do background images (in nsCSSRendering) paint synchronously?

My thinking is that since they have a reasonable fallback (the background color), we shouldn't draw them until they're fully loaded (like border-image). bz saw the logic in this, but said that I should ask dbaron. I'll do that once he comes on irc.

I believe this addresses all of joe's review comments. bz has a great strategy for the onload stuff, which I'm going to get working on now. After that, I think everything should be ready for review.
Attachment #396197 - Attachment is obsolete: true
Blocks: 512269
Blocks: 512435
Attached patch WIP 32 (obsolete) (deleted) — Splinter Review
Updated WIP that fires OnStartDecode at init time in order to support bug 512435
Attachment #396233 - Attachment is obsolete: true
Attached patch consumers v1 (obsolete) (deleted) — Splinter Review
Added a sub-patch of all the consumer-facing changes I've made. Flagging bz for review on this. After I address his comments, I'll also flag people for more specific reviews (such as the different widget owners).
Attachment #396540 - Flags: review?(bzbarsky)
Any update on the Windows build failure?
Attached patch tests v1 (obsolete) (deleted) — Splinter Review
Added some tests. These will have to land later, because they depend on some small test framework I wrote for the tests in bug 478398, which in turn depends on other bugs that depend on this bug.
Attached patch tests v1 (obsolete) (deleted) — Splinter Review
actual tests, not the d-o-d patch itself ;-)
Attachment #397296 - Attachment is obsolete: true
Attached patch WIP 33 (obsolete) (deleted) — Splinter Review
Updated WIP patch. This adds a new invariant to imgIDecoder::Close where we guarantee that OnStop* notifications will be called. This fixes a crashtest hang we started hitting in bug 512435 where we blocked onload but never unblocked because we never got stop notifications.
Attachment #396442 - Attachment is obsolete: true
Attached patch WIP 34 (obsolete) (deleted) — Splinter Review
Small update to the GIF decoder to match the new behavior introduced in the other decoders.
Attachment #397624 - Attachment is obsolete: true
Attachment #396540 - Flags: review?(bzbarsky) → review+
Comment on attachment 396540 [details] [diff] [review]
consumers v1

>+++ b/layout/base/nsDisplayList.h
>+  PRBool IsSyncDecode() { return mSyncDecode; }

How about:

  PRBool ShouldSyncDecodeImages() { return mSyncDecodeImages; }

(so rename the method and the member).

> +  void SetSyncDecode(PRBool aSyncDecode) { mSyncDecode = aSyncDecode; }

SetSyncDecodeImages.

>+++ b/layout/base/nsIPresShell.h
> +    RENDER_SYNCDECODE = 0x08

RENDER_SYNC_DECODE_IMAGES ?  Or RENDER_SYNCDECODE_IMAGES?

};

>+++ b/layout/base/nsImageLoader.cpp
>+    // XXXbholley - The above comment was there for OnStopFrame. What does
>+    // it mean? Does it still make sense now that we've moved this code to
>+    // OnStopRequest?

It means images send synchronous notifications, so attempting to paint might well reenter this code.  And yes it should still be relevant.  Look up the blame for that comment being added for testcases....

>+++ b/layout/base/nsLayoutUtils.h
>+  enum { PAINT_IN_TRANSFORM = 0x01, PAINT_SYNCDECODE = 0x02 };

 enum {
   PAINT_IN_TRANSFORM = 0x01,
   PAINT_SYNCDECODE_IMAGES = 0x02
 };

>+   * this is inside a transform or SVG foreignObject. If PAINT_SYNC_DECODE is
>+   * set, we force synchronous decode on all images.

Comment doesn't match code as far as the flag name goes.

>   static nsresult DrawImage(nsIRenderingContext* aRenderingContext,
>                             imgIContainer*       aImage,
>                             gfxPattern::GraphicsFilter aGraphicsFilter,
>                             const nsRect&        aDest,
>                             const nsRect&        aFill,
>                             const nsPoint&       aAnchor,
>-                            const nsRect&        aDirty);
>+                            const nsRect&        aDirty,
>+                            PRUint32             aImageFlags = imgIContainer::FLAG_NONE);

Are there lots of callers that wouldn't pass flags?  If not, make it non-optional.

>   static nsresult DrawSingleUnscaledImage(nsIRenderingContext* aRenderingContext,

And here.

>   static nsresult DrawSingleImage(nsIRenderingContext* aRenderingContext,

And here.

>+++ b/layout/svg/base/src/nsSVGImageFrame.cpp
>+  // XXXbholley - I don't think huge images in SVGs are common enough to
>+  // warrant worrying about the responsiveness impact of doing synchronous
>+  // decodes. The extra code complexity of determinining when we want to
>+  // force sync probably just isn't worth it, so always pass FLAG_SYNCDECODE

Would this really be any harder than it is for nsImageFrame? If so, please file
a followup bug and put a FIXME comment here, not an XXX one.  I don't think your assumption about huge images is necessarily warranted...

>+++ b/modules/libpr0n/build/nsImageModule.cpp

Are the contract id revs required?  Why?

>+++ b/modules/libpr0n/public/imgIContainer.idl
>+  const unsigned long FRAME_MAX_SPECIFIER = 1;

This needs a better name and documentation.

>+   * Get a surface for the given frame. This may be a platform-native,
>+   * optimized frame, so you cannot inspect its pixel data.
...
>+  [noscript] gfxASurface getFrame(in PRUint32 aWhichFrame,

Should that be "optimized surface"?

>+   * INIT_FLAG_MULTIPART: The container will be used to display a stream of
>+   * images in a multipart channel. If this flag is true, INIT_FLAG_DISCARDABLE
>+   * and INIT_FLAG_DECODE_ON_DRAW must be false.

s/true/set/ and s/be false/not be set/.

>+  readonly attribute unsigned long dataSize;

I assume that'll never be > 4GB or so, eh?  ;)

>+   * Sets the size of the container. This should only be called by the decoder. This function may be called multiple
>+   * times, but will return an error if subsequent calls do not match the first.
>+   */
>+  [noscript] void setSize(in long aWidth, in long aHeight);

In IDL terms it'll throw, not return an error.

>+++ b/modules/libpr0n/public/imgIDecoderObserver.idl
>- * XXXldb The two functions should probably be split.

I don't think that anything has changed to make that comment less applicable.

>+ * on load and not visibly discarded (the only option until bug 435296 landed),

I don't think that parenthetical is needed.

>+ * expect. However, this isn't necessarily the case anymore. With
> decode-on-draw,

Nor is the sentence starting with "However" here.
 
>+ * the set of decode notifications can come completely _after_ the load notifications,
>+ * and can come multiple times if the image is discardable.

Can they interleave in other ways?  Say decode start after load start but before load end; decode end after load end?

>+   * Called as soon as the image begins getting decoded. This does not include
>+   * "header-only" decodes used by decode-on-draw to parse the width/height
>+   * out of the image. Thus, it is a decode notification only.

Will this trigger sync or async?  For example, if I call ExtractFrame, will I get OnStartDecode before the ExtractFrame call returns?  Does it matter what flags I pass to ExtractFrame?

>+   * Called once enough data has been loaded from the network that we were able
>+   * to parse the width/height from the image. By the time this handler has been
>+   * called, the size has been set on the container and STATUS_SIZE_AVAILABLE
>+   * has been set on the associated imgRequest.

It's not so much a handler as a callback...

With the above nits fixed, looks good to me.  r=bzbarsky on these parts.
Blocks: 514033
Attached patch WIP 35 (obsolete) (deleted) — Splinter Review
Looks like I forgot to check for error shutdowns and header-only decodes before firing off notifications in Close. Fixed that here.
Attachment #397627 - Attachment is obsolete: true
Attached patch WIP 36 (obsolete) (deleted) — Splinter Review
small update to fix a problem where we requested a sync decode from within an imgIDecoderObserver method (which we don't allow, and assert against within imgContainer). OnStopFrame is called, so the frame is decoded and we don't need to sync decode.
Attachment #398016 - Attachment is obsolete: true
Attached patch WIP 37 (obsolete) (deleted) — Splinter Review
Updated WIP to address bz's review comments.

(In reply to comment #88)
> >+++ b/layout/svg/base/src/nsSVGImageFrame.cpp
> >+  // XXXbholley - I don't think huge images in SVGs are common enough to
> >+  // warrant worrying about the responsiveness impact of doing synchronous
> >+  // decodes. The extra code complexity of determinining when we want to
> >+  // force sync probably just isn't worth it, so always pass FLAG_SYNCDECODE
> 
> Would this really be any harder than it is for nsImageFrame? If so, please file
> a followup bug and put a FIXME comment here, not an XXX one.  I don't think
> your assumption about huge images is necessarily warranted...

Looking into this again. Will update soon.


> 
> >+++ b/modules/libpr0n/build/nsImageModule.cpp
> 
> Are the contract id revs required?  Why?
> 

shrug - joe said to. Added/removed methods from the API, so I assumed that I should...

> >+  readonly attribute unsigned long dataSize;
> 
> I assume that'll never be > 4GB or so, eh?  ;)

Joe - what do you think about this?


> >+++ b/modules/libpr0n/public/imgIDecoderObserver.idl
> >- * XXXldb The two functions should probably be split.
> 
> I don't think that anything has changed to make that comment less applicable.

The comment below about bug 505385 was meant to subsume it in the sense of "it should be fixed _and_ it's going to be"


Still need to look at the svg thing again, but I'm going to start flagging for more review now so things happen in parallel.
Attachment #398046 - Attachment is obsolete: true
Comment on attachment 398143 [details] [diff] [review]
WIP 37

flagging joe for formal review on the patch, and vlad for sr
Attachment #398143 - Flags: superreview?(vladimir)
Attachment #398143 - Flags: review?(joe)
Attached patch consumers v2 (deleted) — Splinter Review
flagging roc for review on the updated consumers patch
Attachment #396540 - Attachment is obsolete: true
Attachment #398144 - Flags: review?(roc)
Attached patch cocoa v1 (deleted) — Splinter Review
flagging josh for review of the osx stuff
Attachment #398146 - Flags: review?
Attachment #398146 - Flags: review? → review?(joshmoz)
Attached patch os2 v1 (obsolete) (deleted) — Splinter Review
flagging peter for the os2 stuff
Attachment #398147 - Flags: review?(mozilla)
Attached patch gtk v1 (deleted) — Splinter Review
flagging karlt for the gtk widget stuff
Attachment #398179 - Flags: review?(mozbugz)
Attached patch windows v1 (deleted) — Splinter Review
flagging jmathies for the windows stuff
Attachment #398181 - Flags: review?(jmathies)
Attached patch test changes v1 (obsolete) (deleted) — Splinter Review
flagging dolske for the changes to various image-related tests
Attachment #398182 - Flags: review?(dolske)
+    // The canvas spec says that drawImage should draw the first frame
+    // of animated images

Can you add this comment to CreatePattern too?

Why doesn't canvas image drawing do a sync decode?

+   * Set to PR_TRUE if images should be force to decode synchronously
+   * (be careful - the image must be loaded before a synchronous draw!)

What does that last line mean? We're going to crash if an image is not loaded?

@@ -3185,17 +3190,17 @@ ImageRenderer::Draw(nsPresContext*      
+          aDest, aFill, aAnchor, aDirty, imgIContainer::FLAG_SYNC_DECODE);

As we discussed, you don't want sync decode here.

+    RENDER_SYNC_DECODE_IMAGES = 0x08

I think sync decode should be the default for RenderDocument and drawWindow, to match existing behaviour for API users.

-NS_IMETHODIMP nsImageLoader::OnStopFrame(imgIRequest *aRequest,
-                                         PRUint32 aFrame)
+NS_IMETHODIMP nsImageLoader::OnStopRequest(imgIRequest *aRequest,
+                                           PRBool aLastPart)

This should still be OnStopFrame, I guess. But we'll need to trigger decoding of the image in ImageRenderer when the image is not complete or not decoded.

-  if (aFlags & PAINT_IN_TRANSFORM) {
+  if (aFlags & PAINT_IN_TRANSFORM)
     builder.SetInTransform(PR_TRUE);
-  }
+  if (aFlags & PAINT_SYNC_DECODE_IMAGES)
+    builder.SetSyncDecodeImages(PR_TRUE);

Still need {} here

+    if (aFlags & RENDER_SYNC_DECODE_IMAGES)
+      builder.SetSyncDecodeImages(PR_TRUE);

{}

I need to think about how to make this easier for consumers. Right now I think it's a bit error-prone for consumers, there's a risk that some images might never be decoded.
Comment on attachment 398146 [details] [diff] [review]
cocoa v1

roc had some suggestions that will alter the consumer stuff. Canceling widget reviews for now.
Attachment #398146 - Flags: review?(joshmoz)
Attachment #398147 - Flags: review?(mozilla)
Attachment #398179 - Flags: review?(mozbugz)
Attachment #398181 - Flags: review?(jmathies)
Attached patch WIP 38 (obsolete) (deleted) — Splinter Review
patch to fix roc's smaller concerns.

(In reply to comment #99)
 
> Why doesn't canvas image drawing do a sync decode?

It does. SurfaceFromElement forces sync.


I spent quite a few hours on IRC last night with joe and roc hashing out some issues. It boiled down to the following 3 changes:
1) move the exposure of touchImage from bug 512435 to this one, and rename it requestDecode
2) call requestDecode for background images, and remove the syncdecode
3) identify all other places where we just want to get the data. insert a requestDecode(), move the GetFrame/whatever call to hook off of FRAME_COMPLETE instead of LOAD_COMPLETE, and remove the syncdecode

doing that next.
Attachment #398143 - Attachment is obsolete: true
Attachment #398143 - Flags: superreview?(vladimir)
Attachment #398143 - Flags: review?(joe)
Attached patch WIP 39 (obsolete) (deleted) — Splinter Review
Updated patch to do things the way roc suggested.

I believe this patch should be code complete except for looking into the SVG drawing thing mentioned by boris. I'm about to head out for a few hours, so I'm flagging roc for review now. Assuming he's ok with it, I'll flag joe for r, vlad for sr, and peter for os2 (after some thought I think the other widget changes are trivial enough that bz and roc's review should be enough).

pushed to try as 689c0392a72b.
Attachment #398373 - Attachment is obsolete: true
Attachment #398475 - Flags: review?(roc)
+   * (make sure the image is loaded before doing a synchronous draw. If it
+   * isn't, there's no guarantee as to how much decoded data we'll end up
+   * with). Set to PR_FALSE (default) otherwise.

I think you should probably just remove this parenthetical comment, or have it say "if this flag is set, we'll decode and draw whatever data has been loaded. If it is not set, we will only draw whatever has been already decoded."

 NS_IMETHODIMP nsImageLoader::OnStartContainer(imgIRequest *aRequest,
                                               imgIContainer *aImage)
...
+  // Request a decode
+  aImage->RequestDecode();

We actually don't want this here, do we? This will force decoding of all background images. I think we want nsCSSRendering/ImageRenderer to do the RequestDecode when we try to render the image.

Otherwise great.
Attached patch WIP 40 (obsolete) (deleted) — Splinter Review
So I ran into some test failures after applying roc's suggestions.

Issue number one was layout/base/tests/test_bug445810.html, which expected borderimage load to be equal to borderimage decode. I pulled some testing machinery down my patch stack and fixed up the test to use it. So that part should be settled.

The more annoying problem is that I realized that we still do in fact need to do a sync decode for border image in certain cases. Just like regular nsImageFrames, we need to check the display list builder for ShouldSyncDecodeImages(), because otherwise we run straight back into the canvas/reftest issue. The sucky part is that DrawBorderImageComponent os called by nsCSSRendering::PaintBorder, which has about 20 callers. I talked to dbaron, and he didn't see any way of being clever about it, and suggested just passing aBuilder down to from all the callers. This shouldn't actually be hard, but it's certainly a pain. To make things easier to review, I'm going to postpone this for the moment, fix everything else, and make this a patch on top of this one.
Attachment #398475 - Attachment is obsolete: true
Attachment #398475 - Flags: review?(roc)
Attached patch test changes v2 (deleted) — Splinter Review
updated test-changes subpatch. flagging dolske for review
Attachment #398182 - Attachment is obsolete: true
Attachment #398646 - Flags: review?
Attachment #398182 - Flags: review?(dolske)
Attachment #398646 - Flags: review? → review?(dolske)
Attached patch WIP 41 (obsolete) (deleted) — Splinter Review
Added the ability to call requestDecode() on imgIRequest
Attachment #398645 - Attachment is obsolete: true
Attached patch WIP 42 (obsolete) (deleted) — Splinter Review
magic WIP 42!

Fixed that small comment, and set things up to call RequestDecode() on demand.

I believe that this satisfies roc's review comments. I'm next going to look into the svg thing real quick (the last remaining issue from bz), then do the patch to sync decode borderimage and background at the appropriate time.

sooo close...
Attachment #398648 - Attachment is obsolete: true
Blocks: 514672
so yeah - the SVG thing is doable, but it requires passing a flag in a bunch of places (similar to the border image case), which I don't want to add to this patch.

Filed bug 514672 for it.
looks like the stuff from comment #104 applies to backgrounds too.

roc - since it looks like this is going to a lot of changes, I wanted to check to see whether you agreed with dbaron that we should just pass the display builder everywhere. I have some other things to do in the mean time, so I'll hold off on working on that until you weigh in (I would guess in 6ish hours).
Comment on attachment 398656 [details] [diff] [review]
WIP 42

Since the aBuilder passing stuff will be a separate patch, this patch should be complete. Flagging roc and joe for review, vlad for sr.
Attachment #398656 - Flags: superreview?(vladimir)
Attachment #398656 - Flags: review?(roc)
Attachment #398656 - Flags: review?(joe)
Attached patch os2 v2 (deleted) — Splinter Review
I think the widget changes are pretty trivial in general, so I'm not going to seek review on windows mac and linux. Still running the os2 stuff by Peter though, since we won't have any other indication if something is broken.
Attachment #398147 - Attachment is obsolete: true
Attachment #398671 - Flags: review?(mozilla)
You shouldn't pass display list stuff into nsCSSRendering. You should just pass in a sync decode flag.
ugh, for some reason my bugmail didn't refresh and I didn't see roc's comment until just now.

So I just finished writing the patches to pass things down as display lists. This ended up being a _ton_ of code, and it would be a lot more if everything were flags. I'm hoping roc just wants things to be converted into flags right before nsCSSRendering, which would make it not too hard to change. Regardless, I'm just going to post the patches and flag him for review to see what he says.
Attached patch background builder passing v1 (obsolete) (deleted) — Splinter Review
added a patch to pass the display builder down to the background drawing code.
Attachment #398814 - Flags: review?(roc)
Attached patch border builder passing v1 (obsolete) (deleted) — Splinter Review
added a patch to pass the display builder down to the border drawing code.
this is, ideally, code complete (though roc or the tinderbox may say otherwise). Pushed the whole shbang to try as b920c5a478ed.

By shbang I mean the following patch stack:

bug-435296.patch
bug-435296-builder-background.patch
bug-435296-builder-borderimage.patch
bug-512435.patch
bug-512435-tests.patch
bug-435296-tests.patch
Attachment #398815 - Flags: review?(roc)
Why does drawWindow need to be able to force sync decode, anyway, if we block onload until all images for which we've requested decoding are fully decoded?
(In reply to comment #117)
> Why does drawWindow need to be able to force sync decode, anyway, if we block
> onload until all images for which we've requested decoding are fully decoded?

Because we haven't necessarily requested decoding them.

* Canvas source pages aren't necessarily on screen (they could be in background tabs, the window could be minimized, they could be visible, or they could be out-of-viewport)
* Even if they are, they might only have shown up after onload (ie, we might be triggering off of img.onload to call drawWindow)
* Printing certainly prints things that are offscreen. Without the sync decode stuff, printing will print blank images for everything south of the viewport
Attached patch background builder passing v2 (obsolete) (deleted) — Splinter Review
fixed a compilation error on some platforms and a crasher bug for the background builder patch.
Attachment #398814 - Attachment is obsolete: true
Attachment #398854 - Flags: review?(roc)
Attachment #398814 - Flags: review?(roc)
Comment on attachment 398656 [details] [diff] [review]
WIP 42

looks like there's a bug with border-image in the big patch. I've looked into it a little bit but right now I need to go (leaving europe in 14 hours). Canceling review for the moment.
Attachment #398656 - Flags: superreview?(vladimir)
Attachment #398656 - Flags: review?(roc)
Attachment #398656 - Flags: review?(joe)
I think Bobby already knows this but I'll repeat it: we should only pass the nsDisplayListBuilder to methods that are already manipulating display lists, which nsCSSRendering is not. The nsCSSRendering methods need explicit flags.

Although one possible alternative is to have "sync decode" be part of the gfxContext state. That would cut down on flag passing but feels vaguely inappropriate, so I'm not really advocating it.
Comment on attachment 398671 [details] [diff] [review]
os2 v2

Am I blind or are there no differences between "os2 v1" and this? Were there supposed to be?

Anyway, this should be fine, although I don't have a tree to actually test it.
Attachment #398671 - Flags: review?(mozilla) → review+
Attached patch WIP 43 (obsolete) (deleted) — Splinter Review
Updated WIP - fixes some issues with border-image
Attachment #398656 - Attachment is obsolete: true
gah, bugzilla interdiff random fail a la bug 514802...

Anyone following interdiffs should note that there were also quite a few changes to imgRequestProxy.{cpp,h} and imgIRequest.idl in the above WIP.
Attached patch WIP 44 (obsolete) (deleted) — Splinter Review
Updated WIP

After some thought, I decided that we should after all be passing SYNC_DECODE for border-images. We already have to RequestDecode indiscriminately for border-images, so all passing SYNC_DECODE really does is make us draw the border-image the first time (and finish off any small amount of decoding that we might have left) instead of painting the fallback for a split second while the image decodes. The reason this is beneficial has to do with the fact that we still need to be able to do sync decoding on border-images no matter what (so that they can be drawn from drawWindow with canvas). However, conditionally doing a sync decode turns out to be really complicated, because it means that in the common case IsBorderImageLoaded needs to return true for FRAME_COMPLETE, but sometimes it needs to return true for LOAD_COMPLETE instead. Because IsBorderImageLoaded is used for all sorts of things like layout and visibility, this gets messy quick. Thus, we pass SYNC_DECODE for border images.
Attachment #399614 - Attachment is obsolete: true
Attachment #399881 - Flags: review?(roc)
Attached patch sync decode handling v3 (obsolete) (deleted) — Splinter Review
added the sync decode handling patch. pushed the whole thing to try as 1988886d65fb.
Attachment #398815 - Attachment is obsolete: true
Attachment #398854 - Attachment is obsolete: true
Attachment #399884 - Flags: review?(roc)
Attachment #398815 - Flags: review?(roc)
Attachment #398854 - Flags: review?(roc)
Comment on attachment 399881 [details] [diff] [review]
WIP 44

+  if (mActions & ACTION_REFLOW_ON_DECODE)
+    DoReflow();
+  if (mActions & ACTION_REDRAW_ON_DECODE)
+    DoRedraw(nsnull);
+  if (mActions & ACTION_REFLOW_ON_LOAD)
+    DoReflow();
+  if (mActions & ACTION_REDRAW_ON_LOAD)
+    DoRedraw(nsnull);

{}

+  // The reflow might not do all the invalidation we need, so continue
+  // on with the invalidation codepath.

This comment makes no sense here
Attachment #399881 - Flags: review+
Comment on attachment 399884 [details] [diff] [review]
sync decode handling v3

+  PRUint32 paintFlags = aBuilder->ShouldSyncDecodeImages()
+                          ? nsCSSRendering::PAINTBG_SYNC_DECODE_IMAGES
+                          : 0;

probably worth having a helper aBuilder->GetBackgroundPaintFlags() for this

+                                         nsPoint aPt, PRBool aSyncDecode)

I'd pass PRUint32 aBackgroundFlags here instead. Flags beat boolean parameters anytime.
Attachment #399884 - Flags: review?(roc) → review+
Attached patch WIP 45 (obsolete) (deleted) — Splinter Review
addressed roc's review concerns on the decode-on-draw patch. flagging joe for final review, vlad for sr.
Attachment #399881 - Attachment is obsolete: true
Attachment #400065 - Flags: superreview?(vladimir)
Attachment #400065 - Flags: review?(joe)
Attached patch sync decode handling v4 (obsolete) (deleted) — Splinter Review
Updated sync decode patch to address roc's review concerns. I also took the liberty of making sync decoding a member variable of ImageRenderer that gets set during construction, which I think is cleaner (no longer need the flags for Draw and PrepareImage). This patch should be ready to go.
Attachment #399884 - Attachment is obsolete: true
Comment on attachment 397297 [details] [diff] [review]
tests v1

also flagging dolske for review on the tests proper.
Attachment #397297 - Flags: review?(dolske)
Comment on attachment 397297 [details] [diff] [review]
tests v1

>+// Note - Please do not use the test_bug435296_ONLY_* images for any other
>+// mochitests, because having them already in the cache would screw up this test

No one will see this. :) Also noted on IRC that running the test again in the same mochitest session fails (unless you shift-reload), which is unfortunate at best.

It would probably be better to just flush the cache at the beginning of the test, to fix both problems. Should be trivial to do...

See http://mxr.mozilla.org/mozilla-central/source/browser/base/content/sanitize.js#118

>+  // Wait a few seconds, and then the image should be decoded
>+  setTimeout("finishTest();", 3000);

Ugh! This will likely cause intermittent failures on slow test boxes.

I guess you can't use the draw-to-a-canvas trick here, since the point is to ensure the image eventually decodes by itself (without being forced). So just make this poll once a second, and eventually fail if 60 seconds/loops go by and it's still not decoded. Seems like the best we can do without an "ondecode" event.
Attachment #397297 - Flags: review?(dolske) → review-
Or just poll until decoded; if we hit the mochitest timeout, the test failed.
Comment on attachment 398646 [details] [diff] [review]
test changes v2

>--- a/modules/libpr0n/test/reftest/apng/delaytest.html
...
> function startTimer() {
>+  var canvas = document.createElement("canvas");
>   const delay = 100;
>   setTimeout("document.documentElement.className = '';", delay);
> }

I don't think this change is needed, looks like an accidental paste.
Attachment #398646 - Flags: review?(dolske) → review+
Comment on attachment 400065 [details] [diff] [review]
WIP 45

> nsBMPDecoder::~nsBMPDecoder()
> {
>+  if (mColors)
>     delete[] mColors;
>-    if (mRow)
>-        free(mRow);
>+  if (mRow)
>+    free(mRow);
> }

There's some indentation weirdness here - parts of the rest of the file uses 4 spaces, but you've used 2 spaces here & elsewhere, so we end up with a weird mismatch. The rest of imagelib uses 2 spaces, so that's probably what we should use.

Also, you don't need an if before a delete - the check is implicit.

> //******************************************************************************
>-nsresult nsGIFDecoder2::ProcessData(unsigned char *data, PRUint32 count, PRUint32 *_retval)
>+nsresult nsGIFDecoder2::ProcessData(unsigned char *data, PRUint32 count)
> {
>   // Push the data to the GIF decoder
>-  
>   nsresult rv = GifWrite(data, count);
>   NS_ENSURE_SUCCESS(rv, rv);
> 
>   // Flushing is only needed for first frame
>   if (!mGIFStruct.images_decoded) {
>-    rv = FlushImageData();
>+    nsresult rv = FlushImageData();
>+    NS_ENSURE_SUCCESS(rv, rv);

You don't need to redeclare rv here.

> //******************************************************************************
> nsresult nsGIFDecoder2::BeginImageFrame(gfx_depth aDepth)
> {
>+

Drop that blank line.

>-  void      EndGIF();
>+  void      EndGIF(PRBool aSuccess);

I'm not in love with a boolean success/failure parameter, mostly because it's not obvious when reading the callers of EndGIF what they are indicating. I'd prefer an enumerated type DECODE_SUCCESS/DECODE_FAILURE, but I'll also accept something along these lines:

EndGIF(/* aSuccess = */ PR_TRUE);

(That version is my poor man's way of getting named parameters in C++ code.)

Same comment applies to all the decoders' Close() calls, especially since the meaning of the boolean parameter to their Close() calls is actually inverted from EndGIF(). Also NotifyDone(), 

>diff --git a/modules/libpr0n/decoders/icon/nsIconDecoder.cpp b/modules/libpr0n/decoders/icon/nsIconDecoder.cpp
>-  // Read the image data direct into the frame data
>-  rv = inStr->Read((char*)imageData, imageLen, &readLen);
>-  NS_ENSURE_SUCCESS(rv, rv);
>-  NS_ENSURE_TRUE(readLen == imageLen, NS_ERROR_UNEXPECTED);
>+        // Book Keeping
>+        aFromSegment++;
>+        aCount--;
>+        decoder->mState = iconStateHaveHeight;
>+        break;

Indentation in this file is a little wacky - jumps from 2 spaces to 8 in some spots.

+enum {
+  iconStateStart      = 0,
+  iconStateHaveHeight = 1,
+  iconStateReadPixels = 2,
+  iconStateFinished   = 3,
+  iconStateError      = 4
+};

Should this be in the header file?

>+// We make this a method to get the benefit of the 'this' parameter
>+NS_METHOD
>+nsPNGDecoder::ProcessData(unsigned char* aBuffer, PRUint32 aCount)
>+{
>+

Drop the blank line here too.

> static NS_METHOD ReadDataOut(nsIInputStream* in,
>                              void* closure,
>                              const char* fromRawSegment,
>                              PRUint32 toOffset,
>                              PRUint32 count,
>                              PRUint32 *writeCount)
> {
>+

And here.

>+   * @param aWhichFrame Frame specifier of the kFrame* variety.

(repeated several times) Frame specifier of the FRAME_* variety


> imgFrame *imgContainer::GetImgFrame(PRUint32 framenum)
> {
>-  nsresult rv = RestoreDiscardedData();
>-  NS_ENSURE_SUCCESS(rv, nsnull);
>+
> NS_IMETHODIMP imgContainer::GetCurrentFrameRect(nsIntRect &aRect)
> {
>+
> NS_IMETHODIMP imgContainer::GetCurrentFrameIndex(PRUint32 *aCurrentFrameIdx)
> {
>+
> NS_IMETHODIMP imgContainer::GetNumFrames(PRUint32 *aNumFrames)
> {
>+
> NS_IMETHODIMP imgContainer::GetAnimated(PRBool *aAnimated)
> {
>+
>+NS_IMETHODIMP imgContainer::GetDataSize(PRUint32 *_retval)
> {
>+
> NS_IMETHODIMP imgContainer::AppendFrame(PRInt32 aX, PRInt32 aY, PRInt32 aWidth,
>                                         PRInt32 aHeight, 
>                                         gfxASurface::gfxImageFormat aFormat,
>                                         PRUint8 **imageData,
>                                         PRUint32 *imageLength)
> {
>+

A bunch of empty lines to start functions.

>+  // Notify that we discarded
>+  nsCOMPtr<imgIDecoderObserver> observer(do_QueryReferent(self->mObserver));
>+  if (observer)
>+    observer->OnDiscard(nsnull);
>+
>+
>+
>+  // Log

An unusual number of newlines there!

>+// This function is called whenever somebody touches us in a meaningful way.

What does 'meaningful' mean?

TouchImage() implies to me that we're just going to change its timestamp (and anything that relies on that, like the discard timer), so it's very surprising to me that it kicks off decoding as well. I think some explanatory comments and potentially a rename are in order here.

>   inline Anim* ensureAnimExists() {
>-    if (!mAnim)
>+

Two requests - could you move the brace to the next line, and of course remove the blank line at the beginning of the function?

>+/* void onDiscard (in imgIRequest request); */
>+NS_IMETHODIMP imgRequest::OnDiscard(imgIRequest *aRequest)
>+{
>+

Blank line.

>+  // Clear the state bits we no longer deserve.
>+  PRUint32 stateBitsToClear = stateDecodeStarted;
>+  mState &= ~stateBitsToClear;
>+
>+  // Clear the sttus bits we no longer deserve.

_status_


Hooray! Looks good otherwise.
Attachment #400065 - Flags: review?(joe) → review+
Blocks: 516063
Attachment #400065 - Flags: superreview?(vladimir) → superreview+
Attached patch tests v2 (deleted) — Splinter Review
updated tests to address dolske's review comments. reflagging him for review.
Attachment #397297 - Attachment is obsolete: true
Attachment #400180 - Flags: review?(dolske)
Attachment #400180 - Flags: review?(dolske) → review+
Attached patch RC1 (obsolete) (deleted) — Splinter Review
Uploaded the (hopefully) final decode-on-draw patch. This addresses all of joe's concerns, and thus the whole patch stack should be ready:
decode-on-draw
sync decode handing
onload blocking
onload blocking tests
decode on draw tests

Pushed the whole thing to try as 6e506aa04f84. If everything looks good in the morning, it's landing time.
Attachment #400065 - Attachment is obsolete: true
Attached patch RC2 (deleted) — Splinter Review
small build bustage fix on everything but mac
Attachment #400243 - Attachment is obsolete: true
Attached patch sync decode handling v5 (deleted) — Splinter Review
or maybe the bustage fix was in this patch.

Also fixed a really annoying regression of bug 494667 as the result of not initializing mBGPaintFlags (and thus passing PANTBG_WILL_PAINT_BORDER in situations where we shouldn't because the flags were garbage).
Attachment #400096 - Attachment is obsolete: true
Pushed to mozilla-central:

Decode-on-draw: 9f856f094fea
Sync decode handling: 03d92c1c09ff
tests: faf7eff2f2ea
Attached patch Fix SeaMonkey bustage [Moved to bug 516195] (obsolete) (deleted) — Splinter Review
Attachment #400349 - Flags: review?(bobbyholley)
Attachment #400349 - Attachment description: Fix SeaMonkey bustage → Fix SeaMonkey bustage [Moved to bug 516195]
Attachment #400349 - Attachment is obsolete: true
Attachment #400349 - Flags: review?(bobbyholley)
There were some apparent perf regressions that appeared on mac osx last night. Joe and I ran some experiments, but things didn't turn out as clear as we'd hoped. As a backup, I'm hoping that we can disable gDecodeOnDraw but leave the new architecture in (much less invasive than a backout).

Pushed a blank patch to try as 5145739b2a07, and a patch with decode-on-draw disabled to try as 51d2fdadf19f. Hopefully the second patch should revert us to the old behavior with the new machinery.
also pushed 16a952b7e73d to try, which disables discarding also. This disables the storing of source data, and thus cuts out a lot of the new machinery. It will be interesting to compare the three.
GAH, just realized I forgot to include the gif fix, so the last two pushes will hang on try.

Repushed decode-on-draw=false,discardable=true as 0a602217f2d8, and decode-on-draw=false,discardable=false as be9bd7d793f4.

If anyone is around who can cancel try builds, 51d2fdadf19f and 16a952b7e73d are useless and should be killed.
Depends on: 516290
No longer blocks: 512260
Depends on: 512260
Depends on: 516307
Resolving this bug as 'fixed' because it's so huge. Everything else should be followup bugs.

Turning off discard and decode-on-draw seems to make the perf regressions go away, so we're doing it for now in bug 516311 to give ourselves time to analyze things without backing out.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Blocks: 516320
Depends on: 516304
No longer depends on: 516307
No longer blocks: 516320
Depends on: 516320
Depends on: 516644
Depends on: 516335
Depends on: 516772
Blocks: 517775
(In reply to comment #146)
> Resolving this bug as 'fixed' because it's so huge. Everything else should be
> followup bugs.
> 
> Turning off discard and decode-on-draw seems to make the perf regressions go
> away, so we're doing it for now in bug 516311 to give ourselves time to analyze
> things without backing out.

If the perf regression is MAC only, and this "fix" seems seems to cause broken drawing of icons in Linux dialog boxes, and a fix does not appear to be coming anytime soon, for the benefit of us poor disenfranchised Linux users, could either the original patch be backed out, or could the disabling of this feature for the perf regression be somehow only done onMAC until this is all sorted?
Depends on: 516665
> If the perf regression is MAC only, and this "fix" seems seems to cause broken
> drawing of icons in Linux dialog boxes, and a fix does not appear to be coming
> anytime soon, for the benefit of us poor disenfranchised Linux users, could
> either the original patch be backed out, or could the disabling of this feature
> for the perf regression be somehow only done onMAC until this is all sorted?

The perf regression should be fixed (see bug 517091). The reason these things haven't been switched back on has to do with bug 517091. I'll try to look at it sometime soon.
Depends on: 523944
This seems to have caused bug 523944.
OS: Mac OS X → All
Hardware: x86 → All
Target Milestone: --- → mozilla1.9.3a1
Depends on: 565773
Blocks: 563439
Depends on: 578684
No longer depends on: 578684
Depends on: 580190
No longer depends on: 565773
No longer blocks: 511899
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: