Closed
Bug 331298
Opened 19 years ago
Closed 18 years ago
Speed up image drawing/decoding
Categories
(Core :: Graphics, defect)
Core
Graphics
Tracking
()
VERIFIED
FIXED
People
(Reporter: pavlov, Assigned: pavlov)
References
Details
(Keywords: perf)
Attachments
(2 files, 2 obsolete files)
(deleted),
patch
|
vlad
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
pavlov
:
review+
|
Details | Diff | Splinter Review |
Now that we can really make use of 32bpp bitmaps, we should decode to 32bit ARGB we should do so. I'm seeing a huge hot spot in the code we currently use to combibne RGB and A1/A8 in to ARGB so fixing this should give us a big perf win. Doing this also cleans up lots of code in the image decoders and will eventually lead to a single path for decoding in each decoder instead of the mix of mac/windows/bgr/rgb/toptobottom/bottomtotop/endianness/etc mess that is there now.
Assignee | ||
Comment 1•19 years ago
|
||
this is just a checkpoint along the way. still need to fix animated gifs, bmps, xbms and platform icons.
this patch also ends up holding on to 2 copies of the image when it optimizes. need to fix that too.
also need to remove the aptr stuff in the rgb_1 case in the png decoder since it is unused
Assignee: nobody → pavlov
Status: NEW → ASSIGNED
Assignee | ||
Comment 2•19 years ago
|
||
if anyone wants to do bmp, ico, xbm, i wouldn't be too upset. I think I got most of the gif stuff, but possibily missed a couple spots.
Attachment #215852 -
Attachment is obsolete: true
Assignee | ||
Comment 3•19 years ago
|
||
I think this solves everything short of transparency in ICOs. I'll fix that as a seperate bug.
Attachment #216084 -
Attachment is obsolete: true
Attachment #216170 -
Flags: review?(vladimir)
Comment on attachment 216170 [details] [diff] [review]
third pass (checked in)
>--- gfx/src/thebes/nsThebesImage.cpp 8 Mar 2006 00:28:19 -0000 1.7
>+++ gfx/src/thebes/nsThebesImage.cpp 25 Mar 2006 00:02:49 -0000
> nsresult
> nsThebesImage::Init(PRInt32 aWidth, PRInt32 aHeight, PRInt32 aDepth, nsMaskRequirements aMaskRequirements)
> {
>- return NS_OK;
>-}
>+ mImageSurface = new gfxImageSurface(format, mWidth, mHeight);
>+ memset(mImageSurface->Data(), 0xF, mHeight * mImageSurface->Stride());
0xFF, not 0xF. Actually, make it 0x00, I dunno what I was thinking. Why do we always create the image
>Index: gfx/thebes/src/gfxWindowsFonts.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/gfx/thebes/src/gfxWindowsFonts.cpp,v
>retrieving revision 1.29
>diff -u -6 -p -r1.29 gfxWindowsFonts.cpp
>--- gfx/thebes/src/gfxWindowsFonts.cpp 14 Mar 2006 23:17:55 -0000 1.29
>+++ gfx/thebes/src/gfxWindowsFonts.cpp 25 Mar 2006 00:02:49 -0000
This stuff shouldn't be part of this patch
>--- modules/libpr0n/decoders/gif/imgContainerGIF.cpp 21 Feb 2006 23:19:19 -0000 1.23
>+++ modules/libpr0n/decoders/gif/imgContainerGIF.cpp 25 Mar 2006 00:02:50 -0000
>@@ -45,12 +45,16 @@
> //******************************************************************************
> // Fill aFrame with black. Does not change the mask.
> void imgContainerGIF::BlackenFrame(gfxIImageFrame *aFrame)
> {
> if (!aFrame)
> return;
>-
>+#ifdef MOZ_CAIRO_GFX
>+ nsCOMPtr<nsIImage> img(do_GetInterface(aFrame));
>+ if (!img)
>+ return;
>+ PRInt32 w, h;
>+ aFrame->GetWidth(&w);
>+ aFrame->GetHeight(&h);
>+
>+ nsRefPtr<gfxASurface> surf;
>+ img->GetSurface(getter_AddRefs(surf));
>+ nsRefPtr<gfxContext> ctx = new gfxContext(surf);
>+ ctx->SetColor(gfxRGBA(0.0, 0.0, 0.0, 1.0));
>+ ctx->Rectangle(gfxRect(0, 0, w, h));
>+ ctx->Fill();
>+#else
No need for the w/h business; just ctx->SetColor(gfxRGB(0,0,0)); ctx->SetOperator(SOURCE); ctx->Paint();
r= me with those things fixed.. I /may/ be too tired to review the patch, but I think it looks ok.
Attachment #216170 -
Flags: review?(vladimir) → review+
Assignee | ||
Comment 5•19 years ago
|
||
i always create the image because we 99.99% of the time will write stuff to it almost immediately. fixed the other stuff
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Comment 6•19 years ago
|
||
Additional speedup is available: Change line 548 of
nsPNGDecoder.cpp (after applying "third pass" patch) from
} else {
to
} else if (a != 255) {
and line 562 from
}
to
} else {
cptr += 4;
}
This will avoid calculations in the common case of fully opaque pixels.
Comment 7•19 years ago
|
||
Comment on attachment 216170 [details] [diff] [review]
third pass (checked in)
+ const PRUint8 a = (format == gfxIFormat::RGB_A1) ? abpr[i>>3] : abpr[i];
why is this right in the A1 case? don't you need to find the right bit and use 0/255?
XBM:
+ *ar++ = (val << 24) | 0;
shouldn't this be:
*ar++ |= (val << 24);
?
Comment 8•19 years ago
|
||
const PRUint8 val = ((pixel & (1 << i)) >> i) ? 255 : 0;
*ar++ = (val << 24) | 0;
We don't even need "val". Replace the two statements with
*ar++ |= ((pixel & (1 << i)) >> i) ? 0xFF000000L : 0;
Assignee | ||
Updated•19 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 9•19 years ago
|
||
this didn't really give me the speed up i was looking for on Tp. not sure why. Lets toss some more performance gains at the problem...
Status: REOPENED → ASSIGNED
Assignee | ||
Comment 10•19 years ago
|
||
this patch builds on top of the previous stuff checked in. it causes the PNG and JPEG decoders to write directly to the buffer provided from the gfxImageSurface. This way we don't have to allocate temp row buffers to write to. I've also changed the JPEG, PNG and GIF decoders to write out as 32bit values using shifts so that a) I can get rid of the endian ifdefs and b) hopefully make the writes a bit faster.
Attachment #216289 -
Flags: review?(vladimir)
Assignee | ||
Updated•19 years ago
|
Attachment #216170 -
Attachment description: third pass → third pass (checked in)
Comment on attachment 216289 [details] [diff] [review]
write directly (checked in)
>+
>+ PRUint32 colorIndex = *rowBufIndex * 3;
>+ *rgbRowIndex++ = (0xFF << 24) |
>+ (cmap[colorIndex] << 16) |
>+ (cmap[colorIndex+1] << 8) |
>+ (cmap[colorIndex+2]);
Make all this stuff use a macro for readability at some point.
>+ // Note! row_stride here must match the row_stride in
>+ // nsJPEGDecoder::WriteFrom
>+#if defined(MOZ_CAIRO_GFX) || defined(XP_MAC) || defined(XP_MACOSX)
>+ const int row_stride = mInfo.output_width << 2; // * 4
>+ // we're thebes. we can write stuff directly to the data
We're not always Thebes, since XP_MAC; change that comment to be "we're writing XRGB data out" or something?
>+ PRUint8 *imageData;
>+ PRUint32 imageDataLength;
>+ mFrame->GetImageData(&imageData, &imageDataLength);
>+ nsCOMPtr<nsIImage> img(do_GetInterface(mFrame));
>+ nsIntRect r(0, 0, mInfo.output_width, 1);
>+#else
>+ const int row_stride = mInfo.output_width * 3;
>+#endif
>+ PRUint8 r = *j1++;
>+ PRUint8 g = *j1++;
>+ PRUint8 b = *j1++;
>+ *ptrOutputBuf++ = (0xFF << 24) | (r << 16) | (g << 8) | b;
I'm pretty sure the compiler will do the right thing here and get rid of the temporaries, but you can get rid of them with a macro; ((*j1++) << 16) | ((*j1++) << 8) ... woudl be too ugly, I think.
This looks fine, but I don't think you'll see any speedup from it -- we had way more copies before and things didn't get faster. I think there's some format strangeness...
Attachment #216289 -
Flags: review?(vladimir) → review+
Comment 12•19 years ago
|
||
Comment on attachment 216289 [details] [diff] [review]
write directly (checked in)
why do you GetImageData also for non-cairo mac, in jpeg?
( nsJPEGDecoder::OutputScanlines())
couldn't you use NS_RGBA instead of (0xFF << 24) | (r << 16) | (g << 8) | b;?
Why do you call ImageUpdated in the jpeg decoder but not the png one?
Attachment #216289 -
Flags: review+ → review?(vladimir)
Comment 13•19 years ago
|
||
> couldn't you use NS_RGBA instead of (0xFF << 24) | (r << 16) | (g << 8) | b;?
ah, no, you can't, unfortunately...
Comment 14•19 years ago
|
||
In the GIF decoder one can also directly write to the thebes image buffer (instead of allocating RGBRow...)
Assignee | ||
Updated•19 years ago
|
Attachment #216289 -
Attachment description: write directly → write directly (checked in)
Attachment #216289 -
Flags: review?(vladimir) → review+
Assignee | ||
Comment 15•19 years ago
|
||
Alfred Kayser: yeah, there is still a lot of optimizations that can be had. 32bit reads from png data, etc. I'm planning on leaving this bug open until we clean it up more.
gif is in a bad place right now because of the way it deals with previous frames and compositing. i'll post more in 324707 soon.
Comment 16•19 years ago
|
||
*** Bug 280529 has been marked as a duplicate of this bug. ***
Comment 17•19 years ago
|
||
(In reply to comment #7)
> shouldn't this be:
> *ar++ |= (val << 24);
OK, thinking some more about this, it shouldn't. but the | 0 does look odd.
No longer depends on: 332611
Comment 18•19 years ago
|
||
Could this have caused bug 332713?
Comment 19•19 years ago
|
||
and Bug 332386 ?
Assignee | ||
Updated•18 years ago
|
OS: Windows XP → All
Hardware: PC → All
Comment 20•18 years ago
|
||
Add bug 143046 to the dependency list, as the patch for that one (using 8bit storage for animation frames of GIF's) provides a huge memory saving and a performance speed up of 100% (i.e. one specific large animation now takes < 10% cpu, instead of > 20% cpu load).
Depends on: slowGIF
Comment 21•18 years ago
|
||
Bug 360000 is about addressing the ICO and BMP transparancy decoding in Cairo
Depends on: 360000
Comment 22•18 years ago
|
||
Note bug 324707 mentioned in comment #15 is fixed (by bug 336532)
Opened bug 366465 for further optimization of buffer usage in GIF decoding in Cairo.
Comment 23•18 years ago
|
||
Why not add a 'NS_COLOR_ARGB' for Cairo pixels or something like that?
Also alpha premultiplication is needed in multiple locations.
(See bug 360000 for an example of a SetPixel(r,g,b,a) doing the premultiplication
Assignee | ||
Comment 25•18 years ago
|
||
this is fixed enough.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago → 18 years ago
Resolution: --- → FIXED
Updated•18 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•