Closed
Bug 166862
Opened 22 years ago
Closed 12 years ago
Valid (but huge) images crash mozilla in nsImageGTK::Init
Categories
(Core Graveyard :: Image: Painting, defect)
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: tenthumbs, Unassigned)
References
Details
(Keywords: crash)
Attachments
(5 files, 4 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
I hope this is the right component.
This is a companion to bug 153621.
All the common image formats (JPEG, GIF, and PNG) allow at least 16 bit
dimensions, but nsImageGTK assumes 1) that all calculations will fit into
int32's and 2) that all allocations will succeed. Neither is true on Linux
so it's easy to crash mozilla.
In nsImageGTK::Init there is the code
// create the memory for the image
ComputeMetrics();
mImageBits = (PRUint8*) new PRUint8[mSizeImage];
ComputeMetrics is in nsImageGTK.h and it does a nice job but it doesn't
check for overflows so mSizeImage can be negative or the right value mod
2^32. Neither case is good.
Mozilla doesn't use C++ exceptions so there's no way to trap operator new[].
If the underlying allocation fails you get system-dependent behavior. On
Linux, the default exception handlers in libstdc++ just abort the process.
Not good either.
There's also the question of whether mozilla should try to load a huge image
in the first place. The maximum contiguous space theoretically available to
a typical 32-bit Linux process is somewhat less than 2GB so loading the
largest images is impossible. You probably don't want to load anything
anywhere near that big so there has to be some limit imposed. Personally, I
think there should be a pref.
How to fix this? The first thing is to change ComputeMetrics so it doesn't
overflow. It should return a value indicating success or failure. Since most
images aren't big, two executions paths should work,
if (mWidth < 8192 && mHeight < 8192)
{
... do fast integer calculations and test if small enough
}
else
{
... do slow double precision floating point calculations and test
}
Of course new[] has to be replaced by some flavor of malloc. This should be
easy but I don't know for sure.
Because I can crash mozilla at will with valid data, setting severity to
critical.
Comment 1•22 years ago
|
||
>Of course new[] has to be replaced by some flavor of malloc. This should be
>easy but I don't know for sure.
hm, not sure if that is correct. I think we're somehow telling the compiler to
return NULL, or maybe we have overridden operator new. anyway, I think the
result should be null-checked.
are you sure that mozilla terminates because of a uncaught exception?
> hm, not sure if that is correct. I think we're somehow telling the
> compiler to return NULL, or maybe we have overridden operator new.
> anyway, I think the result should be null-checked.
On Linux new[] is handled in libstc++ and it _always_ installs default
exception handlers. The compiler has no control over the defaults.
In the situation here, operator new[] calls operator new which uses the
system malloc. If that returns NULL then the default handler aborts the
process. You are supposed to change this behavior with C++ try/catch
exception handling. Mozilla doesn't want to do that so it can't use new.
You also can't do null checking because you never see it.
> are you sure that mozilla terminates because of a uncaught exception?
Absolutely, at least for bad_alloc. Here's the last part of the stack trace
when I point mozilla at a 20000x20000 PNG.
#0 0x4055af41 in __kill () from /lib/libc.so.6
#1 0x40280d79 in pthread_kill (thread=1024, signo=6) at signals.c:65
#2 0x402811c7 in raise (sig=6) at signals.c:226
#3 0x4055c223 in abort () at ../sysdeps/generic/abort.c:88
#4 0x40513ec4 in __cxxabiv1::__terminate(void (*)()) (handler=0x4055c160
<abort>) at
/mnt1/GCC/3.1/gcc-3.1.1/libstdc++-v3/libsupc++/eh_terminate.cc:47
#5 0x40513f11 in std::terminate() () at
/mnt1/GCC/3.1/gcc-3.1.1/libstdc++-v3/libsupc++/eh_terminate.cc:57
#6 0x40514081 in __cxa_throw () at
/mnt1/GCC/3.1/gcc-3.1.1/libstdc++-v3/libsupc++/eh_throw.cc:77
#7 0x4051426f in operator new(unsigned) (sz=1079119808) at
/mnt1/GCC/3.1/gcc-3.1.1/libstdc++-v3/libsupc++/new_op.cc:54
#8 0x4051436c in operator new[](unsigned) (sz=1200000000)
at /mnt1/GCC/3.1/gcc-3.1.1/libstdc++-v3/libsupc++/new_opv.cc:36
#9 0x41243374 in nsImageGTK::Init(int, int, int, nsMaskRequirements)
(this=0x8593010, aWidth=20000, aHeight=20000, aDepth=24,
aMaskRequirements=nsMaskRequirements_kNoMask) at nsImageGTK.cpp:196
My sytem didn't have 1200000000 bytes available so malloc failed so the
process bombs out immediately.
Note that Windows behavior is different. There, almost all requests succeed
but fulfilling them can be a problem.
Here's a simple exception handler test for Linux. If you run it with no
arguments, you get the default bad_alloc handler and it should abort. If there
are any arguments, it sets its own handler and should succeed. If you compile
it with -fno-exceptions then it crashes in either case.
This is a patch for discussion. Mozilla no longer aborts on huge images but now
claims the images are corrupt. It's a start.
Since it's not possible to allocate a contiguous 2GB block on a typical Linux, I
decided to make that the maximuum for an image. There probably should be a pref
for a smaller limit. I wouldn't want mozilla to allocate 200MB for an image just
because some idiot web page wanted to.
Comment 6•22 years ago
|
||
Comment on attachment 101151 [details] [diff] [review]
msImageGTK patch
+ { // slow way; could use long longs.
why not use PRUint64.
also, why fail if the image is more than 4 GB. 64 bit platforms can handle >4GB
allocations just fine. I'm against introducing abitrary limits...
Intel x86 chips have few registers so their 64-bit support is marginal. The
fpu stack has a lot more. These days there's little performance difference.
It's not all that important here, anyway.
4GB is the theoretical maximum for a 32-bit address space. It's almost always
less. A typical Linux has only a 3GB process space. Since almost all mozillas
run on 32-bit systems I don't think it's worth allowing such sizes.
Right now I think it's more useful to find a better error for this case.
Perhaps something like: "%s cannot be loaded because of a system allocation
failure."
Comment 9•22 years ago
|
||
>4GB is the theoretical maximum for a 32-bit address space
this is why I said "on 64 bit platforms".
I would not do this check at all, and wait until later if the malloc fails.
Reporter | ||
Comment 10•22 years ago
|
||
Note that the goodies like mSizeImage are all int32's so they can't hold moree
than 2^31-1. That's the constraint I apply in ComputeMetrics, I just do it in
two stages. If you want to rewrite imagelib to use larger size variables, be
my guest. That's far beyond my abilities.
Reporter | ||
Comment 11•22 years ago
|
||
Better error return value. Mozilla still says the image has errors but that's
coming from content and I don't have a clue how to fix that.
Attachment #101151 -
Attachment is obsolete: true
Reporter | ||
Comment 12•22 years ago
|
||
It helps a lot if I use the right patch.
Attachment #101397 -
Attachment is obsolete: true
Comment 13•22 years ago
|
||
Comment on attachment 101406 [details] [diff] [review]
the right improved patch
a few comments:
+ // Right now 2GB-1 sizes are allowed
I don't think that comment makes sense at the place where it is. because at
that place, no such check is done.
Personally, I prefer if (!mImageBits) to if (mImageBits == nsnull)... but, this
doesn't really matter
finally, please use PRInt64/PRUint64 rather than double for the calculation.
see http://www.mozilla.org/projects/nspr/reference/html/Prtyp.html#14775
using a double for an integer type is just a dirty hack, imho.
Reporter | ||
Comment 14•22 years ago
|
||
> I don't think that comment makes sense at the place where it is. because
> at that place, no such check is done.
I clarified it. The whole point is that a check should be done there.
> Personally, I prefer if (!mImageBits) to if (mImageBits == nsnull)...
> but, this doesn't really matter
I always treat typedef'ed and #define'ed items as opaque quantities so I
always compare to them explicitly. If nsnull were always 0 then it
wouldn't be necessary or desireable. It's not essential that nsnull = 0 so
your form isn't guaranteed to be identical to mine.
> finally, please use PRInt64/PRUint64 rather than double for the
> calculation.
Why not. At one point I was worried about overflow but now that it's
apparent that that's not going to happen, int64 is fine. It's slower than
the floating point, though. Besides, the last patch was broken.
I'll attach a new patch but it's still not ready. There's the pref to
limit the size. There's also the question of what to return if
ComputeMetrics fails. Right now it returns NS_ERROR_OUT_OF_MEMORY which
isn't really right because ComputeMetrics fails for administrative reasons
not some system fault. I think I know how to create a new error number but
I don't know how to create a string nor do I know how to report it to the
user.
Returning NS_ERROR_OUT_OF_MEMORY for the malloc failures isn't really
right either. It's not that the system is out of memory, it's that there
isn't enough memory for the particular image. Same issues as previously,
creating a new error and reporting it.
Finally, it's still an abomination that mozilla claims that the image is
broken. A lxr search for "InvalidImage" shows that both layout and content
have their stick fingers all over this. In fact, it's content that rather
arrogantly assumes that any laod failure must be the image's fault.
All of this needs to be dealt with.
BTW, the other platforms all have the equivalent of ComputeMetrics and
they all have the same overflow problems in the calculations. Someone
should deal with that.
Reporter | ||
Comment 15•22 years ago
|
||
Attachment #101406 -
Attachment is obsolete: true
Comment 16•22 years ago
|
||
firstly, no, that's not how you use PRInt64. please read the URL I gave you. you
have to use macros for things like multiplicating.
next, you can use PR_INT32_MAX instead of 2147483647
+ rb64= tmp64 >> 5 + (tmp64 & 0x1F) ? 1 : 0;
this could really do with a few parentheses
Reporter | ||
Comment 17•22 years ago
|
||
> firstly, no, that's not how you use PRInt64. please read the URL I gave you.
Actually, that's the wrong URL, you should have said prlong.html. I did look
at that and it's tres icky. Is there any platform using gtk where the compiler
doesn't understand int64?
> + rb64= tmp64 >> 5 + (tmp64 & 0x1F) ? 1 : 0;
> this could really do with a few parentheses
yes, that should be
rb64= (tmp64 >> 5) + ((tmp64 & 0x1F) ? 1 : 0);
Right now I don't have a lot of time for this and I'm spending that time on
trying to figure out how to undo mozilla's theory that the image is corrupt.
Any help would be appreciated.
Comment 19•21 years ago
|
||
*** Bug 219632 has been marked as a duplicate of this bug. ***
Comment 20•21 years ago
|
||
this is tenthumb's patch, using 64-bit macros in all their glory.
this also handles the "fast" case first since it will be (by far) the most
common.
Reporter | ||
Comment 21•21 years ago
|
||
Hold up on this. There seems to be similar code in
<http://lxr.mozilla.org/seamonkey/source/gfx/src/shared/gfxImageFrame.cpp#45>
so some of this may not be necessary.
Comment 22•21 years ago
|
||
+ PRInt64 tmp64a, tmp64b, tmp64c, tmp64d;
sorry for not thinking about this earlier... but there is nsInt64, a C++ wrapper
around PRInt64 with overloaded operators, which might make this code easier to read.
I won't complain if you leave it as-is, though.
Reporter | ||
Comment 23•21 years ago
|
||
The 64 bit macros are incomprehensible, at least to me. :-) I see I wasn't
thinking when I wrote this. Here's an equivalent patch with normal variables. I
also spotted some more new[] calls and replaced them.
I'm not sure all the testing is really necessary any more but it won't hurt
much. It bothers me that we have to replace new[] but I guess it's necessary.
Comment 24•21 years ago
|
||
> The 64 bit macros are incomprehensible, at least to me. :-)
there's always nsInt64...
>It bothers me that we have to replace new[] but I guess it's necessary.
the entire codebase assumes that new won't throw an exception in OOM... we
should probably use new (std::nothrow), I guess...
Reporter | ||
Comment 25•21 years ago
|
||
Let's try the patch that works this time.
Attachment #141459 -
Attachment is obsolete: true
Reporter | ||
Comment 26•21 years ago
|
||
> we should probably use new (std::nothrow), I guess...
Will that work without exception handling?
Comment 27•21 years ago
|
||
as that flag tells new _not_ to throw an exception, I sure hope so.
Reporter | ||
Comment 28•21 years ago
|
||
This test program:
#include <new>
#include <cstdio>
using namespace std;
int main (void)
{
char *cp1;
char *cp2;
cp1 = new (std::nothrow) char[1024 * 1024 * 1024];
printf ("cp1 = %p\n", cp1);
cp2 = new (std::nothrow) char[1024 * 1024 * 1024];
printf ("cp2 = %p\n", cp2);
return 0;
}
works with gcc 3.3.2 and gcc2.95.3 on Linux (cp2 is null). I have no idea if
this works with all the compilers mozilla uses. If it does, we could convert
mozilla (or at least parts of it) to this form.
Comment 29•21 years ago
|
||
gfxImageFrame.cpp checks that a conservative estimate of the memory
requirements (4*width*height) will not overflow a signed 32-bit integer.
The more serious and harder to solve problem on gtk is that the X Protocol
has numerous 16-bit signed and unsigned limits on coordinates and
dimensioning, and neither gdk nor our gfx code checks and works around
these problems for images.
Reporter | ||
Comment 30•21 years ago
|
||
I said this in another bug but there are three main issues:
1) exhausting the system virtual memory,
2) exhausting the process's free space, and
3) exhausting another process's space.
They interact with each other. This bug started out as a way to address case
2. I had to change ComputeMetrics only because it could overflow and there
were no other tests at the time. It could be simpler if it is absolutely
guaranteed that it won't overflow.
Someone somewhere has to try to protect against case 3. We could restrict
the largest dimension to 32767 but I'm sure someone will complain about
their 40000x2 image not working.
Comment 31•21 years ago
|
||
*** Bug 238401 has been marked as a duplicate of this bug. ***
Reporter | ||
Comment 32•21 years ago
|
||
Basically patch6 with new (std::nothrow) instead of PR_malloc.
Mozilla won't crash but still reports images as broken when they're not.
Comment 33•21 years ago
|
||
are you sure (std::nothrow) is supported by all of our unix compilers? i was
working on a patch to macroize new to do this stuff, but never finished it (the
macroization was not fun)
Reporter | ||
Comment 34•21 years ago
|
||
I'm willing to guarantee that this _will_ break some compiler, but we'll never
know unless we try. It does work in gcc 2.95.3 and 3.3.3 which is good.
However, theoretically I should match new (std::nothrow) [] with a suitable
delete []. I must be missing some thing but I can't find a syntax gcc likes,
so this might break in some future version.
The real question is whether the standards require new[] ot throw exceptions.
If so, then mozilla has a big problem. If not, then we can provide an ad hoc
solution (as we do here). This is one of the few places where new[] has a
significant chance of failure.
Note that I think it is an abomination that mozilla reports these images as
broken. This patch will just create a lot of bugzilla noise. Mozilla really
really needs to report the right errors.
Comment 35•21 years ago
|
||
i've used bezilla, i've already tried to cross that bridge.
mozilla has a big problem.
Comment 36•19 years ago
|
||
*** Bug 317112 has been marked as a duplicate of this bug. ***
Comment 37•18 years ago
|
||
Updated•18 years ago
|
Assignee: jdunn → pavlov
QA Contact: tpreston
Updated•18 years ago
|
Assignee: pavlov → nobody
Updated•18 years ago
|
QA Contact: image.gfx
Comment 40•15 years ago
|
||
Found this bug when redirected from a duplicate bug.
Firefox 3.6.3 running on Slackware 13.0 (full patches) can be made to grind to a halt and lock the system when trying to load this image:
http://upload.wikimedia.org/wikipedia/commons/f/f0/Iridescent_Glory_of_Nearby_Helix_Nebula.jpg
No core dumps obtained.
It seems illogical on a Core 2 Duo 1.66GHz laptop with 2GB RAM running Xfce should be able to be brought to a halt just when loading this image (8.6MB).
Will try to run a strace.
Comment 41•15 years ago
|
||
Strace impossible: even in safe mode, the image loads, and the network slows indicating it's fully downloaded (watching in Xfce netload plugin), but firefox seems to have a run-away memory allocation issue with the image. Image loads fine in GIMP (albeit > 700MB RAM used) and Gwenview (uses < 100MB RAM).
Comment 42•12 years ago
|
||
We don't have any crash reports in nsImageGTK at all in the last 4 weeks, and this bug is in a graveyard component, so I think this is not valid any more.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•