Closed
Bug 137478
Opened 23 years ago
Closed 21 years ago
win32 crash in jpeg\jdapimin.c: jpeg_consume_input() [jpg]
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: jrgmorrison, Assigned: pavlov)
References
Details
(Keywords: crash)
Attachments
(2 files, 2 obsolete files)
(deleted),
patch
|
tor
:
review-
tor
:
superreview-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sicking
:
review+
bryner
:
superreview+
|
Details | Diff | Splinter Review |
I've run into a crash in '-O2' optimized builds on msvc/win32.
The strange thing is that it happens on one win2k machine, but not
another. Here's the state of affairs:
1) build on machine A with '-O2' -- crashes as soon as it sees a JPEG
2) build again on machine A wiht '-O1' -- no crash
3) run build from step (1) on machine B -- no crash
4) build on machine B with '-O2' -- no crash when run on machine B
5) run build from (4) on machine A -- crash on any JPEG, when run on A
I also have another win98 system building with '-O2' and it doesn't crash
either. [Obviously, a debug, unoptimized build doesn't crash either].
All machines are running MSVC 6.0 with Service Pack 5 and the MSVC++ Processor
Pack for that service pack.
One thing I should note is that it is the newest (and fastest) machine which
crashes, which I suppose points to some asm or processor problem. Let me know
what details you need on the exact processor being used on the crashing
system.
The crash is inn jpeg\jdapimin.c: jpeg_consume_input()
switch (cinfo->global_state) {
case DSTATE_START:
/* Start-of-datastream actions: reset appropriate modules */
(*cinfo->inputctl->reset_input_controller) (cinfo);
/* Initialize application's data source module */
(*cinfo->src->init_source) (cinfo); //XXXX<-- crash here
cinfo->global_state = DSTATE_INHEADER;
/*FALLTHROUGH*/
and |*cinfo->src| says that it has value "0x02" in the debugger (but some
inspection of the actually assembly and registers would confirm whether that
value was correct, or whether that was the kind of bogus value that the msvc
debugger gives sometimes with optimized builds).
Stack trace:
jpeg_consume_input(jpeg_decompress_struct * 0x01c0d240) line 314 + 4 bytes
jpeg_read_header(jpeg_decompress_struct * 0x01c0d240, unsigned char 1) line 268
nsJPEGDecoder::WriteFrom(nsJPEGDecoder * const 0x01c0d220, nsIInputStream *
0x029bd3a8, unsigned int 4008, unsigned int * 0x0012fc6c) line 229 + 11 bytes
imgRequest::OnDataAvailable(imgRequest * const 0x10068c62, nsIRequest *
0x029bd370, nsISupports * 0x1002a542, nsIInputStream * 0x029bd3a8, unsigned int
26428578, unsigned int 43775256) line 751
nsCOMPtr_base::assign_with_AddRef(nsCOMPtr_base * const 0x01c0d240, nsISupports
* 0x029c7f20) line 73
nsStreamListenerTee::OnDataAvailable(nsStreamListenerTee * const 0x029bd3a8,
nsIRequest * 0x029c7be0, nsISupports * 0x00000000, nsIInputStream * 0x00000000,
unsigned int 0, unsigned int 4008) line 56 + 34 bytes
nsHttpChannel::OnDataAvailable(nsHttpChannel * const 0x029c7be4, nsIRequest *
0x029bf3ac, nsISupports * 0x00000000, nsIInputStream * 0x029c47cc, unsigned int
0, unsigned int 4008) line 2876
nsOnDataAvailableEvent::HandleEvent(nsOnDataAvailableEvent * const 0x01c0d240)
line 192 + 23 bytes
PL_HandleEvent(PLEvent * 0x029caa3c) line 596 + 4 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x10042fe0) line 526 + 6 bytes
_md_EventReceiverProc(HWND__ * 0x000201a6, unsigned int 49315, unsigned int 0,
long 15811040) line 1077 + 13 bytes
USER32! 77e13eb0()
nsAppShellService::Run(nsAppShellService * const 0x00fe2dd0) line 309
main1(int 4202558, char * * 0x00000004, nsISupports * 0x00253c90) line 1415 +
10 bytes
main(int 0, char * * 0x00252cb8) line 1763 + 23 bytes
WinMain(HINSTANCE__ * 0x00400000, HINSTANCE__ * 0x00400000, char * 0x00133c42,
HINSTANCE__ * 0x00400000) line 1781 + 23 bytes
MOZILLA! WinMainCRTStartup + 308 bytes
Reporter | ||
Comment 1•23 years ago
|
||
blocks bug 136999 "Windows builds should build with -O2"
pavlov: I have a system all set up with a build that has this crash. Can we
take some time to have a look at this and figure out what is going wrong.
Blocks: 136999
Comment 2•23 years ago
|
||
I've seen this a while ago (same stack)
I've tried to #undef MMX support (no more inline assembly that can break opt
code)and it worked for me
Later on I pulled another tree and the problem was simply not there anymore
So now I run with -O2 and no crash, go figure ...
So did you try to #undef MMX support ?
(in jmorecfg.h:
/* Defines for MMX support. */
#if defined(XP_WIN32) && defined(_M_IX86)
#define HAVE_MMX_INTEL_MNEMONICS <-------- comment out that line
#endif
)
Reporter | ||
Comment 3•23 years ago
|
||
Bernard: thanks. Yes, removing that define and compiling with -O2 cures the
crash that I was experiencing.
Comment 4•23 years ago
|
||
I think I got it:
http://lxr.mozilla.org/mozilla1.0/source/modules/libimg/png/pngvcrd.c#33
see the comment: CPUID instruction will trash ebx ecx edx !
those registers might be used by optimized code (especially ecx) !
therefore the solution would be to sync jpeg mmxsupport() function with
png_mmx_support()
Comment 5•23 years ago
|
||
for test and review
Reporter | ||
Comment 6•23 years ago
|
||
Bernard. Good catch, although unfortunately, that doesn't seem to cure this
crash. It does seem like something that should be done though.
One thing I noted, and I'm really over my head at this point, but I was
stepping through this in the debugger, and it seemed like those push/pops
had been optimized away (???). It doesn't appear that ebx,ecx,edx had been
restored to previous state at the end of mmxsupported() in jdapimin.c. (I
don't know; maybe I botched the patch somehow).
cc: a couple of pixel heads :-)
Comment 7•23 years ago
|
||
assembler code:
GLOBAL(void)
jpeg_CreateDecompress (j_decompress_ptr cinfo, int version, size_t structsize)
{
60F11000 push ebp
60F11001 mov ebp,esp
60F11003 push ecx
int i;
#ifdef HAVE_MMX_INTEL_MNEMONICS
static int cpuidDetected = 0;
if(!cpuidDetected)
60F11004 mov eax,dword ptr ds:[60F27334h]
60F11009 test eax,eax
60F1100B push ebx
60F1100C push esi
60F1100D push edi
60F1100E jne jpeg_CreateDecompress+63h (60F11063h)
{
MMXAvailable = mmxsupport();
60F11010 mov dword ptr [ebp-4],0
60F11017 push ebx
60F11018 push ecx
60F11019 push edx
60F1101A pushfd
60F1101B pop eax
60F1101C mov ecx,eax
60F1101E xor eax,200000h
60F11023 push eax
60F11024 popfd
60F11025 pushfd
60F11026 pop eax
60F11027 xor eax,ecx
60F11029 je jpeg_CreateDecompress+4Bh (60F1104Bh)
60F1102B xor eax,eax
60F1102D cpuid
60F1102F cmp eax,1
60F11032 jl jpeg_CreateDecompress+4Bh (60F1104Bh)
60F11034 xor eax,eax
60F11036 inc eax
60F11037 cpuid
60F11039 and edx,800000h
60F1103F cmp edx,0
60F11042 je jpeg_CreateDecompress+4Bh (60F1104Bh)
60F11044 mov dword ptr [ebp-4],1
60F1104B mov eax,dword ptr [ebp-4]
60F1104E pop edx
60F1104F pop ecx
60F11050 pop ebx
60F11051 mov eax,dword ptr [ebp-4]
60F11054 mov dword ptr [_MMXAvailable (60F27348h)],eax
cpuidDetected = 1;
60F11059 mov dword ptr ds:[60F27334h],1
}
#endif
/* For debugging purposes, zero the whole master structure.
* But error manager pointer is already there, so save and restore it.
*/
/* Guard against version mismatches between library and caller. */
cinfo->mem = NULL; /* so jpeg_destroy knows mem mgr not called */
if (version != JPEG_LIB_VERSION)
60F11063 mov eax,dword ptr [version]
60F11066 cmp eax,3Eh
60F11069 mov esi,dword ptr [cinfo]
....
I use VC70. I don't have VC60.
as you can see, mmxsupport() is inlined but push and pops are there
(to step through the assembly code, I did right click | Go to disassembly after
I hit the breakpoint)
I verified that CPUID changes the registers and that they are now preserved
All that I think it's true:
1) From Comment #3 we know that this is somewhat mmx related.
2) From the stack trace the crash occurs before any data is actually decoded so
the only mmx specific code involved would be mmxsupport()
I suggest that you look at the assembly code to check if push and pops are there
then try to replace the line:
MMXAvailable = mmxsupport();
by:
MMXAvailable = 1; (or 0, it must be the result of the true mmxsupport() call)
maybe that way we would know if the problem is really with mmxsupport()/CPUID or not
Assignee | ||
Updated•23 years ago
|
Target Milestone: --- → mozilla1.1beta
Comment 8•21 years ago
|
||
The patch wfm. Can other windows people verify?
Attachment #79515 -
Attachment is obsolete: true
Comment 9•21 years ago
|
||
Tested swalker's updated patch. Seems to do the trick, I can render JPEG images
using an -O2 build without crashing.
Updated•21 years ago
|
Attachment #134032 -
Flags: superreview?(tor)
Attachment #134032 -
Flags: review?(tor)
Comment 10•21 years ago
|
||
Comment on attachment 134032 [details] [diff] [review]
Updated version of Bernard's patch; sync the additions added when libpng 1.2.5 landed
According to the MS documentation, asm blocks generate their own
clobber list and you don't need to manually preserve e[abcd]x or
e[sd]i. Just preserving the flags should be enough.
Attachment #134032 -
Flags: superreview?(tor)
Attachment #134032 -
Flags: superreview-
Attachment #134032 -
Flags: review?(tor)
Attachment #134032 -
Flags: review-
Comment 11•21 years ago
|
||
I assume __fastcall option /Gr isn't being used is it? It was my understanding
that if you make a call to a function and need register values preserved it's
the caller's responsibility to preserve them not the callee's, other than things
like esp and ebp. The only other thing I thought might be happening is the local
being stored in a register, in which case it might have gotten clobbered, but a
simple test I ran indicates that shouldn't happen. For simple assignment it was
referenced on the stack, for more complex operations a register was used and
then it was stored back to a location on the stack and the assembler referenced
that from then on.
I'm not sure if I have the Processor Pack, though. That could very well make a
difference. I'll check and report back.
Comment 12•21 years ago
|
||
"I assume __fastcall option /Gr isn't being used is it?"
yes, __fastcall isn't being used here
And I looked at MS doc again and I couldn't see any statement about __asm block
preserving any register
About the patch I have to say that it's just a matter of synchronizing jpeg
mmxsupport() with libpng png_mmx_support() and that there's no problem with
png_mmx_support() with -O2 ...
And I've looked at assembly code again and it's clear that no push/pop are
generated to preserve registers, you have to do this manually, hence the
push/pop for ebx,ecx,edx
I've noticed that with -O2 when push/pop for ebx/cx/dx are added the call to
mmxsupport() is not inlined. When push/pops are removed mmxsupport() is inlined
with -O2 and not inlined with -O1. Maybe the crash has nothing to do with
preserving the registers but is just about inlining or not inlining mmxsupport() ?
Comment 13•21 years ago
|
||
I wasn't implying that the __asm would cause the registers to be preserved, what
I was saying that in my test the register was stored back at the start of the
__asm block.
The inline issue may be the heart of it. With it inlined it's not actually a
function call, thus the optimizer may have tripped up didn't think about
preserving registers used. Forcing this to be non-inline would be an interesting
test.
Comment 14•21 years ago
|
||
+__declspec(noinline) int mmxsupport()
Does indeed not crash.
Updated•21 years ago
|
Summary: win32 '-O2' crash in jpeg\jdapimin.c: jpeg_consume_input() → win32 '-O2' crash in jpeg\jdapimin.c: jpeg_consume_input() [jpg]
Comment 15•21 years ago
|
||
*** Bug 235087 has been marked as a duplicate of this bug. ***
Comment 16•21 years ago
|
||
Removing -O2 from the summary because this bug exhibits itself using other
optimisation flags too. Also clearing target milestone of mozilla1.1beta and
removing defunct QA contact.
This bug exhibits itself regularly when compiling with MSVC++ .NET 2003 using
the generic "ac_add_options --enable-optimize" mozconfig option.
Summary: win32 '-O2' crash in jpeg\jdapimin.c: jpeg_consume_input() [jpg] → win32 crash in jpeg\jdapimin.c: jpeg_consume_input() [jpg]
Target Milestone: mozilla1.1beta → ---
Comment 17•21 years ago
|
||
Untested possible fix.
Attachment #142875 -
Flags: review?(bugmail)
Comment 18•21 years ago
|
||
Just a comment that both of these patches with SSE2 IDCT and GL optimization crash SeaMonkey. They work fine on FireFox though.
I don't think that the code in jdapimin.c is the cause as I replaced both
routines with "return 1" and the thing still crashed. It could be that the
SSE2 and/or MMX code has a problem with GL optimization.
I'm going to play around with this stuff a little more next week.
http://msdn.microsoft.com/library/en-us/vclang/html/_core_Using_and_Preserving_Registers_in_Inline_Assembly.asp
Indicates that the compiler should be preserving registers used in asm-blocks.
However they also say that we should make sure to not use __fastcall, which I
think that we do by default.
Comment 20•21 years ago
|
||
Attachment #142875 -
Attachment is obsolete: true
Comment on attachment 143013 [details] [diff] [review]
add static
r=me if you've tested that this fixes the problem.
However it'd be great if someone that has MSVC *without* the processorpack
could try to compile with this change and make sure it still compiles.
It's not a big deal if it doesn't compile since we already require the
processorpack. But if it's required then it might be good to put an
announcement in the NGs or some such.
Attachment #143013 -
Flags: review+
Attachment #143013 -
Flags: superreview?(bryner)
Attachment #142875 -
Flags: review?(bugmail)
Updated•21 years ago
|
Attachment #143013 -
Flags: superreview?(bryner) → superreview+
Comment 22•21 years ago
|
||
Checked in.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment 23•20 years ago
|
||
This bug (or bug 125762, I'm not sure) may have caused major regression bug
247437 (see attached screenshot)
You need to log in
before you can comment on or make changes to this bug.
Description
•