Closed Bug 380494 Opened 18 years ago Closed 18 years ago

crash [@ moz_cairo_win32_surface_get_image][@ cairo_win32_surface_create_for_dc]

Categories

(Core :: Graphics, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: Peter6, Assigned: vlad)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(2 files, 3 obsolete files)

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a5pre) Gecko/20070511 Minefield/3.0a5pre ID:2007051106 [cairo] since Bug 371135 landed (uncertain) some people get unreproducable crash. While browsing FF simply crashes, no obvious reason. Incident ID: 32028548 Stack Signature _cairo_win32_surface_create_for_dc 658fc8f7 Product ID FirefoxTrunk Build ID 2007051004 Trigger Time 2007-05-11 00:24:50.0 Platform Win32 Operating System Windows NT 5.1 build 2600 Module firefox.exe + (004ed4cb) URL visited User Comments sudden death after startup, no obvious reason Since Last Crash 43231 sec Total Uptime 43231 sec Trigger Reason Access violation Source File, Line No. e:\builds\tinderbox\fx-trunk\winnt_5.2_depend\mozilla\gfx\cairo\cairo\src\cairo-win32-surface.c, line 373 Stack Trace _cairo_win32_surface_create_for_dc [mozilla/gfx/cairo/cairo/src/cairo-win32-surface.c, line 373] gfxWindowsNativeDrawing::PaintToContext [mozilla/gfx/thebes/src/gfxwindowsnativedrawing.cpp, line 272] nsNativeThemeWin::DrawWidgetBackground [mozilla/widget/src/windows/nsnativethemewin.cpp, line 852] 0x002b7058
Keywords: crash
FWIW, your crash involved _cairo_win32_surface_create_for_dc, but someone else had a crash with _moz_cairo_win32_surface_get_image with the same two functions in the stack. Stack Signature _moz_cairo_win32_surface_get_image de8e4739 Product ID FirefoxTrunk Build ID 2007051106 Trigger Time 2007-05-11 19:31:02.0 Platform Win32 Operating System Windows NT 5.1 build 2600 Module firefox.exe + (004ed8fa) URL visited User Comments Since Last Crash 34552 sec Total Uptime 34552 sec Trigger Reason Access violation Source File, Line No. e:\builds\tinderbox\fx-trunk\winnt_5.2_depend\mozilla\gfx\cairo\cairo\src\cairo-win32-surface.c, line 1856 Stack Trace _moz_cairo_win32_surface_get_image [mozilla/gfx/cairo/cairo/src/cairo-win32-surface.c, line 1856] gfxWindowsNativeDrawing::PaintToContext [mozilla/gfx/thebes/src/gfxwindowsnativedrawing.cpp, line 253] nsNativeThemeWin::DrawWidgetBackground [mozilla/widget/src/windows/nsnativethemewin.cpp, line 968]
Severity: major → critical
Summary: crash [@ moz_cairo_win32_surface_get_image] → crash [@ moz_cairo_win32_surface_get_image][@ cairo_win32_surface_create_for_dc]
Seems to happen for me whenever I open tools -> add-ons. Talkback TB32069036Q
CCing vlad since this appears to be the same crash we were talking about with aja a few nights ago in IRC.
OS: Windows XP → All
Hardware: PC → All
(In reply to comment #4) > CCing vlad since this appears to be the same crash we were talking about with > aja a few nights ago in IRC. > FWIW, crashes didn't occur for me in -safe-mode, or when I disabled two extensions (cmSiteNavigator Toolbar 0.8, and Link Widgets 1.6). I also tried launching the buttons.png from Link Widgets into the content window and mousing over the buttons in it repeatedly, and no crash.
I verified bug 371135 as the culprit via my own builds by pulling source by time.
Blocks: 371135
I created a debug build and put some debug prints into CheckSurfaceSize and what i found was that it was the first check for either width or height being <=0 that was triggering this issue. I cot the following debug print an assertion: Size <=0 - w=69, h=0,l=0 ###!!! ASSERTION: gfxASurface::AddRef without mSurface: 'mSurface != nsnull', file c:/mozilla/trunk/gfx/thebes/src/gfxASurface.cpp, line 66 Further investigation found that the call causing this was this one: http://lxr.mozilla.org/seamonkey/source/gfx/thebes/src/gfxWindowsNativeDrawing.cpp#173 I have no idea why we should be here with a height of zero. I tried adding a check here for the surface creation failing by making it return nsnull, but that does not prevent the browser from crashing. There would appear to be 2 issues here. 1. Why should we be at this point in the code at all with a zero height. and 2. If this surface creation does return an error what needs to be done to prevent a subsequent browser crash. During my investigation I also found that the code here: http://lxr.mozilla.org/seamonkey/source/gfx/thebes/src/gfxASurface.cpp#170 is incorrect in that is should be only doing the ADDREF if result is not null.
Attached patch workaround (obsolete) (deleted) — Splinter Review
I don't think this is a proper fix. This merely avoids the crash by permitting zero as a valid value of height and/or width of a surface.
Attached patch potential fix (obsolete) (deleted) — Splinter Review
Can someone who can reproduce this try this fix? I can't manage to trigger the crash.
Assignee: nobody → vladimir
Status: NEW → ASSIGNED
(In reply to comment #9) > Created an attachment (id=264771) [details] > potential fix > > Can someone who can reproduce this try this fix? I can't manage to trigger the > crash. > This resolves the crash issue for me.
Comment on attachment 264771 [details] [diff] [review] potential fix roc, does this seem like a good approach to fixing this (we'll need similar wallpaper on linux)? The problem is that now we're explicitly failing on attempts to get surfaces that are 0 width or 0 height; before, there was code in cairo that made those 1 pixel in size that we put in to wallpaper. I think we should fix the callers; otherwise, we can just take the earlier patch in this bug and make 0 width/height valid for surfaces... (though that causes problems with mallocs and whatever, so.. we'll see)
Attachment #264771 - Flags: review?(roc)
Comment on attachment 264771 [details] [diff] [review] potential fix I think this is fine, modulo the answers to two questions: 1) Why 2 and not 1 for the size? 2) What is the rationale to not supporting zero-sized surfaces in cairo? That seems like an unnecessary complication to the API.
Attachment #264771 - Flags: superreview+
Attachment #264771 - Flags: review?(roc)
Attachment #264771 - Flags: review+
Attached patch allow for zero-sized surfaces (deleted) — Splinter Review
Ok, based on: 16:10 < cworth> vlad_: I don't know that we've ever tested it, but I'd consider cairo broken to not handle 0-sized surfaces, (just like I curse broken xlib for not handling these). I believe win32, xlib, and quartz all have explicit support for this. So, we allow zero-sized surfaces.
Attachment #264698 - Attachment is obsolete: true
Attachment #264771 - Attachment is obsolete: true
Attachment #264809 - Flags: review?(roc)
(In reply to comment #13) > Created an attachment (id=264809) [details] > allow for zero-sized surfaces That works for me as well. I like your cleverer way to avoid dividing by zero.
Comment on attachment 264809 [details] [diff] [review] allow for zero-sized surfaces If any of these tests are true then it's very likely due to a programming error, so I would like to have assertions there.
I've been able to reasonably reproduce this at litmus.mozilla.org. Not simple, but fairly reliable. I hope it helps verify a fix. -open a test run for trunk smoketests -unhide all the test cases -click a radio button for a test case -hide that test case by clicking its title -repeat with next test case until crash. usually between 2 and 7 testcases. other talkback suggest crashing with this stack at Tools > Addons. But I haven't been able to repro that.
Keywords: topcrash
(In reply to comment #15) > (From update of attachment 264809 [details] [diff] [review]) > If any of these tests are true then it's very likely due to a > programming error, so I would like to have assertions there. Yeah, good point. I just added in NS_ERROR's to those cases.
With this patch installed I ran into an issue where the browser would sometimes hang when switching back and forth between extensions and themes in the addons window. I added the following which seems to have helped. Index: gfx/thebes/src/gfxImageSurface.cpp =================================================================== RCS file: /cvsroot/mozilla/gfx/thebes/src/gfxImageSurface.cpp,v retrieving revision 1.12 diff -u -8 -r1.12 gfxImageSurface.cpp --- gfx/thebes/src/gfxImageSurface.cpp 10 May 2007 19:58:09 -0000 1.12 +++ gfx/thebes/src/gfxImageSurface.cpp 15 May 2007 20:08:16 -0000 @@ -44,17 +44,20 @@ gfxImageSurface::gfxImageSurface(const gfxIntSize& size, gfxImageFormat format) : mSize(size), mFormat(format) { mStride = ComputeStride(); if (!CheckSurfaceSize(size)) return; - mData = (unsigned char *) malloc(mSize.height * mStride); + if (mSize.height) + mData = (unsigned char *) malloc(mSize.height * mStride); + else + mData = (unsigned char *) malloc(1); if (!mData) return; mOwnsData = PR_TRUE; cairo_surface_t *surface = cairo_image_surface_create_for_data((unsigned char*)mData, (cairo_format_t)format,
Ugh, checked in before I saw that comment. That makes sense, I'll get that in as a followup. I still don't like that something is allocating a zero sized surface, but I guess it's ok.
Attached patch followup (obsolete) (deleted) — Splinter Review
Followup patch to avoid malloc(0)
Attachment #264913 - Flags: review?(roc)
malloc(0) returns NULL, so why is this needed? cairo_image_surface_create_for_data shouldn't be looking at the data if it's a size-zero surface. Is there an underlying bug here?
It doesn't, as far as I can tell -- is malloc(0) returning NULL portable? However, the problem is that there's a check there if the allocated memory is NULL... so maybe that patch should be more like if (height * stride > 0) { data = malloc(...); if (!data) return; } else { data = nsnull; } ?
(In reply to comment #22) > It doesn't, as far as I can tell -- is malloc(0) returning NULL portable? Yes. > However, the problem is that there's a check there if the allocated memory is > NULL... so maybe that patch should be more like > > if (height * stride > 0) { > data = malloc(...); > if (!data) > return; > } else { > data = nsnull; > } Sure.
What's been checked in already appears to have resolved the reproducible crash I'd been having with the 2 link navigation extensions mentioned in comment 5 above.
(In reply to comment #23) > (In reply to comment #22) > > It doesn't, as far as I can tell -- is malloc(0) returning NULL portable? > > Yes. > Googling for malloc 0 and msvc malloc 0 got me to discussions that seem to indicate that MSVC and some versions of Linux do not return NULL for a zero byte malloc. I am going to do do a debug build with with some extra debug printf's near the malloc and try to figure out what is really causing the hang (assuming I can reproduce it).
(In reply to comment #25) > Googling for malloc 0 and msvc malloc 0 got me to discussions that seem to > indicate that MSVC and some versions of Linux do not return NULL for a zero > byte malloc. Links? Anyway it doesn't matter in this case.
(In reply to comment #26) > (In reply to comment #25) > > Googling for malloc 0 and msvc malloc 0 got me to discussions that seem to > > indicate that MSVC and some versions of Linux do not return NULL for a zero > > byte malloc. > > Links? Here are a mess of links that all come down to the bottom line of, if you are trying to write portable code you should never call malloc with a zero argument. But, I see that calling malloc with a zero and getting null in return takes a different codepath in this case so I will try the new patch and report back.
You're right. But anyway, it still shouldn't matter.
(In reply to comment #23) > > However, the problem is that there's a check there if the allocated memory is > > NULL... so maybe that patch should be more like > > > > if (height * stride > 0) { > > data = malloc(...); > > if (!data) > > return; > > } else { > > data = nsnull; > > } Well, for some odd reason I could not reproduce the issue with a debug build and then rebuilt without debug and could not reproduce it there either. I might have to go back to trying this on the work computer where I had the issue originally, but I won't be back there until Thursday. In any event I did create a build with code like above and it at least does not seem to introduce any new issues and des seem to make sense.
Attached patch followup (deleted) — Splinter Review
Fixed followup. Whether this was causing the crashing or not, we should take this anyway -- otherwise we can't create 0-sized gfxImageSurfaces.
Attachment #264913 - Attachment is obsolete: true
Attachment #265053 - Flags: review?(roc)
Attachment #264913 - Flags: review?(roc)
Blocks: 380464
Followup checked in.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Flags: in-testsuite?
verified with trunk build from 5/24 and this hasn't appeared in talkback reports since the follow check in
Status: RESOLVED → VERIFIED
Crash Signature: [@ moz_cairo_win32_surface_get_image] [@ cairo_win32_surface_create_for_dc]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: