Closed
Bug 721663
Opened 13 years ago
Closed 13 years ago
Crash in unpackImageRow @ CGAccessSessionGetBytes on Mac OS X 10.5 while printing or previewing
Categories
(Core :: General, defect)
Tracking
()
VERIFIED
FIXED
mozilla13
People
(Reporter: scoobidiver, Assigned: smichaud)
References
Details
(4 keywords, Whiteboard: [qa!])
Crash Data
Attachments
(1 file, 2 obsolete files)
(deleted),
patch
|
smichaud
:
review+
lsblakk
:
approval-mozilla-aurora+
lsblakk
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
It's a new crash signature that first appeared in 12.0a1/20111223 and 11.0a2/20111230.
It's #4 top crasher in 11.0a2 on Mac OS X.
Every comments talk about printing or previewing
Signature CGAccessSessionGetBytes More Reports Search
UUID 63a79621-4dc0-48e1-9fb7-ed7762120125
Date Processed 2012-01-25 09:45:18
Uptime 13
Last Crash 38 seconds before submission
Install Age 13 seconds since version was first installed.
Install Time 2012-01-25 09:44:54
Product Firefox
Version 11.0a2
Build ID 20120124042008
Release Channel aurora
OS Mac OS X
OS Version 10.5.8 9L31a
Build Architecture x86
Build Architecture Info family 6 model 23 stepping 6
Crash Reason EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Crash Address 0x1c9a4000
App Notes
AdapterVendorID: 0x10de, AdapterDeviceID: 0x 863
EMCheckCompatibility True
Frame Module Signature Source
0 @0xffff08a0
1 CoreGraphics CGAccessSessionGetBytes
2 libPDFRIP.A.dylib unpackImageRow
3 libPDFRIP.A.dylib PDFImageEmitData
4 libPDFRIP.A.dylib imageRefEmitDefinition
5 libPDFRIP.A.dylib PDFImageEmitDefinition
6 libPDFRIP.A.dylib emitImageDefinition
7 CoreFoundation CFSetApplyFunction
More reports at:
https://crash-stats.mozilla.com/report/list?signature=CGAccessSessionGetBytes
Reporter | ||
Updated•13 years ago
|
Version: 12 Branch → 11 Branch
Reporter | ||
Comment 1•13 years ago
|
||
It's #1 top crasher on Mac OS X in 11.0b1.
There are a few crashes on Mac OS X 10.6.
tracking-firefox11:
--- → ?
Keywords: topcrash
Comment 2•13 years ago
|
||
Adding the qawanted and regressionwindow-wanted keywords to do some exploratory testing around Printing/Previewing printing on OS X. See the comments at https://crash-stats.mozilla.com/report/list?query_search=signature&query_type=contains&reason_type=contains&range_value=3&range_unit=weeks&hang_type=any&process_type=any&signature=CGAccessSessionGetBytes for more leads.
Also tracking for FF11.
Keywords: qawanted,
regressionwindow-wanted
Comment 3•13 years ago
|
||
Here are a set of STR on 10.5 using most recent 11 beta: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:11.0) Gecko/20100101 Firefox/11.0
STR:
1. Load http://www.psdbox.com/tutorials/new-manga-effect-2011-photoshop-tutorial/
2. Go to file menu - File->Print->Save as PDF.
3. Save the file.
4. Immediately after that click on the new tab button next to the tab open in Step 1.
5. Crash.
https://crash-stats.mozilla.com/report/index/bp-0ba65bfe-6c8e-4426-b6e1-40cdb2120211
Keywords: reproducible
Reporter | ||
Comment 4•13 years ago
|
||
Here are correlations:
CGAccessSessionGetBytes|EXC_BAD_ACCESS / KERN_INVALID_ADDRESS (57 crashes)
100% (57/57) vs. 17% (86/498) libPDFRIP.A.dylib
100% (57/57) vs. 19% (95/498) PrintingCocoaPDEs
Assignee | ||
Comment 5•13 years ago
|
||
Testing with today's mozilla-central nightly and Marcia's STR from comment #3 (on OS X 10.5.8), I get a completely different error:
The nightly crashes, and then so does crashreporter. Then I get the following Apple crash report:
Process: crashreporter [1877]
Path: /Users/smichaud/Desktop/FirefoxNightly 2012-02-13.app/Contents/MacOS/crashreporter.app/Contents/MacOS/crashreporter
Identifier: crashreporter
Version: ??? (???)
Code Type: X86 (Native)
Parent Process: firefox [1863]
Interval Since Last Report: 84 sec
Crashes Since Last Report: 1
Per-App Interval Since Last Report: 0 sec
Per-App Crashes Since Last Report: 1
Date/Time: 2012-02-13 10:26:40.583 -0600
OS Version: Mac OS X 10.5.8 (9L30)
Report Version: 6
Anonymous UUID: C9512736-EC85-4CD1-B209-AB5833DEE8E2
Exception Type: EXC_BREAKPOINT (SIGTRAP)
Exception Codes: 0x0000000000000002, 0x0000000000000000
Crashed Thread: 0
Dyld Error Message:
Library not loaded: /usr/lib/libcrypto.0.9.8.dylib
Referenced from: /Users/smichaud/Desktop/FirefoxNightly 2012-02-13.app/Contents/MacOS/crashreporter.app/Contents/MacOS/crashreporter
Reason: image not found
Ted, any idea what's going on here?
This is very likely a different bug, unrelated to the one Marcia reported.
Assignee | ||
Comment 6•13 years ago
|
||
(Following up comment #5)
The OS X 10.5.8 version of libcrypto.dylib is 0.9.7. So it looks like Breakpad is broken on OS X 10.5.8 in current nightlies. I'll open a new bug.
Comment 7•13 years ago
|
||
Marcia filed bug 721160 on that already.
Assignee | ||
Comment 8•13 years ago
|
||
I can reproduce this in the 2012-01-19 trunk nightly:
bp-253266fa-f4ba-4b16-a228-d10f92120213
I'll look for a regression range.
It's pointless testing with trunk nightlies dated from 2012-01-20 through 2012-02-13, because they all have bug 721160. But bug 721160 should be fixed in tomorrow's mozilla-central nightly.
Updated•13 years ago
|
Assignee: nobody → smichaud
Assignee | ||
Comment 9•13 years ago
|
||
Here's the regression range I found:
firefox-2012-11-21-09-23-46-mozilla-central
firefox-2012-11-22-03-09-49-mozilla-central
I can't tell which patch in this range might have triggered these crashes.
Interestingly, though, jemalloc was backed out for OS X 10.5 in this range (bug 702250). If this does turn out to be what triggered these crashes, jemalloc most likely masked the real bug.
Assignee | ||
Updated•13 years ago
|
Keywords: regressionwindow-wanted
Assignee | ||
Comment 10•13 years ago
|
||
Marcia, please see if you can reproduce my regression range.
Comment 11•13 years ago
|
||
Steven: Happy to oblige, but I am having trouble figuring out what builds the ones in Comment 9 correlate to - I cannot find builds with that exact ID in the directory.
(In reply to Steven Michaud from comment #10)
> Marcia, please see if you can reproduce my regression range.
Assignee | ||
Comment 12•13 years ago
|
||
> Interestingly, though, jemalloc was backed out for OS X 10.5 in this
> range (bug 702250). If this does turn out to be what triggered
> these crashes, jemalloc most likely masked the real bug.
Yes, turning jemalloc back on (on OS X 10.5) does seem to "fix" these
crashes.
Which isn't good news, because doing that isn't feasible, and because
jemalloc's masking of this bug's real cause makes it much harder to
find.
By the way, all the gdb stack traces I've been able to get of these
crashes are corrupt, and basically useless. (This is of course with a
non-symbol-stripped build.)
This is likely to turn out to be a memory corruption bug.
Assignee | ||
Comment 13•13 years ago
|
||
(In reply to comment #11)
Here are the two builds I mentioned in comment #9. The first one doesn't crash (for me). The second one does.
ftp://ftp.mozilla.org/pub/firefox/nightly/2011/11/2011-11-21-09-23-46-mozilla-central/firefox-11.0a1.en-US.mac.dmg
ftp://ftp.mozilla.org/pub/firefox/nightly/2011/11/2011-11-22-03-09-49-mozilla-central/firefox-11.0a1.en-US.mac.dmg
Assignee | ||
Comment 14•13 years ago
|
||
> Which isn't good news, because doing that isn't feasible, and because
> jemalloc's masking of this bug's real cause makes it much harder to
> find.
There recently was another bug that jemalloc masked -- bug 700835. And (something I'd forgotten) you can set the NO_MAC_JEMALLOC to turn off jemalloc on the Mac. So it won't be so hard, after all, to find this bug's true regression range.
I'll be working on that tomorrow.
Assignee | ||
Comment 15•13 years ago
|
||
The NO_MAC_JEMALLOC environment variable.
Assignee | ||
Comment 16•13 years ago
|
||
For my own future reference:
In order to set the NO_MAC_JEMALLOC environment variable when you double-click an app, add the following to its Info.plist:
<key>LSEnvironment</key>
<dict>
<key>NO_MAC_JEMALLOC</key>
<string>1</string>
</dict>
Assignee | ||
Comment 17•13 years ago
|
||
(Following up comment #16)
Though Apple documents this capability, it'd widely (and correctly) reported not to work at all. Fortunately there's another way:
Create a ~/.MacOSX/ directory, and (if you don't have one already) add an environment.plist file to it. Or edit it if you do have one. Then log out and back in again.
Assignee | ||
Comment 18•13 years ago
|
||
(Following up comment #17)
Unfortunately, though, mozilla-central nightlies crash on startup on OS X 10.5.8 with NO_MAC_JEMALLOC set from when jemalloc was enabled on 10.5 (http://hg.mozilla.org/mozilla-central/rev/c4e4af6b7ae4, bug 694335) until it was disabled again (bug 702250).
So (apparently) NO_MAC_JEMALLOC will be of no help finding this bug's true regression range.
Comment 19•13 years ago
|
||
I see the same thing on my lab machine - the second build crashes with my STR but the first one does not. Thanks for hunting down the regression range.
(In reply to Steven Michaud from comment #13)
> (In reply to comment #11)
>
> Here are the two builds I mentioned in comment #9. The first one doesn't
> crash (for me). The second one does.
>
> ftp://ftp.mozilla.org/pub/firefox/nightly/2011/11/2011-11-21-09-23-46-
> mozilla-central/firefox-11.0a1.en-US.mac.dmg
> ftp://ftp.mozilla.org/pub/firefox/nightly/2011/11/2011-11-22-03-09-49-
> mozilla-central/firefox-11.0a1.en-US.mac.dmg
Assignee | ||
Comment 20•13 years ago
|
||
I'm still trying to get the "true" regression range (unmasked by jemalloc). Since I can't use nightlies, it's quite slow. But I've now managed to narrow the range to somewhere between these two landings (inclusive):
http://hg.mozilla.org/mozilla-central/rev/9ae1d4f44b8b
dougt@mozilla.com
Mon Nov 14 20:38:46 2011 -0800
Doug Turner — Bug 690201 - dead code - mLastDrawEvent never used. r=mbrubeck
http://hg.mozilla.org/mozilla-central/rev/24c8d04f6174
bmo@edmorley.co.uk
Mon Nov 14 04:35:37 2011 -0800
Ed Morley — Merge mozilla-central and mozilla-inbound
I'll keep working at it until I've identified the single patch that triggered this bug's crashes.
Assignee | ||
Comment 21•13 years ago
|
||
Here's the patch that triggered these crashes:
Bug 685767 - Factor blurring out into its own class, and use it from gfxAlphaBoxBlur. r=mattwoodrow
author Joe Drew <joe@drew.ca>
Mon Nov 14 17:29:28 2011 +1300 (at Mon Nov 14 17:29:28 2011 +1300)
http://hg.mozilla.org/mozilla-central/rev/6ae6d3beeaf4
Tomorrow I'll try to figure out why.
Assignee | ||
Comment 22•13 years ago
|
||
I had to disable jemalloc while testing. Here's how I did it:
diff --git a/configure.in b/configure.in
--- a/configure.in
+++ b/configure.in
@@ -2343,21 +2343,21 @@ case "$target" in
MKSHLIB_UNFORCE_ALL=''
;;
esac
;;
*-darwin*)
MKSHLIB='$(CXX) $(CXXFLAGS) $(DSO_PIC_CFLAGS) $(DSO_LDOPTS) -o $@'
MKCSHLIB='$(CC) $(CFLAGS) $(DSO_PIC_CFLAGS) $(DSO_LDOPTS) -o $@'
MOZ_OPTIMIZE_FLAGS="-O3"
_PEDANTIC=
- MOZ_MEMORY=1
+ #MOZ_MEMORY=1
CFLAGS="$CFLAGS -fno-common"
CXXFLAGS="$CXXFLAGS -fno-common"
DLL_SUFFIX=".dylib"
DSO_LDOPTS=''
STRIP="$STRIP -x -S"
# Check whether we're targeting OS X or iOS
AC_CACHE_CHECK(for iOS target,
ac_cv_ios_target,
[AC_TRY_COMPILE([#include <TargetConditionals.h>
#if !(TARGET_OS_IPHONE || TARGET_IPHONE_SIMULATOR)
Comment 23•13 years ago
|
||
oh, wow!
Steven, please don't hesitate to call on me for help here. Ping me on irc, whatever. :)
Comment 24•13 years ago
|
||
If this is memory corruption, presumably you want to run this through valgrind, right?
Assignee | ||
Comment 25•13 years ago
|
||
Marcia, could you test your STR from comment #3 on OS X 10.6 and 10.7 in 32-bit mode? I'm not able to reproduce the crashes myself under those conditions. But I don't have printer drivers installed for either of those OS versions, and that may be what makes the difference.
jemalloc is disabled in 32-bit mode, on all versions of OS X.
Assignee | ||
Comment 26•13 years ago
|
||
These crashes happen due to memory corruption that happens at either of the following two lines:
http://hg.mozilla.org/mozilla-central/annotate/6989376471f7/gfx/thebes/gfxBlur.cpp#l119
http://hg.mozilla.org/mozilla-central/annotate/6989376471f7/gfx/thebes/gfxBlur.cpp#l122
The crashes don't happen immediately -- only after a second or two. But they don't happen if I comment out either of those lines.
More specifically, the memory corruption happens here:
http://hg.mozilla.org/mozilla-central/annotate/6989376471f7/gfx/cairo/cairo/src/cairo-quartz-surface.c#l3039
But it doesn't happen if, instead, we use the call to _cairo_quartz_surface_mask_with_generic() on the following line:
http://hg.mozilla.org/mozilla-central/annotate/6989376471f7/gfx/cairo/cairo/src/cairo-quartz-surface.c#l3041
Calling _cairo_quartz_surface_mask_with_generic() is (as best I can tell) a little less efficient, but functionally equivalent.
As far as I know (from the tests I mentioned in comment #25) these crashes only happen on OS X 10.5. But I need help from others to confirm this.
I've ruled out that the crashes have anything to do with accessing deleted objects. I did this by adding printf statements to the constructors and destructors of all the objects in play (including cairo objects).
I still don't know exactly what sort of memory corruption happens, or why. But (as I said above) I'm reasonably confident it's confined to OS X 10.5. So it may be an OS bug. Or it may conceivably be a bug in how quartz cairo operates on OS X 10.5. I don't believe we used the "mask" capability before Joe's patch for bug 685767 landed. This would explain why we didn't have problems earlier.
Assignee | ||
Comment 27•13 years ago
|
||
As I explained above, this patch doesn't really get to the heart of the problem.
But if I'm right that the memory corruption only happens on OS X 10.5, which we aren't going to support for much longer, I'm pretty sure the patch is "good enough".
I've started tryserver builds, which should be available in a few hours.
Attachment #597052 -
Attachment is obsolete: true
Attachment #597900 -
Flags: review?(joe)
Assignee | ||
Comment 28•13 years ago
|
||
Here's a tryserver build made with my patch from comment #27:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/smichaud@pobox.com-13ccb76757b7/try-macosx64/firefox-13.0a1.en-US.mac.dmg
There were no non-spurious test failures.
Assignee | ||
Comment 29•13 years ago
|
||
By the way, I've been trying to use libgmalloc and valgrind on my OS X 10.5.8 machine, so far with no luck. I suspect the machine has too little RAM -- only 4GB.
Comment 30•13 years ago
|
||
Comment on attachment 597900 [details] [diff] [review]
Provisional fix
Review of attachment 597900 [details] [diff] [review]:
-----------------------------------------------------------------
Please also add this as a Cairo patch to gfx/cairo.
Attachment #597900 -
Flags: review?(joe) → review+
Assignee | ||
Comment 31•13 years ago
|
||
Comment on attachment 597900 [details] [diff] [review]
Provisional fix
Landed on mozilla-inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/9466529cdbc0
> Please also add this as a Cairo patch to gfx/cairo.
Done.
Assignee | ||
Comment 32•13 years ago
|
||
What I landed (carrying forward Joe's r+).
I'll wait a few days for others to test, and then (presuming there aren't any problems) seek aurora and beta branch approval.
Marcia, you're probably in the best position to test :-)
Attachment #597900 -
Attachment is obsolete: true
Attachment #598258 -
Flags: review+
Assignee | ||
Updated•13 years ago
|
status-firefox12:
--- → affected
status-firefox13:
--- → affected
Comment 33•13 years ago
|
||
Testing using Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0) Gecko/20100101 Firefox/11.0 (b3), I am not able to reproduce the bug. I will try 10.6 next. Confirming that I was testing in 32 bit mode.
(In reply to Steven Michaud from comment #25)
> Marcia, could you test your STR from comment #3 on OS X 10.6 and 10.7 in
> 32-bit mode? I'm not able to reproduce the crashes myself under those
> conditions. But I don't have printer drivers installed for either of those
> OS versions, and that may be what makes the difference.
>
> jemalloc is disabled in 32-bit mode, on all versions of OS X.
Comment 34•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla13
Reporter | ||
Updated•13 years ago
|
status-firefox13:
affected → ---
Comment 35•13 years ago
|
||
Is this fix considered low-risk enough to nominate for Aurora/Beta approval? Or does it make more sense to back out bug 685767? Thanks!
Assignee | ||
Comment 36•13 years ago
|
||
> Is this fix considered low-risk enough to nominate for Aurora/Beta approval?
I think it is. It only changes behavior on OS X 10.5. And even there it only causes a slight increase in RAM usage and (probably) no change in performance.
It'd be good to hear from others who know more about Cairo than I do, though.
Assignee | ||
Comment 37•13 years ago
|
||
Marcia, could you check with today's mozilla-central nightly to confirm that your STR no longer works with it?
Updated•13 years ago
|
tracking-firefox12:
--- → -
Comment 38•13 years ago
|
||
(In reply to Steven Michaud from comment #36)
> It'd be good to hear from others who know more about Cairo than I do, though.
I've sent email to Joe and Jeff to get their feedback.
Comment 39•13 years ago
|
||
This is relatively safe. I'd be OK with shipping it in a beta and aurora. We already have to use the fallback path in many cases, so using it for more cases should continue to be safe.
Assignee | ||
Updated•13 years ago
|
Attachment #598258 -
Flags: approval-mozilla-beta?
Attachment #598258 -
Flags: approval-mozilla-aurora?
Comment 40•13 years ago
|
||
Comment on attachment 598258 [details] [diff] [review]
Patch with copy in gfx/cairo
[Triage Comment]
Comfortable taking this low-risk fix, please land today 2/27/12 in preparation for go-to-build on 2/28/12
Attachment #598258 -
Flags: approval-mozilla-beta?
Attachment #598258 -
Flags: approval-mozilla-beta+
Attachment #598258 -
Flags: approval-mozilla-aurora?
Attachment #598258 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 41•13 years ago
|
||
Comment on attachment 598258 [details] [diff] [review]
Patch with copy in gfx/cairo
Landed on mozilla-aurora:
http://hg.mozilla.org/releases/mozilla-aurora/rev/154b852a9952
Assignee | ||
Comment 42•13 years ago
|
||
Comment on attachment 598258 [details] [diff] [review]
Patch with copy in gfx/cairo
Landed on mozilla-beta:
http://hg.mozilla.org/releases/mozilla-beta/rev/1dc6cd15f683
Assignee | ||
Updated•13 years ago
|
Comment 44•13 years ago
|
||
Verified fixed using the STR in Comment 3. I tested with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:11.0) Gecko/20100101 Firefox/11.0 which is the Beta 5 build. No crash observed with saving the PDF and opening a new tab.
Reporter | ||
Updated•13 years ago
|
Comment 45•13 years ago
|
||
I have verified this on:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:12.0) Gecko/20100101 Firefox/12.0 beta 2
Firefox didn't crash using the steps from comment3.
Setting resolution to Verified Fixed.
You need to log in
before you can comment on or make changes to this bug.
Description
•