Closed Bug 444864 Opened 16 years ago Closed 16 years ago

Crash [@ nsNativeThemeCocoa::DrawCellWithScaling] with <html:input>, large letter-spacing

Categories

(Core :: Widget: Cocoa, defect, P1)

x86
macOS
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: jruderman, Assigned: smichaud)

References

Details

(4 keywords, Whiteboard: [sg:critical?])

Crash Data

Attachments

(7 files, 4 obsolete files)

(deleted), text/html
Details
(deleted), text/plain
Details
(deleted), text/plain
Details
(deleted), text/plain
Details
(deleted), text/plain
Details
(deleted), patch
jaas
: review+
Details | Diff | Splinter Review
(deleted), patch
jaas
: review+
Details | Diff | Splinter Review
Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000031ff4988 Crashed Thread: 0 Thread 0 Crashed: 0 argb32_sample_ARGB32 + 383 1 argb32_image_mark + 601 2 argb32_image + 390 3 ripd_Mark + 326 4 ripl_BltImage + 1299 5 ripc_RenderImage + 273 6 ripc_DrawImage + 4973 7 CGContextDrawImage + 397 8 nsNativeThemeCocoa::DrawCellWithScaling(NSCell*, CGContext*, CGRect const&, _NSControlSize, float, float, float, float, float const (*) [3][4], int) + 2082 (nsNativeThemeCocoa.mm:304) 9 nsNativeThemeCocoa::DrawPushButton(CGContext*, CGRect const&, int, int, int) + 1482 (nsNativeThemeCocoa.mm:463) 10 nsNativeThemeCocoa::DrawWidgetBackground(nsIRenderingContext*, nsIFrame*, unsigned char, nsRect const&, nsRect const&) + 2978 (nsNativeThemeCocoa.mm:1116) Related to bug 444260?
Attached file backtrace from trunk crash (deleted) —
My stack looked a bit different from Jesse's when I crashed with his testcase. nsNativeThemeCocoa::DrawCellWithScaling wasn't called for me.
Whiteboard: [sg:critical?]
Attached file Gdb trace of crash with debug symbols (deleted) —
Here's a stack trace I made in gdb using a 1.9.0-branch opt build with debug symbols. Notice that the deleted object appears to be a Gecko object (its address appears as a parameter in several Gecko methods in the trace). This is different from bug 444260, where the deleted object appears to be an internal Cocoa object (one that's not directly visible anywhere in browser code). Oddly, though I crash on load (on both the trunk and the 1.9.0 branches) when I open the testcase from bmo, it's much more difficult to crash when I download the testcase and open it using a file:// URL. However, even when I don't crash, the browser's performance slows to a miserable crawl (the slowness lasts until I leave the current page or close the current window). So I think there are really two bugs here: 1) A crash bug caused by referencing a deleted object, possibly in cross-platform code. 2) An OS-X-specific performance bug caused by calling CGContextDrawImage() (a system call) with an inappropriately-constructed CGImageRef. The most urgent bug is #1. But once it's fixed, I think we also need to address #2. When I get to that point, I'll open a new bug to address #2 (which I won't label security sensitive). > Related to bug 444260? I don't think so.
Assignee: joshmoz → smichaud
Status: NEW → ASSIGNED
Attachment #329472 - Attachment mime type: application/octet-stream → text/plain
(In reply to comment #2) > Here's a stack trace I made in gdb using a 1.9.0-branch opt build with > debug symbols. Oh, right. Debug symbols :-) Thanks.
Attached file Console log of Linux crash (deleted) —
I also crash on Linux when I load this bug's testcase into FF3 (loading it from bmo or from a file:// URL). But Breakpad doesn't come up, so all I have is what appears in the console.
I don't crash (in FF3) on Windows.
(Following up comment #2) > Here's a stack trace I made in gdb using a 1.9.0-branch opt build > with debug symbols. Notice that the deleted object appears to be a > Gecko object (its address appears as a parameter in several Gecko > methods in the trace). I'm wrong about the deleted object being a Gecko object -- when I break on argb32_sample_ARGB32 (just before the crash) and get a stack trace (doing bt), none of the functions' parameters match the address of the object that eventually causes the crash (as they do when I get a stack after the crash). I have to conclude that the stack I get after the crash is partially corrupt (at least with respect to the function parameters that it reports). After a bunch of digging around, I think it's most likely this bug's deleted object is also an internal object (possibly a local variable in the method where the crash takes place), like that of bug 444260. Though it's not a Cocoa or CoreFoundation object (since neither NSZombieEnabled=YES nor CFZombieLevel=4 makes the crashes stop). So it looks like I'm going to have to tackle this bug's crash indirectly, by resolving problem #2 (from comment #2).
Finally, I'm no longer at all sure this bug's invalid pointer (0x33c1b988 from comment #2's trace) corresponds to a deleted object. This address is (I think) more likely to lie just below the address space occupied by a valid object.
Attached patch Provisional fix (obsolete) (deleted) — Splinter Review
Here's a patch modeled on my patch for bug 444260 (attachment 328698 [details] [diff] [review]), but which I think obsoletes it. This patch fixes both this bug (bug 444864) and bug 444260.
Attachment #329570 - Flags: review?(joshmoz)
Flags: wanted1.9.1?
Flags: wanted1.9.0.x?
Priority: -- → P1
Josh, can you review this?
Flags: wanted1.9.0.x?
Flags: wanted1.9.0.x+
Flags: blocking1.9.0.3?
My patch (attachment 329570 [details] [diff] [review]) also fixes bug 449111. So that's now three (potentially) security-sensitive crash bugs that it fixes -- bug 444260, bug 444864 and bug 449111.
Attachment #329570 - Flags: review?(joshmoz) → review?(mstange)
Attachment #329570 - Flags: review?(joshmoz)
After talking to Josh and Markus last week about my patch (attachment 329570 [review]) on IRC, I went over the code (and my patch) again to see if I could either come up with a better patch, or come up with better justifications for it. I've not been able to come up with a better patch (save for a minor change to how I get the focused view's bounds). But until and unless we (or someone else) _can_ come up with something better, I think my patch (as altered) is a reasonable way to fix these three bugs (bug 444260, bug 444864 and bug 449111). Here's why. The coordinate system for destRect (with which nsNativeThemeCocoa::DrawCellWithScaling() is called) is one for an nsIFrame object (which means that it's ultimately the coordinate system for the NSView into which DrawCellWithScaling() will draw). Because of this, I've replaced (in my new patch) focusViewFrame with focusViewBounds, which I initialize with a call to [focusView bounds] (which uses focusView's coordinate system). The problems of all three bugs arise when we call the scaling code (in DrawCellWithScaling) on an object whose destRect is unresonably large. These include crashes. But they also include the performance bug (#2) that I describe in comment #2 above. So getting rid of these crashes isn't really enough (by itself) to "fix" these problems. Also (aside from the performance problem) the buttons' display (from all three bugs) is much more "broken" when I just tackle the crash directly (e.g. using my first patch for bug 444260, attachment 328691 [details] [diff] [review]) than it is with my current patch. My current patch also makes the buttons' display conform much more closely to how they're drawn by FF3 on Windows and Linux (and by FF2 on all platforms). So I think it makes sense to avoid the scaling code "when we need to". My current patch decides this using a rough rule of thumb -- we should avoid the scaling code when the object we're drawing will overflow the NSView object into which it's being drawn. I considered (and tried out) simply truncating the size of the object to be drawn (making sure drawRect is never larger than the NSView into which we're drawing). But it sometimes makes sense to draw a larger NSCell object into a smaller NSView object -- for example (as in this bug's case) it sometimes makes sense to make an NSCell object larger than the current browser page (so that one has to scroll the page to see the whole object). Next I considered limiting the size of the NSCell object to some multiple of the size of the NSView object into which it's drawn. This could be a reasonable approach ... but how do you decide what the limits are? My rough rule of thumb, though rough, has the advantage of being very simple. Finally, Markus noticed (last week) that my patch doesn't get rid of all drawing problems -- for example, parts of the buttons tend to disappear when you scroll the browser window horizontally (in all these bugs' examples). But 1) I'm not sure there's anything we can do about this (I suspect it's due to an Apple bug in [NSCell drawWithFrame:inView:]) and 2) It's not unreasonable that that buttons with such unreasonable dimensions don't draw perfectly. It's only unreasonable that we crash while drawing them :-)
Attachment #329570 - Attachment is obsolete: true
Attachment #335379 - Flags: review?(mstange)
Attachment #335379 - Flags: review?(joshmoz)
Attachment #329570 - Flags: review?(mstange)
Attachment #329570 - Flags: review?(joshmoz)
Comment on attachment 335379 [details] [diff] [review] Provisional fix rev1 (use correct coordinate system) I'm fine with this approach.
Attachment #335379 - Flags: review?(mstange) → review+
Flags: blocking1.9.1?
Attached patch Fix rev2 (avoid scaling if drawRect too large) (obsolete) (deleted) — Splinter Review
Here's a patch that takes a new approach -- it disables scaling if the area of drawRect (in pixels^2) is too large. This "works", but I think it's less robust (and therefore not as good) as my previous patch. The main reason is that you need to make CELL_SCALING_MAX_AREA quite small in order to avoid both crashes and (very) poor drawing performance. (I'll talk more about this in a later comment.) In my next comment I'll post another patch that (more or less) follows the approach of my previous patch (with some improvements). Both patches now check for [NSView focusView] returning NULL -- which I discovered can happen, and can itself lead to crashes.
Attachment #335379 - Attachment is obsolete: true
Attachment #335379 - Flags: review?(joshmoz)
Here's a patch that improves on my previous patch, while still taking (more or less) the same approach. Instead of checking if drawRect is larger than the currently focused NSView object, it checks if either its width or its height is more than twice as large as the window containing the currently focused NSView object. This deals with the edge case of focusedView being substantially smaller than its window.
Josh asked me to find an absolute "pixel size" above which we can avoid scaling. As I say in comment #15, I did find a way to do this. But (for various reasons), that approach isn't as good as that taken by my rev3 patch from comment #16. The main difficulty is that (as I say in comment #2) this bug is really two bugs: At very large "pixel sizes", crashes will happen (using the testcases for all three bugs mentioned here -- bug 444260, bug 444864 and bug 449111). But if you alter the testcases for bug 444864 and bug 444260 to make their "pixel sizes" just small enough to avoid crashes, it will take horribly long to scroll or resize those (altered) examples' windows. For some reason the performance problem is particularly bad if CGContextDrawImage() rescales its image (in either dimension) by a factor close to but less than 1 (e.g. by 0.9): You need to make CELL_SCALING_MAX_AREA quite small to avoid this worst-case scenario. In tests on nsNativeThemeCocoa::DrawCellWithScaling(), with either xscale or yscale set to 0.9, I found that I needed to set CELL_SCALING_MAX_AREA to approximately 500000 (I tested on a very fast machine (an 8-core MacPro) running OS X 10.5.4 and a moderately fast machine (an early model MacBook Pro) running OS X 10.4.11). In tests using Markus's test app from comment #14, I found I could get reasonable performance (with 0.9 scaling) using an "area" as large as 750000. But my tests with nsNativeThemeCocoa::DrawCellWithScaling() are obviously more directly relevant.
Side note: Interestingly, I found that CELL_SCALING_MAX_AREA doesn't seem to depend on how much RAM you have installed. My just-small-enough-to-avoid-crashes value for CELL_SCALING_MAX_AREA was the same on my MacPro (with 4GB of RAM) as on my MacBook Pro (with 2GB of RAM).
Attachment #337535 - Flags: review?(joshmoz)
Attachment #337532 - Flags: review+
Attachment #337535 - Flags: review?(joshmoz) → review-
Attachment #337532 - Flags: superreview?(roc)
Comment on attachment 337535 [details] [diff] [review] Fix rev3 (avoid scaling if drawRect too much larger than its window) I'm still not convinced my rev2 patch is better than my rev3 patch. But my rev2 patch will do the job. And it has the advantage that it will also catch (i.e. prevent) attempts to rescale large cells (large in area) that might otherwise cause performance problems.
Attachment #337535 - Attachment is obsolete: true
Flags: blocking1.9.0.3? → blocking1.9.0.3+
Attachment #337532 - Flags: superreview?(roc) → superreview+
rev2 patch (attachment 337532 [details] [diff] [review]) landed on mozilla-central.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment on attachment 337532 [details] [diff] [review] Fix rev2 (avoid scaling if drawRect too large) Do people feel this needs test(s)? If so, what kind of test(s)?
Attachment #337532 - Flags: approval1.9.0.3?
I think crashtests would be good - just put the testcases into /widget/src/cocoa/crashtests and update crashtests.list.
Flags: blocking1.9.1? → blocking1.9.1+
Flags: wanted1.9.1?
> rev2 patch (attachment 337532 [details] [diff] [review]) landed on mozilla-central. Backed out because it likely caused some reftest failures (see bug 455346). I'm looking into this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Turns out the reftest failures were a real issue (and not just another example of test flakiness, or test-machine flakiness). They were caused by the code I added (to both rev2 and rev3) to take an early out when [NSView focusView] returns nil. Turns out this is unnecessary for my rev2 patch ([NSCell drawWithFrame:inView:] can deal with it if [NSView focusView] returns nil). And the fact that nsNativeThemeCocoa::DrawCellWithScaling() could return without drawing anything was (apparently) what caused the reftest failures. The reftest failures mentioned in bug 455346 no longer happen when I run reftests locally. With luck they won't happen on the unit test machines, either. When I reland this patch, I intend to do it in the morning. The unit tests can take many hours to run, and I don't want to have to watch this into the wee hours. (The crashes I saw, which the nil focusView early out was needed to fix, only happened with my rev3 patch.)
Attachment #337532 - Attachment is obsolete: true
Attachment #338736 - Flags: superreview?(roc)
Attachment #338736 - Flags: review?(joshmoz)
Attachment #337532 - Flags: approval1.9.0.3?
Attachment #338736 - Flags: review?(joshmoz) → review+
Attachment #338736 - Flags: superreview?(roc) → superreview+
rev4 patch landed on mozilla-central. Over the next few hours, I'll be watching for any reftest failures.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
I followed Markus' suggestion from comment #22 and added the testcases for this bug, bug 444260 and bug 449111 to widget/src/cocoa/crashtests.
Attachment #339118 - Flags: review?(joshmoz)
Comment on attachment 338736 [details] [diff] [review] Fix rev4 (fix rev2's reftest failures) Crashtests are available (attachment 339118 [details] [diff] [review]).
Attachment #338736 - Flags: approval1.9.0.3?
Comment on attachment 339118 [details] [diff] [review] Crashtests for bugs 444864, 444260 and 449111 thanks
Attachment #339118 - Flags: review?(joshmoz) → review+
Flags: wanted1.8.1.x-
Blocks: 449111
Attachment #339118 - Flags: superreview?(roc)
Attachment #339118 - Flags: superreview?(roc) → superreview+
Crashtests (attachment 338736 [details] [diff] [review]) landed on mozilla-central.
> Crashtests (attachment 338736 [details] [diff] [review]) landed on mozilla-central. Actually they're in attachment 339118 [details] [diff] [review].
Attachment #338736 - Flags: approval1.9.0.4? → approval1.9.0.4+
Comment on attachment 338736 [details] [diff] [review] Fix rev4 (fix rev2's reftest failures) Approved for 1.9.0.4, a=dveditz for release-drivers
Attachment #339118 - Flags: approval1.9.0.4?
Attachment #339118 - Flags: approval1.9.0.4?
Comment on attachment 339118 [details] [diff] [review] Crashtests for bugs 444864, 444260 and 449111 Tests don't need approval to land, even on the branch. However, these shouldn't land until this bug is opened up publicly, after 3.0.4 is released.
Flags: in-testsuite?
Landed my rev4 patch on the 1.9.0 branch: Checking in widget/src/cocoa/nsNativeThemeCocoa.mm; /cvsroot/mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm,v <-- nsNativeThemeCocoa.mm new revision: 1.94; previous revision: 1.93 done (Sorry for the delay.) I'll land the tests after 3.0.4 is released.
Keywords: fixed1.9.0.4
Verified with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.4pre) Gecko/2008102104 GranParadiso/3.0.4pre for 1.9.0.4.
Group: core-security
Steven Michaud already checked in crashtests (comment 26): http://hg.mozilla.org/mozilla-central/rev/427a44cf7ccd
Flags: in-testsuite? → in-testsuite+
verified fixed on the 1.9.1 branch using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20081229 Shiretoko/3.1b3pre and verified on trunk using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20081229 Minefield/3.2a1pre
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsNativeThemeCocoa::DrawCellWithScaling]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: