Closed
Bug 672787
Opened 13 years ago
Closed 13 years ago
Aurora build crashes [@ _moz_pixman_image_composite32] at start-up (07/20)
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
mozilla8
People
(Reporter: xti, Assigned: jchen)
References
Details
Crash Data
Attachments
(2 files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
mfinkle
:
review+
jst
:
approval-mozilla-aurora+
christian
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
Build id : Mozilla/5.0 (Android;Linux armv7l;rv:7.0a2)Gecko/20110720
Firefox/7.0a2 Fennec/7.0a2
Device: Motorola Droid 2
OS: Android 2.2
Steps to reproduce:
Case 1:
If there is any Aurora build outdated installed, update it from about:firefox. After the new build is installed, tap on Open button.
Case 2:
Go to http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-aurora-android/ and tap on fennec-7.0a2.multi.eabi-arm.apk. After the app is installed, tap on the Open button.
Expected result:
Aurora build opens normally.
Actual result:
Aurora build crashes every time when it's opened and a Mozilla Crash dialog is displayed.
Note:
I cannot get the crash report from about:crashes because the Aurora build doesn't open at all.
Reporter | ||
Comment 1•13 years ago
|
||
I was able to get a crash report after I've installed the build from 20110719 over it: https://crash-stats.mozilla.com/report/index/bp-502110f9-b010-4911-86f8-edc292110720
Reporter | ||
Comment 2•13 years ago
|
||
This issue doesn't occur on:
Build id : Mozilla/5.0 (Android;Linux armv7l;rv:7.0a2)Gecko/20110719
Firefox/7.0a2 Fennec/7.0a2
Build config: http://hg.mozilla.org/releases/mozilla-aurora/rev/4d2a4e9e9730
But occurs on:
Build id : Mozilla/5.0 (Android;Linux armv7l;rv:7.0a2)Gecko/20110719
Firefox/7.0a2 Fennec/7.0a2
A possible range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2011-07-19&enddate=2011-07-20+03%3A00
Comment 3•13 years ago
|
||
https://crash-stats.mozilla.com/report/index/bp-502110f9-b010-4911-86f8-edc292110720
0 libxul.so libxul.so@0x9df494
1 libxul.so _moz_pixman_image_composite32 gfx/cairo/libpixman/src/pixman.c:371
2 libxul.so _clip_and_composite_boxes gfx/cairo/cairo/src/cairo-image-surface.c:3002
3 libxul.so _cairo_image_surface_paint gfx/cairo/cairo/src/cairo-image-surface.c:3304
4 libxul.so _cairo_surface_paint gfx/cairo/cairo/src/cairo-surface.c:2100
5 libxul.so _cairo_gstate_paint gfx/cairo/cairo/src/cairo-gstate.c:1049
6 libxul.so _moz_cairo_paint gfx/cairo/cairo/src/cairo.c:2238
7 libxul.so _moz_cairo_paint_with_alpha gfx/cairo/cairo/src/cairo.c:2267
8 libxul.so gfxContext::Paint gfx/thebes/gfxContext.cpp:772
9 libxul.so gfxPlatform::OptimizeImage gfx/thebes/gfxPlatform.cpp:414
10 libxul.so imgFrame::Optimize nsAutoPtr.h:954
11 libxul.so mozilla::imagelib::RasterImage::DecodingComplete modules/libpr0n/src/RasterImage.cpp:1111
12 libxul.so mozilla::imagelib::Decoder::PostDecodeDone nsCOMPtr.h:800
13 libxul.so mozilla::imagelib::nsPNGDecoder::end_callback modules/libpr0n/decoders/nsPNGDecoder.cpp:863
14 libxul.so MOZ_PNG_push_have_end modules/libimg/png/pngpread.c:1908
15 libxul.so MOZ_PNG_push_read_chunk modules/libimg/png/pngpread.c:364
16 libxul.so MOZ_PNG_proc_some_data modules/libimg/png/pngpread.c:65
17 libxul.so MOZ_PNG_process_data modules/libimg/png/pngpread.c:39
18 libxul.so mozilla::imagelib::nsPNGDecoder::WriteInternal modules/libpr0n/decoders/nsPNGDecoder.cpp:354
19 libxul.so mozilla::imagelib::Decoder::Write modules/libpr0n/src/Decoder.cpp:104
20 libxul.so mozilla::imagelib::RasterImage::WriteToDecoder modules/libpr0n/src/RasterImage.cpp:2277
Comment 4•13 years ago
|
||
Afaik, you need to look at the pushlog for Aurora, which is here:
http://hg.mozilla.org/releases/mozilla-aurora/
The problem is that I don't see anything that could trigger this crash, afaik.
Crash Signature: [@ _moz_pixman_image_composite32]
Summary: Aurora build crashes at start-up (07/20) → Aurora build crashes [@ _moz_pixman_image_composite32] at start-up (07/20)
Comment 5•13 years ago
|
||
Today's Aurora nightly starts up fine (no crash) on my Xoom.
Comment 6•13 years ago
|
||
I guess this is basically related to/the same as bug 623161.
Comment 7•13 years ago
|
||
Btw, I can reproduce this crash on start-up, using the LG Optimus Black.
Comment 8•13 years ago
|
||
i don't crash on a n1.
Comment 9•13 years ago
|
||
I'm in the Mountain View office with the crashing Aurora browser on the phone. If someone wants to investigate, he can grab my phone (I'm in the QA area).
Comment 10•13 years ago
|
||
I mentioned this to Naoki in case it is useful - There is a corresponding signature on the Firefox side with fairly low volume crash rate: https://crash-stats.mozilla.com/report/list?signature=_moz_pixman_image_composite32
Comment 11•13 years ago
|
||
Ok, this looks more like a Cairo bug to me, hen. Moving it to Core->Graphics.
Component: General → Graphics
Product: Fennec → Core
QA Contact: general → thebes
Version: Firefox 7 → Trunk
Comment 12•13 years ago
|
||
This crashes (info pulled from application.ini):
Version=7.0a2
BuildID=20110720042444
SourceRepository=http://hg.mozilla.org/releases/mozilla-aurora
SourceStamp=579cbf7a9add
This runs:
Version=7.0a2
BuildID=20110719042859
SourceRepository=http://hg.mozilla.org/releases/mozilla-aurora
SourceStamp=4d2a4e9e9730
So these are what landed in that span:
changeset: 72687:579cbf7a9add
user: Simon Montagu <smontagu@smontagu.org>
date: Mon Jul 11 06:40:51 2011 +0300
summary: Don't resolve bidi paragraph in preformatted text until we really get to the end of the line. Bug 670226, r=roc, a=asa
changeset: 72686:433cd269be19
user: Simon Montagu <smontagu@smontagu.org>
date: Mon Jul 11 06:40:51 2011 +0300
summary: Tests for bug 670226
changeset: 72685:ef4909389600
user: Simon Montagu <smontagu@smontagu.org>
date: Fri Jul 08 10:51:26 2011 +0300
summary: Make sure that bidi continuation chains don't go beyond the end of the paragraph. Bug 668941, r=roc, a=asa
changeset: 72684:9a3234ac5c1c
user: Myk Melez <myk@mozilla.org>
date: Tue Jul 19 20:55:10 2011 -0700
summary: update revision of Add-on SDK tests to latest tip; a=test-only
changeset: 72683:82f49f622e9d
user: Luke Wagner <luke@mozilla.com>
date: Mon Jul 18 17:37:19 2011 -0700
summary: Bug 672026 - Ensure that there is an object principals finder during early startup (r=mrbkap,a=asa)
Talos regression in bug 672026
Depends on: 672026
Comment 15•13 years ago
|
||
I helped dougt trigger the following jobs (from http://build.mozilla.org/builds/running.html):
mozilla-aurora 9a3234ac5c1c Android mozilla-aurora build
mozilla-aurora 82f49f622e9d Android mozilla-aurora build
mozilla-aurora ef4909389600 Android mozilla-aurora build
mozilla-aurora 433cd269be19 Android mozilla-aurora build
579cbf7a9add Android mozilla-aurora build
433cd269be19 Android mozilla-aurora build
He would have not had the means to trigger those 3 csets that were on the same push.
The builds should show up in http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-aurora-android/
Comment 16•13 years ago
|
||
How exactly is bug 654049 involved in this?
Comment 17•13 years ago
|
||
It looks like the crash started here:
http://hg.mozilla.org/releases/mozilla-aurora/rev/82f49f622e9d
Depends on: 654049
Assignee | ||
Comment 18•13 years ago
|
||
I did some debugging and seems like it's not really a JS bug, but rather some strange linker magic
Functions from pixman_arm_neon_asm.o are supposed to be at least 4-byte aligned, which is the case before bug 672026:
arm-linux-androideabi-objdump -t dist/lib/libxul.so | grep 'pixman.\+neon'
> 009e57b8 l F .text 00000000 .hidden pixman_composite_src_0888_0565_rev_asm_neon
> 009e79f8 l F .text 00000000 .hidden pixman_scaled_nearest_scanline_8888_8888_OVER_asm_neon
> 009dc3d8 l F .text 00000000 .hidden pixman_composite_scanline_add_asm_neon
> 009eb17c l F .text 00000000 .hidden pixman_scaled_bilinear_scanline_8888_8888_OVER_asm_neon
> 009e6158 l F .text 00000000 .hidden pixman_composite_over_0565_8_0565_asm_neon
> 009e8948 l F .text 00000000 .hidden pixman_scaled_nearest_scanline_0565_8888_SRC_asm_neon
> 009e16c8 l F .text 00000000 .hidden pixman_composite_add_n_8_8_asm_neon
> 009df054 l F .text 00000000 .hidden pixman_composite_src_n_0565_asm_neon
But after bug 672026, everything from pixman_arm_neon_asm.o are now offset by 2 bytes (address in the first column):
arm-linux-androideabi-objdump -t dist/lib/libxul.so | grep 'pixman.\+neon'
> 009e5882 l F .text 00000000 .hidden pixman_composite_src_0888_0565_rev_asm_neon
> 009e7ac2 l F .text 00000000 .hidden pixman_scaled_nearest_scanline_8888_8888_OVER_asm_neon
> 009dc4a2 l F .text 00000000 .hidden pixman_composite_scanline_add_asm_neon
> 009eb246 l F .text 00000000 .hidden pixman_scaled_bilinear_scanline_8888_8888_OVER_asm_neon
> 009e6222 l F .text 00000000 .hidden pixman_composite_over_0565_8_0565_asm_neon
> 009e8a12 l F .text 00000000 .hidden pixman_scaled_nearest_scanline_0565_8888_SRC_asm_neon
> 009e1792 l F .text 00000000 .hidden pixman_composite_add_n_8_8_asm_neon
> 009df11e l F .text 00000000 .hidden pixman_composite_src_n_0565_asm_neon
Strange thing is this only happens to pixman_arm_neon_asm.o
Now when we call these function, blx instruction implies 4-byte alignment:
> 009da4dc <neon_composite_src_8888_8888+0x3c>:
> 9da4dc: 9000 str r0, [sp, #0]
> 9da4de: 980b ldr r0, [sp, #44]
> 9da4e0: f004 efa8 blx 9df434 <pixman_composite_src_8888_8888_asm_neon+0x2>
> 9da4e4: b003 add sp, #12
> 9da4e6: bd00 pop {pc}
So our nice ARM instructions:
> 009df432 <pixman_composite_src_8888_8888_asm_neon>:
> 9df432: e92d5ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
> 9df436: e59d4028 ldr r4, [sp, #40]
> 9df43a: e3a0a000 mov sl, #0 ; 0x0
> 9df43e: e59d502c ldr r5, [sp, #44]
> 9df442: e1a06002 mov r6, r2
> 9df446: e1a0b004 mov fp, r4
> 9df44a: e1a0c006 mov ip, r6
> 9df44e: e1a0e007 mov lr, r7
Turn into gibberish due to the 2-byte offset:
> 009df434 <pixman_composite_src_8888_8888_asm_neon+0x2>:
> 9df434: 4028e92d eormi lr, r8, sp, lsr #18
> 9df438: a000e59d mulge r0, sp, r5
> 9df43c: 502ce3a0 eorpl lr, ip, r0, lsr #7
> 9df440: 6002e59d mulvs r2, sp, r5
> 9df444: b004e1a0 andlt lr, r4, r0, lsr #3
> 9df448: c006e1a0 andgt lr, r6, r0, lsr #3
> 9df44c: e007e1a0 and lr, r7, r0, lsr #3
> 9df450: 9201e1a0 andls lr, r1, #40 ; 0x28
And sooner or later we crash.
This only happens to that bit of NEON assembly, and our Tegra boards don't have NEON so this was not caught on tests.
Also this doesn't happen with NDK5, so one more reason to switch :) I will try to find out if NDK5 doesn't have this linker bug because it was fixed or because the conditions for the bug aren't met under NDK5.
Comment 19•13 years ago
|
||
Yes, this looks like the same issue as bug 666931 and bug 623161
Comment 20•13 years ago
|
||
In the future please don't trigger nightlies when regression hunting. If you need clean builds, use https://build.mozilla.org/clobberer/ to clobber the builder, and use normal opt builds. Triggering multiple nightlies in parallel has unknown behaviour, and seems to cause us to temporarily strand users (bug 673501).
Thanks!
Comment 21•13 years ago
|
||
I asked Timothy B. Terriberry on IRC and he provided more explanations about the problem and a link to this bug in binutils bugtracker:
http://sourceware.org/bugzilla/show_bug.cgi?id=12931
For now the workaround (also applied to WebM earlier) is to explicitly set alignment for code sections and the following patch should do it for pixman:
http://lists.freedesktop.org/archives/pixman/2011-July/001347.html
Please confirm whether it really helps to resolve this bug. And if it does, then it makes sense to do a complete review of all the arm assembly code in Mozilla to see if such workarounds should be also applied somewhere else.
Assignee | ||
Comment 22•13 years ago
|
||
> Please confirm whether it really helps to resolve this bug. And if it does,
> then it makes sense to do a complete review of all the arm assembly code in
> Mozilla to see if such workarounds should be also applied somewhere else.
Yes, this does fix the bug. Thank you for identifying the issue.
I agree a complete review will be very helpful, before another innocent person gets bitten by this bug again :)
Comment 23•13 years ago
|
||
Has a bug been filed to get the pixman alignment fix into the mozilla codebase?
Updated•13 years ago
|
Crash Signature: [@ _moz_pixman_image_composite32] → [@ libxul.so@0x9df494]
[@ _moz_pixman_image_composite32]
tracking-firefox7:
--- → +
Comment 25•13 years ago
|
||
So, what do we need to do here for Firefox 7? Nothing? This is an existing problem? Do we have the workaround mentioned in comment 21 in mozilla-central or mozilla-beta? We must get some action on this today, preferably a resolution if it needs it.
Comment 26•13 years ago
|
||
(In reply to Christian Legnitto [:LegNeato] from comment #25)
> So, what do we need to do here for Firefox 7?
I would suggest cherry picking and applying http://cgit.freedesktop.org/pixman/commit/?id=b8d6babc91459a9f854695b56f0265298a3c6427 to the Mozilla's copy of pixman.
And while you are at it, there is also bug 667284 with a simple fix available. Which would be also nice to have applied.
Assignee | ||
Comment 27•13 years ago
|
||
Here's the patch for Mozilla. It's not in mozilla-central or anywhere else, but it would be very good to have.
Attachment #560443 -
Flags: review?(siarhei.siamashka)
Assignee | ||
Comment 28•13 years ago
|
||
Comment on attachment 560443 [details] [diff] [review]
Fix
Also nominating for Aurora and Beta, since the crash was originally from Fennec 7.
The patch has virtually no risk; it specifies one alignment attribute for three assembly files.
Without it, any build can potentially contain this crash, and the tegras cannot catch it all the time.
Attachment #560443 -
Flags: approval-mozilla-beta?
Attachment #560443 -
Flags: approval-mozilla-aurora?
Comment 29•13 years ago
|
||
There were 2 crashes on android for the past week and ~45 on windows. Not high enough volume (though it is a startup crash which are underreported). Denying approval for beta.
status-firefox7:
--- → wontfix
Attachment #560443 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Comment 30•13 years ago
|
||
Comment on attachment 560443 [details] [diff] [review]
Fix
Review of attachment 560443 [details] [diff] [review]:
-----------------------------------------------------------------
r+ from me
Comment 31•13 years ago
|
||
Comment on attachment 560443 [details] [diff] [review]
Fix
a=jst per todays driver meeting (and this was reviewed, the flag just didn't get set, and given the nature of this change we're ok approving this before it's been landed in mozilla-central).
Attachment #560443 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 32•13 years ago
|
||
Comment on attachment 560443 [details] [diff] [review]
Fix
making the r+ official
Attachment #560443 -
Flags: review?(siarhei.siamashka) → review+
Comment 33•13 years ago
|
||
pushed to aurora and inbound
https://hg.mozilla.org/releases/mozilla-aurora/rev/0b54cb43cec5
https://hg.mozilla.org/integration/mozilla-inbound/rev/da9d9d9d9809
status-firefox8:
--- → fixed
Target Milestone: --- → mozilla8
Comment 34•13 years ago
|
||
Assignee: nobody → jimnchen+bmo
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•