Closed Bug 824777 Opened 12 years ago Closed 12 years ago

firefox crash on trulia.com

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

(blocking-b2g:tef+, blocking-basecamp:-, b2g18 affected, b2g18-v1.0.0 affected, b2g18-v1.0.1 affected)

VERIFIED WORKSFORME
B2G C4 (2jan on)
blocking-b2g tef+
blocking-basecamp -
Tracking Status
b2g18 --- affected
b2g18-v1.0.0 --- affected
b2g18-v1.0.1 --- affected

People

(Reporter: mluna, Assigned: diego)

References

Details

(Keywords: b2g-testdriver, crash, unagi, Whiteboard: [cr 440442][mvines following up internally to look for fix][b2g-crash][b2g-gfx])

Crash Data

Attachments

(2 files)

Attached file log of the crash (deleted) —
on 12.12.2012 build on unagi, I can repeatedly crash the OS on Trulia: 1. go to trulia.com 2. perform a search 3. tap the Next button at the bottom of the page 4. Firefox OS crashes (often there is no notification that it crashed). expected: to see the next page of search results, find log attached.
Component: Gaia → General
I see this as well on the 12-26 nightly - right now my crash reports seem to be stuck in pending. In my case I didn't even have to finish the search to generate a crash.
Keywords: reproducible
I couldn't reproduce this crash in a build from today. Do we have any data on frequency of this crash? Getting crashstat link would be great too.
blocking-basecamp: ? → -
I hit this too. Just clicked on "search" then the phone rebooted as the new page was loading. Otoro 20121226203932.
blocking-basecamp: - → +
Crash Signature: [@ libgenlock.so@0x751]
Gralloc
Snorp, do you have a clue where we could start here?
Assignee: nobody → snorp
If I had to guess I'd say it's a bug in the msm gralloc driver, but we could also be trying to allocate some crazy huge amount or something like that. I can take a closer look when I'm back from vacation.
When are you back from vacation? Can anyone else look?
Flags: needinfo?(jpr)
Assignee: snorp → gwright
Target Milestone: --- → B2G C4 (2jan on)
Can you confirm which version of trulia.com this happens on? I'm using the desktop version of the site and I can't seem to reproduce on an unagi.
Flags: needinfo?(mluna)
With a 2013-01-03 build I can reproduce this with the following STR on my unagi: 1. go to m.trulia.com (redirected from trulia.com) in the browser 2. click "Homes for sale" 3. click the magnifying glass in top right corner 4. Don't change any fields and press "Search" in the bottom right
Flags: needinfo?(mluna)
Flags: needinfo?(jpr)
Severity: normal → critical
Keywords: crash
Whiteboard: [b2g-crash]
I've got some time to look at this today. Unfortunately most of my Friday was spent fighting with the b2g buildsystem :(
Notes: I just tried the steps in Comment 11 using a local head build, all that happens is that the site complains nothing was entered (due to item #4). It may help to have static content due to web site changes (git log: e39daf356831b473c61793fb8dece17fa4961d8b in gecko).
Whiteboard: [b2g-crash] → [b2g-crash][b2g-gfx]
I'll update to latest b2g sources and see if I can still reproduce a crash, then.
So with the current b2g sources I can't reproduce based on the steps in comment 11. I can browse and I've hit "next" a few times. Interestingly, I also don't get the site complaint re. empty fields that Garner has reported in comment 13.
OK, an older build of b2g also doesn't crash, so it seems the site has changed and we no longer have a reproducible testcase :(
blocking-basecamp: + → ?
Keywords: reproducibleqawanted
I just crashed using the latest unagi nightly, 20130107 - https://crash-stats.mozilla.com/report/index/d4714331-cf83-4990-b8e5-b9b152130107 is my report. I will see if I can repro more reliably.
OK, we'll keep it as a blocker if you can crash.
blocking-basecamp: ? → +
Marcia, are you following Comment 11 steps or Comment 1 steps? If 11, what is in the search field by default when you search? It's possible that the default location is based on the location of the phone, so different people running the search get different results and some of them cause the crash and some don't...
I managed to crash it a few times now: * just going into the search page * clearing the city and getting out of that field * a few seconds after the last user operation, not even trying to crash it It seems to gets easier once you crash it once if you don't power down the phone.
Milan: another crash: https://crash-stats.mozilla.com/report/index/7988cd15-692c-4c84-bd16-ef2312130107 I don't follow the steps in either comment exactly. The first crash came when I had the default location set (Mountain View). I did a search with some parameters adjusted (BR, square footage, etc)and I crashed. The second time I switched location to a zip code, altered some of the search criteria, pressed the home button and then crashed. Seems that site is pretty unstable in general. In both cases I never invoked a next button to generate a crash.
Crash Signature: [@ libgenlock.so@0x751] → [@ libgenlock.so@0x751] [@ gralloc::PmemAshmemController::allocate] [@ main]
OS: Mac OS X → Gonk (Firefox OS)
Hardware: x86 → ARM
QA Contact: tchung → mozillamarcia.knous
We have been able to get crashes different ways on this site; George is working on getting the apitrace for a crash now. It does appear to be a racing condition or something like that, as the same workflow sometimes works.
I'm struggling to get a crash with apitrace enabled, possibly due to differences in timing. :( I'll continue trying and also try to use our internal GL debug modes baked into GLContext.
Not sure if this is the same problem. It's produced by the same steps. (gdb) bt #0 0x42685b08 in ?? () from /home/tingyuan/B2G/unagi/out/target/product/unagi/system/lib/egl/eglsubAndroid.so #1 0x426c0424 in eglImageLock () from /home/tingyuan/B2G/unagi/out/target/product/unagi/system/lib/egl/libEGL_adreno200.so #2 0x42979000 in ?? () Cannot access memory at address 0x3f7ffff8 (gdb) info registers r0 0x43f8bf30 1140375344 r1 0x4a775610 1249334800 r2 0x0 0 r3 0x3000 12288 r4 0x4a775610 1249334800 r5 0x70b 1803 r6 0x70b 1803 r7 0x49d0aa60 1238411872 r8 0x4a9d0000 1251803136 r9 0xd 13 r10 0x0 0 r11 0x4acce934 1254943028 r12 0x0 0 sp 0x43a5d0d8 0x43a5d0d8 lr 0x426c0424 1114375204 pc 0x42685b08 0x42685b08 cpsr 0x20000030 536870960 The faulting instruction is: 0000153c <eglSubDriverMain>: ... 1b08: 6848 ldr r0, [r1, #4] ... And the faulting address (0x4a775614) is near pmem: 49e00000-4a3ab000 rwxs 0bd00000 00:0b 874 /dev/pmem 4a906000-4aa00000 rwxp 00000000 00:00 0 I'm digging the stack by hand since some of the libraries are stripped. Or can we ask the manufacturer to provide libraries with debug info?
Here are the hand-decoded stack trace: #0 0x42685b08 sp = 0x43a5d0d8, eglSubDriverMain from eglsubAndroid.so #1 0x426c0424 sp = 0x43a5d0e0, eglImageLock from libEGL_adreno200.so #2 0x42c7f5be sp = 0x43a5d0f0, lock_egl_image_for_sw from libGLESv2_adreno200.so #3 0x42c75df8 sp = 0x43a5d0f8, qgl2DrvAPI_glEGLImageTargetTexture2DOES from libGLESv2_adreno200.so #4 0x42c6c5f5 sp = 0x43a5d140, glEGLImageTargetTexture2DOES from libGLESv2_adreno200.so #5 0x411f6162 sp = 0x43a5d148, ... from libxul.so by setting $pc & $sp to ?5: (#5 in the above == #0 below) #0 mozilla::gl::GLContext::fEGLImageTargetTexture2D (this=<value optimized out>, aTextureUnit=1803) at /home/tingyuan/work/mozilla-b2g18/gfx/gl/GLContext.h:3223 #1 mozilla::gl::TextureImageEGL::BindTexture (this=<value optimized out>, aTextureUnit=1803) at /home/tingyuan/work/mozilla-b2g18/gfx/gl/GLContextProviderEGL.cpp:1561 #2 0x411f0030 in mozilla::gl::TiledTextureImage::BindTexture (this=0x4938df70, aTextureUnit=33984) at /home/tingyuan/work/mozilla-b2g18/gfx/gl/GLContext.cpp:1205 #3 0x411d839c in ScopedBindTexture (this=0x43a5d264, aTexture=<value optimized out>, aTextureUnit=<value optimized out>) at ../../dist/include/GLContext.h:235 #4 ScopedBindTextureAndApplyFilter (this=0x43a5d264, aTexture=<value optimized out>, aTextureUnit=<value optimized out>) at ../../dist/include/GLContext.h:255 #5 0x411dc3a2 in mozilla::layers::ShadowImageLayerOGL::RenderLayer (this=0x483a4800, aPreviousFrameBuffer=<value optimized out>, aOffset=...) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ImageLayerOGL.cpp:996 #6 0x411da372 in ContainerRender<mozilla::layers::ShadowContainerLayerOGL> (this=0x4acce800, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:263 #7 mozilla::layers::ShadowContainerLayerOGL::RenderLayer (this=0x4acce800, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:449 #8 0x411da6e6 in ContainerRender<mozilla::layers::ShadowRefLayerOGL> (this=0x477b0000, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:263 #9 mozilla::layers::ShadowRefLayerOGL::RenderLayer (this=0x477b0000, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:490 #10 0x411da372 in ContainerRender<mozilla::layers::ShadowContainerLayerOGL> (this=0x4a8d3000, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:263 #11 mozilla::layers::ShadowContainerLayerOGL::RenderLayer (this=0x4a8d3000, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:449 #12 0x411da372 in ContainerRender<mozilla::layers::ShadowContainerLayerOGL> (this=0x4a8d2800, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:263 #13 mozilla::layers::ShadowContainerLayerOGL::RenderLayer (this=0x4a8d2800, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:449 #14 0x411da372 in ContainerRender<mozilla::layers::ShadowContainerLayerOGL> (this=0x495b5000, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:263 #15 mozilla::layers::ShadowContainerLayerOGL::RenderLayer (this=0x495b5000, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:449 #16 0x411da372 in ContainerRender<mozilla::layers::ShadowContainerLayerOGL> (this=0x47da1400, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:263 #17 mozilla::layers::ShadowContainerLayerOGL::RenderLayer (this=0x47da1400, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:449 #18 0x411da372 in ContainerRender<mozilla::layers::ShadowContainerLayerOGL> (this=0x47da0c00, aPreviousFrameBuffer=0, aOffset=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/ContainerLayerOGL.cpp:449 #20 0x411dea5c in mozilla::layers::LayerManagerOGL::Render (this=0x47432350) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/LayerManagerOGL.cpp:1028 #21 0x411deea6 in mozilla::layers::LayerManagerOGL::EndTransaction (this=0x47432350, aCallback=0, aCallbackData=0x0, aFlags=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/LayerManagerOGL.cpp:700 #22 0x411dca1a in mozilla::layers::LayerManagerOGL::EndEmptyTransaction (this=0x43f8bf30, aFlags=12288) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/opengl/LayerManagerOGL.cpp:641 #23 0x411ea0d2 in mozilla::layers::CompositorParent::Composite (this=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/gfx/layers/ipc/CompositorParent.cpp:582 #24 0x410688e6 in DispatchToMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)()> ( this=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/tuple.h:383 #25 RunnableMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)(), Tuple0>::Run ( this=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/task.h:307 #26 0x4118e66c in MessageLoop::RunTask (this=0x43a5ddf0, task=0x4a775610) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:333 #27 0x4118f49e in MessageLoop::DeferOrRunPendingTask (this=0x43f8bf30, pending_task=<value optimized out>) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:341 #28 0x4118f50e in MessageLoop::DoDelayedWork (this=0x43a5ddf0, next_delayed_work_time=0x43882e50) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:468 #29 0x4119030e in base::MessagePumpDefault::Run (this=0x43882e40, delegate=0x43a5ddf0) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/message_pump_default.cc:27 #30 0x4118e61c in MessageLoop::RunInternal (this=0x4d2c4) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:215 #31 0x4118e6d2 in MessageLoop::RunHandler (this=0x43a5ddf0) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:208 #32 MessageLoop::Run (this=0x43a5ddf0) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:182 #33 0x411969dc in base::Thread::ThreadMain (this=0x42960430) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/thread.cc:156 #34 0x411a09b0 in ThreadFunc (closure=0x1) at /home/tingyuan/work/mozilla-b2g18/ipc/chromium/src/base/platform_thread_posix.cc:39 #35 0x4004ce18 in __thread_entry (func=0x411a09a9 <ThreadFunc>, arg=0x42960430, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217 #36 0x4004c96c in pthread_create (thread_out=<value optimized out>, attr=0xbebcd280, start_routine=0x411a09a9 <ThreadFunc>, arg=0x42960430) at bionic/libc/bionic/pthread.c:357
Comment 25 shows a crash in the Adreno driver, under glEGLImageTargetTexture2DOES. We need help from Qualcomm looking into this. CC'ing Michael Vines.
Can somebody demo this crash to me in person? STR from comment 11 don't result in a crash on my device.
(In reply to Michael Vines [:m1] from comment #27) > Can somebody demo this crash to me in person? STR from comment 11 don't > result in a crash on my device. By STR in comment 11, I can easily reproduce this on unagi and otoro, with codes pulled hours ago. How can I demo to you?
I've also been having trouble reproducing, but eventually it should crash. I think if you keep browsing around on m.trulia.com and fiddling with the search parameters, hitting next, changing/clearing the location field etc, that should do the trick.
ISTR from my work with gralloc on Fennec that some drivers didn't want you messing with the buffer in GL while also locking it for writing. Android uses double (or triple) buffering to avoid this issue, and that's what I ended up doing in AndroidDirectTexture as well.
Whiteboard: [b2g-crash][b2g-gfx] → [b2g-crash][b2g-gfx][shadow:snorp]
bjacob is helping George with this as well; let's keep each other up to speed as to what's being tried?
Just got a gralloc crash, which is different to the one in comment 25. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 491.508] memset () at bionic/libc/arch-arm/bionic/memset.S:77 77 vst1.32 {q0, q1}, [r0]! (gdb) bt #0 memset () at bionic/libc/arch-arm/bionic/memset.S:77 #1 0x41daf402 in gralloc::PmemUserspaceAlloc::alloc_buffer (this=0x40301b60, data=...) at hardware/qcom/display/libgralloc/pmemalloc.cpp:213 #2 0x41dafe42 in gralloc::PmemAshmemController::allocate (this=0x40301b40, data=..., usage=307, compositionType=3) at hardware/qcom/display/libgralloc/alloc_controller.cpp:301 #3 0x41f735f2 in gralloc::gpu_context_t::gralloc_alloc_buffer ( this=0x41f735f3, size=4096, usage=307, pHandle=0x49a2a4c0, bufferType=0, format=1, width=32, height=32) at hardware/qcom/display/libgralloc/gpu.cpp:165 #4 0x41f7389c in gralloc::gpu_context_t::alloc_impl (this=0x489e80b0, w=1, h=1, format=1, usage=307, pHandle=0x49a2a4c0, pStride=0x49a2a4ac, bufferSize=0) at hardware/qcom/display/libgralloc/gpu.cpp:245 #5 0x41f73926 in gralloc::gpu_context_t::gralloc_alloc (dev=0x4a587000, w=<value optimized out>, h=<value optimized out>, format=<value optimized out>, usage=307, pHandle=0x49a2a4c0, pStride=0x49a2a4ac) at hardware/qcom/display/libgralloc/gpu.cpp:296 #6 0x402ec2e6 in android::GraphicBufferAllocator::alloc ( this=<value optimized out>, w=1, h=1, format=1, usage=307, handle=0x49a2a4c0, stride=0x49a2a4ac) at frameworks/base/libs/ui/GraphicBufferAllocator.cpp:102 #7 0x402ebc62 in android::GraphicBuffer::initSize (this=0x49a2a480, w=1, h=1, format=1, reqUsage=307) at frameworks/base/libs/ui/GraphicBuffer.cpp:149 ---Type <return> to continue, or q <return> to quit--- #8 0x402ebfd6 in GraphicBuffer (this=0x49a2a480, w=1, h=1, reqFormat=1, reqUsage=307) at frameworks/base/libs/ui/GraphicBuffer.cpp:62 #9 0x411fa21c in mozilla::gl::TextureImageEGL::CreateBackingSurface ( this=0x4800e5c0, aSize=...) at /home/george/dev/B2G/gecko/gfx/gl/GLContextProviderEGL.cpp:1857 #10 0x411fab46 in mozilla::gl::TextureImageEGL::Resize (this=0x4800e5c0, aSize=...) at /home/george/dev/B2G/gecko/gfx/gl/GLContextProviderEGL.cpp:1596 #11 0x411fac84 in TextureImageEGL (this=0x49c23800, aSize=..., aContentType=gfxASurface::CONTENT_COLOR_ALPHA, aFlags=mozilla::gl::TextureImage::NoFlags) at /home/george/dev/B2G/gecko/gfx/gl/GLContextProviderEGL.cpp:1322 #12 mozilla::gl::GLContextEGL::TileGenFunc (this=0x49c23800, aSize=..., aContentType=gfxASurface::CONTENT_COLOR_ALPHA, aFlags=mozilla::gl::TextureImage::NoFlags) at /home/george/dev/B2G/gecko/gfx/gl/GLContextProviderEGL.cpp:1993 #13 0x411f888e in mozilla::gl::TiledTextureImage::Resize (this=0x49c8c550, aSize=...) at /home/george/dev/B2G/gecko/gfx/gl/GLContext.cpp:1282 #14 0x411f89dc in TiledTextureImage (this=0x49c8c550, aGL=0x49c23800, aSize=..., aContentType=gfxASurface::CONTENT_COLOR_ALPHA, aFlags=mozilla::gl::TextureImage::NoFlags) at /home/george/dev/B2G/gecko/gfx/gl/GLContext.cpp:947 #15 0x411fa7d0 in mozilla::gl::GLContextEGL::CreateTextureImage ( ---Type <return> to continue, or q <return> to quit--- this=0x49c23800, aSize=..., aContentType=gfxASurface::CONTENT_COLOR_ALPHA, aWrapMode=<value optimized out>, aFlags=mozilla::gl::TextureImage::NoFlags) at /home/george/dev/B2G/gecko/gfx/gl/GLContextProviderEGL.cpp:1954 #16 0x411df478 in mozilla::layers::ShadowImageLayerOGL::Init (this=0x47ce0000, aFront=...) at /home/george/dev/B2G/gecko/gfx/layers/opengl/ImageLayerOGL.cpp:719 #17 0x411df6bc in mozilla::layers::ShadowImageLayerOGL::Swap (this=0x47ce0000, aNewFront=..., aNewBack=0x43bffa00) at /home/george/dev/B2G/gecko/gfx/layers/opengl/ImageLayerOGL.cpp:784 #18 0x411f2ed6 in mozilla::layers::ShadowLayersParent::RecvUpdate ( this=<value optimized out>, cset=<value optimized out>, targetConfig=<value optimized out>, isFirstPaint=<value optimized out>, reply=0x43bffbdc) at /home/george/dev/B2G/gecko/gfx/layers/ipc/ShadowLayersParent.cpp:447 #19 0x410bc7ba in mozilla::layers::PLayersParent::OnMessageReceived ( this=0x4856b100, __msg=<value optimized out>, __reply=@0x43bffcfc) at /home/george/dev/B2G/objdir-gecko/ipc/ipdl/PLayersParent.cpp:509 #20 0x410b681e in mozilla::layers::PCompositorParent::OnMessageReceived ( this=0x49ff9d80, __msg=..., __reply=@0x43bffcfc) at /home/george/dev/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:393 #21 0x4108b2f8 in mozilla::ipc::SyncChannel::OnDispatchMessage ( this=0x49ff9d88, msg=...) at /home/george/dev/B2G/gecko/ipc/glue/SyncChannel.cpp:144 ---Type <return> to continue, or q <return> to quit--- #22 0x41089df8 in mozilla::ipc::RPCChannel::OnMaybeDequeueOne (this=0x49ff9d88) at /home/george/dev/B2G/gecko/ipc/glue/RPCChannel.cpp:400 #23 0x4106c8ae in DispatchToMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)()> (this=<value optimized out>) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/tuple.h:383 #24 RunnableMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)(), Tuple0>::Run (this=<value optimized out>) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/task.h:307 #25 0x410887b8 in mozilla::ipc::RPCChannel::RefCountedTask::Run ( this=<value optimized out>) at ../../dist/include/mozilla/ipc/RPCChannel.h:425 #26 mozilla::ipc::RPCChannel::DequeueTask::Run (this=<value optimized out>) at ../../dist/include/mozilla/ipc/RPCChannel.h:448 #27 0x4119263c in MessageLoop::RunTask (this=0x43bffdf0, task=0x0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:333 #28 0x4119346e in MessageLoop::DeferOrRunPendingTask (this=0x49ff9d88, pending_task=<value optimized out>) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:341 #29 0x4119404c in MessageLoop::DoWork (this=0x43bffdf0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:441 #30 0x411942cc in base::MessagePumpDefault::Run (this=0x43c82e40, delegate=0x43bffdf0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_pump_default.cc:---Type <return> to continue, or q <return> to quit--- 23 #31 0x411925ec in MessageLoop::RunInternal (this=0x4d2cc) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:215 #32 0x411926a2 in MessageLoop::RunHandler (this=0x43bffdf0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:208 #33 MessageLoop::Run (this=0x43bffdf0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:182 #34 0x4119a9ac in base::Thread::ThreadMain (this=0x42a60430) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/thread.cc:156 #35 0x411a4980 in ThreadFunc (closure=0x1) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/platform_thread_posix.cc:39 #36 0x4007be18 in __thread_entry (func=0x411a4979 <ThreadFunc>, arg=0x42a60430, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217 #37 0x4007b96c in pthread_create (thread_out=<value optimized out>, attr=0xbea31260, start_routine=0x411a4979 <ThreadFunc>, arg=0x42a60430) at bionic/libc/bionic/pthread.c:357 #38 0x48a199e0 in ?? () Cannot access memory at address 0x0 #39 0x48a199e0 in ?? () Cannot access memory at address 0x0 Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I got this to crash in gdb using the original STR. http://www.pastebin.mozilla.org/2045919 Nothing really jumps out at me except width/height are 1.
Interesting logcat output around the crash: I/Adreno ( 109): ioctl code 0xc0140910 (IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS) failed: errno 22 Invalid argument E/Adreno200-ES20( 109): <gl2_surface_swap:41>: GL_OUT_OF_MEMORY E/Adreno200-EGL( 109): <qeglDrvAPI_eglSwapBuffers:3345>: EGL_BAD_ALLOC I/Adreno ( 109): ioctl code 0xc0140910 (IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS) failed: errno 22 Invalid argument E/Adreno200-EGL( 109): <qeglDrvAPI_eglSwapBuffers:3345>: EGL_BAD_ALLOC I/Adreno ( 109): ioctl code 0xc0140910 (IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS) failed: errno 22 Invalid argument E/Adreno200-EGL( 109): <qeglDrvAPI_eglSwapBuffers:3345>: EGL_BAD_ALLOC I/Adreno ( 109): ioctl code 0xc0140910 (IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS) failed: errno 22 Invalid argument E/Adreno200-EGL( 109): <qeglDrvAPI_eglSwapBuffers:3345>: EGL_BAD_ALLOC I/Adreno ( 109): ioctl code 0xc0140910 (IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS) failed: errno 22 Invalid argument E/Adreno200-EGL( 109): <qeglDrvAPI_eglSwapBuffers:3345>: EGL_BAD_ALLOC I/Adreno ( 109): ioctl code 0xc0140910 (IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS) failed: errno 22 Invalid argument I/Adreno ( 109): ioctl code 0x400c0912 (IOCTL_KGSL_CMDSTREAM_FREEMEMONTIMESTAMP) failed: errno 22 Invalid argument D/memalloc( 490): /dev/pmem: Unmapping buffer base:0x461a5000 size:3678208 offset:3227648
Got yet another different trace: #0 memset () at bionic/libc/arch-arm/bionic/memset.S:77 #1 0x403f8402 in gralloc::PmemUserspaceAlloc::alloc_buffer (this=0x40401b60, data=...) at hardware/qcom/display/libgralloc/pmemalloc.cpp:213 #2 0x403f8e42 in gralloc::PmemAshmemController::allocate (this=0x40401b40, data=..., usage=307, compositionType=3) at hardware/qcom/display/libgralloc/alloc_controller.cpp:301 #3 0x41f395f2 in gralloc::gpu_context_t::gralloc_alloc_buffer (this=0x41f395f3, size=573440, usage=307, pHandle=0x471e8640, bufferType=0, format=2, width=320, height=448) at hardware/qcom/display/libgralloc/gpu.cpp:165 #4 0x41f3989c in gralloc::gpu_context_t::alloc_impl (this=0x4a2d7f60, w=320, h=436, format=2, usage=307, pHandle=0x471e8640, pStride=0x471e862c, bufferSize=0) at hardware/qcom/display/libgralloc/gpu.cpp:245 #5 0x41f39926 in gralloc::gpu_context_t::gralloc_alloc (dev=0x4ac40000, w=<value optimized out>, h=<value optimized out>, format=<value optimized out>, usage=307, pHandle=0x471e8640, pStride=0x471e862c) at hardware/qcom/display/libgralloc/gpu.cpp:296 #6 0x402db2e6 in android::GraphicBufferAllocator::alloc (this=<value optimized out>, w=320, h=436, format=2, usage=307, handle=0x471e8640, stride=0x471e862c) at frameworks/base/libs/ui/GraphicBufferAllocator.cpp:102 #7 0x402dac62 in android::GraphicBuffer::initSize (this=0x471e8600, w=320, h=436, format=2, reqUsage=307) at frameworks/base/libs/ui/GraphicBuffer.cpp:149 #8 0x402dafd6 in GraphicBuffer (this=0x471e8600, w=320, h=436, reqFormat=2, reqUsage=307) at frameworks/base/libs/ui/GraphicBuffer.cpp:62 #9 0x411e6156 in mozilla::layers::GrallocBufferActor::Create (aSize=..., aContent=<value optimized out>, aOutHandle=0x43affbcc) at /Volumes/Slow/source/B2G/gecko/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp:168 #10 0x411e4de8 in mozilla::layers::ShadowLayersParent::AllocPGrallocBuffer (this=<value optimized out>, aSize=<value optimized out>, aContent=<value optimized out>, aOutHandle=0x2) at /Volumes/Slow/source/B2G/gecko/gfx/layers/ipc/ShadowLayersParent.cpp:502 #11 0x410af2c2 in mozilla::layers::PLayersParent::OnMessageReceived (this=0x42a78800, __msg=<value optimized out>, __reply=@0x43affcfc) at /Volumes/Slow/source/B2G/objdir-gecko/ipc/ipdl/PLayersParent.cpp:451 #12 0x410a920e in mozilla::layers::PCompositorParent::OnMessageReceived (this=0x4a4fd790, __msg=..., __reply=@0x43affcfc) at /Volumes/Slow/source/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:393 ---Type <return> to continue, or q <return> to quit--- #13 0x4107dd00 in mozilla::ipc::SyncChannel::OnDispatchMessage (this=0x4a4fd798, msg=...) at /Volumes/Slow/source/B2G/gecko/ipc/glue/SyncChannel.cpp:144 #14 0x4107c7f4 in mozilla::ipc::RPCChannel::OnMaybeDequeueOne (this=0x4a4fd798) at /Volumes/Slow/source/B2G/gecko/ipc/glue/RPCChannel.cpp:400 #15 0x4105f976 in DispatchToMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)()> (this=<value optimized out>) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/tuple.h:383 #16 RunnableMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)(), Tuple0>::Run (this=<value optimized out>) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/task.h:307 #17 0x4107b1b4 in mozilla::ipc::RPCChannel::RefCountedTask::Run (this=<value optimized out>) at ../../dist/include/mozilla/ipc/RPCChannel.h:425 #18 mozilla::ipc::RPCChannel::DequeueTask::Run (this=<value optimized out>) at ../../dist/include/mozilla/ipc/RPCChannel.h:448 #19 0x41185068 in MessageLoop::RunTask (this=0x43affdf0, task=0x0) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:333 #20 0x41185ebe in MessageLoop::DeferOrRunPendingTask (this=0x4a4fd798, pending_task=<value optimized out>) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:341 #21 0x41186a9c in MessageLoop::DoWork (this=0x43affdf0) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:441 #22 0x41186d1c in base::MessagePumpDefault::Run (this=0x43b82e40, delegate=0x43affdf0) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_pump_default.cc:23 #23 0x41185018 in MessageLoop::RunInternal (this=0x4d2cd) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:215 #24 0x411850ce in MessageLoop::RunHandler (this=0x43affdf0) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:208 #25 MessageLoop::Run (this=0x43affdf0) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:182 #26 0x4118d434 in base::Thread::ThreadMain (this=0x42a603d0) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/thread.cc:156 #27 0x41197410 in ThreadFunc (closure=0x1) at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/platform_thread_posix.cc:39 #28 0x4010ae18 in __thread_entry (func=0x41197409 <ThreadFunc>, arg=0x42a603d0, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217 #29 0x4010a96c in pthread_create (thread_out=<value optimized out>, attr=0xbee33280, start_routine=0x41197409 <ThreadFunc>, arg=0x42a603d0) at bionic/libc/bionic/pthread.c:357 #30 0x47baf630 in ?? () Cannot access memory at address 0x0
The more I look at this, the more I think it is just a bug in this pmem driver. Everything goes fine until it tries to actually write into the mapped memory, which the driver says should work. But it clearly doesn't...
So, so far Benoit and I have come across some crazy stuff in the gralloc driver. Basically, the pmem allocator mmaps a chunk of contiguous memory at the start to make gralloc allocations with, then unmaps a single page in the middle of it via PmemKernelAlloc::free_buffer(), without keeping track of which page gets unmapped. Then pmem thinks it still has the full contiguous memory area and the system crashes when it tries to deref inside the unmapped region. This is the state of our /proc/pid/maps at the time of the crash: 4a300000-4a513000 rw-s 0bd00000 00:0b 873 /dev/pmem 4a514000-4ab00000 rw-s 0bf14000 00:0b 873 /dev/pmem replacing the memset call in the pmem allocator with a for loop resulted in a crash at the following values: (gdb) p base+offset+i $6 = (void *) 0x4a513000 This is the initial mmap call: E/memalloc( 507): <<<<<<<<<<<<< MMAP base 0x4a300000 size 0x800000 fd 49 (gdb) p/x 0x4a300000+0x800000 $7 = 0x4ab00000
So, the page unmap is coming from this: (gdb) bt #0 gralloc::PmemUserspaceAlloc::free_buffer (this=0x40301b60, base=0x4a550000, size=4096, offset=3104768, fd=162) at hardware/qcom/display/libgralloc/pmemalloc.cpp:237 #1 0x41f714fa in gralloc::gpu_context_t::free_impl (this=<value optimized out>, hnd=0x44014560) at hardware/qcom/display/libgralloc/gpu.cpp:273 #2 0x41f7158e in gralloc::gpu_context_t::gralloc_free (dev=0x40301b60, handle=0x2f6000) at hardware/qcom/display/libgralloc/gpu.cpp:317 #3 0x402ea248 in android::GraphicBufferAllocator::free (this=<value optimized out>, handle=0x4a550000) at frameworks/base/libs/ui/GraphicBufferAllocator.cpp:133 #4 0x402e9d0c in android::GraphicBuffer::free_handle (this=0x42a65700) at frameworks/base/libs/ui/GraphicBuffer.cpp:108 #5 0x402e9de2 in ~GraphicBuffer (this=0x42a65700, __in_chrg=<value optimized out>) at frameworks/base/libs/ui/GraphicBuffer.cpp:96 #6 0x402e9e04 in ~GraphicBuffer (this=0x40301b60, __in_chrg=<value optimized out>) at frameworks/base/libs/ui/GraphicBuffer.cpp:98 #7 0x40dfa920 in android::LightRefBase<android::GraphicBuffer>::decStrong (this=0x42a3f630, __in_chrg=<value optimized out>) at /home/george/dev/B2G/frameworks/base/include/utils/RefBase.h:172 #8 android::EGLNativeBase<ANativeWindowBuffer, android::GraphicBuffer, android::LightRefBase<android::GraphicBuffer> >::decStrong (this=0x42a3f630, __in_chrg=<value optimized out>) at /home/george/dev/B2G/frameworks/base/include/ui/egl/android_natives.h:67 #9 ~sp (this=0x42a3f630, __in_chrg=<value optimized out>) at /home/george/dev/B2G/frameworks/base/include/utils/StrongPointer.h:149 #10 0x411f3558 in ~GrallocBufferActor (this=0x42a3f600, __in_chrg=<value optimized out>) at ../../dist/include/mozilla/layers/ShadowLayerUtilsGralloc.h:68 #11 0x411f3580 in ~GrallocBufferActor (this=0x40301b60, __in_chrg=<value optimized out>) at ../../dist/include/mozilla/layers/ShadowLayerUtilsGralloc.h:68 #12 0x40a3e30c in mozilla::net::NeckoParent::DeallocPCookieService (this=<value optimized out>, cs=0x4a550000) at /home/george/dev/B2G/gecko/netwerk/ipc/NeckoParent.cpp:408 #13 0x410ba358 in mozilla::layers::PLayersParent::RemoveManagee (this=0x48888180, aProtocolId=<value optimized out>, aListener=<value optimized out>) at /home/george/dev/B2G/objdir-gecko/ipc/ipdl/PLayersParent.cpp:227 #14 0x410b72f4 in mozilla::layers::PGrallocBufferParent::OnMessageReceived (this=0x42a3f618, __msg=<value optimized out>) at /home/george/dev/B2G/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp:211 #15 0x410b619a in mozilla::layers::PCompositorParent::OnMessageReceived (this=0x49ba5340, __msg=...) at /home/george/dev/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:343 #16 0x41084fc0 in mozilla::ipc::AsyncChannel::OnDispatchMessage (this=0x49ba5348, msg=...) at /home/george/dev/B2G/gecko/ipc/glue/AsyncChannel.cpp:473 #17 0x41089e02 in mozilla::ipc::RPCChannel::OnMaybeDequeueOne (this=0x49ba5348) at /home/george/dev/B2G/gecko/ipc/glue/RPCChannel.cpp:402 #18 0x4106c8ae in DispatchToMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)()> (this=<value optimized out>) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/tuple.h:383 #19 RunnableMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)(), Tuple0>::Run (this=<value optimized out>) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/task.h:307 #20 0x410887b8 in mozilla::ipc::RPCChannel::RefCountedTask::Run (this=<value optimized out>) at ../../dist/include/mozilla/ipc/RPCChannel.h:425 #21 mozilla::ipc::RPCChannel::DequeueTask::Run (this=<value optimized out>) at ../../dist/include/mozilla/ipc/RPCChannel.h:448 #22 0x4119263c in MessageLoop::RunTask (this=0x43c91df0, task=0x43c91d14) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:333 #23 0x4119346e in MessageLoop::DeferOrRunPendingTask (this=0x49ba5348, pending_task=<value optimized out>) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:341 #24 0x4119404c in MessageLoop::DoWork (this=0x43c91df0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:441 #25 0x411942cc in base::MessagePumpDefault::Run (this=0x43a82e40, delegate=0x43c91df0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_pump_default.cc:23 #26 0x411925ec in MessageLoop::RunInternal (this=0x4d2ce) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:215 #27 0x411926a2 in MessageLoop::RunHandler (this=0x43c91df0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:208 #28 MessageLoop::Run (this=0x43c91df0) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/message_loop.cc:182 #29 0x4119a9ac in base::Thread::ThreadMain (this=0x42b60430) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/thread.cc:156 #30 0x411a4980 in ThreadFunc (closure=0x1) at /home/george/dev/B2G/gecko/ipc/chromium/src/base/platform_thread_posix.cc:39 #31 0x40070e18 in __thread_entry (func=0x411a4979 <ThreadFunc>, arg=0x42b60430, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217 #32 0x4007096c in pthread_create (thread_out=<value optimized out>, attr=0xbed03280, start_routine=0x411a4979 <ThreadFunc>, arg=0x42b60430) at bionic/libc/bionic/pthread.c:357 However, GraphicBufferAllocator::free() seems to be trying to keep track of the unallocated chunk by removing the item from the linked list: status_t GraphicBufferAllocator::free(buffer_handle_t handle) { status_t err; err = mAllocDev->free(mAllocDev, handle); LOGW_IF(err, "free(...) failed %d (%s)", err, strerror(-err)); if (err == NO_ERROR) { Mutex::Autolock _l(sLock); KeyedVector<buffer_handle_t, alloc_rec_t>& list(sAllocList); list.removeItem(handle); } return err; } We're pretty much out of ideas here, but we are fairly certain there's a bug in the accounting of which chunks are available. Michael - are you able to look into this further given this information? See comment 37
Flags: needinfo?(mvines)
Adding Vlad as he's also looking at gfx related weirdness.
Whiteboard: [b2g-crash][b2g-gfx][shadow:snorp] → [mvines will provide info 10 January][b2g-crash][b2g-gfx][shadow:snorp]
Michael is pretty busy so it's unlikely we can block basecamp on getting the info we need. When there's movement here, please re-nom.
blocking-b2g: --- → tef+
blocking-basecamp: + → -
Removing QAwanted: comment 11 has steps that consistantly reproduce.
Keywords: qawanted
Trying to reproduce again today I ran into a different issue, this time with scrolling: bug 829220.
Finally got a stack trace on the browser process side, to where it allocates this 1x1 gralloc buffer. It's an ImageLayer of size 1x1. Suggests that we might try to make a testcase with 1x1 images. Breakpoint 1, mozilla::layers::ShadowLayerForwarder::PlatformAllocBuffer (this=0x4326c708, aSize=..., aContent=gfxASurface::CONTENT_COLOR_ALPHA, aCaps=0, aBuffer=0x43a93d9c) at /hack/b2g/B2G/gecko/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp:260 260 printf_stderr("foo\n"); (gdb) p aSize $1 = (const gfxIntSize &) @0x43a93d8c: {<mozilla::gfx::BaseSize<int, nsIntSize>> = {width = 1, height = 1}, <No data fields>} (gdb) bt #0 mozilla::layers::ShadowLayerForwarder::PlatformAllocBuffer (this=0x4326c708, aSize=..., aContent=gfxASurface::CONTENT_COLOR_ALPHA, aCaps=0, aBuffer=0x43a93d9c) at /hack/b2g/B2G/gecko/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp:260 #1 0x40cdd4bc in mozilla::layers::ShadowLayerForwarder::AllocBufferWithCaps (this=0x4326c708, aSize=..., aContent=gfxASurface::CONTENT_COLOR_ALPHA, aCaps=0, aBuffer=0x43a93d9c) at /hack/b2g/B2G/gecko/gfx/layers/ipc/ShadowLayers.cpp:441 #2 0x40cdd486 in mozilla::layers::ShadowLayerForwarder::AllocBuffer (this=0x4326c708, aSize=..., aContent=gfxASurface::CONTENT_COLOR_ALPHA, aBuffer=0x43a93d9c) at /hack/b2g/B2G/gecko/gfx/layers/ipc/ShadowLayers.cpp:428 #3 0x40ca5012 in mozilla::layers::BasicShadowableImageLayer::Paint (this=0x43a93c00, aContext=0x43ceb7b0, aMaskLayer=0x0) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicImageLayer.cpp:373 #4 0x40c9d55c in mozilla::layers::BasicLayerManager::PaintSelfOrChildren (this=0x4326c690, aPaintContext=..., aGroupTarget=0x43ceb7b0) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicLayerManager.cpp:828 #5 0x40c9d8f0 in mozilla::layers::BasicLayerManager::PaintLayer (this=0x4326c690, aTarget=0x43ceb7b0, aLayer=0x43a93c00, aCallback=0x4055e3e9 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*)>, aCallbackData=0xbe91c818, aReadback=0xbe91c174) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicLayerManager.cpp:939 #6 0x40c9d5c0 in mozilla::layers::BasicLayerManager::PaintSelfOrChildren (this=0x4326c690, aPaintContext=..., aGroupTarget=0x43ceb7b0) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicLayerManager.cpp:841 #7 0x40c9d8f0 in mozilla::layers::BasicLayerManager::PaintLayer (this=0x4326c690, aTarget=0x43ceb7b0, aLayer=0x438a9800, aCallback=0x4055e3e9 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*)>, aCallbackData=0xbe91c818, aReadback=0x0) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicLayerManager.cpp:939 #8 0x40c9cc1e in mozilla::layers::BasicLayerManager::EndTransactionInternal (this=0x4326c690, aCallback=0x4055e3e9 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*)>, aCallbackData=0xbe91c818, aFlags=mozilla::layers::LayerManager::END_NO_COMPOSITE) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicLayerManager.cpp:586 #9 0x40c9c7e2 in mozilla::layers::BasicLayerManager::EndTransaction (this=0x4326c690, aCallback=0x4055e3e9 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*)>, aCallbackData=0xbe91c818, aFlags=mozilla::layers::LayerManager::END_NO_COMPOSITE) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicLayerManager.cpp:509 #10 0x40c9e136 in mozilla::layers::BasicShadowLayerManager::EndTransaction (this=0x4326c690, aCallback=0x4055e3e9 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*)>, aCallbackData=0xbe91c818, aFlags=mozilla::layers::LayerManager::END_NO_COMPOSITE) at /hack/b2g/B2G/gecko/gfx/layers/basic/BasicLayerManager.cpp:1149 #11 0x4057d95a in nsDisplayList::PaintForFrame (this=<value optimized out>, aBuilder=0xbe91c818, aCtx=<value optimized out>, aForFrame=<value optimized out>, aFlags=13) at /hack/b2g/B2G/gecko/layout/base/nsDisplayList.cpp:1147 #12 0x4057dac0 in nsDisplayList::PaintRoot (this=0xbe91cba0, aBuilder=0xbe91c818, aCtx=0x0, aFlags=13) at /hack/b2g/B2G/gecko/layout/base/nsDisplayList.cpp:1012 #13 0x4058c5a6 in nsLayoutUtils::PaintFrame (aRenderingContext=<value optimized out>, aFrame=0x427f1800, aDirtyRegion=<value optimized out>, aBackstop=<value optimized out>, aFlags=772) at /hack/b2g/B2G/gecko/layout/base/nsLayoutUtils.cpp:1955 #14 0x40598746 in PresShell::Paint (this=0x432dc100, aViewToPaint=<value optimized out>, aDirtyRegion=..., aType=nsIPresShell::PaintType_NoComposite, aWillSendDidPaint=<value optimized out>) at /hack/b2g/B2G/gecko/layout/base/nsPresShell.cpp:5349 #15 0x407ab1fe in nsViewManager::ProcessPendingUpdatesForView (this=0x427a8940, aView=0x432d2dc0, aFlushDirtyRegion=<value optimized out>) at /hack/b2g/B2G/gecko/view/src/nsViewManager.cpp:431 #16 0x407ab29a in nsViewManager::ProcessPendingUpdates (this=0x427a8940) at /hack/b2g/B2G/gecko/view/src/nsViewManager.cpp:1221 #17 0x4059c06e in nsRefreshDriver::Notify (this=0x427fab00, aTimer=<value optimized out>) at /hack/b2g/B2G/gecko/layout/base/nsRefreshDriver.cpp:436
Whiteboard: [mvines will provide info 10 January][b2g-crash][b2g-gfx][shadow:snorp] → [mvines following up internally to look for fix][b2g-crash][b2g-gfx][shadow:snorp]
(In reply to Benoit Jacob [:bjacob] from comment #44) > Finally got a stack trace on the browser process side, to where it allocates > this 1x1 gralloc buffer. It's an ImageLayer of size 1x1. Suggests that we > might try to make a testcase with 1x1 images. > I initially had a stack with a 1x1 buffer too, but I've seen it with other sizes as well, so I'm not sure the 1x1 is significant at all. If anything, it seems like a larger buffer would be a better way of triggering the issue (better chance of crossing an unmapped region).
Have you seen a crash with a size greater than page size ( == 4096 bytes == size of a 32x32 image at 32bpp) ? Sizes smaller than 32 get rounded up to 32 in libgralloc (as the stacks show) so indeed there is nothing special about size 1; but in all the crashes that we've seen, the bad memory access was into a 1-page-wide non-mapped range, per the scenario described in comment 37: a single page gets unmapped, then a bad memory allocation is made into that unmapped range.
Specifically: what matters is not so much the size of the allocation that crashes, rather it's the size of the gralloc buffer that gets freed, causing page(s) to be unmapped. What I'd like to know is if this crash can be reproduced without having a single-page gralloc buffer deallocated before.
(In reply to Benoit Jacob [:bjacob] from comment #47) > Specifically: what matters is not so much the size of the allocation that > crashes, rather it's the size of the gralloc buffer that gets freed, causing > page(s) to be unmapped. What I'd like to know is if this crash can be > reproduced without having a single-page gralloc buffer deallocated before. I saw one that was allocating a 320x480 (I think?) buffer, but it's entirely possible a 1x1 one was freed earlier causing the issue you're describing.
Whiteboard: [mvines following up internally to look for fix][b2g-crash][b2g-gfx][shadow:snorp] → [cr 440442][mvines following up internally to look for fix][b2g-crash][b2g-gfx][shadow:snorp]
Do you have any more information Michael?
Flags: needinfo?(mvines)
Michael asked me to assign this to him while he's getting the info.
Assignee: gwright → mvines
I concur with George's observations in comment 37. I've reproduced the problem several times with stack traces similar to comment 32 and comment 35, but haven't seen any crashes like the one in comment 25, so I'm leaning toward this being a gralloc issue, not an EGL issue. I'm enlisting Diego's help here to investigate gralloc.
Flags: needinfo?(mvines)
Whiteboard: [cr 440442][mvines following up internally to look for fix][b2g-crash][b2g-gfx][shadow:snorp] → [cr 440442][mvines following up internally to look for fix][b2g-crash][b2g-gfx]
Update: CAF will have info within a few days, check back on 1/24.
Assignee: mvines → dwilson
I'm suddenly no longer able to reproduce this crash on Trulia.com. Anyone have an alternate method?
Attached file Test app (deleted) —
I think i have the same problem with my app, just install the app, launch it and then enter a route in the ui (from ... to). Sometimes it works, sometimes it crash and made gaia reboot on unagi. (the app work just fine on firefox desktop) My guess is that my issue is tightly related to this one, but if not i will open a new nug report. Thanks !
(In reply to mael.lavault from comment #54) > I think i have the same problem with my app, just install the app, launch it > and then enter a route in the ui (from ... to). Sometimes it works, > sometimes it crash and made gaia reboot on unagi. (the app work just fine on > firefox desktop) > > My guess is that my issue is tightly related to this one, but if not i will > open a new nug report. Do you get similar crash signatures? Actually, can you provide a crash report URL? QA: can someone verify this app crashes in the same way as trulia.com?
Keywords: qawanted
jsmith advises that we cannot test the app unless it is routed through Marketplace - can we begin that process?
(In reply to Marcia Knous [:marcia] from comment #56) > jsmith advises that we cannot test the app unless it is routed through > Marketplace - can we begin that process? Right. The packaged app attached here needs to be submitted to marketplace prod. It's a privileged app using one privileged permission (systemXHR).
You can test it using b2gremote firefox addon. I will try to get a stack trace tonight. (you do that using adb logcat right ?). But last time i checked it i had differents stacktrace, sometimes it was about pmen or shmem (something like that) and sometimes it was graphic related.
(In reply to mael.lavault from comment #58) > You can test it using b2gremote firefox addon. I will try to get a stack > trace tonight. (you do that using adb logcat right ?). But last time i > checked it i had differents stacktrace, sometimes it was about pmen or shmem > (something like that) and sometimes it was graphic related. It's going to be better here to test this directly on device - otherwise, I'm not sure if using the add-on is an accurate way to determine if this should block or not.
Actually this addon is used to install the app on the device in an easy way. Just clone git repo, zip the content and rename it to .xpi then install it in firefox. Connect your unagi device, launch the addon, a message will appear on the phone, accept it. Then select the directory with the zip of your app and click full_unagi button. The app is installed on the phone ;)
So sometimes i get this : D/memalloc( 107): Out of PMEM. Dumping PMEM stats for debugging D/memalloc( 107): ------------- PRINT PMEM STATS -------------- D/memalloc( 107): Node 0 -> Start Address : 0 Size 19200 Free info 0 D/memalloc( 107): Node 1 -> Start Address : 19200 Size 19200 Free info 0 D/memalloc( 107): Node 2 -> Start Address : 38400 Size 19200 Free info 0 D/memalloc( 107): Node 3 -> Start Address : 57600 Size 19200 Free info 0 D/memalloc( 107): Node 4 -> Start Address : 76800 Size 16640 Free info 0 D/memalloc( 107): Node 5 -> Start Address : 93440 Size 2560 Free info 0 D/memalloc( 107): Node 6 -> Start Address : 96000 Size 16640 Free info 0 D/memalloc( 107): Node 7 -> Start Address : 112640 Size 2560 Free info 0 D/memalloc( 107): Node 8 -> Start Address : 115200 Size 2560 Free info 0 D/memalloc( 107): Node 9 -> Start Address : 117760 Size 2560 Free info 0 D/memalloc( 107): Node 10 -> Start Address : 120320 Size 19200 Free info 1 D/memalloc( 107): Node 11 -> Start Address : 139520 Size 16640 Free info 0 D/memalloc( 107): Node 12 -> Start Address : 156160 Size 13312 Free info 1 D/memalloc( 107): Node 13 -> Start Address : 169472 Size 2560 Free info 0 D/memalloc( 107): Node 14 -> Start Address : 172032 Size 2560 Free info 0 D/memalloc( 107): Node 15 -> Start Address : 174592 Size 2560 Free info 0 D/memalloc( 107): Node 16 -> Start Address : 177152 Size 2560 Free info 0 D/memalloc( 107): Node 17 -> Start Address : 179712 Size 475648 Free info 1 D/memalloc( 107): Total Allocated: Total Free: D/memalloc( 107): ---------------------------------------------- E/memalloc( 107): /dev/pmem: No more pmem available W/memalloc( 107): Falling back to ashmem Sometimes this : W/GraphicBufferAllocator( 656): alloc(2621440, 3358720, 1, 00000133, ...) failed -22 (Invalid argument) I/Gecko ( 799): [Child 799] ###!!! ABORT: creating ThebesLayer 'back buffer' failed! width=2621440, height=3358720, type=3000: file ../../../gecko/gfx/layers/basic/BasicThebesLayer.cpp, line 460 E/Gecko ( 799): mozalloc_abort: [Child 799] ###!!! ABORT: creating ThebesLayer 'back buffer' failed! width=2621440, height=3358720, type=3000: file ../../../gecko/gfx/layers/basic/BasicThebesLayer.cpp, line 460 I/Gecko ( 656): I/Gecko ( 656): ###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv I/Gecko ( 656): I also got other errors sometimes but cannot reproduce them now.
Diego, do you think the issue reported in comments >= 54 is the same as the original trulia.com issue?
Flags: needinfo?(mluna)
(In reply to mael.lavault from comment #61) > ###!!! ABORT: creating ThebesLayer 'back buffer' failed! width=2621440, height=3358720, type=3000 That is a crazy width/height, and will, among other things, overflow a 32-bit integer if multiplied together -- it'll overflow to 0, which we might end up treating as 1 down the line maybe? I'd be interested to know why we're trying to allocate something this large.
(In reply to Andrew Overholt [:overholt] from comment #62) > Diego, do you think the issue reported in comments >= 54 is the same as the > original trulia.com issue? FYI I cannot reproduce this on trulia.com anymore @mael.lavault@mailz.org I'm not quite convinced the issue in your app is the same as the on in this bug. Can you please create a new bug for this?
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #63) > (In reply to mael.lavault from comment #61) > > ###!!! ABORT: creating ThebesLayer 'back buffer' failed! width=2621440, height=3358720, type=3000 > > That is a crazy width/height, and will, among other things, overflow a > 32-bit integer if multiplied together -- it'll overflow to 0, which we might > end up treating as 1 down the line maybe? I'd be interested to know why > we're trying to allocate something this large. That's a map aplication using leaflet js lib. It preload some tiles to be quicker. The app itself is very fluid appart from the crashes.
I can more easily reproduce the call stacks from comment 32 and comment 35 using wired.com as reported in bug 834435
I haven't been able to reproduce any crash on Trulia on the last few builds.
Yeah, me neither. Neither can James Hicks. Please reopen if trulia.com is still giving you some trouble
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
Batch edit: Bugs still affected on b2g18 after 2/13 merge to v1.0.1 branch are affected on v1.0.1 branch.
@lsblack Is it easily reproducible? Do you have more detailed steps to reproduce?
Build ID should be 20130214070203 gecko: http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/d1288313218e Gaia: 6544fdb8dddc56f1aefe94482402488c89eeec49 Kernel: Dec5 Tested on "Unagi" device This bug not reproduces for me. Mozilla OS doesn't crash.
Resolution: WORKSFORME → FIXED
Lukas or Alex, it looks like there is nothing to uplift to branches, should we mark this bug as status-b2g18* fixed?
IMO, this bug should have been left as WORKSFORME since the reason for this bug disappearing is unknown.
Resolution: FIXED → WORKSFORME
Build ID:20130219070200 Gecko http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/98354c0298ab Gaia edaca00b1eb7534120b6255db5d5200fb1d86d65 Kernel: Dec 5 Verified this issue no longer reproduces on "Unagi" device for me
Status: RESOLVED → VERIFIED
Flags: needinfo?(mluna)
Removing QA wanted. Michelle had shown me this crash, and I don't see the crash any more.
Keywords: qawanted
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: