Closed
Bug 1031527
Opened 10 years ago
Closed 10 years ago
b2g crashes during boot during receiving call
Categories
(Core :: Graphics: Layers, defect, P1)
Tracking
()
People
(Reporter: tkundu, Assigned: sotaro)
References
Details
(Keywords: crash, topcrash-b2g, Whiteboard: [caf priority: p1][CR 686674][b2g-crash])
Crash Data
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
sotaro
:
review+
|
Details | Diff | Splinter Review |
Test steps: 1. Reboot the device for every 60 seconds and Use another phone to call test device continuously every 20 seconds 2. After night run, mini dumps are generated on the phone. Logs(stack trace, logcat etc) : https://drive.google.com/file/d/0B1cSMS8_GuAEZ29LZE1Zd2hpRjQ/edit?usp=sharing Exact string in logcat when this crash happens is : E Gecko : mozalloc_abort: [Parent 4310] ###!!! ABORT: IPDL error [PSharedBufferManagerChild]: "Error deserializing 'MaybeMagicGrallocBufferHandle'". abort()ing as a result.: file ../../../../../../../../gecko/ipc/glue/ProtocolUtils.cpp, line 198 This crash is happening multiple times on FFOS 2.0, msm8610 256MB device. gaia: https://www.codeaurora.org/cgit/quic/lf/b2g/mozilla/gaia/tag/?h=0a25eff60fcc699687e45ba2ac8b9a3ab3782672&id=AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.016 gecko: https://www.codeaurora.org/cgit/quic/lf/b2g/mozilla/gecko/tag/?h=0b83d5e6a54df9bfccdbb5ba662527bf657f5381&id=AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.016 I knew that STR will be very difficult to reproduce it again. But if you give us some direction in debugging then we can reproduce it with more logs and share with you for further debugging.
Reporter | ||
Updated•10 years ago
|
blocking-b2g: --- → 2.0?
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(msreckovic)
Reporter | ||
Updated•10 years ago
|
Whiteboard: [CR 686674]
Updated•10 years ago
|
Component: General → Graphics: Layers
Keywords: crash
Product: Firefox OS → Core
Whiteboard: [CR 686674] → [CR 686674][b2g-crash]
Version: unspecified → 32 Branch
Updated•10 years ago
|
Whiteboard: [CR 686674][b2g-crash] → [caf priority: p1][CR 686674][b2g-crash]
Comment 1•10 years ago
|
||
Crash observed on: Device: msm8610 Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.017 Moz BuildID: 20140624000201 B2G Version: 2.0 Gecko Version: 32.0a2 Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=9d2f7bd16a8dc0c74c97c5a40d2f0731f3dfff4b Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=32c226e5a7adbb95e9c4ee003dc9e64699da03e1
Comment 2•10 years ago
|
||
(We've seen this crash 91 times on v2.0 in the last week)
blocking-b2g: 2.0? → 2.0+
Keywords: topcrash-b2g
Comment 3•10 years ago
|
||
Observed on: Device: msm8226 Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.010 Moz BuildID: 20140614000202 B2G Version: 2.0 Gecko Version: 32.0a2 Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=5b1fdc6000d35962769e789b924b24e166a27759 Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=f6c59193a0102d1b2ffb136b5d1d42fd30f93446
Updated•10 years ago
|
Flags: needinfo?(msreckovic) → needinfo?(milan)
Updated•10 years ago
|
status-b2g-v2.0:
--- → affected
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → sotaro.ikeda.g
Assignee | ||
Comment 4•10 years ago
|
||
I found a strange code in ParamTraits<MagicGrallocBufferHandle>::Read(). The following code do file descriptor's dup only when IPC is between inter threads and index(key of buffer) is negative value. > int dupFd = sameProcess && index < 0 ? dup(fd.fd) : fd.fd; http://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp#139 index(key) is defined as int and incremented for each gralloc buffer allocation. Therefore, index(key) could become negative when the value is wrapped. > ref.mKey = ++sBufferKey; http://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/SharedBufferManagerParent.cpp#192
Assignee | ||
Comment 5•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #4) > I found a strange code in ParamTraits<MagicGrallocBufferHandle>::Read(). The > following code do file descriptor's dup only when IPC is between inter > threads and index(key of buffer) is negative value. > > > int dupFd = sameProcess && index < 0 ? dup(fd.fd) : fd.fd; > http://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ > ShadowLayerUtilsGralloc.cpp#139 > We do not need to dup fd in this case since SharedBufferManager is implemented.
Assignee | ||
Comment 6•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8448164 -
Flags: review?(jmuizelaar)
Assignee | ||
Comment 7•10 years ago
|
||
As a gralloc buffer's key, "int" is used. 32 bit is too small for it. I am going to create a bug for it.
Assignee | ||
Comment 8•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #7) > As a gralloc buffer's key, "int" is used. 32 bit is too small for it. I am > going to create a bug for it. Created bug 1032364.
Comment 9•10 years ago
|
||
Comment on attachment 8448164 [details] [diff] [review] patch - remove dup fd from ParamTraits<MagicGrallocBufferHandle>::Read() Review of attachment 8448164 [details] [diff] [review]: ----------------------------------------------------------------- This seems reasonable. ::: gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp @@ +139,5 @@ > // the right thing and dup's the fd. If it's shared cross-thread, > + // SCM_RIGHTS doesn't dup the fd. > + // But in shared cross-thread, dup fd is not necessary. > + // A pointer of the GraphicBuffer is directly got from SharedBufferManagerParent. > + fds[n] = fd.fd; I'd change this sentence to be something like: // But in shared cross-thread, dup fd is not necessary because we get a pointer to the GraphicBuffer directly from SharedBufferManagerParent and don't create a new GraphicBuffer around the fd. As an aside, didn't we have problems with gralloc/genlock when using the same file descriptor and accessed from the same process but on different threads?
Attachment #8448164 -
Flags: review?(jmuizelaar) → review+
Assignee | ||
Comment 10•10 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #9) > > As an aside, didn't we have problems with gralloc/genlock when using the > same file descriptor and accessed from the same process but on different > threads? Android basically expects one instance per process. When it is not like that, some implementations except qcom do not work correctly. There is a such explanation in the following link. http://androidxref.com/4.4.3_r1.1/xref/hardware/libhardware/modules/gralloc/mapper.cpp#94 How to control gralloc buffer is almost same as android at Compositable level. Therefore, there seems no problem. Actually, the problem is not observed until now.
Assignee | ||
Comment 11•10 years ago
|
||
Apply the comment. Carry "r=jmuizelaar"
Attachment #8448164 -
Attachment is obsolete: true
Attachment #8448371 -
Flags: review+
Assignee | ||
Comment 12•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/05f283d784e0
Comment 13•10 years ago
|
||
Can you explain why it is too small?
Comment 14•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/05f283d784e0
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Assignee | ||
Comment 15•10 years ago
|
||
(In reply to Andreas Gal :gal from comment #13) > Can you explain why it is too small? gralloc buffer key is incremented for each gralloc buffer allocation for each process. b2g process could exist very long time, therefore there is a risk the id could exceed the int max. I did not say that the problem happened by it.
Assignee | ||
Comment 16•10 years ago
|
||
Similar crashes exist in Bug 1021451, Bug 1032642. All crashes happens when PSharedBufferManagerChild exist in b2g process and failed to deserialize MaybeMagicGrallocBufferHandle. The deserialization happens at ParamTraits<MagicGrallocBufferHandle>::Read(). http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp#110 attachment 8448371 [details] [diff] [review] fixed one prolem in ParamTraits<MagicGrallocBufferHandle>::Read() and add logs.
Assignee | ||
Comment 17•10 years ago
|
||
It seems better to keep-open until we confirmed the problem fix.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [caf priority: p1][CR 686674][b2g-crash] → [caf priority: p1][CR 686674][b2g-crash][keep-open]
Assignee | ||
Comment 18•10 years ago
|
||
Tapas, can you test again with attachment 8448371 [details] [diff] [review] and with logcat log? It might not fix the problem, but add some logs to ParamTraits<MagicGrallocBufferHandle>::Read(). It might help to debug.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Flags: needinfo?(tkundu)
Resolution: --- → FIXED
Whiteboard: [caf priority: p1][CR 686674][b2g-crash][keep-open] → [caf priority: p1][CR 686674][b2g-crash]
Marking as fixed in 2.1 via comment 14. Has this landed for 2.0?
status-b2g-v2.1:
--- → fixed
Comment 20•10 years ago
|
||
4 billion allocations is ... unlikely. Not sure I buy this story.
Reporter | ||
Comment 21•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #18) > Tapas, can you test again with attachment 8448371 [details] [diff] [review] > and with logcat log? It might not fix the problem, but add some logs to > ParamTraits<MagicGrallocBufferHandle>::Read(). It might help to debug. Thanks for your work. We are asking out internal team to confirm this. I will re-open if it does not fix this issue.
Comment 22•10 years ago
|
||
Observed on: Device: msm8610 Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.020 Moz BuildID: 20140628000201 B2G Version: 2.0 Gecko Version: 32.0a2 Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=91958625774ebe6c425adf322b09f1edf906578d Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=eab4f8ff82758f08f049e14d305e29fcc50cb292
Reporter | ||
Comment 23•10 years ago
|
||
Could you please upload a patch for 2.0 and land it in 2.0 ?
Flags: needinfo?(sotaro.ikeda.g)
Comment 24•10 years ago
|
||
Can we confirm the problem is fixed in 2.1 before uplifting?
Flags: needinfo?(milan)
Comment 25•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/c963ceaae677
Comment 26•10 years ago
|
||
Sorry, I already pushed by the time you commented :(
Comment 27•10 years ago
|
||
Observed on: Device: msm8610 Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.022 Moz BuildID: 20140630000201 B2G Version: 2.0 Gecko Version: 32.0a2 Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=c0c8ad187c0466285f2580531e09f8322996f561 Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=ef0781044f85ee7f364236c074ab0b7644c38ff8
Comment 28•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #26) > Sorry, I already pushed by the time you commented :( No worries.
Assignee | ||
Comment 29•10 years ago
|
||
I also found another problems - all SharedBufferManagerParent's static members access are not protected by Monitor. + SharedBufferManagerParent creates one thread for each content process. - Two SharedBufferManagerParent instances are created for each content process. + One instance is created by SharedBufferManagerParent::CloneToplevel(), another is created by ContentParent::AllocPSharedBufferManagerParent(). + In ImageBridge case, only one instance seem to be created by ImageBridgeParent::CloneToplevel().
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 30•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #29) > I also found another problems > - all SharedBufferManagerParent's static members access are not protected by > Monitor. > + SharedBufferManagerParent creates one thread for each content process. > - Two SharedBufferManagerParent instances are created for each content > process. > + One instance is created by SharedBufferManagerParent::CloneToplevel(), > another is created by > ContentParent::AllocPSharedBufferManagerParent(). > + In ImageBridge case, only one instance seem to be created by > ImageBridgeParent::CloneToplevel(). I am going to create a new bug for the problems.
Assignee | ||
Comment 31•10 years ago
|
||
Bug 1034294 could be an actual cause of this bug. SharedBufferManagerParent::RecvAllocateGrallocBuffer() does not handle a race condition correctly. It could cause that a gralloc buffer is stored to incorrect key. See Bug 1034294 Comment 7.
Comment 32•10 years ago
|
||
Observed on: Device: msm8226 Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.026 Moz BuildID: 20140707000200 B2G Version: 2.0 Gecko Version: 32.0a2 Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=ef67af27dff3130d41a9139d6ae7ed640c34d922 Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=d3eae03cdf4e6944e646d05938a22dc1380a0d95
Reporter | ||
Comment 33•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #31) > Bug 1034294 could be an actual cause of this bug. > SharedBufferManagerParent::RecvAllocateGrallocBuffer() does not handle a > race condition correctly. It could cause that a gralloc buffer is stored to > incorrect key. See Bug 1034294 Comment 7. This seems to be fixed this issue for us. Tbanks a lot for your quick help.
Flags: needinfo?(tkundu)
Assignee | ||
Comment 34•10 years ago
|
||
(In reply to Tapas Kumar Kundu from comment #33) > (In reply to Sotaro Ikeda [:sotaro] from comment #31) > > Bug 1034294 could be an actual cause of this bug. > > SharedBufferManagerParent::RecvAllocateGrallocBuffer() does not handle a > > race condition correctly. It could cause that a gralloc buffer is stored to > > incorrect key. See Bug 1034294 Comment 7. > > This seems to be fixed this issue for us. Tbanks a lot for your quick help. Good news! Thanks for the confirmation :-)
Comment 35•10 years ago
|
||
No STR is present to create test case to address bug.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(srapanan)
Updated•10 years ago
|
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(srapanan)
Flags: in-moztrap-
Depends on: 1058366
No longer depends on: 1058366
Comment 36•10 years ago
|
||
Unable to verify due to automation necessary in STR
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?][QAnalyst-verify-]
Flags: needinfo?(ktucker)
Updated•10 years ago
|
QA Whiteboard: [QAnalyst-Triage?][QAnalyst-verify-] → [QAnalyst-Triage+][QAnalyst-verify-]
Flags: needinfo?(ktucker)
You need to log in
before you can comment on or make changes to this bug.
Description
•