Closed
Bug 871574
Opened 11 years ago
Closed 11 years ago
crash in mozilla::dom::indexedDB::PIndexedDBRequestChild::OnMessageReceived
Categories
(Core :: Storage: IndexedDB, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: scoobidiver, Unassigned)
Details
(Keywords: crash, Whiteboard: [b2g-crash])
Crash Data
There are six crashes including recent ones in builds from May 6 and 7.
Here is a crash report: bp-88a50569-d82f-4e1d-8b4e-d366b2130507.
Frame Module Signature Source
0 @0x0
1 libxul.so mozilla::dom::indexedDB::PIndexedDBRequestChild::OnMessageReceived PIndexedDBRequestChild.cpp:194
2 libxul.so mozilla::dom::PContentChild::OnMessageReceived PContentChild.cpp:2302
3 libxul.so mozilla::ipc::AsyncChannel::OnDispatchMessage AsyncChannel.cpp:471
4 libxul.so mozilla::ipc::RPCChannel::OnMaybeDequeueOne RPCChannel.cpp:402
5 libxul.so RunnableMethod<IPC::ChannelProxy::Context, void , Tuple0>::Run tuple.h:383
6 libxul.so mozilla::ipc::RPCChannel::DequeueTask::Run RPCChannel.h:425
7 libxul.so MessageLoop::RunTask message_loop.cc:337
8 libxul.so MessageLoop::DeferOrRunPendingTask message_loop.cc:345
9 libxul.so MessageLoop::DoWork message_loop.cc:445
10 libxul.so mozilla::ipc::DoWorkRunnable::Run MessagePump.cpp:42
11 libxul.so nsThread::ProcessNextEvent nsThread.cpp:620
12 libxul.so NS_ProcessNextEvent_P nsThreadUtils.cpp:237
13 libxul.so mozilla::ipc::MessagePump::Run MessagePump.cpp:117
14 libxul.so mozilla::ipc::MessagePumpForChildProcess::Run MessagePump.cpp:231
15 libxul.so MessageLoop::RunInternal message_loop.cc:219
16 libxul.so MessageLoop::Run message_loop.cc:212
17 libxul.so nsBaseAppShell::Run nsBaseAppShell.cpp:163
18 libxul.so XRE_RunAppShell nsEmbedFunctions.cpp:646
19 libxul.so mozilla::ipc::MessagePumpForChildProcess::Run MessagePump.cpp:198
20 libxul.so MessageLoop::RunInternal message_loop.cc:219
21 libxul.so MessageLoop::Run message_loop.cc:212
22 libxul.so XRE_InitChildProcess nsEmbedFunctions.cpp:485
23 plugin-container main ipc/app/MozillaRuntimeMain.cpp:60
24 libc.so __libc_init libc_init_dynamic.c:114
25 @0xb0001dc5
More reports at:
https://crash-stats.mozilla.com/report/list?signature=%400x0+|+mozilla%3A%3Adom%3A%3AindexedDB%3A%3APIndexedDBRequestChild%3A%3AOnMessageReceived
Comment 1•11 years ago
|
||
According to Bug 863500 comment 25, this one should be duplicated to Bug 863500.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Reporter | ||
Comment 2•11 years ago
|
||
There are recent crashes.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Reporter | ||
Comment 3•11 years ago
|
||
It's #7 crasher in B2G 18.0.
These crashes indicate memory corruption. We'll need some STR and valgrind probably :(
Basically we're crashing on a null deref of 'actor' here:
PIndexedDBRequestChild::OnMessageReceived(const Message& __msg)
{
...
PIndexedDBRequestChild* actor;
if ((!(Read((&(actor)), (&(__msg)), (&(__iter)), false)))) {
FatalError("Error deserializing 'PIndexedDBRequestChild'");
return MsgValueError;
}
...
(actor)->DestroySubtree(Deletion);
...
}
But:
PIndexedDBRequestChild::Read(
PIndexedDBRequestChild** __v,
const Message* __msg,
void** __iter,
bool __nullable)
{
int32_t id;
if ((!(Read((&(id)), __msg, __iter)))) {
FatalError("Error deserializing 'id' for 'PIndexedDBRequestChild'");
return false;
}
if (((1) == (id)) || (((0) == (id)) && ((!(__nullable))))) {
mozilla::ipc::ProtocolErrorBreakpoint("bad ID for PIndexedDBRequest");
return false;
}
if ((0) == (id)) {
(*(__v)) = 0;
return true;
}
...
}
That can only return a null actor if '__nullable' is true, which it can't be in this case. So somewhere between 'Read(&actor)' and 'actor->DestroySubtree()' our actor pointer is being overwritten.
Reporter | ||
Comment 6•11 years ago
|
||
It's now #4 top crasher in B2G 18.0.
blocking-b2g: --- → leo?
Keywords: topcrash
Comment 7•11 years ago
|
||
Reporter: Can you describe the user impact when this crash occurs?
ahuang: Can you analyze what we have already first and provide some insights so we can see the severity?
Flags: needinfo?(ahuang)
Comment 8•11 years ago
|
||
(In reply to Wayne Chang [:wchang] from comment #7)
> Reporter: Can you describe the user impact when this crash occurs?
>
> ahuang: Can you analyze what we have already first and provide some insights
> so we can see the severity?
We don't see this at least after 5/15 build, right? I believe the severity is low.
According to Ben in comment 4 and comment 5, I think coredump may provide us little help here. Minidump from partner is not enough to solve this bug for sure, but I think it's barely possible to let partners run Valgrind in stress tests as well. Maybe bug 847268, enabling coredump is much more reasonable for partners and us to dig into this bug.
Flags: needinfo?(ahuang)
Comment 9•11 years ago
|
||
blocking-b2g: leo? → leo+
To note : these are all keon or peak crashes.
Comment 12•11 years ago
|
||
(In reply to ben turner [:bent] from comment #5)
> That can only return a null actor if '__nullable' is true, which it can't be
> in this case. So somewhere between 'Read(&actor)' and
> 'actor->DestroySubtree()' our actor pointer is being overwritten.
Let's try wether we can reproduce this on emulator-x86 or not. We can enable hardware watchpoint with gdb 7.4 (or later) (bug 865582) on emulator-x86. Valgrind seems to be a good choice, too.
Comment 13•11 years ago
|
||
(In reply to Scoobidiver from comment #3)
> It's #7 crasher in B2G 18.0.
Hi,
I want to check this bug, using HW watchpoint on emulator-x86. Can you provide 100% reproduciable steps? Thanks.
Reporter | ||
Comment 14•11 years ago
|
||
(In reply to Wayne Chang [:wchang] from comment #7)
> Reporter: Can you describe the user impact when this crash occurs?
(In reply to Alan Huang [:ahuang] from comment #13)
> Can you provide 100% reproduciable steps?
I don't have. This bug was filed against crash stats. In addition, users can't add a comment when crashing so no clue except maybe from URLs if available.
Updated•11 years ago
|
Assignee: nobody → ahuang
Comment 15•11 years ago
|
||
Are we still seeing this on more recent builds?
Flags: needinfo?(scoobidiver)
Reporter | ||
Comment 16•11 years ago
|
||
(In reply to Wayne Chang [:wchang] from comment #15)
> Are we still seeing this on more recent builds?
It happens on Peak and Keon up to B2G 18.0/20130613 which seems to be the latest FxOS-1.0.1 build.
Flags: needinfo?(scoobidiver)
Comment 17•11 years ago
|
||
(In reply to Scoobidiver from comment #16)
> (In reply to Wayne Chang [:wchang] from comment #15)
> > Are we still seeing this on more recent builds?
> It happens on Peak and Keon up to B2G 18.0/20130613 which seems to be the
> latest FxOS-1.0.1 build.
Have we seen it on 1.1 or trunk/1.2 builds recently as well?
Reporter | ||
Comment 18•11 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #17)
> Have we seen it on 1.1 or trunk/1.2 builds recently as well?
ZTE phones don't have symbols so I can't say for 1.1. In trunk/1.2, there are only 3 crashes over the last week, none for this bug, so it's not statistically representative.
Comment 19•11 years ago
|
||
(In reply to Scoobidiver from comment #18)
> ZTE phones don't have symbols so I can't say for 1.1.
The shipped ZTE phones are running 1.0.1 - both 1.1 and 1.2 are only in use in internal testing builds/devices (unagi etc.), or for 1.2, on Geeksphones devices with very daring users.
Reporter | ||
Comment 20•11 years ago
|
||
It's #21 crasher in B2G for all versions.
blocking-b2g: leo+ → leo?
Keywords: topcrash
Comment 21•11 years ago
|
||
Hello Al,
As we talked before, we may need QA help us to find STR for this. Can Taiwan QA provide some help here? Thanks!
Keywords: qawanted
QA Contact: atsai
Comment 22•11 years ago
|
||
Triage- Leo-ing until we can find an STR or the occurrence rate rises.
blocking-b2g: leo? → ---
Comment 23•11 years ago
|
||
Hi, Alan,
Sorry to jump in.
I have no idea regarding provided logs.
All that we can do is run the scenarios that Bug 863500 comment 24 mentioned.
Do you think this makes sense?
If you know that there have any specific methods to trigger this crash, please feel free to contact us.
I will also go to your cubicle to discuss this problem with you after I did the test.
Thanks!
Comment 24•11 years ago
|
||
Hi, Alan and all,
I automated the test steps that Bug 863500 comment 24 mentioned recently and run it on the following V1-TRAIN build with unagi device.
* 2013-07-03-07-02-10
* 2013-07-18-23-02-25
I still cannot reproduce it.
This bug was reported 2 months ago. I cannot sure if we had any patch impact the bug and became a potential issue.
By the way, I also doubt that if the crash reports were caused by QA since we ran the Leo test during the period. But I don't have any finding.
I will continue to monitor this issue form automation server but not spend too much time.
If you have further suggestions, comments, or findings, please feel free to contact.
Thanks!
Reporter | ||
Updated•11 years ago
|
Crash Signature: [@ @0x0 | mozilla::dom::indexedDB::PIndexedDBRequestChild::OnMessageReceived] → [@ @0x0 | mozilla::dom::indexedDB::PIndexedDBRequestChild::OnMessageReceived]
[@ @0x0 | mozilla::dom::indexedDB::PIndexedDBRequestChild::OnMessageReceived(IPC::Message const&)]
Comment 25•11 years ago
|
||
Based on the comment above I think QA has done what we can to reproduce this. If we get more information later during daily testing, we'll try to action it from there. For now, there's not much we can do here.
Keywords: qawanted
(In reply to William Hsu [:whsu] from comment #24)
> I automated the test steps that Bug 863500 comment 24 mentioned recently and
> run it on the following V1-TRAIN build with unagi device.
It might help to run this series of steps under valgrind and see if it reports anything unusual. Please ping qDot for help on setting it up.
Comment 27•11 years ago
|
||
Valgrind unfortunately only runs on >= v1.2 on the nexus 4.
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #27)
> Valgrind unfortunately only runs on >= v1.2 on the nexus 4.
Eh? I was able to run it on v1.0.1 unagi before.
Comment 29•11 years ago
|
||
bent's original instructions for getting valgrind up and running on v1.0/1.1 are at
https://bug854517.bugzilla.mozilla.org/attachment.cgi?id=729283
See if you can work through these. I'm hoping my valgrind patches for v1.2 will land soon, and will try to backport them to 1.0/1.1 when that happens.
Updated•11 years ago
|
Keywords: topcrash-b2g
Updated•11 years ago
|
Component: General → DOM: IndexedDB
Product: Firefox OS → Core
Comment 30•11 years ago
|
||
Alan, this has been a top-crasher for a while with no action on it, can you please help here ?
Flags: needinfo?(ahuang)
Comment 31•11 years ago
|
||
I have no idea of this for a while, and I am currently occupied by tarako. Un-take this first.
Assignee: ahuang → nobody
Flags: needinfo?(ahuang)
I don't see any recent crash in anything higher than 18. I am not sure if this bug will appear in new Gecko levels. Should we keep this open?
Flags: needinfo?(bbajaj)
Comment 33•11 years ago
|
||
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #32)
> I don't see any recent crash in anything higher than 18. I am not sure if
> this bug will appear in new Gecko levels. Should we keep this open?
lets close it for now and we can reopen if need be.
Flags: needinfo?(bbajaj)
Updated•11 years ago
|
Keywords: topcrash-b2g
WFM for now
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•