Closed
Bug 1166346
Opened 10 years ago
Closed 10 years ago
Crash in HttpChannelParentListener while stability testing
Categories
(Core :: Networking: HTTP, defect)
Tracking
()
Tracking | Status | |
---|---|---|
b2g-v2.2 | --- | fixed |
b2g-master | --- | unaffected |
People
(Reporter: ggrisco, Assigned: jduell.mcbugs)
References
Details
(Keywords: crash, Whiteboard: [b2g-crash][caf-crash 638][caf priority: p1][CR 840028])
Crash Data
Attachments
(9 files)
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
fabrice
:
approval-mozilla-b2g37+
|
Details | Diff | Splinter Review |
Saw this crash 3 times on AU 157 while running stability tests over many hours:
[@ mozilla::net::HttpChannelParentListener::OnDataAvailable | mozilla::net::nsHttpChannel::OnDataAvailable | nsInputStreamPump::OnStateTransfer | nsInputStreamPump::OnInputStreamReady ]
cafbot will upload minidump and logs soon.
Comment 1•10 years ago
|
||
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
Patrick, would you be able to take a look at this or redirect to someone appropriate?
Flags: needinfo?(mcmanus)
Updated•10 years ago
|
blocking-b2g: 2.2? → 2.2+
Updated•10 years ago
|
Assignee: nobody → jduell.mcbugs
Flags: needinfo?(mcmanus)
Updated•10 years ago
|
Whiteboard: [CR 840028] → [caf priority: p1][CR 840028]
Updated•10 years ago
|
Whiteboard: [caf priority: p1][CR 840028] → [b2g-crash][caf-crash 638][caf priority: p1][CR 840028]
Comment 4•10 years ago
|
||
Observed on:
Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.157
Moz BuildID: 20150511002500
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.157.xml?h=release
B2G Version: v2.2
Gecko Version: 37.0
Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=528ef60e7cda09ad43478065f5d33bda398fbeb7
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=f7e168e362809afa274c2ecb9d346aeda90aa519
Patches: bug 1057091, bug 1165048, bug 1162663, bug 1157029, bug 1155797, bug 1133147, bug 1160127, bug 1156063, bug 1164513, bug 1156503, bug 1152439, bug 1159434
Comment 5•10 years ago
|
||
Comment 6•10 years ago
|
||
Assignee | ||
Comment 7•10 years ago
|
||
I'm not sure if we can trust the stack trace here. HttpChannelParentListener::OnDataAvailable() only dereferences a single pointer (mNextListener), and it checks it for null first (and I don't see any places it could be modified on another thread), so AFAICT it's not logically possible that we're actually crashing in that function. Perhaps it's the next ODA callee in the chain, though that doesn't look promising either.
An HTTP log might help here: Greg, is there any chance you could get one, attach it here, and needinfo me?
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
(Sadly that doc doesn't say how to set up logging on Android/B2G: :blassey or :mayhemer might know how. Another issue might be size of the log--it'll get big if you're running for hours).
Flags: needinfo?(ggrisco)
Reporter | ||
Comment 8•10 years ago
|
||
Hi Jason, thanks for looking into this. It would be better if you can provide some logging patch that we can apply that takes long running time into consideration. It takes some time to land these patches internally and then build and send to test team, so sooner we can have this the better since we're up against a deadline.
Although the stack trace may not be exactly correct, we have seen this same crash signature multiple times and we aren't seeing any other spurious crashes. All of the logs I've seen for this are showing browser activity near the time of crash.
The only other crash we're seeing currently is bug 1162663 which doesn't look related.
Flags: needinfo?(ggrisco) → needinfo?(jduell.mcbugs)
Comment 9•10 years ago
|
||
Observed on:
Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.159
Moz BuildID: 20150511002500
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.159.xml?h=release
B2G Version: v2.2
Gecko Version: 37.0
Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=528ef60e7cda09ad43478065f5d33bda398fbeb7
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=f7e168e362809afa274c2ecb9d346aeda90aa519
Patches: bug 1165048, bug 1162663, bug 1155797, bug 1133147, bug 1152439, bug 1159434
Comment 10•10 years ago
|
||
Comment 11•10 years ago
|
||
Comment 12•10 years ago
|
||
Jason, I'm pretty confident this is the MOZ_RELEASE_ASSERT hitting, since MOZ_REALLY_CRASH (which is the underlying assertion mechanism) triggers a write to NULL: https://dxr.mozilla.org/mozilla-central/source/mfbt/Assertions.h#198
Comment 13•10 years ago
|
||
Observed on:
Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.162
Moz BuildID: 20150519002500
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.162.xml?h=release
B2G Version: v2.2
Gecko Version: 37.0
Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=732acec6f37d13ccea6b0ddc48904a53a2970894
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=197cb3859f55ca587256c76648adec54a1f502ef
Patches: bug 1155797, bug 1133147, bug 1152439, bug 1165048, bug 1162663
Comment 14•10 years ago
|
||
Comment 15•10 years ago
|
||
Comment 16•10 years ago
|
||
Jason, do you have an update you can provide on this issue?
Assignee | ||
Comment 17•10 years ago
|
||
So I think jdm is right about the assertion. And that means that we're somehow still delivering OnDataAvailable after we've diverted the channel. Which is a bug in necko for sure, but it's not clear yet if it's due to 1) changes in the necko code, or 2) some new use case of DivertTo() that's exposing a long-standing bug, or 3) a bug that's always been there but is only exposed by the stability tests.
> Saw this crash 3 times on AU 157 while running stability tests over many hours:
Greg: how much of pain would it be to try to get a regression range for this? It sounds like it might take a while if you only see this after many hours. Also, do we know if we have any reports of this in the wild? (I forget how good our crash reporting is for Firefox OS, especially for parent crashes).
Flags: needinfo?(jduell.mcbugs) → needinfo?(ggrisco)
Assignee | ||
Comment 18•10 years ago
|
||
Dragana: sworkman tells me you may have run into crashes like these when you were working on some Divert-related patches. Does that ring a bell? Do you have cycles to look into this?
Flags: needinfo?(dd.mozilla)
Comment 19•10 years ago
|
||
Bug 1097878 (looks like similar stack trace) and bug 1106396 (needed an extra patch) are what I was referring to. Not sure how FxOS 2.2 is related to Fx36-38. Maybe the patches could be uplifted?
Comment 20•10 years ago
|
||
Steve, FxOS 2.2 is using gecko 37.
Comment 21•10 years ago
|
||
there is a patch in bug 1106396
it is not in b2g 2.2 it should be uplifted.
Flags: needinfo?(dd.mozilla)
Comment 22•10 years ago
|
||
this is rather small patch.
Reporter | ||
Comment 23•10 years ago
|
||
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #17)
> Greg: how much of pain would it be to try to get a regression range for
> this? It sounds like it might take a while if you only see this after many
> hours. Also, do we know if we have any reports of this in the wild? (I
> forget how good our crash reporting is for Firefox OS, especially for parent
> crashes).
We started seeing this in AU 157, there were no prior reports. Although, it is possible that the issue existed before that, but wasn't seen due to other frequent crashes, so it's hard to say. Regardless, here's the breakdown:
AU 157: Seen 12 times
AU 159: Seen 18 times
AU 162: Seen 4 times so far (still testing)
cafbot has commented on each of these builds, so you should have gaia/gecko versions for each.
Flags: needinfo?(ggrisco)
Comment 24•10 years ago
|
||
Comment on attachment 8609647 [details] [diff] [review]
bug_1106396_fix_v2_suspend.patch
NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.
[Approval Request Comment]
Bug caused by (feature/regressing bug #): Long existing bug
User impact if declined: crash
Testing completed: It is in since 38
Risk to taking this patch (and alternatives if risky): Low risk, running for couple of months already.
String or UUID changes made by this patch: none
Attachment #8609647 -
Flags: approval-mozilla-b2g37?
Updated•10 years ago
|
Attachment #8609647 -
Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Comment 25•10 years ago
|
||
The other crashes with the same signature - bug 1153256 is caused by a addon, so this patch should fix it.
Comment 26•10 years ago
|
||
Status: NEW → RESOLVED
Closed: 10 years ago
status-b2g-v2.2:
--- → fixed
status-b2g-master:
--- → unaffected
Resolution: --- → FIXED
Target Milestone: --- → 2.2 S13 (29may)
You need to log in
before you can comment on or make changes to this bug.
Description
•