Closed Bug 1171307 Opened 9 years ago Closed 8 years ago

[e10s] crash in mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::ipc::MessageChannel::OnChannelErrorFromLink()

Categories

(Core :: DOM: Content Processes, defect, P2)

40 Branch
All
Windows
defect

Tracking

()

RESOLVED FIXED
Tracking Status
e10s + ---
firefox39 --- unaffected
firefox40 - affected
firefox41 - wontfix
firefox42 - wontfix
firefox43 - wontfix
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 --- fix-optional
firefox50 --- fix-optional
firefox51 --- fix-optional

People

(Reporter: epinal99-bugzilla2, Unassigned)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-8e8c2c6d-5045-4e6b-95a6-f6da92150604.
=============================================================

Be sure e10s is enabled.
STR:
1) Open the large image http://www.imagebam.com/image/b05156413493542
2) Click fastly (~ 1/0.5 sec) repeatedly on the image to enlarge/reduce it.
IMPORTANT: At each click, click on a different place of the image.
NB: some ad pop-ups can appear on the click event, just go back to the tab and continue to click.


Result:
After a few clicks (~5), the image "freezes" and the tab crashes.
Same result if HWA is disabled.

Regression range:
good=2015-04-17
bad=2015-04-18
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=51e3cb11a258&tochange=8af644c9b616

Maybe bug 1146214.
Be sure e10s is enabled.
STR:
1) Open http://www.imagebam.com/image/b05156413493542
2) Drag the image and drop onto the image
3) Wait a few seconds

Actual Results:
Crash
bp-a4391c90-8963-4d41-be20-ff2142150604

Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ffd1b16f058b&tochange=4ddadc870ef6

Regressed by: Bug 1071562
Blocks: 1071562
Flags: needinfo?(enndeakin)
tracking-e10s: --- → ?
Enabled tracking for 40 and 41, because crashes.  I don't have privileges to change the flag for e10s, so needing info from Liz.
Flags: needinfo?(lhenry)
Don't know the specifics of the code that is crashing, but the image here is very large (1.6MB file and 7000x9000x pixels size). That, I think, works out to 252MB of data sent over a message.
Flags: needinfo?(enndeakin)
The e10s team does their own triage to set priorities on their bugs, so I'm clearing needinfo. We do want to track this since it's a regression. Thanks Jen!
https://crash-stats.mozilla.com/report/index/54230516-3603-411d-9b11-454d22150609
Component: IPC → mozglue
OS: Windows NT → Windows
Hardware: x86 → All
Component: mozglue → IPC
Flags: needinfo?(mrbkap)
> Component: mozglue → IPC

Was thinking the first frame in a crash report defines the component the crash belongs to ... ???

Crashing Thread
Frame 	Module 	Signature 	Source
0 	mozglue.dll 	mozalloc_abort(char const* const) 	memory/mozalloc/mozalloc_abort.cpp
1 	xul.dll 	NS_DebugBreak 	xpcom/base/nsDebugImpl.cpp
2 	xul.dll 	mozilla::ipc::MessageChannel::OnChannelErrorFromLink() 	ipc/glue/MessageChannel.cpp
3 	xul.dll 	mozilla::ipc::ProcessLink::OnChannelError() 	ipc/glue/MessageLink.cpp
4 	xul.dll 	IPC::Channel::ChannelImpl::OnIOCompleted(base::MessagePumpForIO::IOContext*, unsigned long, unsigned long) 	ipc/chromium/src/chrome/common/ipc_channel_win.cc
5 	xul.dll 	base::MessagePumpForIO::WaitForIOCompletion(unsigned long, base::MessagePumpForIO::IOHandler*) 	ipc/chromium/src/base/message_pump_win.cc
6 	xul.dll 	base::MessagePumpForIO::DoRunLoop() 	ipc/chromium/src/base/message_pump_win.cc
7 	xul.dll 	base::MessagePumpWin::Run(base::MessagePump::Delegate*) 	ipc/chromium/src/base/message_pump_win.h
8 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc
9 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc
10 	xul.dll 	base::Thread::ThreadMain() 	ipc/chromium/src/base/thread.cc
11 	xul.dll 	`anonymous namespace'::ThreadFunc(void*) 	ipc/chromium/src/base/platform_thread_win.cc
12 	kernel32.dll 	BaseThreadInitThunk 	
13 	ntdll.dll 	RtlUserThreadStart 	
14 	kernel32.dll 	BasepReportFault 	
15 	kernel32.dll 	BasepReportFault

https://crash-stats.mozilla.com/report/index/54230516-3603-411d-9b11-454d22150609
mozalloc_abort and NS_DebugBreak are functions called to crash on purpose, for some value of on purpose. mozalloc_abort is the one actually crashing. NS_DebugBreak is something that uses mozalloc_abort to crash, and it is used by macros like NS_RUNTIMEABORT, NS_WARN_IF_FALSE etc. where some of those will only call NS_DebugBreak on debug builds.
Anyways, what this all means is that it's whatever is above them in the stack that's at fault.
Assigning to Mike Hommey because every tracked bug should be assigned. Feel free to reassign to someone else.
Assignee: nobody → mh+mozilla
That would be for some peer of the IPC module. Picking one at random.
Assignee: mh+mozilla → bent.mozilla
Untracking for 40 as e10s won't be available in beta.
Assignee: bent.mozilla → nobody
Component: IPC → DOM: Content Processes
Jim, this is another e10 crash (though not top crash) which needs an owner. I don't see much dev activity on it in the last month or so. Thanks!
Flags: needinfo?(jmathies)
(In reply to Ritu Kothari (:ritu) from comment #11)
> Jim, this is another e10 crash (though not top crash) which needs an owner.
> I don't see much dev activity on it in the last month or so. Thanks!

This was flagged by the e10s team on 6-9, Blake will take a look at it at some point here.
Flags: needinfo?(jmathies)
FWIW Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:42.0) Gecko/20100101 Firefox/42.0 ID:20150712030212 CSet: eab21ec484bb crashes like

Report ID 	Date Submitted
bp-ca9d4e0d-b248-4868-80f2-aa8532150713
	13/07/2015	06:58 a.m.

I was showing https://bugzilla.mozilla.org/show_bug.cgi?id=1144063 to some folks in the LATAM QA meetup.
QA Whiteboard: mozLATAM
After clicking on the image repeatedly in Developer Edition 42.0a2, ALL my tabs crashed. Not the browser, but each individual tab. That probably shouldn't happen?
At the moment, the default configuration for e10s only uses 1 content process. So if one tab causes a crash, all tabs will crash. You can increase the number of content processes by adjusting dom.ipc.processCount, but note that this is not a supported configuration yet (and tabs aren't spread out across content processes in a particularly principled way).
Untracked and wontfix'd for FF41 as e10s is not enabled by default. FF42+
Not tracking it anymore as we are going to disable e10s soon in aurora but tracking for 43.
This crash is quite low in volume except on 40.0.x. removing needinfo to get back into triage.
Flags: needinfo?(mrbkap)
not tracking since its low volume. renom if it comes up in frequency
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::ipc::MessageChannel::OnChannelErrorFromLink()] → [@ mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::ipc::MessageChannel::OnChannelErrorFromLink()] [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::MessageChannel::OnChannelErrorFromLink]
Very low volume across all channels. Untracking this for now.
I managed to reproduce this issue on the latest Aurora(46.0a2) on Windows 7 x32. If e10s is enabled, repeatedly clicking the image on different positions causes the crash. If e10s is disabled Firefox does not encounter any crash.

User Agent: Mozilla/5.0 (Windows NT 6.1; rv:46.0) Gecko/20100101 Firefox/46.0
Build ID: 20160209004008
Flags: needinfo?(wmccloskey)
Attached file bwtest.html (deleted) —
This test always crashes the child process on e10s.
I run it on local host, and it does a blob XHR for a file named test.zip - a file that is 1.1 Gb in size.

https://crash-stats.mozilla.com/report/index/febbbc9c-766f-4946-9569-8e9572160215

I've also had crashes on non-e10s, when doing multiple XHRs in a limited ammount of time, the process is killed with SIGKILL, but that's probably a different bug.
https://crash-stats.mozilla.com/report/index/b93ea76a-3144-4996-a1f4-ac6182160506

Occurred when viewing IndexedDB content in the Storage developer tool.
I think this used to be how we would crash when messages were too large. Now we crash more directly. The original problem should be addressed by bug 1272018.
Flags: needinfo?(wmccloskey)
This crash still happens on Nightly, though not very often.
I'm going to mark this one dependent on bug 1272018, based on the STR in comment 0 and 1.
Depends on: 1272018
Priority: -- → P2
Crash volume for signature 'mozalloc_abort | NS_DebugBreak | mozilla::ipc::MessageChannel::OnChannelErrorFromLink':
 - nightly (version 50): 67 crashes from 2016-06-06.
 - aurora  (version 49): 66 crashes from 2016-06-07.
 - beta    (version 48): 6 crashes from 2016-06-06.
 - release (version 47): 6 crashes from 2016-05-31.
 - esr     (version 45): 0 crash from 2016-04-07.

Crash volume on the last weeks:
             Week N-1   Week N-2   Week N-3   Week N-4   Week N-5   Week N-6   Week N-7
 - nightly          9          3          2          4         12         18         16
 - aurora           6          4          8         11         17         15          0
 - beta             0          2          0          0          2          1          0
 - release          0          1          0          1          2          0          1
 - esr              0          0          0          0          0          0          0

Affected platforms: Windows, Linux
We already track this crash on bug 1051567.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
(In reply to Henrik Skupin (:whimboo) from comment #28)
> We already track this crash on bug 1051567.
> 
> *** This bug has been marked as a duplicate of bug 1051567 ***

I disagree with your duping, because this bug has STR (see comment #1) and regressed by bug 1071562 which landed in FF40. But the crash signature looks pretty generic and can be found in versions prior to FF40.
In addition, the crash is partially fixed after bug 1272018 but there is still an issue (I'll file a new bug).
Resolution: DUPLICATE → FIXED
Blocks: 1295272
I've filed bug 1295272 for the remaining issue.
Crash volume for signature 'mozalloc_abort | NS_DebugBreak | mozilla::ipc::MessageChannel::OnChannelErrorFromLink':
 - nightly (version 51): 27 crashes from 2016-08-01.
 - aurora  (version 50): 71 crashes from 2016-08-01.
 - beta    (version 49): 29 crashes from 2016-08-02.
 - release (version 48): 4 crashes from 2016-07-25.
 - esr     (version 45): 0 crashes from 2016-05-02.

Crash volume on the last weeks (Week N is from 08-22 to 08-28):
            W. N-1  W. N-2  W. N-3
 - nightly       3      16       4
 - aurora       35      21       9
 - beta          4      16       9
 - release       0       2       0
 - esr           0       0       0

Affected platforms: Windows, Linux

Crash rank on the last 7 days:
           Browser   Content     Plugin
 - nightly           #117
 - aurora  #201      #117
 - beta    #12608    #2742
 - release           #329
 - esr
This is tracked in bug 1295272. Marking "fix optional" in this bug only to get it off our triage lists and focus on 1295272.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: