Closed
Bug 1219469
Opened 9 years ago
Closed 9 years ago
Crash in HttpChannelParent::SynthesizeResponse
Categories
(Core :: DOM: Service Workers, defect)
Core
DOM: Service Workers
Tracking
()
RESOLVED
FIXED
mozilla46
People
(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)
References
Details
(Keywords: crash, reproducible, topcrash)
Crash Data
Attachments
(2 files, 8 obsolete files)
(deleted),
patch
|
jdm
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jdm
:
review+
jduell.mcbugs
:
review+
|
Details | Diff | Splinter Review |
See <http://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-inbound-linux-debug/1446062559/mozilla-inbound_ubuntu32_vm-debug_test-web-platform-tests-e10s-8-bm04-tests1-linux32-build38.txt.gz>
15:09:30 INFO - PROCESS | 1778 | [Parent 1778] ###!!! ASSERTION: You can't dereference a NULL nsAutoPtr with operator->().: 'mRawPtr != 0', file ../../../dist/include/nsAutoPtr.h, line 196
15:09:30 INFO - PROCESS | 1778 | [Parent 1778] ###!!! ASSERTION: You can't dereference a NULL nsAutoPtr with operator->().: 'mRawPtr != 0', file ../../../dist/include/nsAutoPtr.h, line 196
15:09:31 INFO - PROCESS | 1778 | --DOCSHELL 0x99821100 == 9 [pid = 1830] [id = 1019]
15:09:31 INFO - PROCESS | 1778 | --DOCSHELL 0x982a9600 == 8 [pid = 1830] [id = 1016]
15:09:31 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/HQkNCLtmTz-8Kf4sEX_5NQ/artifacts/public/build/firefox-44.0a1.en-US.linux-i686.crashreporter-symbols.zip
15:09:31 INFO - PROCESS | 1778 | [Child 1830] ###!!! ABORT: Aborting on channel error.: file /builds/slave/m-in-lx-d-00000000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1768
15:09:31 INFO - PROCESS | 1778 | #01: ???[/builds/slave/test/build/application/firefox/libxul.so +0x71d5bc]
15:09:31 INFO - PROCESS | 1778 | #02: ???[/builds/slave/test/build/application/firefox/libxul.so +0x71d688]
15:09:31 INFO - PROCESS | 1778 | #03: ???[/builds/slave/test/build/application/firefox/libxul.so +0x709767]
15:09:31 INFO - PROCESS | 1778 | #04: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6f69ec]
15:09:31 INFO - PROCESS | 1778 | #05: ???[/builds/slave/test/build/application/firefox/libxul.so +0x70f254]
15:09:31 INFO - PROCESS | 1778 | #06: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6f96e7]
15:09:31 INFO - PROCESS | 1778 | #07: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6fa79a]
15:09:31 INFO - PROCESS | 1778 | #08: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6fa7c0]
15:09:31 INFO - PROCESS | 1778 | #09: ???[/builds/slave/test/build/application/firefox/libxul.so +0x7033c6]
15:09:31 INFO - PROCESS | 1778 | #10: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6ffc8e]
15:09:31 INFO - PROCESS | 1778 | #11: ???[/lib/i386-linux-gnu/libpthread.so.0 +0x6d4c]
15:09:31 INFO - PROCESS | 1778 | [Child 1830] ###!!! ABORT: Aborting on channel error.: file /builds/slave/m-in-lx-d-00000000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1768
15:09:31 INFO - PROCESS | 1778 | Hit MOZ_CRASH() at /builds/slave/m-in-lx-d-00000000000000000000/build/src/memory/mozalloc/mozalloc_abort.cpp:33
15:09:56 INFO - mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/7dc715ef-9a16-47de-08f0b0d7-4d714946.dmp
15:09:56 INFO - mozcrash Saved app info as /builds/slave/test/build/blobber_upload_dir/7dc715ef-9a16-47de-08f0b0d7-4d714946.extra
15:09:57 INFO - PROCESS-CRASH | /_mozilla/service-workers/service-worker/fetch-request-redirect.https.html | application crashed [@ mozilla::net::HttpChannelParent::SynthesizeResponse(nsIInterceptedChannel*)]
15:09:57 INFO - Crash dump filename: /tmp/tmpQK0exE.mozrunner/minidumps/7dc715ef-9a16-47de-08f0b0d7-4d714946.dmp
15:09:57 INFO - Operating system: Linux
15:09:57 INFO - 0.0.0 Linux 3.2.0-76-generic-pae #111-Ubuntu SMP Tue Jan 13 22:34:29 UTC 2015 i686
15:09:57 INFO - CPU: x86
15:09:57 INFO - GenuineIntel family 6 model 62 stepping 4
15:09:57 INFO - 1 CPU
15:09:57 INFO -
15:09:57 INFO - Crash reason: SIGSEGV
15:09:57 INFO - Crash address: 0x6
15:09:57 INFO - Process uptime: not available
15:09:57 INFO -
15:09:57 INFO - Thread 0 (crashed)
15:09:57 INFO - 0 libxul.so!mozilla::net::HttpChannelParent::SynthesizeResponse(nsIInterceptedChannel*) [HttpChannelParent.cpp:a01667dab7c5 : 264 + 0xd]
15:09:57 INFO - eip = 0xb0ddea40 esp = 0xbf940f38 ebp = 0xbf940f78 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x90cc6fd4 edi = 0x00000008 eax = 0x00000000 ecx = 0xb0dcf09a
15:09:57 INFO - edx = 0x9fd86260 efl = 0x00210286
15:09:57 INFO - Found by: given as instruction pointer in context
15:09:57 INFO - 1 libxul.so!mozilla::net::ResponseSynthesizer::Dispatch() [HttpChannelParent.cpp:a01667dab7c5 : 240 + 0x9]
15:09:57 INFO - eip = 0xb0ddeb31 esp = 0xbf940f80 ebp = 0xbf940f98 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9fd86260 edi = 0xbf940fcc
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 2 libxul.so!mozilla::net::InterceptedChannelBase::DoNotifyController() [InterceptedChannel.cpp:a01667dab7c5 : 73 + 0x9]
15:09:57 INFO - eip = 0xb0ddf5dd esp = 0xbf940fa0 ebp = 0xbf940fe8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9fd86260 edi = 0xbf940fcc
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 3 libxul.so!mozilla::net::InterceptedChannelChrome::NotifyController() [InterceptedChannel.cpp:a01667dab7c5 : 134 + 0x9]
15:09:57 INFO - eip = 0xb0ddf7de esp = 0xbf940ff0 ebp = 0xbf941038 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9fd86260 edi = 0xbf94101c
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 4 libxul.so!mozilla::net::nsHttpChannel::OpenCacheEntry(bool) [nsHttpChannel.cpp:a01667dab7c5 : 3025 + 0x8]
15:09:57 INFO - eip = 0xb0e2ad4f esp = 0xbf941040 ebp = 0xbf9411a8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9fd86260 edi = 0x9abba800
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 5 libxul.so!mozilla::net::nsHttpChannel::Connect() [nsHttpChannel.cpp:a01667dab7c5 : 423 + 0x10]
15:09:57 INFO - eip = 0xb0e2b7da esp = 0xbf9411b0 ebp = 0xbf9413d8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9abba800 edi = 0x9abba8e8
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 6 libxul.so!mozilla::net::nsHttpChannel::ContinueBeginConnectWithResult() [nsHttpChannel.cpp:a01667dab7c5 : 5414 + 0x9]
15:09:57 INFO - eip = 0xb0e2bb04 esp = 0xbf9413e0 ebp = 0xbf941408 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9abba800 edi = 0x9abbabe0
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 7 libxul.so!mozilla::net::nsHttpChannel::BeginConnect() [nsHttpChannel.cpp:a01667dab7c5 : 5330 + 0x9]
15:09:57 INFO - eip = 0xb0e2c9bd esp = 0xbf941410 ebp = 0xbf941618 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9abba800 edi = 0x9abbabe0
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 8 libxul.so!mozilla::net::nsHttpChannel::OnProxyAvailable(nsICancelable*, nsIChannel*, nsIProxyInfo*, nsresult) [nsHttpChannel.cpp:a01667dab7c5 : 5484 + 0x9]
15:09:57 INFO - eip = 0xb0e2cce6 esp = 0xbf941620 ebp = 0xbf941658 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9abba800 edi = 0x00000000
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 9 libxul.so!nsAsyncResolveRequest::DoCallback() [nsProtocolProxyService.cpp:a01667dab7c5 : 287 + 0x16]
15:09:57 INFO - eip = 0xb0cc0eaa esp = 0xbf941660 ebp = 0xbf941718 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x8fffd260 edi = 0xbf941784
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 10 libxul.so!nsAsyncResolveRequest::Run() [nsProtocolProxyService.cpp:a01667dab7c5 : 169 + 0x9]
15:09:57 INFO - eip = 0xb0cc1072 esp = 0xbf941720 ebp = 0xbf941738 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x00000000 edi = 0xbf941784
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 11 libxul.so!nsProtocolProxyService::AsyncResolveInternal(nsIChannel*, unsigned int, nsIProtocolProxyCallback*, nsICancelable**, bool) [nsProtocolProxyService.cpp:a01667dab7c5 : 1304 + 0x6]
15:09:57 INFO - eip = 0xb0cc1287 esp = 0xbf941740 ebp = 0xbf9417f8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x00000000 edi = 0xbf941784
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 12 libxul.so!nsProtocolProxyService::AsyncResolve2(nsIChannel*, unsigned int, nsIProtocolProxyCallback*, nsICancelable**) [nsProtocolProxyService.cpp:a01667dab7c5 : 1328 + 0x16]
15:09:57 INFO - eip = 0xb0cc15fa esp = 0xbf941800 ebp = 0xbf941828 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9abba838 edi = 0xbf941878
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 13 libxul.so!mozilla::net::nsHttpChannel::ResolveProxy() [nsHttpChannel.cpp:a01667dab7c5 : 2201 + 0x20]
15:09:57 INFO - eip = 0xb0e0c6c7 esp = 0xbf941830 ebp = 0xbf941898 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9abba838 edi = 0xbf941878
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 14 libxul.so!mozilla::net::nsHttpChannel::AsyncOpen(nsIStreamListener*, nsISupports*) [nsHttpChannel.cpp:a01667dab7c5 : 5043 + 0x9]
15:09:57 INFO - eip = 0xb0e2d065 esp = 0xbf9418a0 ebp = 0xbf941938 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9abba800 edi = 0x9abba888
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 15 libxul.so!mozilla::net::nsHttpChannel::ContinueProcessRedirection(nsresult) [nsHttpChannel.cpp:a01667dab7c5 : 4763 + 0x1a]
15:09:57 INFO - eip = 0xb0e26927 esp = 0xbf941940 ebp = 0xbf941988 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9b104800 edi = 0x00000005
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 16 libxul.so!mozilla::net::nsHttpChannel::OnRedirectVerifyCallback(nsresult) [nsHttpChannel.cpp:a01667dab7c5 : 6681 + 0x9]
15:09:57 INFO - eip = 0xb0e26677 esp = 0xbf941990 ebp = 0xbf9419c8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x9b104800 edi = 0x00000000
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 17 libxul.so!nsAsyncVerifyRedirectCallbackEvent::Run() [nsAsyncRedirectVerifyHelper.cpp:a01667dab7c5 : 43 + 0xb]
15:09:57 INFO - eip = 0xb0c8d1a7 esp = 0xbf9419d0 ebp = 0xbf9419e8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x95d7e900 edi = 0xb5f04340
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 18 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:a01667dab7c5 : 964 + 0x14]
15:09:57 INFO - eip = 0xb0c0feb8 esp = 0xbf9419f0 ebp = 0xbf941a68 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xb7046b20 edi = 0xb5f04340
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 19 libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:a01667dab7c5 : 297 + 0x10]
15:09:57 INFO - eip = 0xb0c3bc7b esp = 0xbf941a70 ebp = 0xbf941aa8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xb70d7750 edi = 0xb702cb20
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 20 libxul.so!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [MessagePump.cpp:a01667dab7c5 : 95 + 0xc]
15:09:57 INFO - eip = 0xb0ef3b0c esp = 0xbf941ab0 ebp = 0xbf941af8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xb70d7750 edi = 0xb702cb20
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 21 libxul.so!MessageLoop::RunInternal() [message_loop.cc:a01667dab7c5 : 234 + 0x14]
15:09:57 INFO - eip = 0xb0ecc79a esp = 0xbf941b00 ebp = 0xbf941b28 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xb702cb20 edi = 0xb7046b20
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 22 libxul.so!MessageLoop::Run() [message_loop.cc:a01667dab7c5 : 227 + 0x8]
15:09:57 INFO - eip = 0xb0ecc7c0 esp = 0xbf941b30 ebp = 0xbf941b58 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xb702cb20 edi = 0xb7046b20
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 23 libxul.so!nsBaseAppShell::Run() [nsBaseAppShell.cpp:a01667dab7c5 : 156 + 0xe]
15:09:57 INFO - eip = 0xb261fce9 esp = 0xbf941b60 ebp = 0xbf941b88 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xb7188890 edi = 0xb7046b20
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 24 libxul.so!nsAppStartup::Run() [nsAppStartup.cpp:a01667dab7c5 : 281 + 0x9]
15:09:57 INFO - eip = 0xb2e02152 esp = 0xbf941b90 ebp = 0xbf941ba8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xa9ab2240 edi = 0xbf941e19
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 25 libxul.so!XREMain::XRE_mainRun() [nsAppRunner.cpp:a01667dab7c5 : 4298 + 0x17]
15:09:57 INFO - eip = 0xb2e51dc9 esp = 0xbf941bb0 ebp = 0xbf941c98 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0x00000000 edi = 0xbf941e19
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 26 libxul.so!XREMain::XRE_main(int, char**, nsXREAppData const*) [nsAppRunner.cpp:a01667dab7c5 : 4391 + 0x9]
15:09:57 INFO - eip = 0xb2e54b4c esp = 0xbf941ca0 ebp = 0xbf941ce8 ebx = 0xb5eff61c
15:09:57 INFO - esi = 0xbf941d20 edi = 0xbf941d38
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 27 libxul.so!XRE_main [nsAppRunner.cpp:a01667dab7c5 : 4493 + 0xf]
15:09:57 INFO - eip = 0xb2e54da1 esp = 0xbf941cf0 ebp = 0xbf941e28 ebx = 0x0808cdb4
15:09:57 INFO - esi = 0xbf941d20 edi = 0xb700b600
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 28 firefox!do_main [nsBrowserApp.cpp:a01667dab7c5 : 212 + 0x6]
15:09:57 INFO - eip = 0x0804d5c0 esp = 0xbf941e30 ebp = 0xbf942e88 ebx = 0x0808cdb4
15:09:57 INFO - esi = 0xbf943024 edi = 0xb700b600
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 29 firefox!main [nsBrowserApp.cpp:a01667dab7c5 : 352 + 0x16]
15:09:57 INFO - eip = 0x0804c8cb esp = 0xbf942e90 ebp = 0xbf942f78 ebx = 0x0808cdb4
15:09:57 INFO - esi = 0xbf943024 edi = 0x00000000
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 30 libc-2.15.so!__libc_start_main [libc-start.c : 226 + 0x1f]
15:09:57 INFO - eip = 0xb74bc4d3 esp = 0xbf942f80 ebp = 0x00000000
15:09:57 INFO - Found by: previous frame's frame pointer
15:09:57 INFO - 31 firefox!__libc_csu_fini + 0x10
15:09:57 INFO - eip = 0x0807be40 esp = 0xbf942f84 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 32 libc-2.15.so!__libc_start_main [libc-start.c : 226 + 0x1f]
15:09:57 INFO - eip = 0xb74bc4d3 esp = 0xbf942f90 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 33 firefox + 0x4b94
15:09:57 INFO - eip = 0x0804cb94 esp = 0xbf942fe0 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 34 libc-2.15.so!__libc_start_main [libc-start.c : 96 + 0x9]
15:09:57 INFO - eip = 0xb74bc3e9 esp = 0xbf942fec ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 35 firefox + 0x4b94
15:09:57 INFO - eip = 0x0804cb94 esp = 0xbf942ff8 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 36 firefox!_start + 0x21
15:09:57 INFO - eip = 0x0804cbb5 esp = 0xbf943000 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 37 firefox!_init + 0x77c
15:09:57 INFO - eip = 0x0804c7e0 esp = 0xbf943004 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 38 firefox!__libc_csu_fini + 0x10
15:09:57 INFO - eip = 0x0807be40 esp = 0xbf943010 ebp = 0x00000000
15:09:57 INFO - Found by: call frame info
15:09:57 INFO - 39 firefox!mozilla::ReadAheadFile(char const*, unsigned int, unsigned int, int*) [FileUtils.cpp:a01667dab7c5 : 538 + 0x8]
15:09:57 INFO - eip = 0x0807be30 esp = 0xbf943014 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
15:09:57 INFO - 40 ld-2.15.so + 0xf280
15:09:57 INFO - eip = 0xb77cd280 esp = 0xbf943018 ebp = 0x00000000
15:09:57 INFO - Found by: stack scanning
I can't debug this because of bug 1219467. :(
This happened in the test added in bug 1219085.
Comment 1•9 years ago
|
||
mSynthesizedHead is null. The only way I see this being possible is if we call SynthesizeResponse multiple times, which suggests that ChannelIntercepted is being called multiple times.
Comment 2•9 years ago
|
||
Is this from the navigation redirect tests? If so, then the channel may be intercepted more than once.
Comment 3•9 years ago
|
||
Oh, its from bug 1219085, which seems to include redirecting an iframe. So I think it might be hitting a legitimate bug. We need to support intercepting again when a navigation is redirected.
Assignee | ||
Comment 4•9 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #3)
> Oh, its from bug 1219085, which seems to include redirecting an iframe.
Yes.
> So
> I think it might be hitting a legitimate bug. We need to support
> intercepting again when a navigation is redirected.
I agree, this is a real bug, which is why I made it block v1.
Comment 5•9 years ago
|
||
Removing dependency as Ehsan said it's working for him now.
No longer depends on: 1219467
Assignee | ||
Comment 6•9 years ago
|
||
This bypasses the crash but we die at the IPC layer I think because we delete the actor in HttpChannelChild::CompleteRedirectSetup(). We need to add support for intercepting redirected channels properly after this patch...
Updated•9 years ago
|
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Comment 7•9 years ago
|
||
On a fresh tree I'm not seeing the crash when running the test in e10s mode. I may just do another try to see if its been fixed by other changes.
Comment 8•9 years ago
|
||
Hmm, I can reproduce on a linux opt DEBUG build.
Comment 9•9 years ago
|
||
(In reply to [Vacation: Nov 3-19] from comment #6)
> This bypasses the crash but we die at the IPC layer I think because we
> delete the actor in HttpChannelChild::CompleteRedirectSetup().
This is kind of right, but I think for slightly different reasons.
The Send__delete__() call deletes the actor from the IPC stack, but the HttpChannelChild object sticks around because its a ref counted object. The AsyncOpen() then creates a new actor for the HttpChannelChild by calling SendPHttpChannelChildConstructor() again.
The IPC crash was happening, though, because Send__delete__() was racing with parent-to-child messages. This is a common IPC problem when destroying an actor. Its caused by the timing of the Send__delete__(), but the overall plan of reattaching an actor to HttpChannelChild is sound and does not directly lead to this crash.
The solution is to bounce a message through the parent so that we know all messages are complete and no more will be sent. Then we can call AsyncOpen() knowing the last actor is safely destroyed.
This patch does this. I verified locally it works on linux opt/DEBUG and windows debug/DEBUG.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1b8abf2064db
Attachment #8688642 -
Flags: review?(jduell.mcbugs)
Comment 10•9 years ago
|
||
Here is Ehsan's original patch. It ensures that the state of synthesized redirects is correct on the parent side.
Attachment #8681955 -
Attachment is obsolete: true
Attachment #8688646 -
Flags: review?(jduell.mcbugs)
Comment 11•9 years ago
|
||
Comment on attachment 8688642 [details] [diff] [review]
P2 Safely reopen HttpChannelChild when interception triggers a redirect. r=jduell
>- // This is a redirected channel, and the corresponding parent channel has started
>- // AsyncOpen but was intercepted and suspended. We must tear it down and start
>- // fresh - we will intercept the child channel this time, before creating a new
>- // parent channel unnecessarily.
This comment was intended to explain why the prior code was correct; clearly it was inadequate, but I think we should be able to achieve a similar one with less complexity than the patch here. The key in my original code was that the parent channel is suspended as soon as possible, which was supposed to prevent any messages from being sent to the child. I'm curious which messages were getting by that, and whether we could set a flag in the parent to just drop them instead of sending them to the child?
Comment 12•9 years ago
|
||
I guess I forgot that suspension doesn't suspend any channel notifications that are already in the event queue?
Comment 13•9 years ago
|
||
I could put the comment back. I did not mean to delete it completely since the general plan is still in effect. Its just not clear to me we can know the parent is completely suspended at this point.
(In reply to Josh Matthews [:jdm] from comment #12)
> I guess I forgot that suspension doesn't suspend any channel notifications
> that are already in the event queue?
Yes, I think that is an issue. At one point I tried setting mIPClosed at the suspension point in the parent and I still got a message through that caused a crash. So I think it was already in the queue.
I'll admit, this is complex enough I could easily be missing an simpler way to fix it. This works, though. I guess I'd like to move forward with this approach for now and then use our planning in Orlando to simplify our overall interception strategy.
Comment 14•9 years ago
|
||
Comment on attachment 8688642 [details] [diff] [review]
P2 Safely reopen HttpChannelChild when interception triggers a redirect. r=jduell
Dropping the review flag while I check something for Josh.
Attachment #8688642 -
Flags: review?(jduell.mcbugs)
Updated•9 years ago
|
Attachment #8688646 -
Flags: review?(jduell.mcbugs)
Comment 15•9 years ago
|
||
Comment 16•9 years ago
|
||
Josh, the failing message type is 8257557. As far as I can tell this is an OnStatus() message.
It appears OnStatus() is for progress triggered by nsHttpChannel::OnDataAvailable(). Do you have any ideas how to prevent these from firing?
Note, I got this with an mIPCClosed = true added to the suspend case in HttpChannelParent::SynthesizeResponse(). sSo the OnStatus is being sent just before we try to suspend.
Flags: needinfo?(josh)
Comment 17•9 years ago
|
||
At this point I feel like trying to special case every message that might get sent before we synthesize so the queue is empty if we do suspend is going to be rather fragile. If the channel changes then all these assumptions could break again and introduce more race conditions.
I feel like an explicit async shutdown message like P2 contains is the most robust thing to do here.
Long term we can hopefully avoid having to cross process boundaries at all and keep interception only in the parent.
Comment 18•9 years ago
|
||
The try build suggests there are still race conditions here:
https://treeherder.mozilla.org/logviewer.html#?job_id=13856482&repo=try
I think this means the child process was killed:
WARNING: pipe error (53): Connection reset by peer
We may have to land the fetch-request-redirect.https.html tests only enabled for non-e10s now. We can fix this and other e10s issues in 45+. Lets see if the try shows any errors in non-e10s before doing that, though.
Comment 19•9 years ago
|
||
I landed the test in bug 1219805 disabled for e10s. When this bug is fixed we should re-enable fetch-request-redirect.https.html on e10s as well.
No longer blocks: 1219085
Comment 20•9 years ago
|
||
So try shows my patch is inadequate anyway as its hitting an nsCOMPtr nullptr somewhere on try. Of course, I can't reproduce this locally.
Since this bug is e10s only I'm going to stop working on it. We can try to fix it in the coming weeks, but perhaps its best just to wait until the SWM e10s refactor.
Assignee: bkelly → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(josh)
Comment 22•9 years ago
|
||
Note, when this is fixed we need to enable both of these tests on e10s:
fetch-request-redirect.https.html
navigation-redirect.https.html
Assignee | ||
Updated•9 years ago
|
tracking-e10s:
--- → ?
Updated•9 years ago
|
Assignee | ||
Comment 23•9 years ago
|
||
Jim, why did you minus this? Clearly we don't want to ship e10s with this unfixed? (Note that we're planning to talk about potentially changing how service worker interception works in e10s next week, but still, we should track this, I think.)
Flags: needinfo?(jmathies)
Comment 24•9 years ago
|
||
(In reply to :Ehsan Akhgari from comment #23)
> Jim, why did you minus this? Clearly we don't want to ship e10s with this
> unfixed? (Note that we're planning to talk about potentially changing how
> service worker interception works in e10s next week, but still, we should
> track this, I think.)
It's blocking "service workers post v1", which I assumed meant it does not block the rollout of service workers in 44? e10s will be on for a limited set of users via experimentation in beta/44 and will roll out to a limited set of users in 45. Does this bug represent a serious conflict in either of those? If so lets get it tracked to the appropriate release and get it owned.
Flags: needinfo?(jmathies) → needinfo?(ehsan)
Assignee | ||
Comment 25•9 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #24)
> (In reply to :Ehsan Akhgari from comment #23)
> > Jim, why did you minus this? Clearly we don't want to ship e10s with this
> > unfixed? (Note that we're planning to talk about potentially changing how
> > service worker interception works in e10s next week, but still, we should
> > track this, I think.)
>
> It's blocking "service workers post v1", which I assumed meant it does not
> block the rollout of service workers in 44? e10s will be on for a limited
> set of users via experimentation in beta/44 and will roll out to a limited
> set of users in 45. Does this bug represent a serious conflict in either of
> those? If so lets get it tracked to the appropriate release and get it owned.
Currently in e10s, a website using service workers can cause this crash in some edge cases. We have decided that this doesn't block the initial release of service workers since e10s will be disabled in 44 release (although our beta users _may_ run into this crash), but if we're going to enable e10s for 45 (which is news to me!!! this should be announced more broadly) we should definitely fix this (among other issues with how we interact with e10s, such as bug 1214305.
Flags: needinfo?(ehsan) → needinfo?(jmathies)
Updated•9 years ago
|
Flags: needinfo?(jmathies)
Comment 26•9 years ago
|
||
[Tracking Requested - why for this release]:
e10s related bug in service workers v1 implementation
Sounds like this is edge case so we probably don't need to worry about this for the e10s experiment running in 44. However this may need to block 45.
Ehsan, how serious is this bug? Does it show up on crash stats at all in nightly or aurora?
status-firefox45:
--- → affected
tracking-firefox45:
--- → ?
Updated•9 years ago
|
status-firefox44:
--- → affected
Comment 27•9 years ago
|
||
This is really a service worker issue, so moving to our component for easier tracking.
Component: Networking → DOM: Service Workers
Updated•9 years ago
|
Blocks: e10s-swbeta
Comment 28•9 years ago
|
||
FWIW, this is showing up in nightly and aurora topcrash lists somewhat intermittently (maybe a small number of users crashing a lot every so often):
Nightly:
https://crash-stats.mozilla.com/signature/?product=Firefox&release_channel=nightly&date=%3E2015-08-01&signature=mozilla%3A%3Anet%3A%3AHttpChannelParent%3A%3ASynthesizeResponse&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1#aggregations
Aurora:
https://crash-stats.mozilla.com/signature/?product=Firefox&release_channel=aurora&date=%3E2015-08-01&signature=mozilla%3A%3Anet%3A%3AHttpChannelParent%3A%3ASynthesizeResponse&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1#aggregations
Comment 29•9 years ago
|
||
This is on our list to fix in the next week or two.
Ehsan, Josh, are you guys planning to look at this?
Flags: needinfo?(josh)
Flags: needinfo?(ehsan)
Comment 31•9 years ago
|
||
Note, the steps in bug 1231978 don't require any navigation redirects, yet lead to this same crash!
Comment 32•9 years ago
|
||
(In reply to :Ehsan Akhgari from comment #25)
> (In reply to Jim Mathies [:jimm] from comment #24)
> > (In reply to :Ehsan Akhgari from comment #23)
> > > Jim, why did you minus this? Clearly we don't want to ship e10s with this
> > > unfixed? (Note that we're planning to talk about potentially changing how
> > > service worker interception works in e10s next week, but still, we should
> > > track this, I think.)
> >
> > It's blocking "service workers post v1", which I assumed meant it does not
> > block the rollout of service workers in 44? e10s will be on for a limited
> > set of users via experimentation in beta/44 and will roll out to a limited
> > set of users in 45. Does this bug represent a serious conflict in either of
> > those? If so lets get it tracked to the appropriate release and get it owned.
>
> Currently in e10s, a website using service workers can cause this crash in
> some edge cases.
A 302 redirect is not really an edge case. I can crash the browser on demand with a sw+redirect. I'll attach a simple node app to demonstrate. Admittedly, this is stuff I am working on and not something I ran into in the wild. But I wouldn't be surprised if there are sites out there that will consistently crash nightly and aurora users before the page even loads.
Comment 33•9 years ago
|
||
Comment 34•9 years ago
|
||
> Admittedly, this is stuff I am working on and not something I ran into in the wild. But I wouldn't be surprised if there are sites out there that will consistently crash nightly and aurora users before the page even loads.
One example is our Service Worker Cookbook (https://serviceworke.rs/), that crashes even without 302.
Assignee | ||
Comment 35•9 years ago
|
||
I agree that this crash can happen on demand. I have been going through the crash reports and so far as far as I can tell all of the crashes are coming from developers using your tests sites (eeejay's site, serviceworke.rs, localhost, etc.)
Comment 36•9 years ago
|
||
I hit this crash with 100% repro in a new profile with e10s is enabled but not when e10s is disabled.
STR:
1. Click https://dev.opera.com/articles/showmodaldialog/demo.html
2. There is no step 2.
RESULT:
The link tries to redirect to https://dev.opera.com/blog/showmodaldialog/ and then crashes.
bp-74d94880-46df-400f-8f58-c25b62151216
bp-06278d39-01e6-463e-ac94-a92ae2151216
bp-3d1af0d4-cd9e-4157-a27b-c14da2151216
Keywords: reproducible
Assignee | ||
Comment 37•9 years ago
|
||
We might have an idea what is going on here...
Assignee: nobody → ehsan
Flags: needinfo?(josh)
Flags: needinfo?(ehsan)
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 38•9 years ago
|
||
Attachment #8688642 -
Attachment is obsolete: true
Attachment #8688646 -
Attachment is obsolete: true
Attachment #8698116 -
Attachment is obsolete: true
Assignee | ||
Comment 39•9 years ago
|
||
Before this patch, the redirected channel still uses the
HttpChannelParentListener object belonging to the actor for the
pre-redirect channel. This causes problems for interception since it
will cause the redirected channel to use the interception information
belonging to the pre-redirect channel.
Instead, we need to set up a freshly created listener for the parent
actor, and connect the redirected channel to that listener object so
that the correct interception information gets used by the redirected
channel.
Attachment #8699230 -
Flags: review?(josh)
Assignee | ||
Updated•9 years ago
|
Attachment #8699199 -
Attachment is obsolete: true
Comment 40•9 years ago
|
||
Reviews in the redirect code should probably go through me.
Comment 41•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #40)
> Reviews in the redirect code should probably go through me.
FWIW, jduell delegated reviews for necko changes related to service workers to jdm.
Not saying we shouldn't ask you for review, but just FYI since I think that happened while you were out on PTO.
Assignee | ||
Comment 42•9 years ago
|
||
Comment on attachment 8699230 [details] [diff] [review]
Link the redirected channel and the newly created HttpChannelParent object together as soon as the parent actor is created
This patch is simple, elegant, and wrong. :(
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0bd1d5d1ad53
Attachment #8699230 -
Flags: review?(josh)
Comment 43•9 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #41)
> (In reply to Honza Bambas (:mayhemer) from comment #40)
> > Reviews in the redirect code should probably go through me.
>
> FWIW, jduell delegated reviews for necko changes related to service workers
> to jdm.
>
> Not saying we shouldn't ask you for review, but just FYI since I think that
> happened while you were out on PTO.
Sure. But this is touching the general redirect code. At least it would be nice I knew there were some changes made.
Updated•9 years ago
|
Updated•9 years ago
|
Assignee | ||
Comment 44•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #43)
> (In reply to Ben Kelly [:bkelly] from comment #41)
> > (In reply to Honza Bambas (:mayhemer) from comment #40)
> > > Reviews in the redirect code should probably go through me.
> >
> > FWIW, jduell delegated reviews for necko changes related to service workers
> > to jdm.
> >
> > Not saying we shouldn't ask you for review, but just FYI since I think that
> > happened while you were out on PTO.
>
> Sure. But this is touching the general redirect code. At least it would be
> nice I knew there were some changes made.
I'll ask you both after I figure this out. :-) More eyes can't hurt here.
Comment 45•9 years ago
|
||
This makes the 'Failed to load <url>. A Service Worker intercepted the request...' error appear in the console when loading cpeterson's example. I went with an existing string to make it easier to land this on branches.
Attachment #8700113 -
Flags: review?(ehsan)
Updated•9 years ago
|
Assignee: ehsan → josh
Status: NEW → ASSIGNED
Updated•9 years ago
|
Assignee: josh → ehsan
Assignee | ||
Comment 46•9 years ago
|
||
Comment on attachment 8700113 [details] [diff] [review]
Report an interception error and cancel the HTTP channel when encountering a known topcrash situation
This looks good but I have two requests:
1. Can you please move this into a different bug? Landing this here will make a mess of what the release managers have to deal with...
2. I thought about this a bit and I don't think we want to reuse an existing string. I much prefer a non-localized temporary string that explains:
a) That this is only an issue in multi-process Firefox.
b) That this is caused by a redirect.
Attachment #8700113 -
Flags: review?(ehsan) → review-
Updated•9 years ago
|
Attachment #8700113 -
Attachment is obsolete: true
Assignee | ||
Comment 47•9 years ago
|
||
Here is what I have found to be wrong so far with my patch. I have been debugging test_redirect_history_wrap.js.
With this patch, we create a new HttpChannelParentListener object at the time the redirect channel is connected, but nsHttpChannel::mListener for the post-redirect channel is not changed, so it will still point to the other (old) HttpChannelParentListener object, that has the pre-redirect nsHttpChannel as its mNextListener object.
As a result, by the time HttpChannelParent::OnStartRequest is called for example, aRequest points to the post-redirect channel but mChannel points to the pre-redirect channel, since the HttpChannelParent object is the one that belongs to the pre-redirect channel.
The incorrect nsHttpChannel::mListener comes from the mRedirectChannel->AsyncOpen() call from nsHttpChannel::ContinueProcessRedirection(). Now, setting mListener inside HttpChannelParent::ConnectChannel() doesn't fix anything since the mListener will be overwritten when AsyncOpen is called on the redirected channel as before. If I avoid overwriting mListener in nsHttpChannel::AsyncOpen if it's non-null, then that test passes, but that sounds like a hack. It does however fix the test failures I saw on try as far as I can tell.
Honza, what do you think about this approach?
Flags: needinfo?(honzab.moz)
Comment 48•9 years ago
|
||
Comment on attachment 8699230 [details] [diff] [review]
Link the redirected channel and the newly created HttpChannelParent object together as soon as the parent actor is created
Review of attachment 8699230 [details] [diff] [review]:
-----------------------------------------------------------------
I haven't touched e10s redirects since we started adding SW code, so there's probably a lot that I'm missing. And I need to leave to catch a plane very soon. But here's some thoughts...
> With this patch, we create a new HttpChannelParentListener object at the time
> the redirect channel is connected, but nsHttpChannel::mListener for the
> post-redirect channel is not changed, so it will still point to the other
> (old) HttpChannelParentListener object
Our e10s architecture has always relied on there being a single HttpChannelParentListener per original necko channel (i.e even when we hit redirects, we keep the same HttpChannelParentListener). This is because nsHttpChannel::mListener can wind up being a chain of listeners (the gzip decoder can interject itself as an intermediate listener, for instance, and theoretically an addon could use nsITraceableChannel to inject itself, etc). So we have no failproof way to "change the listener".
I suspect we'll want a different architecture somehow (ehsan tells me on IRC he's working on a different approach)
Attachment #8699230 -
Flags: feedback-
Assignee | ||
Comment 49•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Attachment #8699230 -
Attachment is obsolete: true
Assignee | ||
Comment 50•9 years ago
|
||
It looks like the WIP 2 patch is mostly correct. I'll push to try and clean it up and prepare it for review next week!
Assignee | ||
Comment 51•9 years ago
|
||
If all stars align, this will be green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a49f73edb6df
Assignee | ||
Comment 52•9 years ago
|
||
Attachment #8703053 -
Flags: review?(josh)
Assignee | ||
Comment 53•9 years ago
|
||
The controller needs to be able to handle redirects.
HttpChannelParentListener is a listener object that can attach to
multiple HttpChannelParent objects as part of a single logical Necko
channel (meaning that the listener object will be reused for redirected
channels) so it seems like the better choice for the object that should
implement nsINetworkInterceptController. This patch moves the
implementation of the nsINetworkInterceptController interface to
HttpChannelParentListener.
Attachment #8703054 -
Flags: review?(josh)
Attachment #8703054 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8701174 -
Attachment is obsolete: true
Updated•9 years ago
|
Attachment #8703053 -
Flags: review?(josh) → review+
Comment 54•9 years ago
|
||
Comment on attachment 8703054 [details] [diff] [review]
Part 2: Make HttpChannelParentListener be the controller
Review of attachment 8703054 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +104,5 @@
> mIPCClosed = true;
>
> // If this is an intercepted channel, we need to make sure that any resources are
> // cleaned up to avoid leaks.
> + if (mParentListener) {
Does mParentListener still exist after a redirection occurs? If so, this could cause us to cancel the interception for a redirected channel inappropriately.
Assignee | ||
Comment 55•9 years ago
|
||
(In reply to Josh Matthews [:jdm] from comment #54)
> Comment on attachment 8703054 [details] [diff] [review]
> Part 2: Make HttpChannelParentListener be the controller
>
> Review of attachment 8703054 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: netwerk/protocol/http/HttpChannelParent.cpp
> @@ +104,5 @@
> > mIPCClosed = true;
> >
> > // If this is an intercepted channel, we need to make sure that any resources are
> > // cleaned up to avoid leaks.
> > + if (mParentListener) {
>
> Does mParentListener still exist after a redirection occurs?
Yes, through calling SetParentListener() from HttpChannelParentListener::OnRedirectResult().
> If so, this
> could cause us to cancel the interception for a redirected channel
> inappropriately.
Hmm, why? mInterceptedChannel will *only* be set when mShouldSuspendIntercept is set to true. This code was added in bug 1157283 to clean up the forcefully destroyed parent actor, and I think it should stay as it is. The scenario that you're describing should never occur. WDYT?
Flags: needinfo?(josh)
Comment 56•9 years ago
|
||
Comment on attachment 8703054 [details] [diff] [review]
Part 2: Make HttpChannelParentListener be the controller
Review of attachment 8703054 [details] [diff] [review]:
-----------------------------------------------------------------
True!
Attachment #8703054 -
Flags: review?(josh) → review+
Updated•9 years ago
|
Flags: needinfo?(josh)
Comment 57•9 years ago
|
||
Comment on attachment 8703054 [details] [diff] [review]
Part 2: Make HttpChannelParentListener be the controller
Review of attachment 8703054 [details] [diff] [review]:
-----------------------------------------------------------------
Not quite sure why you need the xpcom voodoo, but I'll trust you.
Attachment #8703054 -
Flags: review?(jduell.mcbugs) → review+
Comment 58•9 years ago
|
||
With 6 crashes during the last 2 weeks, I don't care enough about it. Not tracking.
Assignee | ||
Comment 59•9 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #58)
> With 6 crashes during the last 2 weeks, I don't care enough about it. Not
> tracking.
FWIW the low number of crashes is due to the fact that we wallpapered over it in bug 1233845. That being said, with e10s not shipping in 45, I agree that there is no reason to track this for 45.
Assignee | ||
Comment 60•9 years ago
|
||
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #57)
> Comment on attachment 8703054 [details] [diff] [review]
> Part 2: Make HttpChannelParentListener be the controller
>
> Review of attachment 8703054 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Not quite sure why you need the xpcom voodoo, but I'll trust you.
I did that in order to not have to guess the correct object type, which at least in the case of the listener object is not possible to figure out using the other interfaces the listener supports. In the case of HCP, we should rip out the static_casts to HttpChannelParent* that we currently use and replace them with nice do_QueryObject() calls. :-)
Flags: needinfo?(honzab.moz)
Comment 61•9 years ago
|
||
Comment 62•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox46:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Comment 63•9 years ago
|
||
ehsan, can you get this uplifted to 45 for the e10s beta experiment
Flags: needinfo?(ehsan)
Whiteboard: [e10s-45-uplift]
Assignee | ||
Comment 64•9 years ago
|
||
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #63)
> ehsan, can you get this uplifted to 45 for the e10s beta experiment
We wallpapered the crash in bug 1233845 which is on 44 and above. This bug removes the wallpaper and provides a correct fix, but I don't think that's needed for the e10s beta experiment.
Flags: needinfo?(ehsan)
You need to log in
before you can comment on or make changes to this bug.
Description
•