Closed Bug 1219469 Opened 9 years ago Closed 9 years ago

Crash in HttpChannelParent::SynthesizeResponse

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
e10s m8+ ---
firefox44 --- affected
firefox45 - affected
firefox46 --- fixed

People

(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)

References

Details

(Keywords: crash, reproducible, topcrash)

Crash Data

Attachments

(2 files, 8 obsolete files)

(deleted), patch
jdm
: review+
Details | Diff | Splinter Review
(deleted), patch
jdm
: review+
jduell.mcbugs
: review+
Details | Diff | Splinter Review
See <http://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-inbound-linux-debug/1446062559/mozilla-inbound_ubuntu32_vm-debug_test-web-platform-tests-e10s-8-bm04-tests1-linux32-build38.txt.gz> 15:09:30 INFO - PROCESS | 1778 | [Parent 1778] ###!!! ASSERTION: You can't dereference a NULL nsAutoPtr with operator->().: 'mRawPtr != 0', file ../../../dist/include/nsAutoPtr.h, line 196 15:09:30 INFO - PROCESS | 1778 | [Parent 1778] ###!!! ASSERTION: You can't dereference a NULL nsAutoPtr with operator->().: 'mRawPtr != 0', file ../../../dist/include/nsAutoPtr.h, line 196 15:09:31 INFO - PROCESS | 1778 | --DOCSHELL 0x99821100 == 9 [pid = 1830] [id = 1019] 15:09:31 INFO - PROCESS | 1778 | --DOCSHELL 0x982a9600 == 8 [pid = 1830] [id = 1016] 15:09:31 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/HQkNCLtmTz-8Kf4sEX_5NQ/artifacts/public/build/firefox-44.0a1.en-US.linux-i686.crashreporter-symbols.zip 15:09:31 INFO - PROCESS | 1778 | [Child 1830] ###!!! ABORT: Aborting on channel error.: file /builds/slave/m-in-lx-d-00000000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1768 15:09:31 INFO - PROCESS | 1778 | #01: ???[/builds/slave/test/build/application/firefox/libxul.so +0x71d5bc] 15:09:31 INFO - PROCESS | 1778 | #02: ???[/builds/slave/test/build/application/firefox/libxul.so +0x71d688] 15:09:31 INFO - PROCESS | 1778 | #03: ???[/builds/slave/test/build/application/firefox/libxul.so +0x709767] 15:09:31 INFO - PROCESS | 1778 | #04: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6f69ec] 15:09:31 INFO - PROCESS | 1778 | #05: ???[/builds/slave/test/build/application/firefox/libxul.so +0x70f254] 15:09:31 INFO - PROCESS | 1778 | #06: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6f96e7] 15:09:31 INFO - PROCESS | 1778 | #07: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6fa79a] 15:09:31 INFO - PROCESS | 1778 | #08: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6fa7c0] 15:09:31 INFO - PROCESS | 1778 | #09: ???[/builds/slave/test/build/application/firefox/libxul.so +0x7033c6] 15:09:31 INFO - PROCESS | 1778 | #10: ???[/builds/slave/test/build/application/firefox/libxul.so +0x6ffc8e] 15:09:31 INFO - PROCESS | 1778 | #11: ???[/lib/i386-linux-gnu/libpthread.so.0 +0x6d4c] 15:09:31 INFO - PROCESS | 1778 | [Child 1830] ###!!! ABORT: Aborting on channel error.: file /builds/slave/m-in-lx-d-00000000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1768 15:09:31 INFO - PROCESS | 1778 | Hit MOZ_CRASH() at /builds/slave/m-in-lx-d-00000000000000000000/build/src/memory/mozalloc/mozalloc_abort.cpp:33 15:09:56 INFO - mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/7dc715ef-9a16-47de-08f0b0d7-4d714946.dmp 15:09:56 INFO - mozcrash Saved app info as /builds/slave/test/build/blobber_upload_dir/7dc715ef-9a16-47de-08f0b0d7-4d714946.extra 15:09:57 INFO - PROCESS-CRASH | /_mozilla/service-workers/service-worker/fetch-request-redirect.https.html | application crashed [@ mozilla::net::HttpChannelParent::SynthesizeResponse(nsIInterceptedChannel*)] 15:09:57 INFO - Crash dump filename: /tmp/tmpQK0exE.mozrunner/minidumps/7dc715ef-9a16-47de-08f0b0d7-4d714946.dmp 15:09:57 INFO - Operating system: Linux 15:09:57 INFO - 0.0.0 Linux 3.2.0-76-generic-pae #111-Ubuntu SMP Tue Jan 13 22:34:29 UTC 2015 i686 15:09:57 INFO - CPU: x86 15:09:57 INFO - GenuineIntel family 6 model 62 stepping 4 15:09:57 INFO - 1 CPU 15:09:57 INFO - 15:09:57 INFO - Crash reason: SIGSEGV 15:09:57 INFO - Crash address: 0x6 15:09:57 INFO - Process uptime: not available 15:09:57 INFO - 15:09:57 INFO - Thread 0 (crashed) 15:09:57 INFO - 0 libxul.so!mozilla::net::HttpChannelParent::SynthesizeResponse(nsIInterceptedChannel*) [HttpChannelParent.cpp:a01667dab7c5 : 264 + 0xd] 15:09:57 INFO - eip = 0xb0ddea40 esp = 0xbf940f38 ebp = 0xbf940f78 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x90cc6fd4 edi = 0x00000008 eax = 0x00000000 ecx = 0xb0dcf09a 15:09:57 INFO - edx = 0x9fd86260 efl = 0x00210286 15:09:57 INFO - Found by: given as instruction pointer in context 15:09:57 INFO - 1 libxul.so!mozilla::net::ResponseSynthesizer::Dispatch() [HttpChannelParent.cpp:a01667dab7c5 : 240 + 0x9] 15:09:57 INFO - eip = 0xb0ddeb31 esp = 0xbf940f80 ebp = 0xbf940f98 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9fd86260 edi = 0xbf940fcc 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 2 libxul.so!mozilla::net::InterceptedChannelBase::DoNotifyController() [InterceptedChannel.cpp:a01667dab7c5 : 73 + 0x9] 15:09:57 INFO - eip = 0xb0ddf5dd esp = 0xbf940fa0 ebp = 0xbf940fe8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9fd86260 edi = 0xbf940fcc 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 3 libxul.so!mozilla::net::InterceptedChannelChrome::NotifyController() [InterceptedChannel.cpp:a01667dab7c5 : 134 + 0x9] 15:09:57 INFO - eip = 0xb0ddf7de esp = 0xbf940ff0 ebp = 0xbf941038 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9fd86260 edi = 0xbf94101c 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 4 libxul.so!mozilla::net::nsHttpChannel::OpenCacheEntry(bool) [nsHttpChannel.cpp:a01667dab7c5 : 3025 + 0x8] 15:09:57 INFO - eip = 0xb0e2ad4f esp = 0xbf941040 ebp = 0xbf9411a8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9fd86260 edi = 0x9abba800 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 5 libxul.so!mozilla::net::nsHttpChannel::Connect() [nsHttpChannel.cpp:a01667dab7c5 : 423 + 0x10] 15:09:57 INFO - eip = 0xb0e2b7da esp = 0xbf9411b0 ebp = 0xbf9413d8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9abba800 edi = 0x9abba8e8 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 6 libxul.so!mozilla::net::nsHttpChannel::ContinueBeginConnectWithResult() [nsHttpChannel.cpp:a01667dab7c5 : 5414 + 0x9] 15:09:57 INFO - eip = 0xb0e2bb04 esp = 0xbf9413e0 ebp = 0xbf941408 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9abba800 edi = 0x9abbabe0 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 7 libxul.so!mozilla::net::nsHttpChannel::BeginConnect() [nsHttpChannel.cpp:a01667dab7c5 : 5330 + 0x9] 15:09:57 INFO - eip = 0xb0e2c9bd esp = 0xbf941410 ebp = 0xbf941618 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9abba800 edi = 0x9abbabe0 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 8 libxul.so!mozilla::net::nsHttpChannel::OnProxyAvailable(nsICancelable*, nsIChannel*, nsIProxyInfo*, nsresult) [nsHttpChannel.cpp:a01667dab7c5 : 5484 + 0x9] 15:09:57 INFO - eip = 0xb0e2cce6 esp = 0xbf941620 ebp = 0xbf941658 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9abba800 edi = 0x00000000 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 9 libxul.so!nsAsyncResolveRequest::DoCallback() [nsProtocolProxyService.cpp:a01667dab7c5 : 287 + 0x16] 15:09:57 INFO - eip = 0xb0cc0eaa esp = 0xbf941660 ebp = 0xbf941718 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x8fffd260 edi = 0xbf941784 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 10 libxul.so!nsAsyncResolveRequest::Run() [nsProtocolProxyService.cpp:a01667dab7c5 : 169 + 0x9] 15:09:57 INFO - eip = 0xb0cc1072 esp = 0xbf941720 ebp = 0xbf941738 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x00000000 edi = 0xbf941784 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 11 libxul.so!nsProtocolProxyService::AsyncResolveInternal(nsIChannel*, unsigned int, nsIProtocolProxyCallback*, nsICancelable**, bool) [nsProtocolProxyService.cpp:a01667dab7c5 : 1304 + 0x6] 15:09:57 INFO - eip = 0xb0cc1287 esp = 0xbf941740 ebp = 0xbf9417f8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x00000000 edi = 0xbf941784 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 12 libxul.so!nsProtocolProxyService::AsyncResolve2(nsIChannel*, unsigned int, nsIProtocolProxyCallback*, nsICancelable**) [nsProtocolProxyService.cpp:a01667dab7c5 : 1328 + 0x16] 15:09:57 INFO - eip = 0xb0cc15fa esp = 0xbf941800 ebp = 0xbf941828 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9abba838 edi = 0xbf941878 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 13 libxul.so!mozilla::net::nsHttpChannel::ResolveProxy() [nsHttpChannel.cpp:a01667dab7c5 : 2201 + 0x20] 15:09:57 INFO - eip = 0xb0e0c6c7 esp = 0xbf941830 ebp = 0xbf941898 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9abba838 edi = 0xbf941878 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 14 libxul.so!mozilla::net::nsHttpChannel::AsyncOpen(nsIStreamListener*, nsISupports*) [nsHttpChannel.cpp:a01667dab7c5 : 5043 + 0x9] 15:09:57 INFO - eip = 0xb0e2d065 esp = 0xbf9418a0 ebp = 0xbf941938 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9abba800 edi = 0x9abba888 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 15 libxul.so!mozilla::net::nsHttpChannel::ContinueProcessRedirection(nsresult) [nsHttpChannel.cpp:a01667dab7c5 : 4763 + 0x1a] 15:09:57 INFO - eip = 0xb0e26927 esp = 0xbf941940 ebp = 0xbf941988 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9b104800 edi = 0x00000005 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 16 libxul.so!mozilla::net::nsHttpChannel::OnRedirectVerifyCallback(nsresult) [nsHttpChannel.cpp:a01667dab7c5 : 6681 + 0x9] 15:09:57 INFO - eip = 0xb0e26677 esp = 0xbf941990 ebp = 0xbf9419c8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x9b104800 edi = 0x00000000 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 17 libxul.so!nsAsyncVerifyRedirectCallbackEvent::Run() [nsAsyncRedirectVerifyHelper.cpp:a01667dab7c5 : 43 + 0xb] 15:09:57 INFO - eip = 0xb0c8d1a7 esp = 0xbf9419d0 ebp = 0xbf9419e8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x95d7e900 edi = 0xb5f04340 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 18 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:a01667dab7c5 : 964 + 0x14] 15:09:57 INFO - eip = 0xb0c0feb8 esp = 0xbf9419f0 ebp = 0xbf941a68 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xb7046b20 edi = 0xb5f04340 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 19 libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:a01667dab7c5 : 297 + 0x10] 15:09:57 INFO - eip = 0xb0c3bc7b esp = 0xbf941a70 ebp = 0xbf941aa8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xb70d7750 edi = 0xb702cb20 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 20 libxul.so!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [MessagePump.cpp:a01667dab7c5 : 95 + 0xc] 15:09:57 INFO - eip = 0xb0ef3b0c esp = 0xbf941ab0 ebp = 0xbf941af8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xb70d7750 edi = 0xb702cb20 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 21 libxul.so!MessageLoop::RunInternal() [message_loop.cc:a01667dab7c5 : 234 + 0x14] 15:09:57 INFO - eip = 0xb0ecc79a esp = 0xbf941b00 ebp = 0xbf941b28 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xb702cb20 edi = 0xb7046b20 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 22 libxul.so!MessageLoop::Run() [message_loop.cc:a01667dab7c5 : 227 + 0x8] 15:09:57 INFO - eip = 0xb0ecc7c0 esp = 0xbf941b30 ebp = 0xbf941b58 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xb702cb20 edi = 0xb7046b20 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 23 libxul.so!nsBaseAppShell::Run() [nsBaseAppShell.cpp:a01667dab7c5 : 156 + 0xe] 15:09:57 INFO - eip = 0xb261fce9 esp = 0xbf941b60 ebp = 0xbf941b88 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xb7188890 edi = 0xb7046b20 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 24 libxul.so!nsAppStartup::Run() [nsAppStartup.cpp:a01667dab7c5 : 281 + 0x9] 15:09:57 INFO - eip = 0xb2e02152 esp = 0xbf941b90 ebp = 0xbf941ba8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xa9ab2240 edi = 0xbf941e19 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 25 libxul.so!XREMain::XRE_mainRun() [nsAppRunner.cpp:a01667dab7c5 : 4298 + 0x17] 15:09:57 INFO - eip = 0xb2e51dc9 esp = 0xbf941bb0 ebp = 0xbf941c98 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0x00000000 edi = 0xbf941e19 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 26 libxul.so!XREMain::XRE_main(int, char**, nsXREAppData const*) [nsAppRunner.cpp:a01667dab7c5 : 4391 + 0x9] 15:09:57 INFO - eip = 0xb2e54b4c esp = 0xbf941ca0 ebp = 0xbf941ce8 ebx = 0xb5eff61c 15:09:57 INFO - esi = 0xbf941d20 edi = 0xbf941d38 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 27 libxul.so!XRE_main [nsAppRunner.cpp:a01667dab7c5 : 4493 + 0xf] 15:09:57 INFO - eip = 0xb2e54da1 esp = 0xbf941cf0 ebp = 0xbf941e28 ebx = 0x0808cdb4 15:09:57 INFO - esi = 0xbf941d20 edi = 0xb700b600 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 28 firefox!do_main [nsBrowserApp.cpp:a01667dab7c5 : 212 + 0x6] 15:09:57 INFO - eip = 0x0804d5c0 esp = 0xbf941e30 ebp = 0xbf942e88 ebx = 0x0808cdb4 15:09:57 INFO - esi = 0xbf943024 edi = 0xb700b600 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 29 firefox!main [nsBrowserApp.cpp:a01667dab7c5 : 352 + 0x16] 15:09:57 INFO - eip = 0x0804c8cb esp = 0xbf942e90 ebp = 0xbf942f78 ebx = 0x0808cdb4 15:09:57 INFO - esi = 0xbf943024 edi = 0x00000000 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 30 libc-2.15.so!__libc_start_main [libc-start.c : 226 + 0x1f] 15:09:57 INFO - eip = 0xb74bc4d3 esp = 0xbf942f80 ebp = 0x00000000 15:09:57 INFO - Found by: previous frame's frame pointer 15:09:57 INFO - 31 firefox!__libc_csu_fini + 0x10 15:09:57 INFO - eip = 0x0807be40 esp = 0xbf942f84 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 32 libc-2.15.so!__libc_start_main [libc-start.c : 226 + 0x1f] 15:09:57 INFO - eip = 0xb74bc4d3 esp = 0xbf942f90 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 33 firefox + 0x4b94 15:09:57 INFO - eip = 0x0804cb94 esp = 0xbf942fe0 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 34 libc-2.15.so!__libc_start_main [libc-start.c : 96 + 0x9] 15:09:57 INFO - eip = 0xb74bc3e9 esp = 0xbf942fec ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 35 firefox + 0x4b94 15:09:57 INFO - eip = 0x0804cb94 esp = 0xbf942ff8 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 36 firefox!_start + 0x21 15:09:57 INFO - eip = 0x0804cbb5 esp = 0xbf943000 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 37 firefox!_init + 0x77c 15:09:57 INFO - eip = 0x0804c7e0 esp = 0xbf943004 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 38 firefox!__libc_csu_fini + 0x10 15:09:57 INFO - eip = 0x0807be40 esp = 0xbf943010 ebp = 0x00000000 15:09:57 INFO - Found by: call frame info 15:09:57 INFO - 39 firefox!mozilla::ReadAheadFile(char const*, unsigned int, unsigned int, int*) [FileUtils.cpp:a01667dab7c5 : 538 + 0x8] 15:09:57 INFO - eip = 0x0807be30 esp = 0xbf943014 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning 15:09:57 INFO - 40 ld-2.15.so + 0xf280 15:09:57 INFO - eip = 0xb77cd280 esp = 0xbf943018 ebp = 0x00000000 15:09:57 INFO - Found by: stack scanning I can't debug this because of bug 1219467. :( This happened in the test added in bug 1219085.
mSynthesizedHead is null. The only way I see this being possible is if we call SynthesizeResponse multiple times, which suggests that ChannelIntercepted is being called multiple times.
Is this from the navigation redirect tests? If so, then the channel may be intercepted more than once.
Oh, its from bug 1219085, which seems to include redirecting an iframe. So I think it might be hitting a legitimate bug. We need to support intercepting again when a navigation is redirected.
(In reply to Ben Kelly [:bkelly] from comment #3) > Oh, its from bug 1219085, which seems to include redirecting an iframe. Yes. > So > I think it might be hitting a legitimate bug. We need to support > intercepting again when a navigation is redirected. I agree, this is a real bug, which is why I made it block v1.
Removing dependency as Ehsan said it's working for him now.
No longer depends on: 1219467
This bypasses the crash but we die at the IPC layer I think because we delete the actor in HttpChannelChild::CompleteRedirectSetup(). We need to add support for intercepting redirected channels properly after this patch...
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
On a fresh tree I'm not seeing the crash when running the test in e10s mode. I may just do another try to see if its been fixed by other changes.
Hmm, I can reproduce on a linux opt DEBUG build.
(In reply to [Vacation: Nov 3-19] from comment #6) > This bypasses the crash but we die at the IPC layer I think because we > delete the actor in HttpChannelChild::CompleteRedirectSetup(). This is kind of right, but I think for slightly different reasons. The Send__delete__() call deletes the actor from the IPC stack, but the HttpChannelChild object sticks around because its a ref counted object. The AsyncOpen() then creates a new actor for the HttpChannelChild by calling SendPHttpChannelChildConstructor() again. The IPC crash was happening, though, because Send__delete__() was racing with parent-to-child messages. This is a common IPC problem when destroying an actor. Its caused by the timing of the Send__delete__(), but the overall plan of reattaching an actor to HttpChannelChild is sound and does not directly lead to this crash. The solution is to bounce a message through the parent so that we know all messages are complete and no more will be sent. Then we can call AsyncOpen() knowing the last actor is safely destroyed. This patch does this. I verified locally it works on linux opt/DEBUG and windows debug/DEBUG. https://treeherder.mozilla.org/#/jobs?repo=try&revision=1b8abf2064db
Attachment #8688642 - Flags: review?(jduell.mcbugs)
Here is Ehsan's original patch. It ensures that the state of synthesized redirects is correct on the parent side.
Attachment #8681955 - Attachment is obsolete: true
Attachment #8688646 - Flags: review?(jduell.mcbugs)
Comment on attachment 8688642 [details] [diff] [review] P2 Safely reopen HttpChannelChild when interception triggers a redirect. r=jduell >- // This is a redirected channel, and the corresponding parent channel has started >- // AsyncOpen but was intercepted and suspended. We must tear it down and start >- // fresh - we will intercept the child channel this time, before creating a new >- // parent channel unnecessarily. This comment was intended to explain why the prior code was correct; clearly it was inadequate, but I think we should be able to achieve a similar one with less complexity than the patch here. The key in my original code was that the parent channel is suspended as soon as possible, which was supposed to prevent any messages from being sent to the child. I'm curious which messages were getting by that, and whether we could set a flag in the parent to just drop them instead of sending them to the child?
I guess I forgot that suspension doesn't suspend any channel notifications that are already in the event queue?
I could put the comment back. I did not mean to delete it completely since the general plan is still in effect. Its just not clear to me we can know the parent is completely suspended at this point. (In reply to Josh Matthews [:jdm] from comment #12) > I guess I forgot that suspension doesn't suspend any channel notifications > that are already in the event queue? Yes, I think that is an issue. At one point I tried setting mIPClosed at the suspension point in the parent and I still got a message through that caused a crash. So I think it was already in the queue. I'll admit, this is complex enough I could easily be missing an simpler way to fix it. This works, though. I guess I'd like to move forward with this approach for now and then use our planning in Orlando to simplify our overall interception strategy.
Comment on attachment 8688642 [details] [diff] [review] P2 Safely reopen HttpChannelChild when interception triggers a redirect. r=jduell Dropping the review flag while I check something for Josh.
Attachment #8688642 - Flags: review?(jduell.mcbugs)
Attachment #8688646 - Flags: review?(jduell.mcbugs)
Josh, the failing message type is 8257557. As far as I can tell this is an OnStatus() message. It appears OnStatus() is for progress triggered by nsHttpChannel::OnDataAvailable(). Do you have any ideas how to prevent these from firing? Note, I got this with an mIPCClosed = true added to the suspend case in HttpChannelParent::SynthesizeResponse(). sSo the OnStatus is being sent just before we try to suspend.
Flags: needinfo?(josh)
At this point I feel like trying to special case every message that might get sent before we synthesize so the queue is empty if we do suspend is going to be rather fragile. If the channel changes then all these assumptions could break again and introduce more race conditions. I feel like an explicit async shutdown message like P2 contains is the most robust thing to do here. Long term we can hopefully avoid having to cross process boundaries at all and keep interception only in the parent.
The try build suggests there are still race conditions here: https://treeherder.mozilla.org/logviewer.html#?job_id=13856482&repo=try I think this means the child process was killed: WARNING: pipe error (53): Connection reset by peer We may have to land the fetch-request-redirect.https.html tests only enabled for non-e10s now. We can fix this and other e10s issues in 45+. Lets see if the try shows any errors in non-e10s before doing that, though.
I landed the test in bug 1219805 disabled for e10s. When this bug is fixed we should re-enable fetch-request-redirect.https.html on e10s as well.
No longer blocks: 1219085
So try shows my patch is inadequate anyway as its hitting an nsCOMPtr nullptr somewhere on try. Of course, I can't reproduce this locally. Since this bug is e10s only I'm going to stop working on it. We can try to fix it in the coming weeks, but perhaps its best just to wait until the SWM e10s refactor.
Assignee: bkelly → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(josh)
e10s, so not blocking v1.
Blocks: ServiceWorkers-postv1
No longer blocks: ServiceWorkers-v1
Note, when this is fixed we need to enable both of these tests on e10s: fetch-request-redirect.https.html navigation-redirect.https.html
tracking-e10s: --- → ?
Jim, why did you minus this? Clearly we don't want to ship e10s with this unfixed? (Note that we're planning to talk about potentially changing how service worker interception works in e10s next week, but still, we should track this, I think.)
Flags: needinfo?(jmathies)
(In reply to :Ehsan Akhgari from comment #23) > Jim, why did you minus this? Clearly we don't want to ship e10s with this > unfixed? (Note that we're planning to talk about potentially changing how > service worker interception works in e10s next week, but still, we should > track this, I think.) It's blocking "service workers post v1", which I assumed meant it does not block the rollout of service workers in 44? e10s will be on for a limited set of users via experimentation in beta/44 and will roll out to a limited set of users in 45. Does this bug represent a serious conflict in either of those? If so lets get it tracked to the appropriate release and get it owned.
Flags: needinfo?(jmathies) → needinfo?(ehsan)
(In reply to Jim Mathies [:jimm] from comment #24) > (In reply to :Ehsan Akhgari from comment #23) > > Jim, why did you minus this? Clearly we don't want to ship e10s with this > > unfixed? (Note that we're planning to talk about potentially changing how > > service worker interception works in e10s next week, but still, we should > > track this, I think.) > > It's blocking "service workers post v1", which I assumed meant it does not > block the rollout of service workers in 44? e10s will be on for a limited > set of users via experimentation in beta/44 and will roll out to a limited > set of users in 45. Does this bug represent a serious conflict in either of > those? If so lets get it tracked to the appropriate release and get it owned. Currently in e10s, a website using service workers can cause this crash in some edge cases. We have decided that this doesn't block the initial release of service workers since e10s will be disabled in 44 release (although our beta users _may_ run into this crash), but if we're going to enable e10s for 45 (which is news to me!!! this should be announced more broadly) we should definitely fix this (among other issues with how we interact with e10s, such as bug 1214305.
Flags: needinfo?(ehsan) → needinfo?(jmathies)
Flags: needinfo?(jmathies)
[Tracking Requested - why for this release]: e10s related bug in service workers v1 implementation Sounds like this is edge case so we probably don't need to worry about this for the e10s experiment running in 44. However this may need to block 45. Ehsan, how serious is this bug? Does it show up on crash stats at all in nightly or aurora?
This is really a service worker issue, so moving to our component for easier tracking.
Component: Networking → DOM: Service Workers
Blocks: e10s-swbeta
This is on our list to fix in the next week or two. Ehsan, Josh, are you guys planning to look at this?
Flags: needinfo?(josh)
Flags: needinfo?(ehsan)
Note, the steps in bug 1231978 don't require any navigation redirects, yet lead to this same crash!
(In reply to :Ehsan Akhgari from comment #25) > (In reply to Jim Mathies [:jimm] from comment #24) > > (In reply to :Ehsan Akhgari from comment #23) > > > Jim, why did you minus this? Clearly we don't want to ship e10s with this > > > unfixed? (Note that we're planning to talk about potentially changing how > > > service worker interception works in e10s next week, but still, we should > > > track this, I think.) > > > > It's blocking "service workers post v1", which I assumed meant it does not > > block the rollout of service workers in 44? e10s will be on for a limited > > set of users via experimentation in beta/44 and will roll out to a limited > > set of users in 45. Does this bug represent a serious conflict in either of > > those? If so lets get it tracked to the appropriate release and get it owned. > > Currently in e10s, a website using service workers can cause this crash in > some edge cases. A 302 redirect is not really an edge case. I can crash the browser on demand with a sw+redirect. I'll attach a simple node app to demonstrate. Admittedly, this is stuff I am working on and not something I ran into in the wild. But I wouldn't be surprised if there are sites out there that will consistently crash nightly and aurora users before the page even loads.
Attached file crash-sw.tar.gz (obsolete) (deleted) —
> Admittedly, this is stuff I am working on and not something I ran into in the wild. But I wouldn't be surprised if there are sites out there that will consistently crash nightly and aurora users before the page even loads. One example is our Service Worker Cookbook (https://serviceworke.rs/), that crashes even without 302.
I agree that this crash can happen on demand. I have been going through the crash reports and so far as far as I can tell all of the crashes are coming from developers using your tests sites (eeejay's site, serviceworke.rs, localhost, etc.)
I hit this crash with 100% repro in a new profile with e10s is enabled but not when e10s is disabled. STR: 1. Click https://dev.opera.com/articles/showmodaldialog/demo.html 2. There is no step 2. RESULT: The link tries to redirect to https://dev.opera.com/blog/showmodaldialog/ and then crashes. bp-74d94880-46df-400f-8f58-c25b62151216 bp-06278d39-01e6-463e-ac94-a92ae2151216 bp-3d1af0d4-cd9e-4157-a27b-c14da2151216
Keywords: reproducible
We might have an idea what is going on here...
Assignee: nobody → ehsan
Flags: needinfo?(josh)
Flags: needinfo?(ehsan)
Attached patch WIP (obsolete) (deleted) — Splinter Review
Attachment #8688642 - Attachment is obsolete: true
Attachment #8688646 - Attachment is obsolete: true
Attachment #8698116 - Attachment is obsolete: true
Depends on: 1233245
Before this patch, the redirected channel still uses the HttpChannelParentListener object belonging to the actor for the pre-redirect channel. This causes problems for interception since it will cause the redirected channel to use the interception information belonging to the pre-redirect channel. Instead, we need to set up a freshly created listener for the parent actor, and connect the redirected channel to that listener object so that the correct interception information gets used by the redirected channel.
Attachment #8699230 - Flags: review?(josh)
Attachment #8699199 - Attachment is obsolete: true
Reviews in the redirect code should probably go through me.
(In reply to Honza Bambas (:mayhemer) from comment #40) > Reviews in the redirect code should probably go through me. FWIW, jduell delegated reviews for necko changes related to service workers to jdm. Not saying we shouldn't ask you for review, but just FYI since I think that happened while you were out on PTO.
Comment on attachment 8699230 [details] [diff] [review] Link the redirected channel and the newly created HttpChannelParent object together as soon as the parent actor is created This patch is simple, elegant, and wrong. :( https://treeherder.mozilla.org/#/jobs?repo=try&revision=0bd1d5d1ad53
Attachment #8699230 - Flags: review?(josh)
(In reply to Ben Kelly [:bkelly] from comment #41) > (In reply to Honza Bambas (:mayhemer) from comment #40) > > Reviews in the redirect code should probably go through me. > > FWIW, jduell delegated reviews for necko changes related to service workers > to jdm. > > Not saying we shouldn't ask you for review, but just FYI since I think that > happened while you were out on PTO. Sure. But this is touching the general redirect code. At least it would be nice I knew there were some changes made.
(In reply to Honza Bambas (:mayhemer) from comment #43) > (In reply to Ben Kelly [:bkelly] from comment #41) > > (In reply to Honza Bambas (:mayhemer) from comment #40) > > > Reviews in the redirect code should probably go through me. > > > > FWIW, jduell delegated reviews for necko changes related to service workers > > to jdm. > > > > Not saying we shouldn't ask you for review, but just FYI since I think that > > happened while you were out on PTO. > > Sure. But this is touching the general redirect code. At least it would be > nice I knew there were some changes made. I'll ask you both after I figure this out. :-) More eyes can't hurt here.
This makes the 'Failed to load <url>. A Service Worker intercepted the request...' error appear in the console when loading cpeterson's example. I went with an existing string to make it easier to land this on branches.
Attachment #8700113 - Flags: review?(ehsan)
Assignee: ehsan → josh
Status: NEW → ASSIGNED
Assignee: josh → ehsan
Comment on attachment 8700113 [details] [diff] [review] Report an interception error and cancel the HTTP channel when encountering a known topcrash situation This looks good but I have two requests: 1. Can you please move this into a different bug? Landing this here will make a mess of what the release managers have to deal with... 2. I thought about this a bit and I don't think we want to reuse an existing string. I much prefer a non-localized temporary string that explains: a) That this is only an issue in multi-process Firefox. b) That this is caused by a redirect.
Attachment #8700113 - Flags: review?(ehsan) → review-
Blocks: 1233845
Attachment #8700113 - Attachment is obsolete: true
Here is what I have found to be wrong so far with my patch. I have been debugging test_redirect_history_wrap.js. With this patch, we create a new HttpChannelParentListener object at the time the redirect channel is connected, but nsHttpChannel::mListener for the post-redirect channel is not changed, so it will still point to the other (old) HttpChannelParentListener object, that has the pre-redirect nsHttpChannel as its mNextListener object. As a result, by the time HttpChannelParent::OnStartRequest is called for example, aRequest points to the post-redirect channel but mChannel points to the pre-redirect channel, since the HttpChannelParent object is the one that belongs to the pre-redirect channel. The incorrect nsHttpChannel::mListener comes from the mRedirectChannel->AsyncOpen() call from nsHttpChannel::ContinueProcessRedirection(). Now, setting mListener inside HttpChannelParent::ConnectChannel() doesn't fix anything since the mListener will be overwritten when AsyncOpen is called on the redirected channel as before. If I avoid overwriting mListener in nsHttpChannel::AsyncOpen if it's non-null, then that test passes, but that sounds like a hack. It does however fix the test failures I saw on try as far as I can tell. Honza, what do you think about this approach?
Flags: needinfo?(honzab.moz)
Comment on attachment 8699230 [details] [diff] [review] Link the redirected channel and the newly created HttpChannelParent object together as soon as the parent actor is created Review of attachment 8699230 [details] [diff] [review]: ----------------------------------------------------------------- I haven't touched e10s redirects since we started adding SW code, so there's probably a lot that I'm missing. And I need to leave to catch a plane very soon. But here's some thoughts... > With this patch, we create a new HttpChannelParentListener object at the time > the redirect channel is connected, but nsHttpChannel::mListener for the > post-redirect channel is not changed, so it will still point to the other > (old) HttpChannelParentListener object Our e10s architecture has always relied on there being a single HttpChannelParentListener per original necko channel (i.e even when we hit redirects, we keep the same HttpChannelParentListener). This is because nsHttpChannel::mListener can wind up being a chain of listeners (the gzip decoder can interject itself as an intermediate listener, for instance, and theoretically an addon could use nsITraceableChannel to inject itself, etc). So we have no failproof way to "change the listener". I suspect we'll want a different architecture somehow (ehsan tells me on IRC he's working on a different approach)
Attachment #8699230 - Flags: feedback-
Attached patch WIP 2 (obsolete) (deleted) — Splinter Review
Attachment #8699230 - Attachment is obsolete: true
It looks like the WIP 2 patch is mostly correct. I'll push to try and clean it up and prepare it for review next week!
Attachment #8703053 - Flags: review?(josh)
The controller needs to be able to handle redirects. HttpChannelParentListener is a listener object that can attach to multiple HttpChannelParent objects as part of a single logical Necko channel (meaning that the listener object will be reused for redirected channels) so it seems like the better choice for the object that should implement nsINetworkInterceptController. This patch moves the implementation of the nsINetworkInterceptController interface to HttpChannelParentListener.
Attachment #8703054 - Flags: review?(josh)
Attachment #8703054 - Flags: review?(jduell.mcbugs)
Attachment #8701174 - Attachment is obsolete: true
Attachment #8703053 - Flags: review?(josh) → review+
Comment on attachment 8703054 [details] [diff] [review] Part 2: Make HttpChannelParentListener be the controller Review of attachment 8703054 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelParent.cpp @@ +104,5 @@ > mIPCClosed = true; > > // If this is an intercepted channel, we need to make sure that any resources are > // cleaned up to avoid leaks. > + if (mParentListener) { Does mParentListener still exist after a redirection occurs? If so, this could cause us to cancel the interception for a redirected channel inappropriately.
(In reply to Josh Matthews [:jdm] from comment #54) > Comment on attachment 8703054 [details] [diff] [review] > Part 2: Make HttpChannelParentListener be the controller > > Review of attachment 8703054 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: netwerk/protocol/http/HttpChannelParent.cpp > @@ +104,5 @@ > > mIPCClosed = true; > > > > // If this is an intercepted channel, we need to make sure that any resources are > > // cleaned up to avoid leaks. > > + if (mParentListener) { > > Does mParentListener still exist after a redirection occurs? Yes, through calling SetParentListener() from HttpChannelParentListener::OnRedirectResult(). > If so, this > could cause us to cancel the interception for a redirected channel > inappropriately. Hmm, why? mInterceptedChannel will *only* be set when mShouldSuspendIntercept is set to true. This code was added in bug 1157283 to clean up the forcefully destroyed parent actor, and I think it should stay as it is. The scenario that you're describing should never occur. WDYT?
Flags: needinfo?(josh)
Comment on attachment 8703054 [details] [diff] [review] Part 2: Make HttpChannelParentListener be the controller Review of attachment 8703054 [details] [diff] [review]: ----------------------------------------------------------------- True!
Attachment #8703054 - Flags: review?(josh) → review+
Flags: needinfo?(josh)
Comment on attachment 8703054 [details] [diff] [review] Part 2: Make HttpChannelParentListener be the controller Review of attachment 8703054 [details] [diff] [review]: ----------------------------------------------------------------- Not quite sure why you need the xpcom voodoo, but I'll trust you.
Attachment #8703054 - Flags: review?(jduell.mcbugs) → review+
With 6 crashes during the last 2 weeks, I don't care enough about it. Not tracking.
(In reply to Sylvestre Ledru [:sylvestre] from comment #58) > With 6 crashes during the last 2 weeks, I don't care enough about it. Not > tracking. FWIW the low number of crashes is due to the fact that we wallpapered over it in bug 1233845. That being said, with e10s not shipping in 45, I agree that there is no reason to track this for 45.
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #57) > Comment on attachment 8703054 [details] [diff] [review] > Part 2: Make HttpChannelParentListener be the controller > > Review of attachment 8703054 [details] [diff] [review]: > ----------------------------------------------------------------- > > Not quite sure why you need the xpcom voodoo, but I'll trust you. I did that in order to not have to guess the correct object type, which at least in the case of the listener object is not possible to figure out using the other interfaces the listener supports. In the case of HCP, we should rip out the static_casts to HttpChannelParent* that we currently use and replace them with nice do_QueryObject() calls. :-)
Flags: needinfo?(honzab.moz)
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Blocks: 1239311
ehsan, can you get this uplifted to 45 for the e10s beta experiment
Flags: needinfo?(ehsan)
Whiteboard: [e10s-45-uplift]
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #63) > ehsan, can you get this uplifted to 45 for the e10s beta experiment We wallpapered the crash in bug 1233845 which is on 44 and above. This bug removes the wallpaper and provides a correct fix, but I don't think that's needed for the e10s beta experiment.
Flags: needinfo?(ehsan)
Thanks for the explanation
Whiteboard: [e10s-45-uplift]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: