Closed Bug 1560636 Opened 5 years ago Closed 5 years ago

IPC messages from socket process happen on main

Categories

(Core :: WebRTC: Networking, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox70 --- fixed

People

(Reporter: bwc, Assigned: bwc)

References

Details

Attachments

(1 file)

For instance, when a packet is received, MediaTransportChild::RecvOnPacketReceived is called on main. Incoming media packets do not need to be flowing through main like this.

Priority: -- → P3
Assignee: nobody → docfaraday

So I'm trying to use BackgroundChild::GetOrCreateSocketActorForCurrentThread to make this work, but am having no luck. The call to SendPMediaTransportConstructor seems to work, but calls to the PMediaTransportChild never get where they are supposed to (the calls succeed, but the PMediaTransportParent never receives the messages). When I switch to using BackgroundChild::GetOrCreateForCurrentThread (ie; to using PBackground to the parent process instead of the socket process), the messaging makes it to the parent process, so it seems the problem lies in PBackground for the socket process, but maybe there's something additional I need to do to make that work?

Flags: needinfo?(kershaw)

(In reply to Byron Campen [:bwc] from comment #2)

So I'm trying to use BackgroundChild::GetOrCreateSocketActorForCurrentThread to make this work, but am having no luck. The call to SendPMediaTransportConstructor seems to work, but calls to the PMediaTransportChild never get where they are supposed to (the calls succeed, but the PMediaTransportParent never receives the messages). When I switch to using BackgroundChild::GetOrCreateForCurrentThread (ie; to using PBackground to the parent process instead of the socket process), the messaging makes it to the parent process, so it seems the problem lies in PBackground for the socket process, but maybe there's something additional I need to do to make that work?

Is there an easy way to test your patch? I have no idea how to test the call path.
FWIW, we've used BackgroundChild::GetOrCreateSocketActorForCurrentThread before in https://phabricator.services.mozilla.com/D17387 and it worked.

Flags: needinfo?(kershaw)

(In reply to Kershaw Chang [:kershaw] from comment #3)

(In reply to Byron Campen [:bwc] from comment #2)

So I'm trying to use BackgroundChild::GetOrCreateSocketActorForCurrentThread to make this work, but am having no luck. The call to SendPMediaTransportConstructor seems to work, but calls to the PMediaTransportChild never get where they are supposed to (the calls succeed, but the PMediaTransportParent never receives the messages). When I switch to using BackgroundChild::GetOrCreateForCurrentThread (ie; to using PBackground to the parent process instead of the socket process), the messaging makes it to the parent process, so it seems the problem lies in PBackground for the socket process, but maybe there's something additional I need to do to make that work?

Is there an easy way to test your patch? I have no idea how to test the call path.
FWIW, we've used BackgroundChild::GetOrCreateSocketActorForCurrentThread before in https://phabricator.services.mozilla.com/D17387 and it worked.

Yeah, the thing that isn't working for me is parent API calls (they go nowhere), but PBackgroundDataBridge has no parent API. I am able to construct endpoints and send the constructor just fine (the parent end is created, on the right process).

To test, you need --setpref="media.peerconnection.mtransport_process=true" --setpref="network.process.enabled=true", and use any webrtc website like https://mozilla.github.io/webrtc-landing/pc_test.html (hitting the start button should do the trick).

Thanks for the information and I can successfully test your patch now.
When I hit the start button, I saw that MediaTransportHandlerIPC was created three times and also three MediaTransportChild were created.
But only one MediaTransportParent was created in socket process. I also saw only two (not three) SendActivateTransport were called and no RecvActivateTransport was called in socket process.
I am not sure if this is the excepted behavior, but I think there should be only one MediaTransportChild in each content process?
And why SendActivateTransport was not called three times? I guess maybe we don't use the right MediaTransportChild to send IPC messages?

We use a separate MediaTransportHandler for each PeerConnection (ie; two of them), and then there's a third that we use to access the global ICE log for about:webrtc. So, I would expect to see two SendCreateIceCtx, two SendActivateTransport, etc. I am also seeing that none of the Recv API is being called, which is where I'm stuck. I do not know why there is only one MediaTransportParent being created either.

Here is my finding.

  1. I saw the first time GetOrCreateSocketActorForCurrentThread was called on non main thread.
  2. SendInitBackgroundRunnable was created.
  3. SendInitBackgroundRunnable was dispatched to aMainEventTarget, which is mCallbackThread.
  4. bridgeChild->SendInitBackground was never called. So, BackgroundParent was not created on socket process.
  5. The third time that GetOrCreateSocketActorForCurrentThread was called on main thread, so SendInitBackground is directly called. I think that's the reason why I only see one MediaTransportParent was created on socket process.

Could you check why the runnable is never executed?

I guess that would be because the SendInitBackgroundRunnable doesn't run the passed function on anything other than main:

https://searchfox.org/mozilla-central/rev/15be167a5b436b57fef944b84eef061d24c1af8c/ipc/glue/BackgroundImpl.cpp#1767-1774

^

Flags: needinfo?(jvarga)

aMainEventTarget argument was added just for new LocalStorage implementation that sometimes needs to use a nested event target/event loop on the main thread. It wasn't designed for anything else. Anyway, if GetOrCreateForCurrentThread() is called on non main thread and the actor doesn't exist yet, it must dispatch a runnable to the main thread to finish PBackground construction by sending a message over PContent.

Flags: needinfo?(jvarga)

(In reply to Byron Campen [:bwc] from comment #8)

I guess that would be because the SendInitBackgroundRunnable doesn't run the passed function on anything other than main:

https://searchfox.org/mozilla-central/rev/15be167a5b436b57fef944b84eef061d24c1af8c/ipc/glue/BackgroundImpl.cpp#1767-1774

Yes, mSendInitfunc needs to be run on main thread.
I've tried to call GetOrCreateSocketActorForCurrentThread with nullptr and I can see two MediaTransportParent were created on socket process.
Then I immediately saw the following assertion failure.

Assertion failure: NS_IsMainThread() (InitializeCipherSuite() can only be accessed on the main thread), at /Users/changkershaw/work/gecko/security/manager/ssl/nsNSSComponent.cpp:2247
#01: mozilla::MediaTransportParent::RecvCreateIceCtx(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, nsTArray<mozilla::dom::RTCIceServer>&&, mozilla::dom::RTCIceTransportPolicy const&)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0xf39371]
#02: mozilla::dom::PMediaTransportParent::OnMessageReceived(IPC::Message const&)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0xacc4dd]
#03: mozilla::ipc::PBackgroundParent::OnMessageReceived(IPC::Message const&)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0xc9767f]
#04: mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x8bbee0]
#05: mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x8b9e31]
#06: mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x8baa8d]
#07: mozilla::ipc::MessageChannel::MessageTask::Run()[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x8bb0aa]
#08: nsThread::ProcessNextEvent(bool, bool*)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x1595cc]
#09: NS_ProcessNextEvent(nsIThread*, bool)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x15d4f8]
#10: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)[/Users/changkershaw/work/gecko/obj-x86_64-apple-darwin18.6.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x8c0b40]

Yeah, there's more work to be done after that. I just had a successful test, now trying to debug failures that mochitest uncovers.

Try looks mostly ok, and it looks like we've accomplished what we set out to do. We have a new intermittent failure that needs debugging, though. It looks like the messaging to the MediaTransportHandlerSTS is getting reordered somehow... maybe MozPromise is queue-jumping?

Ok, I see what is happening. Switching over to MozPromise::Then for the dispatch to STS on SendPacket means we dispatch even if SendPacket is called on STS (which it is). This means SendPacket is getting slightly delayed, and arriving after the transport has been destroyed. I think I can just remove this assert. We don't re-use transport ids, so there should not be any risk of a packet being sent on the wrong transport.

Blocks: 1482521
Attachment #9075770 - Attachment description: Bug 1560636: (WIP) Avoid using main thread with PMediaTransport. → Bug 1560636: Avoid using main thread with PMediaTransport. r?mjf,r?kershaw

Try looks good. It looks like this patch might also fix bug 1480942, but bug 1541424 has muddied the water pretty badly. I'll continue investigating this over in bug 1480942.

Pushed by bcampen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3c6a1d3f9168 Avoid using main thread with PMediaTransport. r=mjf,kershaw
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
Regressions: 1566445
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: