Closed
Bug 1016352
Opened 11 years ago
Closed 10 years ago
WebSocketChannelChild must be thread-safe
Categories
(Core :: Networking: WebSockets, defect)
Tracking
()
RESOLVED
FIXED
mozilla33
People
(Reporter: baku, Assigned: baku)
References
Details
Attachments
(1 file, 2 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review |
Currently it's not thread-safe and the porting of WebSocket to workers fails in several e10s mochitests.
Why are we refcounting it off the main thread? I don't think channels are usually threadsafe.
Assignee | ||
Comment 2•11 years ago
|
||
WebSocketChannel is thread-safe and all the callbacks are received in the target thread, specified by RetargetDeliveryTo().
Comment 3•11 years ago
|
||
Hey baku - I assume it's this test you're talking about: https://tbpl.mozilla.org/php/getParsedLog.php?id=39963172&tree=Try&full=1 ? Ignore the rest of this comment if you're referring to something else.
I tried NS_DECL_THREADSAFE_ISUPPORTS in WebSocketChannelChild, but I get what looks like the same error as in that log. Looks like SendPWebSocketConstructor is passing a null PBrowserChild/TabChild to PNeckoChild::Write?? I'm not sure - I'll need to look at it more.
AsyncOpen should be the same in both single process and e10s, though. Both should be running on the main thread.
Assignee | ||
Comment 4•10 years ago
|
||
> I tried NS_DECL_THREADSAFE_ISUPPORTS in WebSocketChannelChild, but I get
> what looks like the same error as in that log. Looks like
> SendPWebSocketConstructor is passing a null PBrowserChild/TabChild to
> PNeckoChild::Write?? I'm not sure - I'll need to look at it more.
Yes. This is the issue. And last time I debugged it, the issue was that.
Would be great if you can take a look. Let me know if I can help.
Comment 5•10 years ago
|
||
Took a deeper look today:
So, the crash I see is in PNeckoChild::Write(PBrowserChild ...) - the PBrowserChild arg is null, but Write does not want it to be.
There are a couple of things here.
1. I compared SendPWebSocketConstructor with it's HTTP and FTP equivalents; HTTP is set to allow null PBrowserChild args, FTP (like WebSockets) is not. This can be fixed by adding a 'nullable' qualifier to PWebSocket.
Since we're already checking MissingRequiredTabChild in WebSocketChannelChild::AsyncOpen, it leads me to believe that this should be ok. I'm thinking that it is ok to have it null for mochitests or whatever, basically in cases where network.disable.ipc.security is set to true.
Jason, can you confirm this?
2. Even with 'nullable' added, the test still fails :) But much more of the test is able to run, so that's progress :)
Out of interest, Baku, have you looked at the behavior outside of automated tests - is there still a crash in PNeckoChild::Write?
Note: I also had NS_DECL_THREADSAFE_ISUPPORTS in WebSocketChannelChild for this debugging.
Flags: needinfo?(jduell.mcbugs)
Flags: needinfo?(amarchesini)
Comment 6•10 years ago
|
||
Why should it be ok to have it as null for mochitests? We use the PBrowserChild to check permissions for requests coming from content processes, and websockets are inherently associated with particular pages, unlike HTTP channels.
Comment 7•10 years ago
|
||
Ah, I see. Then we can't do that :)
Does that mean that MissingRequiredTabChild is wrong? Or am I conflating UsingNeckoIPCSecurity and this nullable qualifier? Anyway, that is a secondary issue.
So somehow we need to get a TabChild/PBrowserChild from the NotificationCallbacks of the WebSocketChannel, or from the loadgroup. Both of those seem to be set in WebSocketImpl. I change WebSocketImpl::GetInterface to also try QueryInterface on it's mParent object (class WebSocket), but I still don't see a TabChild returned here.
Comment 8•10 years ago
|
||
MissingRequiredTabChild provides us with an error from a more useful context than the actual IPC message encoding (which is where the presence of nullable in the protocol definition takes effect).
Comment 9•10 years ago
|
||
We only allow a null PBrowser for xpcshell tests, where there's no browser to pass. Websockets don't work without the DOM, so they're always mochitests, so as jdm mentions, it's ok that we disallow null.
> Since we're already checking MissingRequiredTabChild in WebSocketChannelChild::AsyncOpen
I don't understand how we're getting past this check (and calling the constructor with a null TabChild):
http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/websocket/WebSocketChannelChild.cpp#381
I.e. shouldn't we have already returned from AsyncOpen with ILLEGAL_VALUE if the tabchild is null?
This bug seems mistitled to me--AFACIT this has nothing to do with WebSocketChannelChild being thread-safe or not. It sounds like the client code isn't setting the .notificationCallbacks for the channel to something that provides a nsITabChild.
Flags: needinfo?(jduell.mcbugs)
Assignee | ||
Comment 10•10 years ago
|
||
> This bug seems mistitled to me--AFACIT this has nothing to do with
> WebSocketChannelChild being thread-safe or not. It sounds like the client
> code isn't setting the .notificationCallbacks for the channel to something
> that provides a nsITabChild.
What I see in my tests is this:
nsresult
WebSocketImpl::UpdateURI()
{
// Check for Redirections
nsRefPtr<BaseWebSocketChannel> channel;
channel = static_cast<BaseWebSocketChannel*>(mChannel.get());
0:18.33 Hit MOZ_CRASH(WebSocketChannelChild not thread-safe) at /home/baku/Sources/m/websocket/src/netwerk/protocol/websocket/WebSocketChannelChild.cpp:24
0:18.55 mozilla::net::WebSocketChannelChild::AddRef() (/home/baku/Sources/m/websocket/src/netwerk/protocol/websocket/WebSocketChannelChild.cpp:24 (discriminator 1))
0:18.72 nsRefPtr<mozilla::net::BaseWebSocketChannel>::assign_with_AddRef(mozilla::net::BaseWebSocketChannel*) (/home/baku/Sources/m/websocket/build/content/base/src/../../../dist/include/nsAutoPtr.h:841)
0:18.72 nsRefPtr<mozilla::net::BaseWebSocketChannel>::operator=(mozilla::net::BaseWebSocketChannel*) (/home/baku/Sources/m/websocket/build/content/base/src/../../../dist/include/nsAutoPtr.h:945)
the same code works on the main-process.
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 11•10 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=709a3a2d93f8
Ok, this patch makes try (almost) green except for test 33. But this is a separated issue and it's related to WebSocket to workers.
I mean, if jduell is happy with this patch we can land it before bug 504553.
Attachment #8444879 -
Flags: review?(jduell.mcbugs)
Comment 12•10 years ago
|
||
Comment on attachment 8444879 [details] [diff] [review]
child.patch
Review of attachment 8444879 [details] [diff] [review]:
-----------------------------------------------------------------
necko changes look fine. You need a DOM peer (feel free to pick a different one if you like) for review the WebSocket.cpp bits--the JS incantations are a mystery to me :)
Attachment #8444879 -
Flags: review?(jduell.mcbugs)
Attachment #8444879 -
Flags: review?(bugs)
Attachment #8444879 -
Flags: review+
Assignee | ||
Comment 13•10 years ago
|
||
Attachment #8444879 -
Attachment is obsolete: true
Attachment #8444879 -
Flags: review?(bugs)
Assignee | ||
Comment 14•10 years ago
|
||
Actually, the WebSocket changes are not needed for this patch and I moved them to the other bug. If the patch is green on try, I'll land it.
https://tbpl.mozilla.org/?tree=Try&rev=e4a04f2cc786
Attachment #8444959 -
Attachment is obsolete: true
Assignee | ||
Comment 15•10 years ago
|
||
Checkin-needed, green on try, and just necko changes.
Keywords: checkin-needed
Comment 16•10 years ago
|
||
Assignee: nobody → amarchesini
Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in
before you can comment on or make changes to this bug.
Description
•