Closed
Bug 1024555
Opened 10 years ago
Closed 10 years ago
blocked gethash requests cause the browser to hang
Categories
(Core :: DOM: Security, defect)
Tracking
()
RESOLVED
FIXED
mozilla35
People
(Reporter: mmc, Assigned: mmc)
References
Details
Attachments
(2 files, 4 obsolete files)
(deleted),
text/x-log
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review |
This issue affects users in China, where the Great Firewall is blocking responses to gethash requests. This makes safebrowsing useless in China, but the collateral damage is that nsIUrlClassifierHashCompleter can't specify a timeout and doesn't get onStopRequest (http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js#400).
This means that if a popular website like baidu includes blocked content as it is currently, Firefox hangs waiting for the gethash response.
Comment 1•10 years ago
|
||
The problem seems to me that there is no good fail-safe option.
If the gethash request for an URL that is on the prefix-blocklist times out, what will you do?
Comment 2•10 years ago
|
||
I'm still catching up on email, but is the problem that a SuspendChannel on some subresource causes many other unrelated requests to block?
I would expect a random image/JS file on a big site hanging not to hold up the rest of the site load.
Assignee | ||
Comment 3•10 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #2)
> I'm still catching up on email, but is the problem that a SuspendChannel on
> some subresource causes many other unrelated requests to block?
I couldn't reproduce, but please see bug 1023767.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #1)
> The problem seems to me that there is no good fail-safe option.
>
> If the gethash request for an URL that is on the prefix-blocklist times out,
> what will you do?
Fail open, same as if all safebrowsing services are down.
Comment 4•10 years ago
|
||
Is it possible to set a timeout, say 10s, for each gethash request?
Comment 5•10 years ago
|
||
Yes, if you implement it (last time I looked at it, we couldn't configure HTTP timeouts, certainly not depending on the connection origin). We might need to make our own timers + timeout handling inside SafeBrowsing.
But we should figure out why the site loads block entirely, because taking at least 10s to download random things is going to make Firefox performance look very bad.
Comment 6•10 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #5)
> Yes, if you implement it (last time I looked at it, we couldn't configure
> HTTP timeouts, certainly not depending on the connection origin). We might
> need to make our own timers + timeout handling inside SafeBrowsing.
>
> But we should figure out why the site loads block entirely, because taking
> at least 10s to download random things is going to make Firefox performance
> look very bad.
Yes, but it's better than endless page loading.
About why the site loads block entirely, take taobao.com for example, there are some taobao's pages that contain script[1], the script hit safebrowsing hash db, so Firefox hold the script[1] loading (which means the whole page loading is consequently hold too, unless the script[1] is |deferred|), then Firefox send a gethash request to Google to check if the script[1] is a malware url, but because of GFW, the gethash request never got response (HashCompleterRequest.onStopRequest never called), so the page is endless loading.
[1] http://a.tbcdn.cn/??s/kissy/1.2.0/kissy-min.js,apps/sportalapps/global/1.0/seller-global-min.js?t=20131028.js
Assignee | ||
Comment 7•10 years ago
|
||
Hey Patrick,
We need some necko advice. Is it possible for nsIChannel.asyncOpen to specify a timeout? Otherwise a safebrowsing outage (as happens regularly for China users because of the Great Firewall) can cause Firefox to hang waiting for safebrowsing responses. The relevant code is here: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js#251
Thanks,
Monica
Flags: needinfo?(mcmanus)
Comment 8•10 years ago
|
||
I think comment 5 sums up the expected usage - call asyncopen and set a timer in the calling code and cancel() the channel if your timer expires. iirc that's what the xhr code does.
Perhaps we should do something better.
There are a couple of necko internal timers, but they aren't going to be fine grained enough for you. Basically they are there to reap and reclaim network sockets for stuff that is truly and thoroughly hung - they won't give you responsiveness. (they are at 5 minutes and 90 seconds depending on the state of things and are global configs).
So do you want to implement the timer in the calling code or do you want it to be the property of nsIHttp*mumble* ? I don't mind implementing a necko interface, it should be quite easy, but it would ride the trains at the normal pace.
I would suggest that one problem with either approach is that it is subject to false positives with queueing delays.. i.e. your timer fires because the channel has just been waiting in line for a long time to get started rather than is actually taking a long time itself - which is a function of the length of the line. This can lead to "livelock" on slow networks. (real work never gets done even though the code appears to be making progress)
setting the timer only on admission is harder than it sounds because channels aren't bound tightly to particular tcp connections until those tcp connections are established. (this is related to the reason I mentioned two separate timers already exist).
fun - eh?
Flags: needinfo?(mcmanus)
Assignee | ||
Comment 10•10 years ago
|
||
Pin, is this still an issue? I tried to reproduce by pointing my hosts file at localhost but couldn't get Firefox to hang.
Flags: needinfo?(pzhang)
Assignee | ||
Comment 11•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → mmc
Status: NEW → ASSIGNED
Comment 12•10 years ago
|
||
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #10)
> Pin, is this still an issue? I tried to reproduce by pointing my hosts file
> at localhost but couldn't get Firefox to hang.
Yes, it's still an issue. I don't think it could be reproduced by just changing hosts file.
Flags: needinfo?(pzhang)
Assignee | ||
Comment 13•10 years ago
|
||
Comment on attachment 8486105 [details] [diff] [review]
Cancel gethash requests after 5 seconds if they are still pending
Review of attachment 8486105 [details] [diff] [review]:
-----------------------------------------------------------------
I don't know how to test this :( It does not interfere with normal operation, or when the hosts file hijacks safebrowsing.
Attachment #8486105 -
Flags: review?(gpascutto)
Comment 14•10 years ago
|
||
Comment on attachment 8486105 [details] [diff] [review]
Cancel gethash requests after 5 seconds if they are still pending
Review of attachment 8486105 [details] [diff] [review]:
-----------------------------------------------------------------
When you say you don't know how to test, do you mean automated or, "at all"? If it's "at all" then I'm tempted to r- until you do. It indeed won't activate during normal operation, but what if my Wifi hiccups for 5s and Firefox get stuck for 2 minutes, or forever, because of a missing callback or the backoff logic breaking?
"Emulating" by just firing the timer callback instantly would work. As would replacing the completion URLs by the IP or your own machine, and running a simple TCP proxy on it (that delays). Not sure if you can find something off-the-shelf for it, but a TCP proxy in Unix that multiplexes select/read/send is a few hours of reasonably entertaining coding.
::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js
@@ +207,5 @@
> + notify: function HCR_notify() {
> + // If we haven't gotten onStopRequest, just cancel.
> + if (this._channel && this._channel.isPending()) {
> + dump("hashcompleter: cancelling request to " + this.gethashUrl + "\n");
> + this._channel.cancel(Cr.NS_BINDING_ABORTED);
It's totally non-obvious to me whether this will correctly end up calling notifySuccess / notifyFailure? Failure to open the channel would, but canceling it on the timeout?
What happens to the RequestBackoff logic?
Attachment #8486105 -
Flags: review?(gpascutto) → review+
Assignee | ||
Comment 15•10 years ago
|
||
I reproduced the hang and manually verified this patch fixes it.
STR:
1) Write node server that accepts requests but never responds
2) Point gethash URL at this
3) Go to http://algremh.com/Check/ppl/ (still live as of right now) and Firefox hangs
With this patch, the hash completer gets an onStopRequest which falls through to notifyFailure and backs off as expected.
Assignee | ||
Comment 16•10 years ago
|
||
Assignee | ||
Comment 17•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8486105 -
Attachment is obsolete: true
Assignee | ||
Comment 18•10 years ago
|
||
Comment on attachment 8486865 [details] [diff] [review]
Cancel gethash requests after 5 seconds if they are still pending
Review of attachment 8486865 [details] [diff] [review]:
-----------------------------------------------------------------
Improved comments, carry over gcp's review.
Attachment #8486865 -
Flags: review+
Comment 19•10 years ago
|
||
Comment on attachment 8486865 [details] [diff] [review]
Cancel gethash requests after 5 seconds if they are still pending
Review of attachment 8486865 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js
@@ +231,5 @@
> + // Set a timer that cancels the channel after 5 seconds in case we don't
> + // get a gethash response.
> + this.timer_ = Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer);
> + // Ask the timer to use nsITimerCallback (.notify()) when ready
> + this.timer_.initWithCallback(this, 5000, this.timer_.TYPE_ONE_SHOT);
Is it possible to get the timeout from a pref, say browser.safebrowsing.gethash.timeout? Then we could set it with some proper value based on the local network condition.
Assignee | ||
Comment 20•10 years ago
|
||
Sure, I can do that. It would also be great to have confirmation if this fix works, if you can reproduce that from the TBPL link.
Comment 21•10 years ago
|
||
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #20)
> Sure, I can do that. It would also be great to have confirmation if this fix
> works, if you can reproduce that from the TBPL link.
I tested the patch, it works perfectly, thanks!
Assignee | ||
Comment 22•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8486865 -
Attachment is obsolete: true
Assignee | ||
Comment 23•10 years ago
|
||
Comment on attachment 8487322 [details] [diff] [review]
Cancel gethash requests after 5 seconds if they are still pending
Review of attachment 8487322 [details] [diff] [review]:
-----------------------------------------------------------------
Update to use pref for setting timeout. gcp, are you ok with this naming?
Attachment #8487322 -
Flags: review?(gpascutto)
Comment 24•10 years ago
|
||
Comment on attachment 8487322 [details] [diff] [review]
Cancel gethash requests after 5 seconds if they are still pending
Review of attachment 8487322 [details] [diff] [review]:
-----------------------------------------------------------------
::: modules/libpref/init/all.js
@@ +4290,5 @@
> // UDPSocket API
> pref("dom.udpsocket.enabled", false);
> +
> +// Gethash timeout for Safebrowsing.
> +pref("browser.safebrowsing.gethash.timeout_ms", 5000);
I would throw this in with the other browser.safebrowsing.* prefs, which are in the project specific configs. That even makes some sense as mobile may want a higher timeout.
Also, most of the prefs related to completions are like urlclassifier.* not browser.safebrowsing.*, so for consistency (haha) it should probably be urlclassifier.gethash.timeout_ms.
Attachment #8487322 -
Flags: review?(gpascutto) → review+
Assignee | ||
Comment 25•10 years ago
|
||
Assignee | ||
Comment 26•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8487322 -
Attachment is obsolete: true
Assignee | ||
Comment 27•10 years ago
|
||
Comment on attachment 8487348 [details] [diff] [review]
Cancel gethash requests after 5 seconds if they are still pending
Review of attachment 8487348 [details] [diff] [review]:
-----------------------------------------------------------------
Renamed as gcp suggested. I actually filed bug 1033450 to consolidate prefs in all.js a while back.
Attachment #8487348 -
Flags: review+
Assignee | ||
Comment 28•10 years ago
|
||
Comment 29•10 years ago
|
||
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/c52e9871c9e1 - xpcshell/tests/toolkit/components/url-classifier/tests/unit/test_hashcompleter.js didn't care for that one bit.
Assignee | ||
Comment 30•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8487348 -
Attachment is obsolete: true
Assignee | ||
Comment 31•10 years ago
|
||
Assignee | ||
Comment 32•10 years ago
|
||
Comment 33•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Updated•10 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•