Flip the network.http.spdy.websockets pref to false for Fx65 release users to work around bug 1523427
Categories
(Core :: Networking: WebSockets, enhancement, P2)
Tracking
()
People
(Reporter: RyanVM, Assigned: mythmon)
References
Details
(Whiteboard: [necko-triaged])
Per bug 1523427, the feature causing problems can be disabled via setting the network.http.spdy.websockets pref to false. I've confirmed with Dragana that this pref doesn't require a restart to take effect.
This is high urgency as it blocks wider rollout of Fx65.
We should make the recipe affect the beta and release channels for versions 65.0 and 65.0b*. For now, let's leave the recipe as not affecting a future 65.0.1 version as there's a low-risk looking fix that we can possibly take as a ride-along fix.
Mythmon, are you the right person to take this or should the NI be redirected to someone else?
Reporter | ||
Comment 1•6 years ago
|
||
Confirmed in bug 1523427 #c22 that flipping the pref flip is an effective workaround.
Assignee | ||
Comment 2•6 years ago
|
||
I've prepared a Normandy recipe for this which is awaiting approval. Ryan, can you review it?
Can we rollout this recipe slowly rather than at 100%? I am worried what scenarios have gone untested with the spdy.websockets true for all of beta65 cycle. How about 25% for a day? and 100% a day later?
Hi Dragana, is there a way to narrow down the users that might be impacted by bug 1523427 rather than turning it off for all of release population? This is to mitigate the risk mentioned in comment 3.
We did something similar in Bug 1471672.
Reporter | ||
Comment 6•6 years ago
|
||
As noted in comment 3, there is a risk here of some undetected breakage in the websockets code having been introduced between early November and now. We are doing our best to mitigate that risk by a combination of manual testing performed by the QA team in Las Vegas and a Try push off mozilla-release with the pref flipped. That Try push isn't showing any obvious breakage:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=83ddb75d6a294c59d15c46671dee49756bfadb42
QA also didn't notice any observable change in behavior with the pref flipped but was able to confirm that it resolved the originally-reported issue.
We had some offline discussions about this and came to the following conclusions:
- This recipe needs to target both 64.* and 65.* to ensure that the rule is already in place for users updating from 64 to 65 after we unthrottle updates (otherwise Fx64 clients will see that the recipe doesn't apply and disregard). This isn't risky because the pref was added by bug 1434137 and will therefore be a no-op for users on 64.
- Rolling this out gradually or with a restricted audience doesn't seem feasible to me since we don't have any real ways of detecting which users are in an environment with an affected proxy server and this blocks wider rollout of Fx65. Given the risk mitigations above, this gives me confidence in flipping the pref as a short-term fix.
Reporter | ||
Comment 7•6 years ago
|
||
(In reply to Michael Cooper [:mythmon] from comment #2)
I've prepared a Normandy recipe for this which is awaiting approval. Ryan,
can you review it?
This is now approved and published with the changes noted in comment 6. I was also able to confirm that the recipe properly sets network.http.spdy.websockets to false in the following cases:
- Firefox 64 new profile.
- Firefox 65 migrated from 64 with previously-enrolled profile.
- Firefox 65 new profile.
Comment 8•6 years ago
|
||
(In reply to Ritu Kothari (:ritu) from comment #4)
Hi Dragana, is there a way to narrow down the users that might be impacted by bug 1523427 rather than turning it off for all of release population? This is to mitigate the risk mentioned in comment 3.
There is no way to narrow down the users that might be impacted by bug 1523427.
The pref turns off the websocket over http/2 feature. Firefox will still support websockets over http1. With the pref turn on, if a server do not support the websocket over http/2 feature, Firefox will use websockets over http1. Therefore both code paths were tested in nightly and beta.
The websocket over http/2 feature is a new feature and there is not a lot of servers that support it (this might be the reason we discovered it so late). Therefore, I believe that nightly 65 and beta 65 users have mostly used the websockets over http1 code path. Risk of turning off the pref is very very low.
(In reply to Dragana Damjanovic [:dragana] from comment #8)
(In reply to Ritu Kothari (:ritu) from comment #4)
Hi Dragana, is there a way to narrow down the users that might be impacted by bug 1523427 rather than turning it off for all of release population? This is to mitigate the risk mentioned in comment 3.
There is no way to narrow down the users that might be impacted by bug
1523427.The pref turns off the websocket over http/2 feature. Firefox will still
support websockets over http1. With the pref turn on, if a server do not
support the websocket over http/2 feature, Firefox will use websockets over
http1. Therefore both code paths were tested in nightly and beta.The websocket over http/2 feature is a new feature and there is not a lot of
servers that support it (this might be the reason we discovered it so late).
Therefore, I believe that nightly 65 and beta 65 users have mostly used the
websockets over http1 code path. Risk of turning off the pref is very very
low.
The details are very helpful and glad to hear the risk is minuscule. Thanks Dragana!
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Reporter | ||
Comment 10•6 years ago
|
||
Hi mythmon, I've just uplifted the real fix for bug 1523427 to mozilla-release for Monday's 65.0.1 build. Can we please go ahead and update the recipe to versions <65.0.1? Thanks!
Assignee | ||
Comment 11•6 years ago
|
||
I've updated the recipe to exclude 65.0.1 and above.
Reporter | ||
Comment 12•6 years ago
|
||
Approved, thanks for the update.
Comment 13•6 years ago
|
||
Hey :mythmon -- can we close this bug out at this point?
Assignee | ||
Comment 14•6 years ago
|
||
I'd like to keep this bug open as long as the the Normandy recipe is still active.
This dashboard shows that we still have about 6% of users that have the hotfix active. Disabling the hotfix today would cause these users to revert to the default (broken) state. I think in a couple more weeks we can disable this.
Is this going to affect 66 at all ? Or are we only keeping this alive for 65 users?
Assignee | ||
Comment 16•6 years ago
|
||
This recipe affects only versions >= 64 and < 65.0.1. Users of 65.0.1 and above are not affected by this recipe.
Yes, I meant, do we want it to affect 66. I'll assume no then.
Reporter | ||
Comment 18•6 years ago
|
||
The real fix for this issue shipped in 65.0.1+.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 19•5 years ago
|
||
If we haven't killed this recipe yet, we can now.
Assignee | ||
Comment 20•5 years ago
|
||
In the past week, we've seen 600K users that are still benefiting from this recipe being active. Specifically, that's users in the clients_daily
table that were running 65.0 (the build without the fix) and had the hotfix active. For scale, that's about 0.4% of the users we've seen in the last week.
Ryan, do you still want to disable this recipe?
Reporter | ||
Comment 21•5 years ago
|
||
Ugh. I guess keep it around, but it would be nice if we had a clearer-cut policy around this as I don't think we should be supporting this indefinitely. Fx65 is almost 3 releases behind current now.
Assignee | ||
Comment 22•5 years ago
|
||
Re-opening since we are keeping the experiment active.
Assignee | ||
Comment 24•5 years ago
|
||
I don't think this is needed any more. I've disabled the recipe.
Reporter | ||
Updated•5 years ago
|
Description
•