Closed Bug 1462164 Opened 6 years ago Closed 6 years ago

Gradual roll-out of TLS fallback-limit to 1.3 on beta channel (61)

Categories

(Core :: Security: PSM, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox61 + fixed

People

(Reporter: rhelmer, Assigned: rhelmer)

References

Details

TLS 1.3 is already enabled on Beta (currently version 61), we'd like to now do a gradual roll-out of the fallback-limit pref. This is controlled by the "security.tls.version.fallback-limit" pref, which is currently set to 3 (TLS 1.2) on Beta. The value we wish to roll out is 4 (TLS 1.3) The plan is to use Feature Flagging with Normandy Pref Rollout , which is a new built-in capability as of Firefox 61. Since this is the first time we're doing a large-scale test of Normandy, we will initially roll this out at 10% and then ramp up to 95% once we've seen telemetry validating that it's working as expected.
> Since this is the first time we're doing a large-scale test of Normandy Just a small note, this is the first time we're doing a large-scale test of a particular Normandy feature: preference rollout. Normandy itself has been well vetted in the past, since it is the underlying mechanism for the Shield program. :) > we will initially roll this out at 10% and then ramp up to 95% once we've seen telemetry validating that it's working as expected. +1. I think this is a good approach even once Normandy preference rollout has been tested out. This will minimize damage in case something goes awry for any reason.
Based on the Green Pre-release signoff report for the Normandy preferences rollout that was sent out by Adrian today, are we ready to give this a go?
Flags: needinfo?(rhelmer)
Flags: needinfo?(mcooper)
As far as I know, we weren't actually waiting for the QA sign-off to launch this rollout. Rob told me there was another issue they were working on before rolling out, and I was waiting to hear more about it. From my side, this is ready to go whenever Rob is ready. Rob: Is there anything else we are still waiting on?
Flags: needinfo?(mcooper)
(In reply to Michael Cooper [:mythmon] from comment #3) > As far as I know, we weren't actually waiting for the QA sign-off to launch > this rollout. Rob told me there was another issue they were working on > before rolling out, and I was waiting to hear more about it. From my side, > this is ready to go whenever Rob is ready. > > Rob: Is there anything else we are still waiting on? We were holding until bug 1462303 was investigated. ekr, are we good to go on TLS fallback-limit to 1.3 on beta channel?
Flags: needinfo?(rhelmer) → needinfo?(ekr)
Yes, it's fixed. Go for 10%.
Flags: needinfo?(ekr)
This is live for 10% of Beta 61.
Looking at telemetry from 2018-06-05 (yesterday), 143k out of 1.44M (9.96%) pings have the pref rollout active. https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/16317/command/16335 Additionally, we've seen 223k total enrollments across the lifetime of the rollout. https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/16031/command/16032 This is the first Normandy pref rollout, so I'm paying close attention to its health. From what I can see, everything is looking good from my end, and behaving as expected.
When do we intend to up the rate to 95%?
Flags: needinfo?(rhelmer)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #8) > When do we intend to up the rate to 95%? ekr, according to Normandy's telemetry we're at 10% now on Beta, can you let us know here when we should bump this up? Thanks!
Flags: needinfo?(rhelmer) → needinfo?(ekr)
Any updates on this? Beta 62 is drawing near and there is interest from the QA side as well for the first live Rollout and its data.
Enrollment data still holds at about 10%. As far as all of Normandy's metrics, the rollout was a success. There was one problem, which was caused by a known and already fixed client side bug, causing extraneous update events to be sent. Currently we are on hold until Rob can analyze the TLS telemetry that was sent by enrolled clients. This is proving to be more difficult than expected, since the previous version of analysis didn't work with the rollout system, and the data isn't in any of the derived data sets.
61 is on release and bug 1473987 tracks the rollout there.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(ekr)
Resolution: --- → FIXED
Hi Michael, as we discussed, this recipe is still "alive/enabled" but should it be? What criteria would we use to disable it? Thanks!
Flags: needinfo?(mcooper)
Once we disable this recipe, and users on the target version of Firefox would no longer get the rollout. This recipe is targeting only Beta 61, so I would guess it is safe to turn off. I checked the metrics for this recipe on Datadog, and it has gotten about 700 enrollments in the last 24 hours. Ultimately I think it is up to Rob about if it is safe to turn this off. Rob, what do you think?
Flags: needinfo?(mcooper) → needinfo?(rhelmer)
(In reply to Michael Cooper [:mythmon] from comment #14) > Once we disable this recipe, and users on the target version of Firefox > would no longer get the rollout. This recipe is targeting only Beta 61, so I > would guess it is safe to turn off. I checked the metrics for this recipe on > Datadog, and it has gotten about 700 enrollments in the last 24 hours. > Ultimately I think it is up to Rob about if it is safe to turn this off. > > Rob, what do you think? I think we're fine, but let me check Telemetry first - leaving needinfo for now.
Rhelmer, any updates about this?
(In reply to Michael Cooper [:mythmon] from comment #16) > Rhelmer, any updates about this? Sorry have been meaning to get back to you - what's the harm in leaving it as-is, out of curiosity? I think it's safe at this point, since we don't explicitly don't support old releases... I don't want to purposely break anyone but I can't see how this would.
Flags: needinfo?(rhelmer) → needinfo?(mcooper)
There isn't any direct harm in leaving this is as, but every recipe that is enabled in Normandy is downloaded by every Firefox Client every six hours. It would be wise to disable recipes that aren't having an effect anymore. Due to the design of rollouts, ending the rollout won't affect users that already have the feature. It will only stop applying it to new profiles started on Beta 61, or existing profiles on Beta 61 that have never been online since we launched this feature. Ideally this would be a data driven process. How many users are still actively applying this update, and is it below a threshold that is acceptable? I checked data dog, and we are seeing about 2000 enrollments per week, which is ~25% of what we saw at the start of the rollout.
Flags: needinfo?(mcooper)
(In reply to Michael Cooper [:mythmon] from comment #18) > There isn't any direct harm in leaving this is as, but every recipe that is > enabled in Normandy is downloaded by every Firefox Client every six hours. > It would be wise to disable recipes that aren't having an effect anymore. > > Due to the design of rollouts, ending the rollout won't affect users that > already have the feature. It will only stop applying it to new profiles > started on Beta 61, or existing profiles on Beta 61 that have never been > online since we launched this feature. > > Ideally this would be a data driven process. How many users are still > actively applying this update, and is it below a threshold that is > acceptable? I checked data dog, and we are seeing about 2000 enrollments per > week, which is ~25% of what we saw at the start of the rollout. Given this is beta and also an unsupported release, I think we're fine to pull this now. There shouldn't be any new installs etc. I do agree having some kind of policies around this would be good, let's think about it a bit. We might want to be case-by-case for starters and then make a policy once we have more data, also really depends on how critical the roll-out is to functionality too I'd think.
I have disabled the recipe.
You need to log in before you can comment on or make changes to this bug.