Open Bug 1459931 Opened 6 years ago Updated 2 years ago

Rollout updates preference value with the preference-experiment value

Categories

(Firefox :: Normandy Client, defect, P3)

defect

Tracking

()

Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 --- fix-optional
firefox62 --- affected

People

(Reporter: aflorinescu, Unassigned)

References

Details

[Description:] When a pref-experiment and a rollout target the same preference, there are two main scenarios: I. pref-experiment runs first, then rollout II. rollout runs first, then pref-experiment The results differ depending on the case: I. pref-experiment expires, value gets set to the rollout value II.pref-experiment active, rollout active, value gets set to the pref-experiment value. [Environment:] Windows 8.1 x64 62.0a1 20180507222648 [Prerequisites:] 1. Set the app.normandy.dev_mode preference to true to run recipes immediately on startup. 2. Set the app.normandy.logging.level preference to 0 to enable more logging. 3. Set the security.content.signature.root_hash preference to DB:74:CE:58:E4:F9:D0:9E:E0:42:36:BE:6C:C5:C4:F6:6A:E7:74:7D:C0:21:42:7A:03:BC:2F:57:0C:8B:9B:90. 4. Set the preference value for app.normandy.api_url set to https://normandy.stage.mozaws.net/api/v1 [Steps:] (reduced) 1. Create and Publish a Rollout recipe for preference_x. 2. Set prerequisites and run Firefox that supports rollouts(>=61) 3. Create and Publish a Pref-experiment for preference_x. 4. Restart Firefox. [Actual Result:] 2. Rollout runs and sets the value for preference_x to rollout_value. 4. Rollout runs, pref-experiment runs and roll-out sets the value for preference_x to pref-experiments_value. e.g. - pref-roll-exp has 99 rollout_value and 111 pref-experiment_value app.normandy.recipe-runner INFO Executing recipe "gy22" (action=preference-experiment) app.normandy.preference-experiments DEBUG PreferenceExperiments.has(gyexp) app.normandy.preference-experiments DEBUG PreferenceExperiments.get(gyexp) app.normandy.preference-experiments DEBUG PreferenceExperiments.markLastSeen(gyexp) app.normandy.recipe-runner INFO Executing recipe "gyroll" (action=preference-rollout) app.normandy.action.PreferenceRolloutAction DEBUG updating rollout-gyula-pref: pref-roll-exp previous value changed from null to 111 [Expected Result:] Preference Experiment is expired and the rollout_value is set.
This isn't ideal, but I think it is the way we should leave it for now. Ideally we would have some sort of conflict resolution, either server or client side, such that only one recipe would apply, and it would be predictable as to which. Exactly what this mechanism should be is something that we still haven't figured out. This sort of interaction between Normandy actions (ie preference-experiment and preference-study) is something that the old model of action execution, where server supplies the actions, doesn't support well. The new model, where the actions are part of Firefox, allows for better paths for interaction. Because of this, I think that a better resolution of this bug relies on bug 1440779. After that, we can figure out how this should work.
Priority: -- → P3
As a side-effect from this bug, when this scenario happens, the telemetry that gets on the server gets a bit confused: see https://sql.telemetry.mozilla.org/queries/50541/source The rollout/rollback ends up looking like this: (note that now the failed rollback has "branches_experiment":"2" which is a value that 'leaks' from the preference-experiment that is performed in this case on the same preference.) " {"branches_experiment":"2","adiz":"active","graduated-rollback-pref":"active"} 3,235 normandy unenrollFailed preference_rollback " Mike, let me know your thoughts related to this and also let me know if you want to track this into a separate bug.
Flags: needinfo?(mcooper)
I think the issue here is that both the preference rollout and rollback are both active at the same time, and ran in the "wrong" order. The rollback seems to have ran first, and failed because the matching rollout hadn't ran yet. Then the rollout ran correctly. Later the rollback ran correctly. This doesn't seem to be related at all to the preference study, and is an interaction between rollout and rollback. One of the issues making this harder is that it isn't easy to tell what order events came in (at least, I don't know how). For two events have a "event_timestamp" of 3235, and seem to be in the same subsession. The only ordering of these two events that makes sense is "unenrollFailed" followed by "enroll" (the reverse wouldn't make any sense). The "experiments" fields seem to be the "eventually consistent" value here, not being generated for each event. I think the solution to this situation is to simply not allow rollouts and rollbacks to be active at the same time, something we can handle on the server. I don't think there is any new data in comment 3 that indicates poor interactions between rollouts and studies.
Flags: needinfo?(mcooper)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.