Closed
Bug 1472308
Opened 6 years ago
Closed 6 years ago
Disable parallel OMTP for OSX 10.9 users via Normandy pref experiment
Categories
(Core :: Graphics, enhancement)
Tracking
()
RESOLVED
FIXED
People
(Reporter: RyanVM, Assigned: mythmon)
References
Details
Over in bug 1471892, we believe that parallel OMTP is causing crashes in OSX CoreGraphics code for users on 10.9. We would like to confirm that to be the case via Normandy.
Can we please create a recipe which changes the "layers.omtp.paint-workers" pref to a value of 1 for all OSX 10.9 users running Fx61 on the release channel? Thanks!
Flags: needinfo?(mcooper)
Assignee | ||
Comment 1•6 years ago
|
||
I've created a recipe on stage and prod implementing the behavior described in comment 0. I've enabled the one on stage for testing.
Stage recipe: https://normandy-admin.stage.mozaws.net/recipe/509/
Prod recipe: https://normandy-admin.prod.mozaws.net/recipe/495/
Public API: https://normandy.cdn.mozilla.net/api/v1/recipe/495/
For ease, here is the current filter expression:
(
normandy.telemetry.main.environment.system.os.name == "Darwin"
&& (
normandy.telemetry.main.environment.system.os.version == "13.0.0"
|| normandy.telemetry.main.environment.system.os.version == "13.4.0"
)
&& normandy.channel == "release"
&& normandy.version >= "61.0"
&& normandy.version < "62.0"
)
Flags: needinfo?(mcooper)
Reporter | ||
Comment 2•6 years ago
|
||
Recipe looks good to me, thanks! Andrei, does your team have an OSX 10.9 system handy to test this out?
Flags: needinfo?(andrei.vaida)
Comment 3•6 years ago
|
||
We've been able to test the parallel OMTP stage disabling on Firefox 61.0 build3 (20180621125625). The results look good, but there is one detail that requires clarification: the disabling process takes place only after the browser is restarted. Is this expected or not? See more details about the performed testing in this etherpad https://public.etherpad-mozilla.org/p/request_bug1472308.
Please let me know if additional testing is required or if there are questions about this report.
Flags: needinfo?(andrei.vaida) → needinfo?(mcooper)
Assignee | ||
Comment 4•6 years ago
|
||
Based on the testing details, I believe the restart behavior is as expected. Specifically:
> the recipe execution is made only after restart
This is as intended. Normally Normandy runs on a timer, every 6 hours. By setting app.normandy.dev_mode = true, Normandy runs at every start up. In this test procedure, the restart is the trigger to cause Normandy to run, instead of waiting around for the timer to fire.
Flags: needinfo?(mcooper)
Reporter | ||
Comment 5•6 years ago
|
||
Sounds like this is ready to go live then. Thanks for the quick testing!
Reporter | ||
Comment 6•6 years ago
|
||
This is now published.
Reporter | ||
Comment 7•6 years ago
|
||
Hi Michael, per the investigation in bug 1471892, we'd like to revise this recipe such that we set the "layers.omtp.enabled" pref to false rather than changing "layers.omtp.paint-workers". What's the best way to go about that now?
Status: RESOLVED → REOPENED
Flags: needinfo?(mcooper)
Resolution: FIXED → ---
Assignee | ||
Comment 8•6 years ago
|
||
Because of the way that we set up this recipe, we'll need to make a new recipe to make this change, and deactivate the old one. That means that users will either have a window where both prefs are changed, or where neither of them are. If all goes well, it should be only a few seconds, but in some corner cases it could be up to six hours of Firefox running. Are either of these cases a problem? We can bias the edge cases one way or the other if needed.
I'll set up a new recipe for this, and we can figure out the transition.
Reporter | ||
Comment 9•6 years ago
|
||
We could probably just turn off the existing recipe ASAP since it's not helping anyway. That said, the old recipe shouldn't matter once the new one is live anyway (since disabling OMTP will outweigh any settings for controlling it when enabled), so it probably isn't a big deal.
Assignee | ||
Comment 10•6 years ago
|
||
The new recipe is 502.
Delivery Console (needs VPN currently): https://delivery-console.prod.mozaws.net/recipe/502/
Public API: https://normandy.cdn.mozilla.net/api/v1/recipe/502/
Feel free to disable the old recipe at any time, if it isn't helping. If you're happy with the new one, it should also be ready to go.
Flags: needinfo?(mcooper)
Reporter | ||
Comment 11•6 years ago
|
||
OK, I've disabled the old recipe and approved and published the new one. Thanks, Mike!
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 12•6 years ago
|
||
I think we need to update this recipe to make it a user pref so it's present on startup.
Flags: needinfo?(mcooper)
Assignee | ||
Comment 13•6 years ago
|
||
Unfortunately, we can't change the branch of the preference experiment either, since users don't update after the initial enrollment. I'll create a new recipe with the change, and add one to my counter of how bad an idea immutable enrollment was.
Assignee | ||
Comment 14•6 years ago
|
||
Recipe number 3 is here: https://delivery-console.prod.mozaws.net/recipe/509/
Public API: https://normandy.cdn.mozilla.net/api/v1/recipe/509/
Flags: needinfo?(mcooper)
Reporter | ||
Comment 15•6 years ago
|
||
Part 3 is now live.
Comment 16•6 years ago
|
||
the crash rate for bug 1471892 isn't really going away. iulia, could you check if the pref flip from recipe number 3 filters through to your installation?
Flags: needinfo?(iulia.cristescu)
Comment 17•6 years ago
|
||
https://crash-stats.mozilla.com/report/index/bca7bf69-4100-43c5-9afb-db7990180716#tab-telemetryenvironment e.g. contains:
"hotfix-omtp-61-osx-10-9-pt3-bug-1472308":{"branch":"hotfix","type":"normandy-exp"}
but the app notes still say "OMTP+1".
Comment 18•6 years ago
|
||
(In reply to [:philipp] from comment #16)
> the crash rate for bug 1471892 isn't really going away. iulia, could you
> check if the pref flip from recipe number 3 filters through to your
> installation?
I don't see the recipe being fetched or executed on OSX 10.9 (neither on stage). What is the difference between this one and the previous (the one from comment 1)? Also, the "layers.omtp.enabled" pref is still true.
Flags: needinfo?(mcooper)
Flags: needinfo?(madperson)
Flags: needinfo?(iulia.cristescu)
Assignee | ||
Comment 19•6 years ago
|
||
The second and third recipes were only published on prod, not stage. Only the third recipe is currently active. I verified that it is in the payload being sent to clients. Looking at Telemetry events, I see about 4 million enrollment events, about 15K enrollments, and only a handful of enrollment failures.
The difference between the first recipe and the second is the pref being changed. The first used "layers.omtp.paint-workers", whereas the second used "layers.omtp.enabled". It was necessary to make a second recipe since preference experiments can't be updated in place for already enrolled users.
There are two differences between the second and third recipes. The third targets the user branch, instead of the default branch (to make it compatible with a feature that requires a restart). The other change is to use a range expression for OSX version instead of fixed versions. I'm not sure this is relevant, but the targeting new behavior is a superset of the old behavior.
Neither of these changes would affect whether or not the recipe was being fetched, since fetching happens without regard to the targeting or recipe arguments.
Iulia, can you double check that the recipe is not being fetched? Can you share your STR if you still aren't seeing the recipe?
Flags: needinfo?(mcooper)
Assignee | ||
Comment 20•6 years ago
|
||
The notebook I used to calculate enrollment is here: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/21345/command/21365
Assignee | ||
Comment 21•6 years ago
|
||
Iulia: I have a theory of why you didn't see the recipe. We recently changed the caching behavior of Normandy in a way that shouldn't affect this, but might. I have two requests: First, next time you try this can you run Firefox with MOZ_LOG=sync,timestamp,nsHttp:5,cache2:5 to record some extra network details? And second, if you get a run where you don't see the recipe again, please *save the profile*. It may be related to bug 1354151, which we have had a lot of trouble tracking down.
Flags: needinfo?(iulia.cristescu)
Comment 22•6 years ago
|
||
Michael, sorry for the late answer and thank you for the clarification!
I double checked the third recipe and today I encountered no issues.
I used exactly the same station (Marvericks 10.9.5) and the same build (61.0.1 build1 20180704003137), created new profiles, set "app.normandy.dev_mode" to "true" to run recipes immediately on startup and "app.normandy.logging.level" to "0" to enable more logging, then restarted the browser for the changes to take place. The "Hotfix: Disable parallel OMTP for OSX 10.9 - part 3 [Bug 1472308]" recipe was successfully displayed as fetched and executed in Browser Console and also, the "layers.omtp.enabled" was properly switched to "false". In this moment, the crash from bug 1471892 was still reproducible. After a restart (that also made sense, it was required in order to have OMTP disabled), I wasn't able to reproduce the crash anymore, using the steps mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1471892#c18.
I suppose that the unsuccessful attempts from yesterday were caused by bug 1354151, but unfortunately I previously deleted the "bad" profiles and also I didn't manage to reproduce that awkward situation anymore.
Flags: needinfo?(madperson)
Flags: needinfo?(iulia.cristescu)
Comment 23•6 years ago
|
||
Please let me know if additional testing is required or if there is something unclear.
Assignee | ||
Comment 24•6 years ago
|
||
Thanks for the details Iulia. It sounds like your steps are correct, and make sense. I'm still puzzled why you ran into the problem last time you checked, but it seems like the problem has gone away now.
Reporter | ||
Comment 25•6 years ago
|
||
So, to summarize, the recipe *does* appear to be properly taking effect after a restart. And Iulia's STR for reproducing the crash no longer appear to work afterwards. But we're still not seeing a drop in crash rate in bug 1471892?!
Assignee | ||
Comment 26•6 years ago
|
||
This has been running for a while now. Is it still useful and necessary? Should it be ended? It is targeting Release 61 only, so it's usefulness is rapidly diminishing.
Reporter | ||
Comment 27•6 years ago
|
||
Go ahead and turn it off.
Assignee | ||
Comment 28•6 years ago
|
||
The Normandy recipe (https://delivery-console.prod.mozaws.net/recipe/509/) has been disabled.
You need to log in
before you can comment on or make changes to this bug.
Description
•