1182638 - [e10s][telemetry] slow-script dialog appears twice as often in e10s vs non-e10s

(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #3) > Roberto, we still need this information. Why did you cancel the NI? Roberto commented on the other bug (bug 1182637): > We don't have enough users not using e10s without add-ons to make any strong claims. > Soon enough we are probably not going to have enough users to compare e10s and non e10s builds as well. > We should make sure that at least one quarter of Nightly users still runs non-e10s builds. Essentially, we can't compare baseline e10s vs non-e10s (users without extensions) on Nightly simply because the non-e10s pool on Nightly is super-depleted and probably very much unlike the e10s population at this point. We could do a comparison on Aurora since it's a steady 45% e10s, but Roberto is tied up in validating the data from the FHR/Telemetry unification (higher org priority?) so sadly he won't have have time to help with this in Q3.

Vladan Djeric (:vladan)

Reporter

Comment 5

•

9 years ago

Brad: why is there a sudden drop-off of E10S users on Aurora according to E10S_AUTOSTART, starting with the July 29th Aurora build?

Flags: needinfo?(blassey.bugs)

Vladan Djeric (:vladan)

Reporter

Comment 6

•

9 years ago

I asked Anthony Zhang to test your hypothesis on bug 1182637. I need to sync up with :jimm on interpreting slow-script-notice count, but either way, we're going to need another server-side analysis person for this e10s comparison in Q3

Flags: needinfo?(azhang)

Roberto Agostino Vitillo (:rvitillo)

Comment 7

•

9 years ago

Sorry about that Brad, I should have linked back to Bug 1182637. Ultimately we should aim to run an A/B test where we randomly pick a set of users on Aurora (Beta?) to have e10s enabled (or disabled). I expect some users to revert back their settings (and we can track that) but that's the only way to make sure the two populations are not biased. Current nightly data is not in an usable state and on aurora there was a sudden drop-off of E10S users which will probably make future analyses likely to be unreliable/biased.

Flags: needinfo?(rvitillo)

Jim Mathies [:jimm]

Comment 8

•

9 years ago

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #5) > Brad: why is there a sudden drop-off of E10S users on Aurora according to > E10S_AUTOSTART, starting with the July 29th Aurora build? We landed a bad a11y patch on aurora that triggered the a11y restart prompt for a bunch of users despite not having an a11y client. Those users should come back with 42.

Flags: needinfo?(blassey.bugs)

Jim Mathies [:jimm]

Comment 9

•

9 years ago

(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #7) > Sorry about that Brad, I should have linked back to Bug 1182637. Ultimately > we should aim to run an A/B test where we randomly pick a set of users on > Aurora (Beta?) to have e10s enabled (or disabled). I expect some users to > revert back their settings (and we can track that) but that's the only way > to make sure the two populations are not biased. Current nightly data is not > in an usable state and on aurora there was a sudden drop-off of E10S users > which will probably make future analyses likely to be unreliable/biased. If you want to a/b on aurora with random samples, now's is your chance with 42 merging! Just target 42 once it's on that channel.

Vladan Djeric (:vladan)

Reporter

Comment 10

•

9 years ago

(In reply to Jim Mathies [:jimm] from comment #9) > If you want to a/b on aurora with random samples, now's is your chance with > 42 merging! Just target 42 once it's on that channel. Indeed. So I want to do something like "if clientID % 2 == 0, force e10s pref on, otherwise force it off". Which e10s prefs do I need to flip?

Jim Mathies [:jimm]

Comment 11

•

9 years ago

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #10) > (In reply to Jim Mathies [:jimm] from comment #9) > > If you want to a/b on aurora with random samples, now's is your chance with > > 42 merging! Just target 42 once it's on that channel. > > Indeed. So I want to do something like "if clientID % 2 == 0, force e10s > pref on, otherwise force it off". Which e10s prefs do I need to flip? Lets file a fresh bug and cc/ni felipe, he's been handling all of our enabling patch work.

:Felipe Gomes (needinfo for replies!)

Comment 12

•

9 years ago

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #10) > (In reply to Jim Mathies [:jimm] from comment #9) > > If you want to a/b on aurora with random samples, now's is your chance with > > 42 merging! Just target 42 once it's on that channel. > > Indeed. So I want to do something like "if clientID % 2 == 0, force e10s > pref on, otherwise force it off". Which e10s prefs do I need to flip? Yeah a fresh bug for this would be nice. To do this you'll need to generate this clientID somehow (or is there already something that you can use?) and do this check in nsAppRunner.cpp mozilla::BrowserTabsRemoteAutostart(). There you can muck with the value of the trialPref boolean to do this A/B testing. I'd keep the prefs as they are now (i.e., keep browser.tabs.remote.autostart.2 = true), and when it's true, perform the A/B filtering. It'd be nice to also add a new status entry that says disabledByAB in order to properly see that on telemetry.

Flags: needinfo?(vdjeric)

Anthony Zhang [:azhang] (last day at Mozilla: 2016-04-29)

Comment 13

•

9 years ago

Ref: https://bugzilla.mozilla.org/show_bug.cgi?id=1182637#c6 For Aurora 41 users with buildIDs on July 28th, 2015 who have no extensions installed: > Median difference in histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.03, (0.56, 0.53). > The probablity of this effect being purely by chance is 0.86.

Flags: needinfo?(vdjeric)

Flags: needinfo?(azhang)

Anthony Zhang [:azhang] (last day at Mozilla: 2016-04-29)

Comment 14

•

9 years ago

I seem to have cleared the needinfo on :vladan somehow...

Flags: needinfo?(vdjeric)

Vladan Djeric (:vladan)

Reporter

Updated

•

9 years ago

Depends on: 1193089

Vladan Djeric (:vladan)

Reporter

Comment 15

•

9 years ago

(In reply to :Felipe Gomes from comment #12) > Yeah a fresh bug for this would be nice. Filed meta bug 1193089 > To do this you'll need to generate this clientID somehow > (or is there already something that you can use?) Yes, Telemetry already has a clientID tied to a profile > and do this check in nsAppRunner.cpp mozilla::BrowserTabsRemoteAutostart(). > There you can muck with the value of the trialPref boolean to do this A/B > testing. > > I'd keep the prefs as they are now (i.e., keep > browser.tabs.remote.autostart.2 = true), and when it's true, perform the A/B > filtering. > > It'd be nice to also add a new status entry that says disabledByAB in order > to properly see that on telemetry. Good suggestions, thanks

No longer depends on: 1193089

Flags: needinfo?(vdjeric)

Vladan Djeric (:vladan)

Reporter

Updated

•

9 years ago

Depends on: 1193089

Jim Mathies [:jimm]

Updated

•

9 years ago

tracking-e10s: ? → +

Jim Mathies [:jimm]

Comment 16

•

9 years ago

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #0) > According to Telemetry, the frequency of the slow-script dialog appearing > roughly doubled with e10s: > SLOW_SCRIPT_NOTICE_COUNT in > http://nbviewer.ipython.org/urls/gist.githubusercontent.com/vitillo/ > cb6f1304316c1c1a2cbc/raw/e10s%20analysis.ipynb Hey Vlad, This data looks to be static, I see build dates specified here. Is there some way to run this report to get the latest data?

Flags: needinfo?(vladan.bugzilla)

Vladan Djeric (:vladan)

Reporter

Comment 17

•

9 years ago

Roberto & Birunthan re-ran the e10s comparison report on the Aurora A/B experiment population, these are the slow-script findings: 1) For profiles with 0 extensions: Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.09, (e10s=0.66, single-process=0.57). The probability of this effect being purely by chance is 0.47 2) For profiles with at least 1 extension: Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.17, (e10s=0.48, single-process=0.31). The probability of this effect being purely by chance is 0.08 3) For profiles with *only* AdBlock Plus installed: Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.22, (e10s=0.46, single-process=0.23). The probability of this effect being purely by chance is 0.23. I'll post the analyses and add more conclusions after we're done reviewing the findings.

Flags: needinfo?(vladan.bugzilla)

Vladan Djeric (:vladan)

Reporter

Comment 18

•

9 years ago

And yes, you can re-run the analysis on Aurora any time you like, via the telemetry-dash.mozilla.org Spark interface (let me know if you want links to docs). We'll also be re-running these analyses for every future experiment we run. The experiment populations are less biased than the current e10s/non-e10s split on Aurora

Vladan Djeric (:vladan)

Reporter

Updated

•

9 years ago

Blocks: e10s-measurement

Vladan Djeric (:vladan)

Reporter

Updated

•

9 years ago

Blocks: 1222894

Jim Mathies [:jimm]

Updated

•

9 years ago

Summary: Telemetry: slow-script dialog appears twice as often in e10s vs non-e10s → [e10s][telemetry] slow-script dialog appears twice as often in e10s vs non-e10s

Jim Mathies [:jimm]

Comment 19

•

9 years ago

https://github.com/vitillo/e10s_analyses/blob/master/aurora/e10s_experiment.ipynb 35% regression: median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is -0.15, (0.42, 0.57).

Jim Mathies [:jimm]

Comment 20

•

9 years ago

Sorry, looks like the first number is e10s so I think this is actually an improvement? Vladan, can you clarify?

Flags: needinfo?(vladan.bugzilla)

Roberto Agostino Vitillo (:rvitillo)

Comment 21

•

9 years ago

Jim, that comparison is not statistically significant.

Flags: needinfo?(vladan.bugzilla)

Jim Mathies [:jimm]

Comment 22

•

9 years ago

Hey Roberto, in the original report, we had a regression as such: Median difference in histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.15, (0.26, 0.11) in the current report we have: Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is -0.15, (0.42, 0.57) How is the first significant but the second not?

Flags: needinfo?(rvitillo)

Roberto Agostino Vitillo (:rvitillo)

Comment 23

•

9 years ago

In both cases the probability of the effect being purely by chance is pretty high so we can't say anything for sure.

Flags: needinfo?(rvitillo)

Jim Mathies [:jimm]

Comment 24

•

9 years ago

<jimm> rvitillo: fyi I ni'd you again on bug 1182638, had another question about those numbers. <jimm> I don't understand why the bug was filed <rvitillo> jimm: In both cases the probability of the effect being purely by chance is pretty high so we can't say anything for sure. <jimm> ah ok <rvitillo> we either need more data to be sure or the difference is so small that it doesn’t really matter <rvitillo> Running an experiment on Beta should help with that <jimm> ok, I'm going to close that out then. we can refile if we see solid proof of a regression down the road. <jimm> thanks!

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → WORKSFORME

Jim Mathies [:jimm]

Updated

•

9 years ago

Whiteboard: [aurora experiment][mixed addons]

Jim Mathies [:jimm]

Updated

•

9 years ago

Blocks: 1249978

Chris Peterson [:cpeterson]

Updated

•

9 years ago

Blocks: 1251545