Closed Bug 1284933 Opened 8 years ago Closed 8 years ago

Change update orphaning Spark script to ignore orphaned users who haven't run Firefox for at least 120 minutes since they became orphaned

Categories

(Toolkit :: Application Update, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: spohl, Assigned: spohl)

References

()

Details

Attachments

(3 files)

The Spark script should make sure that users have run Firefox for at least 120 minutes since they became orphaned. This ensures that users should have had enough time to check for updates and initiate the download/update. This will help reduce the number of reported orphaned users that may in fact be test farms, systems that were restored from VM snapshots etc.
Note: for release users the update check happens once every 24 hours and hence it would also be a good thing to also have a report for people that haven't run in 24 hours to eliminate the is downloading, is staged, etc. for these users... at least for comparison.
Attached file Spark changes (deleted) —
Robert, I believe this takes care of your comment as well.
Attachment #8769412 - Flags: review?(rvitillo)
Attachment #8769412 - Flags: feedback?(robert.strong.bugs)
Attached file Dashboard changes (deleted) —
Attachment #8769413 - Flags: review?(chutten)
Attachment #8769413 - Flags: feedback?(robert.strong.bugs)
Attached image Difference between dashboards (deleted) —
I ran the new Spark job again on the report from 6/29. As this screenshot illustrates, the change in orphaned users is quite dramatic. By requiring that users run Firefox for at least 2 hours before being considered orphaned, we reduced the number of orphaned users by about 45%.
Note that the two hours stems from our KPI in the update orphaning project document. Adding the URL to the document here.
And so people don't have to open the document, our KPI is: "The percentage of orphaned users who have been running Firefox version N for at least 120 minutes since an update for version N+3 has become available should be 2% by the end of 2016."
Comment on attachment 8769414 [details] Difference between dashboards Could you add the screenshot of the 24 hour graph as well for comparison?
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #7) > Comment on attachment 8769414 [details] > Difference between dashboards > > Could you add the screenshot of the 24 hour graph as well for comparison? Maybe I didn't understand what you meant by the 24 hours. What is the reference point for the 24 hours, i.e. 24 hours since when? It might be easiest if you could you take a look at the Spark script[1] and tell me where you were thinking that we should split these users out. [1] https://github.com/sapohl/data-pipeline/blob/7e303d1fe167dbab61335837aca57be2f301cd5e/reports/update-orphaning/Update%20orphaning%20analysis%20using%20longitudinal%20dataset.ipynb
(In reply to Stephen A Pohl [:spohl] from comment #6) > And so people don't have to open the document, our KPI is: > > "The percentage of orphaned users who have been running Firefox version N > for at least 120 minutes since an update for version N+3 has become > available should be 2% by the end of 2016." Do you know how the "at least 120 minutes since an update..." was decided upon? I think it should be reworded to specify users have downloaded the new update since we don't force the user to restart to update after the download for quite some time (depends on version when the restart is prompted) or reworded in some similar fashion.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #9) > (In reply to Stephen A Pohl [:spohl] from comment #6) > > And so people don't have to open the document, our KPI is: > > > > "The percentage of orphaned users who have been running Firefox version N > > for at least 120 minutes since an update for version N+3 has become > > available should be 2% by the end of 2016." > Do you know how the "at least 120 minutes since an update..." was decided > upon? > > I think it should be reworded to specify users have downloaded the new > update since we don't force the user to restart to update after the download > for quite some time (depends on version when the restart is prompted) or > reworded in some similar fashion. This is the document that we (bsmedberg, mhowell, you and I) reviewed and signed off on...
What I mean by 24 hours is to have the same graph as you have for 120 minutes and instead of it being 120 minutes it would be for 1440 minutes.
(In reply to Stephen A Pohl [:spohl] from comment #10) > (In reply to Robert Strong [:rstrong] (use needinfo to contact me) from > comment #9) > > (In reply to Stephen A Pohl [:spohl] from comment #6) > > > And so people don't have to open the document, our KPI is: > > > > > > "The percentage of orphaned users who have been running Firefox version N > > > for at least 120 minutes since an update for version N+3 has become > > > available should be 2% by the end of 2016." > > Do you know how the "at least 120 minutes since an update..." was decided > > upon? > > > > I think it should be reworded to specify users have downloaded the new > > update since we don't force the user to restart to update after the download > > for quite some time (depends on version when the restart is prompted) or > > reworded in some similar fashion. > > This is the document that we (bsmedberg, mhowell, you and I) reviewed and > signed off on... I know and it was reviewed very quickly since I wasn't given much time from creation to sign off and didn't have much time during that short period to review it. Do you know why 120 minutes was chosen?
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #11) > What I mean by 24 hours is to have the same graph as you have for 120 > minutes and instead of it being 120 minutes it would be for 1440 minutes. Isn't a check executed almost immediately when a user starts Firefox when he hasn't run Firefox for 24 hours? The 120 minutes refer to actual runtime, which could happen over several days or even weeks if the user isn't using Firefox much. Are you asking that we look at 24 hours runtime? I suspect that this would give the (false) impression that we have barely any orphaned users at all. (In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #12) > Do you know why 120 minutes was chosen? 2 hours over 12 weeks after users have become orphaned seemed like a reasonable amount of time that we could reasonably expect that an update check has occurred and we can analyze what state these users are in (via UPDATE_CHECK_CODE_NOTIFY).
I misunderstood the direction you were taking. For clarity, I am concerned about people that have it downloaded and or staged as up to date. So, someone that seldom runs Firefox that runs, downloads (stages also if it is set to and they are able to), and then exits without restarting or restarts without it being immediately reported to telemetry. There may be other cases but that is the main one. I think I was mistaken regarding the 24 hours. I think that the versions in question don't ask the user to restart for 48 hours. Having said that, running Firefox for 120 minutes in total is likely good enough as long as there are multiple runs to show they have restarted to apply the update. Note: the reduction of 45% when only reporting users that run Firefox for 120 minutes during the period may be due to slow connection users. It would be interesting to see what the difference is when using 240 minutes instead of 120 as well as what the spread is using 5 minute intervals. When I did that for the stub installer most of the users that didn't download cancelled within the 1st minute. This could be similar in that the majority of these users that fall into that 45% may only be using Firefox for a few minutes in total during the period. If that is the case we might want to make using BITS to perform the download while Firefox is not running a higher priority to lessen orphaning.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #14) > I misunderstood the direction you were taking. > > For clarity, I am concerned about people that have it downloaded and or > staged as up to date. as not reported as up to date.
Attachment #8769412 - Flags: review?(rvitillo) → review-
Attachment #8769413 - Flags: review?(chutten) → review+
Comment on attachment 8769412 [details] Spark changes Could you please change the comment "2 hours guarantees that a user should at least have detected an update by now." to something like 2 hours so most systems have had a chance to perform an update check, download the update, and restart Firefox after the update has been downloaded.
Attachment #8769412 - Flags: feedback?(robert.strong.bugs) → feedback+
Comment on attachment 8769413 [details] Dashboard changes lgtm. Could you run a one-off report so it is possible to see a breakdown of how long the users ran Firefox in the 12 week period for the systems in the 45% reduction?
Attachment #8769413 - Flags: feedback?(robert.strong.bugs) → feedback+
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #16) > Comment on attachment 8769412 [details] > Spark changes > > Could you please change the comment "2 hours guarantees that a user should > at least have detected an update by now." to something like > > 2 hours so most systems have had a chance to perform an update check, > download the update, and restart Firefox after the update has been > downloaded. Changed and rebased patch. (In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #17) > Comment on attachment 8769413 [details] > Dashboard changes > > lgtm. > > Could you run a one-off report so it is possible to see a breakdown of how > long the users ran Firefox in the 12 week period for the systems in the 45% > reduction? We should handle this in a separate bug. We will need to figure out the time ranges that we want to group users into etc. and I would like to keep this separate from this bug here.
This has landed.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: