Closed Bug 836010 Opened 12 years ago Closed 12 years ago

When startup is determined to be slow, tell users about ways to improve their startup time

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 21
Tracking Status
relnote-firefox --- 21+

People

(Reporter: jaws, Assigned: dao)

References

(Blocks 1 open bug)

Details

Attachments

(6 files, 2 obsolete files)

We should keep track of startup times locally on the machine. If a user encounters multiple slow startups, we should show a notification bar that says something along the lines of "It seems that your Firefox is slow to start up. ... Find ways to improve the startup speed of Firefox"
Clicking on "Find ways to improve the startup speed of Firefox" could either take the user to a SUMO article or to about:support. A SUMO page is probably better for l10n as well as ability to update on the fly.
I talked with Matej (CC'd) and he had these first ideas for the strings that could be a little more whimsical. > "Firefox seems. slow. to. start." [See how you can make it _fast_] Where "See how you can make it _fast_" is the button to the SUMO article.
Some small tweaks to comment 2 and two options of how it could be formatted: Firefox seems slow. To. Start. [Learn how to _speed it up_] Firefox seems slow... to... start. [Learn how to _speed it up_] What I like about this approach is that it relies on formatting rather than language. That means that it could still be localized easily without the punctuation for languages where it would make sense or wouldn't be appropriate.
Attached patch patch (obsolete) (deleted) — Splinter Review
Comment on attachment 708335 [details] [diff] [review] patch Review of attachment 708335 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/locales/en-US/chrome/browser/browser.properties @@ +445,5 @@ > mixedContentBlocked.unblock.accesskey = D > + > +# LOCALIZATION NOTE - %S is brandShortName > +slowStartup.message = %S seems slow… to… start. > +slowStartup.helpButton.label = Learn How to Speed It Up The casing on this feels too awkward to me. I think this should be changed to Matej's proposed "Learn how to speed it up".
We use title capitalization in buttons, so "Learn how to speed it up" would definitely be wrong.
We should consider having an easy way to exit that notification that then lets us know not to show the warning again for awhile. Perhaps changes the threshold to the current startup speed, so if Firefox gets slower then it will warn again. I'm just worried about users having this bar every time they start Firefox and not having a way to turn it off
(In reply to Dão Gottwald [:dao] from comment #6) > We use title capitalization in buttons, so "Learn how to speed it up" would > definitely be wrong. I don't think we need to be 100% strict on this. The sentence capitalization is a clear win here.
(In reply to Tyler Downer [:Tyler] from comment #7) > We should consider having an easy way to exit that notification that then > lets us know not to show the warning again for awhile. Perhaps changes the > threshold to the current startup speed, so if Firefox gets slower then it > will warn again. I'm just worried about users having this bar every time > they start Firefox and not having a way to turn it off We wouldn't show it every time they start Firefox. With the current patch it would be every fifth start, and we can increase that number if wanted. (In reply to Jared Wein [:jaws] from comment #8) > (In reply to Dão Gottwald [:dao] from comment #6) > > We use title capitalization in buttons, so "Learn how to speed it up" would > > definitely be wrong. > > I don't think we need to be 100% strict on this. The sentence capitalization > is a clear win here. The clear win isn't clear to me, and I don't see why this button should be different from others.
This was supposed to be a feature of the Firefox health report, and so I suggest that if we're going to do this we ought to make that the main entrance point.
What do you mean by main entrance point?
This looks awesome guys. I agree with Tyler though. We don't want the messaging to be annoying to the end user if they choose not to do anything about slowness or worse, they try to fix it and make no progress. Even if we aren't showing it every time, there should be an opt out option.
There may be some overlap between about:healthreport and this feature. Perhaps one day the slow start detecting drops you into about:healthreport?slowStart or something. Or perhaps part of FHR is periodically asking "is this browser healthy" and that triggers an alert if not. I dunno. I think you should blaze forward without regard for FHR at this stage and we can (possibly) merge code later. Now, if you start writing a lot of code for "am I healthy," then we should probably have a discussion on the relationship between FHR and those features so we don't accumulate too much engineering debt. mconnor: feel free to override if I'm speaking out of ignorance.
Comment on attachment 708335 [details] [diff] [review] patch Review of attachment 708335 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/components/nsBrowserGlue.js @@ +436,5 @@ > + } > + > + Services.prefs.setIntPref("browser.slowStartup.averageTime", averageTime); > + Services.prefs.setIntPref("browser.slowStartup.samples", samples); > + }, As part of FHR we implemented https://hg.mozilla.org/mozilla-central/file/default/services/datareporting/sessions.jsm. So, there is a full history of every session for the past 180 days between this API (prefs storage) and FHR (SQLite storage). Since we now have this "session recorder" API/service, I think it makes sense for the tracking logic you added here to live in that, not in nsBrowserGlue.js. This service is always running assuming FHR is enabled in the build (which it is for official builds). So, you can rely on it for this feature.
Comment on attachment 708335 [details] [diff] [review] patch Review of attachment 708335 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/components/nsBrowserGlue.js @@ +413,5 @@ > + if (!startupInfo.firstPaint) { > + Cc["@mozilla.org/timer;1"] > + .createInstance(Ci.nsITimer) > + .initWithCallback(this._trackSlowStartup.bind(this), > + 1000, Ci.nsITimer.TYPE_ONE_SHOT); We should add a max-tries amount here, so that in pathological cases we don't keep checking this infinite times.
Attached image Slow Start Icon - OS X (deleted) —
(In reply to Gregory Szorc [:gps] from comment #14) > As part of FHR we implemented > https://hg.mozilla.org/mozilla-central/file/default/services/datareporting/ > sessions.jsm. So, there is a full history of every session for the past 180 > days between this API (prefs storage) and FHR (SQLite storage). > > Since we now have this "session recorder" API/service, I think it makes > sense for the tracking logic you added here to live in that, not in > nsBrowserGlue.js. What would this buy us? Would this simplify code or make it reusable for other purposes? Skimming the JSM didn't help me answer these questions.
Attached image Slow Start Icon - Linux (deleted) —
(In reply to Dão Gottwald [:dao] from comment #19) > > Since we now have this "session recorder" API/service, I think it makes > > sense for the tracking logic you added here to live in that, not in > > nsBrowserGlue.js. > > What would this buy us? Would this simplify code or make it reusable for > other purposes? Skimming the JSM didn't help me answer these questions. * nsBrowserGlue.js would be smaller * Easier test coverage of feature (xpcshell not mochitest) * Reusable by others It's probably not worth doing at this juncture.
Attached patch patch v2 (obsolete) (deleted) — Splinter Review
I stopped relying on nsIAppStartup's firstPaint property, as I couldn't figure out when it would become available. I'm now just using the time it takes until the first browser-delayed-startup-finished notification, which actually depends on MozAfterPaint.
Attachment #708335 - Attachment is obsolete: true
Attachment #709087 - Flags: review?(felipc)
Attached image screenshot (deleted) —
Attached patch patch v2 (deleted) — Splinter Review
reverted obsolete change that I made when experimenting with how to reliably access the firstPaint property
Attachment #709087 - Attachment is obsolete: true
Attachment #709087 - Flags: review?(felipc)
Attachment #709090 - Flags: review?(felipc)
I heard Greg mentioning yesterday that the lack of firstPaint data is a known race condition. If that is simply the case and the data always seems available on the next trial at retrieving it, we should keep using firstPaint because that metric is much more understood and has historical data for us to base which threshold to use. A random front-end code change is much more likely to sway the timing of browser-delayed-startup-finished then of firstPaint, which may suddenly make lots of users see that dialog. In any case, we should also add telemetry to measure this feature and keep an eye on it to make sure it doesn't explode.
(In reply to :Felipe Gomes from comment #25) > I heard Greg mentioning yesterday that the lack of firstPaint data is a > known race condition. If that is simply the case and the data always seems > available on the next trial at retrieving it, I tried that, it didn't help. > A random front-end code change is much more likely to sway the timing of > browser-delayed-startup-finished then of firstPaint, which may suddenly make > lots of users see that dialog. I can see how e.g. an add-on could break delayedStartup such that browser-delayed-startup-finished would never fire. If how ever it fires late because delayedStartup blocks the main thread for a long time, taking this into account in the time measurement seems quite reasonable to me.
sessionRestored is recorded during sessionrestore-windows-restored observer: https://mxr.mozilla.org/mozilla-central/source/toolkit/components/startup/nsAppStartup.cpp#646 firstPaint is recorded at https://mxr.mozilla.org/mozilla-central/source/view/src/nsViewManager.cpp#338. Not sure where that gets called during startup. Search for "StartupTimeline::Record" in MXR for all the recordings. The race condition I was referring to was sessionRestored. In FHR I am currently trying to access sessionRestored during a sessionrestore-windows-restored observer. However, it only works sometimes because observer install order isn't consistent. If you are measuring firstPaint, I'd not do slow start detection until sessionrestore-windows-restored. firstPaint will definitely be defined then. You may also be able to get away with installing a timer in final-ui-startup. Either way, you should have code to handle the case where firstPaint == -1. bsmedberg could probably give a more solid recommendation.
(In reply to Dão Gottwald [:dao] from comment #26) > I can see how e.g. an add-on could break delayedStartup such that > browser-delayed-startup-finished would never fire. If how ever it fires late > because delayedStartup blocks the main thread for a long time, taking this > into account in the time measurement seems quite reasonable to me. Fair point. Is the time for firstPaint and browser-delayed-startup-finished really similar on your tests? I feel much less confindent on setting the threshold to 1min for browser-delayed-startup-finished. Is there any good reasoning to apply and reach this number (60sec)?
(In reply to :Felipe Gomes from comment #28) > (In reply to Dão Gottwald [:dao] from comment #26) > > I can see how e.g. an add-on could break delayedStartup such that > > browser-delayed-startup-finished would never fire. If how ever it fires late > > because delayedStartup blocks the main thread for a long time, taking this > > into account in the time measurement seems quite reasonable to me. > > Fair point. Is the time for firstPaint and browser-delayed-startup-finished > really similar on your tests? Yes, the numbers seemed reasonably close. > I feel much less confindent on setting the > threshold to 1min for browser-delayed-startup-finished. Is there any good > reasoning to apply and reach this number (60sec)? I'm not sure what exactly you're asking, but if browser-delayed-startup-finished is fired after one minute, something is definitely wrong and we can count that as a slow startup. 60 seconds is a conservative threshold and I expect that we'll carefully reduce it in the future, maybe once we've shipped this for the first time.
My point is that we based the 60s by looking at the telemetry histogram for firstPaint and noticing that 99.99% of reports are on 30s or less. I'm trying to look at the same number for delayed-startup-finished (I didn't know there was telemetry for that). Right now the dashboard is too slow to load so I can't see the histogram :( But the evolution loaded and the numbers for it in the main percentile seem around 4-5sec higher than firstPaint. So maybe 60s is also reasonable for this one but let's revisit this number again after we see what's the long tail for that measurement. Can you add a telemetry probe reporting when we've showed the slow warning to the users?
> Can you add a telemetry probe reporting when we've showed the slow warning > to the users? Wouldn't this be just another (less accurate) way to catch delayed-startup-finished regressions?
Comment on attachment 709090 [details] [diff] [review] patch v2 Review of attachment 709090 [details] [diff] [review]: ----------------------------------------------------------------- (In reply to Dão Gottwald [:dao] from comment #31) > > Can you add a telemetry probe reporting when we've showed the slow warning > > to the users? > > Wouldn't this be just another (less accurate) way to catch > delayed-startup-finished regressions? Possibly, I need to think more about that. But we can leave that for a follow-up. ::: browser/components/nsBrowserGlue.js @@ +424,5 @@ > + if (averageTime > Services.prefs.getIntPref("browser.slowStartup.timeThreshold")) > + this._showSlowStartupNotification(); > + averageTime = 0; > + samples = 0; > + } getting the average of the 5 runs will trigger the dialog for an unusual spike, where what we wanted is to show the dialog for when the N previous runs were all above the threshold. The worst scenario is: samples = 0, startup takes 5 seconds for a random (non-consistent) reason. We won't show anything, and the following startups only take 50ms. We will still show the "Slow startup" warning for a really fast startup due to the previously slow one 4 runs ago.
Attachment #709090 - Flags: review?(felipc)
(In reply to :Felipe Gomes from comment #32) > getting the average of the 5 runs will trigger the dialog for an unusual > spike, where what we wanted is to show the dialog for when the N previous > runs were all above the threshold. > > The worst scenario is: samples = 0, startup takes 5 seconds for a random > (non-consistent) reason. We won't show anything, and the following startups > only take 50ms. We will still show the "Slow startup" warning for a really > fast startup due to the previously slow one 4 runs ago. This case seems too constructed to me to really worry about it. Even assuming it existed in the real world, surely a single 5 minute startup is bad enough that we shouldn't ignore it, so the fact that it would affect the average and trigger the notification is intended. Showing the notification later than the single slow startup happened would indeed be inconvenient but not wrong per se.
I'm open to consider different ways to decide when to show the notification, if anybody wants to propose something, but it shouldn't be significantly more complex than what we have.
I was thinking that the samples pref could be a text array with the 5 previous times, like: "15032,28345,32053,20154,29353" then you do a .split(","), remove the oldest and push the new value. If all values are > then the threshold, then we'd show it. What do you think? If you prefer the current version over this suggestion please request review again.
Comment on attachment 709090 [details] [diff] [review] patch v2 Yeah, I still prefer this, since I don't want all values to be above the threshold. As described in comment 33, if some startups are bad enough to push the average above the threshold, we should notify.
Attachment #709090 - Flags: review?(felipc)
I prefer the current version because the approach that comment #35 proposes has the potential to show the warning multiple times in a row, which can get too annoying.
The telemetry data for browser-delayed-startup shows 99.85% on 30s or less, but it's worth noting there's a 0.16% report for NaN.
Attachment #709090 - Flags: review?(felipc) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/bf1502fe8053 (In reply to Jared Wein [:jaws] from comment #37) > I prefer the current version because the approach that comment #35 proposes > has the potential to show the warning multiple times in a row, which can get > too annoying. Well, this could be avoided by removing all records when notifying. Anyway, as I said I still prefer the current approach for other reasons.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 21
Blocks: 837637
Depends on: 837640
Depends on: 838713
Our nightly users aren't the real targets here, but has anyone checked the before/after on startup time metrics to see if this moved the needle? Is support seeing any extra traffic to the SUMO page? Should I stop spamming the bug with these questions and take it to email?
SUMO never gets much nightly traffic, maybe 1-2 questions a week, so we haven't seen this. Looking on Input (which gets more nightly traffic) we see a few comments like https://input.mozilla.org/opinion/3553794, which could possibly relate to user confusion (but it honestly isn't clear enough to tell). As this hits aurora we can plan an aurora survey around this feature, and when it hits beta we'll have more feedback as well.
We should consider, instead of pointing to a SUMO article, pointing to the Firefox Reset dialog, which will fix this for the majority of users.
(In reply to Tyler Downer [:Tyler] from comment #43) > We should consider, instead of pointing to a SUMO article, pointing to the > Firefox Reset dialog, which will fix this for the majority of users. It seems like if we want users to try resetting their profile as a first step, we should make the SUMO article say so.
(In reply to Dão Gottwald [:dao] from comment #44) > It seems like if we want users to try resetting their profile as a first > step, we should make the SUMO article say so. That’s a good point. There are many steps to take that will fix Firefox if reset doesn’t work. At the same time, it seems like we can do better than simply asking user to read an article that could potentially solve slowness, in the same browser that was experiencing slowness. Putting a reset button in the UI will give user a step that has result s/he can feel right away. If startup is still slow after reset, we should give user the next thing to try in the same location. My hypothesis: asking user to read an article is going to have a lower conversion rate (ie. percentage of user who will try the solution), than if we present the solution in the form of actionable buttons. So, perhaps, a link to “Learn more…” should be given when we’ve exhausted all the possible solutions. For example, if all we have is Reset, then when startup is still slow after reset, we should show “Learn more…”
I have to agree. If we are telling users that Firefox is slow they probably just want us to fix it rather than us sending them on a troubleshooting deep dive. Although I would be a little hesitant to link directly to the Profile Reset while it is still throwing away session data.
I don't think we need to optimize the current UI too much. The long term plan (see comment 13) is that Firefox Health Report will be the more comprehensive framework for this kind of thing. The notification bar added here in this bug is a quickie short-term solution (aka "better than nothing" :). That's not to say the current UI is firmly locked-down/frozen, just that I think we'd be better off focusing on how FHR can solve this better.
Blocks: 848460
(In reply to Bram Pitoyo [:bram] from comment #45) > (In reply to Dão Gottwald [:dao] from comment #44) > > It seems like if we want users to try resetting their profile as a first > > step, we should make the SUMO article say so. > > That’s a good point. There are many steps to take that will fix Firefox if > reset doesn’t work. > Actually there isn't really anything else to do in Firefox beyond a reset. It really does talk care of most problems (except malware and hardware issues). Putting a sumo article as an inbetween step will just decrease the amount of people who successfully complete the task.
Depends on: 868711
Depends on: 868400
Blocks: 872172
Depends on: 896025
We until we can offer the reset prompt (bug 872172) can we please change the support article people are sent to? Right now that are sent to https://support.mozilla.org/kb/firefox-takes-long-time-start-up which tells people to go to https://support.mozilla.org/kb/reset-firefox-easily-fix-most-problems We should send people to https://support.mozilla.org/kb/reset-firefox-easily-fix-most-problems directly. Should I reopen this bug or file a new one?
You should file a new bug.
Blocks: 1015619
Blocks: 1379245
Blocks: 1389590
Blocks: 1545858
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: