Closed Bug 671041 Opened 13 years ago Closed 13 years ago

batch 'o telemetry metrics for session restore

Categories

(Firefox :: Session Restore, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 13

People

(Reporter: Dolske, Assigned: dietrich)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 5 obsolete files)

Session Restore # of tabs restored Time to restore amount of cookie data (curse you google!) # of windows (do people use multiple windows anymore?) Panorama Actual usage Number of groups / number of named groups / stacked groups Number of tabs per group "Move to group" context menu item usage https://bugzilla.mozilla.org/show_bug.cgi?id=666538
Where does the session restore code live?
browser/components/sessionstore
(In reply to Justin Dolske [:Dolske] from comment #0) > Session Restore > > # of tabs restored I think # of windows & # of tabs both make sense > Time to restore This is a little tricky to define with cascaded session restore. > amount of cookie data (curse you google!) I think this was actually session storage, cookies isn't really a huge problem I think. But session storage isn't going to be too bad anymore (see bug 669603). So I think it's ok not to do this. Some others that could be interesting... * time until deferred session restore is triggered * amount of form data * how successful restoring forms is * # of child shistory entries - that might not be too important soon (see bug 700923) Is there anything else?
two bits i've been measuring: * time to gather data * time to write the data to file should also measure the save/restore apis duration, so we know how often add-ons are using this stuff and how long it takes.
Assignee: nobody → dietrich
Attached patch session saving IO telemetry (obsolete) (deleted) — Splinter Review
Attachment #579148 - Flags: review?(tglek)
Attachment #579148 - Flags: review?(paul)
Comment on attachment 579148 [details] [diff] [review] session saving IO telemetry Review of attachment 579148 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/components/sessionstore/src/nsSessionStore.js @@ +3667,5 @@ > if (this._lastSessionState) > oState.lastSessionState = this._lastSessionState; > > + reportTelemetry("SESSION_RESTORE_COLLECT_DATA", Date.now() - collectStart); > + I'm afraid this is going to turn into a useless data point. I don't know how much value we'd get from this alone. It's going to grow as a session grows, it'll depend on if tabs have been restored or not, if aUpdateAll/pinnedOnly are true/false, number of tabs, amount of data in the tab" Ideally we'd be able to associate all of that and have a compound data point, but telemetry doesn't work like that. The best we'll be able to tell is if somebody has _some_ abnormally large numbers. And only a single user could find that out since we can't correlate data on our side. Soooo, let's take it for that with the understanding that it's not the most useful point on its own. @@ +3690,2 @@ > this._writeFile(this._sessionFile, stateString.data); > + reportTelemetry("SESSION_RESTORE_WRITE_DATA", Date.now() - writeStart); This is going to lie, isn't it? The file is written async via NetUtil.asyncCopy. Let's do it in _writeFile where we have a callback set up already (currently notifies "sessionstore-state-write-complete")
(In reply to Paul O'Shannessy [:zpao] from comment #6) > Soooo, let's take it for that with the understanding that it's not the most > useful point on its own. Not useless, it tells us if collection stage is a problem, how much, and for what proportion of users (running builds w/ telemetry). We'll get buckets which will show us what % of users have pathological collection times vs minimal times, etc. You should think of this as malleable data. If we want to add more metrics, we can. If we're not getting anything useful, we remove it or change it. But more specifically - what points would provide the most value? ms-per-tab? As you can see from the patch, it's not hard to instrument these things, so we can add whatever bits provide the most value. > @@ +3690,2 @@ > > this._writeFile(this._sessionFile, stateString.data); > > + reportTelemetry("SESSION_RESTORE_WRITE_DATA", Date.now() - writeStart); > > This is going to lie, isn't it? The file is written async via > NetUtil.asyncCopy. Let's do it in _writeFile where we have a callback set up > already (currently notifies "sessionstore-state-write-complete") Good catch, will fix.
Comment on attachment 579148 [details] [diff] [review] session saving IO telemetry I recommend appending _MS to SESSION_RESTORE_COLLECT_DATA to make units more obvious. .now() is redundant
Attachment #579148 - Flags: review?(tglek) → review-
(In reply to Taras Glek (:taras) from comment #8) > > .now() is redundant how?
(In reply to Dietrich Ayala (:dietrich) from comment #9) > (In reply to Taras Glek (:taras) from comment #8) > > > > .now() is redundant > > how? s/.now()// and code will still work(and according to Brendan run faster).
Comment on attachment 579148 [details] [diff] [review] session saving IO telemetry Note I think we need more session store data. * amount of time per tab in collect * time to read data
Another set of data we need: * size of each page serialized * size of each tab history * total size
Summary: batch 'o telemetry metrics for session restore, panorama → batch 'o telemetry metrics for session restore
(In reply to Taras Glek (:taras) from comment #12) > Another set of data we need: > * size of each page serialized what size are you talking about? > * size of each tab history by size do you mean the number of history entries per tab? > * total size of what?
(In reply to Dietrich Ayala (:dietrich) from comment #13) > (In reply to Taras Glek (:taras) from comment #12) > > Another set of data we need: > > * size of each page serialized > > what size are you talking about? Sorry, that was unclear. Number of history entries serialized per tab. > > > * size of each tab history > > by size do you mean the number of history entries per tab? No i mean amount of data written per history entry. ie sizeof(google.com), sizeof(google.com/results), sizeof(firstresult) > > > * total size > > of what? of the session restore data written
Attached patch more (obsolete) (deleted) — Splinter Review
* time to read file from disk at startup * if the file contained parse-able JSON * time to collect data (inclusive) * time to serialize data to JSON string * time to write data to disk
Attachment #579148 - Attachment is obsolete: true
Comment on attachment 594111 [details] [diff] [review] more Review of attachment 594111 [details] [diff] [review]: ----------------------------------------------------------------- Drive-by comments: ::: browser/components/sessionstore/src/nsSessionStartup.js @@ +142,2 @@ > } > + Services.telemetry.getHistogramById("FX_SESSION_RESTORE_CORRUPT_FILE").add(corruptFile); Use reportTelemetry? Here and elsewhere in this file. ::: toolkit/components/telemetry/TelemetryHistograms.h @@ +305,5 @@ > +HISTOGRAM_BOOLEAN(FX_SESSION_RESTORE_CORRUPT_FILE, "Session restore: Whether the file read on startup contained parse-able JSON") > +HISTOGRAM(FX_SESSION_RESTORE_COLLECT_DATA_MS, 1, 30000, 10, EXPONENTIAL, "Session restore: Time to collect all window and tab data (ms)") > +HISTOGRAM(FX_SESSION_RESTORE_SERIALIZE_DATA_MS, 1, 1000, 10, EXPONENTIAL, "Session restore: Time to JSON serialize session data (ms)") > +HISTOGRAM(FX_SESSION_RESTORE_READ_FILE_MS, 1, 1000, 10, EXPONENTIAL, "Session restore: Time to read the session data from the file on disk (ms)") > +HISTOGRAM(FX_SESSION_RESTORE_WRITE_FILE_MS, 1, 1000, 10, EXPONENTIAL, "Session restore: Time to write the session data to the file on disk (ms)") The max time for these read and write ones seems a bit on the low side. Why not make them higher?
Comment on attachment 594111 [details] [diff] [review] more Review of attachment 594111 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/components/sessionstore/src/nsSessionStartup.js @@ +142,2 @@ > } > + Services.telemetry.getHistogramById("FX_SESSION_RESTORE_CORRUPT_FILE").add(corruptFile); reportTelemetry doesn't exist in this file, but it could be added, but if we're going to be spattering that everywhere (there's already an identically-named-but-slightly-different function in Places), maybe we should pull it out somewhere? ::: browser/components/sessionstore/src/nsSessionStore.js @@ +3678,5 @@ > if (this._lastSessionState) > oState.lastSessionState = this._lastSessionState; > > + reportTelemetry("FX_SESSION_RESTORE_COLLECT_DATA_MS", new Date() - collectStart); > + Do we care to differentiate between forced update all / pinned only / using cached data? I'm guessing not since the aggregate is probably most valuable. @@ +3700,2 @@ > this._writeFile(this._sessionFile, stateString.data); > + } I'll allow it :)
Attached patch comments addressed (obsolete) (deleted) — Splinter Review
Attachment #594111 - Attachment is obsolete: true
Attachment #596116 - Flags: review?(paul)
Attached patch comments addressed (obsolete) (deleted) — Splinter Review
(forgot to hg add telemetry.js, removed a dump)
Attachment #596116 - Attachment is obsolete: true
Attachment #596118 - Flags: review?(paul)
Attachment #596116 - Flags: review?(paul)
Can you use TelemetryStopwatch (bug 723561) instead of the custom telemetry.js?
Comment on attachment 596118 [details] [diff] [review] comments addressed canceling review, gonna use the new timestamp stuff.
Attachment #596118 - Flags: review?(paul)
Attached patch now with more less. (obsolete) (deleted) — Splinter Review
1. now uses TelemetryTimestamp 2. fixed bug where corrupt file wasn't reporting zpao: in nsSessionStartup, basically i just moved the try block to the place that actually can throw, instead of wrapping that whole section. in the status quo, any code after the evalInSandbox just wouldn't get executed.
Attachment #596118 - Attachment is obsolete: true
Attachment #596182 - Flags: review?(paul)
Comment on attachment 596182 [details] [diff] [review] now with more less. Review of attachment 596182 [details] [diff] [review]: ----------------------------------------------------------------- TelemetryStopwatch is so much easier to read. Me gusta. ::: browser/components/sessionstore/src/nsSessionStartup.js @@ +142,2 @@ > var s = new Cu.Sandbox("about:blank", {sandboxName: 'nsSessionStartup'}); > this._initialState = Cu.evalInSandbox("(" + iniString + ")", s); Tangential, but we should probably just take this out sometime soon - we've been using JSON.stringify/parse for a while (bug 387859). @@ -142,5 @@ > - // If this is a normal restore then throw away any previous session > - if (!doResumeSessionOnce) > - delete this._initialState.lastSessionState; > - } > - catch (ex) { debug("The session file is invalid: " + ex); } This debug statement has proved helpful in the past for helping figure out that there was an error loading the file, so can you leave it in?
Attachment #596182 - Flags: review?(paul) → review+
Attached patch for check-in (deleted) — Splinter Review
Paul, added more specific debug() lines for the possible failure points there.
Attachment #596182 - Attachment is obsolete: true
Attachment #596809 - Flags: review?(paul)
Attachment #596809 - Flags: review?(paul) → review+
Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: