Closed
Bug 671041
Opened 13 years ago
Closed 13 years ago
batch 'o telemetry metrics for session restore
Categories
(Firefox :: Session Restore, defect)
Firefox
Session Restore
Tracking
()
RESOLVED
FIXED
Firefox 13
People
(Reporter: Dolske, Assigned: dietrich)
References
(Blocks 1 open bug)
Details
Attachments
(1 file, 5 obsolete files)
(deleted),
patch
|
zpao
:
review+
|
Details | Diff | Splinter Review |
Session Restore
# of tabs restored
Time to restore
amount of cookie data (curse you google!)
# of windows (do people use multiple windows anymore?)
Panorama
Actual usage
Number of groups / number of named groups / stacked groups
Number of tabs per group
"Move to group" context menu item usage
https://bugzilla.mozilla.org/show_bug.cgi?id=666538
Comment 1•13 years ago
|
||
Where does the session restore code live?
Comment 2•13 years ago
|
||
browser/components/sessionstore
Comment 3•13 years ago
|
||
(In reply to Justin Dolske [:Dolske] from comment #0)
> Session Restore
>
> # of tabs restored
I think # of windows & # of tabs both make sense
> Time to restore
This is a little tricky to define with cascaded session restore.
> amount of cookie data (curse you google!)
I think this was actually session storage, cookies isn't really a huge problem I think. But session storage isn't going to be too bad anymore (see bug 669603). So I think it's ok not to do this.
Some others that could be interesting...
* time until deferred session restore is triggered
* amount of form data
* how successful restoring forms is
* # of child shistory entries - that might not be too important soon (see bug 700923)
Is there anything else?
Assignee | ||
Comment 4•13 years ago
|
||
two bits i've been measuring:
* time to gather data
* time to write the data to file
should also measure the save/restore apis duration, so we know how often add-ons are using this stuff and how long it takes.
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → dietrich
Assignee | ||
Comment 5•13 years ago
|
||
Attachment #579148 -
Flags: review?(tglek)
Assignee | ||
Updated•13 years ago
|
Attachment #579148 -
Flags: review?(paul)
Comment 6•13 years ago
|
||
Comment on attachment 579148 [details] [diff] [review]
session saving IO telemetry
Review of attachment 579148 [details] [diff] [review]:
-----------------------------------------------------------------
::: browser/components/sessionstore/src/nsSessionStore.js
@@ +3667,5 @@
> if (this._lastSessionState)
> oState.lastSessionState = this._lastSessionState;
>
> + reportTelemetry("SESSION_RESTORE_COLLECT_DATA", Date.now() - collectStart);
> +
I'm afraid this is going to turn into a useless data point. I don't know how much value we'd get from this alone. It's going to grow as a session grows, it'll depend on if tabs have been restored or not, if aUpdateAll/pinnedOnly are true/false, number of tabs, amount of data in the tab"
Ideally we'd be able to associate all of that and have a compound data point, but telemetry doesn't work like that. The best we'll be able to tell is if somebody has _some_ abnormally large numbers. And only a single user could find that out since we can't correlate data on our side.
Soooo, let's take it for that with the understanding that it's not the most useful point on its own.
@@ +3690,2 @@
> this._writeFile(this._sessionFile, stateString.data);
> + reportTelemetry("SESSION_RESTORE_WRITE_DATA", Date.now() - writeStart);
This is going to lie, isn't it? The file is written async via NetUtil.asyncCopy. Let's do it in _writeFile where we have a callback set up already (currently notifies "sessionstore-state-write-complete")
Assignee | ||
Comment 7•13 years ago
|
||
(In reply to Paul O'Shannessy [:zpao] from comment #6)
> Soooo, let's take it for that with the understanding that it's not the most
> useful point on its own.
Not useless, it tells us if collection stage is a problem, how much, and for what proportion of users (running builds w/ telemetry). We'll get buckets which will show us what % of users have pathological collection times vs minimal times, etc.
You should think of this as malleable data. If we want to add more metrics, we can. If we're not getting anything useful, we remove it or change it.
But more specifically - what points would provide the most value? ms-per-tab? As you can see from the patch, it's not hard to instrument these things, so we can add whatever bits provide the most value.
> @@ +3690,2 @@
> > this._writeFile(this._sessionFile, stateString.data);
> > + reportTelemetry("SESSION_RESTORE_WRITE_DATA", Date.now() - writeStart);
>
> This is going to lie, isn't it? The file is written async via
> NetUtil.asyncCopy. Let's do it in _writeFile where we have a callback set up
> already (currently notifies "sessionstore-state-write-complete")
Good catch, will fix.
Comment 8•13 years ago
|
||
Comment on attachment 579148 [details] [diff] [review]
session saving IO telemetry
I recommend appending _MS to SESSION_RESTORE_COLLECT_DATA to make units more obvious.
.now() is redundant
Attachment #579148 -
Flags: review?(tglek) → review-
Assignee | ||
Comment 9•13 years ago
|
||
(In reply to Taras Glek (:taras) from comment #8)
>
> .now() is redundant
how?
Comment 10•13 years ago
|
||
(In reply to Dietrich Ayala (:dietrich) from comment #9)
> (In reply to Taras Glek (:taras) from comment #8)
> >
> > .now() is redundant
>
> how?
s/.now()// and code will still work(and according to Brendan run faster).
Updated•13 years ago
|
Attachment #579148 -
Flags: review?(paul)
Comment 11•13 years ago
|
||
Comment on attachment 579148 [details] [diff] [review]
session saving IO telemetry
Note I think we need more session store data.
* amount of time per tab in collect
* time to read data
Comment 12•13 years ago
|
||
Another set of data we need:
* size of each page serialized
* size of each tab history
* total size
Assignee | ||
Updated•13 years ago
|
Summary: batch 'o telemetry metrics for session restore, panorama → batch 'o telemetry metrics for session restore
Assignee | ||
Comment 13•13 years ago
|
||
(In reply to Taras Glek (:taras) from comment #12)
> Another set of data we need:
> * size of each page serialized
what size are you talking about?
> * size of each tab history
by size do you mean the number of history entries per tab?
> * total size
of what?
Comment 14•13 years ago
|
||
(In reply to Dietrich Ayala (:dietrich) from comment #13)
> (In reply to Taras Glek (:taras) from comment #12)
> > Another set of data we need:
> > * size of each page serialized
>
> what size are you talking about?
Sorry, that was unclear. Number of history entries serialized per tab.
>
> > * size of each tab history
>
> by size do you mean the number of history entries per tab?
No i mean amount of data written per history entry. ie sizeof(google.com), sizeof(google.com/results), sizeof(firstresult)
>
> > * total size
>
> of what?
of the session restore data written
Assignee | ||
Comment 15•13 years ago
|
||
* time to read file from disk at startup
* if the file contained parse-able JSON
* time to collect data (inclusive)
* time to serialize data to JSON string
* time to write data to disk
Attachment #579148 -
Attachment is obsolete: true
Comment 16•13 years ago
|
||
Comment on attachment 594111 [details] [diff] [review]
more
Review of attachment 594111 [details] [diff] [review]:
-----------------------------------------------------------------
Drive-by comments:
::: browser/components/sessionstore/src/nsSessionStartup.js
@@ +142,2 @@
> }
> + Services.telemetry.getHistogramById("FX_SESSION_RESTORE_CORRUPT_FILE").add(corruptFile);
Use reportTelemetry? Here and elsewhere in this file.
::: toolkit/components/telemetry/TelemetryHistograms.h
@@ +305,5 @@
> +HISTOGRAM_BOOLEAN(FX_SESSION_RESTORE_CORRUPT_FILE, "Session restore: Whether the file read on startup contained parse-able JSON")
> +HISTOGRAM(FX_SESSION_RESTORE_COLLECT_DATA_MS, 1, 30000, 10, EXPONENTIAL, "Session restore: Time to collect all window and tab data (ms)")
> +HISTOGRAM(FX_SESSION_RESTORE_SERIALIZE_DATA_MS, 1, 1000, 10, EXPONENTIAL, "Session restore: Time to JSON serialize session data (ms)")
> +HISTOGRAM(FX_SESSION_RESTORE_READ_FILE_MS, 1, 1000, 10, EXPONENTIAL, "Session restore: Time to read the session data from the file on disk (ms)")
> +HISTOGRAM(FX_SESSION_RESTORE_WRITE_FILE_MS, 1, 1000, 10, EXPONENTIAL, "Session restore: Time to write the session data to the file on disk (ms)")
The max time for these read and write ones seems a bit on the low side. Why not make them higher?
Comment 17•13 years ago
|
||
Comment on attachment 594111 [details] [diff] [review]
more
Review of attachment 594111 [details] [diff] [review]:
-----------------------------------------------------------------
::: browser/components/sessionstore/src/nsSessionStartup.js
@@ +142,2 @@
> }
> + Services.telemetry.getHistogramById("FX_SESSION_RESTORE_CORRUPT_FILE").add(corruptFile);
reportTelemetry doesn't exist in this file, but it could be added, but if we're going to be spattering that everywhere (there's already an identically-named-but-slightly-different function in Places), maybe we should pull it out somewhere?
::: browser/components/sessionstore/src/nsSessionStore.js
@@ +3678,5 @@
> if (this._lastSessionState)
> oState.lastSessionState = this._lastSessionState;
>
> + reportTelemetry("FX_SESSION_RESTORE_COLLECT_DATA_MS", new Date() - collectStart);
> +
Do we care to differentiate between forced update all / pinned only / using cached data? I'm guessing not since the aggregate is probably most valuable.
@@ +3700,2 @@
> this._writeFile(this._sessionFile, stateString.data);
> + }
I'll allow it :)
Assignee | ||
Comment 18•13 years ago
|
||
Attachment #594111 -
Attachment is obsolete: true
Attachment #596116 -
Flags: review?(paul)
Assignee | ||
Comment 19•13 years ago
|
||
(forgot to hg add telemetry.js, removed a dump)
Attachment #596116 -
Attachment is obsolete: true
Attachment #596118 -
Flags: review?(paul)
Attachment #596116 -
Flags: review?(paul)
Comment 20•13 years ago
|
||
Can you use TelemetryStopwatch (bug 723561) instead of the custom telemetry.js?
Assignee | ||
Comment 21•13 years ago
|
||
Comment on attachment 596118 [details] [diff] [review]
comments addressed
canceling review, gonna use the new timestamp stuff.
Attachment #596118 -
Flags: review?(paul)
Assignee | ||
Comment 22•13 years ago
|
||
1. now uses TelemetryTimestamp
2. fixed bug where corrupt file wasn't reporting
zpao: in nsSessionStartup, basically i just moved the try block to the place that actually can throw, instead of wrapping that whole section. in the status quo, any code after the evalInSandbox just wouldn't get executed.
Attachment #596118 -
Attachment is obsolete: true
Attachment #596182 -
Flags: review?(paul)
Comment 23•13 years ago
|
||
Comment on attachment 596182 [details] [diff] [review]
now with more less.
Review of attachment 596182 [details] [diff] [review]:
-----------------------------------------------------------------
TelemetryStopwatch is so much easier to read. Me gusta.
::: browser/components/sessionstore/src/nsSessionStartup.js
@@ +142,2 @@
> var s = new Cu.Sandbox("about:blank", {sandboxName: 'nsSessionStartup'});
> this._initialState = Cu.evalInSandbox("(" + iniString + ")", s);
Tangential, but we should probably just take this out sometime soon - we've been using JSON.stringify/parse for a while (bug 387859).
@@ -142,5 @@
> - // If this is a normal restore then throw away any previous session
> - if (!doResumeSessionOnce)
> - delete this._initialState.lastSessionState;
> - }
> - catch (ex) { debug("The session file is invalid: " + ex); }
This debug statement has proved helpful in the past for helping figure out that there was an error loading the file, so can you leave it in?
Attachment #596182 -
Flags: review?(paul) → review+
Assignee | ||
Comment 24•13 years ago
|
||
Paul, added more specific debug() lines for the possible failure points there.
Attachment #596182 -
Attachment is obsolete: true
Attachment #596809 -
Flags: review?(paul)
Updated•13 years ago
|
Attachment #596809 -
Flags: review?(paul) → review+
Assignee | ||
Updated•13 years ago
|
Keywords: checkin-needed
Comment 25•13 years ago
|
||
Keywords: checkin-needed
Target Milestone: --- → Firefox 13
Comment 26•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•