Closed
Bug 828720
Opened 12 years ago
Closed 12 years ago
FHR performance Telemetry
Categories
(Firefox Health Report Graveyard :: Client: Desktop, defect)
Firefox Health Report Graveyard
Client: Desktop
Tracking
(Not tracked)
VERIFIED
FIXED
Firefox 21
People
(Reporter: vladan, Assigned: gps)
References
Details
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
rnewman
:
review+
vladan
:
review+
|
Details | Diff | Splinter Review |
Please add Telemetry probes to evaluate performance impact of FHR operations.
We should start by measuring the following:
- Time to do an FHR submission from start to end
- Time to read in data from previous days
- The size of FHR payloads
- Time to write out the backup file of an FHR submission
- Shutdown delay added -> I think this should be straightforward with the event loop spinning model, you can just time _waitForShutdown
Assignee | ||
Comment 1•12 years ago
|
||
Could you please point me to code in tree that does something similar so I can borrow best practices?
Reporter | ||
Comment 2•12 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #1)
> Could you please point me to code in tree that does something similar so I
> can borrow best practices?
We have documentation here: https://developer.mozilla.org/en-US/docs/Performance/Adding_a_new_Telemetry_probe
The FX_SESSION_RESTORE_* histograms look pretty good http://mxr.mozilla.org/mozilla-central/source/browser/components/sessionstore/src/SessionStore.jsm#3718
Comment 3•12 years ago
|
||
- Should report size of fhr db
Assignee | ||
Comment 4•12 years ago
|
||
Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsITelemetry.getHistogramById] Stack trace: resource://gre/modules/TelemetryStopwatch.jsm:121 < onResult()@resource://gre/modules/services/healthreport/healthreporter.jsm:599
I performed a |make -C toolkit/components/telemetry| after updating the JSON file. Do I need to relink libxul or something?
Reporter | ||
Comment 5•12 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #4)
> Created attachment 702537 [details] [diff] [review]
> Telemetry first stab
>
> Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE)
> [nsITelemetry.getHistogramById] Stack trace:
> resource://gre/modules/TelemetryStopwatch.jsm:121 <
> onResult()@resource://gre/modules/services/healthreport/healthreporter.jsm:
> 599
>
> I performed a |make -C toolkit/components/telemetry| after updating the JSON
> file. Do I need to relink libxul or something?
If you're adding new probes or changing the order of probes, you need to rebuild the users of these Telemetry probes. You'll also need to remake toolkit/library to rebuild xul.dll
Assignee | ||
Comment 6•12 years ago
|
||
I got most of the requested probes. The two I missed are:
* Compressed payload size
* Database size
The former will require some invasion in the generic BagheeraClient type. I'm not sure if the best way to get at the latter. nsIStorageConnection.dbFile.size? Doesn't that block while performing a stat()? How often should we collect database size?
I'm fine pushing the other two to a follow-up bug if Perf Team wants those in this patch in Nightly ASAP.
Anyway, rnewman gets code. vladan gets probe sanity (I confess to not putting too much thought into things like n_buckets).
Assignee: nobody → gps
Attachment #702537 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #702598 -
Flags: review?(vdjeric)
Attachment #702598 -
Flags: review?(rnewman)
Comment 7•12 years ago
|
||
Comment on attachment 702598 [details] [diff] [review]
Telemetry probes for FHR, v1
Review of attachment 702598 [details] [diff] [review]:
-----------------------------------------------------------------
General comment: lift all those magic constants out into nice, brief *consts*. We've already been bitten once by typos, and it makes it much easier to audit those histogram entries against the code.
::: services/healthreport/healthreporter.jsm
@@ +121,5 @@
> this._shutdownInitiated = false;
> this._shutdownComplete = false;
> this._shutdownCompleteCallback = null;
>
> + TelemetryStopwatch.start("HEALTHREPORT_INIT_MS", this);
Filed bug 831115 to cover the incorrect documentation for start() and stop().
I can only assume that this call is correct, because the telemetry docs and code disagree.
@@ +437,4 @@
> this._shutdownCompleteCallback = Async.makeSpinningCallback();
> this._shutdownCompleteCallback.wait();
> this._shutdownCompleteCallback = null;
> + TelemetryStopwatch.finish("HEALTHREPORT_SHUTDOWN_DELAY_MS", this);
Remember that wait() can throw. You want a try..finally here.
::: toolkit/components/telemetry/Histograms.json
@@ +2584,5 @@
> + "description": "Time (ms) that Firefox Health Report delays application shutdown by."
> + },
> + "HEALTHREPORT_GENERATE_JSON_PAYLOAD_MS": {
> + "kind": "exponential",
> + "high": "30000",
Why are these strings? (I see they're strings elsewhere in the file, but…)
Attachment #702598 -
Flags: review?(rnewman) → review+
Reporter | ||
Comment 8•12 years ago
|
||
Comment on attachment 702598 [details] [diff] [review]
Telemetry probes for FHR, v1
>+ "HEALTHREPORT_INIT_MS": {
>+ "kind": "exponential",
>+ "high": "30000",
>+ "n_buckets": 20,
>+ "description": "Time (ms) spent to initialize Firefox Health Report service."
>+ },
Some of the "highs" seem too high, I imagine anything more than 10 seconds would be extremely rare.
>+ "HEALTHREPORT_PAYLOAD_UNCOMPRESSED_BYTES": {
>+ "kind": "linear",
>+ "high": "2000000",
>+ "n_buckets": 200,
>+ "description": "Size (in bytes) of the raw Health Report payload."
>+ },
You probably want 202 buckets here. You always get buckets 0, 1, and MAX.
>+ try {
>+ yield this._saveLastPayload(payload);
>+ } catch (ex) {
>+ TelemetryStopwatch.cancel("HEALTHREPORT_SAVE_LAST_PAYLOAD_MS", this);
>+ throw ex;
>+ } finally {
>+ TelemetryStopwatch.finish("HEALTHREPORT_SAVE_LAST_PAYLOAD_MS", this);
>+ }
So this would call both cancel & finish in case of an error? Why not put the "finish" call after the yield?
Look at PLACES_DATABASE_FILESIZE_MB for DB size. It will be an additional stat but presumably that block will be in the disk cache after the DB is opened.
Attachment #702598 -
Flags: review?(vdjeric) → review+
Reporter | ||
Comment 9•12 years ago
|
||
.. and OS.File can do the stat off the main thread
Assignee | ||
Comment 10•12 years ago
|
||
I'll probably bake an OS.File-based .size() API into Sqlite.jsm to implement the database size telemetry probe. Follow-up bug.
I'll land this patch with review feedback soon. Been busy with other bugs.
Assignee | ||
Comment 11•12 years ago
|
||
Whiteboard: [fixed in services]
Assignee | ||
Comment 12•12 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in services]
Target Milestone: --- → mozilla21
Comment 13•12 years ago
|
||
Verified that I see these healthreport metrics gathered in about:telemetry using Aurora 21.0a2 2013-03-27.
Status: RESOLVED → VERIFIED
Updated•12 years ago
|
Component: Metrics and Firefox Health Report → Client: Desktop
Product: Mozilla Services → Firefox Health Report
Target Milestone: mozilla21 → Firefox 21
Updated•6 years ago
|
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•