Closed
Bug 1156253
Opened 10 years ago
Closed 10 years ago
Use OS.File lz4 compression for archived telemetry pings
Categories
(Toolkit :: Telemetry, defect)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla40
Tracking | Status | |
---|---|---|
firefox40 | --- | fixed |
People
(Reporter: gfritzsche, Assigned: Dexter)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 4 obsolete files)
(deleted),
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
gfritzsche
:
review+
|
Details | Diff | Splinter Review |
A quick check shows that the OS.File lz4 compression gives a ~70% reduction with the pings i have locally.
Slightly slower loading from the ping archive should not be a problem, so this seems like an easy win.
Comment 1•10 years ago
|
||
Yoric, are there any caveats with us OS.File's lz4 compression? I seem to remember an issue with OS.File not using a standard format
Flags: needinfo?(dteller)
Comment 2•10 years ago
|
||
we are using it for bookmark backups, the only drawback is that the user can't use a third party decompressor, that means it's not a good idea to use .lz4 extension, we used .jsonlz4 for bookmarks.
Comment 3•10 years ago
|
||
Indeed, this is not standard lz4. I'd like to add support for standard lz4 one of these days, but I have found 0 time for that so far.
Flags: needinfo?(dteller)
Comment 4•10 years ago
|
||
How about gzip instead then? We don't really want to support our own compression file format
Reporter | ||
Comment 5•10 years ago
|
||
We are already using this in other places and OS.File supports it with just providing an option to read/writeAtomic, so this would be a straight-forward solution.
What concerns would we have with lz4?
Backwards compability would already be required by other modules (experiments, bookmarks, crash manager).
Comment 6•10 years ago
|
||
As mreid pointed out in the "New Telemetry ping upload & expiry rules" doc, we already gzip pings before sending, so we might as well just save the gzip compressed stream to disk.
The drawback of using our own lz4 format is that it's harder for devs and users to inspect the saved pings.
Comment 7•10 years ago
|
||
Yoric, how hard would it be to add gzip support to read & writeAtomic? We already have the compression/decompression code in the codebase
Flags: needinfo?(dteller)
Comment 8•10 years ago
|
||
After looking at zlib.h, I believe that I can do this in a few hours for the regular path (i.e. JS code). The fast path version will probably be reasonable, too, but it will take me some time to remember exactly how that code works.
Flags: needinfo?(dteller)
Reporter | ||
Comment 9•10 years ago
|
||
Per the meeting today, we will start to use lz4 compression for now because it's trivial to do.
We will use a ".jsonlz4" suffix for the files, so we can later tell them apart.
We will look into moving over to gzip in a non-blocking follow-up bug.
Comment 10•10 years ago
|
||
fwiw, writing a decompression restartless add-on would be trivial (and less time consuming than adding gzip support to OS.File imo) and would solve all of the "not inspectable" complains.
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → alessio.placitelli
Assignee | ||
Comment 11•10 years ago
|
||
This patch adds lz4 compression for archived pings. New archived pings are compressed and old uncompressed pings are still getting loaded correctly.
It also adds test coverage for compressed/uncompressed ping loading.
Attachment #8601495 -
Flags: review?(gfritzsche)
Reporter | ||
Comment 12•10 years ago
|
||
Comment on attachment 8601495 [details] [diff] [review]
bug1156253.patch
Review of attachment 8601495 [details] [diff] [review]:
-----------------------------------------------------------------
I think it's more efficient to move the first patch from bug 1156355 over here.
Then the changes in this patch can be completely limited to the aborted session ping handling in TelemetryStorage.jsm (and tests).
::: toolkit/components/telemetry/TelemetryStorage.jsm
@@ +94,5 @@
> *
> * @param {string} id The pings id.
> * @param {number} timestampCreated The pings creation timestamp.
> * @param {string} type The pings type.
> + * @param {bool} compressed If |true|, expect the ping to be compressed.
I don't think we need to keep track of whether archived pings are compressed.
When loading we can just test through {path, path+"lz4"} and load the first that exists.
@@ +121,5 @@
> * @param {bool} overwrite If |true|, the file will be overwritten if it exists,
> * if |false| the file will not be overwritten and no error will be reported if
> * the file exists.
> + * @param {bool} compress If |true|, the file will use lz4 compression. Otherwise no
> + * compression will be used.
What is the default?
Attachment #8601495 -
Flags: review?(gfritzsche)
Reporter | ||
Updated•10 years ago
|
Status: NEW → ASSIGNED
Reporter | ||
Comment 13•10 years ago
|
||
This refactors the aborted-session file handling details into TelemetryStorage. This is relatively simple and just moving code around per the Telemetry redesign draft. Note that all use of SaveSerializer will be in TelemetryStorage after the other refactors, so i didn't take care of sharing that between modules now.
Attachment #8602211 -
Flags: review?(alessio.placitelli)
Assignee | ||
Comment 14•10 years ago
|
||
Comment on attachment 8602211 [details] [diff] [review]
Refactor aborted-session ping file details into TelemetryStorage
Review of attachment 8602211 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/TelemetryController.jsm
@@ +279,3 @@
> * @return {Promise} Promise that is resolved when the ping is saved.
> */
> + checkAbortedSessionPing: function addAbortedSessionPing() {
|function checkAbortedSessionPing|
@@ +712,4 @@
> },
>
> /**
> * Save an aborted-session ping to the pending pings and archive it.
This comment should probably be updated, along with the "@param" below.
@@ +716,5 @@
> *
> * @param {String} aFilePath The path to the aborted-session checkpoint ping.
> * @return {Promise} Promise that is resolved when the ping is saved.
> */
> + checkAbortedSessionPing: Task.async(function* addAbortedSessionPing() {
|function* checkAbortedSessionPing()|
::: toolkit/components/telemetry/TelemetrySession.jsm
@@ +316,5 @@
> return [parseInt(io.readBytes), parseInt(io.writeBytes)];
> }
> };
>
> + /**
Nit: leading whitespace.
@@ +1887,5 @@
> yield this.savePendingPings();
> yield this._stateSaveSerializer.flushTasks();
>
> if (IS_UNIFIED_TELEMETRY) {
> + TelemetryStorage.removeAbortedSessionPing();
Can't we do this in |TelemetryController|? TelemetrySession is using TelemetryController for all the aborted-session related operations, but it's directly interacting with the storage for this.
::: toolkit/components/telemetry/TelemetryStorage.jsm
@@ +310,5 @@
> + * We are using this to synchronize saving to the file that TelemetrySession persists
> + * its state in.
> + */
> +function SaveSerializer() {
> + this._queuedOperations = [];
When are the other refactors taking place? I fear this might take some time and sharing the SaveSerializer (just exporting it) might be relatively cheap.
::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js
@@ +1355,5 @@
> yield OS.File.setDates(PENDING_PING_FILE, null, Date.now() - OVERDUE_PING_FILE_AGE);
> yield TelemetryController.reset();
>
> // Wait for the aborted-session ping.
> + while (true) {
Why is this needed/why are you changing this part?
Attachment #8602211 -
Flags: review?(alessio.placitelli)
Assignee | ||
Comment 15•10 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #12)
> Comment on attachment 8601495 [details] [diff] [review]
> bug1156253.patch
>
> Review of attachment 8601495 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I think it's more efficient to move the first patch from bug 1156355 over
> here.
> Then the changes in this patch can be completely limited to the aborted
> session ping handling in TelemetryStorage.jsm (and tests).
>
> ::: toolkit/components/telemetry/TelemetryStorage.jsm
> @@ +94,5 @@
> > *
> > * @param {string} id The pings id.
> > * @param {number} timestampCreated The pings creation timestamp.
> > * @param {string} type The pings type.
> > + * @param {bool} compressed If |true|, expect the ping to be compressed.
>
> I don't think we need to keep track of whether archived pings are compressed.
> When loading we can just test through {path, path+"lz4"} and load the first
> that exists.
>
Ideally, in the future, we will be having the majority of compressed archived pings and a few uncompressed ones (archived before this patch lands). I think that testing through {path, path+"lz4"} would make us waste the first disk hit. Maybe we should try the lz4 first. What do you think?
Assignee | ||
Comment 16•10 years ago
|
||
Thanks for your review, Georg. I'm not keeping track of the compression status of archived pings anymore.
This stacks up on the other patch in this bug.
Attachment #8601495 -
Attachment is obsolete: true
Attachment #8602655 -
Flags: review?(gfritzsche)
Assignee | ||
Comment 17•10 years ago
|
||
Attachment #8602676 -
Flags: review?(gfritzsche)
Reporter | ||
Comment 18•10 years ago
|
||
(In reply to Alessio Placitelli [:Dexter] from comment #14)
> ::: toolkit/components/telemetry/TelemetryStorage.jsm
> @@ +310,5 @@
> > + * We are using this to synchronize saving to the file that TelemetrySession persists
> > + * its state in.
> > + */
> > +function SaveSerializer() {
> > + this._queuedOperations = [];
>
> When are the other refactors taking place? I fear this might take some time
> and sharing the SaveSerializer (just exporting it) might be relatively cheap.
I don't think that wins us anything. If we want to avoid duplicating we should just move the _stateSaveSerializer (and related state file handling details) over to TelemetryStorage too in a separate patch here.
> ::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js
> @@ +1355,5 @@
> > yield OS.File.setDates(PENDING_PING_FILE, null, Date.now() - OVERDUE_PING_FILE_AGE);
> > yield TelemetryController.reset();
> >
> > // Wait for the aborted-session ping.
> > + while (true) {
>
> Why is this needed/why are you changing this part?
Dropped, that was from an earlier version of the ping.
Reporter | ||
Comment 19•10 years ago
|
||
Attachment #8602707 -
Flags: review?(alessio.placitelli)
Reporter | ||
Updated•10 years ago
|
Attachment #8602211 -
Attachment is obsolete: true
Reporter | ||
Comment 20•10 years ago
|
||
Comment on attachment 8602676 [details] [diff] [review]
part 3 - Make ProfileAge use a different prefix
Review of attachment 8602676 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with below change.
::: toolkit/components/telemetry/TelemetryEnvironment.jsm
@@ +155,5 @@
>
> const EXPERIMENTS_CHANGED_TOPIC = "experiments-changed";
>
> +XPCOMUtils.defineLazyGetter(this, "gProfileAgeLogger",
> + () => Log.repository.getLoggerWithMessagePrefix(LOGGER_NAME, "ProfileAge::"));
We don't need to keep that object alive after we used the profile accessor.
Just move this into _updateProfile with a local var:
const logger = Log.repository.getLoggerWithMessagePrefix(LOGGER_NAME, "ProfileAge - "));
... new ProfileAge(null, logger);
Also note the prefix change as ProfileAge.jsm doesn't expect a "foo::" prefixed logger.
Attachment #8602676 -
Flags: review?(gfritzsche) → review+
Assignee | ||
Comment 21•10 years ago
|
||
Comment on attachment 8602707 [details] [diff] [review]
Refactor aborted-session ping file details into TelemetryStorage
Review of attachment 8602707 [details] [diff] [review]:
-----------------------------------------------------------------
Ok, there's no reason to move that right now then.
Attachment #8602707 -
Flags: review?(alessio.placitelli) → review+
Assignee | ||
Comment 22•10 years ago
|
||
Attachment #8602676 -
Attachment is obsolete: true
Attachment #8602714 -
Flags: review+
Reporter | ||
Comment 23•10 years ago
|
||
Comment on attachment 8602655 [details] [diff] [review]
part 2 - Add lz4 compression for archived pings. v2
Review of attachment 8602655 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/TelemetryStorage.jsm
@@ +166,5 @@
> * @param {bool} overwrite If |true|, the file will be overwritten if it exists,
> * if |false| the file will not be overwritten and no error will be reported if
> * the file exists.
> + * @param {bool} [compress=false] If |true|, the file will use lz4 compression.
> + * Otherwise no compression will be used.
No external caller uses that param. Please drop it from this public function.
@@ +296,5 @@
>
> /**
> * Loads a ping file.
> * @param {String} aFilePath The path of the ping file.
> + * @param {Boolean} [aCompressed=false] If |true|, expects the file to be compressed using lz4.
No external caller uses that param. Please drop it from this public function.
@@ +431,3 @@
> yield OS.File.makeDir(OS.Path.dirname(filePath), { ignoreExisting: true,
> from: OS.Constants.Path.profileDir });
> + yield TelemetryStorage.savePingToFile(ping, filePath, /*overwrite*/true, /*compress*/true);
* this.savePingToFile
* if you are worried about readability let's just have the internal savePingToFile() function use an - optional - options object?
@@ +443,5 @@
> */
> loadArchivedPing: function(id, timestampCreated, type) {
> this._log.trace("loadArchivedPing - id: " + id + ", timestampCreated: " + timestampCreated + ", type: " + type);
> const path = getArchivedPingPath(id, new Date(timestampCreated), type);
> + const pathCompressed = path + 'lz4';
Nit: this is the only time i see us using '' instead of "".
@@ +449,5 @@
> + // for the uncompressed version.
> + this._log.trace("loadArchivedPing - loading ping from: " + pathCompressed);
> + return this.loadPingFile(pathCompressed, /*compressed*/true).catch(() => {
> + this._log.trace("loadArchivedPing - compressed ping not found, loading: " + path);
> + return this.loadPingFile(path, /*compressed*/false);
We only need to try other suffixes when we didn't find the "jsonlz4" file.
If e.g. it's invalid there is something else going wrong.
A task + try/catch should make that handling more readable.
Finally, let's at least have a space between |/*compressed*/true| etc. or use an options argument.
@@ +459,5 @@
> *
> * @param {string} id The pings id.
> * @param {number} timestampCreated The pings creation timestamp.
> * @param {string} type The pings type.
> + * @param {bool} [compressed=true] If |true|, expect the ping to be compressed.
I don't think we need that. Let's just remove all instances of {pingFile, pingFile+"lz4"} that we find, otherwise we could e.g. fall back to an old uncompressed version.
@@ +467,5 @@
> + this._log.trace("_removeArchivedPing - id: " + id + ", timestampCreated: " +
> + timestampCreated + ", type: " + type + ", compressed: " + compressed);
> + let path = getArchivedPingPath(id, new Date(timestampCreated), type);
> + if (compressed) {
> + path += 'lz4';
Nit: '' vs. ""
@@ +543,5 @@
> * @param {bool} overwrite If |true|, the file will be overwritten if it exists,
> * if |false| the file will not be overwritten and no error will be reported if
> * the file exists.
> + * @param {bool} compress If |true|, the file will use lz4 compression. Otherwise no
> + * compression will be used.
As per above, maybe it's time we use an options argument for the internal savePingToFile function?
@@ +752,5 @@
>
> /**
> * Loads a ping file.
> * @param {String} aFilePath The path of the ping file.
> + * @param {Boolean} aCompressed If |true|, expects the file to be compressed using lz4.
As per above, maybe it's time we use an options argument for the internal loadPingFile function?
@@ +784,5 @@
> * Otherwise an object with the extracted data in the form:
> * { timestamp: <number>,
> * id: <string>,
> + * type: <string>,
> + * compressed: <bool> }
Do we need this anywhere?
@@ +826,5 @@
> return {
> timestamp: timestamp,
> id: uuid,
> type: type,
> + compressed: compressed,
Do we need this anywhere?
Attachment #8602655 -
Flags: review?(gfritzsche)
Assignee | ||
Comment 24•10 years ago
|
||
As per IRC, I've left the comments and added a whitespace.
Attachment #8602655 -
Attachment is obsolete: true
Attachment #8602742 -
Flags: review?(gfritzsche)
Reporter | ||
Updated•10 years ago
|
Attachment #8602742 -
Flags: review?(gfritzsche) → review+
Assignee | ||
Comment 25•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 26•10 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/73dcdc97ecdb
https://hg.mozilla.org/integration/fx-team/rev/23bf90af772d
https://hg.mozilla.org/integration/fx-team/rev/7daf22db7a3a
Keywords: checkin-needed
Comment 27•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/73dcdc97ecdb
https://hg.mozilla.org/mozilla-central/rev/23bf90af772d
https://hg.mozilla.org/mozilla-central/rev/7daf22db7a3a
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in
before you can comment on or make changes to this bug.
Description
•