836719 - Perf review for FHR metrics/

This feature is too large for me to be able to perform a review in one pass, so I will start by asking a few questions on the design of metrics/. My apologies in advance if the questions have obvious answers, I'm trying to enter into that code.

Attachment #710214 - Flags: feedback?(rnewman)

Attachment #710214 - Flags: feedback?(gps)

Gregory Szorc [:gps]

Comment 2

•

12 years ago

Comment on attachment 710214 [details] [diff] [review] First look at metrics/ Review of attachment 710214 [details] [diff] [review]: ----------------------------------------------------------------- It sounds like most the comments are related to missing docs. Sorry about that. Sadly we had to skimp on the docs to meet deadlines. I have a patch or two in my patch queue that add docs. Just need to get things checked in... ::: services/metrics/collector.jsm @@ +73,5 @@ > } > > + // registerProvider + _popAndInitProvider looks quite complicated > + // and seems to duplicate promises. Why not something along the lines of > + // this._providerInitQueue = this._providerInitQueue.then(...) If you chain like this, won't an early rejection cause subsequent queued promises to not execute? We explicitly ignore rejections so failures are isolated. ::: services/metrics/dataprovider.jsm @@ +8,5 @@ > > this.EXPORTED_SYMBOLS = [ > "Measurement", > "Provider", > +// Nit: No final ',' Our (services) coding convention *always* uses a trailing comma in arrays and objects. The main reason is it allows copies, deletes, moves, and adds of elements in the container without having to touch surrounding lines. This makes editing much easier (yank whole lines instead of partial) and also reduces the burden of review since changed lines are fewer. Furthermore, the final "," is harmless, so there is no potential for harm. We think the benefits of the trailing comma have an obvious productivity win and encourage others to adopt this convention. @@ +53,5 @@ > * FUTURE: provide hook points for measurements to supplement with custom > * storage needs. > */ > this.Measurement = function () { > +// Nit: Could you name the constructor? This simplifies debugging and profiling. As noted in #perf, we stopped naming our functions after http://javascript-reverse.tumblr.com/post/34328529526/javascript-function-name-inference-aka-stop-function because less typing == win. As also noted in #perf, apparently the profiler doesn't support unnamed functions yet. We weren't aware of this limitation until today. Unfortunately, the horse has left the stable. I suggest the profiler gain this feature ASAP. @@ +87,5 @@ > +// one. Am I correct? > +// > +// If so, wouldn't it be interesting to simplify the code > +// until you have clients that need another serialization > +// format? We are planning on adding protocol buffer serialization in the near future. We decided to bite the technical bullet while the code was fresh in our heads and before we'd have to update many providers. While YAGNI could apply, we still intend to ship protocol buffers so I'd like to retain this. If nothing else, it does make testing a bit easier (less monkeypatching). @@ +94,4 @@ > > Measurement.prototype = Object.freeze({ > +// Shouldn't this be |this.Measurement.prototype|? > +// (same comment applies to all your prototypes) Stylistically I can see where you are coming from. But it doesn't matter since "this" here is the global/module scope. @@ +148,5 @@ > */ > serializer: function (format) { > +// I am wary of |JSON.stringify|, as it is bound to be executed on the > +// main thread and this might jank on large data sets. Do you have plans > +// to offer something asynchronous? This is the primary reason why I made so much noise in bug 827852 comment #3 which led to your filing of bug 832664. I also made noise a few months back when we wrote this code. Sadly there is nothing we can do until better APIs are made available. We can only feel better knowing that this will only occur at most once every 24h. It's worth noting that if we switch to protocol buffers we will presumably have more opportunities to not jank. @@ +213,5 @@ > +// > +// From the top of my head, here are a few possible alternatives: > +// - "raw" sql, in which providers add new tables to the schema using sqlite; > +// - using something without schema, e.g. json; or > +// - using something with lightweight schemas, e.g. IndexedDB. You should read storage.jsm. We want strong typing for validation and to foster an easier transition to protocol buffers. The idea of "raw" SQL had occurred to us. We have intentionally avoided it because a) there is yet no strong requirement for it b) we feel that centralizing common storage requirements will cut down on overall loc c) it would likely bloat the database d) it will reduce complexity and bugs in providers. We strongly prefer to solve problems generically in storage.jsm where they can be leveraged by all providers. As for JSON, the Perf Team told us we should use SQLite (mainly as opposed to file I/O). We figured using SQLite to store JSON seemed like an abuse of a database. While we do shoehorn JSON in the database in a few places (notably add-ons), it is the exception to the rule. We never gave IndexedDB a thorough consideration because the benefits of SQLite (notably the power of SQL) combined with blessing from the Perf Team seemed like the ideal solution. @@ +417,5 @@ > * in the array are the type functions, not instances. Instances of the > * `Measurement` are created at run-time by the `Provider` and are bound > * to the provider and to a specific storage backend. > */ > +// If I understand the code correctly, it should also define |onInit| and |onShutdown|. Nope. These are optional. @@ +469,5 @@ > */ > initPreferences: function (branchParent) { > +// I do not understand why we need to handle the provider's preferences. > +// Shouldn't they do that by themselves? > +// Also, it is unclear to me who calls |initPreferences| We figured it would be easier for preferences to "just work." The upside is all preferences live in the same branch. The downside is an extra Preferences object in memory. In reality, not many providers use this. So maybe we could lazy-load the prefs interface. This function is currently called by the entity constructing the Provider. Provider instances aren't meant to be user-instantiated. @@ +603,5 @@ > }); > }, > > _dateToDays: function (date) { > +// Nit: Why is this in the prototype? It is commonly used. I figured it was easier to put directly on the prototype than in some random exported symbol.

Attachment #710214 - Flags: feedback?(gps)

(dormant account)

Comment 3

•

12 years ago

(In reply to Gregory Szorc [:gps] from comment #2) > > The idea of "raw" SQL had occurred to us. We have intentionally avoided it > because a) there is yet no strong requirement for it b) we feel that > centralizing common storage requirements will cut down on overall loc c) it > would likely bloat the database d) it will reduce complexity and bugs in > providers. We strongly prefer to solve problems generically in storage.jsm > where they can be leveraged by all providers. > > As for JSON, the Perf Team told us we should use SQLite (mainly as opposed > to file I/O). We figured using SQLite to store JSON seemed like an abuse of > a database. While we do shoehorn JSON in the database in a few places > (notably add-ons), it is the exception to the rule. > > We never gave IndexedDB a thorough consideration because the benefits of > SQLite (notably the power of SQL) combined with blessing from the Perf Team > seemed like the ideal solution. I should clarify this. I recommended sqlite exclusively on the basis of robust persistence. I believe I warned against overusing it. I never suggested that one try to use sqlite to enforce data correctness. I would also never recommend a relational data storage model for performance. I envisioned FHR using sqlite as a simple key/value store. I was reluctant to recommend sqlite because it could result in badness, unfortunately sqlite overuse happened realizing my fears.

Brendan Eich [:brendan]

Comment 4

•

12 years ago

From comment 2: > ::: services/metrics/dataprovider.jsm > @@ +8,5 @@ > > > > this.EXPORTED_SYMBOLS = [ > > "Measurement", > > "Provider", > > +// Nit: No final ',' > > Our (services) coding convention *always* uses a trailing comma in arrays and objects. > The main reason is [good]. Agreed. Yoric, you may be thinking of C and C++, which formally ban trailing comma where some compilers allow it. ECMA-262 absolutely supports optional single trailing comma in array literal, for the good reasons given. This is good style, not bad style or nit fodder. /be

(dormant account)

Comment 5

•

12 years ago

(In reply to Taras Glek (:taras) from comment #3) > I envisioned FHR using sqlite as a simple key/value store. I was reluctant > to recommend sqlite because it could result in badness, unfortunately sqlite > overuse happened realizing my fears. I checked irc logs. Looks like I did not warn against overusing sqlite. I'm sorry, I meant to do that.

David Teller [:Yoric] - still alive but not very active

Assignee

Comment 6

•

12 years ago

(In reply to Gregory Szorc [:gps] from comment #2) > ::: services/metrics/collector.jsm > @@ +73,5 @@ > > } > > > > + // registerProvider + _popAndInitProvider looks quite complicated > > + // and seems to duplicate promises. Why not something along the lines of > > + // this._providerInitQueue = this._providerInitQueue.then(...) > > If you chain like this, won't an early rejection cause subsequent queued > promises to not execute? We explicitly ignore rejections so failures are > isolated. Well, not if you do something along the lines of: this._providerInitQueue = this._providerInitQueue.then( function doInit() { } ).then(null, function handleInitError() { } ); > > ::: services/metrics/dataprovider.jsm > @@ +8,5 @@ > > > > this.EXPORTED_SYMBOLS = [ > > "Measurement", > > "Provider", > > +// Nit: No final ',' > > Our (services) coding convention *always* uses a trailing comma in arrays > and objects. The main reason is it allows copies, deletes, moves, and adds > of elements in the container without having to touch surrounding lines. Ok, makes sense. I was accustomed to avoid this because some browsers don't like it, but that's probably a non-issue on m-c. > @@ +53,5 @@ > > * FUTURE: provide hook points for measurements to supplement with custom > > * storage needs. > > */ > > this.Measurement = function () { > > +// Nit: Could you name the constructor? This simplifies debugging and profiling. [...] > Unfortunately, the horse has left the stable. I suggest the profiler gain > this feature ASAP. Yes, definitely. And thanks for the info. > > @@ +87,5 @@ > > +// one. Am I correct? > > +// > > +// If so, wouldn't it be interesting to simplify the code > > +// until you have clients that need another serialization > > +// format? > > We are planning on adding protocol buffer serialization in the near future. > We decided to bite the technical bullet while the code was fresh in our > heads and before we'd have to update many providers. > > While YAGNI could apply, we still intend to ship protocol buffers so I'd > like to retain this. If nothing else, it does make testing a bit easier > (less monkeypatching). I have some (limited) reservations about putting this in a v1. As far as I can tell, everything on top of this mechanism (|getJSONPayload|, |_uploadData|, etc.) hardcodes the kind of serialization at a higher level, so this looks like you are going to need some rewrites anyway to make use of the feature. I will let you balance these reservations against your confidence that you are going to need it soon. Your call. > > @@ +94,4 @@ > > > > Measurement.prototype = Object.freeze({ > > +// Shouldn't this be |this.Measurement.prototype|? > > +// (same comment applies to all your prototypes) > > Stylistically I can see where you are coming from. But it doesn't matter > since "this" here is the global/module scope. By the way, I haven't checked: is that also true on B2G with the B2G module hack? > > @@ +148,5 @@ > > */ > > serializer: function (format) { > > +// I am wary of |JSON.stringify|, as it is bound to be executed on the > > +// main thread and this might jank on large data sets. Do you have plans > > +// to offer something asynchronous? > > This is the primary reason why I made so much noise in bug 827852 comment #3 > which led to your filing of bug 832664. I also made noise a few months back > when we wrote this code. > > Sadly there is nothing we can do until better APIs are made available. We > can only feel better knowing that this will only occur at most once every > 24h. At the moment, my concern is whether your code will be easy to move to something async. Given that |serializer| looks like a public API, and given that it returns something synchronous, I am wondering whether you will be able to move without breaking some public APIs. > It's worth noting that if we switch to protocol buffers we will presumably > have more opportunities to not jank. Good to know. > > @@ +213,5 @@ > > +// > > +// From the top of my head, here are a few possible alternatives: > > +// - "raw" sql, in which providers add new tables to the schema using sqlite; > > +// - using something without schema, e.g. json; or > > +// - using something with lightweight schemas, e.g. IndexedDB. > > You should read storage.jsm. I must confess that I have attempted to, and that I have given up around line 50 of your 250 loc schema. Actually, the complexity of this schema is one of the main reasons for this question. > We want strong typing for validation and to foster an easier transition to > protocol buffers. > > The idea of "raw" SQL had occurred to us. We have intentionally avoided it > because a) there is yet no strong requirement for it b) we feel that > centralizing common storage requirements will cut down on overall loc c) it > would likely bloat the database d) it will reduce complexity and bugs in > providers. We strongly prefer to solve problems generically in storage.jsm > where they can be leveraged by all providers. At this stage, I do not understand which problems are solved by your SQL schema or where you have strong typing involved (unless perhaps you count foreign keys as a kind of strong typing, which would make sense). Can you point me to a part of the documentation that would help me convince myself? > As for JSON, the Perf Team told us we should use SQLite (mainly as opposed > to file I/O). We figured using SQLite to store JSON seemed like an abuse of > a database. While we do shoehorn JSON in the database in a few places > (notably add-ons), it is the exception to the rule. I understand your rejection of JSON in a database, although I suspect it makes sense in the context. Of course, this depends very much on the question of exactly which problems your storage solves/needs to solve. > We never gave IndexedDB a thorough consideration because the benefits of > SQLite (notably the power of SQL) combined with blessing from the Perf Team > seemed like the ideal solution. Well, IndexedDB is basically a variant on JSON in sqlite, so the same arguments pro/against apply. > @@ +417,5 @@ > > * in the array are the type functions, not instances. Instances of the > > * `Measurement` are created at run-time by the `Provider` and are bound > > * to the provider and to a specific storage backend. > > */ > > +// If I understand the code correctly, it should also define |onInit| and |onShutdown|. > > Nope. These are optional. Fair enough. Whenever you get around to your next round of documentation, I suggest you mention |onInit| and |onShutdown| as possible hooks at this point in the documentation. > @@ +469,5 @@ > > */ > > initPreferences: function (branchParent) { > > +// I do not understand why we need to handle the provider's preferences. > > +// Shouldn't they do that by themselves? > > +// Also, it is unclear to me who calls |initPreferences| > > We figured it would be easier for preferences to "just work." The upside is > all preferences live in the same branch. The downside is an extra > Preferences object in memory. > > In reality, not many providers use this. So maybe we could lazy-load the > prefs interface. I guess I will have to wait until I see the higher layers to understand this "just work" factor. For the moment, I see lines of code that I nag me a little because it looks like the feature is not really needed at this layer and could be removed for the purpose of minimizing the scope of the module. > This function is currently called by the entity constructing the Provider. > Provider instances aren't meant to be user-instantiated. Ok. In your next round of documentation, could you add one line to that effect? > @@ +603,5 @@ > > }); > > }, > > > > _dateToDays: function (date) { > > +// Nit: Why is this in the prototype? > > It is commonly used. I figured it was easier to put directly on the > prototype than in some random exported symbol. I don't get it. Why should this be exported at all?

David Teller [:Yoric] - still alive but not very active

Assignee

Updated

•

12 years ago

Summary: Perf review for FHR → Perf review for FHR metrics/

David Teller [:Yoric] - still alive but not very active

Assignee

Updated

•

12 years ago

Blocks: 839822

David Teller [:Yoric] - still alive but not very active

Assignee

Updated

•

12 years ago

Blocks: 841323

David Teller [:Yoric] - still alive but not very active

Assignee

Comment 7

•

12 years ago

The Perf team had too many uncollated notes about FHR, so we had a meeting to exchange notes and ideas. During the meeting, I believe that we have come up with an alternative design for storage that might satisfy everybody. I will let vladan and gps determine together whether this design is applicable. The main point is that sqlite is extremely tailored for data safety. This makes it a great tool for data that we cannot afford to lose no matter what happens, but this also makes it over-expensive and over-complex for data that we could afford to lose in some rare cases. It is our understanding that FHR does not actually need this extreme level of data safety. Once we loosen this restriction, many things can be simplified or moved further off the main thread, hence removing many opportunities for jank.

David Teller [:Yoric] - still alive but not very active

Assignee

Updated

•

12 years ago

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Mike Connor [:mconnor]

Updated

•

12 years ago

Component: Metrics and Firefox Health Report → Client: Desktop

Product: Mozilla Services → Firefox Health Report

Richard Newman [:rnewman]

Updated

•

11 years ago

Attachment #710214 - Flags: feedback?(rnewman)

BMO Automation

Updated

•

6 years ago

Product: Firefox Health Report → Firefox Health Report Graveyard

Bugzilla

Perf review for FHR metrics/

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

Tracking

(Not tracked)

People

(Reporter: Yoric, Assigned: Yoric)

References

Details

(Keywords: perf, Whiteboard: [Snappy:P1])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Updated

Updated

Comment 7

Updated

Updated

Updated

Updated

Attachment

General

Description

File Name

Content Type