Closed
Bug 1211599
Opened 9 years ago
Closed 9 years ago
Enforce 100-bucket limit on Telemetry histograms
Categories
(Toolkit :: Telemetry, defect, P4)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla45
People
(Reporter: vladan, Assigned: chutten)
References
(Blocks 1 open bug)
Details
(Whiteboard: [measurement:client])
Attachments
(2 files, 5 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
chutten
:
review+
|
Details | Diff | Splinter Review |
As discussed on fhr-dev, we should standardize on a maximum of 100 buckets in a histogram and enforce it in the histogram generation script.
Currently, there are 176 histograms with n_buckets > 100, so we may have to whitelist them.
Setting the maximum threshold higher would be of little help:
# of histograms with 100-199 buckets: 12
# of histograms with 200-299 buckets: 24
# of histograms with 400 buckets: 1
# of histograms with 500 or 512 buckets: 5
# of histograms with 700 buckets: 1
# of histograms with 1,000 buckets: 131
# of histograms with 10,000 buckets: 2
The 1,000 bucket histograms are the biggest challenge.
Bug 1167292 created the biggest histograms, with 10,000 buckets in 2 histograms
Updated•9 years ago
|
Priority: -- → P4
Whiteboard: [measurement:client]
Reporter | ||
Comment 1•9 years ago
|
||
Chris: can you add a 100 bucket maximum in toolkit/components/telemetry/histogram_tools.py? Whitelist the existing histograms with more than 100 buckets.. you can put their names in some kind of whitelist file in the same directory
Georg: any objections?
Assignee: nobody → chutten
Assignee | ||
Comment 2•9 years ago
|
||
Just a simple JSON file with the 185+ histograms who have numeric buckets > 100, and the code to throw an exception when one's seen.
Also, since this works in other places that might not have the whitelist present, default to allowing everything in the event the whitelist isn't found. Complain, but allow.
Attachment #8678277 -
Flags: review?(nfroyd)
Reporter | ||
Comment 3•9 years ago
|
||
Comment on attachment 8678277 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
Review of attachment 8678277 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/bucket-whitelist.json
@@ +1,4 @@
> +{
> + "MEMORY_RESIDENT": true,
> + "MEMORY_JS_MAIN_RUNTIME_TEMPORARY_PEAK": true,
> + "MEMORY_JS_GC_HEAP": true,
Add a JS comment to the header warning people not to edit the whitelist without Perf team's approval (include my IRC nick or email address)
::: toolkit/components/telemetry/histogram_tools.py
@@ +244,5 @@
> self._high = try_to_coerce_to_number(high)
> self._n_buckets = try_to_coerce_to_number(n_buckets)
> + if self._n_buckets > 100 and type(self._n_buckets) is int:
> + try:
> + with open(os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json'), 'r') as f:
this line is pretty ugly, can you figure out the path on a separate line and put it in a temporary variable
@@ +250,5 @@
> + whitelist = json.load(f)
> + except ValueError, e:
> + raise BaseException, 'error parsing bucket whitelist'
> + if self._name not in whitelist:
> + raise KeyError, 'Non-whitelisted histogram %s cannot have > 100 buckets' % self._name
I wouldn't want people to just whitelist their large histogram after getting this message :) How about "New histograms are not allowed to have more than 100 buckets. Histograms with hundreds of buckets increase mobile memory use, slow down histogram aggregation, etc. Contact :vladan or Perf team if you think an exception needs to be made"
Assignee | ||
Comment 4•9 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #3)
> Comment on attachment 8678277 [details] [diff] [review]
> 0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
>
> Review of attachment 8678277 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: toolkit/components/telemetry/bucket-whitelist.json
> @@ +1,4 @@
> > +{
> > + "MEMORY_RESIDENT": true,
> > + "MEMORY_JS_MAIN_RUNTIME_TEMPORARY_PEAK": true,
> > + "MEMORY_JS_GC_HEAP": true,
>
> Add a JS comment to the header warning people not to edit the whitelist
> without Perf team's approval (include my IRC nick or email address)
JSON files aren't permitted to have comments and still be JSON files. I double-checked and python's json.load unfortunately doesn't seem to like comments (ie, it's a strict JSON parser, not a JS object parser).
>
> ::: toolkit/components/telemetry/histogram_tools.py
> @@ +244,5 @@
> > self._high = try_to_coerce_to_number(high)
> > self._n_buckets = try_to_coerce_to_number(n_buckets)
> > + if self._n_buckets > 100 and type(self._n_buckets) is int:
> > + try:
> > + with open(os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json'), 'r') as f:
>
> this line is pretty ugly, can you figure out the path on a separate line and
> put it in a temporary variable
Sure.
> @@ +250,5 @@
> > + whitelist = json.load(f)
> > + except ValueError, e:
> > + raise BaseException, 'error parsing bucket whitelist'
> > + if self._name not in whitelist:
> > + raise KeyError, 'Non-whitelisted histogram %s cannot have > 100 buckets' % self._name
>
> I wouldn't want people to just whitelist their large histogram after getting
> this message :) How about "New histograms are not allowed to have more than
> 100 buckets. Histograms with hundreds of buckets increase mobile memory use,
> slow down histogram aggregation, etc. Contact :vladan or Perf team if you
> think an exception needs to be made"
Agreed. I'll change it up.
Assignee | ||
Comment 5•9 years ago
|
||
Updated error message, split path calculation to separate line for clarity.
Attachment #8678277 -
Attachment is obsolete: true
Attachment #8678277 -
Flags: review?(nfroyd)
Attachment #8678305 -
Flags: review?(vladan.bugzilla)
Attachment #8678305 -
Flags: review?(nfroyd)
Reporter | ||
Comment 6•9 years ago
|
||
Comment on attachment 8678305 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
Review of attachment 8678305 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/histogram_tools.py
@@ +244,5 @@
> self._high = try_to_coerce_to_number(high)
> self._n_buckets = try_to_coerce_to_number(n_buckets)
> + if self._n_buckets > 100 and type(self._n_buckets) is int:
> + try:
> + whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json');
semicolons in python?
@@ +247,5 @@
> + try:
> + whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json');
> + with open(whitelist_path, 'r') as f:
> + try:
> + whitelist = json.load(f)
are you loading the whitelist file for every histogram?
Attachment #8678305 -
Flags: review?(vladan.bugzilla) → review-
Assignee | ||
Comment 7•9 years ago
|
||
Now we're only parsing it the once, and I've removed a spurious semi.
Attachment #8678305 -
Attachment is obsolete: true
Attachment #8678305 -
Flags: review?(nfroyd)
Attachment #8678333 -
Flags: review?(vladan.bugzilla)
Attachment #8678333 -
Flags: review?(nfroyd)
Reporter | ||
Comment 8•9 years ago
|
||
Comment on attachment 8678333 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
Review of attachment 8678333 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/bucket-whitelist.json
@@ +1,2 @@
> +{
> + "MEMORY_RESIDENT": true,
why not make this an array? the "true" values seems redundant
::: toolkit/components/telemetry/histogram_tools.py
@@ +77,5 @@
> always_allowed_keys = ['kind', 'description', 'cpp_guard', 'expires_in_version',
> 'alert_emails', 'keyed', 'releaseChannelCollection']
>
> +try:
> + whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json')
are realpath & abspath needed?
@@ +82,5 @@
> + with open(whitelist_path, 'r') as f:
> + try:
> + whitelist = json.load(f)
> + except ValueError, e:
> + raise BaseException, 'error parsing bucket whitelist'
include the path & filename in the error message to make it easier to debug
@@ +84,5 @@
> + whitelist = json.load(f)
> + except ValueError, e:
> + raise BaseException, 'error parsing bucket whitelist'
> +except IOError:
> + whitelist = None
declare this at the top of the file, and use a more descriptive name like n_buckets_whitelist
@@ +85,5 @@
> + except ValueError, e:
> + raise BaseException, 'error parsing bucket whitelist'
> +except IOError:
> + whitelist = None
> + print 'Unable to load whitelist. Assume all histograms are acceptable.'
ditto on including path/filename where script looked for the whitelist
@@ +255,5 @@
> self._high = try_to_coerce_to_number(high)
> self._n_buckets = try_to_coerce_to_number(n_buckets)
> + if self._n_buckets > 100 and type(self._n_buckets) is int:
> + if whitelist is None:
> + print 'Histogram %s allowed to have > 100 buckets because the whitelist file could not be read' % self._name
this is going to print ~190 error messages, just put up one big scary message in the IOError handler instead
Attachment #8678333 -
Flags: review?(vladan.bugzilla)
Assignee | ||
Comment 9•9 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #8)
> Comment on attachment 8678333 [details] [diff] [review]
> 0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
>
> Review of attachment 8678333 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: toolkit/components/telemetry/bucket-whitelist.json
> @@ +1,2 @@
> > +{
> > + "MEMORY_RESIDENT": true,
>
> why not make this an array? the "true" values seems redundant
This way I can test for membership instead of having to indexOf each time. I could convert from list to set inside Python to accomplish a similar thing, but this way it is easier to extend in the future if we want this whitelist to mean more than just "presence in this list means we're good for > 100 buckets". Something like "MEMORY_RESIDENT": {n_buckets_limit: 400, can_access_awesome_thing1: 'read-only'}, for instance.
> ::: toolkit/components/telemetry/histogram_tools.py
> @@ +77,5 @@
> > always_allowed_keys = ['kind', 'description', 'cpp_guard', 'expires_in_version',
> > 'alert_emails', 'keyed', 'releaseChannelCollection']
> >
> > +try:
> > + whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json')
>
> are realpath & abspath needed?
For traversing symlinks it would be. I'm not sure if it's relevant, I copypasta'd from the many SCRIPT_DIR examples in other python files: https://dxr.mozilla.org/mozilla-central/search?q=SCRIPT_+realpath&redirect=true&case=true
> @@ +82,5 @@
> > + with open(whitelist_path, 'r') as f:
> > + try:
> > + whitelist = json.load(f)
> > + except ValueError, e:
> > + raise BaseException, 'error parsing bucket whitelist'
>
> include the path & filename in the error message to make it easier to debug
Okily dokily.
> @@ +84,5 @@
> > + whitelist = json.load(f)
> > + except ValueError, e:
> > + raise BaseException, 'error parsing bucket whitelist'
> > +except IOError:
> > + whitelist = None
>
> declare this at the top of the file, and use a more descriptive name like
> n_buckets_whitelist
Alrighty.
> @@ +85,5 @@
> > + except ValueError, e:
> > + raise BaseException, 'error parsing bucket whitelist'
> > +except IOError:
> > + whitelist = None
> > + print 'Unable to load whitelist. Assume all histograms are acceptable.'
>
> ditto on including path/filename where script looked for the whitelist
Makes sense.
> @@ +255,5 @@
> > self._high = try_to_coerce_to_number(high)
> > self._n_buckets = try_to_coerce_to_number(n_buckets)
> > + if self._n_buckets > 100 and type(self._n_buckets) is int:
> > + if whitelist is None:
> > + print 'Histogram %s allowed to have > 100 buckets because the whitelist file could not be read' % self._name
>
> this is going to print ~190 error messages, just put up one big scary
> message in the IOError handler instead
I figured if it was in automation no one would be looking anyway, and if it wasn't in automation we'd want the user to be scared. I'll try to be just as scary with one line. 'Tis the season, after all.
Comment 10•9 years ago
|
||
Comment on attachment 8678333 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
Review of attachment 8678333 [details] [diff] [review]:
-----------------------------------------------------------------
Vladan's review seems good enough to me, and it sounds like there are some revisions that are going to be made anyway.
::: toolkit/components/telemetry/histogram_tools.py
@@ +85,5 @@
> + except ValueError, e:
> + raise BaseException, 'error parsing bucket whitelist'
> +except IOError:
> + whitelist = None
> + print 'Unable to load whitelist. Assume all histograms are acceptable.'
Nit: I think the error reads slightly better as "Assuming all histograms..."
@@ +258,5 @@
> + if whitelist is None:
> + print 'Histogram %s allowed to have > 100 buckets because the whitelist file could not be read' % self._name
> + elif self._name not in whitelist:
> + raise KeyError, ('New histograms are not permitted to have more than 100 buckets. '
> + 'Histograms with large numbers of buckets use disproportionately-high amounts of resources. '
Nit: no hyphen here.
Attachment #8678333 -
Flags: review?(nfroyd)
Assignee | ||
Comment 11•9 years ago
|
||
Attachment #8678859 -
Flags: review?(vladan.bugzilla)
Attachment #8678859 -
Flags: review?(nfroyd)
Reporter | ||
Updated•9 years ago
|
Attachment #8678333 -
Attachment is obsolete: true
Reporter | ||
Comment 12•9 years ago
|
||
Comment on attachment 8678859 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
Review of attachment 8678859 [details] [diff] [review]:
-----------------------------------------------------------------
(In reply to Chris H-C :chutten from comment #9)
> (In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #8)
> > why not make this an array? the "true" values seems redundant
>
> This way I can test for membership instead of having to indexOf each time. I
> could convert from list to set inside Python to accomplish a similar thing,
> but this way it is easier to extend in the future if we want this whitelist
> to mean more than just "presence in this list means we're good for > 100
> buckets". Something like "MEMORY_RESIDENT": {n_buckets_limit: 400,
> can_access_awesome_thing1: 'read-only'}, for instance.
You have a point about future extensibility, but then the values in the dict should be {} or {n_buckets: true}. I suggest for this patch we just keep a list in the file and convert it to a set as you said.
> > are realpath & abspath needed?
>
> For traversing symlinks it would be. I'm not sure if it's relevant, I
> copypasta'd from the many SCRIPT_DIR examples in other python files:
> https://dxr.mozilla.org/mozilla-central/
> search?q=SCRIPT_+realpath&redirect=true&case=true
Ok, let's keep it
Attachment #8678859 -
Flags: review?(vladan.bugzilla) → review+
Assignee | ||
Comment 13•9 years ago
|
||
Bug 1211599 - Only allow whitelisted histograms to have > 100 buckets.
Assignee | ||
Comment 14•9 years ago
|
||
Comment on attachment 8678896 [details]
MozReview Request: Bug 1211599 - Only allow whitelisted histograms to have > 100 buckets.
mozreview test publish. Feel free to ignore.
Attachment #8678896 -
Attachment is obsolete: true
Updated•9 years ago
|
Attachment #8678859 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 15•9 years ago
|
||
Patch updated as asked. Now to figure out try.
Attachment #8678859 -
Attachment is obsolete: true
Attachment #8680018 -
Flags: review+
Assignee | ||
Comment 16•9 years ago
|
||
Clean trybuild of win32 and xpcshell tests on Win7: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3d86e92d4cd2
Status: NEW → ASSIGNED
Keywords: checkin-needed
Comment 17•9 years ago
|
||
Keywords: checkin-needed
Comment 18•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox45:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Comment 19•9 years ago
|
||
bugherder uplift |
status-b2g-v2.5:
--- → fixed
Comment 20•9 years ago
|
||
removing the b2g 2.5 flag since this commit has been reverted due to an incorrect merge, sorry for the confusion
status-b2g-v2.5:
fixed → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•