Closed
Bug 1172688
Opened 9 years ago
Closed 9 years ago
Add telemetry for when gethash calls timeout
Categories
(Toolkit :: Safe Browsing, defect)
Toolkit
Safe Browsing
Tracking
()
RESOLVED
FIXED
mozilla48
Tracking | Status | |
---|---|---|
firefox48 | --- | fixed |
People
(Reporter: francois, Assigned: dimi)
References
Details
(Whiteboard: tpe-seceng)
Attachments
(1 file, 2 obsolete files)
MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg
(deleted),
text/x-review-board-request
|
benjamin
:
review+
francois
:
review+
|
Details |
It would be useful to know how often these completions timeout when trying to reach Google's servers.
Reporter | ||
Updated•9 years ago
|
Updated•9 years ago
|
Component: DOM: Security → Safe Browsing
Product: Core → Toolkit
Reporter | ||
Comment 1•9 years ago
|
||
This is probably going to be similar to bug 1150921.
Whiteboard: tpe-seceng
Assignee | ||
Updated•9 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•9 years ago
|
||
Hi François,
Do we measure this by how many success gethash request with a timeout or the time between each timeout ?
Flags: needinfo?(francois)
Reporter | ||
Comment 4•9 years ago
|
||
I don't think we have a need for the actual time it took. We just want to make sure that requests don't time out very often.
So I would suggest: # timeouts / (# timeouts + # completed requests)
which means we also need to add telemetry for the requests that succeeded.
Flags: needinfo?(francois)
Assignee | ||
Comment 5•9 years ago
|
||
Attachment #8732696 -
Flags: review?(francois)
Reporter | ||
Comment 6•9 years ago
|
||
Comment on attachment 8732696 [details] [diff] [review]
Add telemetry for gethash
Review of attachment 8732696 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js
@@ +553,5 @@
> // Notify the RequestBackoff once a response is received.
> this._completer.finishRequest(this.gethashUrl, httpStatus);
>
> if (success) {
> + Services.telemetry.getHistogramById("URLCLASSIFIER_COMPLETE_TIMEOUT").add(0);
What happens when the request doesn't timeout but still fails? For example, if it returns a 503 instead of a 200.
I would say that as far as the telemetry goes, we should consider that a non-timeout event and mark it as such. We're only interested in seeing how often the gethash endpoint fails to answer at all.
Attachment #8732696 -
Flags: review?(francois)
Assignee | ||
Comment 7•9 years ago
|
||
Modification of this patch:
- Record all gethash requests in telemetry instead of just record success requests
Attachment #8732696 -
Attachment is obsolete: true
Attachment #8733220 -
Flags: review?(francois)
Reporter | ||
Comment 8•9 years ago
|
||
Comment on attachment 8733220 [details] [diff] [review]
Add telemetry for gethash v2
Review of attachment 8733220 [details] [diff] [review]:
-----------------------------------------------------------------
That looks fine to me. I think Benjamin needs to review all new telemetry probes.
Attachment #8733220 -
Flags: review?(francois) → review?(benjamin)
Comment 9•9 years ago
|
||
Comment on attachment 8733220 [details] [diff] [review]
Add telemetry for gethash v2
If this is an exploratory measurement, it should expire (typically in 6 months).
If this is an operational alerting metric, I think we should be collecting from the release population, and you should talk a bit about how you plan on monitoring this with either daily or realtime alerting.
Attachment #8733220 -
Flags: review?(benjamin) → feedback-
Reporter | ||
Comment 10•9 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #9)
> If this is an exploratory measurement, it should expire (typically in 6
> months).
>
> If this is an operational alerting metric, I think we should be collecting
> from the release population, and you should talk a bit about how you plan on
> monitoring this with either daily or realtime alerting.
The question we're trying to answer is: Is the 10s timeout reasonable and (especially) does this happen often in practice?
Is that what you'd consider an exploratory measurement? If so, can we run it for 1 year instead of just 6 months?
Flags: needinfo?(benjamin)
Comment 11•9 years ago
|
||
Please run it for 6 months first and if you still need more data you can extend it.
You could instead add a metric that measures the total time a gethash call takes, which would give you a distribution and not just a boolean.
Flags: needinfo?(benjamin)
Reporter | ||
Comment 12•9 years ago
|
||
Dimi, I think we should go with the 6 months that Benjamin suggested.
https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1237198&attachment=8735634 suggests that all we need to do is add "expires_in_version" with whatever release will come 6 months after Fx 48.
Assignee | ||
Comment 13•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/44105/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/44105/
Attachment #8737773 -
Flags: review?(francois)
Attachment #8737773 -
Flags: review?(benjamin)
Assignee | ||
Updated•9 years ago
|
Attachment #8733220 -
Attachment is obsolete: true
Reporter | ||
Comment 14•9 years ago
|
||
Comment on attachment 8737773 [details]
MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg
https://reviewboard.mozilla.org/r/44105/#review40815
Attachment #8737773 -
Flags: review?(francois) → review+
Assignee | ||
Comment 15•9 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #13)
> Created attachment 8737773 [details]
> MozReview Request: Bug 1172688 - Add telemetry for when gethash calls
> timeout. r?francois, bsmedberg
>
> Review commit: https://reviewboard.mozilla.org/r/44105/diff/#index_header
> See other reviews: https://reviewboard.mozilla.org/r/44105/
Forget to say, based on comment 4, i still use boolean for now instead of metric suggested in comment 11
Reporter | ||
Comment 16•9 years ago
|
||
I think the boolean is fine for this initial exploration.
Updated•9 years ago
|
Attachment #8737773 -
Flags: review?(benjamin) → review+
Comment 17•9 years ago
|
||
Comment on attachment 8737773 [details]
MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg
https://reviewboard.mozilla.org/r/44105/#review41015
data-review=me with that change or something equivalent
::: toolkit/components/telemetry/Histograms.json:3364
(Diff revision 1)
> + "URLCLASSIFIER_COMPLETE_TIMEOUT": {
> + "alert_emails": ["gcp@mozilla.com", "francois@mozilla.com"],
> + "expires_in_version": "52",
> + "kind": "boolean",
> + "bug_numbers": [1172688],
> + "description": "True if gethash lookup timeout"
Suggest more detail: "this metric is recorded every time a gethash lookup is performed. `true` is recorded if the lookup times out."
Assignee | ||
Comment 18•9 years ago
|
||
Comment on attachment 8737773 [details]
MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/44105/diff/1-2/
Attachment #8737773 -
Attachment description: MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r?francois, bsmedberg → MozReview Request: Bug 1172688 - Add telemetry for when gethash calls timeout. r=francois, bsmedberg
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 19•9 years ago
|
||
Keywords: checkin-needed
Comment 20•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in
before you can comment on or make changes to this bug.
Description
•