<a class="header-button" href="https://bugzilla-dev.allizom.org/home" title="Go to home page"> Bugzilla

Assignee

Comment 1

•

3 years ago

I flagged this during discussion with chutten; unfortunately the telemetry API doesn't provide an easy way to disable the timer but still produce reliable results. I need the 0's during no-serviceworker times to make the result meaningful (like that we found that 85% of the time we have no SW running). If I disable the timer, to keep the stats right I need to do 2 things: one, when we start a SW (go from 0->1), we'd need to restart the timer (easy), and we'd need to submit N(seconds)/10 0's to Telemetry (the 0's we didn't record). This means calling the API N/10 times in a row, or creating an array of 0's and passing that in (and internally it basically calls the record method with 0 N times, so there's no real perf win). We'd also need to notice shutdown and submit 0's then if we're in a stopped-timer state.

Flags: needinfo?(rjesup)

Assignee

Comment 2

•

3 years ago

Not related to the timer: I don't see where the SERVICE_WORKER_FETCH_RUNNING probe defined in Histograms.json is used, I wonder if it's a leftover from a previous version of the patch.

Yes, I believe so

Assignee

Comment 3

•

3 years ago

The only way I see to solve this without a performance/pageload impact would be to submit the 0's via creating an idle runnable, and even then I'm still worried it would run long enough in some cases (like a new SW after N days without one running) that it could impact pageload. Perhaps use a large fixed delay (10s? 30s?), then use an idle runnable.

Assignee

Updated

•

3 years ago

Flags: needinfo?(chutten)

Assignee

Comment 4

•

3 years ago

I suppose we could do some type of increasing delay the longer we've had 0 SW's, and then iterating N/10 times on every wakeup, with some type of upper limit to limit maximum jank. Note that none of these ideas exactly as above would help the cases where a SW remains running, though we could trivially extend them for "# of SW's hasn't changed", with a trigger to cut the delay immediately on any change in the number (instead of on 0->1 transitions)

Florian Quèze [:florian]

Reporter

Comment 5

•

3 years ago

Is the telemetry API slow enough to make it noticeably slow if we add the same value a few thousand times in a row? And if the answer is yes, can we call it off main thread?

Updated

•

3 years ago

Keywords: regression

Comment 6

•

3 years ago

Set release status flags based on info from the regressing bug 1740335

status-firefox96: --- → unaffected

status-firefox97: --- → affected

status-firefox98: --- → affected

status-firefox-esr91: --- → unaffected

Ryan VanderMeulen [:RyanVM]

Updated

•

3 years ago

Has Regression Range: --- → yes

Chris H-C :chutten

Comment 7

•

3 years ago

(In reply to Florian Quèze [:florian] from comment #5)

Is the telemetry API slow enough to make it noticeably slow if we add the same value a few thousand times in a row? And if the answer is yes, can we call it off main thread?

Telemetry (and Glean) APIs are threadsafe and are in the performance class of "cheap but not free".

In today's m-c Telemetry, Histograms are given samples proportional to the number of paints or vsync intervals (and the latter even on release channel) amounting to hundreds of millions of samples per major version, easily tens of thousands of samples per subsession, if not more. ( Check about:telemetry and use the search box for "rasterize" and you'll see your own counts. Refresh the page to watch them increase.) There are no doubt Histograms with even more samples in them, but those are two that came to mind : )

Telemetry does take a lock on the calling thread and perform some small in-memory calculations per sample to figure out what bucket it should be in and then to increment the count in that bucket. There are some value and bounds checks as well, but they go by quickly.

In short: we should be fine to receive a whole whack-load of zeroes. If you learn that Telemetry isn't up to the task, do let us know and we'll cut a better API for you or I can help you encode your information in a slightly different way (like maybe we count lengths of time that we're at zero SWs to keep 0s out of the main SW probe). Please use the array API if you plan on sending us a bunch of zeroes so you only pick up Telemetry's Histogram lock once per batch. And by all means do so off the main thread. It shouldn't run long, but as advertised these operations are "cheap but not free".

(( In the case of Glean, the Glean SDK has its own dispatcher to take things off of the instrumenting thread so dispatching yourself wouldn't be needed. ))

Flags: needinfo?(chutten)

Updated

•

3 years ago

status-firefox97: affected → wontfix

Cristina Cozmuta (:CrissCozmuta)

Assignee

Comment 8

•

3 years ago

Attached file Bug 1752387: Don't use a timer for recording ServiceWorker running telemetry r=chutten,#dom-worker-reviewers (deleted) — Details

Phabricator Automation

Updated

•

3 years ago

Assignee: nobody → rjesup

Status: NEW → ASSIGNED

Jens Stutte [:jstutte]

Updated

•

3 years ago

Severity: -- → S4

Priority: -- → P3

Pulsebot

Comment 9

•

3 years ago

Pushed by rjesup@wgate.com: https://hg.mozilla.org/integration/autoland/rev/068ad1f32fba Don't use a timer for recording ServiceWorker running telemetry r=chutten,dom-worker-reviewers,edenchuang

Comment 10

•

3 years ago

Backed out for causing failures on nsISupportsImpl.cpp:43. CLOSED TREE

Backout link : https://hg.mozilla.org/integration/autoland/rev/a66c1dc9f44dd80542a1a58f00d14946d3badf17

Push with failures : https://treeherder.mozilla.org/jobs?repo=autoland&duplicate_jobs=visible&resultStatus=testfailed%2Cbusted%2Cexception%2Crunnable&revision=068ad1f32fba801fc0c73525c5d8130285c4d086&selectedTaskRun=XUZpS77GTl6W9Nasv9zVnQ.0

Link to failure log : https://treeherder.mozilla.org/logviewer?job_id=367088428&repo=autoland&lineNumber=2579

Flags: needinfo?(rjesup)

Narcis Beleuzu [:NarcisB]

Comment 11

•

3 years ago

Please also check:

assertion failure in MOZ_Crash -> https://treeherder.mozilla.org/logviewer?job_id=367094551&repo=autoland&lineNumber=4418
bc failure on browser_serviceworker_fetch_new_process.js -> https://treeherder.mozilla.org/logviewer?job_id=367101162&repo=autoland&lineNumber=5462
bc failure on browser_toolbarKeyNav.js -> https://treeherder.mozilla.org/logviewer?job_id=367105002&repo=autoland&lineNumber=2007

Pulsebot

Comment 12

•

3 years ago

Pushed by rjesup@wgate.com: https://hg.mozilla.org/integration/autoland/rev/3f4d90aeea3b Don't use a timer for recording ServiceWorker running telemetry r=chutten

Cristian Tuns

Comment 13

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/3f4d90aeea3b

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

status-firefox99: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 99 Branch

Ryan VanderMeulen [:RyanVM]

Updated

•

3 years ago

Flags: needinfo?(rjesup)

Comment 14

•

3 years ago

The patch landed in nightly and beta is affected.
:jesup, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(rjesup)

Updated

•

3 years ago

status-firefox98: affected → wontfix

Assignee

Comment 15

•

3 years ago

Comment on attachment 9261815 [details]
Bug 1752387: Don't use a timer for recording ServiceWorker running telemetry r=chutten,#dom-worker-reviewers

Beta/Release Uplift Approval Request

User impact if declined: Increased power use by idle firefox instances (including on mobile)
Is this code covered by automated tests?: Yes
Has the fix been verified in Nightly?: Yes
Needs manual test from QE?: No
If yes, steps to reproduce:
List of other uplifts needed: None
Risk to taking this patch: Low
Why is the change risky/not risky? (and alternatives if risky): Relatively simple patch that removes a timer. We certainly don't need to include it, but power use especially by mobile is an issue, and mobile does finger apps that use power in the background
String changes made/needed: none

Flags: needinfo?(rjesup)

Attachment #9261815 - Flags: approval-mozilla-beta?

Comment 16

•

3 years ago

Comment on attachment 9261815 [details]
Bug 1752387: Don't use a timer for recording ServiceWorker running telemetry r=chutten,#dom-worker-reviewers

We are still in early betas and that looks like a valuable fix to uplift. Approved for 98 beta 5, thanks.

Attachment #9261815 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

https://hg.mozilla.org/releases/mozilla-beta/rev/e090b88bb634

Comment 17

•

3 years ago

bugherder uplift