Open Bug 1825600 Opened 2 years ago Updated 1 year ago

Telemetry has 2 timers firing every hour while the user is idle

Categories

(Toolkit :: Telemetry, defect, P4)

defect

Tracking

()

People

(Reporter: florian, Unassigned)

References

(Blocks 2 open bugs)

Details

See this long profile showing a Firefox with only an about:blank tab, sitting there for a full day: https://share.firefox.dev/3nAqddb

One timer has its duration set at https://searchfox.org/mozilla-central/rev/0907529ff72c456ddb47839f5f7ba16291f28dce/toolkit/components/telemetry/app/TelemetryScheduler.sys.mjs#39-43 and the other at https://searchfox.org/mozilla-central/rev/0907529ff72c456ddb47839f5f7ba16291f28dce/toolkit/components/telemetry/pings/EventPing.sys.mjs#29

Chutten's comments in #telemetry when I asked about these timers and if we really need this code to run repeatedly:

  • At the very least EventPing could/should use TelemetryScheduler to schedule its tick
  • EventPing could be silenced with a little more information sharing with TelemetryEvent.cpp and some precise validation analyses to ensure nothing went awry.
  • TelemetryScheduler runs two things: every 5min or so, save "aborted-session" ping payloads in case we crash so we can send the previous session's "main" ping data at the start of the next one (if absolutely nothing's happening, then all we're updating is sessionLength and susbsessionLength which aren't as important as they once were) and send the midnight "main" ping near local midnight (some subtlety around here due to local clocks shifting whenever they'd like, but might be acceptable to change to some other mechanism that only wakes once near local midnight)

EventPing

If EventPing were told not only when the event storage went from nearly full to completely full, but also when it went from completely empty to not empty, it could avoid resetting the timer to fire every 60min to send whatever events collected in the storage. The periodic behaviour is to put an upper bound on how long events need to wait to be submitted. If TelemetryEvent notified on EVENT_STORAGE_NO_LONGER_EMPTY, it could use that as a trigger instead of the previous ping submission attempt.

Doing this would require analysis as well to ensure that such a change didn't affect event submission latency or system reliability.

TelemetryScheduler

Nothing else to add except that the "main" ping triggered near local midnight cannot change behaviour without affecting KPIs. The "aborted-session" payload's up-to-date-ness doesn't matter as much so long as we ensure it's still present. Both of these would need similar analysis work to the EventPing changes in addition to impl and test .

Severity: -- → S4
Priority: -- → P4
Blocks: 1736916
You need to log in before you can comment on or make changes to this bug.