Closed Bug 1191757 Opened 9 years ago Closed 9 years ago

Follow-up on investigative Telemetry client probes

Categories

(Toolkit :: Telemetry, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Iteration:
42.3 - Aug 10
Tracking Status
firefox42 --- affected

People

(Reporter: gfritzsche, Assigned: gfritzsche)

References

Details

(Whiteboard: [unifiedTelemetry] [data-validation])

We landed some investigative Telemetry client probes to track client side misbehavior. We need to follow up on some basic checks with them.
* bug 1190302 - no data yet for TELEMETRY_SESSIONDATA_FAILED_* * bug 1186955 - no data yet for TELEMETRY_PING_SIZE_EXCEEDED_*, TELEMETRY_DISCARDED_*_SIZE_MB * bug 1186492 - oddly, no data yet for TELEMETRY_PING_EVICTED_FOR_SERVER_ERRORS, i expected to see some before bug 1186955 landed due to oversized pings We do have data for bug 1187340, bug 1168835: * pending ping load failures: http://bit.ly/1EambFu * pending ping parse failures: http://bit.ly/1IPaytb * pending ping sizes: http://bit.ly/1DwtUmi
(In reply to Georg Fritzsche [:gfritzsche] from comment #1) > TELEMETRY_PING_EVICTED_FOR_SERVER_ERRORS, i expected to see some before bug > 1186955 landed due to oversized pings Local testing shows that the server correctly returns 4xx errors (411 for a 4mb payload) and the client handles this correctly, so we may just not have had any submissions yet or before bug 1186955.
(In reply to Georg Fritzsche [:gfritzsche] from comment #1) > We do have data for bug 1187340, bug 1168835: > * pending ping load failures: http://bit.ly/1EambFu > * pending ping parse failures: http://bit.ly/1IPaytb > * pending ping sizes: http://bit.ly/1DwtUmi There is some interesting data here: * We do have a lot of parse failures, i'm not sure yet what to make of this. However, we can track this in analysis to see if that explains the missing pings. * We have very few actual disk load failures, 2 of 3 submissions have very high load failure counts. This seems to point to us repeatedly trying to load from disk after failures from the ping send task. I think we have to prioritize bug 1189425 to avoid clients getting stuck. * The pending ping size distribution shows that we have a long tail of pretty big pings. 88.86% are <1MB, a further 10.56% are 1MB<=size<2MB. That means we will get a lot of pings evicted soon, as bug 1186955 introduces a 1MB ping size limit. We can't uplift that bug in this form and probably have to consider increasing that limit to 2MB temporarily. as bug 1186955 introduced a 1MB limit.
(In reply to Georg Fritzsche [:gfritzsche] from comment #3) > * The pending ping size distribution shows that we have a long tail of > pretty big pings. > 88.86% are <1MB, a further 10.56% are 1MB<=size<2MB. That means we will > get a lot of pings evicted > soon, as bug 1186955 introduces a 1MB ping size limit. > We can't uplift that bug in this form and probably have to consider > increasing that limit to 2MB temporarily. > as bug 1186955 introduced a 1MB limit. Worth noting that this only measures the sizes of persisted pings on disk at startup, so this may be biased.
(In reply to Georg Fritzsche [:gfritzsche] from comment #4) > (In reply to Georg Fritzsche [:gfritzsche] from comment #3) > > * The pending ping size distribution shows that we have a long tail of > > pretty big pings. > > 88.86% are <1MB, a further 10.56% are 1MB<=size<2MB. That means we will > > get a lot of pings evicted > > soon, as bug 1186955 introduces a 1MB ping size limit. > > We can't uplift that bug in this form and probably have to consider > > increasing that limit to 2MB temporarily. > > as bug 1186955 introduced a 1MB limit. > > Worth noting that this only measures the sizes of persisted pings on disk at > startup, so this may be biased. This is actually the size of the whole pending ping directory, so not immediately indicating brokenness. The individual ping size is in TELEMETRY_DISCARDED_*_SIZE_MB and those have no data yet.
(In reply to Georg Fritzsche [:gfritzsche] from comment #5) > The individual ping size is in TELEMETRY_DISCARDED_*_SIZE_MB and those have > no data yet. Is this measuring the compressed size (as stored on disk) or the raw uncompressed payload size?
(In reply to Mark Reid [:mreid] from comment #6) > (In reply to Georg Fritzsche [:gfritzsche] from comment #5) > > The individual ping size is in TELEMETRY_DISCARDED_*_SIZE_MB and those have > > no data yet. > > Is this measuring the compressed size (as stored on disk) or the raw > uncompressed payload size? Depends: * TELEMETRY_DISCARDED_PENDING_PINGS_SIZE_MB - pending ping on-disk size * TELEMETRY_DISCARDED_ARCHIVED_PINGS_SIZE_MB - ditto for archived * TELEMETRY_DISCARDED_SEND_PINGS_SIZE_MB - ping size after serializing, before compression & sending to the server
Blocks: 1122482
No longer blocks: 1120356
Blocks: 1201045
They measurements here seem ok except pending ping parse failures. That issue is tracked in bug 1201045.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.