Closed Bug 1318326 Opened 8 years ago Closed 7 years ago

|get_pings_properties| does not return all the available keyedHistograms

Categories

(Data Platform and Tools :: General, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Dexter, Assigned: amiyaguchi)

References

Details

(Whiteboard: [measurement:client:tracking])

Attachments

(1 file)

While validating some data in bug 1308705, I found that |get_pings_properties| [1], for some reason, is not returning all the available keyed histograms from the pings. If we manually extract the keyed histogram data (e.g. for the SEARCH_COUNTS) instead of using get_pings_properties, then all the expected data is returned. [1] - https://github.com/mozilla/python_moztelemetry/blob/master/moztelemetry/spark.py#L345
Here's a sample notebook that highlights the problem: https://gist.github.com/Dexterp37/402c873f2712002f8ad621c08eeb77c3
Whiteboard: [measurement:client:tracking]
Assignee: nobody → amiyaguchi
Priority: -- → P2
Anthony, any news here?
Flags: needinfo?(amiyaguchi)
Component: Metrics: Pipeline → Telemetry APIs for Analysis
Product: Cloud Services → Data Platform and Tools
Hi Georg, there isn't any news as of yet. I'll be bumping this to a P1 as of next week though.
Flags: needinfo?(amiyaguchi)
Priority: P2 → P1
This bug seems to be caused by keyedHistograms located in the content processes [1]. If the keyedHistogram is found in the payload, but not the content process, then the values are ignored. I've written a few test cases that verify this behavior. I noticed that the main ping docs reference bug 1218576 for histograms and keyedHistograms in child payloads. It seems like logic in _get_ping_properties should account for the swap-over from child-process to aggregated content-process telemetry with modifications to avoid KeyError exceptions. [1] https://github.com/mozilla/python_moztelemetry/blob/fbfff402167e62616df4465de1a986a87417e305/moztelemetry/spark.py#L179-L188
It looks like the above PR will fix a majority of the encountered issues. However, there are still about 2000 missing keyedHistograms using get_ping_properties. The updated results of the original notebook can be found [1]. [1] https://gist.github.com/acmiyaguchi/6c09ecd9183138ddf5e02380c4dc5782
I've added the edge case for missing 2000 or so pings, and updated the tests to account for the changes in bug 1363934. This should be ready to go.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Depends on: 1397955
Component: Telemetry APIs for Analysis → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: