Closed Bug 1136719 Opened 10 years ago Closed 7 years ago

Unified FHR+Telemetry Data format cleanup

Categories

(Toolkit :: Telemetry, defect, P4)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: mreid, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client] [needs breakdown])

As we start receiving and using the unified data, we will likely come across things we would like to tweak and fix. Please add such things here. One to start with: the "duration" type measurements are all over the place in terms of units and data type - we have "uptime" in minutes (an int value), "subsessionDuration" in seconds (a float value), "firstPaint" in milliseconds (an int value). It would be easier and less error-prone if all such measures used the same units/type.
Blocks: 1122482
I would also push for using timestamps vs date strings (in Heka we use nanoseconds since the epoch). A big performance hit in processing our textual data (JSON or not) is the date/time parsing and conversion. Switching from the time variable from "time":"2014-04-28T22:18:12.704Z"' to '"time":1.398723492704e+18 in the FxA logs increase processing speed by 1.18x.
There are several pieces of data repeated in the top-level "application" section and the "environment.build" section: application: architecture: <string>, // build architecture, e.g. x86 buildId: <string>, // "20141126041045" name: <string>, // "Firefox" version: <string>, // "35.0" vendor: <string>, // "Mozilla" platformVersion: <string>, // "35.0" xpcomAbi: <string>, // e.g. "x86-msvc" environment.build: architecture: <string>, // e.g. "x86", build architecture for the active build buildId: <string>, // e.g. "20141126041045" applicationName: <string>, // "Firefox" version: <string>, // e.g. "35.0" vendor: <string>, // e.g. "Mozilla" platformVersion: <string>, // e.g. "35.0" xpcomAbi: <string>, // e.g. "x86-msvc"
(In reply to Mike Trinkala [:trink] from comment #1) > A big performance hit in processing our textual data (JSON or not) is the > date/time parsing and conversion. Switching from the time variable from > "time":"2014-04-28T22:18:12.704Z"' to '"time":1.398723492704e+18 in the FxA > logs increase processing speed by 1.18x. Note that the date/time strings were specifically requested in the design phase. See e.g. bug 1120981, comment 9 ff.
We have to process every log but we only manually/visually inspect 1 in several billion, Does that justify the performance hit of not having to use a reader that outputs it human readable format when visual inspection is required?
Flags: needinfo?(bcolloran)
Non-human readable string can be parsed by machines in about a billionth the time it takes a human, so that's a good trade of human time for machine time :-) In any case, many (most?) of these strings should actually be just date stamps like "2014-04-28" without the time and timezone. (see https://bugzilla.mozilla.org/show_bug.cgi?id=1134661#c9). Perhaps removing the extra time information will speed up parsing?
Flags: needinfo?(bcolloran)
Blocks: 1201022
No longer blocks: 1122482
This needs a review of the problems mentioned and breaking them down into specific bugs.
Priority: -- → P3
Whiteboard: [measurement:client] [needs breakdown]
Version: 37 Branch → Trunk
Priority: P3 → P4
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.