Closed Bug 1245514 Opened 9 years ago Closed 9 years ago

Histograms.json should adhere to a schema

Categories

(Toolkit :: Telemetry, defect, P4)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: rvitillo, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client])

Parsing Histograms.json is no fun due to the initial design that allowed values to be expressions, see e.g.: - fields that in some declarations are strings and in others are bool [1] [2] for no good reason - fields that contain expressions [3] We should stick to a strict schema and get rid of computed values. [1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Histograms.json?case=true&from=Histograms.json#9074 [2] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Histograms.json?case=true&from=Histograms.json#9276 [3] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Histograms.json?case=true&from=Histograms.json#5443
Don't forget that it's actually much, much worse than just having calculations in those fields. There are 4 n_values fields that reference _C++ constants_: GC_REASON_2, GC_MINOR_REASON, GC_MINOR_REASON_LONG, STARTUP_MEASUREMENT_ERRORS
(In reply to Chris H-C :chutten from comment #1) > Don't forget that it's actually much, much worse than just having > calculations in those fields. > > There are 4 n_values fields that reference _C++ constants_: GC_REASON_2, > GC_MINOR_REASON, GC_MINOR_REASON_LONG, STARTUP_MEASUREMENT_ERRORS Does anyone remember a reason behind those? Given that we shouldn't just change buckets arbitrarily anyway i don't see why we should allow this. If Python has an easy schema integration we could just integrate the schema check into our histogram python scripts and enforce this.
Blocks: 1201022
Priority: -- → P4
Whiteboard: [measurement:client]
(In reply to Georg Fritzsche [:gfritzsche] from comment #2) > (In reply to Chris H-C :chutten from comment #1) > > Don't forget that it's actually much, much worse than just having > > calculations in those fields. > > > > There are 4 n_values fields that reference _C++ constants_: GC_REASON_2, > > GC_MINOR_REASON, GC_MINOR_REASON_LONG, STARTUP_MEASUREMENT_ERRORS > > Does anyone remember a reason behind those? > Given that we shouldn't just change buckets arbitrarily anyway i don't see > why we should allow this. Because this ensures that when you add new constants on the C++ side, the histograms automatically update. One can argue that this is not the desired behavior, but adding new C++ constants and not getting appropriate measurements out of the histograms is not desirable behavior either.
(In reply to Nathan Froyd [:froydnj] from comment #3) > (In reply to Georg Fritzsche [:gfritzsche] from comment #2) > > (In reply to Chris H-C :chutten from comment #1) > > > Don't forget that it's actually much, much worse than just having > > > calculations in those fields. > > > > > > There are 4 n_values fields that reference _C++ constants_: GC_REASON_2, > > > GC_MINOR_REASON, GC_MINOR_REASON_LONG, STARTUP_MEASUREMENT_ERRORS > > > > Does anyone remember a reason behind those? > > Given that we shouldn't just change buckets arbitrarily anyway i don't see > > why we should allow this. > > Because this ensures that when you add new constants on the C++ side, the > histograms automatically update. One can argue that this is not the desired > behavior, but adding new C++ constants and not getting appropriate > measurements out of the histograms is not desirable behavior either. Maybe we could assert matching the C++ constants in a different way (e.g. by adding tests or a specialized field)? The way they are written now we could not validate or fully describe histograms without a matching build and further work.
(In reply to Georg Fritzsche [:gfritzsche] from comment #4) > Maybe we could assert matching the C++ constants in a different way (e.g. by > adding tests or a specialized field)? Adding appropriate static_asserts in Telemetry.cpp: static_assert(gHistograms[GC_REASON_2].bucketCount <= JAVASCRIPT_ENUM_COUNT, "message"); seems like a reasonable way to handle it, with a comment about changing the histogram name if the assert fires, or something like that. That condition might need to be massaged a bit...
Depends on: 920169
Depends on: 1245910
Blocks: 1251580
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.