crash_id missing from BigQuery crash_v4 table schema
Categories
(Data Platform and Tools :: General, defect, P1)
Tracking
(Not tracked)
People
(Reporter: tdsmith, Assigned: amiyaguchi)
References
Details
(Whiteboard: [dataquality])
Attachments
(1 file)
(deleted),
text/x-github-pull-request
|
Details |
Crash pings have a payload.crashId
field that isn't represented in the telemetry_stable.crash_v4
schema.
It looks like we only considered the payload.metadata
members in bug 1595904; it'd be nice to add this one as well since it's the join key to other datasets.
Assignee | ||
Comment 1•5 years ago
|
||
Query that verifies this in the last stable day: https://sql.telemetry.mozilla.org/queries/67144/source
It also looks like a quick search of the crash schema shows that this field is missing from the payload. https://github.com/mozilla-services/mozilla-pipeline-schemas/blob/master/schemas/telemetry/crash/crash.4.schema.json
Comment 2•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Comment 3•5 years ago
|
||
(In reply to Tim Smith 👨🔬 [:tdsmith] from comment #0)
Crash pings have a
payload.crashId
field that isn't represented in thetelemetry_stable.crash_v4
schema.
I don't remember exactly what's in that ID and I'm not sure if it should be publicly visible. It might be either the crash ID (as generated locally on the machine) or the submission ID (generated by Socorro) but I think it's the former. In that case it shouldn't be possible to correlate it with the data on crash-stats.mozilla.org because we don't store it there. The proper way to correlate crash pings with data in crash-stats.mozilla.org is via the minidump hash (minidumpSha256Hash
field).
We designed that field so that it can be public (in telemetry) because the corresponding field is not on crash-stats.mozilla.org. Generally speaking we don't want to make it possible to correlate telemetry data to crash reports via public information.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
The schema deploy has happened and the table now includes a the payload.crash_id
column, but it is currently missing as per this query. It looks like there hasn't been a new deploy of the ingestion service yet, so this column is still unavailable in the column.
Updated•5 years ago
|
Comment 5•5 years ago
|
||
Appears to be fixed now, thanks Anthony.
Updated•5 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•