Closed Bug 1604666 Opened 5 years ago Closed 5 years ago

crash_id missing from BigQuery crash_v4 table schema

Categories

(Data Platform and Tools :: General, defect, P1)

defect
Points:
1

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: tdsmith, Assigned: amiyaguchi)

References

Details

(Whiteboard: [dataquality])

Attachments

(1 file)

Crash pings have a payload.crashId field that isn't represented in the telemetry_stable.crash_v4 schema.

It looks like we only considered the payload.metadata members in bug 1595904; it'd be nice to add this one as well since it's the join key to other datasets.

Query that verifies this in the last stable day: https://sql.telemetry.mozilla.org/queries/67144/source

It also looks like a quick search of the crash schema shows that this field is missing from the payload. https://github.com/mozilla-services/mozilla-pipeline-schemas/blob/master/schemas/telemetry/crash/crash.4.schema.json

Assignee: nobody → amiyaguchi

(In reply to Tim Smith 👨‍🔬 [:tdsmith] from comment #0)

Crash pings have a payload.crashId field that isn't represented in the telemetry_stable.crash_v4 schema.

I don't remember exactly what's in that ID and I'm not sure if it should be publicly visible. It might be either the crash ID (as generated locally on the machine) or the submission ID (generated by Socorro) but I think it's the former. In that case it shouldn't be possible to correlate it with the data on crash-stats.mozilla.org because we don't store it there. The proper way to correlate crash pings with data in crash-stats.mozilla.org is via the minidump hash (minidumpSha256Hash field).

We designed that field so that it can be public (in telemetry) because the corresponding field is not on crash-stats.mozilla.org. Generally speaking we don't want to make it possible to correlate telemetry data to crash reports via public information.

The schema deploy has happened and the table now includes a the payload.crash_id column, but it is currently missing as per this query. It looks like there hasn't been a new deploy of the ingestion service yet, so this column is still unavailable in the column.

Points: --- → 1
Priority: -- → P1

Appears to be fixed now, thanks Anthony.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [data-platform-health]
Whiteboard: [data-platform-health] → [data-quality]
Component: Datasets: General → General
Whiteboard: [data-quality] → [dataquality]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: