Closed Bug 1592263 Opened 5 years ago Closed 4 years ago

Make slow SQL telemetry available in BigQuery

Categories

(Data Platform and Tools :: General, task, P2)

task
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: lina, Assigned: ascholtz)

References

(Depends on 1 open bug)

Details

(Whiteboard: [dataquality])

Attachments

(1 file)

(I'm not sure if this component is the right place for this bug, please feel free to move it if it's not!).

I wanted to whip up a dashboard showing slow SQL statements, to keep an eye on the performance of new bookmark sync and other Places queries.

It doesn't look like it has a schema definition yet, which might be why I wasn't able to query moz-fx-data-shared-prod.telemetry_live.main_v4.payload.slow_sql. Do we need to add one?

However, I'm also wondering if the current format of the slow SQL data can be fed into BigQuery. Looking at the docs, and at Services.telemetry.slowSQL in Desktop, I see a structure like this:

{ "mainThread": Stats, "otherThreads": Stats }

Where Stats is an object where the keys are the sanitized SQL statements, and each value is an array with two elements: [hitCount, totalTime]. For a complete example:

{
  "mainThreads": {},
  "otherThreads": {
    "UPDATE moz_places SET frecency = ...": [2, 271]
  }
}

Are those dynamic keys going to be a problem for the pipeline? Do we need to transform them into something (an array of { sql, hitCount, totalTime } objects?) on the backend, so that we can validate and get them into BQ?

Thanks!

Do we need to transform them into something (an array of { sql, hitCount, totalTime } objects?) on the backend, so that we can validate and get them into BQ?

Or, can we store them as JSON strings, and define SQL functions to extract the statements and values, like we do for histograms?

This is indeed going to additional_properties in main pings since it's not fully defined in the schema. For example:

SELECT
  json_extract(additional_properties, '$.payload.slowSQL')
FROM
  `moz-fx-data-shared-prod.telemetry.main` 
WHERE
  DATE(submission_timestamp) = "2019-10-30"
LIMIT 100

Which returns mostly nulls but got one hit: {"mainThread":{"UPDATE moz_cache SET DataSiz....

It would be possible to choose to store these as JSON strings. But it also sounds like the structure is defined enough that we could express it in JSON schema and have that flow to fields in BQ. For the case of dynamic keys, it would show up as a REPEATED key/value STRUCT where the key is the SQL statement and the value would likely be an anonymous tuple.

Here's a query I've created, using the data in additional_properties:

https://sql.telemetry.mozilla.org/queries/65832

Defining a schema for slowSQL would go a long way toward making it easier to analyze. Something in the pipeline strips escape sequences from newlines, backslashes, and quotes, so we can't just pass additional_properties.slowSQL to JSON.parse in a SQL function.

It would also be helpful to normalize statements before recording them (bug incoming), and record the DB name in a separate field, instead of a comment in the SQL string. I added some more notes at the top of the query.

Depends on: 1594264

For reference, here is a pyspark code to process the slowSQL payload. This uses raw pings from the old AWS workflow to get the payload. CMD 4 on could be useful for slowSQL analyses.

Points: --- → 2
Priority: -- → P2
Component: Datasets: Main Summary → General
Whiteboard: [data-quality]
Assignee: nobody → ascholtz

The schema changes have been deployed and payload.slow_sql can now be queried in BigQuery.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Whiteboard: [data-quality] → [dataquality]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: