Closed
Bug 1367023
Opened 7 years ago
Closed 6 years ago
Need a Slow SQL periodic report for Sqlite consumers perf triage
Categories
(Data Platform and Tools :: General, enhancement, P3)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mak, Assigned: ccd, NeedInfo)
References
(Blocks 1 open bug, )
Details
(Whiteboard: [DataPlatform])
Attachments
(1 file)
(deleted),
text/plain
|
Details |
From some time the Slow SQL telemetry dashboard has been down, as a result we are lacking any kind of data to identify and prioritize bottlenecks and janks due to Sqlite usage in the product. We still have quite a few components using Sqlite and none of the current Talos tests are good to evaluate their impact. While it's possible a dashboard is not the right solution to this problem, having some kind of periodic report would be of great help to identify new and existing performance issues in the way consumers use Sqlite. Thus, I'd like to kick-off a discussion about what we can do to have this kind of periodic report. What I care about is something like a top-tenish of SQL offenders on and off the main-thread, where by that we care both about which queries take more time on a single execution, but also which queries are executed the most often in short timeframes (blocks of 1 hour or such). Ni? mreid to report what we had available in the previous Slow SQL dashboard. PS: I'm not 100% sure this is in the right component, there are a lot of components about data collection and I couldn't find a documentation about which one to use for new requirements :(
Flags: needinfo?(mreid)
Reporter | ||
Updated•7 years ago
|
Blocks: StorageJank
Updated•7 years ago
|
Component: Metrics: Data Tools → Datasets: General
Product: Cloud Services → Data Platform and Tools
Comment 1•7 years ago
|
||
This is the notebook that previously powered the SlowSQL dashboard. Historically, each day's data was processed, then the current week's daily files were combined into a weekly summary, which was published to a web-facing S3 bucket. The SlowSQL dashboard consumed these weekly summary files directly.
Flags: needinfo?(mreid)
Reporter | ||
Comment 2•7 years ago
|
||
bug 1399879 is one case where this report would have helped us noticing a bug in the wild.
Comment 3•7 years ago
|
||
Marking as P3, since I do not anticipate our team being able to handle this in Q4. Let me know if this is in error.
Points: --- → 3
Priority: -- → P3
Reporter | ||
Comment 4•6 years ago
|
||
Bug 1461753 is another case where this could have been useful (if done weekly or such), since we started having a lot of queries taking more time than usual and causing other queries on the main-thread to be slow. We currently still don't have a good way to monitor increase/decrease in the number and time of slow sql queries.
Updated•6 years ago
|
Whiteboard: [DataPlatform]
We ran into this need again with bug 1462046, is it something you might be able to help with? Thanks.
Flags: needinfo?(rharter)
Flags: needinfo?(mreid)
Comment 6•6 years ago
|
||
Adding Corey and myself to the bug. I'm sure we can help.
Comment 7•6 years ago
|
||
Thanks, Liz. :joy and Corey will follow up with you to figure out what analysis needs to be done.
Flags: needinfo?(rharter)
Updated•6 years ago
|
Flags: needinfo?(mreid)
Assignee | ||
Comment 8•6 years ago
|
||
V1 version of the SlowSQL dataset awaiting to be published on re.dash for subsequent dashboarding and other potential uses: https://bugzilla.mozilla.org/show_bug.cgi?id=1465272
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 9•6 years ago
|
||
Corey, if we can we help you anyway with this, just let us know.
Assignee | ||
Comment 10•6 years ago
|
||
Working on getting the dataset publishing expedited on #fx-metrics. Looks like the Bugzilla template referenced in docs.telemetry.mozilla.org was off, which is why it has been taking so long!
Assignee | ||
Comment 11•6 years ago
|
||
Initial draft of Slow SQL dashboard on re.dash: https://sql.telemetry.mozilla.org/dashboard/slowsql_1
Assignee | ||
Comment 12•6 years ago
|
||
Edits as per Marco and I's meeting added
Reporter | ||
Comment 13•6 years ago
|
||
Would it be possible to sort the app version in the select fields? finding a version in the middle of many is not trivial.
Assignee: nobody → cdowhygelund
Status: NEW → ASSIGNED
Flags: needinfo?(cdowhygelund)
Assignee | ||
Comment 14•6 years ago
|
||
The ascending sort on app version has been implemented. In addition the weekly update job has been implemented and successfully run.
Flags: needinfo?(cdowhygelund)
Assignee | ||
Comment 15•6 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #13) > Would it be possible to sort the app version in the select fields? finding a > version in the middle of many is not trivial. Implemented
Reporter | ||
Comment 16•6 years ago
|
||
I think for further improvements of bug fixes we could file separate bugs. The current state looks good enough. Is the dashboard position final, should it be moved? Could it be linked on the main telemetry.mozilla.org page? Otherwise, I think we can mark this as fixed.
Flags: needinfo?(cdowhygelund)
Assignee | ||
Comment 17•6 years ago
|
||
I believe the location is final with respect to this bug. Moving it to telemetry.mozilla.org should be a separate bug ticket.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Flags: needinfo?(cdowhygelund)
Resolution: --- → FIXED
Assignee | ||
Comment 18•5 years ago
|
||
:mak Is this dashboard still being utilized? The underlying job is utilizing a deprecated library, in addition to the dashboard relying on AWS infrastructure which is being decommissioned. Therefore, it needs to be transitioned over to GCP if it is being used.
Flags: needinfo?(mak)
Reporter | ||
Comment 19•5 years ago
|
||
Lina is working on something related in bug 1592263. Having a way to visualize slow queries is always useful, though it may be enough to be able to query the data as Lina is doing, provided that method is not going away. If so, we could get rid of this dashboard.
ni? Lina just as an heads up.
Flags: needinfo?(mak) → needinfo?(lina)
Assignee | ||
Comment 20•5 years ago
|
||
For reference, the slowSQL job used to populate this deprecated dashboard.
Updated•2 years ago
|
Component: Datasets: General → General
You need to log in
before you can comment on or make changes to this bug.
Description
•