Closed
Bug 1272354
Opened 9 years ago
Closed 8 years ago
Add telemetry for DLC service
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(firefox55 fixed)
RESOLVED
FIXED
Firefox 55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: sebastian, Assigned: sebastian)
References
Details
Attachments
(5 files)
(deleted),
text/x-review-board-request
|
Grisha
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
Grisha
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
Grisha
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
Grisha
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
Grisha
:
review+
liuche
:
review+
|
Details |
Now that DLC is shipping and we are enabling catalog sync (bug 1271352), it would be interesting to see how this performs in the wild. Especially to implement a better scheduling (bug 1257492).
Comment 1•8 years ago
|
||
Before we move l10n into DLC (bug 945123), we should know how DLC performs in practice.
I'll mark bug 945123 to block switching on ICU for Android, so this would be really good to have soon, and even think about potentially uplifting it to get more reliable data.
Blocks: 945123
Comment 2•8 years ago
|
||
Sebastian, is this something we might have soon? (I realize there are always competing priorities)
Flags: needinfo?(s.kaspari)
Assignee | ||
Comment 3•8 years ago
|
||
Yeah, I can squeeze this in. Let me know if you have specific questions / probe requirements. From IRC I remember that :Pike is interested in "install time until DLC ready". In addition to that I wanted to have some simple probes here for download completed/failed, synchronization of catalog completed/failed and maybe the type of network.
Assignee: nobody → s.kaspari
Status: NEW → ASSIGNED
Flags: needinfo?(s.kaspari)
Comment 4•8 years ago
|
||
Thanks. Yeah, what I said on irc is my main point of interest, the other things sound like good additional information to me, too.
Assignee | ||
Comment 5•8 years ago
|
||
I looked into this a bit today and one of the things I'm still not sure about is whether we want to track this (success, failure, time etc.) for every item in the catalog or generate more something like a summarized report. Currently we have 48 files in the catalog (fonts + hyphenation dictionaries) and my only concern here is that this could result in a lot of telemetry data. Although we do not perform downloads often (as there are not many updates to the current set of files). Synchronizing the catalog (or more like checking that there's no new data available) is something we do more often though.
Ignoring probe sizes I'd track for success: content id, time, network type. And for errors additionally the type of error (We already handle specific error type classes internally). For the catalog sync I'd track the same type of data but instead of content id I'd track the number of updated records.
Comment 6•8 years ago
|
||
(In reply to Sebastian Kaspari (:sebastian) from comment #5)
> I looked into this a bit today and one of the things I'm still not sure
> about is whether we want to track this (success, failure, time etc.) for
> every item in the catalog or generate more something like a summarized
> report. Currently we have 48 files in the catalog (fonts + hyphenation
> dictionaries) and my only concern here is that this could result in a lot of
> telemetry data. Although we do not perform downloads often (as there are not
> many updates to the current set of files). Synchronizing the catalog (or
> more like checking that there's no new data available) is something we do
> more often though.
One efficient way to track this would be using a categorical histogram [1] for each success & failure scenario.
For times i'd assume monitoring this in one histogram is a good start, if this data shows real issues then this could be followed up by QA, more detailed measurements, ...
Are you looking into getting this from opt-in Telemetry or opt-out (i.e. beta vs. release population)?
1: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe#Choosing_a_Histogram_Type
Comment 7•8 years ago
|
||
FWIW, i'm just limiting categorical histograms to 50 labels today.
This sounds like a potential reason to increase that limit or allow overriding it.
Assignee | ||
Comment 8•8 years ago
|
||
Thanks for the hints!
(In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> Are you looking into getting this from opt-in Telemetry or opt-out (i.e.
> beta vs. release population)?
Oh, yeah, this is an interesting problem. Most of the initial and interesting events happen way before the user had a chance to opt-in - so we will not be able to measure that. That's a problem.
Comment 9•8 years ago
|
||
On pre-release (Beta, Aurora, Nightly) that is not a problem (we default to "opt-in" data being enabled there).
So the question is mainly: do you need the data from most release users or is the beta population enough to track the status?
The "core" ping is what we get for all release users, but is designed for very minimal b/w impact.
The richer Telemetry data we only get for pre-release users (and the few release users that opted in).
Comment 10•8 years ago
|
||
:sebastian, any update on this? It's a blocked on my DLC l10n in Fennec list.
Flags: needinfo?(s.kaspari)
Comment 11•8 years ago
|
||
Hi Sebastian
I just walked by and have no idea what is downloadable content and I found this page : https://wiki.mozilla.org/Mobile/Fennec/Android/Downloadable_Content.
Is this page still valid? Thank you!
Assignee | ||
Comment 12•8 years ago
|
||
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #10)
> :sebastian, any update on this? It's a blocked on my DLC l10n in Fennec list.
Sorry that this is still taking time. This week I'm still busy getting Focus for Android out of the door.
(In reply to Nevin Chen [:nechen] from comment #11)
> I just walked by and have no idea what is downloadable content and I found
> this page :
> https://wiki.mozilla.org/Mobile/Fennec/Android/Downloadable_Content.
>
> Is this page still valid? Thank you!
Yes, this page is still valid. Although not as detailed as I would like it to be. If you are going to work on DLC bugs then let's jump on Vidyo and I can onboard you. :)
Flags: needinfo?(s.kaspari)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 18•8 years ago
|
||
Those patches add some basic telemetry to monitor successful and failed downloads as well as catalog synchronizations. This could be even more detailed. However this will give us a good overview and based on the data we can decide what details might be interesting to see.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 24•8 years ago
|
||
mozreview-review |
Comment on attachment 8854025 [details]
Bug 1272354 - (DLC) Download Action: If there's no download scheduled then exit early.
https://reviewboard.mozilla.org/r/126018/#review128762
Attachment #8854025 -
Flags: review?(gkruglov) → review+
Comment 25•8 years ago
|
||
mozreview-review |
Comment on attachment 8854026 [details]
Bug 1272354 - (DLC) DownloadAction: Move network checks inside download loop.
https://reviewboard.mozilla.org/r/126020/#review128764
::: commit-message-79c77:4
(Diff revision 2)
> +Bug 1272354 - (DLC) DownloadAction: Move network checks inside download loop. r?grisha
> +
> +This has two advantages:
> +* If the network changes while we are downloading content then we do not continue on an unmetered network.
I think you meant "... on a metered network."?
Attachment #8854026 -
Flags: review?(gkruglov) → review+
Comment 26•8 years ago
|
||
mozreview-review |
Comment on attachment 8854027 [details]
Bug 1272354 - (DLC) Download Action: Add telemetry to track download success/failure.
https://reviewboard.mozilla.org/r/126022/#review128766
It seems like a histogram event to record % success might be interesting. E.g. scheduled 13 things to download, succeeded with 10, record 0.76. Although, rescheduling files might muddy that ratio a bit.
::: mobile/android/base/java/org/mozilla/gecko/dlc/DownloadAction.java:82
(Diff revision 2)
>
> File destinationFile = getDestinationFile(context, content);
> if (destinationFile.exists() && verify(destinationFile, content.getChecksum())) {
> Log.d(LOGTAG, "Content already exists and is up-to-date.");
> catalog.markAsDownloaded(content);
> + DownloadContentTelemetry.eventDownloadSuccess(content);
I'm not familiar with the scheduler - under what conditions would we have a scheduled file already present and matching checksums locally? If there isn't a good reason why this might happen, perhaps that's not a "success" (even though "nothing to do" is a great outcome for this iteration of the loop!)
Attachment #8854027 -
Flags: review?(gkruglov) → review+
Comment 27•8 years ago
|
||
mozreview-review |
Comment on attachment 8854028 [details]
Bug 1272354 - (DLC) Sync Action: Add telemetry to track sync success/failure.
https://reviewboard.mozilla.org/r/126024/#review128768
Attachment #8854028 -
Flags: review?(gkruglov) → review+
Comment 28•8 years ago
|
||
mozreview-review |
Comment on attachment 8854029 [details]
Bug 1272354 - Add in-tree documentation of DLC telemetry.
https://reviewboard.mozilla.org/r/126026/#review128770
Looks great!
::: mobile/android/docs/downloadcontenttelemetry.rst:17
(Diff revision 2)
> +All UI events use action ``action.1`` and method ``service``. Additional information is structured as JSON blobs (``extras`` field).
> +
> +Downloading content
> +===================
> +
> +Telemetry about downloaded content has the extra field ``action`` set to ``dlc_download``. The ``result`` field will be set to "success" or ``failure`` depending on whether the content could be downloaded successfully. The ``content`` field will contain the UUID of the content that was downloaded or failed to be downloaded. This is helpful to identify potentially corrupt content.
s/"success"/``success``
::: mobile/android/docs/downloadcontenttelemetry.rst:56
(Diff revision 2)
> + }
> +
> +Synchronizing catalog
> +=====================
> +
> +The app has a local catalog of content that can be downloaded. This catalog can be updated and is synchronized from a server. Telemetry about synchronizing the catalog has the extra field ``action`` set to ``dlc_sync``. Like for content downloads the ``result`` field will be set to "success" or ``failure`` depending on whether the synchronization was successful or not.
s/"success"/``success``
::: mobile/android/docs/downloadcontenttelemetry.rst:70
(Diff revision 2)
> +.. code-block:: js
> +
> + extras: {
> + "action": "dlc_sync",
> + "result": "success",
> + "updated": false,
This is just "fetched catalog size > 0", right? Would it be more useful to just record the number itself?
Attachment #8854029 -
Flags: review?(gkruglov) → review+
Comment 29•8 years ago
|
||
mozreview-review |
Comment on attachment 8854029 [details]
Bug 1272354 - Add in-tree documentation of DLC telemetry.
https://reviewboard.mozilla.org/r/126026/#review129060
::: mobile/android/docs/downloadcontenttelemetry.rst:17
(Diff revision 2)
> +All UI events use action ``action.1`` and method ``service``. Additional information is structured as JSON blobs (``extras`` field).
> +
> +Downloading content
> +===================
> +
> +Telemetry about downloaded content has the extra field ``action`` set to ``dlc_download``. The ``result`` field will be set to "success" or ``failure`` depending on whether the content could be downloaded successfully. The ``content`` field will contain the UUID of the content that was downloaded or failed to be downloaded. This is helpful to identify potentially corrupt content.
If you could clarify that the "content" is described in the "Synchronizing catalog" section, that'd be helpful.
::: mobile/android/docs/downloadcontenttelemetry.rst:19
(Diff revision 2)
> +Downloading content
> +===================
> +
> +Telemetry about downloaded content has the extra field ``action`` set to ``dlc_download``. The ``result`` field will be set to "success" or ``failure`` depending on whether the content could be downloaded successfully. The ``content`` field will contain the UUID of the content that was downloaded or failed to be downloaded. This is helpful to identify potentially corrupt content.
> +
> +Telemetry extras send for a successful content download might look like this:
Nit: sent
Attachment #8854029 -
Flags: review?(liuche) → review+
Assignee | ||
Comment 30•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8854027 [details]
Bug 1272354 - (DLC) Download Action: Add telemetry to track download success/failure.
https://reviewboard.mozilla.org/r/126022/#review128766
> I'm not familiar with the scheduler - under what conditions would we have a scheduled file already present and matching checksums locally? If there isn't a good reason why this might happen, perhaps that's not a "success" (even though "nothing to do" is a great outcome for this iteration of the loop!)
For example:
* An entry in the catalog could change. In this case we might trigger a download but the actual file has not changed.
* Our process is killed after the download but before we could update the catalog on disk
* We ship a very basic catalog in the app. After updating the catalog from the server we might end up with a somehow different catalog but we already have the file locally.
* And: If there's already a file in the location we want to put it: Let's just test if its already the right one.. :)
The reason why I want to report this as success is that if we haven't been able to mark this file as correctly downloaded in the catalog then it probably wasn't reported to telemetry either.
Assignee | ||
Comment 31•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8854029 [details]
Bug 1272354 - Add in-tree documentation of DLC telemetry.
https://reviewboard.mozilla.org/r/126026/#review128770
> This is just "fetched catalog size > 0", right? Would it be more useful to just record the number itself?
Good question. Right now I don't think the actual number if meaningful - the value heavily depends on when the sync happened and at what time we updated the catalog. But let's see - this is the first iteration.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 36•8 years ago
|
||
Pushed by s.kaspari@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/2512c717a261
(DLC) Download Action: If there's no download scheduled then exit early. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/1b8c117b4c96
(DLC) DownloadAction: Move network checks inside download loop. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/ff4b04a63533
(DLC) Download Action: Add telemetry to track download success/failure. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/1f4417a2d084
(DLC) Sync Action: Add telemetry to track sync success/failure. r=Grisha
https://hg.mozilla.org/integration/autoland/rev/65632edcba8b
Add in-tree documentation of DLC telemetry. r=Grisha,liuche
Comment 37•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/2512c717a261
https://hg.mozilla.org/mozilla-central/rev/1b8c117b4c96
https://hg.mozilla.org/mozilla-central/rev/ff4b04a63533
https://hg.mozilla.org/mozilla-central/rev/1f4417a2d084
https://hg.mozilla.org/mozilla-central/rev/65632edcba8b
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•