Closed
Bug 1053315
Opened 10 years ago
Closed 10 years ago
FHR stops reporting if it fails to initialize
Categories
(Firefox Health Report Graveyard :: Client: Desktop, defect, P1)
Firefox Health Report Graveyard
Client: Desktop
Tracking
(Not tracked)
RESOLVED
FIXED
Firefox 34
People
(Reporter: gps, Assigned: gps)
References
Details
(Keywords: regression)
Attachments
(1 file)
(deleted),
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
In bug 855413 comment 11, bsmedberg says that 0.3% of FHR submissions are setting the "notInitialized" flag. This flag is set when the client fails to initialize. We send a skeleton payload in this scenario so the server has a rough idea of how many clients are in an error state.
I believe I've found a bug introduced by bug 854018 that will cause us to not submit payloads in the case of initialization failure.
Here's the relevant code for doing the upload:
return Task.spawn(function doUpload() {
try {
// The test for upload locking monkeypatches getJSONPayload.
// If the next two lines change, be sure to verify the test is
// accurate!
this._uploadInProgress = true;
let payload = yield this.getJSONPayload();
let histogram = Services.telemetry.getHistogramById(TELEMETRY_PAYLOAD_SIZE_UNCOMPRESSED);
histogram.add(payload.length);
let lastID = this.lastSubmitID;
yield this._state.addRemoteID(id);
<----- Bug starts here
let hrProvider = this.getProvider("org.mozilla.healthreport");
if (hrProvider) {
let event = lastID ? "continuationUploadAttempt"
: "firstDocumentUploadAttempt";
hrProvider.recordEvent(event, now);
}
<----- Bug ends here
TelemetryStopwatch.start(TELEMETRY_UPLOAD, this);
let result;
try {
let options = {
deleteIDs: this._state.remoteIDs.filter((x) => { return x != id; }),
telemetryCompressed: TELEMETRY_PAYLOAD_SIZE_COMPRESSED,
};
result = yield client.uploadJSON(this.serverNamespace, id, payload,
options);
TelemetryStopwatch.finish(TELEMETRY_UPLOAD, this);
} catch (ex) {
TelemetryStopwatch.cancel(TELEMETRY_UPLOAD, this);
if (hrProvider) {
hrProvider.recordEvent("uploadClientFailure", now);
}
throw ex;
}
yield this._onBagheeraResult(request, false, now, result);
} finally {
this._uploadInProgress = false;
}
}.bind(this));
The bug is the code for recording the upload attempt metric assumes that FHR is initialized and working properly. If FHR failed to initialize (such as the SQLite database being corrupt), that code will raise and we'll never upload.
So, if FHR fails to initialize, we'll stop submitting payloads for these clients. It will look to metrics like people stopped using Firefox.
This has been a bug since Firefox 23. Assuming this bug is legit, the only way for us to know the impact is if a) we fix it b) we compare FHR reporting against other pings, such as the blocklist.
Flags: firefox-backlog+
Assignee | ||
Comment 1•10 years ago
|
||
When I initially filed this, I thought it could be wide-impacting. I'm now having second thoughts. More accurately, I'm not sure what the impact is.
The question resides in whether the FHR provider gets registered. Registration requires a SQLite connection. So, if we fail to establish a connection to SQLite at all, we're good: we don't have a bug. However, if we do manage to bind to SQLite and get the provider registered and the INSERT into SQLite throws, we get the bug.
So, our surface area isn't as large as I initially thought.
However, a lot of SQLite corruption issues aren't detected until we attempt to write to the database or read certain pages (opening the database can work just fine). Early in FHR days (~Firefox 21) when I was looking at stacks submitted from clients, we were seeing these SQLite errors while writing individual FHR events. So, I think we still have sufficient exposure to this bug to warrant fixing and uplift.
Assignee | ||
Comment 2•10 years ago
|
||
If recording FHR data during uploading raised an exception, it could
potentially abort the upload. This would appear to Mozilla as clients
that suddenly stopped using Firefox.
This patch adds explicit exception trapping around event record to
ensure this doesn't happen.
Attachment #8472504 -
Flags: review?(benjamin)
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → gps
Status: NEW → ASSIGNED
Comment 3•10 years ago
|
||
Hi Gregory, can you provide a point value and if the bug should be marked as [qa+] or [qa-] for verification.
Iteration: --- → 34.2
QA Whiteboard: [qa?]
Flags: needinfo?(gps)
Assignee | ||
Comment 4•10 years ago
|
||
Automated test coverage of this should be fine.
Points: --- → 1
QA Whiteboard: [qa?] → [qa-]
Flags: needinfo?(gps)
Updated•10 years ago
|
Attachment #8472504 -
Flags: review?(benjamin) → review+
Assignee | ||
Comment 5•10 years ago
|
||
Comment 6•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 34
Updated•6 years ago
|
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•