Closed
Bug 917883
Opened 11 years ago
Closed 10 years ago
Use AsyncShutdown instead of spinning the event loop in healthreport.jsm
Categories
(Firefox Health Report Graveyard :: Client: Desktop, defect)
Firefox Health Report Graveyard
Client: Desktop
Tracking
(Not tracked)
RESOLVED
FIXED
Firefox 32
People
(Reporter: Yoric, Assigned: Yoric)
References
(Blocks 1 open bug)
Details
(Whiteboard: [Async:team][Async Shutdown])
Attachments
(1 file, 8 obsolete files)
(deleted),
patch
|
Yoric
:
review+
|
Details | Diff | Splinter Review |
We need to consolidate the shutdown procedure of healthreport.jsm into AsyncShutdown. This should simplify the code, improve shutdown debugging information, and minimize the risk of introducing deadlocks with misbehaving add-ons.
Assignee | ||
Comment 1•11 years ago
|
||
Here is a first prototype, which passes tests.
Gps, can you tell me what you think of this refactoring? Is that compatible with the design of FHR?
Assignee: nobody → dteller
Attachment #807177 -
Flags: feedback?(gps)
Assignee | ||
Comment 2•11 years ago
|
||
Comment on attachment 807177 [details] [diff] [review]
Shutting down FHR without spinning the event loop
In the absence of news from gps, adding f? rnewman.
Attachment #807177 -
Flags: feedback?(rnewman)
Comment 3•11 years ago
|
||
Comment on attachment 807177 [details] [diff] [review]
Shutting down FHR without spinning the event loop
Review of attachment 807177 [details] [diff] [review]:
-----------------------------------------------------------------
I think this looks good. I'll have to look into the implementation of AsyncShutdown to be sure we're not giving up any robustness guarantees as part of this change.
::: services/healthreport/healthreporter.jsm
@@ +580,5 @@
>
> + this._log.warn("Shutdown complete.");
> + this._shutdownComplete = true;
> + this._deferredShutdown.resolve();
> + TelemetryStopwatch.finish(TELEMETRY_SHUTDOWN, this);
I think we may want to surround this entire function with a try..catch so there is no opportunity for deferredShutdown to not resolve.
How does AsyncShutdown handle an unresolved deferredShutdown?
@@ +586,5 @@
> },
>
> + _onStorageClose: function() {
> + // Do nothing.
> + // This method is overloaded by the test suite.
This comment is misleading because it is incomplete.
The purpose of this function is to provide a "hook" point so the tests can verify certain actions occurred and so errors can be injected at certain points. Although, we may not always be using the latter.
These functions contribute to a robust test suite testing many failure modes. We either need to inline the functionality the tests need or we preserve these APIs. I vote for the latter because hook points are generic and don't require code modifications to facilitate new kinds of tests.
@@ +591,5 @@
> },
>
> + _onProviderManagerShutdown: function() {
> + // Do nothing.
> + // This method is overloaded by the test suite.
Ditto.
Attachment #807177 -
Flags: feedback?(gps) → feedback+
Assignee | ||
Comment 4•11 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #3)
> Comment on attachment 807177 [details] [diff] [review]
> Shutting down FHR without spinning the event loop
>
> Review of attachment 807177 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I think this looks good. I'll have to look into the implementation of
> AsyncShutdown to be sure we're not giving up any robustness guarantees as
> part of this change.
>
> ::: services/healthreport/healthreporter.jsm
> @@ +580,5 @@
> >
> > + this._log.warn("Shutdown complete.");
> > + this._shutdownComplete = true;
> > + this._deferredShutdown.resolve();
> > + TelemetryStopwatch.finish(TELEMETRY_SHUTDOWN, this);
>
> I think we may want to surround this entire function with a try..catch so
> there is no opportunity for deferredShutdown to not resolve.
>
> How does AsyncShutdown handle an unresolved deferredShutdown?
Once bug 917764 has landed, it will crash after one minute of inactivity, with a report detailing which blockers have failed to resolve/reject.
> @@ +586,5 @@
> > },
> >
> > + _onStorageClose: function() {
> > + // Do nothing.
> > + // This method is overloaded by the test suite.
>
> This comment is misleading because it is incomplete.
Sure. Still better than what I had to work with, though :)
But I'll add something more precise – unless you want to take over that bug.
Comment 5•11 years ago
|
||
Comment on attachment 807177 [details] [diff] [review]
Shutting down FHR without spinning the event loop
gps showed up :D
Attachment #807177 -
Flags: feedback?(rnewman)
Assignee | ||
Comment 6•11 years ago
|
||
Applied feedback.
Attachment #807177 -
Attachment is obsolete: true
Attachment #818925 -
Flags: review?(gps)
Assignee | ||
Comment 7•11 years ago
|
||
Comment 8•11 years ago
|
||
Comment on attachment 818925 [details] [diff] [review]
Shutting down FHR without spinning the event loop, v2
Review of attachment 818925 [details] [diff] [review]:
-----------------------------------------------------------------
r+ conditional on removing references to the removed telemetry probe.
::: services/healthreport/healthreporter.jsm
@@ -621,5 @@
> - this._shutdownCompleteCallback = Async.makeSpinningCallback();
> - this._shutdownCompleteCallback.wait();
> - this._shutdownCompleteCallback = null;
> - } finally {
> - TelemetryStopwatch.finish(TELEMETRY_SHUTDOWN_DELAY, this);
Since we're removing TELEMETRY_SHUTDOWN_DELAY, I reckon we should remove the const from this file and the probe from TelemetryHistograms.json.
Attachment #818925 -
Flags: review?(gps) → review+
Assignee | ||
Comment 9•11 years ago
|
||
Applied feedback.
Try: https://tbpl.mozilla.org/?tree=Try&rev=fa3202ba25c3
Attachment #818925 -
Attachment is obsolete: true
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 10•11 years ago
|
||
Another try run had crashes, believe it's this bug (out of the 5 therein):
https://tbpl.mozilla.org/?tree=Try&rev=2c426f843d08
https://tbpl.mozilla.org/php/getParsedLog.php?id=29570855&full=1&branch=try#error0
Keywords: checkin-needed
Assignee | ||
Comment 11•11 years ago
|
||
Try: https://tbpl.mozilla.org/?tree=Try&rev=9087e350ce2e
For the moment, I cannot reproduce the crashes, so they may be unrelated to this bug.
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 12•11 years ago
|
||
Keywords: checkin-needed
Comment 13•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 28
Comment 14•11 years ago
|
||
Backed out for causing bug 931642.
https://hg.mozilla.org/integration/mozilla-inbound/rev/ff49c9feb466
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: Firefox 28 → ---
Updated•11 years ago
|
Whiteboard: [Async][Async Shutdown] → [Async:team][Async Shutdown]
Assignee | ||
Comment 16•11 years ago
|
||
I'm working on it semi-actively, but so far I haven't found the root of the problem: some sqlite database that looks closed is not always closed by the time we hit shutdown.
Flags: needinfo?(dteller)
Assignee | ||
Comment 17•11 years ago
|
||
A new version that apparently doesn't cause bug 931642.
https://tbpl.mozilla.org/?tree=Try&rev=738f8442bd3b
https://tbpl.mozilla.org/?tree=Try&rev=0024f6da9249
Attachment #820223 -
Attachment is obsolete: true
Attachment #8394434 -
Flags: review?(gps)
Assignee | ||
Comment 18•11 years ago
|
||
Comment on attachment 8394434 [details] [diff] [review]
Shutting down FHR without spinning the event loop, v4
Since gps is busy, rnewman might be available.
Attachment #8394434 -
Flags: review?(rnewman)
Comment 19•11 years ago
|
||
Comment on attachment 8394434 [details] [diff] [review]
Shutting down FHR without spinning the event loop, v4
Review of attachment 8394434 [details] [diff] [review]:
-----------------------------------------------------------------
I don't have the bandwidth to give this adequate review right now, and gps is better qualified to spot any errors in this regardless.
If you're confident that this is an incremental change over the version you landed (interdiff won't tell me) to fix the errors without introducing new problems, and you've done enough manual testing to be confident, you may consider my f+ to be rs=me.
::: services/healthreport/healthreporter.jsm
@@ +121,5 @@
>
> try {
> this._s = yield CommonUtils.readJSON(this._filename);
> + } catch (ex if ex instanceof OS.File.Error
> + && ex.becauseNoSuchFile) {
Nit: && at end of line.
@@ +328,5 @@
> + } catch (ex) {
> + this._log.error("Error deleting last payload: " +
> + CommonUtils.exceptionStr(ex));
> + }
> + // As soon as we have could storage, we need to register cleanup or
s/have could/could have
@@ +331,5 @@
> + }
> + // As soon as we have could storage, we need to register cleanup or
> + // else bad things happen on shutdown.
> + Services.obs.addObserver(this, "quit-application", false);
> + // The database needs to be shut down by the end of shutdown
Newlines before comments throughout, please. (Yeah, I know this is copypasta, but let's fix while we're breaking blame.)
Attachment #8394434 -
Flags: review?(rnewman) → feedback+
Assignee | ||
Comment 20•11 years ago
|
||
Well, this is blocking me and neither gps nor you has the bandwidth. How do you suggest I proceed?
Comment 21•11 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?" - I'll be away on April 9th-16th) from comment #20)
> Well, this is blocking me and neither gps nor you has the bandwidth. How do
> you suggest I proceed?
I gave some suggestions on how to proceed in the second paragraph of Comment 19, but I'm now back from my PTO. I will try to find time to give this full attention later today or tomorrow.
Assignee | ||
Comment 22•11 years ago
|
||
Thanks, I appreciate.
Comment 23•11 years ago
|
||
Comment on attachment 8394434 [details] [diff] [review]
Shutting down FHR without spinning the event loop, v4
Review of attachment 8394434 [details] [diff] [review]:
-----------------------------------------------------------------
Back to you with some questions, David.
::: services/healthreport/healthreporter.jsm
@@ +311,5 @@
> }
>
> this._initializeStarted = true;
>
> + return Task.spawn(function*() {
function* is incredibly poorly documented:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/function*
so I honestly have no idea if this code is correct. For example, line 354 is simply:
return;
which seems incorrect if init() is supposed to return a promise.
Could you point me to better documentation?
@@ +322,5 @@
> + yield this._deleteOldLastPayload();
> + this._state._s.removedOutdatedLastpayload = true;
> + // Normally we should save this to a file but it directly conflicts with
> + // the "application re-upgrade" decision in HealthReporterState::init()
> + // which specifically does not save the state to a file.
Remove double spaces.
@@ +355,5 @@
> + }
> +
> + yield this._initializeProviderManager();
> + yield this._onProviderManagerInitialized();
> + this._initializedDeferred.resolve();
So we no longer call and return onInit(), which means that _initializedDeferred is not the promise that's returned from this method, right? That seems wrong.
@@ +570,4 @@
> },
>
> + onInit: function() {
> + return this._initializedDeferred.promise;
This is no longer called.
Attachment #8394434 -
Flags: review?(gps)
Assignee | ||
Comment 24•11 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #23)
> > this._initializeStarted = true;
> >
> > + return Task.spawn(function*() {
>
> function* is incredibly poorly documented:
>
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/
> function*
>
> so I honestly have no idea if this code is correct. For example, line 354 is
> simply:
>
> return;
>
> which seems incorrect if init() is supposed to return a promise.
>
> Could you point me to better documentation?
A function* always returns an iterator, regardless of its return statements. In turn, Task.spawn returns a Promise, regardless of what the function*() returns.
Doing |return somePromise;| in a |Task.spawn(function*() { ... }| is essentially a shorthand for
let value = yield somePromise;
return value;
> @@ +355,5 @@
> > + }
> > +
> > + yield this._initializeProviderManager();
> > + yield this._onProviderManagerInitialized();
> > + this._initializedDeferred.resolve();
>
> So we no longer call and return onInit(), which means that
> _initializedDeferred is not the promise that's returned from this method,
> right? That seems wrong.
The two promises are pretty much equivalent, since both of them are resolved immediately after this._initializedDeferred, but right, let's lift the ambiguity by calling onInit() again.
Assignee | ||
Comment 25•11 years ago
|
||
Applied feedback.
Attachment #8394434 -
Attachment is obsolete: true
Attachment #8408247 -
Flags: review?(rnewman)
Assignee | ||
Comment 26•11 years ago
|
||
Comment 27•11 years ago
|
||
Comment on attachment 8408247 [details] [diff] [review]
Shutting down FHR without spinning the event loop, v5
Review of attachment 8408247 [details] [diff] [review]:
-----------------------------------------------------------------
This looks good to me. Fingers crossed.
Attachment #8408247 -
Flags: review?(rnewman) → review+
Comment 28•11 years ago
|
||
Yoric: please make sure that the xpcshell tests pass -- that Try push didn't run 'em.
Assignee | ||
Comment 29•11 years ago
|
||
Same code, plus unbitrotting.
Try: https://tbpl.mozilla.org/?tree=Try&rev=7d7771ab0c6f
Attachment #8408247 -
Attachment is obsolete: true
Attachment #8414638 -
Flags: review+
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 30•11 years ago
|
||
Keywords: checkin-needed
Whiteboard: [Async:team][Async Shutdown] → [Async:team][Async Shutdown][fixed-in-fx-team]
Comment 31•11 years ago
|
||
OSX 10.6 is still seeing frequent shutdown crashes with this patch. Backed out.
https://hg.mozilla.org/integration/fx-team/rev/5b99fe773ae9
https://tbpl.mozilla.org/php/getParsedLog.php?id=38745484&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=38747063&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=38748254&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=38748336&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=38749588&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=38748212&tree=Fx-Team
Whiteboard: [Async:team][Async Shutdown][fixed-in-fx-team] → [Async:team][Async Shutdown]
Comment 32•11 years ago
|
||
This was also apparently the cause for Valgrind failures like the below:
https://tbpl.mozilla.org/php/getParsedLog.php?id=38754765&tree=Fx-Team
Assignee | ||
Comment 33•11 years ago
|
||
Weird.
I guess it could be a case of bug 985655.
Assignee | ||
Comment 34•11 years ago
|
||
Indeed, looking more carefully, it can be a case of bug 985655 aka "shutdown is very fragile, any change can break stuff." Let's see if we can make this specific part of shutdown more robust.
Assignee | ||
Comment 35•11 years ago
|
||
New patch, now a dependency on the brand new Sqlite.shutdown instead of the coarser grained AsyncShutdown.profileBeforeChange.
Attachment #8414638 -
Attachment is obsolete: true
Assignee | ||
Comment 36•11 years ago
|
||
Full patch, this time.
Try: https://tbpl.mozilla.org/?tree=Try&rev=979b032dd4d5
Attachment #8420986 -
Attachment is obsolete: true
Assignee | ||
Updated•11 years ago
|
Attachment #8420992 -
Flags: review?(rnewman)
Comment 37•11 years ago
|
||
Comment on attachment 8420992 [details] [diff] [review]
Shutting down FHR without spinning the event loop, v8
Review of attachment 8420992 [details] [diff] [review]:
-----------------------------------------------------------------
Something of a rubberstamp, given that I'm not familiar with the Barrier mechanism, but it looks fine to me. I imagine problems will be immediately apparent.
::: services/metrics/storage.jsm
@@ +686,5 @@
> };
>
> +// Expose an asynchronous barrier `shutdown` that clients may use to
> +// perform last minute cleanup and shutdown requests before this module
> +// is shutdown.
s/shutdown./shut down.
Attachment #8420992 -
Flags: review?(rnewman) → review+
Assignee | ||
Comment 38•11 years ago
|
||
Applied feedback.
Attachment #8420992 -
Attachment is obsolete: true
Attachment #8426885 -
Flags: review+
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 39•11 years ago
|
||
Keywords: checkin-needed
Whiteboard: [Async:team][Async Shutdown] → [Async:team][Async Shutdown][fixed-in-fx-team]
Comment 40•10 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Whiteboard: [Async:team][Async Shutdown][fixed-in-fx-team] → [Async:team][Async Shutdown]
Target Milestone: --- → Firefox 32
Assignee | ||
Updated•10 years ago
|
Blocks: AsyncShutdown
Updated•6 years ago
|
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•