Open Bug 722648 Opened 12 years ago Updated 2 years ago

We should have an async version of nsIObserver

Categories

(Core :: XPCOM, defect)

x86
All
defect

Tracking

()

People

(Reporter: espindola, Unassigned)

References

Details

(Whiteboard: [Snappy:P2])

Attachments

(1 file)

Currently when code uses notifyObservers, it expects the observers to have acted by the time the call returns. For example, after 

observerService->NotifyObservers(nsnull, NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID, nsnull);

we expect the threads to be gone. What this means on the other side is that if

FooBar::Observe(nsISupports *, const char *aTopic, const PRUnichar *)

uses an async API as AsyncClose, they have to spin the loop before returning. It would be nice to have a nsIObserverAsync which calls

FooBar::Observe(nsISupports *, const char *aTopic, const PRUnichar *, Continuation C)

And we can now just pass C to AsyncClose (or wrap it if we have some work to do).
Component: Event Handling → XPCOM
OS: Mac OS X → All
QA Contact: events → xpcom
Whiteboard: [Snappy:P2]
Have you got a better example? I really don't want to expand on our generic observer mechanisms which are already rather silly. Can we use custom-built notifications in the cases where this is important?

shutdown-threads pretty much has to be synchronous since the next thing we do is forcefully kill off the threads...
I should have been clear, the API would stay synchronous on the caller side. The idea is just to centralize the loop spinning in one place. Instead of having the Observe methods spin, the messaging system would.
Also, calling nsThread::Shutdown internally spins the event loop anyway. Not sure how you could unify the spots where the event loop spins.
See bug 819060 for a slightly different explanation of the same issue and essentially the same API.
Generally, I believe that Rafael and myself are hitting the same issue when attempting to refactor some code and make it asynchronous (in this case shutdown code). We are faced with code that behaves as follows:
- notify all observers;
- once all observers have returned, since observer service is synchronous, we know that this "ring layer" is complete, proceed.

We wish to rewrite the code and make it behave as follows:
- notify all observers;
- once all synchronous observers have returned and all asynchronous observers have called their callback, we know that this "ring layer" is complete, proceed.

At the moment, the main scenario is the following: ensuring that off-main thread writes (and we are starting to have many of these from many different modules and threads, both in JS and in C++) are complete before proceeding to shutting down threads.

While we could most likely spin the event loop to get this done, I believe that the path of the asynchronous observer is much cleaner and the least risky.
(In reply to David Rajchenbach Teller [:Yoric] from comment #6)
> Generally, I believe that Rafael and myself are hitting the same issue when
> attempting to refactor some code and make it asynchronous (in this case
> shutdown code).

Exactly. I first proposed this during a similar work making sure pending sql queries are executed, but unfortunately never managed to make this bug my priority.
Unless there is an objection, I will try and work on this soonish.
Assignee: nobody → dteller
(In reply to comment #8)
> Unless there is an objection, I will try and work on this soonish.

Can you please share an initial API interface before going ahead and implementing this?  Thanks!
Attached file Interface draft (deleted) —
Here is a first draft of the interface
(In reply to comment #10)
> Created attachment 692438 [details]
>   --> https://bugzilla.mozilla.org/attachment.cgi?id=692438&action=edit
> Interface draft
> 
> Here is a first draft of the interface

Please get sr on this.  Benjamin might be a good person to ask.  Thanks!
I have mixed opinions on this proposal.

Some things to consider:

* Application shutdown is a sensitive topic. Part of me wants to believe that application shutdown and the complicated heuristics involved warrant a dedicated API beyond mere notifications. A proper service management framework complete with full dependencies comes to mind. On the simpler end of the spectrum, a "shutdown mutex" or similar would go a long way. Perhaps those are implemented as async observers. While I think async observers are generally useful and we should implement them, I worry about falling into the trap that async observers are the best way to solve shutdown.

* Any sync observer could technically be made async by passing a function/callback into the subject of the notification.

* Having an explicit async observer API does help solve the problem of counting and gating on all callbacks being invoked.

* What about timeouts? I can see many facilities wanting the ability to time out before *all* receivers have fired the callback. Shutdown especially.

* What about passing data in the async callback? This would necessitate 2 classes of callback to the notifier: 1 gets called as each observee calls them. 1 gets called at most once to single completion. They could be modeled as separate callbacks or as a boolean isComplete argument to a unified callback.

* Should there be a way to signal the notifier after just 1 (vs all) async callbacks are complete? Note that if you incorporate the above you get this for free.

* Can we merge the "subject" and "data" arguments?

* At the risk of confusing the API differences between sync and async, I'd /really/ like notify() to take the topic argument before payload. The current implementation confuses me every time I use it.

* I generally don't like how the observer service is used in practice because statically named notification channels effectively constitute global variables and force services to be coded as tightly coupled singletons rather than loosely coupled entities that can be instantiated multiple times. Case in point, we can't instantiate multiple Sync services in the same application because notifications from 1 instance are indistinguishable from the other. Therefore, I tend to prefer strong binding of one object to another via custom addListener() methods, etc. i.e. I like the ability to hang subscriptions/mailboxes off any arbitrary object. Unfortunately, it is tedious to reinvent this wheel over and over. If I were rewriting observers today, observer notifications would be bound to specific object instances so each type didn't need to reinvent subscriptions. We would have a "global app object" to facilitate global channels (like the existing observer service). I don't expect this to gain much traction. But, it's something I hope you think about.
(In reply to Gregory Szorc [:gps] from comment #12)
> I have mixed opinions on this proposal.
> 
> Some things to consider:
> 
> * Application shutdown is a sensitive topic. Part of me wants to believe
> that application shutdown and the complicated heuristics involved warrant a
> dedicated API beyond mere notifications. A proper service management
> framework complete with full dependencies comes to mind. On the simpler end
> of the spectrum, a "shutdown mutex" or similar would go a long way. Perhaps
> those are implemented as async observers. While I think async observers are
> generally useful and we should implement them, I worry about falling into
> the trap that async observers are the best way to solve shutdown.

Full dependencies? This sounds interesting, but that sounds like scope creep much beyond my current objectives. Could you file a bug on that topic?

> * Any sync observer could technically be made async by passing a
> function/callback into the subject of the notification.

Well, theoretically, yes, but this would break all synchronous observers for the same topic.

> * Having an explicit async observer API does help solve the problem of
> counting and gating on all callbacks being invoked.
> 
> * What about timeouts? I can see many facilities wanting the ability to time
> out before *all* receivers have fired the callback. Shutdown especially.

Good point. Sounds like a followup bug.

> * What about passing data in the async callback? This would necessitate 2
> classes of callback to the notifier: 1 gets called as each observee calls
> them. 1 gets called at most once to single completion. They could be modeled
> as separate callbacks or as a boolean isComplete argument to a unified
> callback.

That sounds a little fishy. The notifier would receive which data exactly? An array of responses? I would like to avoid this for the current bug.

> * Should there be a way to signal the notifier after just 1 (vs all) async
> callbacks are complete? Note that if you incorporate the above you get this
> for free.

I don't think it really makes sense for observers. Observers are conjunctions and you want a disjunction here. I think that this should go in another class.

> * Can we merge the "subject" and "data" arguments?
> 
> * At the risk of confusing the API differences between sync and async, I'd
> /really/ like notify() to take the topic argument before payload. The
> current implementation confuses me every time I use it.
> 
> * I generally don't like how the observer service is used in practice
> because statically named notification channels effectively constitute global
> variables and force services to be coded as tightly coupled singletons
> rather than loosely coupled entities that can be instantiated multiple
> times. Case in point, we can't instantiate multiple Sync services in the
> same application because notifications from 1 instance are indistinguishable
> from the other. Therefore, I tend to prefer strong binding of one object to
> another via custom addListener() methods, etc. i.e. I like the ability to
> hang subscriptions/mailboxes off any arbitrary object. Unfortunately, it is
> tedious to reinvent this wheel over and over. If I were rewriting observers
> today, observer notifications would be bound to specific object instances so
> each type didn't need to reinvent subscriptions. We would have a "global app
> object" to facilitate global channels (like the existing observer service).
> I don't expect this to gain much traction. But, it's something I hope you
> think about.

Well, that's largely why we introduced promises in our codebase. So, this sounds good, but this sounds like another bug.
> > * Any sync observer could technically be made async by passing a
> > function/callback into the subject of the notification.
> 
> Well, theoretically, yes, but this would break all synchronous observers for
> the same topic.

not necessarily you could pass in an object implementing an interface like

interface nsIShutdownBlocker : nsISupports
{
  void WillNotify();
  voidNotify();
};

the implementation of those methods would more or less be a semaphore with WillNotify() - and Notify() + then once all observers would notified the main thread would block on the semaphore.

its of course a bit of a hack, but given all the existing abuse of the observer infrastructure we already have its hard to feel bad about a bit more.

> > * What about timeouts? I can see many facilities wanting the ability to time
> > out before *all* receivers have fired the callback. Shutdown especially.
> 
> Good point. Sounds like a followup bug.

other than shutdown you could just roll your own with nsITimer or something, but I don't particularly advocate that.

> * Can we merge the "subject" and "data" arguments?

I'm not convinced its worth the compat breakage, when it would really make more sense to move many of the observers to there own mechanism.

> * At the risk of confusing the API differences between sync and async, I'd
> /really/ like notify() to take the topic argument before payload. The
> current implementation confuses me every time I use it.
> 
> * I generally don't like how the observer service is used in practice
> because statically named notification channels effectively constitute global
> variables and force services to be coded as tightly coupled singletons
> rather than loosely coupled entities that can be instantiated multiple
> times. Case in point, we can't instantiate multiple Sync services in the
> same application because notifications from 1 instance are indistinguishable
> from the other. Therefore, I tend to prefer strong binding of one object to
> another via custom addListener() methods, etc. i.e. I like the ability to
> hang subscriptions/mailboxes off any arbitrary object. Unfortunately, it is
> tedious to reinvent this wheel over and over. If I were rewriting observers
> today, observer notifications would be bound to specific object instances so
> each type didn't need to reinvent subscriptions. We would have a "global app
> object" to facilitate global channels (like the existing observer service).

is implementing your own really that bad? you more or less need to just write wrappers around nsTObserverArray.  I suppose we could write some sort of mix in thing, but roling your own  wrappers allows the callback to take arguments whose type is specific to what they're being called for (I haven't thought about if you could do that with templates somehow).

> I don't expect this to gain much traction. But, it's something I hope you
> think about.

we already have a bunch of  things in the tree that have there own notification system.  That's not to say we shouldn't have more.
(In reply to David Rajchenbach Teller [:Yoric] from comment #13)
> (In reply to Gregory Szorc [:gps] from comment #12)
> > I have mixed opinions on this proposal.
> > 
> > Some things to consider:
> > 
> > * Application shutdown is a sensitive topic. Part of me wants to believe
> > that application shutdown and the complicated heuristics involved warrant a
> > dedicated API beyond mere notifications. A proper service management
> > framework complete with full dependencies comes to mind. On the simpler end
> > of the spectrum, a "shutdown mutex" or similar would go a long way. Perhaps
> > those are implemented as async observers. While I think async observers are
> > generally useful and we should implement them, I worry about falling into
> > the trap that async observers are the best way to solve shutdown.
> 
> Full dependencies? This sounds interesting, but that sounds like scope creep
> much beyond my current objectives. Could you file a bug on that topic?

It's definitely scope creep! As Gecko behaves more and more like a real OS (and hey, Firefox OS *is* an operating system), I think we'll find our current solution is far from adequate.

> > * Any sync observer could technically be made async by passing a
> > function/callback into the subject of the notification.
> 
> Well, theoretically, yes, but this would break all synchronous observers for
> the same topic.

I'm not advocating changing existing behavior. Instead, a quick and dirty shutdown solution would be to introduce new versions of the existing shutdown notifications that passed a callback as part of the observer payload. No new special async observer service required.

(In reply to Trevor Saunders (:tbsaunde) from comment #14)
> is implementing your own really that bad? you more or less need to just
> write wrappers around nsTObserverArray.  I suppose we could write some sort
> of mix in thing, but roling your own  wrappers allows the callback to take
> arguments whose type is specific to what they're being called for (I haven't
> thought about if you could do that with templates somehow).
> 
> > I don't expect this to gain much traction. But, it's something I hope you
> > think about.
> 
> we already have a bunch of  things in the tree that have there own
> notification system.  That's not to say we shouldn't have more.

I'm just saying that I find it to be a common pattern and AFAIK there is no super simple way to easily roll this out on your new JS types. There is a lot of boilerplate code involved and it's easy to do things wrong. I think it's a pattern begging for a well-defined solution.

Promises can solve the problem of one-shot observers. But any recurring event will need something more involved.
Are there any existing cases where this API would be useful, other than shutdown? It's not entirely clear to me that this needs a broad generic solution.

Additional fodder for thought: perhaps the more interesting problem to solve is making it easy for the *caller* [ie, the thing calling .notify()] to know when the callees are done with their possibly-async work. Ideally without spinning the event loop at all. ;-) The caller would need to provide on "onCompletion" callback, and the callees would need to make some API call when finished. The service would track how many observers were notified, and invoke the onCompletion callback once they've all signaled their work is done.

Still a lot of details  to work out, and I'm not sure if that problem is in any urgent need of a solution either.
(In reply to Justin Dolske [:Dolske] from comment #16)
> Are there any existing cases where this API would be useful, other than
> shutdown? It's not entirely clear to me that this needs a broad generic
> solution.

I can think of at least one case: by the middle of the year, Sync will be entirely asynchronous. As are all of its Places calls… apart from the observer notification interface between them.

That *might* not be a problem: if we can adjust or eliminate the various notifications such that we don't need to wait for our operations to complete prior to returning, then all is well. But if we need to, for example, ensure that we've updated some bookmark tracker state (using async DB calls, or async file operations) before Places proceeds to the next phase of deletion, then we will either need async observers in Places, or to spin the event loop in an observer.

We can special-case the former by being extra nice to mak, or we can establish a common approach for the tree.

> Additional fodder for thought: perhaps the more interesting problem to solve
> is making it easy for the *caller* [ie, the thing calling .notify()] to know
> when the callees are done with their possibly-async work. Ideally without
> spinning the event loop at all. ;-) The caller would need to provide on
> "onCompletion" callback, and the callees would need to make some API call
> when finished. The service would track how many observers were notified, and
> invoke the onCompletion callback once they've all signaled their work is
> done.

That was Greg's idea in Comment 15, and pretty much isomorphic to what I think of when I think about "async observers" -- Rafael's proposal to centralize spinning is really an adapter layer for this.

There are some niceties around 'mutating' current APIs rather than allowing an async version to exist in parallel. And it doesn't really provide a nice reusable solution for common issues like timeouts and different sorts of barriers.

And there's a risk that callers will carry certain invalid expectations if they use an API that looks synchronous. But such is life.

> Still a lot of details  to work out, and I'm not sure if that problem is in
> any urgent need of a solution either.

We'll probably hit it pretty hard -- and keep spinning the event loop -- in a couple of months.
(In reply to Justin Dolske [:Dolske] from comment #16)
> Additional fodder for thought: perhaps the more interesting problem to solve
> is making it easy for the *caller* [ie, the thing calling .notify()] to know
> when the callees are done with their possibly-async work. Ideally without
> spinning the event loop at all. ;-) The caller would need to provide on
> "onCompletion" callback, and the callees would need to make some API call
> when finished. The service would track how many observers were notified, and
> invoke the onCompletion callback once they've all signaled their work is
> done.

That was the initial idea. Is there something in the attached draft that doesn't match this description?

> Still a lot of details  to work out, and I'm not sure if that problem is in
> any urgent need of a solution either.

Well, we have several refactorings for the shutdown that are somewhat blocked by the absence of such a facility. Generally, as mentioned: don't shutdown service Foo before all its users are done with it.
Ah, sorry, I assumed the draft was about spinning the event loop.
(Spinning event loop in a such generic API is definitely not acceptable. We have enough
 problems with event loop spinning already now.)
(In reply to Olli Pettay [:smaug] from comment #20)
> (Spinning event loop in a such generic API is definitely not acceptable. We
> have enough
>  problems with event loop spinning already now.)

I believe that we all agree on this point.
Comment on attachment 692438 [details]
Interface draft

I think this is overdesigned for the problem at hand. Let's solve the minimal shutdown problem with a 'delay shutdown until I say go' interface.
Attachment #692438 - Flags: feedback?(benjamin) → feedback-
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #22)
> I think this is overdesigned for the problem at hand. Let's solve the
> minimal shutdown problem with a 'delay shutdown until I say go' interface.

Which part of shutdown?

I tend to believe that the FHR shutdown problem is not the same as the Telemetry shutdown problem which is also not the same as the async storage shutdown problem. Or, more precisely, that they do not want to delay the same part of shutdown.

What would you suggest, then?
Flags: needinfo?(benjamin)
If they aren't the same problem, what precisely is different about them? I don't see the details in this bug. I assumed that they all need to delay profile-before-change to complete pending async I/O. There really shouldn't be any pending network activity or disk I/O after that point anyway, since we're going to just exit().
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #24)
> I assumed that they all need to delay profile-before-change to complete
> pending async I/O.

Maybe a dumb idea but if we know that all the windows are gone we could just allow sync (from the main thread) IO at some point during shutdown before we exit()...
(In reply to comment #25)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #24)
> > I assumed that they all need to delay profile-before-change to complete
> > pending async I/O.
> 
> Maybe a dumb idea but if we know that all the windows are gone we could just
> allow sync (from the main thread) IO at some point during shutdown before we
> exit()...

That's a good idea specially if we end up doing async IO and just waiting on it on the main thread anyways.
(In reply to :Ehsan Akhgari from comment #26)
> (In reply to comment #25)
> > (In reply to Benjamin Smedberg  [:bsmedberg] from comment #24)
> > > I assumed that they all need to delay profile-before-change to complete
> > > pending async I/O.
> > 
> > Maybe a dumb idea but if we know that all the windows are gone we could just
> > allow sync (from the main thread) IO at some point during shutdown before we
> > exit()...
> 
> That's a good idea specially if we end up doing async IO and just waiting on
> it on the main thread anyways.

Let me throw out some counter points on why I believe this is not a good idea:

1) A (theoretical) benefit of asynchronous I/O is that multiple I/O operations can occur in parallel. If we perform synchronous I/O on the main thread, we're potentially slowing down shutdown because we're not maxing out the abilities of the storage device. The flip side is if we queue up many I/O ops, this may make things slower anyway, depending on the I/O load. That being said, I tend to say "throw I/O at the kernel and let it sort it out." Maybe Gecko itself is the I/O scheduler in some cases!

2) Performing synchronous I/O would require there to be 2 versions of APIs that could be called at shutdown: sync and async. For something like FHR where we are async all the way down the stack and many of its operations have a very slight chance of occurring on shutdown, this would require essentially two parallel versions of the code base. This would be a nightmare to support. Furthermore, it would open the door for accidental use of synchronous APIs (think add-ons). We should be encouraging async APIs everywhere and should be encouraging the active removal of synchronous I/O APIs.

3) How do you handle the case of waiting on previously-initiated operations to complete? Imagine my feature has a series of async events it needs to perform. Let's say this involves 3 I/O operations, each occurring sequentially after the other because they depend on each other. You initiate op 1. Before it finishes, a shutdown is initiated. How do you handle that? You essentially have an in-progress transaction. You need to roll back or see it to completion. Either way, you have an outstanding async operation you need to wait on to finish or your data is in a corrupt state. How are you going to do that? The answers are a) async callback to shutdown manager (essentially this bug) or b) spin the event loop. Now, I know you could technically see a truly aborted shutdown any time due to crashes, abnormal termination, etc and thus you will need to handle these scenarios. But, IMO we shouldn't be increasing the likelihood of them by not having sufficient APIs (status quo).

tl;dr not offering an async "shutdown can proceed" API is inviting data corruption and/or a world of development hurt.
(In reply to comment #27)
> (In reply to :Ehsan Akhgari from comment #26)
> > (In reply to comment #25)
> > > (In reply to Benjamin Smedberg  [:bsmedberg] from comment #24)
> > > > I assumed that they all need to delay profile-before-change to complete
> > > > pending async I/O.
> > > 
> > > Maybe a dumb idea but if we know that all the windows are gone we could just
> > > allow sync (from the main thread) IO at some point during shutdown before we
> > > exit()...
> > 
> > That's a good idea specially if we end up doing async IO and just waiting on
> > it on the main thread anyways.
> 
> Let me throw out some counter points on why I believe this is not a good idea:
> 
> 1) A (theoretical) benefit of asynchronous I/O is that multiple I/O operations
> can occur in parallel. If we perform synchronous I/O on the main thread, we're
> potentially slowing down shutdown because we're not maxing out the abilities of
> the storage device. The flip side is if we queue up many I/O ops, this may make
> things slower anyway, depending on the I/O load. That being said, I tend to say
> "throw I/O at the kernel and let it sort it out." Maybe Gecko itself is the I/O
> scheduler in some cases!

I don't think there are too many kernels that can do actual parallel I/O (except in very specific cases such as RAID etc.) so the benefits here are sort of theoretical as you mentioned.  :-)

> 2) Performing synchronous I/O would require there to be 2 versions of APIs that
> could be called at shutdown: sync and async. For something like FHR where we
> are async all the way down the stack and many of its operations have a very
> slight chance of occurring on shutdown, this would require essentially two
> parallel versions of the code base. This would be a nightmare to support.
> Furthermore, it would open the door for accidental use of synchronous APIs
> (think add-ons). We should be encouraging async APIs everywhere and should be
> encouraging the active removal of synchronous I/O APIs.

Hmm, not sure what APIs you're talking about.  There should not be a need to change the high-level APIs (or perhaps I lack context here.)

> 3) How do you handle the case of waiting on previously-initiated operations to
> complete? Imagine my feature has a series of async events it needs to perform.
> Let's say this involves 3 I/O operations, each occurring sequentially after the
> other because they depend on each other. You initiate op 1. Before it finishes,
> a shutdown is initiated. How do you handle that? You essentially have an
> in-progress transaction. You need to roll back or see it to completion. Either
> way, you have an outstanding async operation you need to wait on to finish or
> your data is in a corrupt state. How are you going to do that? The answers are
> a) async callback to shutdown manager (essentially this bug) or b) spin the
> event loop. Now, I know you could technically see a truly aborted shutdown any
> time due to crashes, abnormal termination, etc and thus you will need to handle
> these scenarios. But, IMO we shouldn't be increasing the likelihood of them by
> not having sufficient APIs (status quo).

Hmm, you can join the thread which is potentially doing pending async operations, right?

I think Ben's main point was that performing asynchronous IO and then waiting on it on the main thread is not better than performing the IO itself on the main thread.  Actually it's slightly worse because of context switch overhead, and potentially more complicated code to handle the threading etc.
(In reply to :Ehsan Akhgari from comment #28)
> I don't think there are too many kernels that can do actual parallel I/O
> (except in very specific cases such as RAID etc.) so the benefits here are
> sort of theoretical as you mentioned.  :-)

But if you schedule a number of I/O operations at the same time, the I/O scheduler does attempt to perform some optimization (minimizing drive seek overhead, etc). Linux has what, 3 or 4 I/O schedulers? It's not too theoretical.
 
> > 2) Performing synchronous I/O would require there to be 2 versions of APIs that
> > could be called at shutdown: sync and async. For something like FHR where we
> > are async all the way down the stack and many of its operations have a very
> > slight chance of occurring on shutdown, this would require essentially two
> > parallel versions of the code base. This would be a nightmare to support.
> > Furthermore, it would open the door for accidental use of synchronous APIs
> > (think add-ons). We should be encouraging async APIs everywhere and should be
> > encouraging the active removal of synchronous I/O APIs.
> 
> Hmm, not sure what APIs you're talking about.  There should not be a need to
> change the high-level APIs (or perhaps I lack context here.)

I'm talking about application/feature APIs. Take FHR for example. We have APIs like "incrementDailyCounter() -> promise". If I needed to call that on shutdown, now I'd need a synchronous version that used synchronous I/O under the hood in addition to the current implementation which uses async for everything.

> > 3) How do you handle the case of waiting on previously-initiated operations to
> > complete? Imagine my feature has a series of async events it needs to perform.
> > Let's say this involves 3 I/O operations, each occurring sequentially after the
> > other because they depend on each other. You initiate op 1. Before it finishes,
> > a shutdown is initiated. How do you handle that? You essentially have an
> > in-progress transaction. You need to roll back or see it to completion. Either
> > way, you have an outstanding async operation you need to wait on to finish or
> > your data is in a corrupt state. How are you going to do that? The answers are
> > a) async callback to shutdown manager (essentially this bug) or b) spin the
> > event loop. Now, I know you could technically see a truly aborted shutdown any
> > time due to crashes, abnormal termination, etc and thus you will need to handle
> > these scenarios. But, IMO we shouldn't be increasing the likelihood of them by
> > not having sufficient APIs (status quo).
> 
> Hmm, you can join the thread which is potentially doing pending async
> operations, right?

How would you do this [from JavaScript]? Does the C++ doing the shutdown know that my JS feature is waiting on a callback? All it may know is there is an outstanding I/O op. My argument is the JS (and I guess any C++ feature too) needs an opportunity to react to shutdown in a graceful way (read: needs to be able to wait for the last I/O operation to complete and then do whatever it can to "finish real soon"). If you don't give me this ability, then we may spend a lot of time at next app startup performing corruption detection due to an apparent ungraceful shutdown!
(In reply to comment #29)
> > Hmm, not sure what APIs you're talking about.  There should not be a need to
> > change the high-level APIs (or perhaps I lack context here.)
> 
> I'm talking about application/feature APIs. Take FHR for example. We have APIs
> like "incrementDailyCounter() -> promise". If I needed to call that on
> shutdown, now I'd need a synchronous version that used synchronous I/O under
> the hood in addition to the current implementation which uses async for
> everything.

Isn't there some promise.wait() API of some sort?  If not, then this is just a problem in the API I'd say.

> > Hmm, you can join the thread which is potentially doing pending async
> > operations, right?
> 
> How would you do this [from JavaScript]?

Using the promise.wait() hypothetical API I mentioned above.

> Does the C++ doing the shutdown know
> that my JS feature is waiting on a callback? All it may know is there is an
> outstanding I/O op. My argument is the JS (and I guess any C++ feature too)
> needs an opportunity to react to shutdown in a graceful way (read: needs to be
> able to wait for the last I/O operation to complete and then do whatever it can
> to "finish real soon"). If you don't give me this ability, then we may spend a
> lot of time at next app startup performing corruption detection due to an
> apparent ungraceful shutdown!

One thing to note is that there's just no way around the requirement of having a way to wait for the asynchronous API.  I don't think that over complicating the shutdown process because we don't have a way to solve that API problem is an acceptable trade-off at all.
(In reply to :Ehsan Akhgari from comment #30)
> (In reply to comment #29)
> > > Hmm, not sure what APIs you're talking about.  There should not be a need to
> > > change the high-level APIs (or perhaps I lack context here.)
> > 
> > I'm talking about application/feature APIs. Take FHR for example. We have APIs
> > like "incrementDailyCounter() -> promise". If I needed to call that on
> > shutdown, now I'd need a synchronous version that used synchronous I/O under
> > the hood in addition to the current implementation which uses async for
> > everything.
> 
> Isn't there some promise.wait() API of some sort?  If not, then this is just
> a problem in the API I'd say.

I would strongly advise against adding a .wait() API. AFAIK the only way to implement that would be through a nested event loop. And, my guess is it would be abused, much like current Sync abuses nested event loop spinning.
(In reply to comment #31)
> (In reply to :Ehsan Akhgari from comment #30)
> > (In reply to comment #29)
> > > > Hmm, not sure what APIs you're talking about.  There should not be a need to
> > > > change the high-level APIs (or perhaps I lack context here.)
> > > 
> > > I'm talking about application/feature APIs. Take FHR for example. We have APIs
> > > like "incrementDailyCounter() -> promise". If I needed to call that on
> > > shutdown, now I'd need a synchronous version that used synchronous I/O under
> > > the hood in addition to the current implementation which uses async for
> > > everything.
> > 
> > Isn't there some promise.wait() API of some sort?  If not, then this is just
> > a problem in the API I'd say.
> 
> I would strongly advise against adding a .wait() API. AFAIK the only way to
> implement that would be through a nested event loop. And, my guess is it would
> be abused, much like current Sync abuses nested event loop spinning.

Oh I was imagining a .wait() API which is implemented on top of pthread_join...
I'm not sure what you guys are actually talking about now, but nsIThread.shutdown() *is* the method which joins a thread, and you are required to call that on all worker threads at latest before xpcom-shutdown-threads. You are certainly allowed to call it earlier in the shutdown sequence. That method internally spins the event loop.
I have an additional scenario.

Once bug 838577 has landed, Session Restore will collect asynchronously the data it needs for saving sessionstore.js. So, from the point of view of Session Restore, shutdown should look like:
1. quit-application-granted;
2. collect window data asynchronously;
3. once 2. is complete, proceed with:
  - closing the windows;
  - writing down the data collected during 2.;
  - rest of shutdown.

This problem cannot be solved by a IO-specific hack.
Blocks: 850712
Note that there are cases where spinning the event loop will just kill the process. In particular, when windows is logging off/shutting down, any further calls to wait on the native event loop will terminate the process. It may be acceptable to spin only the "XPCOM" event loop, but we currently have no method for doing so, IIRC. So if you are only waiting for storage events to complete, then we can probably fix that case. If you are waiting for anything other kind of thing that might require a native event loop (including timers, currently), you just need to refactor the code to collect the relevant data before shutdown instead of during.
"you just need to refactor the code to collect the relevant data before shutdown instead of during."

I read this to mean that we need to perform preemptive operations before shutdown to minimize the possibility of performing them at actual shutdown time.

Even if we do this, there is still a window of time between initiating these async operations before shutdown and a shutdown occurring. What do you do when outstanding async operations fall through that window? Sure, you can more frequently perform preemptive operations, but that has perf impact.
Just for clarification: I am strongly against spinning the event loop to solve our issues. That's a hornet's nest. As far as I understand, so are gps, smaug, dolske and espindola (who, sadly, just left Moz).

Benjamin, we have code that needs to be executed during shutdown, just because that is the only place where it has any meaning. My understanding is that what you suggest would require us to seriously re-architecture the shutdown process. If your objective is to avoid rearchitecturing shutdown, that doesn't sound like a win situation.
I don't really care about event loop spinning during shutdown (if it doesn't cause slower shutdowns).
It is the non-shutdown case when spinning event loop is usually really bad.
For the sake of fairness, s/usually/very occasionally, no? 

Sync spins the event loops dozens of times each sync, dozens of times each day, for millions of users. And for all of the urban legend horrors, it seems to work. 

But let's not get too bogged down in that.
> Just for clarification: I am strongly against spinning the event loop to
> solve our issues.

I don't have any clue what this means. As far as I've read all of the proposals here, they all involve spinning the event loop while waiting for an event from another thread. They just move the loop-spinning process from each client into the core shutdown code. XPCOM shutdown already spins the event loop until it is empty several times during shutdown.
So that means that the clarification was necessary. Benjamin, if you look at the attached proposal, you will notice that it can be implemented without spinning the event loop. It is just a simple API meant to simplify the refactoring of various tasks and make them asynchronous.
I don't see how it can be implemented without spinning the event loop. Let's say we're in XPCOM shutdown:

observerservice->AsyncNotifyObservers(nullptr, "profile-before-change", nullptr, completionobject);

{
  // Wait for completionObject->Run(). 
  // The only way to do this is to spin the event loop, right?
}

exit(0);
Well, the point is that we do not wait for |completionObject->Run()|. Instead, we put all the code of the rest of the shutdown in |completionObject->Run()|.
That doesn't make any sense either, though. If the stack unwinds from main() the program exits anyway, so something has to wait, and the only way we have to wait is via spinning the event loop.
Benjamin and myself pursued the clarification through IRC and I believe that we have now resolved the misunderstanding.

The idea of nsIAsyncObserver is not necessarily to remove any loop spinning that we currently have in the code. It might be possible to remove some (not all) of them at a later stage, but that's not the current objective.

The idea is to add further constraints. Currently, once a shutdown event A is triggered, the only constraint before triggering the next event B is that all the observers of A have finished their sequential execution. With nsIAsyncObserver, we can add the further constraint that they have finished their asynchronous execution. This ensures that code that relies on asynchronous or off main thread execution has a chance to complete before event B is fired.

Benjamin, with this in mind, could you take a second look at the attachment you f-ed?
Attachment #692438 - Attachment mime type: text/plain → text/javascript
Attachment #692438 - Flags: feedback?(benjamin)
Not working actively on this at the moment.
Assignee: dteller → nobody
Attachment #692438 - Flags: feedback?(benjamin)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: