Open Bug 1312279 Opened 8 years ago Updated 2 years ago

[e10s] if it hits the beforeunload timeout, removeTab should warn that calling it from sync message handlers won't work correctly

Categories

(Firefox :: Tabbed Browser, defect, P3)

51 Branch
defect

Tracking

()

People

(Reporter: ngonzale, Unassigned)

References

Details

(Keywords: regression)

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0
Build ID: 20161019084923

Steps to reproduce:

I call gBrowser.removeCurrentTab to close the current tab. The problem occurs when using e10s and in a tab showing a non local page (i.e. not on about:addons for example).


Actual results:

Tab closes after a ~5s delay.


Expected results:

Tab should close immediately, as when clicking on the close tab button or when using the corresponding keyboard shortcut.
Can you provide more detail about Steps to reproduce?
  * where do you call it?
  * with what parameter do you call it?
  * what's the URL of "non local page" ? 

also, does the issue happen with clean profile?

I don't see the issue with Firefox 49.0.2 clean profile, on OSX, with the following steps:
  1. open https://www.mozilla.org in tab
  2. checking "Enable browser chrome and add-on debugging toolboxes"
  3. open Scratchpad
  4. choose "Browser" from "Environment"
  5. enter gBrowser.removeCurrentTab() in Scratchpad
  6. click "Run"

what's the difference between yours?
Flags: needinfo?(ngonzale)
Indeed, I also cannot reproduce this problem by following your steps.

In fact, I encounter the problem when I call removeCurrentTab() in the handler of a (synchronous) message originating from a frame script. The extension I have attached illustrates that behavior. To reproduce it (Firefox Developer Edition (51.0) with e10s enabled, clean profile, OSX):
  1. Install the attached extension
  2. Open https://www.mozilla.org in a tab
  3. Click on that tab -> the mousedown event handler sends a synchronous message whose handler calls gBrowser.removeCurrentTab()
  4. The tab closes after a delay of ~5s

If I call removeCurrentTab with the "skipPermitUnload: true" parameter (gBrowser.removeCurrentTab({skipPermitUnload: true}), the tab closes with no apparent delay.
thanks.
looks like, removeCurrentTab function is taking 5 seconds.
the function doesn't return until the tab get removed.

also, the issue happens only on e10s
and the issue happens regardless of the number of remaining tabs with web content.
Flags: needinfo?(ngonzale)
Summary: removeCurrentTab closes tab after a ~5s delay → [e10s] removeCurrentTab closes tab after a ~5s delay
There are 2 changes regarding this regression

1. removeCurrentTab doesn't work at all, after bug 967873 (firefox 45)
it doesn't return for 5 seconds, and the tab isn't closed after it returns.

regression range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=961911623a6f2ec1d036c7b12a5117ebbeff45d8&tochange=8e0bc70119606b70d74f1aa19d84e697ac4793c7


2. removeCurrentTab works after 5 seconds, after bug 1211637 (firefox 52, and uplifted to 50 and 51)

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=62f79d676e0e11b3ad59a5425b3ebb3ec5bbefb5&tochange=e2d2897e4a7449759267bf168e6b37f7b8c3a94b
Blocks: 967873, 1211637
Status: UNCONFIRMED → NEW
Component: Untriaged → General
Ever confirmed: true
Keywords: regression
(In reply to Nicolás González-Deleito from comment #2)
Sorry for not noticing this earlier after our twitter conversation. I wish twitter had a way of marking notifications as unread. :-\

> In fact, I encounter the problem when I call removeCurrentTab() in the
> handler of a (synchronous) message originating from a frame script. The
> extension I have attached illustrates that behavior.

Ah, OK. This is an important detail. Making the message asynchronous would also prevent this problem. Why do you need a synchronous message that has side effects like closing tabs? Some general advice below...


Generally you want synchronous messages because you need information immediately to make a decision (e.g. whether to call preventDefault on an event). You should be able to gather that information without the closing-tab side-effect and then send a separate (async) message to close the tab.

Alternatively, you could send messages from the parent to the child so that whatever information you need is (asynchronously) updated in the child, so you don't need to send messages from the child to the parent to make your decision.


The issue here is that you're sending a sync message to the parent, and the child is now blocked on the message. Then the parent, in order to close the tab, has to ask the tab whether it has any beforeunload handlers before deciding whether it can close the tab. This requires sending a message to the child. But the child is blocked on your previous message, so it won't process the incoming message until it's no longer blocked. We time out after 5 seconds, at which point we will assume it's OK to close the tab (the message we're sending to the kid is async, so the parent isn't hung), and then we unblock the child process.


Bill, two questions:
- Is there anything we can do to make this kind of problem more obvious for add-ons? Maybe report sync child-to-parent messages that take more than ~500ms to return control or something? I have a sense of deja vu about this bug, but I also thought sync messages from child to parent weren't allowed, so maybe I'm just hallucinating.
- should we lower the timeout?
Flags: needinfo?(wmccloskey)
Flags: needinfo?(ngonzale)
(In reply to :Gijs Kruitbosch from comment #6)
> Generally you want synchronous messages because you need information
> immediately to make a decision (e.g. whether to call preventDefault on an
> event). You should be able to gather that information without the
> closing-tab side-effect and then send a separate (async) message to close
> the tab.

Or, which IIRC we've done in a few places, you can use setTimeout() to schedule the closing of the tab ASAP without blocking the returning of the message on it.
Oh right, the earlier bug I was thinking of was bug 1238856. Took me a while to find it. It's basically the same problem, but a different add-on (and CPOWs rather than explicit sync messages).
(In reply to :Gijs Kruitbosch from comment #6)
> (In reply to Nicolás González-Deleito from comment #2)
> Sorry for not noticing this earlier after our twitter conversation. I wish
> twitter had a way of marking notifications as unread. :-\

No problem ;-)

> 
> > In fact, I encounter the problem when I call removeCurrentTab() in the
> > handler of a (synchronous) message originating from a frame script. The
> > extension I have attached illustrates that behavior.
> 
> Ah, OK. This is an important detail. Making the message asynchronous would
> also prevent this problem. Why do you need a synchronous message that has
> side effects like closing tabs? Some general advice below...

Sorry for not having noticed earlier that the problem comes from the way I am calling removeCurrentTab()...

I need to make a synchronous call because, as you very precisely point out below, I need to decide whether to call preventDefault() on an event or not. Making a synchronous call seemed to be the easiest solution when I made the legacy add-on code I am maintaining compatible with e10s. I was (and still am) intending to have a closer look at how to get rid of / avoid as much as possible sending synchronous messages, but this might require some significant architectural changes... Anyhow, your description of the problem was crystal clear and your suggestion of using setTimeout() to let the synchronous message return nicely works.

Thanks!

> 
> 
> Generally you want synchronous messages because you need information
> immediately to make a decision (e.g. whether to call preventDefault on an
> event). You should be able to gather that information without the
> closing-tab side-effect and then send a separate (async) message to close
> the tab.
> 
> Alternatively, you could send messages from the parent to the child so that
> whatever information you need is (asynchronously) updated in the child, so
> you don't need to send messages from the child to the parent to make your
> decision.
> 
> 
> The issue here is that you're sending a sync message to the parent, and the
> child is now blocked on the message. Then the parent, in order to close the
> tab, has to ask the tab whether it has any beforeunload handlers before
> deciding whether it can close the tab. This requires sending a message to
> the child. But the child is blocked on your previous message, so it won't
> process the incoming message until it's no longer blocked. We time out after
> 5 seconds, at which point we will assume it's OK to close the tab (the
> message we're sending to the kid is async, so the parent isn't hung), and
> then we unblock the child process.
> 
> 
> Bill, two questions:
> - Is there anything we can do to make this kind of problem more obvious for
> add-ons? Maybe report sync child-to-parent messages that take more than
> ~500ms to return control or something? I have a sense of deja vu about this
> bug, but I also thought sync messages from child to parent weren't allowed,
> so maybe I'm just hallucinating.
> - should we lower the timeout?
(In reply to :Gijs Kruitbosch from comment #6)
> Bill, two questions:
> - Is there anything we can do to make this kind of problem more obvious for
> add-ons? Maybe report sync child-to-parent messages that take more than
> ~500ms to return control or something? I have a sense of deja vu about this
> bug, but I also thought sync messages from child to parent weren't allowed,
> so maybe I'm just hallucinating.

Sync messages are *only* allowed in the child->parent direction (except for CPOWs).

I can't think of any good solutions here. We could warn about sync message handlers that call removeTab. We would have to add something like Cu.isRunningSyncMessageHandler. That wouldn't be too hard, but I'm not sure it's worth it.

Easier would be to warn in the console about sync message handlers calling removeTab if we hit a permitUnload timeout.

Another possibility is to make removeTab return immediately even though we haven't closed the tab yet. That seems like it might break more add-ons than it would fix, though.

> - should we lower the timeout?

Probably, but not by much. Sadly, it's not uncommon for the content process to take a second or so to respond. The worst that would happen is that we close the tab without calling the beforeunload handler I guess. But I wouldn't feel comfortable with a timeout below 2 seconds or so. And that's still a pretty crappy user experience.
Flags: needinfo?(wmccloskey)
Flags: needinfo?(ngonzale)
(In reply to Bill McCloskey (:billm) from comment #10)
> Easier would be to warn in the console about sync message handlers calling
> removeTab if we hit a permitUnload timeout.

Let's go with this.
Summary: [e10s] removeCurrentTab closes tab after a ~5s delay → [e10s] if it hits the beforeunload timeout, removeTab should warn that calling it from sync message handlers won't work correctly
Component: General → Tabbed Browser
Priority: -- → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: