Closed Bug 1556627 Opened 5 years ago Closed 5 years ago

nsFocusManager::SetFocus doesn't work for OOP iframes when the iframe isn't focused

Categories

(Core :: DOM: Core & HTML, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla75
Fission Milestone M4.1
Tracking Status
firefox75 --- fixed

People

(Reporter: Jamie, Assigned: hsivonen)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 3 obsolete files)

Screen readers often set focus in response to user actions (e.g. the user moving the screen reader's review cursor or choosing to interact with a control). Other a11y clients might need this too. This is done via Accessible::TakeFocus. For out-of-process iframes, this doesn't work unless the iframe document already has the focus; e.g. you tab into it or click inside it.

Henri, are you able to provide some guidance here as to how we can do this? I see that bug 1533716 made "out-of-process iframes able to request focus". Can Accessible::TakeFocus just call SendRequestFocus on the BrowserChild actor before calling nsFocusManager::SetFocus? Does that even sound reasonable?

Flags: needinfo?(hsivonen)

(In reply to James Teh [:Jamie] from comment #0)

Henri, are you able to provide some guidance here as to how we can do this? I see that bug 1533716 made "out-of-process iframes able to request focus". Can Accessible::TakeFocus just call SendRequestFocus on the BrowserChild actor before calling nsFocusManager::SetFocus? Does that even sound reasonable?

SendRequestFocus isn't meant to be called directly. It's meant to be called via nsIWidget::SetFocus (when nsIWidget is actually PuppetWidget). However, I'm not aware of any particular harm rom calling SendRequestFocus directly on BrowserChild.

Flags: needinfo?(hsivonen) → needinfo?(enndeakin)

Calling nsFocusManager::SetFocus as you currently do should just work to focus the frame. You are calling this from within the out-of-process iframe? Do you have a testcase where this doesn't work?

Flags: needinfo?(enndeakin)

data:text/html,<button>a</button><iframe fission src="data:text/html,&lt;button&gt;b&lt;/button&gt;"></iframe>

  1. Focus the "a" button.
  2. Get an accessible for the "b" button (from within the content process) and call Accessible::TakeFocus.
    • Expected: The "b" button should get focus.
    • Actual: It doesn't. The "a" button remains focused.

I don't quite know how to test this without an a11y client; we don't have a11y testing set up for Fission yet.

Flags: needinfo?(enndeakin)

The issue here is that the checks at https://searchfox.org/mozilla-central/source/dom/base/nsFocusManager.cpp#1223 to determine whether what is being focused is in the active window are using the docshell tree to scan for this, which doesn't provide any useful information for an iframe in a separate process.

I'm going to change this into a bug to fix this.

Blocks: fission
Component: Disability Access APIs → DOM: Core & HTML
Flags: needinfo?(enndeakin)
Summary: Fission a11y: Accessible::TakeFocus doesn't work for OOP iframes when the iframe isn't focused → nsFocusManager::SetFocus doesn't work for OOP iframes when the iframe isn't focused
Blocks: a11y-fission
Fission Milestone: --- → ?
No longer blocks: fission
Fission Milestone: ? → M4

Adding ni for Henri to look into this when he's back from PTO.

Flags: needinfo?(hsivonen)

(In reply to Neil Deakin from comment #4)

The issue here is that the checks at https://searchfox.org/mozilla-central/source/dom/base/nsFocusManager.cpp#1223 to determine whether what is being focused is in the active window are using the docshell tree to scan for this, which doesn't provide any useful information for an iframe in a separate process.

Could this also be the reason why text fields in out-of-process iframes can't be focused by mouse?

(In reply to James Teh [:Jamie] from comment #3)

data:text/html,<button>a</button><iframe fission src="data:text/html,&lt;button&gt;b&lt;/button&gt;"></iframe>

  1. Focus the "a" button.
  2. Get an accessible for the "b" button (from within the content process) and call Accessible::TakeFocus.
    • Expected: The "b" button should get focus.
    • Actual: It doesn't. The "a" button remains focused.

I don't quite know how to test this without an a11y client;

How do I perform step 2 above with an accessibility client on Mac, Windows, or Ubuntu?

Alternatively, can you check if the builds from this try run resolve the problem?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5c7dd607a44fc311522f9c6b72ed1c8475f1564f

Flags: needinfo?(hsivonen) → needinfo?(jteh)

Could this also be the reason why text fields in out-of-process iframes can't be focused by mouse?

(The patch in the try run does not fix the unfocusability of text fields by mouse.)

(In reply to Henri Sivonen (:hsivonen) from comment #7)

(The patch in the try run does not fix the unfocusability of text fields by mouse.)

The mouse part is bug 1562156. These are probably duplicates of each other.

(In reply to Henri Sivonen (:hsivonen) from comment #6)

  1. Get an accessible for the "b" button (from within the content process) and call Accessible::TakeFocus.

How do I perform step 2 above with an accessibility client on Mac, Windows, or Ubuntu?

New test case (because apparently fission.oopif.attribute is deprecated now). You'll need the NVDA screen reader on Windows.

  1. Set pref fission.autostart to true and restart Firefox.
  2. Open this test case:
    data:text/html,a<iframe src="https://google.com/notfound"></iframe>
  3. Focus the top level document (e.g. by tabbing once from the address bar).
  4. Press NVDA+control+z to open the NVDA Python console. (The "NVDA" key defaults to the insert key. You can also set this to be the caps lock key in NVDA menu -> Preferences -> Settings -> Keyboard.)
  5. Enter this command:
    nav.firstChild.next.firstChild.firstChild.setFocus()
  6. Switch back to Firefox.
    • Expected: The "Google" link in the iframe should be focused.
    • Actual: it isn't.

Alternatively, can you check if the builds from this try run resolve the problem?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5c7dd607a44fc311522f9c6b72ed1c8475f1564f

Unfortunately, they don't. No change.

Flags: needinfo?(jteh)

Thank you for the NVDA steps. I'm now pretty convinced that bug 1562156 is a duplicate of this bug.

In the non-Fission case, when we enter SetFocus, mFocusedWindow and mActiveWindow are non-null. They are null in the Fission case. Hence...

(In reply to Neil Deakin from comment #4)

The issue here is that the checks at https://searchfox.org/mozilla-central/source/dom/base/nsFocusManager.cpp#1223 to determine whether what is being focused is in the active window are using the docshell tree to scan for this, which doesn't provide any useful information for an iframe in a separate process.

It's insufficient to just scan the BrowsingContext tree, since the starting point is wrong: We have null mActiveWindow instead of the corresponding BrowsingContext*.

From the point of view of auditing all uses of mFocusedWindow and mActiveWindow for using the BrowsingContext tree instead, we have a pretty daunting number of references to mFocusedWindow and mActiveWindow. Considering that sequential focus traversal works, we might get away with a partial migration to the BrowsingContext tree.

How synchronous does this stuff need to be? Could we maintain asynchronously-updating information about the focused BrowsingContext in the BrowsingContext tree and compute the active BrowsingContext from it on demand? Could we just trigger the information syncing whenever we assign to mFocusedWindow and hope that the IPC racing isn't too terrible?

Do we have any example of a synced property that is true for at most one BrowsingContext at a time? I wonder how much cross-process blur should be explicit and how much it should be a side effect of the focused property sync...

Flags: needinfo?(enndeakin)
Flags: needinfo?(afarre)

(In reply to Henri Sivonen (:hsivonen) from comment #10)

Do we have any example of a synced property that is true for at most one BrowsingContext at a time?

Instead of syncing a property on the browsing contexts themselves, we should probably be syncing the id of the focused browsing context to all processes. Do we have an existing example of that kind of broadcast?

So we have something similar. BrowsingContextGroup has a list of ContentParent that actually has a BrowsingContext from that group. We could put that property on the group instead, and use BrowsingContextGroup::EachParent to send a message to all concerned ContentChild(ren).

Flags: needinfo?(afarre)

Does this make sense as a plan:

In nsFocusManager, replace mActiveWindow and mFocusedWindow with BrowsingContexts, which can be in-process or out-of-process.

Since mFocusedWindow is documented to always be the same as or descendent of mActiveWindow and mActiveWindow is required to be a top-level window, only allow setting mFocusedWindow and compute mActiveWindow from it. (I.e. mActiveWindow is just a cached pre-computed value.)

mFocusedWindow can only change by a new focuses item grabbing focus. When mFocusedWindow changes, the content process in which the new mFocusedWindow resides (i.e. the content process that is taking focus) runs the blurring for the blur-eligible in-process BrowsingContexts and sends an IPC message to the chrome process indicating the id of the newly-focused BrowsingContext.

The chrome process broadcasts the newly-focused BrowsingContext id to all other content processes (on a per-process basis, not on a per-BrowserParent basis!). When the content process receives the id of the newly-focused BrowsingContext from the chrome process, it checks if any of its in-process BrowsingContexts become blurred and runs the blurring steps for those.

There is an opportunity for to content processes to start grabbing focus simultaneously. As far as I can tell, this cannot happen as a result of the user event. It can only happen by JS (or compromised C++) grabbing focus without reacting to user event. The probability of two processes grabbing focus simultaneously from a timer or such seems low enough that I propose we don't implement any particular protocol for dealing with the situation but just let the consequences be whatever the consequences are for the focused BrowsingContext id broadcasts arriving in the order they happen to arrive in.

Flags: needinfo?(afarre)

Yes, this sounds reasonable, and the trade off on races due to simultaneous content child interaction is something that's just inherently fission-y. We've made the decision that this is inevitable in other cases, and I think that it should be in this case as well.

Flags: needinfo?(afarre)

What do you think of comment 14, Neil?

Is that a more complex way of saying that the chrome process gets told and keeps track of which descendant BrowsingContext is focused? Why does it need to broadcast to all other processes rather than just the one that is focused?

Flags: needinfo?(enndeakin)

(In reply to Neil Deakin from comment #17)

Is that a more complex way of saying that the chrome process gets told and keeps track of which descendant BrowsingContext is focused?

No, the chrome process would just be a the broadcast conduit and the content processes would all maintain their understanding of which BrowsingContext is focused. My understanding is that they all maintain the full BrowsingContext tree anyway.

Or do they? (ni farre for this.)

Why does it need to broadcast to all other processes rather than just the one that is focused?

All the processes that own BrowsingContexts in the chain that gets blurred need to know that the focus went elsewhere. It seems simpler to broadcast the id of the newly-focused BrowsingContext to every process than to make the chrome process keep track of which processes strictly need to know.

Flags: needinfo?(afarre)

All the processes that own BrowsingContexts in the chain that gets blurred need to know that the focus went elsewhere. It seems simpler to broadcast the id of the newly-focused BrowsingContext to every process than to make the chrome process keep track of which processes strictly need to know.

OK, so that means that only those processes that have browsing contexts that are ancestors of the browsing context being blurred and not ancestors of those being being focused would respond. That seems like it might work.

Does every chrome subframe (sidebar for example) always have a separate browsing context as well? I assume the intent is to use the same message passing here as well?

(In reply to Neil Deakin from comment #19)

Does every chrome subframe (sidebar for example) always have a separate browsing context as well? I assume the intent is to use the same message passing here as well?

I don't know. However, other sidebar and popup breakage that I've caused while making Fission changes indicates that my understanding of chrome-level browsing contexts is insufficient.

farre, how do sidebars and popups as well as multiple top-level browser windows fit the browsing context tree? More generally, what's the latest documentation that I should read about how the browsing context tree works?

It shouldn't be that different from how it was with nsDocShell. The only big difference is that we only have same type trees, so you're never expected to be able to from a content browsing context be able to traverse upwards in the tree and get a chrome browsing context.

And in the grandes scheme of things this isn't really an issue, with bc's, but more with how focus handling should be dealt with in a post-fission world. Tagging in :nika for thoughts.

Flags: needinfo?(nika)
Flags: needinfo?(hsivonen)

(In reply to Henri Sivonen (:hsivonen) from comment #18)

(In reply to Neil Deakin from comment #17)

Is that a more complex way of saying that the chrome process gets told and keeps track of which descendant BrowsingContext is focused?

No, the chrome process would just be a the broadcast conduit and the content processes would all maintain their understanding of which BrowsingContext is focused. My understanding is that they all maintain the full BrowsingContext tree anyway.

Yes and no. Content processes can see the entire tree, but that doesn't hold the entire state of the browser. It can only see the subtree for each content tab (sidebar, extension popup, etc) loaded in that process. It's possible (common?) for no BrowsingContext in the entire content-visible tree to have focus (e.g. when the URL bar is focused, or focus is in a different tab). I'm not entirely sure how focus works in that situation (it's possible the subtree would still have a focused node which doesn't correspond to the OS's focus).

In the parent process, you can see every BrowsingContext, including those for browser UI, so it would be more reasonable to keep track of which BrowsingContext is focused there.

From the point of view of a content process, I'm guessing what is important is whether or not you have focus. I imagine a system where the parent process keeps track of which BC has focus, and in the content process, the BC which has focus is also tracked. However, in content, the focus status of BCs which aren't local is invisible. Calls like "focus()" and "blur()" would update local state as much as is reasonable, and then the parent process would be notified. When the parent process receives the message from content, it would update its local state, and potentially notify other content processes that they have been focused or blurred, updating their local state async. We may need to use the same epoch mechanism as is used for BC field syncing to ensure that focus state in every process is consistent.

I think that would appear sufficiently sync to make us web-compatible, and there'd be a coherent view of focus in the parent process.

Flags: needinfo?(nika)

(In reply to :Nika Layzell (ni? for response) from comment #23)

Yes and no. Content processes can see the entire tree, but that doesn't hold the entire state of the browser. It can only see the subtree for each content tab (sidebar, extension popup, etc) loaded in that process. It's possible (common?) for no BrowsingContext in the entire content-visible tree to have focus (e.g. when the URL bar is focused, or focus is in a different tab). I'm not entirely sure how focus works in that situation (it's possible the subtree would still have a focused node which doesn't correspond to the OS's focus).

Ah, so I was wrong about the entire tree getting synced. Now that I think about it a bit more, it makes sense from the security perspective not to sync the parts that a given content process doesn't need to know.

In the parent process, you can see every BrowsingContext, including those for browser UI, so it would be more reasonable to keep track of which BrowsingContext is focused there.

FWIW, I have no idea what mActiveWindow means in the chrome context. (In the Web context its the top-level ancestor of mFocusedWindow or, if mFocuseWindow is the top-level thing in the Web hierarchy, mFocusedWindow itself.)

From the point of view of a content process, I'm guessing what is important is whether or not you have focus. I imagine a system where the parent process keeps track of which BC has focus, and in the content process, the BC which has focus is also tracked. However, in content, the focus status of BCs which aren't local is invisible. Calls like "focus()" and "blur()" would update local state as much as is reasonable, and then the parent process would be notified. When the parent process receives the message from content, it would update its local state, and potentially notify other content processes that they have been focused or blurred, updating their local state async. We may need to use the same epoch mechanism as is used for BC field syncing to ensure that focus state in every process is consistent.

I take it that the reasonable way to do the syncing is to send the ID of the most-recently-focused BrowsingContext across IPC and not to make "focused" a field of the BrowsingContext itself, since an outside-the-tree "the focused BrowsingContext is that one" makes sure there can't be more than one, whereas with a "focused" field on the BrowsingContext objects, there'd be more work to maintain the invariant that no more than one BrowsingContext can have the field set to true.

What code should I read to see a sample of the closest thing to something like this that currently exists?

Flags: needinfo?(hsivonen) → needinfo?(nika)

FWIW, I have no idea what mActiveWindow means in the chrome context. (In the Web context its the top-level ancestor of mFocusedWindow or, if mFocuseWindow is the top-level thing in the Web hierarchy, mFocusedWindow itself.)

mActiveWindow is the topmost raised chrome window. mFocusedWindow is either equal to mActiveWindow or a descendant of it.

(In reply to Henri Sivonen (:hsivonen) from comment #24)

I take it that the reasonable way to do the syncing is to send the ID of the most-recently-focused BrowsingContext across IPC and not to make "focused" a field of the BrowsingContext itself, since an outside-the-tree "the focused BrowsingContext is that one" makes sure there can't be more than one, whereas with a "focused" field on the BrowsingContext objects, there'd be more work to maintain the invariant that no more than one BrowsingContext can have the field set to true.

That sounds like a reasonable approach, and probably how I'd do something like that as well :-)

What code should I read to see a sample of the closest thing to something like this that currently exists?

I'm not sure if there's something exactly like this for BrowsingContext right now The closest is probably the BrowsingContextField synchronization logic. Here's a summary of what it does:

Flags: needinfo?(nika)
Flags: needinfo?(afarre)
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED

The patch now has the line-by-line conversion to BrowsingContext without any epoch checking, because it was unclear to me what the code should do differently on an epoch mismatch.

Unsurprisingly, in its current state, the patch breaks SetFocus even in the non-Fission case.

So far, the obvious non-Fission change is that RaiseWindow no longer sets the active windows, which is now computed from the focused browsing context. This changes when (or if?) a window becomes "active".

It is so far unclear to me how the old code actually upheld the claimed invariant that mActiveWindow was mFocusedWindow or its ancestor.

If you see errors in the patch (there have to be some), I welcome those getting pointed out to me.

The patch I uploaded earlier today appears to break blurring. We never run LOGCONTENT("Element %s has been blurred", element.get());

Additionally, it breaks remote process Activate/Deactivate behavior somehow, which breaks keyboard event routing.

Part of the problem seems to be that tracking cross-process focused BrowsingContext loses tracking of the best approximation of in-process ancestor DOM window. I bet that mFocusedWindow in the chrome process has had more subtle semantics than have been documented. Sigh.

What's the expected null/non-null behavior of mFocusedWindow in the chrome process when focus is in Web content?

Flags: needinfo?(enndeakin)

If focus is in a child process, then mFocusedWindow in the chrome process should be the window containing its embedding <browser> element. mFocusedElement should be that <browser> element.

mFocusedWindow should only be null in the chrome process when some other application is active (or in-between blurring one thing and focusing another). If it was it would mean that the focus was on the top level window with no element focused.

Flags: needinfo?(enndeakin)

(In reply to Neil Deakin from comment #32)

If focus is in a child process, then mFocusedWindow in the chrome process should be the window containing its embedding <browser> element. mFocusedElement should be that <browser> element.

mFocusedWindow should only be null in the chrome process when some other application is active (or in-between blurring one thing and focusing another). If it was it would mean that the focus was on the top level window with no element focused.

Thank you.

How does this apply to out-of-process iframes? Clearly, with in-process iframes, the ancestor iframes can be mFocusedElement, since there's only one per process. However, when process boundaries can traverse the same process multiple times in a hierarchy (example.com frames example.org which frames example.com again, which frames example.org again, and focus is in there), will it be a problem that the example.com process can't have mFocusedElement set to both out-of-process iframes?

Currently, the way BrowserParent::Activate and BrowserParent::Deactivate get called seems to assume that each process shows up only once in the framing hierarchy and the process containing the out-of-process iframe has mFocusedElement set to it.

What benefit do we get from maintaining mFocusedElement pointed to the <browser> or out-of-process iframe? How bad would it be to figure out dynamically every time that if browsing context A has focus is out-of-process and its parent B is in this process, the frame that logically contains B is in this process and is focused?

(Also, currently IME focus depends on Activate and Deactivate happening in an orderly fashion. I'm worried that once focusing random stuff by mouse works correctly, it'll be infeasible to maintain proper temporal order of the chrome process observing the Activates and Deactivates and IME focus source of truth will have to change again.)

Flags: needinfo?(enndeakin)

Is there a profound reason why CanonicalBrowsingContext doesn't have a getter for BrowserParent, or is it just a matter of no one having added one yet?

Flags: needinfo?(afarre)

The current patch makes it possible to use BrowsingContexts to focus the top level of Web content by click and enter text. Focusing iframes is still broken.

(In reply to Henri Sivonen (:hsivonen) from comment #34)

Is there a profound reason why CanonicalBrowsingContext doesn't have a getter for BrowserParent, or is it just a matter of no one having added one yet?

I hope there isn't a profound reason not to add such a getter, since I think I really need one. Bug 1585950.

Flags: needinfo?(afarre)

The most pressing issues I see:

  • After making focused BrowsingContext propagate across processes, the "active" concept does not work well, because the the existing code seems to assume that top-level BrowsingContext in the front-most tab should count as "active" even when focus is in chrome, so the notion that "active" must be an ancestor of "focused" but not across the chrome/content boundary seems not quite true.
  • I'm a having trouble getting the focused BrowsingContext synced to newly-spawned processes for out-of-process iframes. (Hopefully simpler to fix than the first point.)

(In reply to Henri Sivonen (:hsivonen) from comment #37)

  • After making focused BrowsingContext propagate across processes, the "active" concept does not work well, because the the existing code seems to assume that top-level BrowsingContext in the front-most tab should count as "active" even when focus is in chrome, so the notion that "active" must be an ancestor of "focused" but not across the chrome/content boundary seems not quite true.

What non-chrome-process top-level content can we have participating in focus apart from the Web content in the currently-active tab of the front-most native window? There can be a sidebar with extension-provided non-chrome-process content. There can be an extension-provided pop-up that's semantically not a titlebarful native window but in a native window manager thingy that's somehow tied to the titlebarful window.

Do we support "focus-follows mouse" still on some X11 configs allowing keystrokes to go to non-frontmost window? What configs?

Are there more cases for content process top-level focus than:

  1. Focus is in chrome. No Web content is receiving keystrokes.
  2. Focus is in the selected browser tab of the frontmost titlebarful native window.
  3. Focus is in the sidebar of the frontmost titlebarful native window.
  4. Focus is in a titlebarless popup associated with the frontmost titlebarful native window.

Is case 4 either always fully-chrome (e.g. awesomebar popup) or always fully-content (popups generated by extension icons in the toolbar), or are there subtle transitions within those windows?

An even larger part that I though is about the "active"/"activate" concept than about focus per se.

AFAICT:

Before e10s, we had the notion of raising a window where window meant an actual GUI window. With e10s, we made the internal APIs model the action of switching to a tab hosting out-of-process Web content as raising the window of the puppet widget hosting the top-level Web content in that process. Each process considers one widget--real or puppet--to be the raised one. This works, since there was only at most one of these process boundaries "active" at a time per process.

With Fission, one process can simultaneously have more than one process boundary integration point on the logical path from the native window to the focused element. Thus, the notion that there is such a thing as the raised DOM window per process no longer works. If site A frames site B frames site A and focus is in the innermost frame, the focus handling for the process hosting site A is broken, because the process-wide globals (fields of a singleton more precisely) clash for the content hosted by the two BrowserChild objects in the same process. We generally assume that we can obtain a pointer to a singleton nsFocusManager without providing any context, so refactoring things such that there'd be one nsFocusManager per BrowserChild rather than per process would be a big deal.

Handing off raisedness by click worked under e10s, too. When you click Web content in a non-topmost GUI window in non-Fission e10s, chrome knows which native GUI window to raise and which BrowserParent to tell to raise its puppet/DOM window. But most importantly: Telling the BrowserParent to raise its puppet window is one level of IPC.

In contrast, when in Fission you click an out-of-process iframe, the click goes directly to the out-of-process iframe. There's no opportunity for a chain of "raise DOM window" operations to propagate down first before we react to the mouse click. We want to process the mouse click synchronously and let the "raise DOM window" chain to happen afterwards somehow. Not only does the existing code expect there to be at most one raised DOM window per process, it expects the DOM windows to have been raised in a top-down sequence before the mouse click is processed.

Merely changing the "focused DOM window" to be a "focused BrowsingContext" is not enough. Trusting the comments about what "active (DOM) window" means relative to the focused one doesn't work at all, because the "active" concept both assumes there's at most one per process and assumes there is one per process boundary.

It seems we need a way for a BrowserChild that's not "active" to be able to put itself and the BrowserChild objects that are its ancestors in Web hierarchy and that reside in the same process in the "active" state synchronously and then let the other BrowserChild objects in the chain logically but residing in other processes to become "active" asynchronously after the fact. And we need to support more than one "active" BrowserChild and more than one focused remote frame host in one process.

To take a step back:

Since the concept of "active" DOM window is an artifact of process boundaries and doesn't exist on in-process iframe boundaries, should be really be extending the concept of "active" to more than one per process or should be try to eliminate the concept of "active" at process boundaries and instead only have the concepts that are equal across all frame boundaries (each DOM window knowing its focused element, which may be an iframe, and which is remembered even if the DOM window is not "raised") and concepts that are synced across all processes (the deepest "raised" BrowsingContext)?

The explanations here help understand the main issue here, thanks. The issue is a case where A and B are processes, and we have a frame tree with a structure like A -> B -> A, this causes an issue as there is only one focus manager per process.

However, the actual focus in only in the one of those 'A' subframes, so nsFocusManager's mFocusedWindow and mFocusedElement would point to those.

So for example if a button is the higher A was focused, mFocusedWindow would be A and mFocusedElement is the button. The current focus for a window is stored at https://searchfox.org/mozilla-central/rev/5cb522c7baba24e55874809e0e206b001494c1e9/dom/base/nsPIDOMWindow.h#636 whether the window is focused or not. If the higher A was focused, this would point to the button.

If however the descendant A subframe was focused, nsFocusedManager::mFocusedWindow would be the lower A, and nsFocusedManager::mFocusedElement would be whatever element in that was focused. The ancestor A's nsPIDOMWindow::mFocusedElement would point at the subframe.

You could change all this to use browsing contexts instead if that would be easier.

The trickier case in with the active window. The original intent of nsFocusManager::mActiveWindow was to indicate the top-level chrome window that was in front. However with e10s, this was repurposed for child processes to also indicate which tab was in front.

I think an analysis of what the callers are doing when getting the active window would be needed. Many callers just want to know whether they are inside an active window so they can update UI state (window decorations, highlight colours, etc). Those don't actually need to know which window is active, but just a boolean indicating active versus inactive. Much of the rest are chrome process only so the issue wouldn't apply. Are there any content process uses of the active window that do more than that?

Flags: needinfo?(enndeakin)

(In reply to Neil Deakin from comment #40)

I think an analysis of what the callers are doing when getting the active window would be needed. Many callers just want to know whether they are inside an active window so they can update UI state (window decorations, highlight colours, etc). Those don't actually need to know which window is active, but just a boolean indicating active versus inactive. Much of the rest are chrome process only so the issue wouldn't apply. Are there any content process uses of the active window that do more than that?

AFAICT, there are three cases:

  1. Checking if current focus or about-to-be-focused thing is in the frontmost GUI window (and for Web content, tab).
  2. Checking things related to actual native GUI window raisedness in the chrome process only.
  3. Checking things related to either the actual native GUI window raisedness in the chrome process or raisedness of a process boundary integration point in a content process.

For the purpose of item #1 it should logically be enough to track the BrowsingContext that is in the frontmost GUI window and has focus, since it should be always possible to walk up from that one. However, doing only that totally breaks cases #2 and #3, because the relationship between "focused" and "active" gets out of sync during raising a window. Specifically, the code wants the window being raised to become "active" synchronously, and then activation / pseudo-raise asynchronously propagates down.

The async top-down propagation is fine for switching tabs/windows from the top and for sequential tabbing, but, as noted, it doesn't at all fit the notion that a random element at the bottom wants to get focused bottom-up. For the start of the bottom-up focusing action, checking what's allowed and what needs to happen, it should be enough for the content process to have an ahead-of-time eagerly-synced copy of the BrowsingContext tree as well as the currently-focused BrowsingContext. However, getting everything else in the right state afterwards, considering that things are built to sync top-down is unresolved. (Hand-waved in comment 14, but in addition to running blurring, the end state should get the "active" status of everything right.)

If however the descendant A subframe was focused, nsFocusedManager::mFocusedWindow would be the lower A, and nsFocusedManager::mFocusedElement would be whatever element in that was focused. The ancestor A's nsPIDOMWindow::mFocusedElement would point at the subframe.

Right, but I'm not yet convinced that there isn't an assumption that when processing things related to the ancestor A, nsFocusedManager::mFocusedElement should point to the subframe (which it can't, since the global is pointing to something in the descendant A).

I spoke with Neil by voice. I'm going to try the following:

  1. Retain current mActiveWindow for chrome-process native windows raising/lowering.
  2. Have the cross-process-synced focused BrowsingContext as planned previously. Use its Top() for making decisions about whether the about-to-be-focused element is in the "active" window.
  3. Try to get mActiveWindow out of the way in content processes when BrowserChild::RecvActivate() does what it needs to do. (I.e. try to make mActiveWindow irrelevant for the puppet widget case.)
  4. When the focused BrowsingContext changes, have each process that experiences the change compute with BrowsingContexts that are on the path from the newly-focused BrowsingContext to the top are in-process and call Activate on them.
  5. Make chrome-process IME/keyboard focus work based on the BrowserParent closest to the focused BrowsingContext instead of the current expectation of Activate/Deactivate being neatly stack-friendly.
  6. Starting IME state management in the newly-focused content process is ???? but hopefully comes out naturally from making item #2 on this list actually work.

When the focused BrowsingContext changes, have each process that experiences the change compute with BrowsingContexts that are on the path from the newly-focused BrowsingContext to the top are in-process and call Activate on them.

Hmm. I wonder if instead the process hosting the nearest common ancestor of the previous and new focused BrowsingContext should call Deactivate and Activate and let them propagate down the usual way even though the focus has already changed ahead of time.

Oh, and the very first thing to try is to remove mFocusedElement to and see if anything breaks to see if it genuinely a cached value as it is supposed to be. After this is all done, if there's an actual perf issue, we can reoptimize, but it seems that refactoring is easier when not having to maintain cache invariants.

When we spawn a new out-of-process iframe, has the CanonicalBrowsingContext for that iframe been added to the BrowsingContextGroup by the time BrowsingContextGroup::EnsureSubscribed runs to subscribe the process to that BrowsingContextGroup? (I'm trying to figure out what's the right point to inform a newly-spawned process about what the currently-focused BrowsingContext is. Can't happen before the new process has the requisite reflector BrowsingContext objects.)

Flags: needinfo?(nika)

(In reply to Henri Sivonen (:hsivonen) from comment #45)

When we spawn a new out-of-process iframe, has the CanonicalBrowsingContext for that iframe been added to the BrowsingContextGroup by the time BrowsingContextGroup::EnsureSubscribed runs to subscribe the process to that BrowsingContextGroup? (I'm trying to figure out what's the right point to inform a newly-spawned process about what the currently-focused BrowsingContext is. Can't happen before the new process has the requisite reflector BrowsingContext objects.)

Yes. BrowisngContext (and CanonicalBrowsingContext by extension) objects are always added to their groups as soon as they're created, and don't become cross-process until they navigate :-) (https://searchfox.org/mozilla-central/rev/7cc0f0e89cb40e43bf5c96906f13d44705401042/docshell/base/BrowsingContext.cpp#143,178). EnsureSubscribed is the earliest moment that a content process is informed about the existence of these contexts, if it was not aware before.

Flags: needinfo?(nika)

Try to get mActiveWindow out of the way in content processes when BrowserChild::RecvActivate() does what it needs to do. (I.e. try to make mActiveWindow irrelevant for the puppet widget case.)

Still trying to figure out what this means in practice. That is, what should nsFocusManager::WindowRaised do for out-of-process iframes? In particular, if example.com frames example.org frames example.com and focused was in the inner example.com before the browser tab was switched away form, how should the state be restored upon switching back to the tab without colliding with the top-level example.com state?

Or should we not "raise" and "lower" out-of-process iframes at all? If so, how should we "activate" and "deactivate" them?

Enough of our focus logic relies on the notion of being able to reuse the notion of "raise" the window of the out-of-process internal of a the frameloader when focus moves across a process boundary that I'm really shy to try to do away with all that.

Still, for out-of-process iframes, equality checking with mActiveWindow won't work as a check for whether the DOM window is already "raised", so we need something else. Once we reach a stable state, all this is computable from where in the BrowsingContext tree the focus is. But I have a hard time understanding what we need to do during focus move or a tab switch to keep the existing code happy enough.

Going back to the earlier patch but with more restraint. This time trying not to change the parent process logic and not trusting the comments:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=222e656aecb9174d6e4c78b1830ee245d06aa26c

So the comments about the relationship of mActiveWindow and mFocusedWindow don't hold. That is, there are significant periods of time when the documented relationship doesn't hold, and our tests depend on that. Likewise, the documented relationship of mFocusedElement and mFocusedWindow->mFocusedElement doesn't hold, either, at important times in ways that our tests depend on.

It's not enough to just say that XUL is weird and assume that these things hold for Web content: Test cases fall over that way, too. So special-casing the chrome process doesn't work.

I think I'm going to need to special-case out-of-process iframes instead, since there are fewer pre-existing assumptions for those.

Moreover, it seems that whenever I make a larger logical set of changes at once, too many things fall over to figure out what exactly broke what. Therefore, I'm now going to try in ridiculously small steps.

Here's one:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bb32d6d987ed02baecb57da17f5e17dba1c9378d

(In reply to Henri Sivonen (:hsivonen) from comment #53)

Here's one:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bb32d6d987ed02baecb57da17f5e17dba1c9378d

Even this tiny change is all orange. :-(

The only test that fails now is a test that tests mozbrowser ignoreuserfocus. I haven't yet verified this, but I'm guessing that BrowsingContext::Top() doesn't cross a mozbrowser boundary, but the test case probably expects everything other than the ignoreuserfocus part to behave "normally".

smaug, you added this for the purpose of the Gaia onscreen keyboard. Does ignoreuserfocus have use cases that apply to the use of mozbrowser in dev tools?

farre, what should I know about mozbrowser and the BrowsingContext tree? Does Fission support in-process mozbrowser, or is it always out-of-process in Fission?

Flags: needinfo?(bugs)
Flags: needinfo?(afarre)

So, my assumptions for BrowsingContext::Top() don't hold even within content processes:
https://searchfox.org/mozilla-central/rev/8a63fc190b39ed6951abb4aef4a56487a43962bc/dom/base/nsFrameLoader.cpp#248-251

It says that in addition to mozbrowser we could create a top-level BrowsingContext in a content process for "a xul browser element with a remote="true" marker". Do all content processes have a hidden XUL document with browser remote=true at the top of the docshell hierarchy, or is there something else that involves XUL in a content process?

(In reply to Henri Sivonen (:hsivonen) from comment #58)

It says that in addition to mozbrowser we could create a top-level BrowsingContext in a content process for "a xul browser element with a remote="true" marker". Do all content processes have a hidden XUL document with browser remote=true at the top of the docshell hierarchy, or is there something else that involves XUL in a content process?

It's for about:addons:
https://searchfox.org/mozilla-central/rev/8a63fc190b39ed6951abb4aef4a56487a43962bc/dom/base/nsFrameLoader.cpp#229-232

Thanks, Nika, for writing this down!

farre, what should I know about mozbrowser and the BrowsingContext tree? Does Fission support in-process mozbrowser, or is it always out-of-process in Fission?

Redirecting needinfo to Nika who wrote the comment:
Do mozbrowser and the about:addons XUL browser remote=true always create create an out-of-process iframe boundary?

If I have a BrowsingContext from below a mozbrowser or browser remote=true boundary in a content process, is there any way of discovering the top-most content-process BrowsingContext?

Flags: needinfo?(afarre) → needinfo?(nika)

(In reply to Henri Sivonen (:hsivonen) from comment #60)

farre, what should I know about mozbrowser and the BrowsingContext tree? Does Fission support in-process mozbrowser, or is it always out-of-process in Fission?

Fission supports both in-process and cross-process <browser> elements. Whether or not browser is cross-process is controlled by the remote attribute, as before. Fission does not support <browser> nested within a content process.

Redirecting needinfo to Nika who wrote the comment:
Do mozbrowser and the about:addons XUL browser remote=true always create create an out-of-process iframe boundary?

A <browser remote=true> element can never appear within a content process, so you can't get an out-of-process iframe boundary at one. The reason why about:addons is able to contain these attributes is that it loaded in the parent process, not a content process. <iframe mozbrowser> within content processes is also no longer supported, and support is generally being removed. I believe the last remaining user is Responsive Design Mode, which is being rewritten to remove the use.

The <browser elements within about:addons which have a remote attribute on them are then treated as another browser boundary, and can be out-of-process. As they are browser boundaries, the BrowsingContext trees within them are split. Say we've loaded about:addons in a tab and it has a single nested browser element with an iframe, the 3 separate BrowsingContext trees would look something like this:

browser.xhtml BrowsingContext
(parent process, type=chrome, toplevel)

--------

about:addons BrowsingContext
(parent process, type=content, toplevel)

--------

moz-extension://addonPrefs BrowsingContext
(content process, type=content, toplevel)
                   |
                   v
moz-extension://addonSubframe BrowsingContext
(content process, type=content, nested)

If I have a BrowsingContext from below a mozbrowser or browser remote=true boundary in a content process, is there any way of discovering the top-most content-process BrowsingContext?

BrowsingContext.top will work in that situation for you.

Flags: needinfo?(nika)

(In reply to :Nika Layzell (ni? for response) from comment #63)

A <browser remote=true> element can never appear within a content process

I'm happy that that's the case. It wasn't clear from the code comments.

<iframe mozbrowser> within content processes is also no longer supported, and support is generally being removed.

OK. Good. That means that dom/html/test/test_ignoreuserfocus.html is doing its testing in a no-longer-supported way.

I believe the last remaining user is Responsive Design Mode, which is being rewritten to remove the use.

If I come across Responsive Design Mode failures, I'll ignore them for now. I'm curious, is Responsive Design Mode being rewritten to remove the use of mozbrowser or being rewritten to move mozbrowser into the chrome process?

Thanks!

Flags: needinfo?(bugs)

Nika, pre-Fission, can extension-provided sidebars or popups have their content allocated to a process that also hosts Web content? (I take it that post-Fission, these shouldn't go into the same process, right?)

Flags: needinfo?(nika)

(In reply to Henri Sivonen (:hsivonen) from comment #75)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=325ac9d8799e05ba5644c05d396dfea2fb5ff6a7

These failures may be attributable to the previous patch, whose try run failed due to infra issues. Specifically, AFAICT, the changes to WindowRaised in this changeset should have no effect on the executions of browser_accesskeys.js due to mActiveWindow being nullptr in the executions that are omitted due to the parent process check.

Rolling back the testing to re-try the runs that didn't have results yesterday due to infra failure. This is before introducing mActiveBrowsingContext:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=abce55d3da78e56edb0ead003e4c476286ee5662

(In reply to Henri Sivonen (:hsivonen) from comment #81)

Rolling back the testing to re-try the runs that didn't have results yesterday due to infra failure. This is before introducing mActiveBrowsingContext:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=abce55d3da78e56edb0ead003e4c476286ee5662

This step was still OK.

(In reply to Henri Sivonen (:hsivonen) from comment #82)

With mActiveBrowsingContext:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fb1671a01b31e8370f8b67d54694e1f4802ce156

This is where breakage starts.

(In reply to Henri Sivonen (:hsivonen) from comment #64)

OK. Good. That means that dom/html/test/test_ignoreuserfocus.html is doing its testing in a no-longer-supported way.

I expect that the ignoreuserfocus attribute is probably exercising some broken focus logic with fission, but the attribute itself can probably be completely removed, as there don't appear to be an users outside of the test.

I'm curious, is Responsive Design Mode being rewritten to remove the use of mozbrowser or being rewritten to move mozbrowser into the chrome process?

RDM already uses mozbrowser in the parent process, but within a "content" docshell. We never use mozbrowser in a content process.

The refactoring being performed is to remove the use of mozbrowser, and instead directly use the browser element within the browser.xhtml UI.

(In reply to Henri Sivonen (:hsivonen) from comment #79)

Nika, pre-Fission, can extension-provided sidebars or popups have their content allocated to a process that also hosts Web content? (I take it that post-Fission, these shouldn't go into the same process, right?)

The process which hosts extension-provided sidebars should always be the extension process. This process can actually host web content right now, as (for example) a moz-extension page can load a https:// iframe element.

Flags: needinfo?(nika)

(In reply to :Nika Layzell (ni? for response) from comment #85)
...

Thanks. It seems that none of those things require special-casing here, then!

Blocks: 1589109

If I'm in a content process and I have two BrowsingContexts neither for which IsInProcess() returns true, how do I check if the two BrowsingContexts are both assigned to the same process or if they are assigned to two different processes?

Flags: needinfo?(nika)

All orange reached. :-(

(In reply to Henri Sivonen (:hsivonen) from comment #97)

All orange reached. :-(

Fortunately, just a silly mistake.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3bee7f58d85ee0950338e14171f0f4f9884c8b22

(In reply to Henri Sivonen (:hsivonen) from comment #94)

If I'm in a content process and I have two BrowsingContexts neither for which IsInProcess() returns true, how do I check if the two BrowsingContexts are both assigned to the same process or if they are assigned to two different processes?

You intentionally can't check this from within a content process. If you're in the parent process, your BrowsingContext is actually a CanonicalBrowsingContext, and you can compare CanonicalBrowsingContext::OwnerProcessId() (https://searchfox.org/mozilla-central/rev/35873cfc312a6285f54aa5e4ec2d4ab911157522/docshell/base/CanonicalBrowsingContext.h#39).

Flags: needinfo?(nika)

(In reply to :Nika Layzell (ni? for response) from comment #100)

(In reply to Henri Sivonen (:hsivonen) from comment #94)

If I'm in a content process and I have two BrowsingContexts neither for which IsInProcess() returns true, how do I check if the two BrowsingContexts are both assigned to the same process or if they are assigned to two different processes?

You intentionally can't check this from within a content process.

Thanks. I guess I'll have to send over all the info in one IPC message to the parent and let it split the message when sending it on to other content processes.


https://treeherder.mozilla.org/#/jobs?repo=try&revision=f13cccff2ee754d3cfdfc35e05cdb223e402f30f

Blocks: 1594337
No longer blocks: 1594337

I have code like this in the parent:

mozilla::ipc::IPCResult ContentParent::RecvSetFocusedBrowsingContext(
    BrowsingContext* aContext) {
  if (!aContext || aContext->IsDiscarded()) {
    MOZ_LOG(
        BrowsingContext::GetLog(), LogLevel::Debug,
        ("ParentIPC: Trying to send a message to dead or detached context"));
    return IPC_OK();
  }

  // XXX Set focused BrowserParent

  aContext->Group()->EachOtherParent(this, [&](ContentParent* aParent) {
    Unused << aParent->SendSetFocusedBrowsingContext(aContext);
  });

  return IPC_OK();
}

This ends up sending an IPC message back to the same ContentChild that the current IPC message was received from. How am I using EachOtherParent wrong? I do see EachOtherParent exclude one parent. Can mSubscribers somehow contain the same process twice with different ContentParent keys?

Flags: needinfo?(nika)

I'm on leave, so redirecting to Andreas

Flags: needinfo?(nika) → needinfo?(afarre)

Roll some unfixed bugs from Fission Milestone M4 to M5

0ee3c76a-bc79-4eb2-8d12-05dc0b68e732

Fission Milestone: M4 → M5

This belongs to M4.1 milestone because it blocks some M-fis tests.

Fission Milestone: M5 → M4.1
Priority: P3 → P2

(In reply to Henri Sivonen (:hsivonen) from comment #113)

I have code like this in the parent:

mozilla::ipc::IPCResult ContentParent::RecvSetFocusedBrowsingContext(
    BrowsingContext* aContext) {
  if (!aContext || aContext->IsDiscarded()) {
    MOZ_LOG(
        BrowsingContext::GetLog(), LogLevel::Debug,
        ("ParentIPC: Trying to send a message to dead or detached context"));
    return IPC_OK();
  }

  // XXX Set focused BrowserParent

  aContext->Group()->EachOtherParent(this, [&](ContentParent* aParent) {
    Unused << aParent->SendSetFocusedBrowsingContext(aContext);
  });

  return IPC_OK();
}

This ends up sending an IPC message back to the same ContentChild that the current IPC message was received from. How am I using EachOtherParent wrong? I do see EachOtherParent exclude one parent. Can mSubscribers somehow contain the same process twice with different ContentParent keys?

This is extremely fishy. One reason could be that this isn't subscribed to the group. So this isn't really the solution, but could you try calling aContext->Group()->EnsureSubscribed(this) before calling EachOtherParent?

Flags: needinfo?(afarre)

Thanks. It seems that EachOtherParent is OK. I think what happens is this:

  1. Content process C1 claims in-process BrowsingContext BC as focused.
  2. The parent process starts switching processes for BC to content process C2.
  3. The parent process receives the message from step 1 and distributes the message to other content processes, including C2.
  4. The process switch completes from the parent process perspective.
  5. Process C2 starts focusing BC, which is now in-process from its perspective.
  6. C2 sends an IPC message to broadcast BC having obtained focus in C2.
  7. C2 receives the message from step #3 that BC is being focused in another process (C1), but it has already become an in-process BrowsingContext for C2.

I guess the next step is to figure out what the right criterion for rejecting some messages is. I'm going to try rejecting messages that carry a BrowsingContext that's supposed to be in another process but is, in fact, in-process.

Blocks: 1580618

Obtaining a BrowserParent from CanonicalBrowsingContext isn't working as I expected: bug 1585950 comment 7.

Current problem: Distinguishing between a top-level BrowsingContext in a content process becoming "not active" due to tab switch and a top-level BrowsingContext becoming "not active" due to being moved to another content process. I don't want to send IPC messages in the latter case so that they don't arrive late and undo the (logically same but different-process) BrowsingContext becoming "active" in the process it just moved to.

Progress! I can now click-focus a text field in an out-of-process iframe provided that:

  1. I have switched away from the browser tab and back once first and
  2. I have then managed to provoke the out-of-process iframe to get painted/composited.

Next steps:

  1. When an out-of-process iframe is created, propagate the current "active browsing context" to its process, since the process couldn't have received the information when the currently-active browsing context became active.
  2. When a top-level browsing context becomes active after having been deactivated, make its out-of-process iframes also run enough activation steps to restart painting/compositing.

Locally, I can now focus a text field in a previously-unfocused out-of-process iframe by mouse click! (Subsequent keyboard and IME events don't go to the right place, yet.) Let's see how orange this is on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d222e8d522ad6fde13c38c0ca5842fafc7b95593

Current status: The IME code doesn't like it when focus changes from one out-of-process iframe to another out-of-process iframe such that both are hosted by the same process. That is, the focus moves inside one process, but the IME connection needs to move from one BrowserChild to another BrowserChild.

Blocks: 1558520

If I have a composition open in one out-of-process iframe and I click to focus a different out-of-process iframe hosted by the same process and the previous one, the content process requests IME to commit composition. Then TextComposition::RequestToCommit runs the branch that sets the commit type to eCompositionCommitAsIs. Then ContentCacheInParent::OnCompositionEvent asserts that the type should instead be either eCompositionChange or eCompositionCommit.

Masayuki, what's the significance of https://searchfox.org/mozilla-central/rev/42c2ecdc429115c32e6bcb78bf087a228a051044/widget/ContentCache.cpp#1122 ? Specifically, why isn't eCompositionCommitAsIs OK?

Flags: needinfo?(masayuki)

(In reply to Henri Sivonen (:hsivonen) from comment #144)

Masayuki, what's the significance of https://searchfox.org/mozilla-central/rev/42c2ecdc429115c32e6bcb78bf087a228a051044/widget/ContentCache.cpp#1122 ? Specifically, why isn't eCompositionCommitAsIs OK?

Ah, because it does not assume that widget does not useeCompositionCommitAsIs in that case. If you hit the assertion, you need to create new path for it because eCompositionCommitAsIs event's mData isn't available. Probably, you need to do *mCommitStringByRequest = mCompositionString; in that case.

Flags: needinfo?(masayuki)

Looks like this makes a bunch on known-intermittent oranges more likely to fail in Fission. I suggest we disable those tests in Fission for now.

Bug: On Ubuntu 18.04, after having used Korean input in one out-of-process iframe and then clicking to focus a different out-of-focus iframe that belongs to the same process and the previous one, the first syllable typed in the second iframe isn't shown until the second syllable starts.

(In reply to Henri Sivonen (:hsivonen) from comment #149)

Bug: On Ubuntu 18.04, after having used Korean input in one out-of-process iframe and then clicking to focus a different out-of-focus iframe that belongs to the same process and the previous one, the first syllable typed in the second iframe isn't shown until the second syllable starts.

Other than that bug, this now seems to work across Windows, Mac, and Linux across a variety of IMEs (only IBus-based ones tested on Linux). It's probably the best to leave the Ubuntu Hangul first syllable case as a follow-up, considering that the Hangul IME on Ubuntu 18.04 seems buggy in other apps in other ways.

Attachment #9094890 - Attachment is obsolete: true

I just tested manually with a build (though haven't tried enabling a11y tests skipped because of this yet) and this seems to be working nicely for a11y. Thanks.

(In reply to Henri Sivonen (:hsivonen) from comment #148)

Looks like this makes a bunch on known-intermittent oranges more likely to fail in Fission. I suggest we disable those tests in Fission for now.

I think this approach sounds reasonable! But I'd like to ensure :kmag and :cpeterson are on the same page as they are still tracking Fission disabled tests.

Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(cpeterson)

(In reply to Hsin-Yi Tsai [:hsinyi] from comment #155)

(In reply to Henri Sivonen (:hsivonen) from comment #148)

Looks like this makes a bunch on known-intermittent oranges more likely to fail in Fission. I suggest we disable those tests in Fission for now.

I think this approach sounds reasonable! But I'd like to ensure :kmag and :cpeterson are on the same page as they are still tracking Fission disabled tests.

Since this bug is blocking other a11y Fission work, I'm OK with temporarily skipping those tests in Fission so you can land this bug fix. But we are trying to fix all tests that are skipped or failing in Fission (for our Fission M4.1 milestone targeting January-February 2020), so we will need to revisit those same tests soon.

Flags: needinfo?(cpeterson)

Thanks. I've requested review on this patch and will create a separate patch about disabling intermittents-turned-perma-orange.

Yeah, I think that's find for now since this is blocking other work.

Flags: needinfo?(kmaglione+bmo)

I've started to review the patch, but I also did some simple testing. It seems to work well so far when Fission is disabled. I created a simple test case with the following structure:

<html id="outer">
<iframe src="some-remote-page">
<html id="outerinput">
<input id="innerinput">
</html>
</iframe>
<input id="outerinput">
</html>

With Fission disabled, it works ok. But with Fission enabled, I need to click the inner input twice to focus it. The first click seems to focus the 'outer' html page, while the second click focuses the 'inner' html page. I can move focus between outer and inner fine.

If 'innerinput' is focused, switching to another window doesn't send a blur event nor disable the focus ring or caret.

(In reply to Neil Deakin (away Dec 12 - Dec 20) from comment #162)

I've started to review the patch, but I also did some simple testing. It seems to work well so far when Fission is disabled. I created a simple test case with the following structure:

<html id="outer">
<iframe src="some-remote-page">
<html id="outerinput">
<input id="innerinput">
</html>
</iframe>
<input id="outerinput">
</html>

With Fission disabled, it works ok. But with Fission enabled, I need to click the inner input twice to focus it.

That's weird. Is the test case online somewhere? I used a setup that looks like yours: https://hsivonen.fi/fission-host.html , and I don't see this problem.

If 'innerinput' is focused, switching to another window doesn't send a blur event nor disable the focus ring or caret.

That's indeed a bug.

That's weird. Is the test case online somewhere? I used a setup that looks like yours: https://hsivonen.fi/fission-host.html , and I don't see this problem.

It looks like it happens only when chrome is focused to begin with. I had the url or search bar focused, then need to click on the inner textbox twice.

I suspect the IME content cache might require widget-level focus management even with the puppet widget:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=49ed06590f629a63a698894f3f4888c87cb58984

Attachment #9117202 - Attachment is obsolete: true

I'm going to mark this blocking Fission dogfooding because it seems like it could be related to the issue we see on a few sites where you can't enter information into a form.

Sadly, it looks like

Assertion failure: !remoteHasFocus (IMEContentObserver should have been destroyed by OnFocusMovedBetweenBrowsers.), at /builds/worker/workspace/build/src/dom/events/IMEStateManager.cpp:449

seen in https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=285600258&revision=909a8c844c1664bed1d5f2a8adf56e379fd62200 is intermittent.

(In reply to Henri Sivonen (:hsivonen) from comment #176)

Sadly, it looks like

Assertion failure: !remoteHasFocus (IMEContentObserver should have been destroyed by OnFocusMovedBetweenBrowsers.), at /builds/worker/workspace/build/src/dom/events/IMEStateManager.cpp:449

seen in https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=285600258&revision=909a8c844c1664bed1d5f2a8adf56e379fd62200 is intermittent.

Not having seen what happens via a debugger, I think the most plausible explanation is that this happens when the window global of a CanonicalBrowsingContext changes but the connected BrowserParent remains the same. IMEStateManager experiences whatever caused that as something that should have cause a cross-process IME focus change.

I'll try synthesizing a focus change to chrome and back to the same BrowserParent when the window global of a CanonicalBrowsingContext changes.

Or maybe I'll just call DestroyIMEContentObserver(); in that case and see what happens. I'm a bit worried disconnecting the IME too often in case it fails to connect back. There was some of that earlier.

(In reply to Henri Sivonen (:hsivonen) from comment #178)

Or maybe I'll just call DestroyIMEContentObserver(); in that case and see what happens. I'm a bit worried disconnecting the IME too often in case it fails to connect back. There was some of that earlier.

That can't work, because the crashing process is Web content. The assertion message has been written as if the assertion was relevant only to the chrome process.

We take this branch in a content process, because there is an out-of-process iframe, so remoteHasFocus is true.

Masayuki, is it OK to skip over the assertion if we aren't in the chrome process. If not, why do we need sActiveIMEContentObserver to already having been destroyed if remoteHasFocus and what kind of mechanism should destroy it in the out-of-process iframe case?

Flags: needinfo?(masayuki)

(In reply to Henri Sivonen (:hsivonen) from comment #180)

Masayuki, is it OK to skip over the assertion if we aren't in the chrome process. If not, why do we need sActiveIMEContentObserver to already having been destroyed if remoteHasFocus and what kind of mechanism should destroy it in the out-of-process iframe case?

If there are two or more IMEContentObserver exist, chrome process may receive outdated (i.e., blurred) content data after new editor gets focus. In non-Fission world, when aContent is a remote process's content, the running process is always chrome. Therefore, the assertion was correct. I mean that chrome process should have already destroyed its IMEContentObserver at its blur handling.

On the other hand, in Fission world, in this case, the running process and aContent's process are both content processes. Ideally, running process should've handled blur and destroyed its IMEContentObserver before aContent gets focus and creates IMEContentObserver. But I guess that if the running process is busy, its blur handling might be delayed. I.e., IMEContentObserver may keep alive even after aContent gets focus, right? If so, it's okay to use the assertion only in chrome process. However, I think that chrome process needs to reject notifications from old focused content process. Looks like that the receiver methods don't have the sender process information so that I guess it's not checked. I think that we need to do it if the assertion detects the race issue detected.

Flags: needinfo?(masayuki)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e0c9a2bcf9f91d84ce3bf2bfbe6642439ad6279f

(In reply to Masayuki Nakano [:masayuki] (he/him)(JST, +0900) from comment #182)

On the other hand, in Fission world, in this case, the running process and aContent's process are both content processes. Ideally, running process should've handled blur and destroyed its IMEContentObserver before aContent gets focus and creates IMEContentObserver. But I guess that if the running process is busy, its blur handling might be delayed. I.e., IMEContentObserver may keep alive even after aContent gets focus, right?

I'm not really sure what happens.

If so, it's okay to use the assertion only in chrome process. However, I think that chrome process needs to reject notifications from old focused content process. Looks like that the receiver methods don't have the sender process information so that I guess it's not checked. I think that we need to do it if the assertion detects the race issue detected.

Thank you. I filed bug 1612802 for this.

The bc tests fail due to the parent process trying to deserialize a non-existing BrowsingContext from IPC:
Hit MOZ_CRASH(Attempt to deserialize absent BrowsingContext) at /builds/worker/workspace/build/src/docshell/base/BrowsingContext.cpp:1510

I don't know what the exact IPC send/receive situation is. Is there some known pattern with a known fix that causes this problem?

Flags: needinfo?(afarre)

Testing hypothesis that the crash comes from IPC WindowLowered, which may even be useless:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ffc224c7ce6023f59453c109dc1a87ebb0a82367

Blocks: 1614268

Neil, would you be OK with reviewing this for the purpose of landing with known bugs (bug 1613054 and out-of-process iframes failing to deactivate when switching windows)? This patch keeps blocking other Fission work and the patch keeps rotting when other Fission work is done, so I think it would make sense to land what exists now and then pursue the known bugs as separate patches.

Flags: needinfo?(enndeakin)

I can review later this week.

Flags: needinfo?(enndeakin)

Thanks. I filed bug 1614297 as a follow-up for the oop iframe deactivation upon window lowering.

No longer blocks: 1588091
Pushed by hsivonen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a80f94ffe726 Make nsFocusManager::SetFocus work in Fission. r=NeilDeakin,farre,masayuki

Note to sheriff:
If toolkit/components/reader/test/browser_readerMode_with_anchor.js fails on Treeherder, let's rather disable it for Fission than back this out.

(In reply to Henri Sivonen (:hsivonen) from comment #207)

Note to sheriff:
If toolkit/components/reader/test/browser_readerMode_with_anchor.js fails on Treeherder, let's rather disable it for Fission than back this out.

The same goes for any Fission perma-orange really.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla75

Thanks for the reviews!

To make it easier to follow along without reading the Phabricator comments: The work to address known problems continues in bug 1614297 and bug 1615548.

Regressions: 1569106
Regressions: 1616373
Regressions: 1617222
Blocks: 1617788
Regressions: 1618437
Depends on: 1652829
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: