Open Bug 1642889 Opened 4 years ago Updated 3 years ago

Enable focus-restoration-in-different-site-iframes.html on Linux

Categories

(Core :: DOM: UI Events & Focus Handling, defect, P3)

Unspecified
Linux
defect

Tracking

()

ASSIGNED

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

Attachments

(1 obsolete file)

I'm going to land a new test focus-restoration-in-different-site-iframes.html that works on Linux locally. However, it fails on Linux on treeherder. This bug is about figuring out why it fails in CI and making it pass there.

For now, the test working locally on Linux and working in Windows opt builds in CI should be good enough to make progress.

Henri, can we re-enable this Linux test now? The dependent bug 1634363 has been fixed.

Fission Milestone: --- → M7
Flags: needinfo?(hsivonen)
Priority: -- → P3
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Flags: needinfo?(hsivonen)

(In reply to Chris Peterson [:cpeterson] from comment #1)

Henri, can we re-enable this Linux test now? The dependent bug 1634363 has been fixed.

No, still fails on CI for reasons not yet understood.

Still fails. Now also on Mac! :-(

On the bright side, I can now repro locally.

I add some similar tests to focus-restoration-in-different-site-frames-window.html in bug 1674702.
I got the similar failure on try, like https://treeherder.mozilla.org/logviewer?job_id=321450119&repo=try&lineNumber=6603, but I could not reproduce locally.

Trying with different timeouts, because I saw a similar problem when mixing step_timeout and setTimeout in another test.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fc60ba65ca6aa02e07eed551ddf9cbffd06923c4

(In reply to Henri Sivonen (:hsivonen) from comment #12)

3000, then 3000:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f5d3fa595629fd5dd5f78ee08a070635320752b6

With those timeouts, not failing in non-WR. Still failing in WR, with Fission and without.

(In reply to Henri Sivonen (:hsivonen) from comment #14)

4000, then 4000:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=04db761ba54adf3f2cf034b5e46d3c9d923ddace

Still fails. It seems that making the timeouts very long doesn't help even though the failure mode looks like the last phase of the test doesn't run.

Pernosco self-serve detects the wrong test as having failed, so I can't request the right repro. Let's see if that can be fixed at the Pernosco end. Suspending this pending an outcome on that point.

I now have a failure repro and a success repro.

When we open the window that just serves as the "other" window, in the success case, as expected, our main window gets a lowering notification from Gtk and we run nsFocusManager::WindowLowered. In the failure case, we proceed to closing the "other" window before our main window runs nsFocusManager::WindowLowered.

The Gtk notification requires the main event loop of the parent process to spin.

Perhaps the right test fix is to make the "other" window wait before it asks to be closed.

Attachment #9190315 - Attachment description: Bug 1642889 - Increase wait times in focus-restoration-in-different-site-iframes.html. → Bug 1642889 - Make the pop-up stay open for longer to make focus-restoration-in-different-site-iframes.html pass Linux.

(In reply to Henri Sivonen (:hsivonen) from comment #17)

The Gtk notification requires the main event loop of the parent process to spin.

AFAICT, we must be doing at least two full event queues worth of events between the new window opening and closing, and apparently it's still possible for gtk_window_focus_out_event to be called too late.

Given that there are more of these (bug 1682685), perhaps we need to pursue a fix that makes Gecko take care of this internally and ignore the late native toolkit notification.

I'm going to try making WindowShown call WindowLowered.

(In reply to Henri Sivonen (:hsivonen) from comment #22)

I'm going to try making WindowShown call WindowLowered.

This is insufficient: Now we're missing WindowRaised after the opened window closes. It seems that this isn't a matter of Gtk being slow but somehow the newly-opened window not getting considered the new frontmost window at all. This explains why adding more wait time didn't help.

Perhaps this is the wrong approach and the right one would be figuring out what makes Gtk act the way it does.

My hypothesis is that in CI, something happens that makes Gtk consider the newly-opened native window not to be the topmost one i.e. the newly-opened native window, for Gtk purposes, opens behind the native window that already existed.

karlt, is there something that I can look for in Pernosco to verify that this happened? Or is this even a known failure mode?

Flags: needinfo?(karlt)

From Neil on bug 1682685:

I wasn't able to reproduce this problem myself unfortunately using the command above. I've not seen this type of issue on Mac, but it is common on Linux for the gtk window container focus events to get lost in certain situations when opening and closing windows rapidly during automated tests, but a good fix has not been found.

I don't know where to go from here considering that changing the "rapidly" part didn't help.

Attachment #9190315 - Attachment description: Bug 1642889 - Make the pop-up stay open for longer to make focus-restoration-in-different-site-iframes.html pass Linux. → Bug 1642889 - Call WindowLowered from WindowShow to paper over the native toolkit calling WindowLowered too late.

For what is the delay before forwarding messages from focus-restoration-in-different-site-iframes-outer.sub.html to its opener waiting?

(In reply to Henri Sivonen (:hsivonen) from comment #17)

Perhaps the right test fix is to make the "other" window wait before it asks to be closed.

Sounds good. Can focus-restoration-in-different-site-iframes-other.html wait for focus before telling its opener to proceed?

(In reply to Henri Sivonen (:hsivonen) from comment #26)

My hypothesis is that in CI, something happens that makes Gtk consider the newly-opened native window not to be the topmost one i.e. the newly-opened native window, for Gtk purposes, opens behind the native window that already existed.

Gecko usually aims to reflect the OS toplevel window activation, as indicated by GTK (or OS) events. If Gecko were to try to second-guess how and when the OS is going to change window activation, then there would be risk of state changes getting out of order and risk of confusion with keyboard input sent to widgets in windows that the OS is showing as inactive.

Window activation is mostly orthogonal to z-order, and z-order is not exposed to content afaik.

Usually the OS will grant a window activation request if the same app already has window activation on one of its windows. On close of the active window, I don't know whether or not Gecko chooses a window to subsequently receive focus, but I would have expected the window manager to make this decision. Such a decision would be window-manager-specific, but some window managers are likely to choose a most-recently-active window.

Can focus-restoration-in-different-site-iframes-inner.html wait for its expected last event instead of adding an arbitrary pause?

karlt, is there something that I can look for in Pernosco to verify that this happened? Or is this even a known failure mode?

MOZ_LOG=Widget:4,WidgetFocus:5 or the pernosco equivalent might be helpful for tracking GTK events.

Flags: needinfo?(karlt)

(In reply to Karl Tomlinson (:karlt) from comment #28)

For what is the delay before forwarding messages from focus-restoration-in-different-site-iframes-outer.sub.html to its opener waiting?

It's waiting for all the event handlers in that document itself potentially appending stuff to the outer log.

(In reply to Henri Sivonen (:hsivonen) from comment #17)

Perhaps the right test fix is to make the "other" window wait before it asks to be closed.

Sounds good. Can focus-restoration-in-different-site-iframes-other.html wait for focus before telling its opener to proceed?

I tried making it wait for a time that, if things were working reasonably at all, should have been enough time. But it still failed. (WPT doesn't have a way to wait for focus per se, AFAICT.)

(In reply to Henri Sivonen (:hsivonen) from comment #26)

My hypothesis is that in CI, something happens that makes Gtk consider the newly-opened native window not to be the topmost one i.e. the newly-opened native window, for Gtk purposes, opens behind the native window that already existed.

Gecko usually aims to reflect the OS toplevel window activation, as indicated by GTK (or OS) events. If Gecko were to try to second-guess how and when the OS is going to change window activation, then there would be risk of state changes getting out of order and risk of confusion with keyboard input sent to widgets in windows that the OS is showing as inactive.

Window activation is mostly orthogonal to z-order, and z-order is not exposed to content afaik.

Usually the OS will grant a window activation request if the same app already has window activation on one of its windows. On close of the active window, I don't know whether or not Gecko chooses a window to subsequently receive focus, but I would have expected the window manager to make this decision. Such a decision would be window-manager-specific, but some window managers are likely to choose a most-recently-active window.

We have plain GNOME/Mutter in CI, right?

Can focus-restoration-in-different-site-iframes-inner.html wait for its expected last event instead of adding an arbitrary pause?

No, since the point is for the test to proceed even if the browser doesn't pass and also to catch events that are not supposed to be there.

karlt, is there something that I can look for in Pernosco to verify that this happened? Or is this even a known failure mode?

MOZ_LOG=Widget:4,WidgetFocus:5 or the pernosco equivalent might be helpful for tracking GTK events.

Thanks.

In there anything I can do in Pernosco to look at Gtk internal to figure out what's going on with the window activation state?

(In reply to Henri Sivonen (:hsivonen) (away from Bugzilla until 2021-01-11) from comment #29)

(In reply to Karl Tomlinson (:karlt) from comment #28)

(In reply to Henri Sivonen (:hsivonen) from comment #17)

Perhaps the right test fix is to make the "other" window wait before it asks to be closed.

Sounds good. Can focus-restoration-in-different-site-iframes-other.html wait for focus before telling its opener to proceed?

I tried making it wait for a time that, if things were working reasonably at all, should have been enough time. But it still failed.

0.5 seconds is not necessarily so long for a debug build to be certain that something is not going to happen, but it is very hard to know how long to wait for things that we don't know are going to happen.

(WPT doesn't have a way to wait for focus per se, AFAICT.)

I'm not aware of any built-in support for browser window focus, but is there a reason why the contents of this window can't be set up to receive focus or "focusin" events given that seems to be possible for the inner window?

We have plain GNOME/Mutter in CI, right?

That would be my guess.
I assume we have whatever is the default for that version of Ubuntu.

Can focus-restoration-in-different-site-iframes-inner.html wait for its expected last event instead of adding an arbitrary pause?

No, since the point is for the test to proceed even if the browser doesn't pass and also to catch events that are not supposed to be there.

WPT has its own (reasonably short) timeout and so will proceed.
It also has clean-up methods that can be used to ensure for example that windows are closed after a timeout.
A single event loop after the last expected event would hopefully be sufficient to notice unexpected related events.

In there anything I can do in Pernosco to look at Gtk internal to figure out what's going on with the window activation state?

I would work back from Pernosco's logging panel.

OS: Unspecified → Linux

Linux bugs don't block our Fission Beta experiment (on Windows and macOS), so I am moving Linux Fission bugs from Fission milestone M7 to M7a.

This wpt failure does not appear to be Fission-specific. Do we need to track this bug for Fission?

The test is disabled for all Linux and all debug and is annotated as failing intermittently on all platforms:

https://searchfox.org/mozilla-central/rev/6f8f3d0e9022b6f0405da26ec940a89455416202/testing/web-platform/meta/focus/focus-restoration-in-different-site-iframes.html.ini

[focus-restoration-in-different-site-iframes.html]
  disabled:
    if (os == "linux") or debug: https://bugzilla.mozilla.org/show_bug.cgi?id=1642889
  [Check result]
    expected:
      if os == "mac": ["FAIL", "PASS"]
      [PASS, FAIL]
Fission Milestone: M7 → M7a

Removing Fission milestone because it is not skipped only for Fission. But note that this becomes more easily reproducible with Fission.

No longer blocks: fission-focus
Fission Milestone: M7a → ---
Attachment #9190315 - Attachment is obsolete: true

FWIW window.focus() is async in Chrome (but IIUC we're not clear whether asynchronicity is the only issue here).

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: