Race condition between removing frame and next context browser check
Categories
(Remote Protocol :: Marionette, defect, P1)
Tracking
(Fission Milestone:M7, firefox83 fixed)
Tracking | Status | |
---|---|---|
firefox83 | --- | fixed |
People
(Reporter: whimboo, Assigned: whimboo)
References
Details
(Whiteboard: [marionette-fission-mvp][simple])
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
Details |
As noticed on bugs like bug 1665785 there is a race when a command like "Element Click" returns, because the current frame has been removed, and the next command fails to correctly check for the browsing context.
There might be a delay in syncing this information between the child and parent process, that we do not take into account right now. Listening for another event or observer notification in the parent process might help.
Assignee | ||
Comment 1•4 years ago
|
||
This should basically be fixed by bug 1394354. In failing cases synthesizing the click was delayed and as such the WebDriver:ElementClick
command returned after the 200ms timeout for the beforeunload
check. Final work to make it completely free of race conditions will happen on bug 1666755, but that's not needed right now.
Updated•4 years ago
|
Assignee | ||
Comment 2•4 years ago
|
||
There was one more failure over on bug 1665549:
[task 2020-09-28T07:46:40.731Z] 07:46:40 INFO - PID 4391 | 1601279200727 Marionette DEBUG 0 -> [0,52,"WebDriver:ElementClick",{"id":"870ae9af-1117-4fd3-a605-95cc77013ad3"}]
[task 2020-09-28T07:46:40.747Z] 07:46:40 INFO - PID 4391 | 1601279200741 Marionette TRACE [38] Frame with id 8589934594 got removed
[task 2020-09-28T07:46:40.747Z] 07:46:40 INFO - PID 4391 | 1601279200741 Marionette TRACE Canceled page load listener because current frame has been removed
[task 2020-09-28T07:46:40.748Z] 07:46:40 INFO - PID 4391 | 1601279200741 Marionette DEBUG 0 <- [1,52,null,{"value":null}]
[task 2020-09-28T07:46:40.748Z] 07:46:40 INFO - PID 4391 | 1601279200742 webdriver::server DEBUG <- 200 OK {"value":null}
[task 2020-09-28T07:46:40.749Z] 07:46:40 INFO - PID 4391 | 1601279200743 webdriver::server DEBUG -> POST /session/5d3d095b-f97b-4311-bb66-bf38dbb55c3e/elements {"using": "css selector", "value": "foo"}
[task 2020-09-28T07:46:40.750Z] 07:46:40 INFO - PID 4391 | 1601279200744 Marionette DEBUG 0 -> [0,53,"WebDriver:FindElements",{"using":"css selector","value":"foo"}]
[task 2020-09-28T07:46:40.750Z] 07:46:40 INFO - PID 4391 | 1601279200744 Marionette TRACE Received DOM event click for [object HTMLInputElement]
While the frame gets closed by the click, the command still seems to return too early so that the next command succeeds instead of failing.
Comment 3•4 years ago
|
||
Tracking marionette-fission-mvp bugs for Fission M7 Beta milestone. We would like Marionette tests to work with Fission before we ride the trains to Beta.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 5•4 years ago
|
||
Note that bug 1666755 will reduce the number of failures in CI, but will not fix it completely:
[task 2020-10-12T10:49:04.151Z] 10:49:04 INFO - PID 13689 | 1602499744148 Marionette DEBUG 0 -> [0,52,"WebDriver:ElementClick",{"id":"6e6decef-542f-4306-91f1-73dc2a1d92f3"}]
[task 2020-10-12T10:49:04.168Z] 10:49:04 INFO - PID 13689 | 1602499744160 Marionette TRACE [38] Frame with id 10737418242 got removed
[task 2020-10-12T10:49:04.169Z] 10:49:04 INFO - PID 13689 | 1602499744160 Marionette TRACE Canceled page load listener because current frame has been removed
[task 2020-10-12T10:49:04.169Z] 10:49:04 INFO - PID 13689 | 1602499744160 Marionette DEBUG 0 <- [1,52,null,{"value":null}]
[task 2020-10-12T10:49:04.170Z] 10:49:04 INFO - PID 13689 | 1602499744160 webdriver::server DEBUG <- 200 OK {"value":null}
[task 2020-10-12T10:49:04.171Z] 10:49:04 INFO - PID 13689 | 1602499744161 webdriver::server DEBUG -> POST /session/e2dc90d6-b27e-4245-a962-f3e68d7548a7/element {"using": "css selector", "value": "foo"}
[task 2020-10-12T10:49:04.172Z] 10:49:04 INFO - PID 13689 | 1602499744161 Marionette DEBUG 0 -> [0,53,"WebDriver:FindElement",{"using":"css selector","value":"foo"}]
[task 2020-10-12T10:49:04.172Z] 10:49:04 INFO - PID 13689 | 1602499744162 Marionette TRACE Received DOM event click for [object HTMLInputElement]
Note that the click event happens way later. So I wonder if it's the wrong frame that we actually observing here.
Assignee | ||
Comment 6•4 years ago
|
||
Note that this isn't fixed yet. The command returns too early due to the closing frame, which is received even before the click happens. Even we setup listeners before the navigation we should wait to handle them until the click has happened. I will take a look at this.
Assignee | ||
Comment 7•4 years ago
|
||
Rejecting the waitForNavigationCompleted
Promise due to the frame being removed happens via the onFrameRemoved
message handler:
https://searchfox.org/mozilla-central/rev/e0eb861a187f0bb6d994228f2e0e49b2c9ee455e/testing/marionette/navigate.js#309-328
That callback is invoked when the parent process receives the Marionette:FrameRemoved
message via the message manager, which is actually sent by the framescript (listener.js). But that all the time, also when the actors are enabled.
To get the framescript finally removed we need to get away from using any of its methods and events / messages. So directly checking for a browsing context going away in the parent process should probably be the right way to do.
waitForNavigationCompleted
knows to which browsing context it belongs. So handling the browsing-context-discarded
observer notification should make it work. The only part that is not clear to me is how to differentiate between the frame being closed, and the browsing context exchanged due to a remoteness change?
Nika, if you have any advice I would be happy to hear. Otherwise I will play around tomorrow and check how this can be done. Thanks.
Assignee | ||
Comment 8•4 years ago
|
||
The try build looks great. So far not a single failing test:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b547cb034ba3a8644ee4b278ca4efa5bc82349ea
I will have a look now in how to figure out that the browsing context has just been swapped due to a remoteness change.
Assignee | ||
Comment 9•4 years ago
|
||
Turned out this is simple, and checking the currentWindowGlobal
is enough. If the frame has been closed its value will be null
. Here a try build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=aab88e9b507542f412ba33ff5330ddd9971d14db
Something we would still have to check is how to get the new browsing context id, but that's for a follow-up bug.
Assignee | ||
Comment 10•4 years ago
|
||
When waiting for a navigation to complete the
"browsing-context-discarded" observer notification
is used to determine if the currently selected
browsing context has been closed or just replaced
due to a remoteness change.
Assignee | ||
Updated•4 years ago
|
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
bugherder |
Updated•2 years ago
|
Description
•