NoSuchElement error instead of StaleElement error returned during / after a cross-group navigation
Categories
(Remote Protocol :: Marionette, defect, P1)
Tracking
(firefox85 wontfix, firefox86 wontfix, firefox87 fixed)
People
(Reporter: whimboo, Assigned: whimboo)
References
()
Details
(Whiteboard: [not-a-fission-bug])
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
pascalc
:
approval-mozilla-release-
|
Details |
Follow-up from bug 1684827.
When checking for element staleness during / after a cross-group navigation a NoSuchElement
error is returned instead of a StaleElementReference
error. The reason for that is the new browsing context id the top-level browsing context gets. For details see bug 1674329.
These changes of the browsing context including its id can only happen for the top-level browsing context, and not for child browsing contexts. As such I didn't notice this problem on bug 1684827.
When implementing the fix we have to make sure to add some wdspec tests that cover this scenario.
Thank you James Nord for reporting the issue!
Assignee | ||
Comment 1•4 years ago
|
||
The trace log as provided on the github issue is interesting:
Marionette TRACE [8589934593] MarionetteEvents actor created for window id 8589934595
Marionette TRACE Received event beforeunload for http://127.0.0.1:60596/job/tidy_broccoli/configure
Marionette TRACE Received DOM event click for [object HTMLButtonElement]
Marionette TRACE Remoteness change detected. Set new top-level browsing context to 37
Marionette TRACE [37] Frame script loaded
Marionette ERROR [37] No reply from Marionette:Register
Marionette TRACE [37] MarionetteEvents actor created for window id 26
Marionette TRACE Received event beforeunload for about:blank
Marionette TRACE Received event pagehide for about:blank
Marionette TRACE [37] MarionetteEvents actor created for window id 10737418241
Marionette TRACE Received event DOMContentLoaded for http://127.0.0.1:60596/job/tidy_broccoli/
Marionette TRACE Received event pageshow for http://127.0.0.1:60596/job/tidy_broccoli/
The page with the button that get clicked is http://127.0.0.1:60596/job/tidy_broccoli/configure
. The click causes a navigation, but due to some reason there is some interaction with about:blank
, which gets unloaded (pagehide
).
James, which code is the event handler of the button running to trigger the navigation?
Assignee | ||
Comment 2•4 years ago
|
||
I can replicate the failure now with the following Marionette unit test:
Using the file-url
element after navigating to about:robots
triggers the NoSuchElement
error. But that's a navigation that switches the process between the content and the parent process. The interesting part is why does it happen for a content process only cross-group navigation.
The documentation for cross-group navigation gives some examples of when it is used.
James, could it be that the page is using some Cross-Origin-Opener-Policy headers?
Comment 3•4 years ago
|
||
The page with the button that get clicked is http://127.0.0.1:60596/job/tidy_broccoli/configure. The click causes a navigation, but due to some reason there is some interaction with about:blank, which gets unloaded (pagehide).
funky.... the submit does not go via about:blank however when clicking the button (to submit the form)
Firefox POSTs to http://127.0.0.1:60596/job/tidy_broccoli/configSubmit
which returns a 302 redirect to http://127.0.0.1:60596/job/tidy_broccoli/
the 302 Redirect has zero content. (is that being inferred as about:blank
?)
James, which code is the event handler of the button running to trigger the navigation?
it is Jenkins, and it's funky form handling :-p
there is https://github.com/jenkinsci/jenkins/blob/77824bd6965145afaa68c075bb851e4030035595/war/src/main/webapp/scripts/hudson-behavior.js#L1100-L1117 (that is attached to the form)
there is event-min attached to the actual button https://github.com/jenkinsci/jenkins/tree/77824bd6965145afaa68c075bb851e4030035595/war/src/main/webapp/scripts/yui/event (the corresponding non minimized is in the same tree).
James, could it be that the page is using some Cross-Origin-Opener-Policy headers?
Cross-Origin-Opener-Policy header
-> same-origin
Comment 4•4 years ago
|
||
if you want to see what is going on in the SUT you can see with the following steps.
- download https://get.jenkins.io/war-stable/2.263.3/jenkins.war
- download install a JRE (8 is better)
- run
JENKINS_HOME=/tmp/jenkins_home; java -jar jenkins.war --httpListenAddress=127.0.0.1 --httpPort=8080
- go to http://127.0.0.1/ follow the install wizard and accept the recommended plugins
- click "New Item" in the left hand menu
- enter a name (
tidy_broccoli
) - click "folder" to create a folder type
this leads you to http://127.0.0.1:60596/job/tidy_broccoli/configure
then you just need to click save
Assignee | ||
Comment 5•4 years ago
|
||
Thanks for the steps. I hope that I will find time later today to continue on this bug. Would you mind checking in the meantime if you see the same behavior with the Firefox preference marionette.actors.enabled
is set to false
?
Assignee | ||
Comment 6•4 years ago
|
||
I just run with actors disabled, and the same behavior can be seen. As such this bug doesn't have to block bug 1669172.
Assignee | ||
Comment 7•4 years ago
|
||
To solve the missing reference to the element we might want to add the browserId
beside the ContentDOMReference object to the ReferenceStore. This id will be unique and remain constant for the whole lifetime of the tab:
By having that extra ID we can always check if the currently selected top-level browsing context has been used in the same tab before.
And this will also help to fix a leak when cleaning-up the ReferenceStore for a closed tab. Right now we pass-in a browsing context, but after such a cross-group navigation elements from former top-level browsing contexts aren't removed and remain forever in the cache.
Comment 8•4 years ago
|
||
This Marionette bug blocks the marionette-fission meta bug but doesn't have a [marionette-fission-mvp]
or [marionette-fission-reserve]
whiteboard tag. I'm going to assume it's not a blocker for Fission MVP for now.
Assignee | ||
Comment 9•4 years ago
|
||
Cross-group navigations cause a swap of the current top-level
browsing context. The only unique identifier is the browser id
the browsing context actually lives in. If it's equal also
treat the browsing context as the same.
Assignee | ||
Comment 10•4 years ago
|
||
Actually the proposed fix doesn't work. The browsing context of the element might not be around anymore given that a garbage collection could have destroyed it already. This can happen at any time when there is a bit of delay between the navigation and the element usage. Looks like we will indeed have to store the browserId
alongside the ContentDOMReference
in our ReferenceStore
.
Assignee | ||
Comment 11•4 years ago
|
||
John, would you mind testing the test build if it fixes your problem? Not sure on which platform you are on, so here the steps how to get to them:
- https://treeherder.mozilla.org/jobs?repo=try&revision=b28fe21d4cb7bbd8f57600d34b8ed07c34e2a393&searchStr=build
- Select the
(Ba)
opt job from your platform - Click on
Artifacts
at the lower pane - Find the Firefox build which is named like
target.dmg
,target.tar.bz2
, ortarget.zip
.
Comment 12•4 years ago
|
||
Henrik - I see 2 options "Windows 2012 x64 opt" and "Windows 2012 x64 shippable opt" does it matter which I use?
(running windows 10 x64)
Assignee | ||
Comment 13•4 years ago
|
||
It doesn't matter, but shippable is what we use for Nightlies.
Comment 14•4 years ago
|
||
I used the "shippable opt" version and can confirm that the issue is fixed.
Assignee | ||
Comment 15•4 years ago
|
||
Wonderful. Thanks a lot for the verification.
Assignee | ||
Updated•4 years ago
|
Comment 16•4 years ago
|
||
Comment 17•4 years ago
|
||
bugherder |
Assignee | ||
Comment 18•4 years ago
|
||
Comment on attachment 9202028 [details]
Bug 1690308 - [marionette] Report StaleElementReference error after cross-group navigations.
Beta/Release Uplift Approval Request
- User impact if declined: For users of WebDriver a wrong error type will be unexpectedly thrown when trying to interact with elements after a cross-group navigation causing a new top-level browsing context.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): An additional check for the browsing context's browerId has been added to check for equality of top-level browsing contexts.
- String changes made/needed:
Updated•4 years ago
|
Comment 19•4 years ago
|
||
Comment on attachment 9202028 [details]
Bug 1690308 - [marionette] Report StaleElementReference error after cross-group navigations.
We already merged mozilla-beta to mozilla-release and it is not end-user facing nor a P1, it's a bit late for an uplift, sorry.
Comment 20•4 years ago
|
||
The patch landed in nightly and beta is affected.
:whimboo, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
For more information, please visit auto_nag documentation.
Comment 21•4 years ago
|
||
Thanks all for the rapid fix.
Assignee | ||
Comment 22•4 years ago
|
||
As given by the denied approval request this is a wontfix for 86.
Updated•2 years ago
|
Description
•