1125377 - Marionette fails during getWindowHandles with e10s turned on when accessing a CPOW

Chris Manchester (limited bugmail, email directly)

Assignee

Description

•

10 years ago

This happens intermittently on mn-e10s jobs (bug 1124966), but constantly in the test suite :erahm is working on that ports the Are We Slim Yet tests to use marionette. He provided a test case that reproduces this that I'll upload shortly. Basically, it opens 15 tabs, forces a gc, closes all but one tab, and repeats the process. The second time through the test fails when figuring out how many tabs to close based on marionette.window_handles. In marionette-server.js, the failure occurs when attempting to retrieve a window ID from a CPOW (we check whether the CPOW is null, and then QI it. If fails in QI, not the null check). I can catch it in the browser toolbox debugger, but inspecting the window there crashes the tab. I'm going to look into this some more, but I'm pretty over-subscribed for the next two weeks. We may need help from the e10s team figuring this one out, and we may need to stop keeping track of listener scripts in marionette via window ids (something we might need to do if CPOWs go away anyways).

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 1

•

10 years ago

Attached patch Marionette test case reproducing the issue (deleted) — Details — Splinter Review

This is the test. Run with --e10s to reproduce.

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

10 years ago

Assignee: nobody → cmanchester

Status: NEW → ASSIGNED

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

10 years ago

Summary: Marionette fails during get_window_handles when accessing a CPOW → Marionette fails during get_window_handles with e10s turned on when accessing a CPOW

David Burns :automatedtester

Updated

•

10 years ago

Keywords: ateam-marionette-server

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 2

•

10 years ago

From a quick test, translating use of a CPOW to use messages (a new command or small script to load into each browser we need to identify) effectively works around this.

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

10 years ago

Blocks: 1125484

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

10 years ago

Summary: Marionette fails during get_window_handles with e10s turned on when accessing a CPOW → Marionette fails during getWindowHandles with e10s turned on when accessing a CPOW

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 3

•

10 years ago

I see the same now for my work on handling chrome windows on bug 1123401. The failure might be a bit different, but it also happens given that the latest nightly has e10s turned on, and we do not disable it anymore by default. https://travis-ci.org/mozilla/firefox-ui-tests/builds/48325279 File "/home/travis/build/mozilla/firefox-ui-tests/client/marionette/marionette.py", line 1293, in execute_script filename=os.path.basename(frame[0])) File "/home/travis/build/mozilla/firefox-ui-tests/client/marionette/decorators.py", line 36, in _ return func(*args, **kwargs) File "/home/travis/build/mozilla/firefox-ui-tests/client/marionette/marionette.py", line 634, in _send_message self._handle_error(response) File "/home/travis/build/mozilla/firefox-ui-tests/client/marionette/marionette.py", line 689, in _handle_error raise errors.JavascriptException(message=message, status=status, stacktrace=stacktrace)JavascriptException: JavascriptException: NS_ERROR_FAILURE: stacktrace: execute_script @windows.py, line 284 inline javascript, line 1 src: " var win = window.open();"

Blocks: 1123401

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 4

•

10 years ago

The simplest command to execute for reproduction for me is: def test_failure(self): self.marionette.execute_script(""" window.open(); """) Then it fails with the above mentioned error.

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 5

•

10 years ago

Henrik, this doesn't look like the same issue. I think this issue could be bug 1047603.

No longer blocks: 1123401

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 6

•

10 years ago

Hi Mike. Because this works most of the time, and working around it by sending messages instead of using a CPOW works around it, it seems like this could be an issue with CPOWs and of interest to e10s. Alternatively, this code could be making a dangerous assumption about CPOWs that I need to fix. I don't expect you're familiar with marionette, but the code that hits it is: https://hg.mozilla.org/mozilla-central/file/6b29617a738f/testing/marionette/marionette-server.js#l1387 When run as a part of the test case attached to this bug (described in comment 0), calling QueryInterface throws undefined. Do you have a sense of whether this sort of thing should be expected to reliably work? Thanks in advance, and please let me know if there might be someone else on your team I should contact about this.

Flags: needinfo?(mconley)

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 7

•

10 years ago

When I say "throws undefined", I mean that when we are in a catch block starting with "catch (e) {...", "e" within that block is undefined.

Mike Conley (:mconley) (:⚙️)

Comment 8

•

10 years ago

It's possible that the contentWindow on the content side no longer exists, and you're attempting to access a "dead CPOW". Do you see warnings about dead CPOWs in the browser console when you hit this?

Flags: needinfo?(mconley) → needinfo?(cmanchester)

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 9

•

10 years ago

Thanks, I do have: "JavaScript error: , line 0: Error: operation not possible on dead CPOW" just before the exception. Is there a rule of thumb for safe handling of CPOWs that might be dead, or is it a question of individual callers knowing when the wrapped object must be alive (I'm not sure how to do that in this case, but I imagine it's possible)?

Flags: needinfo?(cmanchester) → needinfo?(mconley)

Mike Conley (:mconley) (:⚙️)

Comment 10

•

10 years ago

(In reply to Chris Manchester [:chmanchester] from comment #9) > Thanks, I do have: "JavaScript error: , line 0: Error: operation not > possible on dead CPOW" just before the exception. Is there a rule of thumb > for safe handling of CPOWs that might be dead, or is it a question of > individual callers knowing when the wrapped object must be alive (I'm not > sure how to do that in this case, but I imagine it's possible)? CPOWs are wonderful and dangerous things. :) While they do the job of transparently proxying synchronous calls to and from a remote object, they do not hold a reference to that object. That means if the object is garbage collected on the content side, the CPOW is now "dead" and cannot be operated upon. In order for the CPOW to stay alive, something needs to be holding a reference to the thing you want to keep alive in the content process. I believe you can check the dead-ness of a CPOW with Cu.isDeadWrapper, so that might help you avoid the exception. But if you really need those contentWindows to not be dead, you need something on the content side to hold a reference to them.

Flags: needinfo?(mconley)

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 11

•

10 years ago

(In reply to Mike Conley (:mconley) - Needinfo me! from comment #10) > (In reply to Chris Manchester [:chmanchester] from comment #9) > > Thanks, I do have: "JavaScript error: , line 0: Error: operation not > > possible on dead CPOW" just before the exception. Is there a rule of thumb > > for safe handling of CPOWs that might be dead, or is it a question of > > individual callers knowing when the wrapped object must be alive (I'm not > > sure how to do that in this case, but I imagine it's possible)? > > CPOWs are wonderful and dangerous things. :) While they do the job of > transparently proxying synchronous calls to and from a remote object, they > do not hold a reference to that object. That means if the object is garbage > collected on the content side, the CPOW is now "dead" and cannot be operated > upon. > > In order for the CPOW to stay alive, something needs to be holding a > reference to the thing you want to keep alive in the content process. > > I believe you can check the dead-ness of a CPOW with Cu.isDeadWrapper, so > that might help you avoid the exception. But if you really need those > contentWindows to not be dead, you need something on the content side to > hold a reference to them. Guarding with isDeadWrapper does the trick, thanks! In our case we're using the window IDs just to figure out what windows are around, so avoiding dead ones is probably enough.

Brad Lassey [:blassey] (use needinfo?)

Updated

•

10 years ago

Blocks: e10s-harness

tracking-e10s: --- → +

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 12

•

10 years ago

Things aren't quite as easy to fix as I've said in comment 11, but I have a fix for this based on mapping browsers to outerWindowId's rather than always going through a CPOW (waiting on try results now). I noticed this same feature is being added in bug 1077168, so maybe we can piggyback on that when it lands.

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 13

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f7fb8243466f https://treeherder.mozilla.org/#/jobs?repo=try&revision=ea50f94c6909

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 14

•

10 years ago

Attached file MozReview Request: bz://1125377/chmanchester (obsolete) (deleted) — Details

/r/3877 - Bug 1125377 - Store window ids in a weak map keyed by browser objects to prevent attempts to access dead windows in marionette's getWindowHandles. /r/3879 - Bug 1125377 - Marionette test case reproducing the issue. Pull down these commits: hg pull review -r d7bb50999c3557513af8a6501b65c41abe53c8ba

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 15

•

10 years ago

I've been working on this concurrently with bug 1096488 (it's sort of a flavor of the same problem, assuming we can always access an outerWindowId safely, and assuming the outerWindowId we're using as a listenerId never changes), but I think this should land first and can race with bug 1077168, which will afford a simplification but isn't strictly needed. The second commit above is erahm's test, it takes about 60 seconds to run but might be a worthy regression test.

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 16

•

10 years ago

Comment on attachment 8564438 [details] MozReview Request: bz://1125377/chmanchester /r/3877 - Bug 1125377 - Store window ids in a weak map keyed by browser objects to prevent attempts to access dead windows in marionette's getWindowHandles. /r/3879 - Bug 1125377 - Marionette test case reproducing the issue. Pull down these commits: hg pull review -r d7bb50999c3557513af8a6501b65c41abe53c8ba

Attachment #8564438 - Flags: review?(jgriffin)

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 17

•

10 years ago

Comment on attachment 8564438 [details] MozReview Request: bz://1125377/chmanchester /r/3877 - Bug 1125377 - Store window ids in a weak map keyed by browser objects to prevent attempts to access dead windows in marionette's getWindowHandles. /r/3879 - Bug 1125377 - Marionette test case reproducing the issue. Pull down these commits: hg pull review -r 0e1246544cb80f6b21b4bf79d08a85b19cac4b19

Jonathan Griffin (:jgriffin)

Updated

•

10 years ago

Attachment #8564438 - Flags: review?(jgriffin)

Jonathan Griffin (:jgriffin)

Comment 18

•

10 years ago

Comment on attachment 8564438 [details] MozReview Request: bz://1125377/chmanchester https://reviewboard.mozilla.org/r/3875/#review3115 ::: testing/marionette/marionette-server.js (Diff revision 2) > register: function BO_register(uid, id) { With this patch, this method no longer uses the 'id' parameter, so we should adjust it and all call sites. Very nice, I didn't know about browser.permanentKey.

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 19

•

10 years ago

Try with update from review: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e6237c792814 https://treeherder.mozilla.org/#/jobs?repo=try&revision=85f807b89cdd

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 20

•

10 years ago

Comment on attachment 8564438 [details] MozReview Request: bz://1125377/chmanchester /r/3877 - Bug 1125377 - Store window ids in a weak map keyed by browser objects to prevent attempts to access dead windows in marionette's getWindowHandles. /r/3879 - Bug 1125377 - Marionette test case reproducing the issue. Pull down these commits: hg pull review -r 643f714fe9b114ca20121e25ae3a5eb8060019da

Attachment #8564438 - Flags: review?(jgriffin)

Jonathan Griffin (:jgriffin)

Comment 21

•

10 years ago

Comment on attachment 8564438 [details] MozReview Request: bz://1125377/chmanchester https://reviewboard.mozilla.org/r/3875/#review3267 Ship It!

Attachment #8564438 - Flags: review?(jgriffin) → review+

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 22

•

10 years ago

Thanks for the review! https://hg.mozilla.org/integration/mozilla-inbound/rev/23c081755b11

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 23

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/23c081755b11

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

status-firefox38: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla38

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 24

•

10 years ago

This took care of the issue in AWSY and the test in comment 1 but not the cause of the intermittents. The failure is at the point we go to access the contentWindowAsCPOW of a newly opened window so reopening. Bug 1077168 will allow us to get our window ID without using a CPOW at all, so that should take care of this. Short of that we could just try/catch around that point, which is unfortunate, but this really seems like a timing quirk that needs to be worked around.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 25

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1b0f90421f48 https://treeherder.mozilla.org/#/jobs?repo=try&revision=8bd8db137b12

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

10 years ago

Depends on: 1136378

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

10 years ago

Status: REOPENED → RESOLVED

Closed: 10 years ago → 10 years ago

Resolution: --- → FIXED

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 26

•

9 years ago

Attachment #8564438 - Attachment is obsolete: true

Attachment #8619216 - Flags: review+

Attachment #8619217 - Flags: review+

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 27

•

9 years ago

Attached file MozReview Request: Bug 1125377 - Store window ids in a weak map keyed by browser objects to prevent attempts to access dead windows in marionette's getWindowHandles. (deleted) — Details

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 28

•

9 years ago

Attached file MozReview Request: Bug 1125377 - Marionette test case reproducing the issue. (deleted) — Details

BMO Automation

Updated

•

2 years ago

Product: Testing → Remote Protocol

Marionette test case reproducing the issue 10 years ago Chris Manchester (limited bugmail, email directly) (deleted), patch		Details \| Diff \| Splinter Review
MozReview Request: bz://1125377/chmanchester 10 years ago Chris Manchester (limited bugmail, email directly) (deleted), text/x-review-board-request	jgriffin : review+	Details
MozReview Request: Bug 1125377 - Store window ids in a weak map keyed by browser objects to prevent attempts to access dead windows in marionette's getWindowHandles. 9 years ago Chris Manchester (limited bugmail, email directly) (deleted), text/x-review-board-request	jgriffin : review+	Details
MozReview Request: Bug 1125377 - Marionette test case reproducing the issue. 9 years ago Chris Manchester (limited bugmail, email directly) (deleted), text/x-review-board-request	jgriffin : review+	Details