Closed Bug 1525126 Opened 6 years ago Closed 5 years ago

Add Android support to geckodriver

Categories

(Testing :: geckodriver, enhancement, P1)

Version 3
enhancement

Tracking

(firefox71 fixed)

RESOLVED FIXED
mozilla71
Tracking Status
firefox71 --- fixed

People

(Reporter: nalexander, Assigned: whimboo)

References

(Blocks 5 open bugs)

Details

Attachments

(4 files, 6 obsolete files)

(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details

chromedriver has great support for testing Chrome on Android and arbitrary WebView-consuming Apps. From the documentation:

Android-specific Desired Capabilities
The following capabilities are applicable to both Chrome and WebView apps:

    androidPackage: The package name of the Chrome or WebView app.
    androidDeviceSerial: (Optional) The device serial number on which to launch the app (See Multiple Devices section below).
    androidUseRunningApp: (Optional) Attach to an already-running app instead of launching the app with a clear data directory.

The following capabilities are only applicable to WebView apps.

    androidActivity: Name of the Activity hosting the WebView.
    androidProcess: (Optional) Process name of the Activity hosting the WebView (as given by ps). If not given, the process name is assumed to be the same as androidPackage.

These are passed as capabilities to the Web Driver instance -- similar, I think, to firefoxOptions.

What's great about these parameters is that they take care of a bunch of frustrating plumbing:

  • killing and starting Apps
  • clearing local data
  • configuring preferences and profiles
  • mapping ports as necessary

This is all kinda-sorta do-able right now with geckodriver. One uses adb to kill an App, and then pushes profiles to the device, and then launches with a bunch of configuration, and then maps ports, and tries to ensure that Marionette has started, and then invokes geckodriver with a --connect-existing flag, and then tries to figure out how to shut everything down.

I propose to bring all of that "in house": let's make it as easy to use geckodriver on Android as it is to use chromedriver on Android.

The capabilities will need to follow the extension capabilities naming described in webdriver spec[1]/

It might be good to port over the mozdevice[2] python package, well the parts that we need, to its own crate that we can then just consume that from geckodriver.

[1] https://w3c.github.io/webdriver/#dfn-extension-capability
[2] https://searchfox.org/mozilla-central/source/testing/mozbase/mozdevice

This definitely seems like an important thing to do in order to make GeckoView as easy to test as WebView.

So, interestingly, the way that Chromedriver works here is to require the user to start the adb server, but then to have a custom implentation of the protocol using the local server for transport. This is different to mozdevice which runs the adb binary in a subprocess for each command and then parses the stdout.

Some relevant user documentation can be found on
http://chromedriver.chromium.org/getting-started/getting-started---android.
As jgraham notes, it requires the user to start the adb server
manually. That seems like something we can improve on.

(In reply to David Burns :automatedtester from comment #1)

The capabilities will need to follow the extension capabilities
naming described in webdriver spec[1]/

This only applies to the outer object, so as long as we ensure the
new Android capabilities are located on the moz:firefoxOptions
object, as they are with goog:chromeOptions, we can mirror the
chromedriver configuration exactly.

Priority: -- → P3

So, interestingly, the way that Chromedriver works here is to require the user to start the adb server, but then to have a custom implentation of the protocol using the local server for transport. This is different to mozdevice which runs the adb binary in a subprocess for each command and then parses the stdout.

Yeah, I deep-dived into how ChromeDriver does this, and I can't quite understand why they elect to talk to adb over TCP. adb is notoriously flaky, but it's my understanding that the really flaky part is the USB protocol layer, not the command driver layer. There's a little bit of locking in ChromeDriver to stop two sessions trying to drive the same Android App (really, device) simultaneously, but it's just internal locking -- it wouldn't stop you running two chromedriver processes, for example. The real reason that I thought they would do this is that they would know when a device disappears... but the open CDP connection tells them that already. (And, for geckodriver, the open Marionette connection will tell them as well.)

I started to investigate talking adb over TCP in Rust and it's easy enough, if a little tedious. I think it's better just to start invoking adb ... for a while and see how it goes. It's not particularly complicated. So yes, it looks like porting some small pieces of mozdevice into a Rust library. I'll put together something and get some feedback.

This is partial work on supporting android{PackageName,Activity,...}
in the moz:firefoxOptions capabilities block.

Not all of these make sense for GeckoView; I'll sort that out as I
develop the rest of the commit sequence.

Blocks: 1560046

Thanks Nick for your time yesterday! We are kinda looking forward to get this enhancement landed for geckodriver.

As it turned out bug 1298921 is no longer a blocker, and that quitting the browser on Android will be done via adb by killing the process.

Assignee: nobody → nalexander
Status: NEW → ASSIGNED
No longer depends on: 1298921
Priority: P3 → P1

Denis asked about setting environment variables for GeckoDriver-started Android invocations. Just noting it here, although it might become follow-up.

Nick, in regards of our discussions during the last All Hands, would you be able to supply an updated patch? I vaguely remember that you wanted to add/change something compared to the code which is attached to this bug. Thanks!

Flags: needinfo?(nalexander)

This implementation speaks the ADB wire protocol over TCP/IP. This in
constrast to the Python implementation, which generally invokes adb
on the command line. In thousands of runs across multiple devices,
this implementation has proved surprisingly robust.

This allows to port-forward using adb. Connecting to such ports
always establishes a TCP connection to the ADB daemon, but not all
such connections are to bound ports. Waiting for the Marionette
handshake ensures the connection is real, and not just partially
forwarded.

Depends on D39812

See Bug 1298921 for
some discussion of the underlying issue. This is more-or-less what
chromedriver does.

Depends on D39814

This is optional: it's just useful to have confirmation when invoking
large chains of wrappers (i.e., browsertime) that the invoked
geckodriver is the expected version.

Depends on D39815

whimboo: hey, here's an updated series, quite a bit cleaner. This builds but I haven't tested it at all (on Desktop or on Android). I can run browsertime against it pretty easily tomorrow, but you might get to it first. See what you think?

Flags: needinfo?(nalexander)

Thanks Nick! I will have a look at it soon. First have to finish bug 1559592 to unblock Fission work.

Blocks: 1520585
Flags: needinfo?(hskupin)

In testing this, I see:

./mach browsertime -- --firefox.geckodriverPath /Users/nalexander/Mozilla/gecko/target/debug/geckodriver --firefox.binaryPath /Applications/Firefox\ Nightly.app/Contents/MacOS/firefox-bin https://reddit.com -n 2 -vvv
...
[2019-08-01 16:13:45] DEBUG: [browsertime] Telling browser to quit.
1564701225622	webdriver::server	DEBUG	-> DELETE /session/cbd8f31c-574f-1948-be8c-4c0fc06c1f19
1564701225624	Marionette	DEBUG	0 -> [0,406,"Marionette:Quit",{"flags":["eForceQuit"]}]
1564701225625	Marionette	INFO	Stopped listening on port 65249
1564701225681	Marionette	DEBUG	0 <- [1,406,null,{"cause":"shutdown"}]
1564701225697	webdriver::server	DEBUG	Deleting session
JavaScript error: chrome://formautofill/content/FormAutofillFrameScript.js, line 86: Error: Unexpected event type
1564701225709	Marionette	DEBUG	0 -> [0,407,"Marionette:Quit",{"flags":["eForceQuit"]}]
1564701225709	Marionette	DEBUG	0 <- [1,407,{"error":"invalid session id","message":"Tried to run command without establishing a connection","stacktrace":"WebDriv ... t@chrome://marionette/content/server.js:249:9\n_onJSONObjectReady/<@chrome://marionette/content/transport.js:503:20\n"},null]
1564701225722	Marionette	DEBUG	Closed connection 0

The form autofill issue doesn't appear related. My work-around does not apply to Desktop at all, so I guess this is independent?

./mach browsertime -- --firefox.geckodriverPath /Users/nalexander/Mozilla/gecko/target/debug/geckodriver --android https://reddit.com -n 2 -vvv
...
1564703414394	webdriver::server	DEBUG	-> DELETE /session/c815aed9-d836-4bcf-8407-563bbd975615
1564703414397	webdriver::server	DEBUG	<- 500 Internal Server Error {"value":{"error":"unsupported operation","message":"Only supported in Firefox","stacktrace":"WebDriverError@chrome://marionette/content/error.js:175:5\nUnsupportedOperationError@chrome://marionette/content/error.js:493:5\nassert.that/<@chrome://marionette/content/assert.js:428:13\nassert.firefox@chrome://marionette/content/assert.js:97:57\nGeckoDriver.prototype.quit@chrome://marionette/content/driver.js:3466:10\ndespatch@chrome://marionette/content/server.js:305:40\nexecute@chrome://marionette/content/server.js:275:16\nonPacket/<@chrome://marionette/content/server.js:248:20\nonPacket@chrome://marionette/content/server.js:249:9\n_onJSONObjectReady/<@chrome://marionette/content/transport.js:503:20\n"}}
[2019-08-01 16:50:14] ERROR: [browsertime] BrowserError: Only supported in Firefox
    at SeleniumRunner.stop (/Users/nalexander/Mozilla/gecko/tools/browsertime/node_modules/browsertime/lib/core/seleniumRunner.js:415:15)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7)
[2019-08-01 16:50:14] DEBUG: [browsertime] Telling browser to quit.

That's as expected, 'cuz my work-around doesn't supress the Marionette:Quit message. (Perhaps it should!)

Attachment #9041515 - Attachment is obsolete: true

whimboo, others: the browsertime-in-automation MVP is getting close to wanting this. Can I gently nudge it up the priority queue?

Hey Nick. This bug is still on my radar and #2 on my priority list. Once I'm done with bug 1559592, it will be next for me. So early next week, or even already tomorrow.

Just a question ahead... I assume your patches still work, and it mostly needs a refactoring to have pretty and maintainable code? Or what else will be necessary?

Keeping the needinfo flag for now.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #19)

Hey Nick. This bug is still on my radar and #2 on my priority list. Once I'm done with bug 1559592, it will be next for me. So early next week, or even already tomorrow.

Great!

Just a question ahead... I assume your patches still work, and it mostly needs a refactoring to have pretty and maintainable code? Or what else will be necessary?

Yes, these patches are working: in fact, I want them for Bug 1574182 (browsertime), and I've built most of Bug 1534533 (toolchain geckodriver) in order to incorporate them into browsertime :)

I did a significant pass so I expect little refactoring will be necessary (but of course, it's your call). Updating the list I posted on an abandoned early revision:

  1. done there's general clean up required -- it doesn't lint, there are comments, etc.

  2. done there's a Pre: part about waiting for connections to actually handshake rather than just TCP open, and backing off while we wait. It's in the patches but should be separated out as its own commit. This is necessary because ADB port-forwarded TCP ports open even when there's nothing listening on the Android device, i.e., before Marionette has bound the port.

  3. done there's a work-around for Bug 1298921 in place that probably doesn't make sense for non-Android vehicles. Maybe the smart way to address this is to add an Android-specific feature flag (sibling to androidPackage, say) that controls this behaviour (and default it on, since things are broken right now).

  4. for follow-up the code as written uses command line arguments to specify -profile, etc. That's not the GeckoView approach: see Bug 1533385 and the GeckoView automation docs. So this should also get a flag for Fennec and some code to control the GeckoView configuration file.

  5. for follow-up there's no support for connecting to a running Marionette instance. I haven't thought about that at all.

  6. needed? I haven't written any documentation.

Keeping the needinfo flag for now.

Attachment #9081504 - Attachment is obsolete: true

whimboo: two things.

  1. First, it might be that the bzip2 dependency I think I remove in the WIP for Bug 1534533 might already have happened, and therefore we might need a ./mach vendor rust or equivalent for this stack.

  2. While upgrading to selenium-webdriver 4.0.0 alpha-something, I uncovered an issue with how moz:firefoxOptions is validated. I gather that using deprecated capabilities matching, some additional validation was ignored. With newer Selenium, that validation happens, and the following patch is needed to validate the new moz:firefoxOptions fields:

diff --git a/testing/geckodriver/src/capabilities.rs b/testing/geckodriver/src/capabilities.rs
--- a/testing/geckodriver/src/capabilities.rs
+++ b/testing/geckodriver/src/capabilities.rs
@@ -181,34 +181,26 @@ impl<'a> BrowserCapabilities for Firefox
                 );
                 for (key, value) in data.iter() {
                     match &**key {
-                        "binary" => {
+                        "binary" | "profile" | "androidActivity" | "androidPackage" | "androidDeviceSerial" => {
                             if !value.is_string() {
                                 return Err(WebDriverError::new(
                                     ErrorStatus::InvalidArgument,
-                                    "binary path is not a string",
+                                    format!("{} is not a string", key),
                                 ));
                             }
                         }
-                        "args" => {
+                        "args" | "androidIntentArguments" => {
                             if !try_opt!(
                                 value.as_array(),
                                 ErrorStatus::InvalidArgument,
-                                "args is not an array"
+                                format!("{} is not an array", key)
                             )
                             .iter()
                             .all(|value| value.is_string())
                             {
                                 return Err(WebDriverError::new(
                                     ErrorStatus::InvalidArgument,
-                                    "args entry is not a string",
-                                ));
-                            }
-                        }
-                        "profile" => {
-                            if !value.is_string() {
-                                return Err(WebDriverError::new(
-                                    ErrorStatus::InvalidArgument,
-                                    "profile is not a string",
+                                    format!("{} entry is not a string", key),
                                 ));
                             }
                         }

Hm, validation of the custom capabilities should happen at any time, and not only for alwaysMatch and firstMatch. So I'm not sure what's up, but I will keep it in mind. The changes as listed in the last comment can't be taken 1:1, but I will make sure to incorporate them adequately.

Flags: needinfo?(hskupin)
Blocks: 1559120

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #22)

Hm, validation of the custom capabilities should happen at any time, and not only for alwaysMatch and firstMatch. So I'm not sure what's up, but I will keep it in mind. The changes as listed in the last comment can't be taken 1:1, but I will make sure to incorporate them adequately.

Can you suggest what's actually needed here? This just looks like argument validation -- no idea why we do that twice.

No longer blocks: 1520585

(In reply to Nick Alexander :nalexander [he/him] from comment #23)

Can you suggest what's actually needed here? This just looks like argument validation -- no idea why we do that twice.

I will have a look when I'm at this revision of the patch series. Currently making it still through mozdevice. I should have an update soon.

Assignee: nalexander → hskupin

The refactoring of the mozdevice code actually takes longer than expected, and as such we made the decision to just clean-up the current patch, and then follow-up with the refactoring of the package. It's actually more important to get Nick and other teams unblocked.

Note that we still want to wait with landing this patch until the geckodriver 0.25 release (bug 1520585) is out. Nick, for browsertime to work do you need an official release with these patches, or are taskcluster builds enough for now?

Flags: needinfo?(nalexander)
Blocks: 1578424

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #25)

The refactoring of the mozdevice code actually takes longer than expected, and as such we made the decision to just clean-up the current patch, and then follow-up with the refactoring of the package. It's actually more important to get Nick and other teams unblocked.

I have no strong opinions about the existing mozdevice expression: re-write or throw it away and start again. The API is pretty simple; there's probablyjust:

  1. run command
  2. run shell command
  3. push
  4. (eventually) pull

If you prefer, maybe write a version that uses adb on the command line to achieve that? Personally I have found the TCP connection to be much more robust, but there are all sorts of weird things in that protocol (and the current expression, which isn't tidy).

Note that we still want to wait with landing this patch until the geckodriver 0.25 release (bug 1520585) is out. Nick, for browsertime to work do you need an official release with these patches, or are taskcluster builds enough for now?

TC builds are fine by us: just as long as the bits are in the tree so that we can fetch them with the new toolchain tasks.

Flags: needinfo?(nalexander)
Blocks: 1575021

(In reply to Nick Alexander :nalexander [he/him] from comment #26)

If you prefer, maybe write a version that uses adb on the command line to achieve that? Personally I have found the TCP connection to be much more robust, but there are all sorts of weird things in that protocol (and the current expression, which isn't tidy).

I moved this conversation over to bug 1578424.

Note that we still want to wait with landing this patch until the geckodriver 0.25 release (bug 1520585) is out. Nick, for browsertime to work do you need an official release with these patches, or are taskcluster builds enough for now?

TC builds are fine by us: just as long as the bits are in the tree so that we can fetch them with the new toolchain tasks.

Perfect! That gives us more time for testing and improvements.

Blocks: 1578698

Henrik asked for some capabilities objects for GVE invocations. Here's how I get them. Just FYI, you can find some geckodriver binaries with these patches (slightly rebased and extended) in https://bugzilla.mozilla.org/show_bug.cgi?id=1577850#c2. To add additional logging, I used this patch:

diff --git a/testing/webdriver/src/command.rs b/testing/webdriver/src/command.rs
--- a/testing/webdriver/src/command.rs
+++ b/testing/webdriver/src/command.rs
@@ -453,6 +453,13 @@ impl<'de> Deserialize<'de> for NewSessio
         D: Deserializer<'de>,
     {
         let value = serde_json::Value::deserialize(deserializer)?;
+        let mut debug_value = value.clone();
+        if let Some(profile) = debug_value.pointer_mut("/desiredCapabilities/moz:firefoxOptions/profile") {
+            info!("let Some");
+            *profile = serde_json::Value::String(format!("<string of length {}>", profile.as_str().unwrap().len()));
+        }
+        info!("value: {}", debug_value.to_string());
+
         if let Some(caps) = value.get("capabilities") {
             if !caps.is_object() {
                 return Err(de::Error::custom("capabilities must be objects"));

To test everything in GVE, an artifact Android build is fine.

  1. ./mach build and ./mach android install-geckoview-example
  2. cd testing/geckodriver && cargo build && cd (to produce $topsrcdir/target/debug/geckodriver)
  3. ./mach browsertime --setup (the dependencies aren't all that important, TBH, but this is supposed to work, modulo some small issues I need to address)
  4. ./mach browsertime -- --firefox.geckodriverPath /Users/nalexander/Mozilla/gecko/target/debug/geckodriver --android https://reddit.com -n 1 -vv produces output like:
{
    "desiredCapabilities": {
        "browserName": "firefox",
        "moz:firefoxOptions": {
            "androidActivity": "org.mozilla.geckoview_example.GeckoViewActivity",
            "androidIntentArguments": [],
            "androidPackage": "org.mozilla.geckoview_example",
            "args": [
                "-no-remote"
            ],
            "binary": "/Applications/Firefox.app/Contents/MacOS/firefox-bin",
            "profile": "<string of length 765832>"
        },
        "pageLoadStrategy": "normal"
    }
}

You can see that the output is "old-style" desiredCapabilities. That's 'cuz our fork of browsertime is still based on selenium-webdriver 3.6.0. I have WIP to pull our version forward to 4.0.0-alpha-something, which uses modern negotiation. When I apply that work in progress, and force more logging, I get output like:

{
    "capabilities": {
        "alwaysMatch": {
            "browserName": "firefox",
            "moz:firefoxOptions": {
                "androidActivity": "org.mozilla.geckoview_example.GeckoViewActivity",
                "androidIntentArguments": [],
                "androidPackage": "org.mozilla.geckoview_example",
                "args": [
                    "-no-remote"
                ],
                "binary": "/Applications/Firefox.app/Contents/MacOS/firefox-bin",
                "prefs": {
                    "app.update.disabledForTesting": true,
                    ...
                    "xpinstall.whitelist.add.36": ""
                },
                "profile": "<string of length 758064>"
            },
            "pageLoadStrategy": "normal"
        }
    },
    "desiredCapabilities": {
        "browserName": "firefox",
        "moz:firefoxOptions": {
            "androidActivity": "org.mozilla.geckoview_example.GeckoViewActivity",
            "androidIntentArguments": [],
            "androidPackage": "org.mozilla.geckoview_example",
            "args": [
                "-no-remote"
            ],
            "binary": "/Applications/Firefox.app/Contents/MacOS/firefox-bin",
            "prefs": {
                "app.update.disabledForTesting": true,
                ...
                "xpinstall.whitelist.add.36": ""
            },
            "profile": "<string of length 758064>"
        },
        "pageLoadStrategy": "normal"
    }
}

(I have pretty-printed all the JSON here, 'cuz the second one is hard to parse by eye.) Hope that helps, Henrik!

Flags: needinfo?(hskupin)

Also, I see that I'm ignoring args: I set/add arguments by hand to am start. We might want to pass them through? It's unclear what they even mean: -no-remote is very Desktop oriented and probably doesn't make sense for GeckoView. (I'm not even sure it parses, 'cuz GV command line handling is very different to Desktop command line handling.)

(In reply to Nick Alexander :nalexander [he/him] from comment #29)

Also, I see that I'm ignoring args: I set/add arguments by hand to am start. We might want to pass them through?

What is the behavior of the chromedriver here? Can you also show two examples of the generated capabilities similar to Firefox? I would tend to say that we should re-use the args capability for Android, and just get rid of our custom androidIntentArguments capability. We can then set different defaults for Android.

Flags: needinfo?(hskupin) → needinfo?(nalexander)
Blocks: 1579001

(In reply to Nick Alexander :nalexander [he/him] from comment #20)

  1. for follow-up the code as written uses command line arguments to specify -profile, etc. That's not the GeckoView approach: see Bug 1533385 and the GeckoView automation docs. So this should also get a flag for Fennec and some code to control the GeckoView configuration file.

Nick, would you mind filing a follow-up bug for that with all the details?

  1. for follow-up there's no support for connecting to a running Marionette instance. I haven't thought about that at all.

There is bug 1579001 now.

  1. needed? I haven't written any documentation.

Yes, I have to do that.

Depends on: 1579022
No longer blocks: 1579001
Blocks: 1579001
No longer depends on: 1579022

When reading the Chromedriver docs for the device serial I noticed that if multiple devices are attached, the driver selects a random unused device. To keep compatibility we should handle it the same way, which means that we cannot simply return a failure.

Attachment #9081503 - Attachment is obsolete: true

This implementation speaks the ADB wire protocol over TCP/IP. This is
in constrast to the Python implementation, which generally invokes adb
on the command line. In thousands of runs across multiple devices,
this implementation has proved surprisingly robust.

This allows to port-forward using adb. Connecting to such ports
always establishes a TCP connection to the ADB daemon, but not all
such connections are to bound ports. Waiting for the Marionette
handshake ensures the connection is real, and not just partially
forwarded.

Depends on D39812

Depends on D44895

This patch allows geckodriver to interact with processes of
GeckoView packages on Android devices while running itself
on a host machine. The connection to and from the application
under test is done via adb forward/reverse ports.

Depends on D39813

Depends on D44896

I just pushed some updated patches. They contain a lot of clean-ups, and a bit of refactoring. I will still have to check some more mozdevice features in more detail, especially the push_dir command.

Sadly I'm not able to test the patches with browsertime because it cannot start Firefox:

[2019-09-05 22:32:10] INFO: [browsertime] Running tests using Firefox - 1 iteration(s)
[2019-09-05 22:32:10] INFO: [browsertime] Browser failed to start, trying one more time: Could not locate Firefox on the current system
[2019-09-05 22:32:11] INFO: [browsertime] Browser failed to start, trying one more time: Could not locate Firefox on the current system
[2019-09-05 22:32:11] INFO: [browsertime] Browser failed to start, trying one more time: Could not locate Firefox on the current system
[2019-09-05 22:32:11] ERROR: [browsertime] BrowserError: Could not start the browser with 3 tries

Nevertheless I created a small Python script which leverages geckodriver directly, and it starts GeckoView example, navigates to https://example.com, and forces a shutdown of the app.

Nick, would you mind testing with browsertime?

Attachment #9081502 - Attachment is obsolete: true
Attachment #9081501 - Attachment is obsolete: true
Attachment #9081500 - Attachment is obsolete: true

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #32)

When reading the Chromedriver docs for the device serial I noticed that if multiple devices are attached, the driver selects a random unused device. To keep compatibility we should handle it the same way, which means that we cannot simply return a failure.

This is unhelpful behaviour; we shouldn't copy it. Nothing here is really specified so I think we want to be guided by chromedriver but not identical.

As for testing browsertime: you're seeing a dumb limitation of selenium-webdriver (for Node.js) which doesn't know anything about Android support in geckodriver (yet). So it's trying to find Firefox and apparently can't on your device. Pass a dummy path, like --firefox.binaryPath=/usr/bin/true or whatever and the check will be skipped. (And that Firefox binary won't be executed in the Android branch.)

But yes, I can try to test later today as well.

Flags: needinfo?(nalexander)

While I'm here, for point 5. of https://bugzilla.mozilla.org/show_bug.cgi?id=1525126#c20 -- connecting to a running GeckoView App -- I should note that this is basically impossible right now. The reason is that all Apps collide over which Marionette port to choose; the only "real" solution is to randomize the ports. (Right now this uses 2829 just so that it's not the default 2828.) The real solution to this is to use an abstract Unix socket (like /data/data/$PACKAGE/marionette...), which is how chromedriver and the Remote Debugger know where to look for package-specific sockets of this type.

A temporary workaround would be to make the chosen port a function of the Android package name (say, 2829 + sha1(package_name) % 997) and have GeckoView and geckodriver agree on that function. Then at least some collisions are avoided.

I looked at making Marionette uses abstract Unix sockets and it's a pain; there are many assumptions on the form of the Marionette port pref, such that it can't generalize to a socket address easily. There's tickets about this; see, for example, https://bugzilla.mozilla.org/show_bug.cgi?id=1533704#c16.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #31)

(In reply to Nick Alexander :nalexander [he/him] from comment #20)

  1. for follow-up the code as written uses command line arguments to specify -profile, etc. That's not the GeckoView approach: see Bug 1533385 and the GeckoView automation docs. So this should also get a flag for Fennec and some code to control the GeckoView configuration file.

Nick, would you mind filing a follow-up bug for that with all the details?

Yes: I implemented this in the patch for Bug 1577850. That might just get folded into the patches for this ticket: it's essential to support Fenix.

  1. for follow-up there's no support for connecting to a running Marionette instance. I haven't thought about that at all.

There is bug 1579001 now.

  1. needed? I haven't written any documentation.

Yes, I have to do that.

Great! I'm happy to help with this.

The way I see things, the real issue that we need to hammer out is how to separate the bits that should be "capabilities" from the bits that are just "options". When that's in place, everything else looks like implementation details from where I sit.

Thanks so much for driving this, :whimboo!

(In reply to Nick Alexander :nalexander [he/him] from comment #37)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #32)

When reading the Chromedriver docs for the device serial I noticed that if multiple devices are attached, the driver selects a random unused device. To keep compatibility we should handle it the same way, which means that we cannot simply return a failure.

This is unhelpful behaviour; we shouldn't copy it. Nothing here is really specified so I think we want to be guided by chromedriver but not identical.

Ok, so we can leave it as it is right now. If problems are coming up in the future, we could always re-evaluate that decision.

As for testing browsertime: you're seeing a dumb limitation of selenium-webdriver (for Node.js) which doesn't know anything about Android support in geckodriver (yet). So it's trying to find Firefox and apparently can't on your device. Pass a dummy path, like --firefox.binaryPath=/usr/bin/true or whatever and the check will be skipped. (And that Firefox binary won't be executed in the Android branch.)

That works great now! Thanks.

(In reply to Nick Alexander :nalexander [he/him] from comment #39)

The way I see things, the real issue that we need to hammer out is how to separate the bits that should be "capabilities" from the bits that are just "options". When that's in place, everything else looks like implementation details from where I sit.

Capability-wise isn't it just the platform which should be Android here? I'm not sure about the browser name. By default it should also be Firefox (internal name Fenix), but might also be something else like GeckoView_Example. Otherwise and AFAICS all of the Android options should be under moz:firefoxOptions.

Andreas, do you agree on that, or did I miss something?

Flags: needinfo?(ato)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment
#40)

(In reply to Nick Alexander :nalexander [he/him] from comment #39)

The way I see things, the real issue that we need to hammer
out is how to separate the bits that should be "capabilities"
from the bits that are just "options". When that's in place,
everything else looks like implementation details from where I
sit.

Capability-wise isn't it just the platform which should be
Android here? I'm not sure about the browser name. By default it
should also be Firefox (internal name Fenix), but might also be
something else like GeckoView_Example. Otherwise and AFAICS all
of the Android options should be under moz:firefoxOptions.

Andreas, do you agree on that, or did I miss something?

That sounds about right!

Anything put under moz:firefoxOptions, and this includes the
Android configuration if I’m not mistaken, is treated as plain
configuration not subject to capabilities matching.

WebDriver remote ends and intermediary nodes abuse extension
capabilities to pass around a lot of configuration. To give a
couple of examples, this extends to things such as authentication
tokens for running tests in WebDriver-as-service clouds that are
picked up by the service provider, or intermediary node, that
sits between a WebDriver client and the remote end or geckodriver.

WebDriver uses terminology such as “implementation defined” about
extension capabilities, which we choose to interpret as “treat
moz:firefoxOptions as plain configuration”. Chrome chooses to do
the same for their Android options.

To my recollection the only two new capability configurations we
need to support are:

  1. recognising android (lower-case to match other recommended
    values suggested under matching capabilities in platformName;
  2. and possibly allowing matching on a custom browserName/application
    name as whimboo suggests.
Flags: needinfo?(ato)
Depends on: 1298921

Running browsertime with the --verbose flag against the android emulator I can see the following capabilities returned from geckodriver:

1568042100346 webdriver::server DEBUG <- 200 OK {"value":{"sessionId":"077ce69e-82a4-4718-9047-2473dba31fd1","capabilities":{"acceptInsecureCerts":false,"browserName":"firefox","browserVersion":"71.0a1","moz:accessibilityChecks":false,"moz:buildID":"20190905034828","moz:geckodriverVersion":"0.24.0","moz:headless":false,"moz:processID":3051,"moz:profile":"/storage/emulated/0/org.mozilla.geckoview_example-geckodriver-profile","moz:shutdownTimeout":60000,"moz:useNonSpecCompliantPointerOrigin":false,"moz:webdriverClick":true,"pageLoadStrategy":"normal","platformName":"linux","platformVersion":24,"rotatable":false,"setWindowRect":false,"strictFileInteractability":false,"timeouts":{"implicit":0,"pageLoad":300000,"script":30000},"unhandledPromptBehavior":"dismiss and notify"}}}

So the browserName is firefox because we do not yet correctly set it, and that will be done with bug 1559120. I hope that this can wait until then.

The platformName clearly needs a fix s/linux/android/. The version I have to check if that is the correct one.

Fixed:

  • geckodriver: The Android package didn't get force stopped when no connection to Marionette can be established
  • mozdevice: Added more unit tests to verify the correct behavior of the API
  • mozdevice: The data length of adb responses was evaluated as decimal but not hexadecimal, which caused responses to be read only partly. That also means that remaining data was left in the stream, which can cause problems for follow-up commands. To fix that I replaced the code with a new hex.rs module for encoding and decoding the data length
  • mozdevice: Added utility function to remove all forwarded and reversed connections, which is mainly used for the unit tests
  • mozdevice: Setting port forwards were incorrectly using host-serial:{} instead of just host:
  • mozdevice: Not only a OKAYOKAY status can be returned but also OKAYFAIL.

Todo:

  • When starting GV_example via geckodriver directly Services.appinfo.name reports Fennec instead of Firefox. Using browsertime for the exact some package/activity it correctly returns firefox
  • Setting reverse port forwards doesn't work due to strange ADB errors like: bad forward: tcp:2829:tcp:2829. Those ports aren't in use, so not sure why it's failing. We can most likely postpone this work and handle it with the refactoring

There are more todo's as listed in the source. I will continue tomorrow. The updated patches I'm just going to upload.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #43)

  • When starting GV_example via geckodriver directly Services.appinfo.name reports Fennec instead of Firefox. Using browsertime for the exact some package/activity it correctly returns firefox

I was able to figure that out together with Nick yesterday. The underlying reason is that webdriver sets the browserName capability for capability matching:

https://github.com/SeleniumHQ/selenium/blob/master/javascript/node/selenium-webdriver/firefox.js#L266

this.setBrowserName(Browser.FIREFOX);

As such the exact same value is returned in the capabilities sent by Marionette back to geckodriver. Given that in case of geckoview the Services.appinfo.name is Fennec, we seem to force keeping the incoming capability value in the returned capabilities even it doesn't match at all.

Andreas, is that a bug or really wanted? For me it feels broken.

Flags: needinfo?(ato)

Oh, and the WebDriver spec says:

If value is not a string equal to the "browserName" entry in matched capabilities, return success with data null.

There is a chance the remote end will need to start a browser process to correctly determine the browserName. Lightweight checks are preferred before this is done.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #45)

The underlying reason is that webdriver sets the browserName capability for capability matching:

https://github.com/SeleniumHQ/selenium/blob/master/javascript/node/selenium-webdriver/firefox.js#L266

this.setBrowserName(Browser.FIREFOX);

As such the exact same value is returned in the capabilities sent
by Marionette back to geckodriver. Given that in case of geckoview
the Services.appinfo.name is Fennec, we seem to force keeping
the incoming capability value in the returned capabilities even it
doesn't match at all.

Bear in mind that Marionette does not do any capabilities matching.
It treats the capabilities passed to WebDriver:NewSession as
configuration, and runs only rudimentary type- and bounds checks
to ensure the sanity of the data.

Marionette picks up on the data it needs to configure itself and
assumes the remaining data is correct:
https://searchfox.org/mozilla-central/rev/588814f2edddf0e132d77d326ddae50911e8bad1/testing/marionette/capabilities.js#585

I think perhaps there’s an argument that Marionette should only
care for the specific capabilities it needs for setting various
options, and ignore any additional stuff passed on by geckodriver.
This would let the original value for browserName from
Services.appinfo.name to be passed back:
https://searchfox.org/mozilla-central/rev/588814f2edddf0e132d77d326ddae50911e8bad1/testing/marionette/capabilities.js#438
writing everything back on matched

Flags: needinfo?(ato)

(In reply to Andreas Tolfsen 「:ato」 from comment #47)

I think perhaps there’s an argument that Marionette should only
care for the specific capabilities it needs for setting various
options, and ignore any additional stuff passed on by geckodriver.
This would let the original value for browserName from
Services.appinfo.name to be passed back:
https://searchfox.org/mozilla-central/rev/588814f2edddf0e132d77d326ddae50911e8bad1/testing/marionette/capabilities.js#438
writing everything back on matched

That sounds good. I filed this as bug 1580453, given that it is not directly related to the Android work.

For the correct browser name to use, it will be firefox for each and every Android related browser.

Fixed:

  • Given the feedback from James I moved back the hex encoding/decoding methods into the lib module, and made them inline.
  • I introduced a new DeviceError enum to better handle all the various kinds of errors, and an appropriate Result type.
  • I moved all the Android related code in geckodriver into its own module

It might have been good to also introduce a new GeckoDriverError type to make error handling easier, but as agreed with James we will move this to some other bug.

I will have updated patches soon to get the review process started.

Attachment #9090830 - Attachment description: Bug 1525126 - [mozbase] Add Rust `mozdevice` crate speaking ADB over TCP/IP. → Bug 1525126 - [mozbase] Add Rust `mozdevice` crate speaking ADB over TCP/IP. r=#webdriver
Attachment #9090831 - Attachment description: Bug 1525126 - [geckodriver] Wait for Marionette handshake rather than TCP connection. → Bug 1525126 - [geckodriver] Wait for Marionette handshake rather than TCP connection. r=#webdriver
Attachment #9091758 - Attachment description: Bug 1525126 - [marionette] On Android report "android" as platform name. → Bug 1525126 - [marionette] On Android report "android" as platform name. r=#webdriver
Attachment #9090832 - Attachment description: Bug 1525126 - [geckodriver] Add Android / GeckoView support. → Bug 1525126 - [geckodriver] Add Android / GeckoView support. r=#webdriver

I cannot push to try and see Android builds running yet, so at least I have included all Wdspec tests, and Rust tests to ensure no existing tests are broken:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ccf713355b0d18a9c5ce5124f9c7a7c40e1ae74

Attachment #9091758 - Attachment description: Bug 1525126 - [marionette] On Android report "android" as platform name. r=#webdriver → Bug 1525126 - [marionette] Always use "firefox" and "android" as capabilities for Android browsers. r=#webdriver

I currently see crashes of geckodriver when multiple devices are connected, or if the specified device cannot be found. They all crash in the webdriver dispatcher. I'm currently investigating this crash.

That was actually a problem in how I implemented the new failures for unknown and multiple devices. It's working now locally.

Fixes:

  • Updated the docs for more details of Android support
  • Implemented check if device serial is valid, and to raise UnknownDevice(serial) otherwise
  • Custom error for multiple devices attached

Todo:

  • As I noticed it seems to be not possible to forward ports if multiple devices are attached; even the adb command line doesn't work when specifying -s serial for the adb forward <local> <remote> command. Interestingly adb forward --list works with -s. I would propose that we abort if multiple devices are connected.
  • address more review comments (on Monday)

Sorry, but as it looks like I won't be able to finish this patch today. I totally f*^cked up my local hg history when trying to apply changes to a former commit while staying in histedit. Also mozdevice is somehow totally busted today for me in term of that the ADB forward command no longer works, and the error I get returned from adb is empty. I clearly need more investigation here, but that most likely would have to wait until I'm back on Sep 23rd. Sorry.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] (away 09/17 - 09/23) from comment #53)

Also mozdevice is somehow totally busted today for me in term of that the ADB forward command no longer works, and the error I get returned from adb is empty.

I also witnessed this! And I have now fixed it, but I cannot be certain what is the actual fix :(. So let me try to explain what I witnessed and what I did, so that you can try the same. Hopefully we can diagnose more definitively.

Fundamentally, the issue appears to be updating macOS. For me, that was to 10.14.6. Immediately after I could not ADB forward from geckodriver (either built in try or locally). (Aside: I theorize that there's an interaction with unsigned binaries, notarization, and System Integrity Protection.). The internet suggests that updating the Android SDK platforms tools may help; I ran

./mach bootstrap --no-interactive --application-choice mobile_android

and that did update my adb to

$ adb --version
Android Debug Bridge version 1.0.41
Version 29.0.4-5871666
Installed as /Users/nalexander/.mozbuild/android-sdk-macosx/platform-tools/adb

I still had the issue. I know that the x86 emulator can get into a bad state, so I re-installed with mach android-emulator --version=x86-7.0 --force-update. Still busted. I tried with an Android 28 x86 emulator. Still busted.

Then I remembered that there's a server component to adb. I ran adb kill-server && adb start-server, and this appears to have addressed the issue. I cannot explain what happened here.

So: update the Android SDK, restart the client and server components, see if that helps? Good luck!

Flags: needinfo?(nalexander)

Just FYI, the issue reproduced with pure-python-adb. I didn't try it against a live device, simply 'cuz I solved the problem quite rapidly.

Flags: needinfo?(nalexander)

I just came back from PTO and tried to run the tests again, and what a wonder all of them are passing, without forcing me to have to do any manual work. That's good because all the items Nick mentioned in comment 54 I already did before I went on PTO. The only thing I did was to start the emulator and that without the force option.

This is just mystic, and we shouldn't spent more time in trying to figure out what the underlying reason is. I will continue on the patches today, and hopefully have it updated later today. I want to see them landed as soon as possible.

I have updated all of the revisions with the requested changes included. I also removed the default value for the activity name, which would have bound geckodriver to geckoview_example only if no activity name gets specified in the capabilities. By resolving the main activity for a package, it will also work with Fennec now, when only the package name has been specified.

Summary: Make geckodriver handle Android-specific arguments just like chromedriver does → Add Android support to geckodriver
Attachment #9090832 - Attachment description: Bug 1525126 - [geckodriver] Add Android / GeckoView support. r=#webdriver → Bug 1525126 - [geckodriver] Add Android support. r=#webdriver

Had to push a follow-up try build because a unit test was failing. Here the new results:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=61c73b6c22ab64051eef4dabbd64ff3a482bdfd6&selectedJob=268425603

Hey folks! Built just fine, and testing against a Browsertime based on selenium-webdriver both v3.6.x and v4.0-alpha-something succeeded. I found one issue with androidIntentArguments not being accepted which I pointed out in the relevant source (I think).

Overall, just great work by Henrik. He and I talked about author attribution and I'm totally happy to have his name on the patches: it's better that people look to him (and the greater team) for future changes and support.

Bravo!

Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cee49076fdd8 [mozbase] Add Rust `mozdevice` crate speaking ADB over TCP/IP. r=jgraham,webdriver-reviewers,nalexander https://hg.mozilla.org/integration/autoland/rev/d1f3680d88b3 [geckodriver] Wait for Marionette handshake rather than TCP connection. r=jgraham,webdriver-reviewers,nalexander https://hg.mozilla.org/integration/autoland/rev/e20ffbef6a29 [marionette] Always use "firefox" and "android" as capabilities for Android browsers. r=jgraham,nalexander https://hg.mozilla.org/integration/autoland/rev/8ff6cdd2e04a [geckodriver] Add Android support. r=jgraham,nalexander
Blocks: 1587364

The new Android-specific capabilities are not documented on MDN yet:
https://developer.mozilla.org/en-US/docs/Web/WebDriver/Capabilities/firefoxOptions

Could you take a follow-up on this?

Ideally we would have examples for all of the following:

  • Firefox on Android (Fennec)
  • Fenix (GeckoView based)
  • Other GeckoView based apps
Flags: needinfo?(hskupin)

Also, androidIntentArguments is not documented anywhere I could find.

I took care of updating MDN myself because we decided to not duplicate
moz:firefoxOptions documentation between firefox-source-docs.m.o
and MDN.

I would still appreciate another set of eyes on the Android specific
additions to
https://developer.mozilla.org/en-US/docs/Web/WebDriver/Capabilities/firefoxOptions,
especially around androidIntentArguments and the example.

(In reply to Andreas Tolfsen 「:ato」 from comment #65)

I took care of updating MDN myself because we decided to not duplicate
moz:firefoxOptions documentation between firefox-source-docs.m.o
and MDN.

I would still appreciate another set of eyes on the Android specific
additions to
https://developer.mozilla.org/en-US/docs/Web/WebDriver/Capabilities/firefoxOptions,
especially around androidIntentArguments and the example.

Thanks for getting this started. I documented and added examples for androidIntentArguments. I did everything from memory, so hopefully I didn't slip!

Splendid, thanks!

Flags: needinfo?(hskupin)

Hi! my name is Bruno I contribute to the appium open source project, I'm working on preparing Appium to support geckodriver for Android(and desktop) and I notice the same issue as Nick Alexander was seeing when trying to delete the session.

dbug WD Proxy Proxying [DELETE /] to [DELETE http://127.0.0.1:29146/session/4396812b-225e-46ad-acbf-b1e6322c809e] with body: {}
dbug Geckodriver [STDOUT] 1578274046035 webdriver::server       DEBUG   -> DELETE /session/4396812b-225e-46ad-acbf-b1e6322c809e {}
dbug Geckodriver [STDOUT] 1578274046042 webdriver::server       DEBUG   <- 500 Internal Server Error {"value":{"error":"unsupported operation","message":"Only supported in Firefox","stacktrace":"WebDriverError@chrome://marionette/content/error.js:4:250\nUnsupportedOperationError@chrome://marionette/content/error.js:37:77\nassert.that/<@chrome://marionette/content/assert.js:5:513\nassert.firefox@chrome://marionette/content/assert.js:1:1325\nGeckoDriver.prototype.quit@chrome://marionette/content/driver.js:133:8\ndespatch@chrome://marionette/content/server.js:21:34\nexecute@chrome://marionette/content/server.js:18:170\nonPacket/<@chrome://marionette/content/server.js:17:50\nonPacket@chrome://marionette/content/server.js:17:65\n_onJSONObjectReady/<@chrome://marionette/content/transport.js:23:125\n"}}
info WD Proxy Got an unexpected response with status 500: {"value":{"error":"unsupported operation","message":"Only supported in Firefox","stacktrace":"WebDriverError@chrome://marionette/content/error.js:4:250\nUnsupportedOperationError@chrome://marionette/content/error.js:37:77\nassert.that/<@chrome://marionette/content/assert.js:5:513\nassert.firefox@chrome://marionette/content/assert.js:1:1325\nGeckoDriver.prototype.quit@chrome://marionette/content/driver.js:133:8\ndespatch@chrome://marionette/content/server.js:21:34\nexecute@chrome://marionette/content/server.js:18:170\nonPacket/<@chrome://marionette/content/server.js:17:50\nonPacket@chrome://marionette/content/server.js:17:65\n_onJSONObjectReady/<@chrome://marionette/content/transport.js:23:125\n"}}
dbug W3C Matched W3C error code 'unsupported operation' to UnsupportedOperationError

is this the right place to track the issue? Is there work for this on a branch I can clone and help to test?

Flags: needinfo?(hskupin)

Hi Bruno. Thanks bug is marked as fixed, and was also part of a geckodriver release. For remaining problems please file new bugs. We will use those for investigation and fixes if necessary. Thanks.

Flags: needinfo?(hskupin)

Thanks! I'll file a new bug. (In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #69)

Hi Bruno. Thanks bug is marked as fixed, and was also part of a geckodriver release. For remaining problems please file new bugs. We will use those for investigation and fixes if necessary. Thanks.

Sure. Its done. https://bugzilla.mozilla.org/show_bug.cgi?id=1607996

Blocks: 1607996
Blocks: 1617810
Regressions: 1677763
Regressions: 1723451

In case anybody else was curious about this...

(In reply to Nick Alexander :nalexander [he/him] from comment #38)

I looked at making Marionette uses abstract Unix sockets and it's a pain

I agree, it's a pain. However wrapping firefox in ip2unix (https://github.com/nixcloud/ip2unix) does in fact work. I modified the geckodriver Rust code (much easier!) to add support for filesystem sockets and can confirm it all works.

Caveat: you need to apply the one-line patch here: https://github.com/nixcloud/ip2unix/issues/16#issuecomment-637473455 to ip2unix if you want to be able to connect more than once to a given running instance of firefox.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: