Closed Bug 650890 Opened 13 years ago Closed 11 years ago

port remote talos to mozharness

Tracking

(Not tracked)

Status:

RESOLVED DUPLICATE of bug 829211

People

(Reporter: mozilla, Unassigned)

References

Details

(Whiteboard: [mozharness][talos][android][tegras][mozharness+talos])

Attachments

(3 files, 1 obsolete file)

move talos.py and testing.py under subdir testing/ 12 years ago Aki Sasaki (not active) (deleted), patch		Details \| Diff \| Splinter Review
same, but with override-friendly __init__() 12 years ago Aki Sasaki (not active) (deleted), patch	k0scist : review+ mozilla : checked-in+	Details \| Diff \| Splinter Review
v 0.5 12 years ago Justin Wood (:Callek) (deleted), patch	mozilla : review+ jlund : feedback+ jmaher : feedback+ mozilla : checked-in+	Details \| Diff \| Splinter Review
dealing with review comments 12 years ago Aki Sasaki (not active) (deleted), patch	Callek : review+ mozilla : checked-in+	Details \| Diff \| Splinter Review

Aki Sasaki (not active)

Reporter

Description

•

13 years ago

This should get complexity out of buildbotcustom and allow the a-team,
developers, and releng to all use the same scripts.

The dependency on bug 650887 isn't a hard dependency.

Aki Sasaki (not active)

Reporter

Updated

•

13 years ago

Assignee: nobody → aki

Aki Sasaki (not active)

Reporter

Comment 1

•

13 years ago

In progress: https://github.com/escapewindow/mozharness/tree/talosrunner .

Status: NEW → ASSIGNED

Aki Sasaki (not active)

Reporter

Updated

•

13 years ago

Blocks: 651974

Aki Sasaki (not active)

Reporter

Comment 2

•

13 years ago

Removing 'sut' from the summary.
Standalone users will most likely be using adb instead of sut.
We're currently considering altering our Tegra setup to only one apk, that starts IP-based adbd on boot.

The benefits here include:

* adds more android-specific functionality
* removes the need for logcat in a separate process for stdout

The drawbacks include:

* if we add another device platform, SUT would be the lowest common denominator.  all the work we do with adb will be Android-specific and cannot carry over.

I'm going to concentrate on adb for now, while keeping in mind that I may need to shoehorn sut functionality in later.

OS: Android → MeeGo

Summary: port remote sut talos to mozharness → port remote talos to mozharness

Aki Sasaki (not active)

Reporter

Updated

•

13 years ago

OS: MeeGo → Android

Aki Sasaki (not active)

Reporter

Updated

•

13 years ago

Depends on: 698957

Aki Sasaki (not active)

Reporter

Updated

•

13 years ago

Depends on: 698959

Aki Sasaki (not active)

Reporter

Updated

•

13 years ago

Depends on: 698961

Armen [:armenzg]

Updated

•

13 years ago

Blocks: 713003

Jeff Hammel

Updated

•

13 years ago

Blocks: 713055

Jeff Hammel

Updated

•

13 years ago

Priority: P4 → P1

Whiteboard: [mozharness][talos][android][tegras] → [mozharness][talos][android][tegras][mozharness+talos]

Aki Sasaki (not active)

Reporter

Updated

•

12 years ago

Depends on: 738824

Aki Sasaki (not active)

Reporter

Comment 3

•

12 years ago

Larger steps to do here:

* Merge latest mozharness changes into the old talosrunner branch (DONE)
* Get device_talosrunner.py working again locally with current talos.
** This was working on tegras-over-sut and tablets-over-adb previously

Once I get re-familiarized with the script I may find other things that need doing; it's been a while since I touched this code. I think I was waiting for --develop to work properly before allowing for using that local python webserver; I think that's been resolved.

I think I also found a bunch of tests that weren't running properly, but that may have been tablet-vs-tegra. Enough has changed in Talos and Fennec land that I'll have to get a new baseline.

I could then run these directly on the Foopies through buildbot, but that doesn't solve a lot of the existing problems. To solve those problems, I need to modify how we use the tegras:

* bug 738824 - tegra library webapp
* Add logic to check out + check in tegras to mozharness
* Take a small number of tegras out of production and put them into this pool.
* Run for a while in staging to see what kind of stability + numbers we're seeing
* Determine where we're running these: on test slaves? On Foopies but without sut_tools scripts? The former would be better but would require more optimization to prevent Android tests from taking over our desktop test slave pools.
** To optimize for test slaves: instead of running tests directly in the script body, create objects to run the tests on the tegras, and run multiple tests in parallel.
** Those objects would have separate log files and OutputParser's and would be independent of each other.
** If we do this, we need to rethink how we get jobs from buildbot to the script, and how we get status from the script to TBPL, since both buildbot and TBPL assume one test suite == one job.

The closest I have to this in mozharness are OutputParser and MercurialVCS currently. But splitting out discrete tasks to multiple parallel objects would help speed up a number of things, like downloads, repacks, or multiple parallel test suites.

Aki Sasaki (not active)

Reporter

Comment 4

•

12 years ago

Here's what I have in OmniFocus for device_talosrunner.
It's been a while, so I'm not entirely what some of them mean right now.

* talosrunner on test slaves no foopies blog post
* figure out how to determine own/device's IP addresses
* check in / check out tegra solution
  (get rid of flag files?)
* tp4.zip if --develop
* webserver pidfile
  (__main__ try/except?
  looks like wlach/jmaher might have it solved
  http://www.regexprn.com/2010/05/killing-multithreaded-python-programs.html )
* which pool to run on?
* add FQDN to tegras.json
* multiple device support in talos?
* device flags, or lack thereof
* get device_talosrunner.py working with new talos (just added this one)
* changing mozhttpd ports

Aki Sasaki (not active)

Reporter

Comment 5

•

12 years ago

Attached patch move talos.py and testing.py under subdir testing/ (obsolete) (deleted) — Details — Splinter Review

In my github talosrunner branch, I've created a mozharness/test/device.py that I use to talk to devices over adb or sut.  I think this belongs in a directory with talos.py and testing.py.

I've moved mozharness.mozilla.talos to mozharness.mozilla.testing.talos and mozharness.mozilla.testing to mozharness.mozilla.testing.testbase (really not sold on this name; open to suggestions).

If this is cool, I'll then merge back into my talosrunner branch, then move mozharness.test.device to mozharness.mozilla.testing.device .

Attachment #614221 - Flags: review?(jhammel)

Aki Sasaki (not active)

Reporter

Comment 6

•

12 years ago

Attached patch same, but with override-friendly __init__() (deleted) — Details — Splinter Review

While trying to get device_talosrunner.py to inherit Talos, I found that I couldn't override the various items sent to BaseScript.__init__().

This patch includes the above move to mozharness.mozilla.testing.talos, and uses kwargs to allow for a more overrideable Talos object.

Attachment #614221 - Attachment is obsolete: true

Attachment #614221 - Flags: review?(jhammel)

Attachment #614238 - Flags: review?(jhammel)

Jeff Hammel

Comment 7

•

12 years ago

Comment on attachment 614238 [details] [diff] [review]
same, but with override-friendly __init__()

+    def __init__(self, **kwargs):
+        if 'config_options' not in kwargs:
+            kwargs['config_options'] = self.config_options
+        if 'all_actions' not in kwargs:
+            kwargs['all_actions'] = self.actions
+        if 'default_actions' not in kwargs:
+            kwargs['default_actions'] = self.actions
+        if 'config' not in kwargs:
+            kwargs['config'] = {}
+        if 'virtualenv_modules' not in kwargs['config']:
+            kwargs['config']['virtualenv_modules'] = ["talos", "mozinstall"]

dict.setdefault is probably much cleaner than all of these if checks . Other than that, looks fine

Jeff Hammel

Comment 8

•

12 years ago

Comment on attachment 614238 [details] [diff] [review]
same, but with override-friendly __init__()

r+ if you change this to use setdefault

Attachment #614238 - Flags: review?(jhammel) → review+

Aki Sasaki (not active)

Reporter

Comment 9

•

12 years ago

Comment on attachment 614238 [details] [diff] [review]
same, but with override-friendly __init__()

Done. http://hg.mozilla.org/build/mozharness/rev/5e277c2be867

Attachment #614238 - Flags: checked-in+

Aki Sasaki (not active)

Reporter

Updated

•

12 years ago

Depends on: 748197

Aki Sasaki (not active)

Reporter

Updated

•

12 years ago

Depends on: 749042

Aki Sasaki (not active)

Reporter

Updated

•

12 years ago

Assignee: aki → bugspam.Callek

Aki Sasaki (not active)

Reporter

Updated

•

12 years ago

No longer blocks: 713055

cmtalbert

Comment 11

•

12 years ago

Can this be closed, and if not, what's left to do? (And can we help)

Justin Wood (:Callek)

Comment 12

•

12 years ago

No this cannot be closed, we still need to get remote talos working on mozharness, I *hope* to dive back into this work by EOW, but it is an AT RISK Q2 goal at this point.

Justin Wood (:Callek)

Comment 13

•

12 years ago

Attached patch v 0.5 (deleted) — Details — Splinter Review

So, this has been tested from a foopy, most of the mozharness specific stuff works on windows as well, but Talos itself is broken on windows for remote.

The items I tested on the foopy were tSVG, and tdhtml -- so far.

This does not yet install robocop, nor does it do all the cleanup/verify stuff we wrote into sut_tools, because that is *specific* to the directory layout of the foopies so far, so will need additional work to get all of it into here, hopefully rxdroid makes that sane for us here.

And this will need some additional cleanup, of course. But all in all, works, though is not ready for production.

I have included all changes against base mozharness repo, including the changes aki made on the github-fork talosrunner branch.

Attachment #640119 - Flags: review?(aki)

Attachment #640119 - Flags: feedback?(jmaher)

Attachment #640119 - Flags: feedback?(jlund)

Joel Maher ( :jmaher ) (UTC -8)

Comment 14

•

12 years ago

Comment on attachment 640119 [details] [diff] [review]
v 0.5

Review of attachment 640119 [details] [diff] [review]:
-----------------------------------------------------------------

device.py looks like a good place for my work in rxdroid.py to help fill in.

Overall, this is looking pretty good.

::: configs/users/aki/peptest.py
@@ +17,5 @@
> +    "server_port": None,
> +    "tracer_threshold": 50,
> +    "tracer_interval": 10,
> +    "symbols_path": None,
> +}

why is this file included in here?

::: configs/users/aki/tablet1.py
@@ +16,5 @@
> +    "talos_config_file": "remote.config",
> +
> +    # this needs to be set to either your_IP:8000, or an existing webserver
> +    # that serves talos.
> +    "talos_webserver": "10.251.25.44:8000",

I really don't like this hardcoded ip

@@ +22,5 @@
> +    # Set this to start a webserver automatically
> +    "start_python_webserver": True,
> +
> +    # adb or sut
> +    "device_protocol": "adb",

we have all been leaning towards SUT, is there a reason for defaulting to adb?

::: configs/users/aki/tegra1.py
@@ +18,5 @@
> +    "talos_config_file": "remote.config",
> +
> +    # this needs to be set to either your_IP:8000, or an existing webserver
> +    # that serves talos.
> +#    "talos_webserver": "10.251.25.44:8000",

would be nice to remove this commented out line.

::: configs/users/callek/tegra1.py
@@ +10,5 @@
> +    "device_package_name": "org.mozilla.fennec",
> +    "talos_device_name": "tegra-224",
> +    "virtualenv_modules": ["pywin32", "talos"],
> +    "exes": { "easy_install": ['d:\\Sources\\mozharness\\build\\venv\\Scripts\\python.exe',
> +                               'd:\\Sources\\mozharness\\build\\venv\\scripts\\easy_install-2.6-script.py'], },

hard coded paths?

::: mozharness/base/script.py
@@ +618,5 @@
>              self.copy_to_upload_dir(os.path.join(dirs['abs_log_dir'], log_file),
>                                      dest=os.path.join('logs', log_file),
>                                      short_desc='%s log' % log_name,
> +                                    long_desc='%s log' % log_name,
> +                                    rotate=True)

what is rotate used for in copy to upload dir?

::: scripts/device_talosrunner.py
@@ +138,5 @@
> +            additional_options.extend(['--remotePort', '-1'])
> +        if c.get('start_python_webserver'):
> +            additional_options.append('--develop')
> +#        if c.get('repository'):
> +#            additional_options.append('repository', c['repository'])

nice!  put a comment in here that we could delete unpack()

Attachment #640119 - Flags: feedback?(jmaher) → feedback+

Armen [:armenzg]

Updated

•

12 years ago

Blocks: 772959

Aki Sasaki (not active)

Reporter

Comment 15

•

12 years ago

Comment on attachment 640119 [details] [diff] [review]
v 0.5

> new file mode 100644
> --- /dev/null
> +++ b/configs/users/callek/tegra1-foopy.py
> @@ -0,0 +1,53 @@
> +config = {^M
> +    "log_name": "talos",^M

Nit: I'd prefer that your tegra1-foopy.py were in unix format rather than dos, though it is explicitly noted as your test config.

Same for your tegra1.py.

(device_talosrunner.py):
>+                      'generate_config',

this should be generate-config.
s,_,-,

>+        self.info('copying %s to %s' % (inifile, remoteappini))
>+        self.run_command(['cp', inifile, remoteappini])

self.copyfile() ?

(device.py):
>         except ImportError, e:
-            self.log("Can't import DeviceManagerSUT! %s\nDid you check out talos?" % str(e), level=error_level)
-            raise
+            self.fatal("Can't import DeviceManagerSUT! %s\nDid you check out talos?" % str(e), level=error_level)

self.fatal() doesn't take a level argument.  This is a fine change, but we can get rid of error_level references in this method with this change.

>-            self.fatal("dev_root %s not correct!" % str(dev_root))
>+            self.fatal("dev_root %s not correct!" % str(dev_root))        

Nit: kill the trailing whitespace, please?

>+            dm.getInfo('process')
>+            dm.getInfo('memory')
>+            dm.getInfo('uptime')

Are these useful for us or developers?
Aiui, this causes more noise than we usually want.
If these are useful, let's keep them with a comment that we want to get these in the log proper at some point.  If not, let's remove them.

>+            try:
>+                self.info(repr(dm.getInfo('process')))
>+                self.info(repr(dm.getInfo('memory')))
>+                self.info(repr(dm.getInfo('uptime')))
>+                self.info(repr(dm.sendCMD(['exec su -c "logcat -d -v time *:W"'])))

Here too.  Does this work?  (I'm under the impression that dm prints these to screen, and doesn't return the strings, but I might be wrong.)

>+            self.fatal("Remote Device Error: updateApp() call failed - exiting")
>+        

Nit: Trailing whitespace.

>-        if c['device_type'] not in ("tegra250",):
>+        if c['device_type'] not in ("tegra250",) or True:

What's going on here?


I think the main point here is getting this working and merged in for future fixes, so for the most part I think this can land with the above fixed.

This carries some cruft from my talosrunner branch circa mid-late 2011.
I think I never really fleshed out device flags here and thought about ditching them.
I also added default log rotation that doesn't actually work past a single backup in base/script.py.  I think we can land these without issue, as long as we target these for cleanup later.


Jordan: if this causes conflicts with your code, I can help unbitrot.

Attachment #640119 - Flags: review?(aki) → review+

Aki Sasaki (not active)

Reporter

Comment 16

•

12 years ago

Attached patch dealing with review comments (deleted) — Details — Splinter Review

This goes on top of v 0.5.
I also snuck in a multilocale fix that pylint was complaining about.
This gets the unit.sh call down to 4 pyflakes warnings:

mozharness/mozilla/testing/device.py:226: local variable 'p' is assigned to but never used
mozharness/mozilla/testing/device.py:451: 'DMError' imported but unused
mozharness/mozilla/testing/device.py:540: undefined name 'DMError'
mozharness/mozilla/testing/device.py:611: undefined name 'DMError'

I'm not entirely sure how to deal with those, so I'm leaving them for the moment.

Attachment #643642 - Flags: review?(bugspam.Callek)

Justin Wood (:Callek)

Comment 17

•

12 years ago

Comment on attachment 643642 [details] [diff] [review]
dealing with review comments

Review of attachment 643642 [details] [diff] [review]:
-----------------------------------------------------------------

interdiff looks good

Attachment #643642 - Flags: review?(bugspam.Callek) → review+

Aki Sasaki (not active)

Reporter

Comment 18

•

12 years ago

Comment on attachment 640119 [details] [diff] [review]
v 0.5

http://hg.mozilla.org/build/mozharness/rev/0a1e7f8a532f

Attachment #640119 - Flags: checked-in+

Aki Sasaki (not active)

Reporter

Comment 19

•

12 years ago

Comment on attachment 643642 [details] [diff] [review]
dealing with review comments

http://hg.mozilla.org/build/mozharness/rev/f3dcdae70a66

Attachment #643642 - Flags: checked-in+

Jordan Lund (:jlund)

Comment 20

•

12 years ago

Comment on attachment 640119 [details] [diff] [review]
v 0.5

sorry this will seem random. Callek, I emailed you about my lack of feedback a  while ago but never heard back. Anyway adding a plus now(even though its not relevant, helpful) to stop email spam. :)

Attachment #640119 - Flags: feedback?(jlund) → feedback+

Justin Wood (:Callek)

Comment 21

•

11 years ago

I'm not actively working on this right now.

Assignee: bugspam.Callek → nobody

Chris Cooper [:coop] (he/him)

Updated

•

11 years ago

Blocks: 882528

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Product: mozilla.org → Release Engineering

Aki Sasaki (not active)

Reporter

Comment 22

•

11 years ago

I'm going to call this a dup of bug 829211 for the panda portion, and a WONTFIX for the tegra portion.

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → DUPLICATE

You need to log in before you can comment on or make changes to this bug.