Closed Bug 1094369 Opened 10 years ago Closed 10 years ago

Green up Linux Mulet reftests on TaskCluster prod

Categories

(Firefox OS Graveyard :: Gaia, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
2.2 S11 (1may)

People

(Reporter: armenzg, Assigned: gerard-majax)

References

(Depends on 1 open bug)

Details

(Whiteboard: [systemsfe])

Attachments

(3 files)

Attached image screenshot from reftest analyzer (deleted) —
This push shows all Mulet reftests: https://tbpl.mozilla.org/?tree=Try&rev=be3fb02d83bd&show_all=1 However, a lot of the test jobs are failing. From a screenshot I took in the screen analyzer I think we might have some reftests expecting a bigger screen (I'm not sure): https://tbpl.mozilla.org/php/getParsedLog.php?id=51906221&tree=Try&full=1
Depends on: 1128986
For future reference, we can run this locally like this: python scripts/mulet_unittest.py --cfg b2g/generic_config.py --cfg b2g/mulet_config.py --test-suite reftest --test-manifest tests/layout/reftests/reftest.list --total-chunks 6 --cfg developer_config.py --gaia-repo ~/repos/branches/gaia-central --installer-url http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/alissy@mozilla.com-489ec085db6b/try-linux64-mulet/firefox-38.0a1.en-US.linux-x86_64.tar.bz2 --test-url http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/alissy@mozilla.com-489ec085db6b/try-linux64-mulet/firefox-38.0a1.en-US.linux-x86_64.tests.zip If you want to run it a second time you can do this: python scripts/mulet_unittest.py --cfg b2g/generic_config.py --cfg b2g/mulet_config.py --test-suite reftest --test-manifest tests/layout/reftests/reftest.list --total-chunks 6 --cfg developer_config.py --gaia-repo ~/repos/branches/gaia-central --installer-url http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/alissy@mozilla.com-489ec085db6b/try-linux64-mulet/firefox-38.0a1.en-US.linux-x86_64.tar.bz2 --test-url http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/alissy@mozilla.com-489ec085db6b/try-linux64-mulet/firefox-38.0a1.en-US.linux-x86_64.tests.zip --binary-path `pwd`/build/application/firefox/firefox --run-tests (This last option is the same as not running the other actions --no-clobber --no-pull --no-download-and-extract --no-create-virtualenv --no-install). Using --gaia-repo is to speed up the process. You don't have to use it. To note that gaia-central is an Hg checkout rather than a Git checkout (mozharness currently only supports that). 14:14:00 INFO - 1422990840430 Marionette INFO sendToClient: {"from":"0","error":{"message":"TypeError: system is undefined","status":17,"stacktrace":"execute_async_script @b2g_desktop.py, line 182\ninline javascript, line 34\nsrc: \" return !system.locked;\""}}, {8eeb2db8-6788-4ffc-9bba-500cad6f2e13}, {8eeb2db8-6788-4ffc-9bba-500cad6f2e13} 14:14:00 INFO - Traceback (most recent call last): 14:14:00 INFO - File "runreftestb2g.py", line 640, in <module> 14:14:00 INFO - sys.exit(main()) 14:14:00 INFO - File "runreftestb2g.py", line 635, in main 14:14:00 INFO - return run_desktop_reftests(parser, options, args) 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/tests/reftest/b2g_desktop.py", line 216, in run_desktop_reftests 14:14:00 INFO - sys.exit(reftest.run_tests(args[0], options)) 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/tests/reftest/b2g_desktop.py", line 92, in run_tests 14:14:00 INFO - self.run_marionette_script() 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/tests/reftest/b2g_desktop.py", line 43, in run_marionette_script 14:14:00 INFO - self._unlockScreen() 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/tests/reftest/b2g_desktop.py", line 182, in _unlockScreen 14:14:00 INFO - self.marionette.execute_async_script('GaiaLockScreen.unlock()') 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 1354, in execute_async_script 14:14:00 INFO - filename=os.path.basename(frame[0])) 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/venv/local/lib/python2.7/site-packages/marionette/decorators.py", line 36, in _ 14:14:00 INFO - return func(*args, **kwargs) 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 634, in _send_message 14:14:00 INFO - self._handle_error(response) 14:14:00 INFO - File "/home/armenzg/moz/tmp/scripts/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 689, in _handle_error 14:14:00 ERROR - raise errors.JavascriptException(message=message, status=status, stacktrace=stacktrace) 14:14:00 ERROR - marionette.errors.JavascriptException: JavascriptException: TypeError: system is undefined 14:14:00 INFO - stacktrace: 14:14:00 INFO - execute_async_script @b2g_desktop.py, line 182 14:14:00 INFO - inline javascript, line 34 14:14:00 INFO - src: " return !system.locked;"
We may have a couple of tests to disable for mulet: > $ git grep B2G layout/reftests/ | grep skip-if | wc -l > 749 That's a couple.
(In reply to Alexandre LISSY :gerard-majax from comment #5) > We may have a couple of tests to disable for mulet: > > > $ git grep B2G layout/reftests/ | grep skip-if | wc -l > > 749 > > That's a couple. Got mislead, we should look for B2GDT, as introduced by bug 958533
Depends on: 958533
Armen, I'm seeing a lot of errors like this one: > marionette_driver.errors.TimeoutException: TimeoutException: Timed out after 15.1 seconds This is not something I had back when I fixed bug 1128986, and this is making my life miserable. Can you help me ?
Flags: needinfo?(armenzg)
Disabled much more tests (marked as failing B2G or skip B2G), and bumped timeout to 300 secs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f34522430ac
Attached file scan_reftest_log.sh (deleted) —
Little script to help.
(In reply to Alexandre LISSY :gerard-majax from comment #11) > Disabled much more tests (marked as failing B2G or skip B2G), and bumped > timeout to 300 secs: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f34522430ac syntax error :( https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce864e2a817a
Links for anyone looking: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f34522430ac&exclusion_state=all&exclusion_profile=false https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce864e2a817a&exclusion_state=all&exclusion_profile=false https://treeherder.mozilla.org/#/jobs?repo=try&revision=a89b5c9d33ec&exclusion_state=all&exclusion_profile=false I don't see the marionetter issue on the last log: http://ftp.mozilla.org/pub/mozilla.org/b2g/try-builds/alissy@mozilla.com-ce864e2a817a/try-linux64-mulet/try_ubuntu64_vm-mulet_test-reftest-6-bm117-tests1-linux64-build5.txt.gz On another note, I don't really much about the harnesses. I'm more experienced with the CI than the harnesses. In the last log, I believe we might need to split the reftests a bit more. What do you think? > 06:20:04 INFO - REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/svg/filters/feConvolveMatrix-bias-01.svg | 9264 / 11761 (78%) > > command timed out: 7200 seconds elapsed running ['/tools/buildbot/bin/python', 'scripts/scripts/mulet_unittest.py', '--cfg', 'b2g/generic_config.py', '--cfg', 'b2g/mulet_config.py', '--test-suite', 'reftest', '--test-manifest', 'tests/layout/reftests/reftest.list', '--total-chunks', '6', '--this-chunk', '6', '--blob-upload-branch', 'try', '--download-symbols', 'ondemand'], attempting to kill > process killed by signal 9 > program finished with exit code -1 > elapsedTime=7200.024388
Flags: needinfo?(armenzg)
Thanks. Another dumb question: why does R1 to R6 executes the same tests instances ?
Flags: needinfo?(armenzg)
There was some TEST-UNEXPECTED-PASS, those should not be there anymore: https://treeherder.mozilla.org/#/jobs?repo=try&revision=44b01f52a2c8
I hope this try will improve regarding the chunking stuff I mentionned in comment 16: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6c271f0dd55e
(In reply to Alexandre LISSY :gerard-majax from comment #18) > I hope this try will improve regarding the chunking stuff I mentionned in > comment 16: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=6c271f0dd55e \o/ looks like we really have 6 chunks now
Removed some TEST-UNEXPECTED-PASS, and disabled addon watching: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a91fb1dcc797
(In reply to Alexandre LISSY :gerard-majax from comment #20) > Removed some TEST-UNEXPECTED-PASS, and disabled addon watching: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=a91fb1dcc797 R1 green, triggered a couple of retry to see
(In reply to Alexandre LISSY :gerard-majax from comment #20) > Removed some TEST-UNEXPECTED-PASS, and disabled addon watching: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=a91fb1dcc797 Same try, but disabling the fix from bug 1039834: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f9f98b894fc4
Trying to fix the: > REFTEST TEST-UNEXPECTED-PASS | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/first-letter/font-text-styles-floater.html | image comparison (==) https://treeherder.mozilla.org/#/jobs?repo=try&revision=de8be84f6442
(In reply to Alexandre LISSY :gerard-majax from comment #23) > This one to check intermittence level of R1: > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=3e1791071c33&exclusion_profile=false Around 25%. All are exposing the same behavior: "load failed: timed out waiting for pending paint count to reach zero (waiting for MozAfterPaint)", on a limited set of test cases.
(In reply to Alexandre LISSY :gerard-majax from comment #25) > (In reply to Alexandre LISSY :gerard-majax from comment #23) > > This one to check intermittence level of R1: > > https://treeherder.mozilla.org/#/ > > jobs?repo=try&revision=3e1791071c33&exclusion_profile=false > > Around 25%. All are exposing the same behavior: "load failed: timed out > waiting for pending paint count to reach zero (waiting for MozAfterPaint)", > on a limited set of test cases. Looking at the code of the test, it relies on waiting 100ms for checking stuff. New try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c24efa5afa77 to experiment bumping this to 1000ms and see the level of intermittence.
All R2 failures seems to be linked to the use of data URL CSS: "data:text/css".
(In reply to Alexandre LISSY :gerard-majax from comment #26) > (In reply to Alexandre LISSY :gerard-majax from comment #25) > > (In reply to Alexandre LISSY :gerard-majax from comment #23) > > > This one to check intermittence level of R1: > > > https://treeherder.mozilla.org/#/ > > > jobs?repo=try&revision=3e1791071c33&exclusion_profile=false > > > > Around 25%. All are exposing the same behavior: "load failed: timed out > > waiting for pending paint count to reach zero (waiting for MozAfterPaint)", > > on a limited set of test cases. > > Looking at the code of the test, it relies on waiting 100ms for checking > stuff. New try: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=c24efa5afa77 to > experiment bumping this to 1000ms and see the level of intermittence. No big change. Lowering to 10ms: https://treeherder.mozilla.org/#/jobs?repo=try&revision=20030fe6da0d
Attached image Capture du 2015-03-02 15:04:53.png (deleted) —
Test failing due to CSP on Mulet and B2G Desktop.
I can see you figured out.
Flags: needinfo?(armenzg)
(In reply to Alexandre LISSY :gerard-majax from comment #29) > Created attachment 8571344 [details] > Capture du 2015-03-02 15:04:53.png > > Test failing due to CSP on Mulet and B2G Desktop. Try forcing those CSP to work: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b57a83624dec
(In reply to Alexandre LISSY :gerard-majax from comment #28) > (In reply to Alexandre LISSY :gerard-majax from comment #26) > > (In reply to Alexandre LISSY :gerard-majax from comment #25) > > > (In reply to Alexandre LISSY :gerard-majax from comment #23) > > > > This one to check intermittence level of R1: > > > > https://treeherder.mozilla.org/#/ > > > > jobs?repo=try&revision=3e1791071c33&exclusion_profile=false > > > > > > Around 25%. All are exposing the same behavior: "load failed: timed out > > > waiting for pending paint count to reach zero (waiting for MozAfterPaint)", > > > on a limited set of test cases. > > > > Looking at the code of the test, it relies on waiting 100ms for checking > > stuff. New try: > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=c24efa5afa77 to > > experiment bumping this to 1000ms and see the level of intermittence. > > No big change. Lowering to 10ms: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=20030fe6da0d No change either :(
Depends on: 1138441
Depends on: 1138442
Depends on: 1138444
Depends on: 1138454
Depends on: 1138447
Renamed b2gIsMulet to Mulet, added comments on the end of line of each disabled test, and moved png test suite pref to each directory where it was needed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ec2c29dd289
A lot of failures on R6 and R5 may be due because of already-skipped B2G/B2G Desktop tests that were not skipped on Mulet. Augmenting parity: https://treeherder.mozilla.org/#/jobs?repo=try&revision=97495c53d4ea
Depends on: 1138895
(In reply to Alexandre LISSY :gerard-majax from comment #33) > Renamed b2gIsMulet to Mulet, added comments on the end of line of each > disabled test, and moved png test suite pref to each directory where it was > needed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ec2c29dd289 > (In reply to Alexandre LISSY :gerard-majax from comment #34) > A lot of failures on R6 and R5 may be due because of already-skipped B2G/B2G > Desktop tests that were not skipped on Mulet. Augmenting parity: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=97495c53d4ea Those somehow died because of taking too much time.
No longer depends on: 1138895
(In reply to Alexandre LISSY :gerard-majax from comment #36) > With all the changes of the two above: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=f70f9f0e85c5 Failure because I forgot one b2gIsMulet :(
Depends on: 1139891
This one should be the last one to get parity for bug 1138442. We also skip on Mulet the 3-4 failing tests in each suite, and the couple of already-known intermittents. The goal is to assert that once those are dealt with, we can get stable green on all suites. https://treeherder.mozilla.org/#/jobs?repo=try&revision=f44933b36f50
(In reply to Alexandre LISSY :gerard-majax from comment #40) > This one should be the last one to get parity for bug 1138442. We also skip > on Mulet the 3-4 failing tests in each suite, and the couple of > already-known intermittents. The goal is to assert that once those are dealt > with, we can get stable green on all suites. > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=f44933b36f50 For some reasons, the skip-if(Mulet) for the R1 failures are still there. With a brutal skip and removing the random, we get: https://treeherder.mozilla.org/#/jobs?repo=try&revision=13e6c3ec48da&exclusion_profile=false In both case, one can notice small intermittence with orange reported but no test failure. At the end of the log, one can see: > 13:48:46 INFO - REFTEST INFO | Result summary: > 13:48:46 INFO - REFTEST INFO | Successful: 1713 (1712 pass, 1 load only) > 13:48:47 INFO - REFTEST INFO | Unexpected: 0 REFTEST INFO | b2g_desktop.py | Running tests: end. > 13:48:47 INFO - (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 unexpected fixed asserts, 0 failed load, 0 exception) > 13:48:47 INFO - REFTEST INFO | Known problems: 335 (55 known fail, 0 known asserts, 23 random, 257 skipped, 0 slow) > 13:48:47 INFO - REFTEST INFO | Total canvas count = 19 > 13:48:47 INFO - REFTEST TEST-START | Shutdown > 13:48:47 INFO - 1425592126583 addons.xpi WARN Add-on reftest@mozilla.org is missing bootstrap method shutdown > 13:48:47 INFO - -*- SettingsManager: Received: inner-window-destroyed for valid innerWindowID=4, cleanup. > 13:48:47 INFO - JavaScript error: chrome://reftest/content/reftest.jsm, line 1964: NS_ERROR_NOT_AVAILABLE: Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIPropertyBag2.getPropertyAsAString] > 13:48:47 INFO - Return code: 0 > 13:48:47 INFO - TinderboxPrint: reftest<br/><em class="testfail">T-FAIL</em> > 13:48:47 WARNING - # TBPL WARNING # > 13:48:47 WARNING - The reftest suite: reftest-4 ran with return status: WARNING
Depends on: 1140394
Depends on: 1140429
I see new failures that seems related to fonts on tc-R: https://treeherder.allizom.org/#/jobs?repo=try&revision=0e4d9c30fd89&exclusion_profile=false For example, REFTEST TEST-UNEXPECTED-FAIL | file:///home/worker/build/tests/reftest/tests/layout/reftests/text-overflow/two-value-syntax.html | image comparison (==), max difference: 10, number of differing pixels: 32 This passes on try and it's not a B2G skipped test. I don't know how to check if fonts are good on taskcluster infrastructure.
Flags: needinfo?(jlal)
Depends on: 1141511
Depends on: 1142506
(In reply to Alexandre LISSY :gerard-majax from comment #47) > Hacking some scrollbar related failures: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=3da7d942e289 No help, trying something else: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2f21c46cf20
Depends on: 1142581
Depends on: 1142928
Depends on: 1142990
Depends on: 1142565
Flags: needinfo?(jlal)
(In reply to Alexandre LISSY :gerard-majax from comment #53) > https://treeherder.mozilla.org/#/jobs?repo=try&revision=d1fcbc811900 > https://treeherder.allizom.org/#/jobs?repo=try&revision=d1fcbc811900 James, I know you're doing a lot on this, but it looks like we still have *tons* of reftests failing only on taskcluster, and all from the same coloring issue.
Depends on: 1144080
No longer depends on: 1142506
No longer depends on: 1142581
No longer depends on: 1142990
No longer depends on: 1138447
Summary: Green up Linux Mulet reftests → Green up Linux Mulet reftests on TaskCluster prod
Depends on: 1150486
Depends on: 1150490
Depends on: 1150492
(In reply to Alexandre LISSY :gerard-majax from comment #56) > Let's get a status and make sure it's already stable: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=794839d25b56 > https://treeherder.allizom.org/#/jobs?repo=try&revision=794839d25b56 Intermittent R1 failures: bug 1150492 Intermittent R6 failures: bug 1150490 Permanent R4 failures: bug 1150486
Depends on: 1150536
Depends on: 1150941
I have patches ready in bug 1153574 that will mass re-enable tests that we disabled to get tc up. This will make the test coverage better.
No longer blocks: 1150579
Depends on: 1153574, 1150579
Assignee: nobody → lissyx+mozillians
Whiteboard: [systemsfe]
Depends on: 1154358
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.2 S11 (1may)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: