Closed Bug 689856 Opened 13 years ago Closed 9 years ago

Android tests intermittently just stop

Categories

(Testing :: General, defect)

ARM
Android
defect
Not set
normal

Tracking

(firefox16 affected, firefox17 affected, firefox18 affected, firefox19 affected)

RESOLVED INCOMPLETE
Tracking Status
firefox16 --- affected
firefox17 --- affected
firefox18 --- affected
firefox19 --- affected

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [android_tier_1][android_tier_∞])

A horrible summary, for horrible logs. Pretty sure I've ignored this same thing in reftests at least, and possibly other suites. Equally possibly, this is the reftest/jsreftest equivalent of bug 689839, and they just die quicker. https://tbpl.mozilla.org/php/getParsedLog.php?id=6586326&tree=Mozilla-Inbound Android Tegra 250 mozilla-inbound opt test jsreftest-2 on 2011-09-27 21:54:52 PDT for push b4f351db9863 REFTEST TEST-START | http://10.250.48.208:30039/jsreftest/tests/jsreftest.html?test=js1_8_1/extensions/new-parenthesization.js INFO | automation.py | Application ran for: 0:12:04.769234 INFO | automation.py | Reading PID log: /tmp/tmpaitUknpidlog WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! REFTEST INFO | runreftest.py | Running tests: end. program finished with exit code 0 elapsedTime=753.734964 TinderboxPrint: jsreftest-2<br/><em class="testfail">T-FAIL</em>
https://tbpl.mozilla.org/php/getParsedLog.php?id=6586394&tree=Mozilla-Inbound Android Tegra 250 mozilla-inbound opt test crashtest on 2011-09-27 21:54:52 PDT for push b4f351db9863 REFTEST INFO | Loading a blank page reconnecting socket INFO | automation.py | Application ran for: 0:19:09.225178 INFO | automation.py | Reading PID log: /tmp/tmpCr9NaJpidlog WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! REFTEST INFO | runreftest.py | Running tests: end. program finished with exit code 0 elapsedTime=1177.916850 TinderboxPrint: crashtest<br/><em class="testfail">T-FAIL</em>
https://tbpl.mozilla.org/php/getParsedLog.php?id=6590219&tree=Mozilla-Inbound Android Tegra 250 mozilla-inbound opt test jsreftest-2 on 2011-09-28 04:19:12 PDT for push 4d10127fd106 REFTEST TEST-START | http://10.250.48.208:30031/jsreftest/tests/jsreftest.html?test=js1_8_1/regress/regress-452498-053.js INFO | automation.py | Application ran for: 0:11:58.997476 INFO | automation.py | Reading PID log: /tmp/tmpp1hlz8pidlog WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! REFTEST INFO | runreftest.py | Running tests: end. program finished with exit code 0 elapsedTime=750.557479 TinderboxPrint: jsreftest-2<br/><em class="testfail">T-FAIL</em>
https://tbpl.mozilla.org/php/getParsedLog.php?id=7653454&tree=Birch 3006 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_object02.html reconnecting socket 3007 INFO TEST-PASS | Asize 3008 INFO TEST-PASS | Asize 3009 INFO TEST-PASS | alignLink INFO | automation.py | Application ran for: 0:19:57.930441 INFO | automation.py | Reading PID log: /tmp/tmplmE5Fhpidlog getting files in '/mnt/sdcard/tests/profile/minidumps/' WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected!
Depends on: 706614
https://tbpl.mozilla.org/php/getParsedLog.php?id=7734029&tree=Birch Android Tegra 250 birch opt test mochitest-6 on 2011-12-03 20:05:53 PST for push 1bc10492926e 2408 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_HTMLTableElement26.html reconnecting socket DeviceManager: error pulling file '/mnt/sdcard/tests/logs/mochitest.log': No such file or directory INFO | automation.py | Application ran for: 0:12:38.005948
https://tbpl.mozilla.org/php/getParsedLog.php?id=7773211&tree=Birch Android Tegra 250 birch opt test mochitest-6 on 2011-12-06 08:02:53 PST for push 884b705c99fb 3241 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_table39.html 3242 INFO TEST-PASS | Asize 3243 INFO TEST-PASS | Asize 3244 INFO TEST-PASS | summaryLink reconnecting socket 3245 INFO TEST-END | /tests/dom/tests/mochitest/dom-level2-html/test_table39.html | finished in 1372ms 3246 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_table40.html INFO | automation.py | Application ran for: 0:19:29.874304 https://tbpl.mozilla.org/php/getParsedLog.php?id=7773243&tree=Birch Android Tegra 250 birch opt test mochitest-4 on 2011-12-06 08:15:53 PST for push 5d205868839c 100 INFO TEST-PASS | throw_INVALID_CHARACTER_ERR 101 INFO TEST-END | /tests/dom/tests/mochitest/dom-level2-core/test_createDocument05.html | finished in 2142ms 102 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-core/test_createDocument06.html INFO | automation.py | Application ran for: 0:01:51.148015 https://tbpl.mozilla.org/php/getParsedLog.php?id=7756295&tree=Birch Android Tegra 250 birch opt test mochitest-6 on 2011-12-05 10:39:53 PST for push 7b72e55cfe69 2357 INFO TEST-END | /tests/dom/tests/mochitest/dom-level2-html/test_HTMLTableElement16.html | finished in 1621ms 2358 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_HTMLTableElement17.html reconnecting socket unable to connect socket reconnecting socket unable to connect socket reconnecting socket unable to connect socket reconnecting socket unable to connect socket reconnecting socket unable to connect socket INFO | automation.py | Application ran for: 0:12:11.603818
https://tbpl.mozilla.org/php/getParsedLog.php?id=7828025&tree=Mozilla-Inbound Android XUL Tegra 250 mozilla-inbound opt test mochitest-6 on 2011-12-08 08:18:35 PST for push af13b9729453 1726 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_HTMLInputElement09.html reconnecting socket INFO | automation.py | Application ran for: 0:19:34.003587
https://tbpl.mozilla.org/php/getParsedLog.php?id=7829565&tree=Mozilla-Inbound Android XUL Tegra 250 mozilla-inbound opt test jsreftest-3 on 2011-12-08 09:15:55 PST for push ffa6056893ed REFTEST TEST-START | http://10.250.48.208:30039/jsreftest/tests/jsreftest.html?test=js1_8_5/extensions/regress-691746.js | 1123 / 1127 (99%) reconnecting socket
https://tbpl.mozilla.org/php/getParsedLog.php?id=7829509&tree=Mozilla-Inbound Android XUL Tegra 250 mozilla-inbound opt test mochitest-4 on 2011-12-08 09:21:01 PST for push 952d14a9e508 974 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-core/test_importNode16.html reconnecting socket 975 INFO Error: Unable to restore focus, expect failures and timeouts. 976 INFO TEST-PASS | no_throw_NOT_SUPPORTED_ERR 977 INFO TEST-END | /tests/dom/tests/mochitest/dom-level2-core/test_importNode16.html | finished in 3766ms 978 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-core/test_importNode17.html INFO | automation.py | Application ran for: 0:11:03.073922
https://tbpl.mozilla.org/php/getParsedLog.php?id=7888567&tree=Firefox Android Tegra 250 mozilla-central opt test mochitest-3 on 2011-12-12 11:20:00 PST for push 9989f0fed131 98 INFO TEST-START | /tests/dom/tests/mochitest/dom-level1-core/test_characterdatadeletedataend.html https://tbpl.mozilla.org/php/getParsedLog.php?id=7888627&tree=Firefox Android Tegra 250 mozilla-central opt test mochitest-3 on 2011-12-12 11:04:04 PST for push e67d90275715 91 INFO TEST-START | /tests/dom/tests/mochitest/dom-level1-core/test_characterdataappenddatanomodificationallowederrEE.html https://tbpl.mozilla.org/php/getParsedLog.php?id=7888859&tree=Firefox Android XUL Tegra 250 mozilla-central opt test mochitest-6 on 2011-12-12 11:09:16 PST for push e67d90275715 1031 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_HTMLElement52.html
https://tbpl.mozilla.org/php/getParsedLog.php?id=7891316&tree=Firefox Android XUL Tegra 250 mozilla-central opt test mochitest-4 on 2011-12-12 13:08:52 PST for push 6057abf8f4e6 1106 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-core/test_namednodemapremovenameditemns06.html 1107 INFO Error: Unable to restore focus, expect failures and timeouts. reconnecting socket
https://tbpl.mozilla.org/php/getParsedLog.php?id=7893375&tree=Firefox Android XUL Tegra 250 mozilla-central opt test mochitest-6 on 2011-12-12 15:04:56 PST for push eb5d4c08a542 1714 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_HTMLInputElement07.html reconnecting socket unable to connect socket reconnecting socket unable to connect socket reconnecting socket unable to connect socket reconnecting socket unable to connect socket reconnecting socket unable to connect socket INFO | automation.py | Application ran for: 0:19:39.506195
https://tbpl.mozilla.org/php/getParsedLog.php?id=7907273&tree=Firefox Android XUL Tegra 250 mozilla-central opt test mochitest-6 on 2011-12-13 09:26:18 PST for push 9b7c08d602b9 1600 INFO TEST-START | /tests/dom/tests/mochitest/dom-level2-html/test_HTMLIFrameElement11.html reconnecting socket 1601 INFO Error: Unable to restore focus, expect failures and timeouts. INFO | automation.py | Application ran for: 0:17:53.959437
Whiteboard: [orange][android_tier_1] → [orange][android_tier_1][android_tier_∞]
Thanks to the added socket verbosity, I think this might just be a flavor of bug 681681: https://tbpl.mozilla.org/php/getParsedLog.php?id=10653634&tree=Mozilla-Inbound Error receiving data from socket. cmd=ps ; err=[Errno 54] Connection reset by peer reconnecting socket (the single reconnecting socket being what I think was generally there before with this, but then single reconnecting sockets are all over the place)
https://tbpl.mozilla.org/php/getParsedLog.php?id=10654060&tree=Mozilla-Inbound Error receiving data from socket. cmd=ps ; err=[Errno 54] Connection reset by peer reconnecting socket REFTEST TEST-PASS | http://10.250.48.211:30086/jsreftest/tests/jsreftest.html?test=js1_8_5/extensions/regress-627984-6.js | ok item 1 REFTEST INFO | Loading a blank page REFTEST TEST-START | http://10.250.48.211:30086/jsreftest/tests/jsreftest.html?test=js1_8_5/extensions/regress-627984-7.js | 1107 / 1128 (98%) INFO | automation.py | Application ran for: 0:12:13.794167 INFO | automation.py | Reading PID log: /tmp/tmpAfPflHpidlog getting files in '/mnt/sdcard/tests/reftest/profile/minidumps/' WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! REFTEST INFO | runreftest.py | Running tests: end. program finished with exit code 0 elapsedTime=749.432141 TinderboxPrint: jsreftest-3<br/><em class="testfail">T-FAIL</em> (which, uh, I, um)
Oops, just a flavor of bug 681861, repeated reconnecting/unable to connect, not some PPC thing.
Interesting, or maybe just a coincidence: The last 3 logs (comments 717 and 718) are all from tegra-349. Comments 714 and 715 are both tegra-182.
And as bug 782495 will tell you, 349 and 182 are both busted.
Whiteboard: [orange][android_tier_1][android_tier_∞] → [android_tier_1][android_tier_∞]
(In reply to Ryan VanderMeulen [:RyanVM] from comment #779) > https://tbpl.mozilla.org/php/getParsedLog.php?id=19577780&tree=Mozilla- > Inbound This looks like a test infra problem of some kind. The test appears to complete normally, but we don't retrieve the last part of the log: 02-08 10:00:00.453 I/GeckoDump( 1850): 52683 INFO TEST-PASS | /tests/content/html/content/test/forms/test_form_attribute-3.html | e2 should be in the scope of i7 02-08 10:00:03.149 I/SUTAgentAndroid( 1513): 10.250.48.201 : ps 02-08 10:00:03.223 I/SUTAgentAndroid( 1513): 10.250.48.201 : isdir /mnt/sdcard/tests/logs 02-08 10:00:03.243 I/SUTAgentAndroid( 1513): 10.250.48.201 : cd /mnt/sdcard/tests/logs 02-08 10:00:03.273 I/SUTAgentAndroid( 1513): 10.250.48.201 : ls 02-08 10:00:03.403 I/SUTAgentAndroid( 1513): 10.250.48.201 : pull /mnt/sdcard/tests/logs/mochitest.log 02-08 10:00:03.813 I/GeckoDump( 1850): 52684 INFO TEST-END | /tests/content/html/content/test/forms/test_form_attribute-3.html | finished in 3488ms 02-08 10:00:03.823 I/GeckoDump( 1850): 52685 INFO TEST-START | /tests/content/html/content/test/forms/test_form_attribute-4.html 02-08 10:00:04.263 I/GeckoDump( 1850): 52686 INFO TEST-PASS | /tests/content/html/content/test/forms/test_form_attribute-4.html | i1 form should be it's parent 02-08 10:00:04.263 I/GeckoDump( 1850): 52687 INFO TEST-PASS | /tests/content/html/content/test/forms/test_form_attribute-4.html | i1 form should be the form with the id in @form 02-08 10:00:04.263 I/GeckoDump( 1850): 52688 INFO TEST-PASS | /tests/content/html/content/test/forms/test_form_attribute-4.html | i1 should still be in the document 02-08 10:00:04.263 I/GeckoDump( 1850): 52689 INFO TEST-PASS | /tests/content/html/content/test/forms/test_form_attribute-4.html | i1 should not have any form owner 02-08 10:00:04.263 I/GeckoDump( 1850): 52690 INFO TEST-PASS | /tests/content/html/content/test/forms/test_form_attribute-4.html | i1 form should be the form with the id in @form 02-08 10:00:04.353 I/GeckoDump( 1850): 52691 INFO TEST-END | /tests/content/html/content/test/forms/test_form_attribute-4.html | finished in 533ms 02-08 10:00:04.383 I/GeckoDump( 1850): 52692 INFO TEST-START | Shutdown 02-08 10:00:04.383 I/GeckoDump( 1850): 52693 INFO Passed: 51150 02-08 10:00:04.383 I/GeckoDump( 1850): 52694 INFO Failed: 0 02-08 10:00:04.383 I/GeckoDump( 1850): 52695 INFO Todo: 340 02-08 10:00:04.383 I/GeckoDump( 1850): 52696 INFO SimpleTest FINISHED 02-08 10:00:04.383 I/GeckoDump( 1850): 52697 INFO TEST-INFO | Ran 0 Loops 02-08 10:00:04.383 I/GeckoDump( 1850): 52698 INFO SimpleTest FINISHED ... 02-08 10:00:07.723 D/GeckoAppShell( 1850): Killing via System.exit() vs. 52683 INFO TEST-PASS | /tests/content/html/content/test/forms/test_form_attribute-3.html | e2 should be in the scope of i7 DeviceManager: pull unsuccessful: could not get all file data
The most recent 6 logs all show death by SIGSEGV but no crash dump and nothing obvious in common between the failures.
Some of the recent (since 04-11) activity in this bug might be a side-effect of bug 861500.
One thing that might be good to give this bug for its comment 1000 birthday would be some logging of what was running when things come to a stop. "TEST UNEXPECTED-FAIL | http://10.250.49.162:30119/tests/dom/media/tests/crashtests/855796.html | Crap, why did we just stop running tests?" (which is at least the last three trunk ones I've done) has a chance of actually being fixed, unlike this dumping ground. https://tbpl.mozilla.org/php/getParsedLog.php?id=23121886&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=23323413&tree=Mozilla-Aurora Another thing that would be good would be to just stop running 855796.html, on both trunk and aurora.
And having skipped that one, we now get two tests further before silently stopping. https://tbpl.mozilla.org/php/getParsedLog.php?id=23379800&tree=Mozilla-Central
(In reply to Phil Ringnalda (:philor) from comment #831) > And having skipped that one, we now get two tests further before silently > stopping. > > https://tbpl.mozilla.org/php/getParsedLog.php?id=23379800&tree=Mozilla- > Central That log shows low memory: processes being killed off (including fennec and sutagent!) about 90 seconds after the last log message from Gecko. 05-24 14:55:43.497 E/GeckoConsole( 1699): [JavaScript Error: "NS_ERROR_UNEXPECTED: Unexpected error" {file: "http://10.250.49.157:30053/tests/dom/media/tests/crashtests/860143.html" line: 14}] ... 05-24 14:57:13.617 I/ActivityManager( 1017): Process android.process.acore (pid 1260) has died. 05-24 14:57:19.387 I/ActivityManager( 1017): Process com.android.email (pid 1346) has died. 05-24 14:57:19.387 I/ActivityManager( 1017): Process com.android.mms (pid 1376) has died. ... 05-24 14:57:19.627 I/ActivityManager( 1017): Process com.mozilla.SUTAgentAndroid (pid 1495) has died. ... 05-24 14:57:19.897 I/ActivityManager( 1017): Process com.android.defcontainer (pid 1564) has died. 05-24 14:57:19.897 I/ActivityManager( 1017): Low Memory: No more background processes.
Some recent reports here - Comment 845, Comment 852 - show low memory, as in bug 877939. 05-30 01:24:32.396 I/ActivityManager( 1017): Process com.svox.pico (pid 1590) has died. 05-30 01:24:32.396 I/ActivityManager( 1017): Process com.mozilla.SUTAgentAndroid (pid 1495) has died and restarted (pid 1881). 05-30 01:24:32.406 I/ActivityManager( 1017): Process com.mozilla.watcher (pid 1433) has died. 05-30 01:24:32.406 W/ActivityManager( 1017): Scheduling restart of crashed service com.mozilla.watcher/.WatcherService in 14971ms 05-30 01:24:32.406 I/ActivityManager( 1017): Process com.cooliris.media (pid 1462) has died. 05-30 01:24:32.526 I/ActivityManager( 1017): Low Memory: No more background processes.
The last 3 failures have been on tegra-142 -- bad device?
(In reply to Geoff Brown [:gbrown] from comment #917) > The last 3 failures have been on tegra-142 -- bad device? https://tbpl.mozilla.org/php/getParsedLog.php?id=25485161&tree=Mozilla-Central Sure looks that way.
3 of the last 4 are tegra-195.
https://tbpl.mozilla.org/php/getParsedLog.php?id=27116751&tree=Mozilla-Inbound The vast majority of these failures are in the dom-level* tests. I will probably skip these on Android soon.
So much has changed since this bug was filed, and I suspect multiple different causes are being lumped in here - so I think we should start afresh. If this occurs again, please file a new bug :-)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Resolution: WORKSFORME → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.