Closed Bug 1013883 Opened 10 years ago Closed 10 years ago

[JSMarionette] Random timeout issues among different tests in differents apps

Categories

(Testing Graveyard :: JSMarionette, defect)

All
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: gweng, Unassigned)

References

Details

I recently committed my LockScreen as-an-app (Bug 898348) patch crazily because the patch comes with so many irrelevant -- according to the code structure and changes -- timeout errors among several apps. They're all failed at the 'before each' stage, and all are timeout errors. The apps have most failed tests are Settings, Contacts, Emails, Calendars, and the Browser would fail at adding bookmark to the Homescreen test. Sometimes the tests of Search and Keyboard app (while it dismisses the keyboard) would fail, too.

Some of these errors has been identified but still provided no information about how this would happen. For example, when the 3 or 4 failed tests occurred at the Settings app while others tests were all going perfectly as usual, it shown that the failed test was caused by the animation of panel sliding was not performed. Namely, the button has been clicked but the panel just do no sliding animation, and thus the waiting loop would keep looping until it timeout.

Although I suspect that all of these failures are all about animation (so they would have a common factor), but it's impossible for me to figure out what exactly happen among these tests, at least not in 2.0 scope. So I can only try to make sure that whether my patch is the root cause of the failures, and did many tests under different conditions. Below is what I've found:

1. My forked repo + my patch, on Travis:
failed with the random timeout issues

2. My forked repo + my patch REVERTED, on Travis:
failed with the random timeout issue

(In theory, commit + revert should equal to the original master. However, maybe the Travis doesn't think so)

3. My forked repo - my patch RESETED, on Travis:
passed as usual

4. My forked repo + my patch, disabled the System app parts, on Travis:
failed with the random timeout issue

(This is because I thought that the only part would affect other apps in my patch is the part in the System app.)

5. My forked repo + my patch, 5/16~5/17, on Travis:
passed as usually, don't know why, and did nothing but rebasing

(This is the most weird case: the tests were all passed suddenly, and I've tested it 5+ times successfully).

6. My forked repo + my patch, 5/19, on Travis:
failed with the random timeout issue

(After the miracle success, it failed as before 2 days later, and I still don't know what happened)

7. Clean repo (cloned from mozilla-b2g) and apply my patch, PR to my own repo, on Travis, run #1:
passed without any reasons

(This is the second passed result, and I keep the job here: https://travis-ci.org/snowmantw/gaia/jobs/25656362)

8. Clean repo (cloned from mozilla-b2g) and apply my patch, PR to my own repo, on Travis, run #2:

failed with the random timeout issue

(I can't make it passed again...)

9. My forked repo + my patch, PR to mozilla-b2g repo, on Travis, after the successful result:

failed with the random timeout issue

(The same patch rebased on the same master, one passed while another one failed)

10. My forked repo + my patch, Ubuntu 13.04 virtualbox image run inside my Mac console: very slow and the timeout issues occur as before

11. Clean Gaia, master branch, Ubuntu 13.04 virtualbox image run inside my Mac console: very slow and the timeout issues occur as before

I also tested marionette tests on my local Linux console. They failed with the similar timeout issue no matter my patch has been committed or not.

On my MacOSX, tests would passed until the B2G desktop crashed (tried several times).

And I also noticed that when the tests passed, the average time to complete the tests seems shorter then the failed sessions. So I suspect that the root cause is when the B2G desktop run on a Linux machine with slow speed, it would cause random timeout with some unknown reasons, and cause the animation would not perform (since on my Mac console, the tests could pass smoothly, and it's always significantly faster than on my Linux console, although these two machines are at the same level).

The most strange thing is the integration tests belong to my patch never failed during these tries. All failed tests are owned by other apps, and I've once made a long black list to find out how many tests I need to suspended if I want the tests got passed. As a result, the blacklist file extended longer twice as before, and it would block half integration tests.

I've discussed this with Evan, Rudy and Arthur (since the Settings app owns the most of the failed tests), and Evan suggested me to open this bug to invite more people to discuss this. Steve and Alive told me they've seem the similar errors as well, although I don't know whether the symptom is the same or not.
I've some logs for it, and I believe this can be reproduced easily. I'll post them all as soon as possible. Here is some failed jobs:

https://travis-ci.org/mozilla-b2g/gaia/jobs/25662768
https://travis-ci.org/snowmantw/gaia/jobs/25601341
https://travis-ci.org/mozilla-b2g/gaia/jobs/25657782
Sounds painful :) There are some common themes though.

1 - Intermittent homescreen. I disabled all of these today, they have been failing too much.

2 - Intermittent settings tests. We've been writing a lot of tests for settings, and unfortunately a lot of them are intermittent, or going intermittent after refactoring a panel. It's hard to tell at exactly which patch we start failing, so it's probably best to just disable them. It seems like there is either a bug in the test or the app if we get stuck on an animation. In this case we might try changing the wait to examine position and not wait for animationend.

3 - Polling socket recv() timeout! issue - this is an environmental issue, and is known, but I'm not sure if we have a bug filed. The main framework bug we need to fix right now is bug 1005707. I'm not sure if it would resolve this though.

Although it's not ideal, I think it's pretty rare to have to restart more than once to get a passing build on travis. I'm sorry you're going through this pain, we need to educate developers about how to write better tests, and get people on these testing framework problems. Unfortunately there is no one team responsible for testing stuff, so it's up to us developers to pitch-in and make things better right now.
It seems getting worse: now it would get stuck at the Timer section, and the tests tend to timeout couldn't pass anyway:

https://travis-ci.org/mozilla-b2g/gaia/jobs/25765475

I'll do the revert and reset test again later.
Hi Greg,

Thanks for collecting the detailed info here.
However, I could not find a failed job from the above to see a failed case around keyboard dismiss.

I'll try to grab your patch and run stability check again to see if I could reproduce it.
The stability check for keyboard related tests has passed, here is the result,
https://travis-ci.org/mozilla-b2g/gaia/builds/26116758
This includes 17 runs * 6 times per run = 102 test runs for each test case.
Here is the stability check result for all the marionette-js tests of settings app,
 Setup: run 3 rounds of all tests in each job.

 - with "lockscreen as an app"
   https://travis-ci.org/mozilla-b2g/gaia/builds/26193269 
   - all jobs failed.

 - without "lockscreen as an app", i.e. the current master
   https://travis-ci.org/mozilla-b2g/gaia/builds/26193475
   - 3 jobs failed. Interestingly, the failed message is different from the above.
== Run MarionetteJS 30 Times and The Summary ==
https://travis-ci.org/snowmantw/gaia/builds/26184404

TL;DR: Keyboard still failed at the dismissing test; settings failed as usual. Nothing new and I'll continue to try to pin down the root cause.

# Success:

+ https://travis-ci.org/snowmantw/gaia/jobs/26184411
+ https://travis-ci.org/snowmantw/gaia/jobs/26184418
+ https://travis-ci.org/snowmantw/gaia/jobs/26184422
+ https://travis-ci.org/snowmantw/gaia/jobs/26184431
+ https://travis-ci.org/snowmantw/gaia/jobs/26184433
+ https://travis-ci.org/snowmantw/gaia/jobs/26184439
+ https://travis-ci.org/snowmantw/gaia/jobs/26184444
+ https://travis-ci.org/snowmantw/gaia/jobs/26184453
+ https://travis-ci.org/snowmantw/gaia/jobs/26184466
+ https://travis-ci.org/snowmantw/gaia/jobs/26184467
+ https://travis-ci.org/snowmantw/gaia/jobs/26184470

# Failed:

+ (0/1)
https://travis-ci.org/snowmantw/gaia/jobs/26184409

1) Use a different outgoing password should be able to manually set up email:
Error: timeout exceeded!

email/test/marionette/manual_setup_outgoing_credentials_test.js:28

+ (2/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184413

1) manipulate display settings "before each" hook:

settings/test/marionette/tests/display_settings_test.js:17

2) Timer "before each" hook:

clock/test/marionette/timer_test.js:9

+ (1/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184415

1) manipulate sound settings "before each" hook:

settings/test/marionette/tests/sound_settings_test.js:14

+ (2/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184420

1) email message list edit mode "before each" hook:

apps/email/test/marionette/message_list_test.js:22

2) Contacts > Form Click phone number Add a simple contact:

apps/communications/contacts/test/marionette/form_test.js:28

+ (2/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184425

1) manipulate sound settings "before each" hook:

settings/test/marionette/tests/sound_settings_test.js:14

2) improve b2g "before each" hook:

settings/test/marionette/tests/improve_settings_test.js:13

+ (5/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184427

1) manipulate screenLock settings "before each" hook:

settings/test/marionette/tests/screen_lock_settings_test.js:14

2) manipulate display settings "before each" hook:

apps/settings/test/marionette/tests/display_settings_test.js:17

3) manipulate media storage settings "before each" hook:

settings/test/marionette/tests/media_storage_settings_test.js:14

4) Contacts > Form Facebook contacts Add phone number from Dialer to existing Facebook contact:

communications/contacts/test/marionette/form_test.js:123

5) Dimiss the keyboard "before each" hook:

keyboard/test/marionette/dismiss_test.js:41

+ (3/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184435

1) show Keyboard APP "before each" hook:

keyboard/test/marionette/launch_test.js:31

2) Contacts > Delete  > Edit menu is not visible on search mode:

communications/contacts/test/marionette/delete_contacts_test.js:32

3) check root panel settings common tests language description on the root panel is translated:

settings/test/marionette/tests/root_settings_test.js:62

+ (3/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184437

1) Contacts > Delete  > Edit menu is not visible on search mode:

communications/contacts/test/marionette/delete_contacts_test.js:32

2) manipulate display settings "before each" hook:

settings/test/marionette/tests/display_settings_test.js:17

3) email message list edit mode "before each" hook:

email/test/marionette/message_list_test.js:22

+ (2/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184441

1) improve b2g "before each" hook:

settings/test/marionette/tests/improve_settings_test.js:13

2) Contacts > Activities webcontacts/contact activity a contact with duplicate number shows merge page:

communications/contacts/test/marionette/activities_test.js:61

+ (1/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184446

1) check root panel settings common tests usb storage enable usb storage:

settings/test/marionette/tests/root_settings_test.js:108

+ (2/1)
https://travis-ci.org/snowmantw/gaia/jobs/26184449

1) Vertical - Bookmark Bookmarking appends to last group:

AssertionError: 48 == 47

dev_apps/home2/test/marionette/bookmark_test.js:54

2) manipulate sound settings "before each" hook:

settings/test/marionette/app/app.js:121

3) Dimiss the keyboard "before each" hook:

keyboard/test/marionette/dismiss_test.js:41

+ (6/1)
https://travis-ci.org/snowmantw/gaia/jobs/26184456

1) Vertical - Bookmark Bookmarking appends to last group:

AssertionError: 1 == 48

dev_apps/home2/test/marionette/bookmark_test.js:54

2) manipulate sound settings "before each" hook:

settings/test/marionette/app/app.js:198

3) check root panel settings common tests usb storage enable usb storage:

settings/test/marionette/tests/root_settings_test.js:108

4) manipulate display settings "before each" hook:

settings/test/marionette/tests/display_settings_test.js:17

5) Contacts > Activities webcontacts/contact activity a contact with duplicate number shows merge page:

communications/contacts/test/marionette/lib/contacts.js:197

6) Stopwatch "before each" hook:

clock/test/marionette/stopwatch_test.js:9

+ (3/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184459

1) manipulate language settings "before each" hook:

settings/test/marionette/tests/language_settings_test.js:14

2) check root panel settings common tests geolocation disable geolocation:

settings/test/marionette/tests/root_settings_test.js:92

3) today "before each" hook:

calendar/test/marionette/today_test.js:18

+ (2/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184461

1) manipulate sound settings "before each" hook:

settings/test/marionette/tests/sound_settings_test.js:14

2) manipulate display settings "before each" hook:

settings/test/marionette/tests/display_settings_test.js:17

+ (1/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184463

1) manipulate sound settings "before each" hook:

settings/test/marionette/tests/sound_settings_test.js:14

+ (4/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184464

1) manipulate screenLock settings "before each" hook:

settings/test/marionette/tests/screen_lock_settings_test.js:14

2) manipulate battery settings "before each" hook:

settings/test/marionette/tests/battery_settings_test.js:14

3) Dimiss the keyboard "before each" hook:

keyboard/test/marionette/dismiss_test.js:41

4) week view "before each" hook:

calendar/test/marionette/lib/calendar.js:135

! (0/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184465

Timer
No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.

+ (2/1)
https://travis-ci.org/snowmantw/gaia/jobs/26184468

1) Vertical - Search "before each" hook:

Error: timeout of 60000ms exceeded

at null.<anonymous> (/home/travis/build/snowmantw/gaia/node_modules/mocha/lib/runnable.js:139:19)

2) Vertical - Search "after each" hook:

Error: Not connected. To write data you must call connect first.

at TcpSync.send (/home/travis/build/snowmantw/gaia/node_modules/marionette-client/lib/marionette/drivers/tcp-sync.js:99:15)

3) Contacts > Delete  > Edit menu is not visible on search mode:

communications/contacts/test/marionette/delete_contacts_test.js:32

+ (1/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184469

1) manipulate display settings "before each" hook:

settings/test/marionette/tests/display_settings_test.js:17
# Timeout:

calendar/test/marionette/lib/calendar.js:135
calendar/test/marionette/today_test.js:18
clock/test/marionette/stopwatch_test.js:9
clock/test/marionette/timer_test.js:9
communications/contacts/test/marionette/activities_test.js:61
communications/contacts/test/marionette/delete_contacts_test.js:32
communications/contacts/test/marionette/form_test.js:123
communications/contacts/test/marionette/form_test.js:28
communications/contacts/test/marionette/lib/contacts.js:197
email/test/marionette/message_list_test.js:22
keyboard/test/marionette/dismiss_test.js:41
keyboard/test/marionette/launch_test.js:31
settings/test/marionette/app/app.js:121
settings/test/marionette/app/app.js:198
settings/test/marionette/tests/battery_settings_test.js:14
settings/test/marionette/tests/display_settings_test.js:17
settings/test/marionette/tests/improve_settings_test.js:13
settings/test/marionette/tests/language_settings_test.js:14
settings/test/marionette/tests/media_storage_settings_test.js:14
settings/test/marionette/tests/root_settings_test.js:108
settings/test/marionette/tests/root_settings_test.js:62
settings/test/marionette/tests/root_settings_test.js:92
settings/test/marionette/tests/screen_lock_settings_test.js:14
settings/test/marionette/tests/sound_settings_test.js:14

# Assertion Error/Not Before Each Timeout:

dev_apps/home2/test/marionette/bookmark_test.js:54

AssertionError: 1 == 48
https://travis-ci.org/snowmantw/gaia/jobs/26184456

AssertionError: 48 == 47
https://travis-ci.org/snowmantw/gaia/jobs/26184449

email/test/marionette/manual_setup_outgoing_credentials_test.js:28

1) Use a different outgoing password should be able to manually set up email:
Error: timeout exceeded!

# Platform Error:

! (0/0)
https://travis-ci.org/snowmantw/gaia/jobs/26184465

Timer
No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.

+ (2/1)
https://travis-ci.org/snowmantw/gaia/jobs/26184468

1) Vertical - Search "before each" hook:

Error: timeout of 60000ms exceeded

at null.<anonymous> (/home/travis/build/snowmantw/gaia/node_modules/mocha/lib/runnable.js:139:19)

2) Vertical - Search "after each" hook:

Error: Not connected. To write data you must call connect first.
Sorry the

calendar/test/marionette/lib/calendar.js:135

Should be:

calendar/test/marionette/week_view_test.js:14

which was failed at:

https://travis-ci.org/snowmantw/gaia/jobs/26184464
bbbbHi Greg,

Got it.
Thanks for the info at Comment 8.

We will land bug Bug 950673 and Bug 1003788 today. The patches might help for Bug 1013883.

I will let you know the landing. And let's run for 30 times again after the patches are landed.
Depends on: 950673
Hi Greg,

The Bug 950673 and Bug 1003788 are landed.
Could we retry your patch?

I tried to cherry-pick your patch, but I got a lot of conflicts.
So need your help here to run your patch on travis.

Thanks.
Flags: needinfo?(gweng)
Close the issue, please refer to https://bugzil.la/898348#c56.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Flags: needinfo?(gweng)
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.