630258 - Huge spike in testrun failures.

Reporter

Description

•

14 years ago

Today there was a huge spike in test failures across all platforms and branches. 2011-01-30: ~35 failed tests 2011-01-31: ~370 failed tests It appears as though many of the new failures are waitForPageLoad() timeouts. I don't think there was any scheduled downtime or outage, but who knows? We should investigate what happened here.

Aaron Train [:aaronmt]

Comment 1

•

14 years ago

1.5.2RC2 was installed yesterday -- something to not overlook

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 2

•

14 years ago

We run 1.5.2 a couple of times on our systems, so I don't expect it to be a regression in Mozmill. In the past I have noticed at least on the Linux machine that even loading pages from localhost timed out. Could someone run the testrun_general script on the affected platforms? You can report to mozill-archive. Would be good to know if that is reproducible. Sadly I don't have time for right now.

Aaron Train [:aaronmt]

Comment 3

•

14 years ago

Example from today on Windows NT http://mozmill-release.brasstacks.mozilla.com/#/general/report/fc2eabbf52c98c01bb8d9938c6002bc1 controller.waitForPageLoad() timeout on testStopReloadButtons, which uses local data

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 4

•

14 years ago

Can you reproduce it when you run this single script via Mozmill too? Or does it only occur for test-runs triggered by the automation scripts?

Armen [:armenzg]

Updated

•

14 years ago

Blocks: 630551

Aaron Train [:aaronmt]

Comment 5

•

14 years ago

Unable to reproduce locally with the automation script and mozmill test-runs

Aaron Train [:aaronmt]

Comment 6

•

14 years ago

Remotely (qa-horus w/ 1.5.2rc2), during the test-run, multiple ports are being used for perhaps multiple httpd instances. This is visible during the tests that are run on localhost. Ports are incremented by one with each ran test.

Aaron Train [:aaronmt]

Comment 7

•

14 years ago

(In reply to comment #5) > Unable to reproduce locally with the automation script and mozmill test-runs Scratch that, was on 1.5.1 - with 1.5.2rc2 I see this.

Aaron Train [:aaronmt]

Comment 8

•

14 years ago

Steps to Reproduce 1. Install and setup mozmill 1.5.2rc2 2. Clone http://hg.mozilla.org/qa/mozmill-tests/ 3. mozmill -t firefox/testAwesomebar -b <path to binary>

Aaron Train [:aaronmt]

Comment 9

•

14 years ago

Comments #6-#8 might be related to something else. Moving discussion to bug 630599

Henrik Skupin [:whimboo][⌚️UTC+2]

Updated

•

14 years ago

Depends on: 630599

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 10

•

14 years ago

Bug 630599 should have fixed that. I cannot reproduce this failures anymore. See the latest testruns: http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed0513ad http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed051c08 http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed052f23 Also the results are looking awesome! There is only one failure on the older branches for the testPasswordNotSaved.js test, but it looks different to bug 614973. We should wait for the official test-run today.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 11

•

14 years ago

(In reply to comment #10) > Bug 630599 should have fixed that. I cannot reproduce this failures anymore. > See the latest testruns: > > http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed0513ad > http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed051c08 > http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed052f23 > > Also the results are looking awesome! There is only one failure on the older > branches for the testPasswordNotSaved.js test, but it looks different to bug > 614973. > > We should wait for the official test-run today. whimboo: can you review these test results with ctalbert, and give us a better description of what exactly was being tested? I ask because we just recently had to disable mozmill in production because of problems caused when mozmill hung production machines. Once these tests are passing in staging, without hanging, please file a bug in mozilla.org/RelEng to have us re-enable these tests in production at a time that everyone is around to carefully watch for hangs.

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 12

•

14 years ago

(In reply to comment #11) > whimboo: can you review these test results with ctalbert, and give us a better > description of what exactly was being tested? I ask because we just recently > had to disable mozmill in production because of problems caused when mozmill > hung production machines. John, we have discussed that in our yesterdays Mozmill meeting, and we probably don't want to re-enable the Mozmill tests until we have released Mozmill 2.0. The issue you have seen is a different one, as what we have logged here.

No longer blocks: 630551

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 13

•

14 years ago

This huge spike can also be related to the new feature of Mozmill to report chrome JS errors. We don't correctly report those due to bug 631175 but it shows that we have some strange bugs in the browser itself: http://mozmill-archive.brasstacks.mozilla.com/#/general/report/00edfccff35537bd5dcddb5129620f10 "[JavaScript Error: \"document is null\" {file: \"chrome://browser/content/browser.js\" line: 12728}]" Seems to only happen on Linux, so I wonder if we have a regression there. Aaron, are you able to reproduce that, if yes even in older releases?

Aaron Train [:aaronmt]

Comment 14

•

14 years ago

I just gave a run via testrun_general as well mozmill on its own under Linux with the latest nightly and beta 10, w/ 1.5.2rc3 and did not see any spikes. In fact, I got quite the opposite: INFO Passed: 216 INFO Failed: 0 INFO Skipped: 13

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 15

•

14 years ago

Could this be a combination of updates and general tests again? Would be good to know what testrun_all.py says.

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 16

•

14 years ago

This should have been the same cause as bug 631175. Not sure what happens but the python package was completely broken on that machine.

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → DUPLICATE

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Product: Mozilla QA → Mozilla QA Graveyard

Bugzilla

Huge spike in testrun failures.

Categories

(Mozilla QA Graveyard :: Mozmill Automation, defect)

Tracking

(Not tracked)

People

(Reporter: u279076, Unassigned)

References

(
URL
)

Details

(Whiteboard: [mozmill-test-failure])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Updated