Closed Bug 630258 Opened 14 years ago Closed 14 years ago

Huge spike in testrun failures.

Categories

(Mozilla QA Graveyard :: Mozmill Automation, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 631175

People

(Reporter: u279076, Unassigned)

References

()

Details

(Whiteboard: [mozmill-test-failure])

Today there was a huge spike in test failures across all platforms and branches. 2011-01-30: ~35 failed tests 2011-01-31: ~370 failed tests It appears as though many of the new failures are waitForPageLoad() timeouts. I don't think there was any scheduled downtime or outage, but who knows? We should investigate what happened here.
1.5.2RC2 was installed yesterday -- something to not overlook
We run 1.5.2 a couple of times on our systems, so I don't expect it to be a regression in Mozmill. In the past I have noticed at least on the Linux machine that even loading pages from localhost timed out. Could someone run the testrun_general script on the affected platforms? You can report to mozill-archive. Would be good to know if that is reproducible. Sadly I don't have time for right now.
Example from today on Windows NT http://mozmill-release.brasstacks.mozilla.com/#/general/report/fc2eabbf52c98c01bb8d9938c6002bc1 controller.waitForPageLoad() timeout on testStopReloadButtons, which uses local data
Can you reproduce it when you run this single script via Mozmill too? Or does it only occur for test-runs triggered by the automation scripts?
Blocks: 630551
Unable to reproduce locally with the automation script and mozmill test-runs
Remotely (qa-horus w/ 1.5.2rc2), during the test-run, multiple ports are being used for perhaps multiple httpd instances. This is visible during the tests that are run on localhost. Ports are incremented by one with each ran test.
(In reply to comment #5) > Unable to reproduce locally with the automation script and mozmill test-runs Scratch that, was on 1.5.1 - with 1.5.2rc2 I see this.
Steps to Reproduce 1. Install and setup mozmill 1.5.2rc2 2. Clone http://hg.mozilla.org/qa/mozmill-tests/ 3. mozmill -t firefox/testAwesomebar -b <path to binary>
Comments #6-#8 might be related to something else. Moving discussion to bug 630599
Bug 630599 should have fixed that. I cannot reproduce this failures anymore. See the latest testruns: http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed0513ad http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed051c08 http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed052f23 Also the results are looking awesome! There is only one failure on the older branches for the testPasswordNotSaved.js test, but it looks different to bug 614973. We should wait for the official test-run today.
(In reply to comment #10) > Bug 630599 should have fixed that. I cannot reproduce this failures anymore. > See the latest testruns: > > http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed0513ad > http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed051c08 > http://mozmill-archive.brasstacks.mozilla.com/#/general/report/e2d85e9ae099daa31109068aed052f23 > > Also the results are looking awesome! There is only one failure on the older > branches for the testPasswordNotSaved.js test, but it looks different to bug > 614973. > > We should wait for the official test-run today. whimboo: can you review these test results with ctalbert, and give us a better description of what exactly was being tested? I ask because we just recently had to disable mozmill in production because of problems caused when mozmill hung production machines. Once these tests are passing in staging, without hanging, please file a bug in mozilla.org/RelEng to have us re-enable these tests in production at a time that everyone is around to carefully watch for hangs.
(In reply to comment #11) > whimboo: can you review these test results with ctalbert, and give us a better > description of what exactly was being tested? I ask because we just recently > had to disable mozmill in production because of problems caused when mozmill > hung production machines. John, we have discussed that in our yesterdays Mozmill meeting, and we probably don't want to re-enable the Mozmill tests until we have released Mozmill 2.0. The issue you have seen is a different one, as what we have logged here.
No longer blocks: 630551
This huge spike can also be related to the new feature of Mozmill to report chrome JS errors. We don't correctly report those due to bug 631175 but it shows that we have some strange bugs in the browser itself: http://mozmill-archive.brasstacks.mozilla.com/#/general/report/00edfccff35537bd5dcddb5129620f10 "[JavaScript Error: \"document is null\" {file: \"chrome://browser/content/browser.js\" line: 12728}]" Seems to only happen on Linux, so I wonder if we have a regression there. Aaron, are you able to reproduce that, if yes even in older releases?
I just gave a run via testrun_general as well mozmill on its own under Linux with the latest nightly and beta 10, w/ 1.5.2rc3 and did not see any spikes. In fact, I got quite the opposite: INFO Passed: 216 INFO Failed: 0 INFO Skipped: 13
Could this be a combination of updates and general tests again? Would be good to know what testrun_all.py says.
This should have been the same cause as bug 631175. Not sure what happens but the python package was completely broken on that machine.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in before you can comment on or make changes to this bug.