Closed Bug 523934 Opened 15 years ago Closed 15 years ago

debug crashtests: 11 tests, and then whole suite, time out reliably

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dbaron, Assigned: dbaron)

References

Details

The debug unit test machines that are running crashtests have reliable test timeouts. I looked at 1 Linux run, 2 Mac runs, and 1 Windows run, and they all had the same exact set of test failures, all due to "timed out waiting for reftest-wait to be removed (after onload fired)": layout/base/crashtests/500467-1.html layout/forms/crashtests/366537-1.xhtml layout/forms/crashtests/367587-1.html layout/forms/crashtests/370703-1.html layout/forms/crashtests/370940-1.html layout/forms/crashtests/373586-1.xhtml layout/generic/crashtests/225868-1.html layout/generic/crashtests/307979-1.html layout/generic/crashtests/324318-1.html layout/generic/crashtests/334105-1.xhtml layout/generic/crashtests/337883-1.html I *don't* see the problem locally (although my local run does hang on layout/generic/crashtests/438509-1.html , which we might want to make skip-if(isDebugBulid); it's a performance test).
Blocks: 523385
Now this seems to have stopped happening on Linux (two cycles in a row), though it was reliable on all platforms before.
It's intermittent on Linux, but still reliable on Mac and Windows.
Summary: debug crashtests: 10 tests, and then whole suite, time out reliably → debug crashtests: 11 tests, and then whole suite, time out reliably
The Linux debug everythingelse log for the above changeset is: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256313337.1256320715.28178.gz&fulltext=1 and it looks like none of the script in the testcases was executed at all. The regular Linux everythingelse log is: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256309297.1256313655.9057.gz&fulltext=1 and it showed what I expected: DEBUGGING BUG 523934: in doIt DEBUGGING BUG 523934: timeout set DEBUGGING BUG 523934: in AttrModifiedListener DEBUGGING BUG 523934: AttrModifiedListener: set timeout DEBUGGING BUG 523934: in AttrModifiedListenerContinuation DEBUGGING BUG 523934: in FinishWaitingForTestEnd DEBUGGING BUG 523934: AttrModifiedListenerContinuation: done waiting and the Linux opt everythingelse log is: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256309037.1256312221.25095.gz&fulltext=1 So it seems like something is preventing the script from being executed in the first place. (Going to check the Mac and Windows logs shortly.)
The Mac debug everythingelse log is here: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256308882.1256313397.6210.gz&fulltext=1 (it failed because of the output size limit, but after some of the required data) That cycle actually didn't have this problem: DEBUGGING BUG 523934: in doIt DEBUGGING BUG 523934: timeout set DEBUGGING BUG 523934: in AttrModifiedListener DEBUGGING BUG 523934: AttrModifiedListener: set timeout DEBUGGING BUG 523934: in AttrModifiedListenerContinuation DEBUGGING BUG 523934: in FinishWaitingForTestEnd DEBUGGING BUG 523934: AttrModifiedListenerContinuation: done waiting REFTEST TEST-PASS | file:///builds/slave/mozilla-central-macosx-debug-unittest-everythingelse/build/reftest/tests/layout/base/crashtests/500467-1.html | (LOAD ONLY) ... DEBUGGING BUG 523934: in boom DEBUGGING BUG 523934: in AttrModifiedListener DEBUGGING BUG 523934: AttrModifiedListener: set timeout DEBUGGING BUG 523934: removed class attribute DEBUGGING BUG 523934: in AttrModifiedListenerContinuation DEBUGGING BUG 523934: in FinishWaitingForTestEnd DEBUGGING BUG 523934: AttrModifiedListenerContinuation: done waiting REFTEST TEST-PASS | file:///builds/slave/mozilla-central-macosx-debug-unittest-everythingelse/build/reftest/tests/layout/forms/crashtests/366537-1.xhtml | (LOAD ONLY) It looks like windows debug everythingelse didn't test that changeset (or hasn't yet).
(In reply to comment #4) > The Linux debug everythingelse log for the above changeset is: > http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256313337.1256320715.28178.gz&fulltext=1 > and it looks like none of the script in the testcases was executed at all. Actually, that's not true. It looks like what happened is: DEBUGGING BUG 523934: in doIt [... lots of printing from a cycle collection ...] REFTEST TEST-UNEXPECTED-FAIL | file:///builds/moz2_slave/mozilla-central-linux-debug-unittest-everythingelse/build/reftest/tests/layout/base/crashtests/500467-1.html | timed out waiting for reftest-wait to be removed (after onload fired) But there was actually nothing printed for the second test, nor was the second dump in the first test ever hit. It seems like maybe having a cycle collection mid-script is putting things in a bad state?
I tried to reproduce the problem on Linux by downloading the packages and running the same commands that the debug unit test box runs, and I didn't see the problem. I tried the same on Windows, but I couldn't get the executables to run ("Bad file number").
[2009-10-27 11:53:07] <bc> dbaron: mochitest doesn't set dom.max_script_run_time or dom.max_chrome_script_run_time do they? [2009-10-27 11:53:43] <ted> i think mochitest does, yes [2009-10-27 11:54:24] <bc> i thought i ran into it during a valgrind run and when looking didn't see it. [2009-10-27 11:54:37] <bc> automation.py does, but mochitest doesn't do the same thing. [2009-10-27 11:54:38] <ted> i don't think reftest does (or i'm not sure) [2009-10-27 11:54:47] <bhearsum> mochitest uses automation.py, doesn't it? [2009-10-27 11:54:55] <ctalbert> yes [2009-10-27 11:54:58] <ted> bc: i think you have your test suites backwards [2009-10-27 11:54:58] <bc> only for crash checking iirc [2009-10-27 11:55:10] <bc> beer? [2009-10-27 11:55:24] <bhearsum> always [2009-10-27 11:55:42] <ted> http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#237 [2009-10-27 11:55:50] <ted> mochitest uses that [2009-10-27 11:55:58] <ted> http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftest.py#62 [2009-10-27 11:56:02] <ted> reftest doesn't set that pref [2009-10-27 11:56:20] <ted> dbaron: we could try setting that pref in the reftest profile and see if that fixes crashtest
Landed a possible fix: http://hg.mozilla.org/mozilla-central/rev/1872ea3e540a , thanks to bc and ted.
I think this fixed it. I've unhid Linux debug everythingelse, but there are still other issues uncovered on Windows and Mac. (Mac seems to be getting less and less stable.)
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Assignee: nobody → dbaron
You need to log in before you can comment on or make changes to this bug.