Closed Bug 859065 Opened 12 years ago Closed 12 years ago

Avoid "command timed out: 1200 seconds without output, attempting to kill" by providing an inner xpcshell timeout of 5 minutes

Categories

(Testing :: XPCShell Harness, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla23

People

(Reporter: Paolo, Assigned: Paolo)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

When an asynchronous xpcshell test times out, we are only able to get the message "command timed out: 1200 seconds without output, attempting to kill" and see the name of the last test file (not test function) that was executed. We can easily get more useful logs by forcing a shorter inner timeout for each xpcshell file (5 minutes seems long enough). This way, we see the entire log of the test that times out, and are able to figure out more precisely what is going on. The timeout applies to each xpcshell file individually. Example: https://tbpl.mozilla.org/php/getParsedLog.php?id=21441817&tree=Try&full=1 I can see that the "test_download_cancel_midway_restart" function is where the test actually hangs.
Attached patch The patch (deleted) — Splinter Review
See comment 0 for the patch description.
Assignee: nobody → paolo.mozmail
Status: NEW → ASSIGNED
Attachment #734346 - Flags: review?(jmaher)
I don't really understand this patch. Are we setting a 5 minute timer and if the test case is still running we time out?
(In reply to Joel Maher (:jmaher) from comment #2) > I don't really understand this patch. Are we setting a 5 minute timer and > if the test case is still running we time out? Yes, we do_throw when the timeout occurs, forcing the main test function to quit before the external watchdog terminates the entire xpcshell suite. I can add this as a comment to the patch if you think it makes things clearer.
so you assert that all tests will finish in 300 seconds instead of 1200 seconds? This just shortens the failure time, if that is the case we could adjust the buildbot scripts.
When the outer 20 minutes timeout is hit, the entire test suite is terminated and we don't even see the output of the failing test file. With this patch, the outer timeout is ideally never hit, so we continue with other tests, and we get the output from the test file that times out, for example the part between >>>>>>> and <<<<<<< from the log in comment 0.
Comment on attachment 734346 [details] [diff] [review] The patch Review of attachment 734346 [details] [diff] [review]: ----------------------------------------------------------------- this is good. Please ensure this runs great on try server for all platforms. I would suggest just running xpcshell (not the other tests) and then retrigger the X jobs a few times each to ensure this works well and hopefully to catch an error.
Attachment #734346 - Flags: review?(jmaher) → review+
Thank you for doing this :-)
Blocks: 778688
Target Milestone: --- → mozilla23
For the record, this patch catches timeouts in asynchronous tests, but does not affect main thread hangs, that may still result in the message "1200 seconds without output". If I understand correctly, bug 597064 will handle this case.
Blocks: 597064
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Blocks: 869638
Blocks: 889317
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: