Closed
Bug 489523
Opened 16 years ago
Closed 15 years ago
talos reboot (connection lost) error message should say that the real errors are above
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dbaron, Unassigned)
Details
This is a continuation of bug 474950.
The talos reboot error message at the end of every talos log is still very frequently confusing people into thinking that the problem is a network problem rather than a test failure.
I think the current message (I trimmed a few stars so it fits):
********************************************************************************
*** END OF RUN - NOW DOING SCHEDULED REBOOT; FOLLOWING ERROR MESSAGE EXPECTED **
********************************************************************************
Should also say something like:
REAL ERROR MESSAGES, IF ANY, WILL BE BEFORE THE PREVIOUS "BuildStep started"
Especially on Windows builds, that previous "BuildStep started" line is quite a few lines up, so people often don't look that far up.
Comment 1•16 years ago
|
||
I dunno, I suspect that any warning message is just effectively hidden in the normal log output spewage, even with caps and stars. It's pretty common to scan up until seeing the first error, and suspecting that. Since we have (had?) other common sporadic failures that also manifest as "twisted.internet.error.ConnectionLost", it's really easy to see that and not look further.
Really the best fix would be to not report this in the log at in the first place. Ideally have code that explicitly watches for the connection to be dropped when it's expected to (and reports an error if it's *not*).
A second-best-that-i-hate-to-even-suggest would be to use some unexpected ascii art to really grab attention. Like:
This error is expected!
|
|
|
|
\ | /
\ | /
\ | /
\|/
Comment 2•16 years ago
|
||
> A second-best-that-i-hate-to-even-suggest would be to use some unexpected ascii
> art to really grab attention. Like:
>
> This error is expected!
That wouldn't have helped me. I understood that the error was expected, but I still assumed that the box was orange for that reason.
Comment 3•16 years ago
|
||
catlee: is it possible we could get something added to buildbot to facilitate rebooting the slaves as a post-build step without it winding up as a disconnection error like this? As useful as the auto-rebooting is, it clearly causes confusion that no amount of explanatory text is going to fix.
Comment 4•16 years ago
|
||
(In reply to comment #3)
> catlee: is it possible we could get something added to buildbot to facilitate
> rebooting the slaves as a post-build step without it winding up as a
> disconnection error like this? As useful as the auto-rebooting is, it clearly
> causes confusion that no amount of explanatory text is going to fix.
It might be...buildbot is pretty good about killing off processes after a build is done, which makes it hard to spawn tasks that are supposed to happen later. Two things I can think of that may work is to have some kind of a DisconnectStep, that can run a given command and then disconnect the slave without complaining, ok a disconnectOk flag on steps to indicate that it's ok if the slave disappears afterwards.
What might be easier is to modify the log output after we're done. So if the log ends with our reboot message followed by the twisted disconnection, then we strip one or both off. This feels a bit dirty to have to do though...
Comment 5•16 years ago
|
||
Maybe we can use the Graceful Shutdown feature once the 0.7.10p1 upgrade happens.
Updated•16 years ago
|
Component: Release Engineering: Talos → Release Engineering: Future
Comment 6•15 years ago
|
||
Oops.
I believe this is no longer an issue -- is that correct?
Reporter | ||
Comment 7•15 years ago
|
||
Yeah, looks better to me. They seem to no longer be showing the error; the ones I checked were:
Linux tp4 mozilla-central:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1252514181.1252515848.9113.gz
Mac nochrome mozilla-central:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1252513075.1252515853.9130.gz
Windows dirty profile mozilla-central:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1252506830.1252515657.6694.gz
Comment 8•15 years ago
|
||
Cool. Resolving fixed.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 9•15 years ago
|
||
Moving closed Future bugs into Release Engineering in preparation for removing the Future component.
Component: Release Engineering: Future → Release Engineering
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•