Closed
Bug 896718
Opened 11 years ago
Closed 11 years ago
Mac 10.6 & 10.7 test-runners have multiple instances of "unable to execute llvm-gcc-4.2: No such file or directory" / "error: command 'llvm-gcc-4.2' failed with exit status 1"
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dholbert, Assigned: emorley)
References
Details
Attachments
(1 file)
(deleted),
patch
|
mozilla
:
review+
|
Details | Diff | Splinter Review |
Mac debug test logs (e.g. mochitest and JS reftest logs, at least) have multiple instances of...
> 14:21:35 INFO - unable to execute llvm-gcc-4.2: No such file or directory
> 14:21:35 INFO - error: command 'llvm-gcc-4.2' failed with exit status 1
...which don't seem to turn the run orange, but which do get highlighted by the log-highlighter. (and hence confuse things when the log is orange for other reasons)
In any case, we probably shouldn't be invoking executables that don't exist.
I'm seeing this in opt & debug logs, for OS X 10.6 and 10.7, on at least mochitest-[1-5] runs. (Doesn't seem to affect OS X ver 10.8, though.
Sample logs:
https://tbpl.mozilla.org/php/getParsedLog.php?id=25576506&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=25575411&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=25575637&tree=Mozilla-Central
Reporter | ||
Updated•11 years ago
|
OS: Linux → Mac OS X
Hardware: x86_64 → All
Summary: Mac test-runners have multiple instances of "unable to execute llvm-gcc-4.2: No such file or directory" / "error: command 'llvm-gcc-4.2' failed with exit status 1" → Mac 10.6 & 10.7 test-runners have multiple instances of "unable to execute llvm-gcc-4.2: No such file or directory" / "error: command 'llvm-gcc-4.2' failed with exit status 1"
Comment 1•11 years ago
|
||
I don't think that we have any compiler installed in the test machines...
Comment 2•11 years ago
|
||
We don't, nor should we.
This is the issue of psutil failing to install/build... cc: aki and gps due to that
Comment 3•11 years ago
|
||
That's bug 893254.
Comment 4•11 years ago
|
||
found in triage.(In reply to Justin Wood (:Callek) from comment #2)
> We don't, nor should we.
>
> This is the issue of psutil failing to install/build... cc: aki and gps due
> to that
aki: thanks.
gps: how to fix this so developers are not impacted? In case this is related to psutil, note that we explicitly+intentionally do not have build/compiler tools on our test machines.
Updated•11 years ago
|
Flags: needinfo?(gps)
Comment 5•11 years ago
|
||
Afaik, short term we can make mozharness eat the errors (not log them or output them to the terminal), or turn off resource monitoring globally. Longer term we can either get specific configs per-platform per-jobtype that turn on/off resource monitoring as appropriate, and/or get psutil installing successfully everywhere,
Comment 6•11 years ago
|
||
When resource monitoring landed, I explicitly asked a bunch of people (RelEng + Sheriffs) if the non-tbpl-run-turning errors in the logs were acceptable until psutil is installed globally (bug 894950 and bug 893254 track that) and the consensus was "yes." So, resource monitoring landed despite it introducing errors in the logs.
I concede the errors are annoying. I would like to see them go away.
I would prefer they go away by installing psutil everywhere.
I don't want us to back out resource monitoring because we're actively using data it is providing. For example, bug 877054 is attempting to find the optimal parallel execution count of xpcshell tests for minimal wall execution time. Bug 895225 was filed to investigate why xpcshell tests are performing a lot more write I/O than most of us expected and appears to indicate xpcshell tests are I/O and not CPU bound, a complete surprise to many!
I don't want to create a fire drill for RelEng to install psutil everywhere. That being said, bug updates make it sound like work is being done on this front. If that's the case and it will be ready soon, then these errors will magically go away and this bug can be marked as a dupe. If the ETA isn't soon enough, then I suppose we can have mozharness swallow the error.
If we need to modify mozharness, I can do this since it was me who introduced the issue. But, I'd like to make sure I'm not implementing a workaround that will only be needed for a few days first.
Depends on: 859573
Flags: needinfo?(gps)
Comment 7•11 years ago
|
||
Ok, luckily it seems like tbpl is only picking up the
14:11:04 ERROR - Return code: 1
line as the errors we want to ignore.
We can use the run_command() success_codes to specify which exit codes we deem a success:
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/script.py#l536
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/script.py#l645
Since we already have an 'optional' bool, we can pass a success_codes of [0, 1] if optional here:
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/python.py#l240
That should turn the above log line into
14:11:04 INFO - Return code: 1
(If we had been picking up any other errors, we might have had to create a special error_list with a substr or regex to IGNORE, like this:
http://hg.mozilla.org/build/mozharness/file/a5daac81696b/mozharness/base/signing.py#l28
Luckily, we don't have to here.)
Assignee | ||
Comment 8•11 years ago
|
||
Allow a return code of 1 when installing optional packages to prevent false positives in the log.
Attachment #780324 -
Flags: review?(aki)
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → emorley
Assignee | ||
Updated•11 years ago
|
Status: NEW → ASSIGNED
Updated•11 years ago
|
Attachment #780324 -
Flags: review?(aki) → review+
Assignee | ||
Comment 9•11 years ago
|
||
Assignee | ||
Comment 10•11 years ago
|
||
Merged to production :-)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•