Closed
Bug 795893
(b-linux64-hp-0028)
Opened 12 years ago
Closed 10 years ago
b-linux64-hp-0028 problem tracking
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: rail, Unassigned)
References
Details
(Whiteboard: [buildduty][buildslaves][capacity])
Per slavealloc it's "firmware patched, out of rotation for bug 779487". The host itself and its PDU (bld-centos6-hp-009-mgmt.build.mozilla.org) isn't responding.
Comment 1•12 years ago
|
||
Loaned to dgherman in bug 708381 for crazy testing.
Comment 2•12 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #1)
> Loaned to dgherman in bug 708381 for crazy testing.
Oops, meant bug 807381.
Comment 3•12 years ago
|
||
Now it's back, taking android builds and failing them with
INFO: copying /home/cltbld/.android to /builds/mock_mozilla/mozilla-centos6-i386/root/builds/.android
ERROR: [Errno 2] No such file or directory: '/home/cltbld/.android'
Traceback (most recent call last):
File "/usr/sbin/mock_mozilla", line 862, in <module>
main(retParams)
File "/usr/sbin/mock_mozilla", line 823, in main
shutil.copy(src, dest)
File "/usr/lib64/python2.6/shutil.py", line 84, in copy
copyfile(src, dst)
File "/usr/lib64/python2.6/shutil.py", line 50, in copyfile
with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: '/home/cltbld/.android'
Comment 4•12 years ago
|
||
disabled in slavealloc
Comment 5•12 years ago
|
||
It seems to have reenabled itself, and is happily burning builds that it can't upload.
Comment 6•12 years ago
|
||
Comment 7•12 years ago
|
||
Comment 8•12 years ago
|
||
Callek says he redisabled it.
Comment 9•12 years ago
|
||
arr, do we have a reference image/machine for this class of machines ?
Comment 10•12 years ago
|
||
New machines don't have reference images. They are completely managed via puppet once they do a basic kickstart.
Comment 11•12 years ago
|
||
So, I confirmed it is puppetizing correctly, I also noticed that new-puppet has no ref to .android right now, while the old centos5 puppet does: http://mxr.mozilla.org/build/source/puppet-manifests/os/centos.pp#73
I suspect this was intentional (for the same way new puppet doesn't contain ssh keys) but I can't find any docs that say to add these files, rail any insight (being someone involved with setting up ec2 slaves which are based on this same puppet image)
Flags: needinfo?(rail)
Comment 12•12 years ago
|
||
So, I confirmed it is puppetizing correctly, I also noticed that new-puppet has no ref to .android right now, while the old centos5 puppet does: http://mxr.mozilla.org/build/source/puppet-manifests/os/centos.pp#73
I suspect this was intentional (for the same way new puppet doesn't contain ssh keys) but I can't find any docs that say to add these files, rail any insight (being someone involved with setting up ec2 slaves which are based on this same puppet image)
Reporter | ||
Comment 13•12 years ago
|
||
We don't manage the slave secrets yet, that's bug 792836. You need to copy ssh keys and android stuff (.android and .mozpass.cfg) from another slave.
Flags: needinfo?(rail)
Comment 14•12 years ago
|
||
Enabled in slavealloc after I verified that it did, indeed not have ~cltbld/.android/ and ~cltbld/.mozpass.cfg
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 15•12 years ago
|
||
Buildbot is not running on this host, hasn't been since this morning...
2013-02-04 07:50:29-0800 [Broker,client] argv: ['python', 'tools/buildfarm/maintenance/count_and_reboot.py', '-
f', '../reboot_count.txt', '-n', '1', '-z']
2013-02-04 07:50:29-0800 [Broker,client] environment: {'LANG': 'en_US.UTF-8', 'CCACHE_HASHDIR': '', 'TERM': 'lin
ux', 'SHELL': '/bin/bash', 'SHLVL': '1', 'HOSTNAME': 'bld-centos6-hp-009.build.scl1.mozilla.com', 'G_BROKEN_FILEN
AMES': '1', 'HISTSIZE': '1000', 'HISTCONTROL': 'ignoredups', 'PWD': '/builds/slave/m-cen-lnx-l10n-ntly', 'LOGNAME
': 'cltbld', 'USER': 'cltbld', 'MAIL': '/var/spool/mail/cltbld', 'PATH': '/usr/local/bin:/usr/lib64/ccache:/usr/l
ocal/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin', 'LESSOPEN': '|/usr/bin/lesspipe.sh %s',
'HOME': '/home/cltbld', '_': '/tools/buildbot/bin/python'}
2013-02-04 07:50:29-0800 [Broker,client] using PTY: False
2013-02-04 07:50:37-0800 [-] Received SIGTERM, shutting down.
2013-02-04 07:50:37-0800 [-] stopCommand: halting current command <buildslave.commands.shell.SlaveShellCommand in
stance at 0x29a53b0>
2013-02-04 07:50:37-0800 [-] command interrupted, attempting to kill
2013-02-04 07:50:37-0800 [-] trying to kill process group 53952
2013-02-04 07:50:37-0800 [-] signal 9 sent successfully
2013-02-04 07:50:37-0800 [Broker,client] lost remote
...
2013-02-04 07:50:37-0800 [Broker,client] lost remote step
2013-02-04 07:50:37-0800 [Broker,client] Lost connection to buildbot-master13.build.scl1.mozilla.com:9001
2013-02-04 07:50:37-0800 [Broker,client] Stopping factory <buildslave.bot.BotFactory instance at 0x2a24c68>
2013-02-04 07:50:37-0800 [-] Main loop terminated.
2013-02-04 07:50:37-0800 [-] Server Shut Down.
[cltbld@bld-centos6-hp-009 ~]$ date
Mon Feb 4 10:39:12 PST 2013
[cltbld@bld-centos6-hp-009 ~]$ uptime
10:39:15 up 2:47, 1 user, load average: 0.00, 0.00, 0.00
Which means it didn't come up right somehow. (went down due to normal reboot)
[cltbld@bld-centos6-hp-009 ~]$ facter fqdn
bld-centos6-hp-009.build.scl1.mozilla.com
Puppet Dashboard says its last run was successful (2013-02-04 07:34 PST) ---
A fresh manual reboot fixed it
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Comment 16•11 years ago
|
||
No space left on device. Disabled in slavealloc.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 17•11 years ago
|
||
This got re-enabled by someone on April 25.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Alias: bld-centos6-hp-009 → b-linux64-hp-0028
Summary: bld-centos6-hp-009 problem tracking → b-linux64-hp-0028 problem tracking
Comment 18•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=47948225&tree=Mozilla-Aurora
Error: unable to free 20.00 GB of space. Free space only 18.66 GB
Disabled.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 19•10 years ago
|
||
Cleaned up for chemspills, re-enabled.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Comment 20•10 years ago
|
||
Please do not re-enable this slave. We are retiring linux hardware build slaves in bug 1106922.
Blocks: 1106922
Resolution: FIXED → WONTFIX
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•