Closed Bug 801607 Opened 12 years ago Closed 10 years ago

Make EC2 instances less susceptible to "abort: No space left on device"

Categories

(Release Engineering :: General, defect, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Unassigned)

References

Details

(Keywords: sheriffing-untriaged)

Latest instance: slave: bld-linux64-ec2-020 https://tbpl.mozilla.org/php/getParsedLog.php?id=16115372&tree=Mozilla-Inbound { ========= Started clone build tools failed (results: 2, elapsed: 19 secs) (at 2012-10-15 05:45:08.363782) ========= hg clone http://hg.mozilla.org/build/tools tools in dir /builds/slave/m-in-lnx/. (timeout 1320 secs) watching logfiles {} argv: ['hg', 'clone', 'http://hg.mozilla.org/build/tools', 'tools'] environment: CCACHE_HASHDIR= G_BROKEN_FILENAMES=1 HISTCONTROL=ignoredups HISTSIZE=1000 HOME=/home/cltbld HOSTNAME=bld-linux64-ec2-020.build.aws-us-west-1.mozilla.com LESSOPEN=|/usr/bin/lesspipe.sh %s LOGNAME=cltbld MAIL=/var/spool/mail/cltbld PATH=/usr/local/bin:/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin PWD=/builds/slave/m-in-lnx SHELL=/bin/bash SHLVL=1 TERM=linux USER=cltbld _=/tools/buildbot/bin/python using PTY: False requesting all changes adding changesets adding manifests adding file changes added 3062 changesets with 6319 changes to 1063 files updating to branch default abort: No space left on device program finished with exit code 255 elapsedTime=19.700516 ========= Finished clone build tools failed (results: 2, elapsed: 19 secs) (at 2012-10-15 05:45:28.080959) ========= }
and: s: bld-linux64-ec2-033 https://tbpl.mozilla.org/php/getParsedLog.php?id=16113771&tree=Mozilla-Inbound { ========= Started clone build tools failed (results: 2, elapsed: 20 secs) (at 2012-10-15 04:16:27.915769) ========= hg clone http://hg.mozilla.org/build/tools tools in dir /builds/slave/m-in-lnx-dbg/. (timeout 1320 secs) watching logfiles {} argv: ['hg', 'clone', 'http://hg.mozilla.org/build/tools', 'tools'] environment: CCACHE_HASHDIR= G_BROKEN_FILENAMES=1 HISTCONTROL=ignoredups HISTSIZE=1000 HOME=/home/cltbld HOSTNAME=bld-linux64-ec2-033.build.aws-us-west-1.mozilla.com LESSOPEN=|/usr/bin/lesspipe.sh %s LOGNAME=cltbld MAIL=/var/spool/mail/cltbld PATH=/usr/local/bin:/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin PWD=/builds/slave/m-in-lnx-dbg SHELL=/bin/bash SHLVL=1 TERM=linux USER=cltbld _=/tools/buildbot/bin/python using PTY: False requesting all changes adding changesets adding manifests adding file changes added 3061 changesets with 6318 changes to 1063 files updating to branch default abort: No space left on device: /builds/slave/m-in-lnx-dbg/tools/lib/python program finished with exit code 255 elapsedTime=20.961564 ========= Finished clone build tools failed (results: 2, elapsed: 20 secs) (at 2012-10-15 04:16:48.912714) ========= }
Sounds like we need to bump buildSpace requirements due to mock overhead -- downloaded RPMs and mock chroot.
Having some initial cleanup on boot would be a good thing to do too.
Depends on: 712206
Severity: critical → major
Priority: -- → P2
Blocks: 807624, 807294, 798820
The EC2 instances have ~100G build space, and the ix hardware they're replacing has close to twice that. This could also be summarised as "update build space requirements to reality, since we've been getting away with them being wrong by having a lot of free space".
linux64 needs another gig - http://mxr.mozilla.org/build/source/buildbot-configs/mozilla/config.py#190 should be a 7 (inbound overrides the 6, and I don't think it's been starting these).
I bumped the build space for linux64 to 7 - http://hg.mozilla.org/build/buildbot-configs/rev/eebcc5ed460e
in production
Whiteboard: [sheriff-want]
Product: mozilla.org → Release Engineering
It doesn't look like there's an overarching problem to fix here...just the normal individual buildSpace bumps.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.