Closed
Bug 1057549
Opened 10 years ago
Closed 10 years ago
series of local disk operations timeouts on win32 builders during release builds
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: hwine, Unassigned)
References
Details
(Whiteboard: [release-impacting])
At least 4 instances following the pattern:
- hg pull into shared from hg.m.o succeeds
- hg clone shared into builder's space succeeds
- hg update -C times out (40 min)
Occurred on:
- b-2008-ix-0127 http://buildbot-master82.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-beta-win32_repack_4%2F10/builds/21
- b-2008-ix-0117 http://buildbot-master85.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-beta-win32_repack_2%2F10/builds/17
- b-2008-ix-0104 http://buildbot-master85.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-beta-win32_repack_5%2F10/builds/18
- b-2008-ix-0172 http://buildbot-master82.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-beta-win32_repack_5%2F10/builds/19
Comment 1•10 years ago
|
||
One thing I noted to Hal was that I disabled all the b-2008-sm slaves today, so we might be using b-2008-ix machines of questionable pedigree. At the very least, these slaves may never have cloned mozilla-beta.
I pulled b-2008-ix-0172, and ran an |hg update -C| by hand on c:\builds\moz2_slave\rel-m-beta-w32_rpk_5-000000000\mozilla-beta. It took 13m to complete, but I don't know whether that would have been affected by the attempt in the failed job:
http://buildbot-master82.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-beta-win32_repack_5%2F10/builds/19/steps/run_script/logs/stdio
A subsequent |hg update -C| on the same dir completed in just a few seconds.
Reporter | ||
Comment 2•10 years ago
|
||
note that those boxes are using hg client version 1.9.1 - while old, there are no mentions of share related bugs being fixed in subsequent releases (we don't use the unshare feature, which did receive bug fixes) (client will be updated in bug 1056981)
Comment 3•10 years ago
|
||
FTR, the same happened for 31.1.0esr:
b-2008-ix-0126: http://buildbot-master84.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-esr31-win32_build/builds/1
b-2008-ix-0164: http://buildbot-master84.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-esr31-win32_build/builds/2
b-2008-ix-0164: http://buildbot-master84.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-esr31-win32_build/builds/3
b-2008-ix-0103: http://buildbot-master84.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-esr31-win32_build/builds/4
Updated•10 years ago
|
Summary: series of local hg update -C timeouts on win32 builders during ff32.0b9 build1 → series of local hg update -C timeouts on win32 builders during release builds
Reporter | ||
Comment 4•10 years ago
|
||
Anyone have any thoughts on how to "prime" these builders for all the various builds we have coming up over the next 2 weeks? Or just take it as a possible issue on first build from idle branch?
Reporter | ||
Comment 5•10 years ago
|
||
FTR, also same for 24.8.0esr:
- b-2008-ix-0071 http://buildbot-master86.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-esr24-win32_repack_9%2F10/builds/2
- b-2008-ix-0109 http://buildbot-master84.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-esr24-win32_repack_4%2F10/builds/3
NOTE: this was timeout on operation after failed hg clone from hg.m.o into shared area
- b-2008-ix-0004 http://buildbot-master84.srv.releng.scl3.mozilla.com:8001/builders/release-mozilla-esr24-win32_repack_4%2F10/builds/4
NOTE: this was timeout on clobber
Reporter | ||
Comment 6•10 years ago
|
||
Per postmortem meeting, extending to include anything that looks like hung/slow local disk I/O. Changed summary to reflect that. Also seems to be part of the general unhappiness of win32 builds, so blocks bug 1026870
On b-2008-ix-0168 for TB 31.1.0 build, this occurred during a purge operation:
http://buildbot-master86.srv.releng.scl3.mozilla.com:8001/builders/release-comm-esr31-win32_repack_10%2F10/builds/0
Blocks: 1026870
Summary: series of local hg update -C timeouts on win32 builders during release builds → series of local disk operations timeouts on win32 builders during release builds
Whiteboard: [release-impacting]
Reporter | ||
Comment 7•10 years ago
|
||
on b-2008-ix-0109 for TB 31.1.0 build, repack (local disk intensive operation) failed at 40m timeout:
http://buildbot-master84.srv.releng.scl3.mozilla.com:8001/builders/release-comm-esr31-win32_repack_8%2F10/builds/1
Comment 8•10 years ago
|
||
The root cause was believed to be machines that were not upgraded to VS2013 correctly (half upgrade) which then proceeded to finish the upgrade during a build. See bug 1062877.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•