Closed
Bug 660514
Opened 13 years ago
Closed 13 years ago
cloning hg.m.o/users/prepr-ffxbld/mozilla-2.0 taking over 4 hours, still not finished
Categories
(mozilla.org Graveyard :: Server Operations, task)
mozilla.org Graveyard
Server Operations
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 661828
People
(Reporter: mozilla, Assigned: nmeyerhans)
References
Details
I'm trying to run a preproduction release to make sure bug 557260 doesn't break Firefox releases when it lands Tuesday morning.
The first part of this is cloning a bunch of user repos in prepr-ffxbld.
This has been timing out (>3600 seconds) several times since Friday afternoon.
Today I decided to stop using the buildbot automation and do it manually:
[cltbld@moz2-linux-slave51 build]$ ssh -l prepr-ffxbld -oIdentityFile=~cltbld/.ssh/ffxbld_dsa hg.mozilla.org clone mozilla-2.0 releases/mozilla-2.0
Please wait. Cloning /releases/mozilla-2.0 to /users/prepr-ffxbld/mozilla-2.0
This has been running between 4-5 hours and still hasn't completed.
Reporter | ||
Comment 1•13 years ago
|
||
Determining and fixing the root cause would be ideal.
A short term workaround would be cloning that user repo for me, at which time we can lower the priority on this bug.
Reporter | ||
Comment 2•13 years ago
|
||
The clone finished overnight.
I went afk around 3:30am-ish PDT and I believe it wasn't done by that point.
I'd love to know why the clone took over 9 hours ?
Severity: major → normal
Updated•13 years ago
|
Assignee: server-ops → nmeyerhans
Reporter | ||
Comment 3•13 years ago
|
||
Raising priority, as this (and bug 661828, which is probably a dup) is killing our ability to quickly port+test mobile releases, which we're trying to do by Friday.
Severity: normal → critical
Reporter | ||
Comment 4•13 years ago
|
||
Rail says this first started 3-4 weeks ago, and this doesn't seem to have resolved by itself over that time. Hoping that regression window is helpful.
Assignee | ||
Comment 5•13 years ago
|
||
We've found a likely culprit: disk contention due to filesystem backups. It seems that backups weren't being made roughly 4-5 weeks ago due to a hardware failure affecting the backup host. The hardware was repaired roughly 3 to 4 weeks ago. Backups are apparently scheduled to start at 1 AM Pacific time and have recently been taking >7 hours to complete.
We've cancelled the currently running backup job, which should free up IO capacity to let any current hg operations complete in a reasonable amount of time. We've got to revisit how we back up these filesystems, though. I suspect that if we can find a time window when releng isn't making heavy use of hg, we can at least reschedule the backups to run during that time.
Reporter | ||
Comment 6•13 years ago
|
||
duping forward since all the action is on bug 661828.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•