Closed Bug 660514 Opened 13 years ago Closed 13 years ago

cloning hg.m.o/users/prepr-ffxbld/mozilla-2.0 taking over 4 hours, still not finished

Tracking

(Not tracked)

Status:

RESOLVED DUPLICATE of bug 661828

People

(Reporter: mozilla, Assigned: nmeyerhans)

References

Details

Aki Sasaki (not active)

Reporter

Description

•

13 years ago

I'm trying to run a preproduction release to make sure bug 557260 doesn't break Firefox releases when it lands Tuesday morning. The first part of this is cloning a bunch of user repos in prepr-ffxbld. This has been timing out (>3600 seconds) several times since Friday afternoon. Today I decided to stop using the buildbot automation and do it manually: [cltbld@moz2-linux-slave51 build]$ ssh -l prepr-ffxbld -oIdentityFile=~cltbld/.ssh/ffxbld_dsa hg.mozilla.org clone mozilla-2.0 releases/mozilla-2.0 Please wait. Cloning /releases/mozilla-2.0 to /users/prepr-ffxbld/mozilla-2.0 This has been running between 4-5 hours and still hasn't completed.

Aki Sasaki (not active)

Reporter

Comment 1

•

13 years ago

Determining and fixing the root cause would be ideal. A short term workaround would be cloning that user repo for me, at which time we can lower the priority on this bug.

Aki Sasaki (not active)

Reporter

Updated

•

13 years ago

Blocks: 557260

Aki Sasaki (not active)

Reporter

Comment 2

•

13 years ago

The clone finished overnight. I went afk around 3:30am-ish PDT and I believe it wasn't done by that point. I'd love to know why the clone took over 9 hours ?

Severity: major → normal

Phong Tran [:phong]

Updated

•

13 years ago

Assignee: server-ops → nmeyerhans

Aki Sasaki (not active)

Reporter

Comment 3

•

13 years ago

Raising priority, as this (and bug 661828, which is probably a dup) is killing our ability to quickly port+test mobile releases, which we're trying to do by Friday.

Severity: normal → critical

Aki Sasaki (not active)

Reporter

Comment 4

•

13 years ago

Rail says this first started 3-4 weeks ago, and this doesn't seem to have resolved by itself over that time. Hoping that regression window is helpful.

Noah Meyerhans [:noahm]

Assignee

Comment 5

•

13 years ago

We've found a likely culprit: disk contention due to filesystem backups. It seems that backups weren't being made roughly 4-5 weeks ago due to a hardware failure affecting the backup host. The hardware was repaired roughly 3 to 4 weeks ago. Backups are apparently scheduled to start at 1 AM Pacific time and have recently been taking >7 hours to complete. We've cancelled the currently running backup job, which should free up IO capacity to let any current hg operations complete in a reasonable amount of time. We've got to revisit how we back up these filesystems, though. I suspect that if we can find a time window when releng isn't making heavy use of hg, we can at least reschedule the backups to run during that time.

Aki Sasaki (not active)

Reporter

Comment 6

•

13 years ago

duping forward since all the action is on bug 661828.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → DUPLICATE

Nobody; OK to take it and work on it

Updated

•

10 years ago

Product: mozilla.org → mozilla.org Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

cloning hg.m.o/users/prepr-ffxbld/mozilla-2.0 taking over 4 hours, still not finished

Categories

(mozilla.org Graveyard :: Server Operations, task)

Tracking

(Not tracked)

People

(Reporter: mozilla, Assigned: nmeyerhans)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Updated