Closed Bug 1035660 Opened 10 years ago Closed 7 years ago

hgtool should *never* clone try

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: glandium, Unassigned)

References

(Blocks 1 open bug)

Details

Following is what I got on try today:

Reporting hg version in use
command: START
command: hg -q version
command: cwd: .
command: output:
Mercurial Distributed SCM (version 1.9.1)
command: END (0.06s elapsed)

Checking if share extension works
command: START
command: hg help share
command: cwd: c:\builds\moz2_slave\try-w32-0000000000000000000000
command: END (0.07 elapsed)

Updating shared repo
Reporting hg version in use
command: START
command: hg -q version
command: cwd: .
command: output:
Mercurial Distributed SCM (version 1.9.1)
command: END (0.06s elapsed)

Attempting to initialize clone with bundles
command: START
command: hg init c:\builds\hg-shared\try
command: cwd: c:\builds\moz2_slave\try-w32-0000000000000000000000
command: output:
command: END (0.07s elapsed)

Trying to use bundle https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/bundles/try.hg
command: START
command: hg unbundle https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/bundles/try.hg
command: cwd: c:\builds\hg-shared\try
command: END (0.41 elapsed)

Using _rmtree_windows ...
Using bundles failed; falling back to clone
command: START
command: hg clone -U -r 1bd3b527a0d2b260fafdb69d34f600caef3d719e https://hg.mozilla.org/try c:\builds\hg-shared\try
command: cwd: c:\builds\moz2_slave\try-w32-0000000000000000000000
command: output:
adding changesets
adding manifests
adding file changes
added 192416 changesets with 1078104 changes to 161554 files
command: END (2545.02s elapsed)
(snip)

As can be seen above, cloning try directly is *extremely* slow, and I suspect it also doesn't help with the mercurial server load. The last resort for try should be to clone something like mozilla-central, from a bundle if necessary, and then pull the changeset from try.
OS: Linux → All
Hardware: x86_64 → All
I don't know why the unbundle failed, but I just want to clarify that the try.hg bundle is a symlink to mozilla-central.hg.
I don't know why the unbundle failed, but please note that it's a quite bad feedback loop:
- try is slow to respond, so updating the webheads takes time.
- having the webheads take more time to update increases the chances of the change being picked by buildbot (because it polls regularly), and the change not being on the webhead the slaves pulls from when it pulls (and it happened a lot today. pulsebot had the problem too ; or maybe worse, like some webheads not being updated at all because of timeouts)
- the unquoted part of the hgtool log *does* contain a failed pull with "unknown revision". And what makes me think about webheads not being updated at all is that a subsequent retry of the same changeset had its pull fail too. So while the first try actually had some weird failures and did not apparently have a previous try tree at all, the second did have one, and only started afresh because the pull failed.
- Since a pull failure because of (presumably) try server load makes us start with a new clone, we get to increase the try server load, happily feeding the feedback loop.

The try build in question:
https://tbpl.mozilla.org/?tree=Try&rev=1bd3b527a0d2
Component: Other → Tools
QA Contact: hwine
Ben, Can we just get rid of try using your bundle trick while fixing this?
(In reply to Mike Hommey [:glandium] from comment #0)
 
> As can be seen above, cloning try directly is *extremely* slow, and I
> suspect it also doesn't help with the mercurial server load. The last resort
> for try should be to clone something like mozilla-central, from a bundle if
> necessary, and then pull the changeset from try.

This is exactly what is currently implemented. We never bundle try directly. Rather the try bundle is a symlink to the latest m-c bundle.

All repo tooling is currently shared and branch-agnostic.
(In reply to Hal Wine [:hwine] (use needinfo) from comment #4)
> This is exactly what is currently implemented. We never bundle try directly.
> Rather the try bundle is a symlink to the latest m-c bundle.

Except that for some reason that fails and we clone try directly then. Which we shouldn't do.
IMO we should remove the "clone" code path and only use bundles + incremental pull.

We could accomplish this via extension magic. See http://article.gmane.org/gmane.comp.version-control.mercurial.devel/54872. This would keep the client-side logic (which needs to be implemented in buildbotcustom, mozharness, etc) dumb and simple: "just clone" or "just pull." Mercurial would do the right thing. The Mercurial people have been asking me if I want to finish that extension. I keep telling them I have no time :/
BTW, fun fact: hg unbundle http-url is usually (much) slower than wget http-url; hg unbundle file.
So a few things we can do here:

- Benchmark 'hg unbundle http://bundle' vs 'wget http://bundle; hg unbundle file' and switch client logic if it makes sense. This won't help the hg web heads at all, but if it's faster, then yay!

- Add some retry logic around failing to pull in a revision. hgtool clobbers local srcdir and shared repos way too easily, basically on any kind of failure. If possible we should distinguish between local issues that require a clobber and intermittent remote issues.

- Add a --never-clone option to hg and enable this on try, or perhaps enable this by default when --clone-by-revision is enabled

But first, let's figure out where the major sources of load are and focus our efforts there.
Component: Tools → General
hgtool is dead
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.