Closed Bug 301969 Opened 19 years ago Closed 19 years ago

fix triton's continuing cvs checkout/resource issues

Categories

(Webtools Graveyard :: Tinderbox, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: chase, Assigned: coop)

References

Details

triton is having problems lately with cvs checkout and general resource issues. triton is a Mac OS X 10.3.9 XServe dual G5 system. It builds Thunderbird Aviary1.0.1 and Trunk. The latest problem was seen on 7/24 at 17:15: /usr/bin/make -f client.mk checkout mozilla/build/autoconf/mozconfig2client-mk: fork: Resource temporarily unavailable mozilla/build/autoconf/mozconfig2client-mk: fork: Resource temporarily unavailable /bin/sh: fork: Resource temporarily unavailable checkout start: cvs -q -z 3 co -r AVIARY_1_0_1_20050124_BRANCH -P -D "07/24/2005 23:41 +0000" mozilla/client.mk mozilla/build/unix/modules.mk mozilla/build/unix/uniq.pl /bin/sh: fork: Resource temporarily unavailable /bin/sh: fork: Resource temporarily unavailable /bin/sh: fork: Resource temporarily unavailable cvs [checkout aborted]: cannot fork: Resource temporarily unavailable /bin/sh: line 1: mozilla/.mozconfig.out: No such file or directory mozilla/client.mk:159: /.mozconfig.mk: No such file or directory mozilla/client.mk:160: /build/unix/modules.mk: No such file or directory make[1]: *** No rule to make target `/build/unix/modules.mk'. Stop. make: *** [checkout] Error 2 End: Sun Jul 24 16:42:01 2005 Error: CVS checkout failed.
To reset the system, I stopped its build scripts, moved its checkout directories aside, killed all build-related processes, and rebooted it. System uptime before reboot was 6 days, 4:25 hours.
Status: NEW → ASSIGNED
I have 3 of these systems at work (Xserves) and they're pretty good about not using up resources like this, mine are all running Tiger though. Do you see any hints in the console logs? system.log or console.log?
Probably worth noting triton is our second XServe. atlantia, our first, has not experienced this problem. Thanks Chris. Your suggestion of checking out the system logs is worth looking into. coop, can you check triton tomorrow (whether it's red or green on Aviary-1.0.1 or Thunderbird) to see if there are old CVS/SSH processes lying around and if there are any hints of problems in the system logs?
Assignee: chase → ccooper
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
Looks like the memory modules on triton are experiencing some problem: Jul 25 01:44:30 localhost kernel: WARNING: 2 parity errors corrected in DIMM5/J42 Jul 25 01:43:29 localhost kernel: WARNING: 2 parity errors corrected in DIMM5/J42 There are errors like this going back as far as there are system logs (1 week).
CVS checkout processes for the trunk build (Tb-Trunk) are failing but not exiting on the mozilla/extensions/spellcheck/myspell/dictionaries dir. To use a recent failed build as an example (http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1122406380.27282.gz&fulltext=1), the tinderbox scripts checkout a series of tags/directories, the final one being: cvs -q -z 3 co -P -A -D 07/26/2005 19:33 +0000 SeaMonkeyAll ? mozilla/extensions/spellcheck/myspell/dictionaries ? mozilla/extensions/xmlextras/base/src/nsRect.cpp ? mozilla/build/unix/thunderbird-config ? mozilla/build/unix/thunderbird-gtkmozembed.pc ? mozilla/build/unix/thunderbird-js.pc ? mozilla/build/unix/thunderbird-nspr.pc ? mozilla/build/unix/thunderbird-nss.pc ? mozilla/build/unix/thunderbird-plugin.pc ? mozilla/build/unix/thunderbird-xpcom.pc ? mozilla/embedding/components/printingui/src/mac/printpde/build ? mozilla/modules/plugin/samples/default/mac/_NullPlugin.rsrc ? mozilla/modules/plugin/samples/default/mac/build ? mozilla/gfx/gfx-config.h ? mozilla/xpfe/bootstrap/appleevents/mozillaSuite.rsrc ? mozilla/xpfe/global/buildconfig.html cvs checkout: in directory mozilla/extensions/spellcheck/myspell/dictionaries: cvs checkout: cannot open CVS/Entries for reading: No such file or directory cvs [checkout aborted]: cannot open CVS/Tag: No such file or directory Because the CVS processes hang instead of exiting, after several such build cycles, there will be lots of cvs processes left hanging around that look like this: cltbld 821 0.0 -0.0 27912 644 p1 S 2:31PM 0:00.07 /bin/sh -c set -e; cvs_co() { set -e; echo "$@" ; "$@" 2>&1 | tee -a /builds/tinderbox/Tb-Trunk/Darwin_7.9.0_Depend/cvsco.log; }; cvs_co cvs -q -z 3 co -P -r NSPRPUB_PRE_4_2_CLIENT_BRANCH -D "07/26/2005 21:31 +0000" mozilla/nsprpub; cvs_co cvs -q -z 3 co -P -r NSS_CLIENT_TAG mozilla/security/nss mozilla/security/coreconf ; cvs_co cvs -q -z 3 co -P -r ldapcsdk_50_client_branch -D "07/26/2005 21:31 +0000" mozilla/directory/c-sdk; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/accessible; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/chrome; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/db/sqlite3; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/ipc/ipcd; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/mail; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/modules/libbz2; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/modules/libmar; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/modules/libpr0n; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/other-licenses/7zstub/thunderbird; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/other-licenses/branding/thunderbird; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/security/manager; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/storage; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/toolkit; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" mozilla/tools/update-packaging; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" SeaMonkeyAll; true; true; cltbld 901 0.0 -0.0 27324 328 p1 S 2:32PM 0:00.01 tee -a /builds/tinderbox/Tb-Trunk/Darwin_7.9.0_Depend/cvsco.log cltbld 902 0.0 -0.1 28264 1508 p1 S 2:32PM 0:00.16 ssh cvs.mozilla.org -l cltbld cvs server I've tried running these same commands by hand, and have confirmed that it is the final SeaMonkeyAll cvs command that is hanging. In particular, I have tried reworking the cvs_co {} bash function that is used by the tinderbox scripts. It seems that the STDERR redirection in the function (2>&1) is responsible for the hang. i.e. current command set that hangs: cvs_co() { set -e; echo "$@" ; "$@" 2>&1 | tee -a /builds/tinderbox/Tb-Trunk/Darwin_7.9.0_Depend/cvsco.log; }; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" SeaMonkeyAll working command set (no hang): cvs_co() { set -e; echo "$@" ; "$@" | tee -a /builds/tinderbox/Tb-Trunk/Darwin_7.9.0_Depend/cvsco.log; }; cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" SeaMonkeyAll 2>&1 I'm not sure whether updating the underlying software (bash,cvs,etc) on this Mac might help this or not. Note: the daily clobber release build will continue to work (at least until too many CVS processes get backed up), because it destroys the mozilla/ dir at the outset. If we wanted a quick and dirty solution, we could switch triton to producing clobber builds exclusively.
(In reply to comment #5) > I've tried running these same commands by hand, and have confirmed that it is > the final SeaMonkeyAll cvs command that is hanging. Thanks for narrowing this down, coop. > In particular, I have tried reworking the cvs_co {} bash function that is used > by the tinderbox scripts. It seems that the STDERR redirection in the function > (2>&1) is responsible for the hang. > > i.e. > current command set that hangs: > cvs_co() { set -e; echo "$@" ; "$@" 2>&1 | tee -a > /builds/tinderbox/Tb-Trunk/Darwin_7.9.0_Depend/cvsco.log; }; > cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" SeaMonkeyAll > > working command set (no hang): > cvs_co() { set -e; echo "$@" ; "$@" | tee -a > /builds/tinderbox/Tb-Trunk/Darwin_7.9.0_Depend/cvsco.log; }; > cvs_co cvs -q -z 3 co -P -A -D "07/26/2005 21:31 +0000" SeaMonkeyAll 2>&1 I'm guessing in this second case the 2>&1 shell redirection gets interpreted by the parent command-line shell and not passed as an argument that ends up in "$@". Wrapping 2>&1 in doublequotes probably will get the shell redirection passed as an input to cvs_co but I suspect the hang would be reintroduced as it's the interaction with either tee or the subshell that probably leads to the problem. > I'm not sure whether updating the underlying software (bash,cvs,etc) on this Mac > might help this or not. Maybe, but maybe not. atlantia isn't having this problem and it should be running the same version of OS, utils, and XCode, but OTOH it builds Firefox which I don't think checks out the extensions/spellcheck/myspell/ directory that causes this problem in the first place. > Note: the daily clobber release build will continue to work (at least until too > many CVS processes get backed up), because it destroys the mozilla/ dir at the > outset. If we wanted a quick and dirty solution, we could switch triton to > producing clobber builds exclusively. I agree. Let's use this as our fallback position when we stop making headway.
From mozilla/extensions/spellcheck/myspell/dictionaries/Attic/Makefile.in,v: 1.3 date 2005.07.16.18.45.11; author gandalf%firefox.pl; state dead; branches; next 1.2; Two other files have dead dates as of this timestamp, as well. triton's Thunderbird trunk build first failed with the following issue on 7/17 11:48 (for a release build): cvs checkout: in directory mozilla/extensions/spellcheck/myspell/dictionaries: cvs checkout: cannot open CVS/Entries for reading: No such file or directory cvs [checkout aborted]: cannot open CVS/Tag: No such file or directory Process killed. Took 2 seconds to die. End: Sun Jul 17 12:48:30 2005 Error: CVS checkout timed out. Process killed. Took 0 seconds to die. The "Process killed" text is from the Tinderbox scripts which kills the make -f client.mk checkout process but most likely fails to kill the cvs_co shell function process. coop and I think extensions/spellcheck/myspell/dictionaries/ probably needs a .cvsignore file now that all of its files have been removed from HEAD. Still investigating, though.
We should not put a .cvsignore file there, otherwise the directory will remain in -P checkouts forever. And firefox does check out this directory. I think this is probably just an old version of "cvs"... is that machine using cvs 1.11? If not, you can emerge cvs 1.11 in fink.
Files in extensions/spellcheck/myspell/dictionaries/ were removed in a patch that landed in bug 295465.
(In reply to comment #8) > We should not put a .cvsignore file there, otherwise the directory will remain > in -P checkouts forever. And firefox does check out this directory. I think this > is probably just an old version of "cvs"... is that machine using cvs 1.11? If > not, you can emerge cvs 1.11 in fink. triton:/builds/tinderbox cltbld$ which cvs /sw/bin/cvs triton:/builds/tinderbox cltbld$ cvs --version Concurrent Versions System (CVS) 1.11.17 (client/server)
FWIW, there is no reason why removing all files from a directory should cause cvs hangs... we have done it in the past (not frequently, but I've done it more than once).
coop updated cvs before I tested what version was installed. It was 1.10 before. atlantia's cvs is 1.10, too.
I'm a bit confused. I don't see ./dictionaries dir in my local tree - I thought I removed it with .cvsignore. Lxr confirms that. Did I miss something?
gandalf, CVS never "deletes" directories from the repository, but it will remove them from the local copy if there are no files left in them. Apparently the cvs process on this machine is confused.
It removes them from the checkout if the "-P" command option (to checkout and update) is used.
just fyi, 10.3.9 cvs = 1.10 10.4.* cvs = 1.11, ocvs = 1.10 = 10.3.9 version This is partially an internal Apple reason, the cvs wrappers functionality showed up for 1.10 and went away for 1.11 and well, that is why cvs lagged for a bit in Panther (10.3.9). You could try building your own cvs to see if that fixes the problem, I would be a little suspicious of the 1.10 version if I was having problems.
I removed the old prebuilt cvs package, and have rebuilt the cvs package using fink. We'll see whether a locally-built cvs package does us any favors. I've also installed ccache on triton to improve the turnaround time on these testing cycles.
you might also want to do a ktrace/kdump on the cvs command to track file i/o, that might give you a clue where this is hanging. If you don't track this down soon, I can try this on panther/tiger here at Apple to see if it's a bug in the OS.
(In reply to comment #19) > If you don't track this down soon, I can try this on panther/tiger here at Apple > to see if it's a bug in the OS. This tickles me into thinking we could try to reproduce this on atlantia, too, by configuring it for a Thunderbird build. Just a thought for another vector.
(In reply to comment #20) > (In reply to comment #19) > > If you don't track this down soon, I can try this on panther/tiger here at Apple > > to see if it's a bug in the OS. > > This tickles me into thinking we could try to reproduce this on atlantia, too, > by configuring it for a Thunderbird build. Just a thought for another vector. Sadly, it was nothing so exotic. I noticed in the logs that the tar command was failing to accept the -j flag (bzip2), although the command worked fine by hand. Further investigation uncovered that the build-seamonkey-util.pl was adding "/bin:/usr/bin" to the beginning of $PATH on Darwin, so all the updated software in /sw/bin was being ignored. This is probably a holdover from when Darwin tools either didn't exist, or were less mature. Adding "/sw/bin" back to the front of the path seems to have fixed things; both of tritons builds are humming along now: http://tinderbox.mozilla.org/showbuilds.cgi?tree=Thunderbird http://tinderbox.mozilla.org/showbuilds.cgi?tree=Aviary-1.0.1
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
The phantom dictionaries directory is bug 305665 - it started causing Camino bustage. The directory wasn't being checked out, it was being recreated at configure time due to a bad allmakefiles.sh.
Depends on: 305665
Component: Tinderbox Platforms → Tinderbox
Product: mozilla.org → Webtools
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.