Closed Bug 106009 Opened 23 years ago Closed 23 years ago

PAC instantiation hangs Regxpcom Solaris nightly build packaging process

Categories

(Core :: XPCOM, defect)

Sun
SunOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla1.0

People

(Reporter: nbidwell, Assigned: dougt)

References

Details

(Keywords: helpwanted, Whiteboard: Needs to land on branch)

Attachments

(7 files)

As I write this, the latest Solaris nightly build on ftp.mozilla.org is from 10/15/2001. That was a week ago. (Not to mention that that build has a rather broken mail client for me...) Is this intentional?
Confirming and changing product to mozilla.org. CCing leaf@mozilla.org in hopes that more info might be out there. Related to bug 105981 or 105988?
Status: UNCONFIRMED → NEW
Component: Build Config → FTP - Staging
Ever confirmed: true
Product: Browser → mozilla.org
Nope, not related to those bugs. There hasn't been a sol26 nightly build log since Oct 15, which is weird. It's still in the crontabs on granite & aesir is up. Running the nightly script by hand to see what it turns up.
the 8am builds fail because IC is messing with the network and cvs hangs. the 8pm builds fail because the cvs process from 8am is still running. linux finishes because it starts at 4am before they start breaking things.
Well, I'm posting this with Solaris build 2001102222, so a nightly build was made last night.
Marking fixed (for lack of IC_screwed_us resolution).
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
And now there's an even newer nightly build up, so it seems everything is working correctly. Thank you. Now back to my regularly scheduled testing. BTW, what/who is IC?
Status: RESOLVED → VERIFIED
Reopening becaause Solaris nighlies aren't showing up again. The last available build seems to be 2001110210
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
John, aesir needs to be resurrected after the network massacre so that solaris nightlies can live again.
Assignee: seawood → antitux
Status: REOPENED → NEW
In addition to Solaris builds, source code isn't getting put onto the ftp server.
source balls are on branch, sol26 builds are on aesir, both down right now. antitux is supposed to get all our unix systems up by COB Friday so source tarballs and sol26 builds should start showing up Saturday morning at the latest.
Status: NEW → ASSIGNED
Closing since Solaris nightlies and source are both back on ftp. Thanks!
Status: ASSIGNED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Re-opening because the Solaris build on ftp is 2001122110. Happy holidays!
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
someone broke the solaris build in nsPluginModule.cpp a while back. I don't know why the tinderboxen are green... could be someone hacked configure, it's an official-only problem, or a parallel build problem, or ???
Component: FTP - Staging → Build Config
Priority: -- → P3
Product: mozilla.org → Browser
Target Milestone: --- → mozilla0.9.8
btw - you shouldn't keep reopening the same bug. this is a completely different problem from the original reported bug so it makes things confusing. In the future, should the Solaris builds fail to appear on ftp.mozilla.org, it should be a new bug since it may or may not be the same problem.
Status: REOPENED → ASSIGNED
Tinderboxes are green because they use a) a more recent version of gcc (speedracer's 2.95.3 vs aesir's 2.7.2.1) or b) Forte (nebiros). Since we've dropped support for gcc 2.7.2.x, we should upgrade the compiler on aesir.
John what is the plugin problem on solaris? There is currently an open bug that is tracking issues on a variety of platforms http://bugzilla.mozilla.org/show_bug.cgi?id=106806 if this is also a solaris problem, then this bug should be updated and we should probably also make the platform to ALL unix, since linux suffers from this as well (the .so.1 issue)
*** Bug 117712 has been marked as a duplicate of this bug. ***
reassigning to asasaki since antitux is busy with other work right now. Aki - Can you upgrade gcc on aesir to 2.95.3? You'll need to coordinate with lpham to make sure the build automation is looking in the right place to find the new compiler once it's in place. Thanks.
actually reassigning this time...
Assignee: antitux → asasaki
Status: ASSIGNED → NEW
Installed in /opt/gcc-2.95.3 which is softlinked to /opt/gcc (so you shouldn't have to change anything in the env). Old /opt/gcc moved to /opt/gcc-2.95.2 which can be removed once everything is working. There are a lot of old cltbld processes on aesir... should I kill those?
Status: NEW → ASSIGNED
Yes, please kill them. Look like they are the old builds. Thanks. Loan
There is now a nightly build for Jan 7 - so thanks! But, the tar file is only 3.5MB :( I've raised bug#118701 for this.
Hm, looks like it's *building* fine (able to run the package from aesir), but regxpcom is hanging, and it doesn't get past the packaging phase unless I kill that... Want to rerun it but I may hit the 8pm build...? Meanwhile, there's a new build up...
Again this is the case... build finishes, regxpcom hangs indefinitely after the "registering smime account manager extension" line. Once I kill regxpcom, the rest of the packaging and the push to the ftp site happens. Anyone have any idea why regxpcom might be hanging on aesir?
IIRC regxpcom is hanging in smime, right? Paste the regxpcom output into the bug, then check lxr or ask on #mozilla for who's been doing smime and regxpcom work and cc them here to get their input. This will probably need to be reassigned to an engineer to fix one or the other.
truss output showed that regxpcom was hanging in poll(). It appeared to occur after all of the components have been registered and the components.reg file had already been created.
27077 ./run-mozilla.sh ./regxpcom 27078 *** Registering -venkman handler. 27079 *** Registering -chat handler. 27080 *** Registering x-application-irc handler. 27081 *** Registering irc protocol handler. 27082 *** Registering smime account manager extension. 27083 Terminated
dougt, kaie -- what are your thoughts on regxpcom hanging after smime registration? TIA.
could you attach some stacktraces of the threads (probably just one thread) involved?
output from truss, sleeping in poll().
I think that output in comment 27 does not give a hint that it could have something to do with smime. From what I have seen on Unix optimized builds, that's always the exact output of the first run of a new build.
Probably true, but it doesn't get to the "terminated" bit until I kill the regxpcom process, which could be many hours after the smime line appears in the log...
Aki, to find out whether it is indeed a problem with the smime extension, or something else, could you please do the following? As a first test, when the build has finished, just remove mailnews/extensions/smime/build/libmsgsmime.so Another test, you could add another debugging output line to that module. In mailnews/extensions/smime/src/smime-service.js locate SMIMEModule.registerSelf = function (compMgr, fileSpec, location, type) { dump("*** Registering smime account manager extension.\n"); .... } And add some more dump output lines. For example, replace that complete function with: SMIMEModule.registerSelf = function (compMgr, fileSpec, location, type) { dump("*** Registering smime account manager extension.\n"); compMgr = compMgr.QueryInterface(Components.interfaces.nsIComponentManagerObsolete); dump("*** smime 2.\n"); compMgr.registerComponentWithType(SMIME_EXTENSION_SERVICE_CID, "SMIME Account Manager Extension Service", SMIME_EXTENSION_SERVICE_CONTRACTID, fileSpec, location, true, true, type); dump("*** smime 3.\n"); catman = Components.classes["@mozilla.org/categorymanager;1"].getService(nsICategoryManager); dump("*** smime 4.\n"); catman.addCategoryEntry("mailnews-accountmanager-extensions", "smime account manager extension", SMIME_EXTENSION_SERVICE_CONTRACTID, true, true); dump("*** smime 5.\n"); } If you build again and start, if we see the line with "*** smime 5", I think it can't be the smime extension.
Ok, done. Looks like it's regxpcom =) dougt -- do you want ownership of this bug? Or do you have recommendations as to who would be best able to fix it? thanks.
reassigning.
Assignee: asasaki → dougt
Status: ASSIGNED → NEW
does someone with a sun build want to look at this. pavlov, do you have a uild that I could peek at?
Keywords: helpwanted
Target Milestone: mozilla0.9.8 → ---
over to pavlov.
Assignee: dougt → pavlov
Since Solaris builds are back and working, the summary should involve the problem that is left.
Summary: Where did the Sun Solaris nightly builds go? → Regxpcom hanging Solaris nightly build packaging process
There have been no new Solaris nightlies for 5 days now.... so I think they aren't working again (or whatever workaround was in place has stopped working).
I'm assuming that this is still the bug holding up nightly builds. If so, could someone please massage things by hand for a build newer than 1-15-2002? Thank you.
*** Bug 122813 has been marked as a duplicate of this bug. ***
My work firewall stops me pulling from CVS and I dont have the time to pull regular tarballs so building it myself is out. I'm sure I'm not alone in being dependent on nightlies for my mozilla testing so I'm guessing theres probably quite a few solaris issues going unnoticed until the nightlies come back simply by virtue of there being fewer users out there running the latest codebase. The longer this goes on the bigger a deal it gets. Any feedback at all on progress to a fix would be welcome.
I just found out that there is a new nightly build available in http://ftp.mozilla.org/pub/mozilla/nightly/latest/, with build date 2002013122. At first it complained about not being able to find run-mozilla.sh, so I just copied the file from the previous nightly build (2002011510) and now it runs beautifully :)
The missing run-mozilla.sh is bug#122942, now fixed. Thanks to whoever did the manual update of the Solaris nightly (or fixed the problem) - now I can play with the new Page Info stuff :)
Before anyone gets too happy, it was the missing run-mozilla.sh that caused the nightly to get delivered. Since run-mozilla.sh didn't exist, regxpcom could not run and then hang.
Target Milestone: --- → Future
With all due respect, why is this bug being "futured"? Do the drivers not care about testing on Solaris? Yes, it's a minority platform, but so are SGI IRIX and HPUX, both of which have up-to-date nightlies.
Even if the root problem with Regxpcom isn't worth the effort to fix at this point, it seems like it should be possible to make a work around that would allow nightlies to be distibuted. It seems like an ugly hack to the build process like automatically killing the Regxpcom process (or removing run-mozilla.sh which allowed 2002013122 to be built) would work.
Or just turn off --enable-crypto on the solaris nightly's to see if that fixes it... --disable-crypto turns off MOZ_PSM which turns off BUILD_SMIME in mozilla/mailnews/extensions/Makefile.in All you have to do is set BUILD_PSM="FALSE" in ns/build/unix/verification/seamonkey-build in the sol26 stanza... line: if this fixes your problem, then someone needs to debug --enable-crypto and smime in the sol26 stanza. But in the meantime there will be nightly builds.
> Or just turn off --enable-crypto on the solaris nightly's to see if > that fixes it... I thought Aki confirmed in his comment 34 that it is NOT the crypto component?
I am not saying it is a crypto issue... I am saying if we want to get rid of smime from the build... the quickest and easiest way to do that is to turn off PSM in the nightlies... Since no PSM, no smime... and then regxpcom CAN'T have an issue trying to load it. btw that was line 869 for doing so. Then whoever is the champion of sol26 should figure out what is going on... just like I do when the darn hpux nightlies mess up.
Whatever the issue is, whether its crypto or something else, this still should NOT be futured. Sure, it doesnt have the visibility of the wintel or linux platforms but one of the main drivers for adopting a solution like Mozilla is that it is truly cross-platform. Inhibiting testing on a major unix platform does not bode well for this continuing. I wonder how many solaris bugs will go unreported between now and the 0.9.9 release if the nightlies remain hosed? Do we really want to suddenly see them all show up at that point rather than have them reported and fixed along the way from nightly trunk builds?
I am not saying FUTURED... I am trying to get you a nightly build. Granted I am suggesting turning off crypto to get you that (so you can test everything else). If I am reading this correctly you guys (who care about solaris) haven't had a nightly build in like forever. I will shut up, I don't care... I don't care if solaris nightly builds work or not. I don't care if crytpo is on or not. I was just trying to suggest a way to get nightlies going again and to narrow down the problem and not leave it to AKI who hasn't touch a solaris build in "like forever". un-ccing myself, do whatever you want.
Don't know if this is relevent, but nebiros SunOS/sparc 5.7 Clobber seems to have been orange for ages. Is this the same problem? Also, i386 Solaris 2.6 nightlies seem to be being built fine, it's just the sparc ones (if that helps at all).
Pav - if there is too much on your plate right now to work on this bug, is there anyone else that could take a stab at it in the meantime?
We are going to try my suggestion for turning off BUILD_PSM in the sol26 builds. We are only going to do this for this weekend only. Hopefully we will get nightlies (remember they won't have PSM or smime) and then on Mon we will turn it back on. This will help us narrow down the issue
Umm, I think you're barking up the wrong tree with the PSM issue (but feel free to prove me wrong). regxpcom is hanging at the end of its run. components.reg has already been written out correctly. Removing smime from the components dir does not fix the hanging problem (tested manually on the day I ran truss). Does anyone know what is being poll'ed?
er, i'm sorry. i'm not sure why this bug is assigned to me. -> cls
Assignee: pavlov → seawood
Target Milestone: Future → ---
So, this is weird. Regxpcom works fine in a standalone xpcom build on sheep. If I build all of Mozilla (except crypto), I see the hang but according to the truss log, it's not hanging in poll any longer. It appeared to be hanging while processing some of the uconv libs. I used the following build options: --enable-extensions=default,irc --without-system-nspr --without-system-zlib --without-system-jpeg --without-system-png --without-system-mng --disable-debug --enable-optimize --disable-tests
With a debug build, I'm seeing the same hang when building all of Mozilla minus PSM. The trace shows that the poll() is coming from necko. (gdb) bt #0 0xfee9990c in _poll () from /usr/lib/libc.so.1 #1 0xfef1b22c in poll () from /usr/lib/libthread.so.1 #2 0xff08847c in PR_Poll (pds=0xcd138, npds=1, timeout=3500000) at ../../../../../mozilla/nsprpub/pr/src/pthreads/ptio.c:3963 #3 0xfdabd924 in nsSocketTransportService::Run (this=0xad068) at ../../../../mozilla/netwerk/base/src/nsSocketTransportService.cpp:469 #4 0xff224ae8 in nsThread::Main (arg=0x81500) at ../../../mozilla/xpcom/threads/nsThread.cpp:120 #5 0xff08a600 in _pt_root (arg=0xa6308) at ../../../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:214 (gdb) Stepping thru gdb shows that the AutoRegister() call returned without any errors (ret = 0). The hang occurs during XPCOM shutdown. Or more specifically, after stepping thru NS_XPCOMShutdown, it's hanging in nsTimerImpl::Shutdown(). It appears to spawn 2 more LWP threads when this occurs. One of those threads is the one shown in the stacktrace above. The extra threads are spawned when mThread->Join() is called from TimerThread::Shutdown() . Pavlov, back to you.
Assignee: seawood → pavlov
Severity: normal → critical
Keywords: helpwanted
Priority: P3 → --
Could someone clarify which version of Solaris has the problem? I am able to reproduce regxpcom hang on one of Solaris 8 boxes (and after rempval of components.reg problem can be repeated) but it is not reproducible on several others Solaris 7/8/9 boxes (note: i am testing *same* build shared over NFS) This leads me to idea that problem may be solved by instation of appropriate solaris patches. Did anyone try to investigate this? I am not sure which patches are necessary to fix the problem but i i would recommend to try patch 106541 for solaris 7 http://sunsolve.sun.com/pub-cgi/retrieve.pl?doc=fpatches%2F106541&zone_32=libc.so.1 (it contains fix for bug 4207080 hang in poll, application does not get notified of data on stream head) For solaris 8 patch 108991 may be usefull. It also has libc.so fixes and it is one of installed patches on system that does not have problem and too old version of this patch is installed on system that has the problem.
I wouldn't be surprised if our build system was in need of some patching. Aki - can you check the patch status on the system, and make sure it's up the latest and greatest patch cluster, as well as the patches mentioned above? cls - do you know of any reason we shouldn't upgrade, or any particular patches we should avoid?
I've heard that the very latest patch cluster from Sun introduces some instabilities. Pavlov and/or Roland would know specifically which one.
comment 59 makes no sense to me -- why would Join spawn threads? cc'ing wtc. /be
No clue what's going on. Both Solaris 2.7+latest patches and Solaris 2.8+latest patches (except Xsun patch 108652-47, we are still using rev -46) are working here...
this is solaris 2.6. downloading the latest recommended patches... which are from 2/5/02. i've had decent luck with the recommended patch bundles, so i'll install these and just keep an eye out for news on any bad patches.
Brendan asked: > comment 59 makes no sense to me -- why would Join spawn threads? > cc'ing wtc I have no idea either. Sorry.
patch cluster installed, aesir rebooted. we'll see if that fixes the 8pm build.
regxpcom is still hanging and the recommended patch cluster had a libc fix, can't find anything else in the 2.6 patchreport about it.
16 nights and new nightly build... could someone at least manually push one?
Grr. I meant: 16 nights and *no* new nightly build. Too many chocolate cookies for me today.
Can't we setup another machine for creating Solaris 2.7 or 2.8 nighly tarballs build with Sun Workshop ?
Comment on attachment 64581 [details] output of `rm component.reg; truss ./regxpcom 2>&1 | tee > regxpcom.log` Can someone provide a log from a hang woth `rm component.reg; truss -u :: ./regxpcom 2>&1 | tee > regxpcom.log`, please ?
killed regxpcom, should be another package available on the site. there is no -u option available in our version of truss... did you want the same output, but more recent, or different output? also, I believe I accidentally added the ">" to the tee comment, which shouldn't be there... ignore it.
Thanks for the new nightly build. Any chance of putting "kill regxpcom" in a cron job?
no. this bug has to be fixed.
roland: even on non-hanging node truss -u results in 500M+ log file. on node with problems it never stop to grow. Back to the idea about solaris patches - I tried one more solaris 2.8 system that also did not have the regxpcom hang. However, it does have strict subset of patches installed on node with problems :( Therefore either problem is introduced in one of additional patches or it is somethere else in the environment.
There's a new Solaris build: 2002-02-25-21-trunk/mozilla-sparc-sun-solaris2.6.tar.gz but it doesn't start because of bug 127817. If 127817 is related to security code as suggested in 127817 comment #3, does it mean that it's security stuff that's stopping Solaris builds normally?
May be a red herring, but take a look at the gzipped truss output file I attached to bug 129567 - Is this related? If it looks similar, then maybe we can compare patch revs or something... see if we can find a patch that if applied causes the problem and can be backed out to make it go away?
Any possibility of manually getting another nightly Solaris build uploaded (or at least deleting the current broken one)? The most recent Solaris nightly is still the build from 20020225, which is broken due to bug 129749. We've had a couple of duplicates of that long since fixed bug because that build is the only Solaris nightly available.
I have comperssed the file because it is reather large
not sure if my attempt earlier in the week to get a new build up worked well or not (killed all the regxpcom procs and i think the various build procs interfered with each other), but there's one from today.
We are building RPM:s on FreeBSD 4.3, RedHat 7.1 & 7.2 and Solaris 2.6, 7 & 8 and I have seen the problem with the hanging regxpcom many times on Solaris. I have been starting regxpcom thru truss and strace and got huge logfiles, so if someone are interested let me know. Workaround: Our SPEC-file (RPM) places this script in the '.../mozilla/dist/bin'-directory. I tested it today, when building 0.9.9, regxpcom ran from 7 to 42s on FreeBSD, RedHat and Solaris 2.6, and ended normally. On Solaris 7 & 8 it was killed after the timeout and ran about 3s the second time. #!/app/cueshell/bin/cueshell # This is a bash-alias dist_bin=`dirname $0` MOZILLA_FIVE_HOME=$dist_bin LD_LIBRARY_PATH=$dist_bin:$LD_LIBRARY_PATH export MOZILLA_FIVE_HOME LD_LIBRARY_PATH case `uname -s` in SunOS) echo "`date`: Starting regxpcom" ( $dist_bin/regxpcom; echo "`date`: regxpcom done.") & waiting=0 while [ $waiting -lt 1800 ]; do if ps -p $! >/dev/null ; then waiting=`expr $waiting + 30` sleep 30 echo "`date`: Waited $waiting seconds for regxpcom" else echo "`date`: Waiting done." waiting=1800 fi done if ps -p $! >/dev/null ; then echo "`date`: Kills regxpcom " /usr/sbin/fuser -k $dist_bin/regxpcom echo "`date`: Restarting regxpcom" $dist_bin/regxpcom; echo "`date`: regxpcom done." fi ;; *) echo "`date`: Starting regxpcom" $dist_bin/regxpcom; echo "`date`: regxpcom done." ;; esac $dist_bin/regchrome touch $dist_bin/chrome/user-skins.rdf $dist_bin/chrome/user-locales.rdf
I've been troubleshooting this using the 20020315xx nightly, and I think I have some useful information. First of all, if components/nsProxyAutoConfig.js is removed from an installed copy of mozilla, then regxpcom will run to completion and exit as it should. However, regxpcom isn't hanging while registering this component; it's hanging in the call to NS_ShutdownXPCOM() just before regxpcom exits. It appears that nsProxoyAutoConfig.js causes an nsDNSService thread to be created, which in turn creates a TimerThread. Later at shutdown time, xpcom tries to kill the timer thread, but it isn't dying. I'm going to attach a copy of the /usr/proc/bin/pstack output for a well-hung regxpcom instance. You'll note the following: 1) thread #1 is performing NS_ShutdownXPCOM() and is waiting for a _thrp_join() call to complete. This is actually a pthread_join() call in the source. I think the '6' in the _thrp_join() argument list means thread #6. 2) lwp #1/thread #6 is within a TimerThread::Run call, blissfully waiting for a call to pthread_cond_wait() to complete. 3) thread #5 is within a nsDNSService::Run call. According to truss, thread 5 was spawned by thread 4, which is inside an nsSocketTransportService::Run call. I have trusses from running regxpcom with and without the proxy autoconfig component present. When it's not present, regxpcom never gets beyond four threads; #5 and #6 are never created. The trusses are quite large so I won't attach them.
my bet is that the problem is: 235 var PacMan = new nsProxyAutoConfig() ;
Assignee: pavlov → gagan
Component: Build Config → Networking
QA Contact: granrose → benc
Summary: Regxpcom hanging Solaris nightly build packaging process → PAC instantiation hangs Regxpcom Solaris nightly build packaging process
Keywords: helpwanted, qawanted
I doubt it, since that will just call this nothing function: 55 function nsProxyAutoConfig() {}; Is it possible that some other component is causing network activity, which is in turn causing the proxyautoconfig stuff to get kicked off? If that's the case, we probably need regxpcom to do more mozilla-like things in its shutdown process. Cc:ing Jud, because embedders on Solaris might well run into this problem as well, if they don't do the shutdown perfectly.
FWIW: PAC download is triggered whenever the PAC preference is modified. see nsProtocolProxyService::PrefsChanged.
(wonders if the bug on nsDNSshutdown leaking, which he can't find, is related)
regxpcom and InitXPCOM does not create any event queue for the main thread. /me wonders if the timer or DNS threads require one present? attaching hack to test this theory.
A detailed truss suggests that there may be a race condition within TimerThread (xpcom/threads/TimerThread.cpp). TimerThread::Shutdown() is running before TimerThread::Run(). This is breaking the method that Shutdown() uses to tell Run() to exit. Shutdown() checks a condition variable and a flag: // notify the cond var so that Run() can return if (mCondVar && mWaiting) PR_NotifyCondVar(mCondVar); but Run() hasn't been called yet so the test fails. Shutdown() falls through and eventually calles nsThread::Join() to harvest the Run() thread. Some time later, the Run() thread starts executing, and eventually goes to sleep on PR_WaitCondVar(mCondVar). Deadlock. I'm attaching a truss clip which illustrates the problem; the truss includes calls to libxpcom and libnspr4. I apologize for not including more data; these truss runs take a long time to complete and produce huge amounts of output. The one I'm excerpting is 41MB, for example.
If this works, we should fix the problem much cleaner by having InitXPCOM startup the event queue directory and Shutdown clean it up. See 135531.
I tried adding "sleep(1)" to TimerThread::Shutdown() just before the timerthread lock is acquired. This has the desired effect; the Shutdown() thread gives up its timeslice, giving the OS time to schedule the Run() thread. By the time Shutdown() wakes up, the Run() thread is in the state that Shutdown() expects. But of course this is just a hack, not a proper solution. TimerThread uses a flag "mProcessing" to indicate whether TimerThread::Run() should keep going or not, but the logic isn't quite right. The flag is initialized false. Run() sets it true on entry, then keeps looping until it sees the flag become false. Shutdown() sets the flag back to false when it wants Run() to return. But if Shutdown() runs before Run(), then Run() can't tell that Shutdown() has already been called and already written to the flag. The attached patch replaces the mProcessing flag with an mShutdown flag. This flag is initialized to false. It's set to true in Shutdown(). Run() never writes to this flag, but it keeps looping as long as the flag is false. With either the added sleep() call or the mShutdown patch, regxpcom no longer hangs shutting down xpcom. Instead, the last few lines that it prints are as follows: *** Registering irc protocol handler. nNCL: registering deferred (0) nNCL: registering deferred (0) Getting service on shutdown. Denied. ContractID: @mozilla.org/js/xpc/ContextStack;1 IID: {a1339ae0-05c1-11d4-8f92-0010a4e73d9a} ###!!! ASSERTION: Component Manager being held past XPCOM shutdown.: 'cnt == 0', file nsXPComInit.cpp, line 582 ###!!! Break: at file nsXPComInit.cpp, line 582 As far as I can tell, this is an unrelated problem. It may be bug 135330 rearing its head; the source distribution I'm using is from 4/3/2002.
the getting @ shutdown is bug 134728 in general, shutdown problems have *many* bugs, although searching for bugs filed by me is a good start.
->Doug
Assignee: gagan → dougt
Comment on attachment 78085 [details] [diff] [review] Proposed TimerThread.cpp, TimerThread.h patch r=dougt. Thanks for fixing this.
Attachment #78085 - Flags: review+
brendan, can you super review? You blame to alot of this code.
Target Milestone: --- → mozilla1.0
I've applied this patch to my mozilla 0.9.9 tree, and regxpcom no longer hangs on package creation.
Was it hanging without this patch? We've had solaris nightlies for the past few days (since the 12th apparently). So either the problem resolved itself or someone added a workaround to the build automation, which I don't see.
Yes, it would consistently hang on building 0.9.8 and 0.9.9 without this patch on solaris 7
Comment on attachment 78085 [details] [diff] [review] Proposed TimerThread.cpp, TimerThread.h patch sr=brendan@mozilla.org dougt: I took cvsblame in making fixes to pavlov's busted threading code, but I won't take all blame here. I do feel pretty foolish for taking this stuff so close to 1.0 (0.9.8, IIRC -- at least I made pav wait till then, instead of checking in on the last day of 0.9.7 as he wanted to). /be
Attachment #78085 - Flags: superreview+
Checked into the trunk: Checking in TimerThread.cpp; /cvsroot/mozilla/xpcom/threads/TimerThread.cpp,v <-- TimerThread.cpp new revision: 1.12; previous revision: 1.11 done Checking in TimerThread.h; /cvsroot/mozilla/xpcom/threads/TimerThread.h,v <-- TimerThread.h new revision: 1.4; previous revision: 1.3 done
Status: NEW → ASSIGNED
Whiteboard: Needs to land on branch
Comment on attachment 78085 [details] [diff] [review] Proposed TimerThread.cpp, TimerThread.h patch a=rjesup@wgate.com for branch checkin
Attachment #78085 - Flags: approval+
Checked into branch. Checking in TimerThread.cpp; /cvsroot/mozilla/xpcom/threads/TimerThread.cpp,v <-- TimerThread.cpp new revision: 1.6.4.4; previous revision: 1.6.4.3 done Checking in TimerThread.h; /cvsroot/mozilla/xpcom/threads/TimerThread.h,v <-- TimerThread.h new revision: 1.3.4.2; previous revision: 1.3.4.1 done Kenneth, thank you for the patch.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
adding fixed1.0.0 keyword (branch resolution). This bug has comments saying it was fixed on the 1.0 branch and a bonsai checkin comment that agrees. To verify the bug has been fixed on the 1.0 branch please replace the fixed1.0.0 keyword with verified1.0.0.
Keywords: fixed1.0.0
updating component and qa... From reading carefully, it seems like this goes to XPCOM Regsitry. Also, the summary seems out of date, is PAC really the root cause of this?
Component: Networking → XPCOM Registry
QA Contact: benc → dougt
Component: XPCOM Registry → XPCOM
QA Contact: doug.turner → xpcom
Keywords: qawanted
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: