Closed Bug 196487 Opened 22 years ago Closed 22 years ago

Doesn't run if home directory is on NFS

Categories

(Core Graveyard :: Profile: BackEnd, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.4beta

People

(Reporter: krister, Assigned: ccarlen)

References

Details

(Keywords: fixed1.4, regression)

Attachments

(2 files)

User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.3b) Gecko/20030212 Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.3b) Gecko/20030212 After deleting ~/Library/Mozilla, the first time Mozilla starts it exits almost immediately. It has, however, created the directory ~/Library/Mozilla with a profile called "default". Next time Mozilla starts it reports that I hve to choose a user profile. The dialog lists "default" as one of the available profiles. If I choose "default" (or create a new profile and choose that), Mozilla pops up an error: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nslProfileInternal.currentProfile] Reproducible: Always Steps to Reproduce: 1. Home directory on NFS 2. Start Mozilla (twice if ~/Library/Mozilla doesn't exist) Actual Results: Mozilla fails to get further than an the Profile Manager. Expected Results: Normal startup.
I can confirm and have reported the exact same behaviour under bug 152287: In version 1.2.1, Mozilla would launch and load a user profile if that profile resided on an NFS-mounted home directory. It would not, however, save downloaded files to any NFS-mount (including the home directory), makeing it useless for Mail or all but the most basic browsing. In 1.3a, 1.3b, and the current (9 March) 1.4a nightly, Mozilla will NOT load a user profile from an NFS-mounted home directory. Since this was an issue that was fixed previously, this appears to be a regression. I WAS able to save a downloaded PDF file to an NFS-mounted share when running from a local (HFS+) home directory with the 1.4a build. The error message on failing to load the profile is: "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nslProfileInternal.currentProfile] Please elevate the priority of this bug if at all possible. Thank you.
Confirming. This is a result of using fcntl profile locking instead of symlink profile locking. fcntl is more failsafe on local volumes but doesn't work on some NFS volumes. On the Camino (Chimera) branch, this specific failure was allowed. Now that Camino is on the trunk, and for Mozilla, this needs to be revisted on the trunk.
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 199557 has been marked as a duplicate of this bug. ***
*** Bug 199870 has been marked as a duplicate of this bug. ***
It doesn't appear that this bug is targeted to a milestone. If this worked in 1.2, shouldn't it block release of 1.4? I can see not fixing it in the 1.3 tree because that's not supposed to be stable, but 1.4 is an even numbered release. It appears that this bug may also be affecting the ability to run recent versions of evolution (so far all that is clear is that evolution has a similar problem with NFS home directories and that there are Mozilla dependencies in evolution).
Flags: blocking1.4?
*** Bug 204643 has been marked as a duplicate of this bug. ***
This is a pretty big one for adoption of software in a corporate environment. It could definitely block deployment.
Flags: blocking1.4b?
Flags: blocking1.4b?
Flags: blocking1.4b?
As promised, I've done some investigation. From the patch in bug 76431: >+ // First, try the 4.x-compatible symlink technique, which works with NFS >+ // without depending on (broken or missing, too often) lockd. So, Mac OS/X NFS is _not_ supporting symlink(). If it doesn't support symlink(), does it run lockd by default? If so does lockd work? If Camino switched from the "symlink then fcntl" algorithm, that's a problem. Looks like this checkin is the root cause: http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsProfileAccess.cpp&branch=&root=/cvsroot&subdir=mozilla/profile/src&command=DIFF_FRAMESET&rev1=1.71&rev2=1.72 The comment was: Use only fcntl-based profile locking on Mac OS X (disable symlink-based locking). Fixes bug 176608. r=ccarlen, sr=jag, a=roc. Here are two relevant comments from that bug: --------------------- ------- Additional Comment #5 From Conrad Carlen 2002-10-24 19:27 ------- The #ifdef I'd like to see is one which causes us to do locking via fcntl instead of the symlink method which requires signal handlers to be set in the first place. If we use fcntl: 1. We're absolutely guaranteed not to have a stuck lock if the machine is rebooted. 2. We're not going to suffer from PID rollover after reboot or changing IP adresses after the machine goes to sleep and is woken up in a new net environment. 3. No need for signal handlers so we don't have this problem. Brendan, I know that locking via fcntl is not supported by all NFS servers but which servers are those exactly? Considering that we're getting more than a few bugs on stuck locks with Chimera, I think it may be better to relnote the problem of not all NFS servers supporting locking and use fcntl. I think that far fewer people would run into the problem of having their profile on such a server than are running into stuck locks. I'm in favor of this #ifdef only for XP_MACOSX - not for XP_UNIX in general. ------- Additional Comment #6 From Brendan Eich 2002-10-24 19:44 ------- I think I said in another bug that the broken NFS servers don't care whether your client Mozilla was compiled XP_UNIX or XP_MACOSX, but sure, you can use fcntl and see whether there are fewer bugs filed against the locking code. I'm not stopping that from happening. Try it and if it works over a few milestones ("works" meaning you get 0 or fewer bugs about stuck/broken locks due to NFS-mounted profile dirs), maybe we should consider using fcntl for XP_UNIX. /be --------------------- Looks to me like the number of servers without working fcntl locking is definitely non-0, unless it's simply a bug in the non-symlink code. I would assume there are other solutions to the signal() issue. The "we had our IP address change" issue is real and was not really considered in the design for the locking. It's not as large as you'd think since it only bites you if you crash mozilla (leaves symlink w/ wrong IP). To some extent that could be avoided by checking that our IP hasn't changed periodically and re-writing the link. Ugly, but would work in most cases. If we could be notified of IP changes the hole would be very small (you'd think there'd be a way; it is rather important to a number of apps such as things that do H.323/SIP, etc).
> If Camino switched from the "symlink then fcntl" algorithm, that's a problem. That depends on for whom (the minority vs. the majority). The number of stuck lock bugs on OS X has been zero since we went to fcntl locking. It's more reliable on the local machine, and that's not going to change. Clearly, though, we need a solution that works for profiles on NFS home dirs as well as local home dirs. I am working on a patch now for OS X which should do that. Somebody suffering from the NFS home dir problem has offered to help me test.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.4beta
If you need more testers, then feel free to contact me at t.bubeck@reinform.de. I'm using Mac OS 10.2.5 toegether with RHAT 7.1 running 2.4.19 as NFS Server.
Ok. If possible, I'd prefer a solution that (on NFS from OS/X) uses the symlink. That would allow an NFS-mounted profile to be used from Linux/BSD/etc as well as OS/X. Hmm. Question: what if the profile is on a local OS/X volume - but the volume is exported via NFS and may be used from elsewhere as well? Does/will the code you're working on handle that case?
Flags: blocking1.4b?
Apparently this bug is different from bug 90682? A regression?
Keywords: regression
*** Bug 90682 has been marked as a duplicate of this bug. ***
I too am happy to test fixes for this bug.
I am also willing to test (can't test anything else until it is fixed!) : (viv <at> ic.ac.uk) I have a related problem that I would love to see fixed that earlier versions exhibit (latest Netscape for example): When it starts up, it takes ages to load, and it causes the nfs file system containing my home directory to be mounted a second time under /Volumes/mathew (the server name is mathew, the original mount point is /Users/viv). If I close and reopen Netscape, the volume is mounted a third time, and so on. It isn't simple to unmount these unwanted mounts, unpredictable things happen. Usually the system reports they are "busy" but once when it didn't I then found it thought my Trash folder had become my ~/Library folder. Nothing else I run causes anything like this except the mozilla engine.
Blocks: 101953
*** Bug 206632 has been marked as a duplicate of this bug. ***
*** Bug 207259 has been marked as a duplicate of this bug. ***
*** Bug 207265 has been marked as a duplicate of this bug. ***
*** Bug 207412 has been marked as a duplicate of this bug. ***
Attached patch patch (deleted) — Splinter Review
Patch allows OS X to work over NFS by using fcntl first and, if that fails, it uses the symlink approach. There's the caveat that running the same profile locally on the NFS server and remotely from an NFS client won't work - the lock will go undetected. I think we'll have to live with that. diff came up with the wackiest interpretation of the changes, so I'll describe them a bit and then attach the whole new file. Both the fcntl locking code and the symlink locking code were factored out of nsProfileLock::Lock and into separate routines. Pretty much straight copy and paste. Since we're no longer making Mac CFM builds, all the glop I put in to allow CFM code to call through to Mach-O code (fcntl) could be removed.
Attached file whole new file - easier to read (deleted) —
I rebuilt Mozilla with sources from May 31 and the patch applied. I can confirm the patch works for me and Mozilla starts without a problem. The NFS server is running FreeBSD 4.8.
Comment on attachment 124460 [details] [diff] [review] patch reviewers - see comment 20
Attachment #124460 - Flags: superreview?(brendan)
Attachment #124460 - Flags: review?(bryner)
Comment on attachment 124460 [details] [diff] [review] patch Looks good to me. r=bryner.
Attachment #124460 - Flags: review?(bryner) → review+
Attachment #124460 - Flags: superreview?(brendan) → superreview+
Checked into trunk, will ask for approval for 1.4.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment on attachment 124460 [details] [diff] [review] patch Seeking approval for 1.4. Code is only different after fcntl lock fails on Mach-O. At that point, without the patch, we're doomed anyway (forced to quit).
Attachment #124460 - Flags: approval1.4?
reporter, commenters, will you try this now and verify fix?
Everything OKAY on this nightly build Mozilla 1.5a Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.5a) Gecko/20030604 on Mac OS X 10.2.5 using a NFS home mounted from Linux 2.4.18. Thanks!
thanks! verified
Status: RESOLVED → VERIFIED
Comment on attachment 124460 [details] [diff] [review] patch a=asa (on behalf of drivers) for checkin to the 1.4 branch.
Attachment #124460 - Flags: approval1.4? → approval1.4+
a=adt Please land this fix on the 1.4 Branch and add the keyword fixed1.4
Checked into branch.
Keywords: fixed1.4
*** Bug 192737 has been marked as a duplicate of this bug. ***
This may have caused regression bug 209048 ... ;-((
mozilla1.4 shipped. unsetting blocking1.4 request.
Flags: blocking1.4?
There is a report to which it seems that this bug recurred. Can someone be checked? bug 234395
(In reply to comment #37) > There is a report to which it seems that this bug recurred. > Can someone be checked? > bug 234395 > That bug, while the end result is the similar, is a different problem.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: