Closed
Bug 196487
Opened 22 years ago
Closed 22 years ago
Doesn't run if home directory is on NFS
Categories
(Core Graveyard :: Profile: BackEnd, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
mozilla1.4beta
People
(Reporter: krister, Assigned: ccarlen)
References
Details
(Keywords: fixed1.4, regression)
Attachments
(2 files)
(deleted),
patch
|
bryner
:
review+
brendan
:
superreview+
asa
:
approval1.4+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details |
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.3b) Gecko/20030212
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.3b) Gecko/20030212
After deleting ~/Library/Mozilla, the first time Mozilla starts it exits almost
immediately. It has, however, created the directory ~/Library/Mozilla with a
profile called "default". Next time Mozilla starts it reports that I hve to
choose a user profile. The dialog lists "default" as one of the available
profiles. If I choose "default" (or create a new profile and choose that),
Mozilla pops up an error:
Component returned failure code: 0x80004005 (NS_ERROR_FAILURE)
[nslProfileInternal.currentProfile]
Reproducible: Always
Steps to Reproduce:
1. Home directory on NFS
2. Start Mozilla (twice if ~/Library/Mozilla doesn't exist)
Actual Results:
Mozilla fails to get further than an the Profile Manager.
Expected Results:
Normal startup.
Comment 1•22 years ago
|
||
I can confirm and have reported the exact same behaviour under bug 152287:
In version 1.2.1, Mozilla would launch and load a user profile if that profile
resided on an NFS-mounted home directory. It would not, however, save
downloaded files to any NFS-mount (including the home directory), makeing it
useless for Mail or all but the most basic browsing.
In 1.3a, 1.3b, and the current (9 March) 1.4a nightly, Mozilla will NOT load a
user profile from an NFS-mounted home directory. Since this was an issue that
was fixed previously, this appears to be a regression. I WAS able to save a
downloaded PDF file to an NFS-mounted share when running from a local (HFS+)
home directory with the 1.4a build.
The error message on failing to load the profile is:
"Component returned failure code: 0x80004005 (NS_ERROR_FAILURE)
[nslProfileInternal.currentProfile]
Please elevate the priority of this bug if at all possible. Thank you.
Assignee | ||
Comment 2•22 years ago
|
||
Confirming. This is a result of using fcntl profile locking instead of symlink
profile locking. fcntl is more failsafe on local volumes but doesn't work on
some NFS volumes. On the Camino (Chimera) branch, this specific failure was
allowed. Now that Camino is on the trunk, and for Mozilla, this needs to be
revisted on the trunk.
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 199557 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 4•22 years ago
|
||
*** Bug 199870 has been marked as a duplicate of this bug. ***
It doesn't appear that this bug is targeted to a milestone. If this worked in
1.2, shouldn't it block release of 1.4? I can see not fixing it in the 1.3 tree
because that's not supposed to be stable, but 1.4 is an even numbered release.
It appears that this bug may also be affecting the ability to run recent
versions of evolution (so far all that is clear is that evolution has a similar
problem with NFS home directories and that there are Mozilla dependencies in
evolution).
Flags: blocking1.4?
Comment 6•22 years ago
|
||
*** Bug 204643 has been marked as a duplicate of this bug. ***
Comment 7•22 years ago
|
||
This is a pretty big one for adoption of software in a corporate environment. It
could definitely block deployment.
Flags: blocking1.4b?
Updated•22 years ago
|
Flags: blocking1.4b?
Updated•22 years ago
|
Flags: blocking1.4b?
Comment 8•22 years ago
|
||
As promised, I've done some investigation.
From the patch in bug 76431:
>+ // First, try the 4.x-compatible symlink technique, which works with NFS
>+ // without depending on (broken or missing, too often) lockd.
So, Mac OS/X NFS is _not_ supporting symlink(). If it doesn't support
symlink(), does it run lockd by default? If so does lockd work?
If Camino switched from the "symlink then fcntl" algorithm, that's a problem.
Looks like this checkin is the root cause:
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsProfileAccess.cpp&branch=&root=/cvsroot&subdir=mozilla/profile/src&command=DIFF_FRAMESET&rev1=1.71&rev2=1.72
The comment was:
Use only fcntl-based profile locking on Mac OS X (disable symlink-based
locking). Fixes bug 176608. r=ccarlen, sr=jag, a=roc.
Here are two relevant comments from that bug:
---------------------
------- Additional Comment #5 From Conrad Carlen 2002-10-24 19:27 -------
The #ifdef I'd like to see is one which causes us to do locking via fcntl
instead of the symlink method which requires signal handlers to be set in the
first place. If we use fcntl:
1. We're absolutely guaranteed not to have a stuck lock if the machine is rebooted.
2. We're not going to suffer from PID rollover after reboot or changing IP
adresses after the machine goes to sleep and is woken up in a new net environment.
3. No need for signal handlers so we don't have this problem.
Brendan, I know that locking via fcntl is not supported by all NFS servers but
which servers are those exactly? Considering that we're getting more than a few
bugs on stuck locks with Chimera, I think it may be better to relnote the
problem of not all NFS servers supporting locking and use fcntl. I think that
far fewer people would run into the problem of having their profile on such a
server than are running into stuck locks. I'm in favor of this #ifdef only for
XP_MACOSX - not for XP_UNIX in general.
------- Additional Comment #6 From Brendan Eich 2002-10-24 19:44 -------
I think I said in another bug that the broken NFS servers don't care whether
your client Mozilla was compiled XP_UNIX or XP_MACOSX, but sure, you can use
fcntl and see whether there are fewer bugs filed against the locking code. I'm
not stopping that from happening. Try it and if it works over a few milestones
("works" meaning you get 0 or fewer bugs about stuck/broken locks due to
NFS-mounted profile dirs), maybe we should consider using fcntl for XP_UNIX.
/be
---------------------
Looks to me like the number of servers without working fcntl locking is
definitely non-0, unless it's simply a bug in the non-symlink code.
I would assume there are other solutions to the signal() issue. The "we had our
IP address change" issue is real and was not really considered in the design for
the locking. It's not as large as you'd think since it only bites you if you
crash mozilla (leaves symlink w/ wrong IP). To some extent that could be
avoided by checking that our IP hasn't changed periodically and re-writing the
link. Ugly, but would work in most cases. If we could be notified of IP
changes the hole would be very small (you'd think there'd be a way; it is rather
important to a number of apps such as things that do H.323/SIP, etc).
Assignee | ||
Comment 9•22 years ago
|
||
> If Camino switched from the "symlink then fcntl" algorithm, that's a problem.
That depends on for whom (the minority vs. the majority). The number of stuck
lock bugs on OS X has been zero since we went to fcntl locking. It's more
reliable on the local machine, and that's not going to change.
Clearly, though, we need a solution that works for profiles on NFS home dirs as
well as local home dirs. I am working on a patch now for OS X which should do
that. Somebody suffering from the NFS home dir problem has offered to help me test.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.4beta
Comment 10•22 years ago
|
||
If you need more testers, then feel free to contact me at t.bubeck@reinform.de.
I'm using Mac OS 10.2.5 toegether with RHAT 7.1 running 2.4.19 as NFS Server.
Comment 11•22 years ago
|
||
Ok. If possible, I'd prefer a solution that (on NFS from OS/X) uses the
symlink. That would allow an NFS-mounted profile to be used from Linux/BSD/etc
as well as OS/X.
Hmm. Question: what if the profile is on a local OS/X volume - but the volume
is exported via NFS and may be used from elsewhere as well? Does/will the code
you're working on handle that case?
Updated•22 years ago
|
Flags: blocking1.4b?
Comment 12•22 years ago
|
||
Apparently this bug is different from bug 90682? A regression?
Keywords: regression
Assignee | ||
Comment 13•22 years ago
|
||
*** Bug 90682 has been marked as a duplicate of this bug. ***
Comment 14•22 years ago
|
||
I too am happy to test fixes for this bug.
Comment 15•22 years ago
|
||
I am also willing to test (can't test anything else until it is fixed!) : (viv <at> ic.ac.uk)
I have a related problem that I would love to see fixed that earlier versions exhibit (latest Netscape
for example): When it starts up, it takes ages to load, and it causes the nfs file system containing
my home directory to be mounted a second time under /Volumes/mathew (the server name is
mathew, the original mount point is /Users/viv). If I close and reopen Netscape, the volume is
mounted a third time, and so on. It isn't simple to unmount these unwanted mounts,
unpredictable things happen. Usually the system reports they are "busy" but once when it didn't I
then found it thought my Trash folder had become my ~/Library folder. Nothing else I run causes
anything like this except the mozilla engine.
Comment 16•22 years ago
|
||
*** Bug 206632 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 17•22 years ago
|
||
*** Bug 207259 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 18•22 years ago
|
||
*** Bug 207265 has been marked as a duplicate of this bug. ***
Comment 19•22 years ago
|
||
*** Bug 207412 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 20•22 years ago
|
||
Patch allows OS X to work over NFS by using fcntl first and, if that fails, it
uses the symlink approach. There's the caveat that running the same profile
locally on the NFS server and remotely from an NFS client won't work - the lock
will go undetected. I think we'll have to live with that.
diff came up with the wackiest interpretation of the changes, so I'll describe
them a bit and then attach the whole new file. Both the fcntl locking code and
the symlink locking code were factored out of nsProfileLock::Lock and into
separate routines. Pretty much straight copy and paste. Since we're no longer
making Mac CFM builds, all the glop I put in to allow CFM code to call through
to Mach-O code (fcntl) could be removed.
Assignee | ||
Comment 21•22 years ago
|
||
Reporter | ||
Comment 22•22 years ago
|
||
I rebuilt Mozilla with sources from May 31 and the patch applied.
I can confirm the patch works for me and Mozilla starts without a problem.
The NFS server is running FreeBSD 4.8.
Assignee | ||
Comment 23•22 years ago
|
||
Attachment #124460 -
Flags: superreview?(brendan)
Attachment #124460 -
Flags: review?(bryner)
Comment 24•22 years ago
|
||
Comment on attachment 124460 [details] [diff] [review]
patch
Looks good to me. r=bryner.
Attachment #124460 -
Flags: review?(bryner) → review+
Comment 25•22 years ago
|
||
Attachment #124460 -
Flags: superreview?(brendan) → superreview+
Assignee | ||
Comment 26•22 years ago
|
||
Checked into trunk, will ask for approval for 1.4.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 27•22 years ago
|
||
Comment on attachment 124460 [details] [diff] [review]
patch
Seeking approval for 1.4.
Code is only different after fcntl lock fails on Mach-O. At that point, without
the patch, we're doomed anyway (forced to quit).
Attachment #124460 -
Flags: approval1.4?
Comment 28•22 years ago
|
||
reporter, commenters,
will you try this now and verify fix?
Comment 29•22 years ago
|
||
Everything OKAY on this nightly build
Mozilla 1.5a
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.5a) Gecko/20030604
on Mac OS X 10.2.5 using a NFS home mounted from Linux 2.4.18.
Thanks!
Comment 31•22 years ago
|
||
Comment on attachment 124460 [details] [diff] [review]
patch
a=asa (on behalf of drivers) for checkin to the 1.4 branch.
Attachment #124460 -
Flags: approval1.4? → approval1.4+
Comment 32•22 years ago
|
||
a=adt Please land this fix on the 1.4 Branch and add the keyword fixed1.4
Assignee | ||
Comment 34•22 years ago
|
||
*** Bug 192737 has been marked as a duplicate of this bug. ***
Comment 35•22 years ago
|
||
This may have caused regression bug 209048 ... ;-((
Comment 37•21 years ago
|
||
There is a report to which it seems that this bug recurred.
Can someone be checked?
bug 234395
Assignee | ||
Comment 38•21 years ago
|
||
(In reply to comment #37)
> There is a report to which it seems that this bug recurred.
> Can someone be checked?
> bug 234395
>
That bug, while the end result is the similar, is a different problem.
Updated•9 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•