Closed Bug 95499 Opened 23 years ago Closed 23 years ago

Page doesn't finish loading (as soon as it should)

Categories

(Core :: Networking, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: sharding, Assigned: darin.moz)

References

Details

I apologize in advance for the vague nature of this bug report. I've been waiting for some time to try to get more concrete information, but I haven't been able to nail this one down. After running Mozilla for some time, it seems to just stop talking to the network. All subsequent HTTP requests fail (it sits there trying indefinitely, but nothing is ever loaded). The only way I've found to get it working again is to quit the browser and restart. No matter how long I've waited, the networking never comes back. All other network activity on the machine, including using telnet or another browser to connect to one of the web servers I'm trying in Mozilla, works fine throughout. This has happend in 0.9.3 on Linux and FreeBSD (and I have second-hand reports of it being seen on MacOS X). Initially I thought it might have been triggered by a timed out request (i.e. one request legitimately times out because the server is down and then no other requests will work). But it seems like I've seen it happen without any legitimate timed out requests as well. I'll keep trying to get better information on this. I'm hoping that someone else has seen it and will have a better idea of what's going on.
Reporter: Are you using a proxy /junkbuster ?
No. No proxies of any sort are involved in any of the locations where I've seen this.
CC bbaetz: Can you help us woith this bug ?
reporter: can you please describe you network configuration in more detail. it might help knowing how fast a connection you have, etc. also, does the problem seem to occur more frequently while visiting only certain sites?
This has happened in multiple locations on multiple machines. In most cases, the machine was connected to the network via 10 or 100 megabit ethernet and the pipe to the Internet ranged from 640k DSL with an OpenBSD firewall to a corporate network with several extremely large pipes. It's also happened at least once on 56K dialup. I'd guess that this has happened a total of about 8 times to me. I haven't detected any sort of correlation with specific sites (and that's exactly the kind of info I've been looking for while waiting to file this report). It's intermittent enough for me that I can't just sit down and trigger it. But it happens often enough that I noticed it as a recurring problem. And, again, it happens on completely different networks with completely different machines, so I don't think that it's related to my Internet connection.
More standard troubleshooting than directly a solution, can you try to clear the caches via prefs if this happens again?
I can confirm this bug. It just happened to me with the 2001082121 build. However I am behind a squid proxy so maybe it doesn't depend on proxy/no proxy. Also it seems to me that the networking broke after some image connection timed out. Then I've hit reload and the throbber started animating. Nothing has changed on the status line. After restarting Mozilla everything works fine. I hasn't been testing Mozilla for 1 month (I've been on vacation) so I can't say if it haven't appeared before.
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Tom, take a look at bug 95010 Try to disable the proxy in the prefs, and then try again. (Without restart) If that helps, you probably see bug 95010
*** Bug 94526 has been marked as a duplicate of this bug. ***
I just had this happen again. This time in 0.9.3 on FreeBSD. I clicked a link (which happened to open a new window). The page never loaded, the clock cursor showed and the logo animation kept moving for about three minutes. During that time, I clicked another link on a different server, and it similarly hung. I finally gave up and hit stop on both. However, no pages at all would open after that. They all just hung with "Document Done" in the status bar, the clock cursor and the logo animation. While this was happening, I successfully reached every server I tried from Communicator 4.78 (the same URLs that were hanging in Mozilla). This is the Linux Mozilla binary on FreeBSD. As said before, I've seen this on Linux too. A truss of the proccess shows a seemingly infinite loop of: linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_newselect(0x3b,0xbfbff690,0xbfbff710,0xbfbff790,0xbfbff83c) = 5 (0x5) poll(0x9402e80,0x3,0x0) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_newselect(0x3b,0xbfbff690,0xbfbff710,0xbfbff790,0xbfbff83c) = 5 (0x5) gettimeofday(0xbfbff824,0x0) = 2 (0x2) gettimeofday(0xbfbff7d8,0x0) = 2 (0x2) linux_ioctl(0xb,0x541b,0xbfbfef20) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfeef4) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfeef4) = 3 (0x3) linux_newselect(0x3b,0xbfbff5bc,0xbfbff63c,0xbfbff6bc,0xbfbff768) = 5 (0x5) poll(0x9402e80,0x7,0x0) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfeef4) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfeef4) = 3 (0x3) linux_newselect(0x3b,0xbfbff5bc,0xbfbff63c,0xbfbff6bc,0xbfbff768) = 5 (0x5) gettimeofday(0xbfbff828,0x0) = 2 (0x2) gettimeofday(0xbfbff8ac,0x0) = 2 (0x2) linux_ioctl(0xb,0x541b,0xbfbfeff4) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_newselect(0x3b,0xbfbff690,0xbfbff710,0xbfbff790,0xbfbff83c) = 5 (0x5) poll(0x9402e80,0x3,0x0) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfefc8) = 3 (0x3) linux_newselect(0x3b,0xbfbff690,0xbfbff710,0xbfbff790,0xbfbff83c) = 5 (0x5) gettimeofday(0xbfbff824,0x0) = 2 (0x2) gettimeofday(0xbfbff7d8,0x0) = 2 (0x2) linux_ioctl(0xb,0x541b,0xbfbfef20) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfeef4) = 3 (0x3) linux_ioctl(0x3a,0x541b,0xbfbfeef4) = 3 (0x3) linux_newselect(0x3b,0xbfbff5bc,0xbfbff63c,0xbfbff6bc,0xbfbff768) = 5 (0x5) poll(0x9402e80,0x7,0x0) = 3 (0x3)
I noticed this on Win2000Pro. This bug is here for a long time. Now I'm testing with build 2001082909, and it is still here. I can only try some deduction from what I saw: 1) I enter site, connection to which is very slow (or it has images/counters/banners, linked to very slow sites) 2) Sometimes after that browser seems to ignore new URL I enter into URL bar, choose link in a page or bookmarks. It seems that it still keeps trying old URL, which it can't connect to, and cannot switch to new URL. 3) Few weeks ago there was a bug on status bar, when old URL's were shown, instead of new ones(that are being loaded), maybe this problem has similar roots ?
strace/truss doesn't help, really - you're tracing the UI thread, I think. Do you have a debug build available, or could you build one?
Yeah, I was afraid that's what I was trussing. But none of the other threads had any activity at all. truss just sat there with no output.
Reporter, please try it with new trunk builds. This seems to be fixed now.
*** Bug 100826 has been marked as a duplicate of this bug. ***
It seems to me that sometimes when some page (or advertisements on that page) stops loading the mozilla's networking gets to some deadlocked state and it doesn't issue any new connection. This bug was very visible in the late august builds then it disappeared and now it seems it appeared again. The throbber on the window with the page which triggered the deadlock is spinning. When you click on stop button the deadlock disappears and the content in other windows continues loading. possible duplicates or simmilar bugs: bug 95010, bug 94526, bug 93712, bug 92611 Note: I am behind a squid proxy.
maybe bug 83526 ? (See also my bug 101535 ) Checkin for bug 83526 is very bad for pages with a couple of objects from unresponsive servers.
darin
Assignee: neeti → darin
reverting bug 83526 is not the solution... IE also limits the number of connections per server to 2, as recommended by RFC2616.
can anyone post a testcase where this bug is often reproducible? perhaps a page with the slow-to-respond ads.
http://www.tweakers.net/ Situation: Many layers. Down part loads first. Two icq flowers (Online / Offline state) blocks loading of the upper part. Note: Other windows don't start loading until the first page is finished. (Should also be visible on other bulletin boards using icq flowers. Servers are down or something like that) IE loads the full page, and says ready very fast. It looks like the icq flower connection ( :-) ) times out much faster.
thanks for the testcase... i'm able to repro the page not loading, but i think that "Page doesn't finish loading" would be a more accurate Summary, since mozilla is still able to access other pages.
Status: NEW → ASSIGNED
Summary: Mozilla occasionally stops talking on the network → Page doesn't finish loading
-> mozilla 0.9.6
Priority: -- → P2
Target Milestone: --- → mozilla0.9.6
actually, the page does finish loading eventually. i'm not sure why it seems to take mozilla a while to figure out that the page has been completely loaded.
Summary: Page doesn't finish loading → Page doesn't finish loading (as soon as it should)
That may be a more accurate summary for the given test case, but it's not at all accurate for the problem I opened this bug for. The problem I described is that once it stops being able to access the network, it continues to be unable. Forever. Until the app is restarted. Even if I let it sit there for two hours (yes, I actually tried that). I haven't seen it for a while, so maybe some other change fixed it. But it was so intermittent to begin with that it's hard to say if it's just luck that it hasn't happened again.
the problem seems to be that our connection attempt to online.mirabilis.com is failing, and we end up waiting for a connection timeout. the http request is never even written out. here's the URL that is failing: http://online.mirabilis.com/scripts/online.dll?icq=36012466&img=5 the last status message from the socket transport before it times out is CONNECTING_TO.
when i try "telnet online.mirabilis.com 80" i never get a connection, so i think we can safely rule out this as a mozilla bug. so, can anyone provide a testcase for this bug? or can we mark it FIXED/INVALID?
Priority: P2 → --
Target Milestone: mozilla0.9.6 → ---
sean: well, a lot of bugs have been fixed since 8/15 ... especially bugs pertaining to HTTP connection management. i'm willing to bet that this bug is fixed since you've not seen it recently. would you say that it has not shown up since august?
I've seen it more recently than that, but it's possible that the machine it happened on was running an older build at the time. I'd be satisfied with closing this ticket and opening a new one with more info if it does happen again.
OK.. on that note, i'm going to mark this as FIXED. please reopen this bug if you find that it is occuring with a recent nightly build. thanks!
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
I've entered a new bug 103070. To describe problems which the fix to bug 83526 does when using a proxy.
Why did you mark this bug "FIXED" without any evidence that it was addressed? If anything, mark if "WORKSFORME" if you can't reproduce it and it hasn't been seen for a while, but "FIXED" implies that some positive action was taken to resolve the problem after identifying the cause. Saying that "lots of bugs have been fixed" and thereby assuming that this must be one of them is a very bad habit to get into, and forces people to keep resubmitting or reopening bugs that were never truly fixed because they were too quickly dismissed. And in the case of this bug, it is definitely NOT fixed. Or, to be more exact, my bug 100826, which was marked as a duplicate of this bug, is not fixed. My reading of this bug is that both bugs seem to be talking about the same problem, where the throbbers spin, but network traffic isn't happening. Either this bug needs to be reopened, if mine is properly a duplicate, or I'll have to reopen my bug if it's not properly a duplicate. One way or the other, it's still broken. I just reproduced this with the latest nightly, 2001100321. I have a test CGI that normally returns very consistently in about 0.25 seconds. I went to the test page http://www.uclick.com/client/cap/gm/2001/09/10/index.html which was mentioned under bug 100826, and kept selecting different days (and clicking on the "click" button) until one got stuck. As soon as that happened, my other window talking to the test CGI froze also. It took 74 seconds for the test window to unfreeze itself, and then the test CGI also unfroze. Only at the very END of that 74 seconds did it open the connection to fetch the test CGI -- I was running a tcpdump and saw the SYN packets go out only at the very end. The test CGI was at http://escher.ties.org/cgi-bin/date for testing, but that's close to me on the network (it's my home machine), and may not work for others to test. I've already uploaded that test CGI under bug 40867 on 7/20/01, if anyone wants to use it locally for testing. It returns a date, tries to suppress caching, and has a GET link and a POST button to retrigger the CGI. Is there some sort of maximum connection limit causing this? I am NOT using a proxy, but it seems to refuse to open new connections when (enough?) old ones are hung due to network problems at other sites...
deven: the reporter has confirmed that _this_ bug no longer appears to exist in recent builds. it appears as though it has been fixed, therefore it has been resolved FIXED. if you believe that this bug still exists, you may reopen it with some evidence that it is the same bug as originally reported. otherwise, please reopen bug 100826.
This is a vague and intermittent bug. As far as I can tell from the original description on this bug, it's the same problem I reported. Mine was "resolved" as a duplicate of this bug. The main difference in my version of the bug report is that I've pointed out that sometimes the delay is extremely long but not infinite (over a minute even to a responsive site, often many minutes) while Netscape 4 can browse fine at the same time to the same sites. I also reported what appeared to be an infinite hang where all networking failed for days on end. Both reports sound like the same bug to me, as well as to the person who marked my bug as a duplicate of this one. What more evidence do you need that it's the same bug? Given that it doesn't seem to be possible to reproduce the infinite hang on demand, what evidence do you have that the bug is actually fixed? Just because the original reporter hasn't noticed it recently is hardly proof that's been fixed. Maybe he's just been lucky recently. How do you know he won't see it again? Bah. There's no point in debating this. I'm reopening bug 100826. I reported that one, and I know THAT bug remains. It sure sounds like the same bug as this one to me, but if you're so convinced this one is fixed, then so be it.
Deven: for duplicates, verification is generally consider "is the same problem". This is not the greatest process, the major limitation the situation you describe, where the dupe bug occurs after the original is fixed. This is rare, and currently we live with the problem by assuming reporters will follow the "original" bug of their "duped" bug. When the original is fixed but your case is not, the reporter is free to re-open their bug, just put a brief comment explaining why you did so ("still happens on X build", "looks like this now...", etc...) Since you already did that, I think we should move back to your bug and see what is going on...
You need to log in before you can comment on or make changes to this bug.