Closed Bug 614950 Opened 14 years ago Closed 14 years ago

Connections stall occasionally after 592284 landed

Categories

(Core :: Networking: HTTP, defect)

x86
Windows 7
defect
Not set
major

Tracking

()

RESOLVED DUPLICATE of bug 614677
Tracking Status
blocking2.0 --- final+

People

(Reporter: khronos, Unassigned)

Details

(Whiteboard: [http-conn])

User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:2.0b8pre) Gecko/20101125 Firefox/4.0b8pre Build Identifier: Mozilla/5.0 (Windows NT 6.1; rv:2.0b8pre) Gecko/20101125 Firefox/4.0b8pre Copied over from [592284 - Accelerate HTTP Connection Timeout], comment 31 from 2010-11-25 It seems this causes some problems. It is hard to reproduce (takes a while to manifest), but since this landed I ended up with a sort of hanged connection state a few times. When this happens, all loading tabs stay as "Connecting...". Only way to get things moving is to close the tab that caused the stall, but there is no way to find out which it was, so it's a matter of luck (ie closing one by one, until it starts working again). Today it did hang because of a download, the download was running fine (and I tried IE, and that was loading just fine too), but nothing else could be loaded in Minefield until the download finished (20 minutes). Reproducible: Sometimes Steps to Reproduce: No way to reproduce this reliably, just happens from time to time.
This might be a dupe of bug 614677
Similar, but probably not a dupe. I don't get any connection reset messages. My tabs stay in the "Connecting..." state and don't time out even after a very long time.
I suspect this is the same problem I've been experiencing for the last few days on Mac OS
Status: UNCONFIRMED → NEW
Ever confirmed: true
I suspect the issue is with the "reuse" of the unused connection after some period of time we do test the socket for a FIN before using it, but maybe there is still a server side bias against a connection without a quick request appearing on it. on/before monday I will 1] make a tryserver build with the reclaim logic stubbed out so 613977/614677/614950 users can confirm or refute whether that is the source of the problem and 2] write a patch that a] doesn't do this reuse for tls (613977 hopefully) b] restricts reuse of these 'unprimed' connections to just a couple of seconds after being created (hopefully covers both this and 614677).. I think the reclaim bits are still worth doing even if they are only valid for a short period of time - http connections to the same host obviously have a strong sense of temporal locality.
I cant contribute to a solution, but this build: firefox-4.0b8pre.en-US.win32.installer.exe tinderboxbuild: 1290676445 changeset: 7662cf45a33b has been the fastest and fewest "connecting..." problems Ive seen in weeks.
Would the reporter try this build: http://ftp.mozilla.org/pub/mozilla.org/firefox/tryserver-builds/mcmanus@ducksong.com-2be20026e175/ That build removes just the reclaim logic associated with surplus extra connections. If that does successfully isolate the problem we can build something a little more nuanced for merge.
(In reply to comment #5) > I cant contribute to a solution, but this build: > > firefox-4.0b8pre.en-US.win32.installer.exe > > tinderboxbuild: 1290676445 > changeset: 7662cf45a33b > > has been the fastest and fewest "connecting..." problems Ive seen in weeks. Thanks Bruce, that probably means you are benefiting from the aggressive restart logic. I suspect the problems being seen are from the "reclaim" logic, which is an optimization that tries to reuse some of the excess connections created by the restarts.. the two things are largely separable and the build mentioned in comment 6 makes that separation.
I didn't have time to test the tryserver build and it will be pretty hard to test, as I didn't have any problems for one day and a half on the normal nightly until yesterday night. Yesterday when it happened again, I fired up TCPview and surprisingly, there were no connections to the servers it tried to connect to. Once it unfreeze itself (this time without any action from me) after about 1.5 minutes, the connections popped up.
The tryserver build made a major difference for me, especially on Mozillazine forums, where the I was getting Connection reset messages every few minutes. During the time I used the tryserver build, I didn't receive the message once, as soon as the auto-update kicked in, the messages returned. I still have occasional problems where opening a link in a new tab simply loads the tab then waits for a second or two before showing any activity, it just shows 'new tab'
Blocks: 592284
Severity: normal → major
blocking2.0: --- → ?
Seems to be running fine, no noted hangs here in over 12 hrs.
No longer blocks: 592284
Depends on: 613977
even though the symptoms are a bit different, the root cause of this is almost certainly tied up in the same issue as 614677 - so I'm going to dup it to avoid the confusion of looking in the wrong place.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
No longer depends on: 613977
blocking2.0: ? → final+
Whiteboard: [http-conn]
You need to log in before you can comment on or make changes to this bug.