Closed Bug 189843 Opened 22 years ago Closed 22 years ago

Cannot make any connections to "localhost" (127.0.0.1)

Categories

(Core :: Networking, defect, P1)

Sun
SunOS
defect

Tracking

()

RESOLVED FIXED
mozilla1.3beta

People

(Reporter: gonufer, Assigned: darin.moz)

References

()

Details

(Keywords: regression)

Attachments

(4 files, 1 obsolete file)

Between 2003011700 and 2003011810 mozilla lost the ability to open any
connections using localhost (127.0.0.1).  I cannot load my default home page
from the Apache server running on localhost nor can I access the rest of the
'net via privoxy running on localhost.  Using a proxy on a remote hose or
directly connecting to any remote hosts works fine.

That timeframe corresponds to when the changes for 176919 (support for async
input/output streams) was committed.

truss shows a connect to localhost followed shortly by a poll invoked by the
same thread in the working case:

36429/2:        connect(19, 0x001B3A50, 32, 1)                  = 0
36429/2:                AF_INET  name = 127.0.0.1  port = 8118
36429/2:        poll(0xFD49FC80, 2, 35000)                      = 1
36429/2:                fd=7  ev=POLLIN rev=0
36429/2:                fd=19 ev=POLLIN|POLLPRI|POLLOUT rev=POLLOUT
36429/2:        write(19, " G E T   h t t p : / / x".., 463)    = 463

In the failing case the connect succeeds but the thread that called connect
never executes any subsequent system calls nor is the file descriptor used by
connect ever referenced again directly (eg, write system call).
to darin
Assignee: dougt → darin
Flags: blocking1.3b?
Blocks: 176919
Keywords: regression
hmm.. WFM linux 2003012108.  i can connect to my localhost apache server without
any problem.  i also successfully used a squid proxy on localhost.  i referred
to it using both localhost and 127.0.0.1 in the mozilla proxy prefs panel.
Greg: if you have a debug mozilla build, can you set the following environment
variable to collect a mozilla socket log?

bash$ export NSPR_LOG_MODULES=nsSocketTransport:5
bash$ export NSPR_LOG_FILE=/tmp/sock.log
bash$ mozilla

repro problem, exit mozilla, and upload /tmp/sock.log to this bug report.  thx!

can anyone else confirm this bug?
WFM on Linux, too.  Will build a debug build and do as suggested.
Severity: major → minor
Attached file debug sock.log (deleted) β€”
This is a log of the browser starting, the attempt to load the default home
page from a web proxy running on localhost, and then File->Quit.
ah!  ic what the problem is now.  there is some code lacking when an initial
call to PR_Connect results in a successful connection, immediately without
blocking.  this is usually not the case on linux, windows, and most other
platforms.  this should be trivial to fix.  i think this probably effects all
platforms, but obviously far less frequently on the platforms i mentioned.
Severity: minor → critical
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.3beta
Attached patch v1 patch (obsolete) (deleted) β€” β€” Splinter Review
this patch should do the trick.  i found a couple errors in the connection
establishment code :(
gonufer: can you please test out this patch if you get a chance?  thx!!
Thanks!  Initial results look good: I can fetch pages served from web server
running on localhost, can get a few pages via privoxy proxy accessed via
localhost.  I'm running into bug#190079 very quickly when I load my default tabs
so I'm rebuilding right now to get that fix.

 
I've used it for a short while now, things look okay.
I spoke too soon.  With proxies it works okay, with "direct connection to the
internet" all networking stops dead once it hits the maximum number of
simultaneous connections.
greg, are you talking about:

pref("network.http.max-connections", 24);

or are you talking about the hard-coded limit of 50 sockets in the socket
transport service?  we shouldn't be hitting that limit :(

can you provide another sock.log for case where you hit the "maximum number of
simultaneous connections" limit.  thx!!
Attached file pstack output from hung mozilla (deleted) β€”
This is pstack output of the mozilla process after it hung on a google images
query and then I chose File->Quit.
Attached file sock.log of hang after patch applied (deleted) β€”
sock.log as requested.	I started mozilla, changed connection to direct, went
to google, searched for "eiffel tower" on images search.  It loaded the first
few pictures then got stuck.  I then File->Quit and and it never exited.  I
used pstack (see previous attachment) and then killed the process.  This log
covers the entire timeframe.
greg: looks like you have HTTP pipelining enabled.  can you try disabling that?
No instant hang with pipelining disabled.  But having it enabled worked with
trunk builds from early/mid last week...
greg: i have another bug filed about pipelining causing problems.  since it is
not enabled by default, i'd like to make this bug just about fixing the bugs in
the socket transport.  please let me know if this patch doesn't fix the problem
100% w/ pipelining disabled.  thx!
Attached patch v1.1 patch - revised/simplified slightly (deleted) β€” β€” Splinter Review
this is basically the same patch with a few more cleanups.
Attachment #112376 - Attachment is obsolete: true
Attachment #112436 - Flags: superreview?(bzbarsky)
Attachment #112436 - Flags: review?(dougt)
I've been experiencing this problem too, both with the latest solaris nightly
and with a copy I compiled myself. I applied the v1.1 patch and recompiled.
Mozilla seems to be working properly now, using the privoxy proxy.

As sort of a torture test I opened about fifteen links in separate tabs. All of
the pages loaded successfully, and all of the sockets between mozilla and
privoxy were closed afterwards.

I have both keep-alive and pipelining turned on. Pipelining doesn't seem to be a
problem, but I don't know what privoxy actually supports.

The patch hunk with a starting line of 890 is a bit broken. The variable
"status" has been replaced by "dUmMy" in a few lines that aren't tagged as
changed. I reverted all of these to "status".
Blocks: 189624
Comment on attachment 112436 [details] [diff] [review]
v1.1 patch - revised/simplified slightly

lets get this tested quickly.  this needs to land for 1.3b.  ask benc to run
the browser buster and any other test cases you think are required.
Attachment #112436 - Flags: review?(dougt) → review+
this bug is possibly causing problems for mailnews too.
I'm going to put this patch in my tree and start testing to see if my mail/news
problems get any better.
Comment on attachment 112436 [details] [diff] [review]
v1.1 patch - revised/simplified slightly

sr=sspitzer, but you might want bz's sr, too.
Attachment #112436 - Flags: superreview?(bzbarsky) → superreview+
Comment on attachment 112436 [details] [diff] [review]
v1.1 patch - revised/simplified slightly

seeking approval on behalf of darin
Attachment #112436 - Flags: approval1.3b?
OK, I haven't seen any hangs and I probably would have by this point.  I think
this patch helps.
Comment on attachment 112436 [details] [diff] [review]
v1.1 patch - revised/simplified slightly

a=blizzard for 1.3b
Attachment #112436 - Flags: approval1.3b? → approval1.3b+
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Flags: blocking1.3b?
*** Bug 190435 has been marked as a duplicate of this bug. ***
I'm intermittently seeing what blizzard was seeing with IMAP mail servers.  Was
there a separate bug for that?  IMAP just gets stuck.  Trunk 2003012412.
*** Bug 189624 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: