Closed Bug 555537 Opened 15 years ago Closed 14 years ago

Thunderbird runs into working threads number limit or socket number limit? ('hangs' on SMTP connect)

Categories

(MailNews Core :: Networking: SMTP, defect)

x86
Windows Vista
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: andrey, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:1.9.2.2pre) Gecko/20100302 Lightning/1.0b2pre Lanikai/3.1b1

2010-03-28 10:56:37.456000 UTC - 0[f27140]: SMTP Connecting to: mail
[The 'sending mail' message appeared]
2010-03-28 10:58:51.445000 UTC - 0[f27140]: queuing url:imap://all@i
2010-03-28 10:58:51.445000 UTC - 0[f27140]: considering playing queu
2010-03-28 10:58:51.445000 UTC - 0[f27140]: creating protocol instan
2010-03-28 10:58:51.445000 UTC - 0[f27140]: failed creating protocol
2010-03-28 10:58:51.457000 UTC - 0[f27140]: considering playing queu
2010-03-28 10:58:51.457000 UTC - 0[f27140]: creating protocol instan
2010-03-28 10:58:51.457000 UTC - 0[f27140]: playing queued url:imap:
2010-03-28 10:59:14.679000 UTC - 0[f27140]: queuing url:imap://spam@
2010-03-28 10:59:14.679000 UTC - 0[f27140]: considering playing queu
2010-03-28 10:59:14.679000 UTC - 0[f27140]: creating protocol instan
2010-03-28 10:59:14.679000 UTC - 0[f27140]: failed creating protocol
2010-03-28 10:59:14.688000 UTC - 0[f27140]: considering playing queu
2010-03-28 10:59:14.688000 UTC - 0[f27140]: creating protocol instan
2010-03-28 10:59:14.688000 UTC - 0[f27140]: playing queued url:imap:
2010-03-28 11:10:35.484000 UTC - 0[f27140]: queuing url:imap://spam@
2010-03-28 11:10:35.484000 UTC - 0[f27140]: considering playing queu
2010-03-28 11:10:35.484000 UTC - 0[f27140]: creating protocol instan
2010-03-28 11:10:35.484000 UTC - 0[f27140]: failed creating protocol
2010-03-28 11:10:35.493000 UTC - 0[f27140]: considering playing queu
2010-03-28 11:10:35.493000 UTC - 0[f27140]: creating protocol instan
2010-03-28 11:10:35.493000 UTC - 0[f27140]: playing queued url:imap:
[after 14 minutes it connects finally!]:
2010-03-28 11:10:49.551000 UTC - 0[f27140]: SMTP entering state: 0
2010-03-28 11:10:49.552000 UTC - 0[f27140]: SMTP Response: 220 myserver

Reproducible: Sometimes

Steps to Reproduce:
1. Compose and send email


Actual Results:  
World wide wait

Expected Results:  
Sending email.

Using about 20 email accounts in Thunderbird. Mostly on one mail server in different domains.

Hangs on SMTP sending happens very often. The only way to send mail w/o waiting in this case is to restart Thunderbird. After restart I can send several messages fast, but than unable again. 

This is not a firewall related problem (no firewalls), and not a SMTP-server problem (always can connect to it using telnet). Also tried local Socks5-proxy to see, how Thunderbird connects, and, when it hangs - it just waiting something, not even attempt to establish a socket connection.

Searching for appropriate counter to move aside limits in the 'about:config', but can't find something related to SMTP or total number of sockets/buffers/threads.
(In reply to comment #0)
> Hangs on SMTP sending happens very often.

Log you attached.
> 2010-03-28 10:56:37.456000 UTC - 0[f27140]: SMTP Connecting to: mail
> [The 'sending mail' message appeared]
>(snip)
> 2010-03-28 11:10:49.551000 UTC - 0[f27140]: SMTP entering state: 0
> 2010-03-28 11:10:49.552000 UTC - 0[f27140]: SMTP Response: 220 myserver
And, log says "connectiong to IMAP server is being executed".
> Next is repeatedly logged.
> 2010-03-28 10:59:14.688000 UTC - 0[f27140]: playing queued url:imap:
> 2010-03-28 11:10:35.484000 UTC - 0[f27140]: queuing url:imap://spam@

Hangs when tries to connect to SMTP server? Or hangs while sending data to SMTP server? Or hangs when tries to save copy of sent mail to IMAP sent folder?

When problem occurs, how many TCP connections of Tb is displayed by "netstat" command or by some other commands?

When you experience problem again, try next, please.
- If hangs while sending data to SMTP, cancel it and "Send Later",
  if possible. Mail is saved in "Outbox" of "Local Folders".
- "Work Offline"(reply cancel to dialog for "download for offline use"),
  then go back to "Work Online"(reply no to dialog for "send unsent messages").
- "Send Unsent Messages".
What happens?

> Gecko/20100302 Lightning/1.0b2pre Lanikai/3.1b1

Can you re-produce problem with -safe-mode and/or with new profile(an IMAP account and an SMTP server only)?

Do you define RSS accounts and watch many RSS feeds? If yes, disable periodical check of RSS feed update. Will frequency of problem be reduced?

Can you get log for other protocol also and check accesses to external sites?
> https://wiki.mozilla.org/MailNews:Logging
> http://www.mozilla.org/projects/nspr/reference/html/prlog.html#25328
> https://developer.mozilla.org/en/HTTP_Logging  
> MS Win example : SET NSPR_LOG_MODULES=
> timestamp,smtp:5,imap:5,pop3:5,nntp:5,nsHttp:5
Something wrong is seen?
If network/socket level error is suspected, get socket level log also and check log, please.
> MS Win example : SET NSPR_LOG_MODULES=
> timestamp,smtp:5,imap:5,pop3:5,nntp:5,nsHttp:5,nsSocketTransport:5,nsHostResolver:5
Because log file size becomes huge due to nsSocketTransport:5 and due to big mail data with imap:5, please don't attach raw log file to this bug. Check locally by yourself first, please.
FYI.
Reason why I asked above questions - there are some reports for similar phenomenon to yours or phenomenon of slowness/UI hang with high CPU while mail sending. 
  Bug 535070 : interfere by RSS feeds is suspected(Bug 535070 comment #34)
  Bug 538283 : connection via OpenVPN is suspected(Bug 538283 comment #15)
  Bug 538397 : interfere by mail scanner is suspected(Bug 538397 comment #6)
> Hangs when tries to connect to SMTP server? Or hangs while sending data to SMTP
> server? Or hangs when tries to save copy of sent mail to IMAP sent folder?

As you can see from the log, hangs on BEFORE actual connect to SMTP-server. SMTP-server wasn't connected in these 14 minutes, Thunderbird was waiting for something (some of its internal mutexes, I think).

> And, log says "connectiong to IMAP server is being executed".

Right. Thunderbird was connected to all of my (many) IMAP accounts. And SMTP problem occurs more often after I added 2 new IMAP accounts recently.

> When problem occurs, how many TCP connections of Tb is displayed by "netstat"
command or by some other commands?

Will check, when it happens next time. Now (normal state) there are 20 established connections from my Thunderbird to IMAP servers.

> When you experience problem again, try next, please.
> - If hangs while sending data to SMTP, cancel it and "Send Later",
>  if possible. Mail is saved in "Outbox" of "Local Folders".
> - "Work Offline"(reply cancel to dialog for "download for offline use"),
>   then go back to "Work Online"(reply no to dialog for "send unsent messages").
> - "Send Unsent Messages".

OK, will try. But on my previous attempts to cancel sending, it not prompts me to save messages to "Outbox" of "Local Folders" (and I don't know how to save new composed messages to this folder, so I just saved messages to 'Draft' folders on the IMAP server, restart Thunderbird and re-send messages from Draft), and newer seen "Work offline" prompts (should I use "Work offline" menu command instead of restarting Thunderbird in case of problem?)

> Do you define RSS accounts and watch many RSS feeds?

No. I have two NNTP accounts, but these are inactive always.

And I have Lightning extension installed, it periodically checks changes on my WebDAV calendar account (tasks), and not interfere with IMAP/SMTP, I hope.

> Something wrong is seen?

"failed creating protocol instance":

2010-03-29 11:38:34.491000 UTC - 0[2727140]: queuing url:imap://myemail:143/fetch>UID>/spam>24040
2010-03-29 11:38:34.491000 UTC - 0[2727140]: considering playing queued url:imap://myemail:143/fetch>UID>/spam>24040
2010-03-29 11:38:34.491000 UTC - 0[2727140]: creating protocol instance to play queued url:imap://myemail:143/fetch>UID>/spam>24040
2010-03-29 11:38:34.491000 UTC - 0[2727140]: failed creating protocol instance to play queued url:imap://myemail:143/fetch>UID>/spam>24040

> Reason why I asked above questions - there are some reports for similar
> phenomenon to yours or phenomenon of slowness/UI hang with high CPU while 
> mail sending.

The CPU load and memory used was normal, not high.

And I have no local antivirus/antispam. Mail filtered on the server.
> When problem occurs, how many TCP connections of Tb is displayed by "netstat"
> command or by some other commands?

54 conections (4 local 127.0.0.1<->127.0.0.1, and others to extern IMAP servers).

Log:
2010-03-29 22:34:41.165000 UTC - 0[2627140]: nsHttpHandler::Observe [topic="timer-callback"]
2010-03-29 22:34:41.165000 UTC - 0[2627140]: STS dispatch [b3e1140]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: SMTP Connecting to: myserver
2010-03-29 22:34:44.660000 UTC - 0[2627140]: creating nsSocketTransport @bded4a0
2010-03-29 22:34:44.660000 UTC - 0[2627140]: nsSocketTransport::Init [this=bded4a0 host=myserver:25 proxy=:0]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: nsSocketTransport::PostEvent [this=bded4a0 type=3 status=0 param=0]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: STS dispatch [11c73040]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: nsSocketTransport::PostEvent [this=bded4a0 type=3 status=0 param=0]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: STS dispatch [11c73060]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: nsSocketTransport::OpenOutputStream [this=bded4a0 flags=0]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: STS dispatch [d8a88c8]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: nsSocketTransport::PostEvent [this=bded4a0 type=0 status=0 param=0]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: STS dispatch [11c73080]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: nsSocketTransport::OpenInputStream [this=bded4a0 flags=0]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: STS dispatch [d8a8948]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: nsSocketTransport::PostEvent [this=bded4a0 type=0 status=0 param=0]
2010-03-29 22:34:44.660000 UTC - 0[2627140]: STS dispatch [11c73040]
2010-03-29 22:34:56.173000 UTC - 0[2627140]: nsHttpHandler::Observe [topic="timer-callback"]
2010-03-29 22:34:56.173000 UTC - 0[2627140]: STS dispatch [4717100]
2010-03-29 22:35:11.180000 UTC - 0[2627140]: nsHttpHandler::Observe [topic="timer-callback"]
2010-03-29 22:35:11.180000 UTC - 0[2627140]: STS dispatch [4d07300]

[waiting for 7 minutes, then clicked 'Cancel')

> When you experience problem again, try next, please.
> - If hangs while sending data to SMTP, cancel it and "Send Later",
>  if possible. Mail is saved in "Outbox" of "Local Folders".

When I clicked 'cancel' I got 'an error occured, check credentials and try again'. It not prompts to 'Send later'.

Entered to offline mode (these local 127.0.0.1 connections remains, but others disappears in netstat listing). In the 'compose' window button 'Send' renamed to 'Send later'. Clicked it. Message was saved to 'Outbox'. Switch back to online mode, it prompted to send unsent messages, I agreed, and it successfully and fast sent it:

2010-03-29 22:56:07.584000 UTC - 0[2627140]: SMTP Connecting to: myserver
2010-03-29 22:56:07.599000 UTC - 0[2627140]: SMTP entering state: 0
2010-03-29 22:56:07.599000 UTC - 0[2627140]: SMTP Response: 220 ...
2010-03-29 22:56:07.599000 UTC - 0[2627140]: SMTP entering state: 14
2010-03-29 22:56:07.615000 UTC - 0[2627140]: SMTP Send: EHLO [myIP]
2010-03-29 22:56:07.615000 UTC - 0[2627140]: SMTP entering state: 0
[...]
2010-03-29 22:56:08.410000 UTC - 0[2627140]: SMTP Send: .
2010-03-29 22:56:08.520000 UTC - 0[2627140]: SMTP entering state: 0
2010-03-29 22:56:08.520000 UTC - 0[2627140]: SMTP Response: 250 OK message accepted for delivery
2010-03-29 22:56:08.520000 UTC - 0[2627140]: SMTP entering state: 9
2010-03-29 22:56:08.520000 UTC - 0[2627140]: SMTP Send: QUIT
2010-03-29 22:56:08.520000 UTC - 0[2627140]: SMTP entering state: 0
2010-03-29 22:56:09.471000 UTC - 0[2627140]: SMTP entering state: 0
2010-03-29 22:56:09.471000 UTC - 0[2627140]: SMTP Response: 221 Goodbye.
2010-03-29 22:56:09.471000 UTC - 0[2627140]: SMTP entering state: 10
2010-03-29 22:56:10.579000 UTC - 0[2627140]: SMTP entering state: 12

OK, this offline/online switch is useful workaround (instead of restarting Thunderbird :)

50 connections - how to move aside this limit?
(In reply to comment #4)

Good analysis.

> 54 conections (4 local 127.0.0.1<->127.0.0.1, and others to extern IMAP servers).

"4 local 127.0.0.1<->127.0.0.1" is 2 sets of loopback connection for Tb<->Tb communication. It's probably for ease of implementation(queueing and scheduling by myself is not needed.)

> 50 connections - how to move aside this limit?

Followig is a web page found by Google search for "tcp connection number limit windows". It looks client side setting. 
> http://smallvoid.com/article/winnt-tcpip-max-limit.html

As it seems that there is no error like "too many connections" from IMAP server, limitation by MAXPERIP(parameter name of Courier) looks irrelevant to your case. However, some kinds of limitation such as maximum tcp connection count per server application, maxmum tcp socket count per server application/per server, may be relevant. It can be a trigger of temporary connection error or temporary communication failure.   

How many IMAP accounts do you define? If max chached connections(per account)=5 and 10 IMAP accouns, 50 connections is normal phenomenon.
Is a server used for all of IMAP, POP3, SMTP, ,,,?
Is error(in socket log) seen around IDLE/DONE per 30 seconds?
(AFAIR, connection error during IDLE is not handled well.)
If issue around IDLE is suspected, can you check with "IDLE command use by Tb disabled"? (If you can afford to "IDLE command use disabeled" in dayly use, please use Tb for several days with "IDLE command use disabeled".)
For SMTP connection.
> 2010-03-28 10:56:37.456000 UTC - 0[f27140]: SMTP Connecting to: mail
> [The 'sending mail' message appeared]
>(snip)
> 2010-03-28 11:10:49.551000 UTC - 0[f27140]: SMTP entering state: 0
> 2010-03-28 11:10:49.552000 UTC - 0[f27140]: SMTP Response: 220 myserver

During Tb's SMTP code is waiting for completion of connection(till just before SMTP entering state: 0 after 4 minutes), is soecket level retries of connection to the SMTP server seen in log? Or waiting for response from server or waiting for free tcp socket?
> Followig is a web page found by Google search for "tcp connection number limit
> windows". It looks client side setting.

> However, some kinds of limitation such as maximum tcp connection
> count per server application, maxmum tcp socket count per server
> application/per server, may be relevant.

I've not changed TCP settings of my Vista. And sometimes there are thousands of connections (total). And, when SMTP in TB hangs, I can connect to the same SMTP/IMAP server using telnet or Outlook from the same PC. So I've not run
into OS or server limits.

From my first message: "Using about 20 email accounts in Thunderbird. Mostly on one mail server in different domains."

mail.server.server*.max_cached_connections was 2..5, I've set it to 15 (yesterday), but it still hangs sometimes - the log in the 'Comment 4'.

> is soecket level retries of connection to the SMTP server seen in log? Or waiting for response from server or waiting for free tcp socket?

At 2010-03-28 socket-level logging wasn't turned on. In yesterday log (2010-03-29) with all logging turned on I've not found any socket error in these 7 minutes before I clicked 'cancel'. What exact string should I grep?
Is there enough information to do anything other than close this incomplete?

On last contact, Andrey indicates he is no longer using thunderbird
Component: General → Networking: SMTP
Product: Thunderbird → MailNews Core
QA Contact: general → networking.smtp
wada, if you disagree please reopen. On last contact, Andrey indicates he is no longer using thunderbird
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.