Thunderbird times-out waiting for final IMAP APPEND response for large messages.
Categories
(MailNews Core :: Networking: IMAP, defect)
Tracking
(thunderbird_esr78+ fixed, thunderbird86 affected)
People
(Reporter: gds, Assigned: gds)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression)
Attachments
(2 files, 3 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
mkmelin
:
review+
wsmwk
:
approval-comm-esr78+
|
Details | Diff | Splinter Review |
This bug is a branched from Bug 538375 Comment 127. What's happening is if a message is appended to an imap mailbox as the result of a copy from another imap account or Local Folders or when saving a draft or copy to Sent and the message is quite large and/or the server is slow, because we only wait 20 seconds for the server to respond with OK after the final segment is transferred, we timeout the transfer and retry. The retry typically also fails. I've noticed this behavior specifically with the outlook.com imap server.
In the distant past, the timeout for this wait was the same as the mail.tcptimeout value, 100 seconds. Bug 396874 change the timeout to 20 seconds due to other issues. So Bug 396874 regresses this bug.
A simple fix for this is to introduce a new pref mailnews.tcptimeout_append with a default of 20 that can be adjusted up/down if errors are noticed. Another possibility is to have a boolean pref mailnews.append_is_tcptimeout that defaults to false but when set true the append timeout is changed from 20 to whatever mailnews.tcptimeout is set to.
The attachment shows the behavior when appending to outlook.com first with the default 20 sec timeout and then with 100 sec timeout. In the first case, you see that tb issues a tcp FIN (frame 505) at about 20 seconds after the final append segment is sent to the server. Since we disconnect from the server, the full append is tried again, which also times out and fails.
In the seconds case, I have increased the append timeout to equal the tcptimeout (100s). The time for the server to respond (time between frames 335 and 341) is 28 seconds which is well within the 100s timeout limit. (Note: frame 341 actually contains the OK response but it doesn't show in the summary display.) The append succeeds and the next message is appended at frame 347.
Note also that the newly added tcp keepalive doesn't affect this since it just keeps the server from disconnecting. The problem here is that TB does the disconnect. Possibly if outlook.com sent keep-alives while processing the final append response, TB would not disconnect; but outlook.com remains completely silent on the network for more than 20 seconds.
Edit: Actually tb sending keep-alive packets could help. However the time base for the keep alive packet interval is in minutes. So for keep alives to be effective for this bug, they would have to be sent on an interval of less than then 20 sec. This would create lots of useless network activity and would be way overkill for the intended purpose of our keep alives: to keep the server from disconnecting during imap IDLE and to speed up reconnects (re: Bug 1535969).
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
Could this impact Bug 760859 - thunderbird blocks UI ... while saving draft message with big files attached ?
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
For updated "try" builds containing a possible fix for this bug please see bug 538375 comment 158
Comment 3•4 years ago
|
||
(In reply to gene smith from comment #2)
For updated "try" builds containing a possible fix for this bug please see bug 538375 comment 158
I tried the new build and can confirm it works. Thanks! I thought I was going crazy wondering why my timeout settings were being ignored - I set tcptimeout to 25000 and it still was timing out on the TB side.
Comment 4•4 years ago
|
||
(In reply to gene smith from comment #2)
For updated "try" builds containing a possible fix for this bug please see bug 538375 comment 158
Following up:
it looks like the change got backed out of or try-comm-central
https://hg.mozilla.org/try-comm-central/file/tip/mailnews/imap/src/nsImapProtocol.cpp#l2180
with this commit:
https://hg.mozilla.org/try-comm-central/rev/45293a726fc3b519b1081c393c95908c361c077b
The artifacts are also gone from the build at https://bugzilla.mozilla.org/show_bug.cgi?id=538375#c158, so right now there is no why to get this fix easily to users. Could the change please be put back in and/or rebuilt so binaries are available again either through try-comm-central nightlies or treeherder? Thanks.
Assignee | ||
Comment 5•4 years ago
|
||
Yes, the "try" builds only hang around for a limited time I guess to avoid the need for infinite disk storage space somewhere.
I've gone ahead and produced a patch for a simple fix. This will allow you to adjust the new pref called mailnews.appendtimeout in the config editor from its default 20 seconds to whatever is needed. Probably setting this to 100 seconds will cover the append timeout issues during bulk copies. It worked for me. (I would recommend setting it back to default when done doing bulk copies to avoid the potential issues discussed in bug 396874.)
The pref gets read into a static variable so it will be the same for all imap accounts and connections that are instantiated. So for a change to this pref to take effect a tb restart is needed.
If and when this is approved, it will first appear in a daily build.
Comment 6•4 years ago
|
||
Updated•4 years ago
|
Assignee | ||
Comment 7•4 years ago
|
||
While this "works" I'm not sure it has any impact per se.
The try builds I did for Bug 538375 with this fix did have impact according to a few reports there. Granted, this is not be the only reason Bug 538375 exists but it does have some effect on it.
Very few users would know to adjust this.
Bug 538375 comes up a lot when you google search for problems involving copy/move of lots of tb message (e.g., google: "thunderbird bug move/copy a lot of message") so a suggestion to adjust this could be placed in the "user story" at the top of that bug and this bug. Then the "very few" users that have the problem can find a possible solution.
Could we make it dynamic to adapt as needed, and work out of the box for all? I see there's something similar at https://searchfox.org/comm-central/rev/f892204fdc17e07b7c2e5e4817a1d27d9fb7a328/mailnews/imap/src/nsImapProtocol.cpp#2187
I don't see a way to auto-adjust this. The adjustment you refer to depends only on the number of messages to be copied. The imap APPEND timeout depends on the size of the problem message, the imap server type and maybe even the speed of the connection. So there is no simple criteria to trigger a change in the append timeout.
should mention unit (seconds)
Fixed in v2 patch.
Comment 8•4 years ago
|
||
Not sure we should add another pref of questionable usage. Maybe we need to increase the default, if it's a problem? Or just make it a fraction of the tcptimeout? If the append is slow, I'd guess many times they correlate.
Comment 9•4 years ago
|
||
Comment on attachment 9177478 [details]
1618455-fix-for-append-timeout-for-bulk-copies-v2.patch
See comment 8, what do you think?
Assignee | ||
Comment 10•4 years ago
|
||
Sorry for the delay in answering. Missed your comment in the mail list from 6 days ago (canceled review got my attention).
First, not sure it's more questionable than a lot of other prefs. Anyhow, not sure what you mean by a fraction of the tcptimeout. Right now it is 20% of the tcptimeouts, 20 sec. Are you suggesting a new pref to set the append timeout as a fraction of tcptimeout? You probably mean set it to a fixed 20% of tcptimeout and if you want/need a longer append timeout, just increase the tcptimeout. That sounds good to me.
With the attached diff, if I need to cause the append timeout to be 40 seconds instead of default 20 I would set the tcptimeout to 40*5 = 200 seconds.
P/S: I don't know why gResponseTimeout is initialized to 60 when the mailnew.js defaults it to 100 (the tcptimeout default).
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
(In reply to gene smith from comment #10)
P/S: I don't know why gResponseTimeout is initialized to 60 when the mailnew.js defaults it to 100 (the tcptimeout default).
Maybe it changed over time? We could start using 100 in the initialization too.
Comment 14•4 years ago
|
||
Is it possible to test this?
I have some old mails with large attachments I am trying to copy from local folders and is getting this timeout for many mail
Assignee | ||
Comment 15•4 years ago
|
||
Sorry for the long delay in getting back to this but here the patch.
Assignee | ||
Comment 16•4 years ago
|
||
(In reply to Thorvald Aagaard from comment #14)
Is it possible to test this?
I have some old mails with large attachments I am trying to copy from local folders and is getting this timeout for many mail
Sorry for the delay in getting back to you. If you still need this, when this patch is approved and "lands" it will be then become available in the daily build. If you need it sooner, I can run a "try" build and you can download and install it from the "try" server. Please let me know if you need the quicker "try" build.
One technical note: This doesn't allow you to adjust the imap append timeout independently. You must adjust the mailnews.tcptimeout preference (default 100 second) to change the append timeout. Append timeout will always be 20% of mailnews.tcptimeout. (See comment 10 above for an example of setting the imap append timeout to 40 seconds. Of course, you can set it to whatever you need.)
Comment 17•4 years ago
|
||
Updated•4 years ago
|
Comment 18•4 years ago
|
||
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/a3f643dfaf67
Change the imap append timeout from hardcoded 20s to 1/5th of mailnews.tcptimeout. r=mkmelin
Comment 19•4 years ago
|
||
Great seeing this go live. I ended up making sure any sync was below 5MB, and just stored larger files on my disk.
But with this change it will again be possible to use my mail to store larger attachments
Comment 20•4 years ago
|
||
If this rides the train for a while, will it be arriving on the release channel? The new cottage industry is trying to upload entire gmail accounts to new gmail accounts and issues are quite common in Support as a result. This would have an impact I am sure.
Assignee | ||
Comment 21•4 years ago
|
||
Well, this patch isn't a complete panacea for migration between servers, not sure if anything is. There's an app call imapSync that is designed for that and does only that (not sure if it's free). There's also my unrelease, uncertified and not guaranteed tb addon (that I ported from someone else's work). This has the link to it: bug 538375 comment 169 (definitely a work in progress).
Comment 22•4 years ago
|
||
Comment on attachment 9200604 [details] [diff] [review]
Bug1618455-imap-append-timeout-not-hardcoded.patch
[Approval Request Comment]
User impact if declined: this can help with timeouts for laggy servers/networks
Testing completed (on c-c, etc.): on beta
Risk to taking this patch (and alternatives if risky): very save
Comment 23•4 years ago
|
||
Comment on attachment 9200604 [details] [diff] [review]
Bug1618455-imap-append-timeout-not-hardcoded.patch
[Triage Comment]
Approved for esr78
Comment 24•4 years ago
|
||
bugherder uplift |
Thunderbird 78.8.1:
https://hg.mozilla.org/releases/comm-esr78/rev/33f12cb03f48
Description
•