Open Bug 634526 Opened 14 years ago Updated 5 years ago

shutdown hang with imap threads blocked. timing out a blocking read?

Categories

(MailNews Core :: Backend, defect)

x86
Windows Vista
defect
Not set
critical

Tracking

(Not tracked)

People

(Reporter: wsmwk, Unassigned)

References

(Regression)

Details

(Keywords: hang, regression, Whiteboard: [has stacktrace])

Attachments

(3 files)

Attached file stacktrace of shutdown hang (deleted) —
shutdown hang with imap threads blocked
(hopefully this is a reproduction of a stack I got earlier with 2011-02-08 build)

using 2011-02-15 tryserver build protz created using TM branch.

sending imap log via email
The first stack trace doesn't have useful symbols, so I can't tell if it's the same kind of hang.
(In reply to comment #2)
> The first stack trace doesn't have useful symbols, so I can't tell if it's the
> same kind of hang.

yeah. that's the tracemonkey build. can you use the other trace?
I'll revert to using comm-central trunk.

I'm going to label this a regression, but I don't have proof yet. perhaps from bug 613184?
Keywords: regression
(In reply to comment #3)

> yeah. that's the tracemonkey build. can you use the other trace?
> I'll revert to using comm-central trunk.

I don't know that the hangs are the same, so I'd just be guessing. Also, it's possible that I need a more complete log, i.e., not just the last part of the log, if the problem has to do with stalled connections that stalled long before shutdown.
> 
> I'm going to label this a regression, but I don't have proof yet. perhaps from
> bug 613184?

Possibly.

I'll try to think if there's a way to instrument this better.
I suspect this is an issue with the server sending an unsolicited response to the DONE command:

2011-02-17 16:00:05.613000 UTC - 3084[4de17c0]: a81b800:mail.lehigh.edu:S-/home/vseerror/mail/moz-lxmac/moz-FF:SendData: DONE

2011-02-17 16:00:05.613000 UTC - 3084[4de17c0]: a81b800:mail.lehigh.edu:S-/home/vseerror/mail/moz-lxmac/moz-FF:CreateNewLineFromSocket: * OK Timeout in 30 minutes

2011-02-17 16:00:05.629000 UTC - 3084[4de17c0]: a81b800:mail.lehigh.edu:S-/home/vseerror/mail/moz-lxmac/moz-FF:CreateNewLineFromSocket: 60 OK IDLE completed

Generally, your server gives that response to the IDLE command, not the DONE command (DONE gets us out of IDLE). Looking at our handling of parsing the DONE response, I don't think we deal with unsolicited responses, and that may be confusing our state. I'll have to hack our test server to be similarly rude.
er, after first trying to make it support IDLE, which might be hard.
I added IDLE support to the imap fake server, and made it generate the same unsolicited response to DONE, but it doesn't seem to cause any problems with the parser. The unit test I wrote to simulate the behavior I saw in Wayne's log had no problems. So I'm at a bit of a loss. I'll attach the changes I made to this bug, but I'm at a bit of a dead end.

What I can tell from Wayne's log is that for a couple minutes, we think the moz-lxmac/moz-FF folder is busy, and status urls fail to run. Then, after a couple minutes, we think the folder isn't busy, but we use a different connection to run the url. Wayne's connection timeout looks like it's 29 minutes, so we're not timing out the connection. Maybe we think the connection has gone bad, but the thread is still around...I'll have to think more about this.
Assignee: nobody → bienvenu
We might be timing out a blocking read, which has a roughly 100 second timeout. I'm not sure why the read timeout would leave the imap thread blocked the way it was, or mark the folder as not busy.

Looking at the timestamps on the log, we're reading the * OK timeout response with the exact same timestamp as sending the DONE command, which makes me think it was in the input stream from previous IDLE command the whole time. But I still don't see why that would cause a problem.

2011-02-17 16:00:05.613000 UTC - 3084[4de17c0]: a81b800:mail.lehigh.edu:S-/home/vseerror/mail/moz-lxmac/moz-FF:SendData: DONE

2011-02-17 16:00:05.613000 UTC - 3084[4de17c0]: a81b800:mail.lehigh.edu:S-/home/vseerror/mail/moz-lxmac/moz-FF:CreateNewLineFromSocket: * OK Timeout in 30 minutes

2011-02-17 16:00:05.629000 UTC - 3084[4de17c0]: a81b800:mail.lehigh.edu:S-/home/vseerror/mail/moz-lxmac/moz-FF:CreateNewLineFromSocket: 60 OK IDLE completed
Summary: shutdown hang with imap threads blocked → shutdown hang with imap threads blocked. timing out a blocking read?
Whiteboard: [has stacktrace]
Blocks: 702152
Assignee: mozilla → nobody
Blocks: 1149287

Gene, do we have similar bug reports?

Flags: needinfo?(gds)
Regressed by: 613184

Doesn't ring any bells...

Flags: needinfo?(gds)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: