Closed Bug 391259 Opened 17 years ago Closed 17 years ago

crash [@ nsProxyReleaseEvent::Run()]

Categories

(Thunderbird :: General, defect)

x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: wsmwk, Assigned: Bienvenu)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 2 obsolete files)

top crasher on trunk going back to 2007-07-26 and as recent as yesterday's build http://crash-stats.mozilla.com/report/index/6e3e0c47-4466-11dc-97ff-001a4bd43ef6 UUID 6e3e0c47-4466-11dc-97ff-001a4bd43ef6 Time 2007-08-06 14:46:04.130000-07:00 Build ID 2007080605 OS Windows NT OS Version 5.1.2600 Service Pack 2 CPU x86 CPU Info GenuineIntel family 15 model 2 stepping 9 Crash Reason EXCEPTION_ACCESS_VIOLATION Crash Address 0x0 Stack Traces Stack of Crashing Thread frame signature source 0 nsProxyReleaseEvent::Run() d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\obj-tb-trunk\xpcom\build\nsproxyrelease.cpp:52 1 nsThread::ProcessNextEvent(int, int*) d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\xpcom\threads\nsthread.cpp:490 2 NS_ProcessNextEvent_P(nsIThread*, int) d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\obj-tb-trunk\xpcom\build\nsthreadutils.cpp:227 3 nsBaseAppShell::Run() d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\widget\src\xpwidgets\nsbaseappshell.cpp:154 4 nsAppStartup::Run() d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\toolkit\components\startup\src\nsappstartup.cpp:170 5 XRE_main d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\toolkit\xre\nsapprunner.cpp:3057 6 main d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\mail\app\nsmailapp.cpp:87 7 WinMain d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\mail\app\nsmailapp.cpp:98 8 __tmainCRTStartup f:\rtm\vctools\crt_bld\self_x86\crt\src\crtexe.c:578 9 BaseProcessStart
crashed for me, slightly different stack. in process of deleting imap message. likely to have been downloading new messages at same time http://crash-stats.mozilla.com/report/index/49a76b9e-47f8-11dc-aaeb-001a4bd46e84 UUID 49a76b9e-47f8-11dc-aaeb-001a4bd46e84 Time 2007-08-11 03:47:42.750000-07:00 Build ID 2007081004 OS Windows NT OS Version 5.1.2600 Service Pack 2 CPU x86 CPU Info GenuineIntel family 15 model 2 stepping 9 Crash Reason EXCEPTION_ACCESS_VIOLATION Crash Address 0x0 frame signature source 0 nsProxyReleaseEvent::Run() d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\obj-tb-trunk\xpcom\build\nsproxyrelease.cpp:52 1 nsThread::ProcessNextEvent(int, int*) d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\xpcom\threads\nsthread.cpp:490 2 NS_ProcessNextEvent_P(nsIThread*, int) d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\obj-tb-trunk\xpcom\build\nsthreadutils.cpp:227 3 nsBaseAppShell::Run() d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\widget\src\xpwidgets\nsbaseappshell.cpp:154 4 nsAppStartup::Run() d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\toolkit\components\startup\src\nsappstartup.cpp:170 5 XRE_main d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\toolkit\xre\nsapprunner.cpp:3059 6 main d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\mail\app\nsmailapp.cpp:87 7 WinMain d:\builds\tinderbox\tb-trunk\winnt_5.2_depend\mozilla\mail\app\nsmailapp.cpp:98 8 __tmainCRTStartup f:\rtm\vctools\crt_bld\self_x86\crt\src\crtexe.c:578 9 BaseProcessStart
#2 crasher now that "nsFrame::BoxReflow" the #1 crasher seems to be gone today, sending mail, same crash bp-d73b00c5-5fae-11dc-b57b-001a4bd43e5c with build 2007090904. Plain text message. After clicking send I was prompted for password, entered wrong pwd, prompted again, entered good pwd, click OK, crash also, a (rare) linux example bp-e2ee3957-5690-11dc-b7fb-001a4bd46e84
I've been hitting this a good bit on trunk, I think. My best guess is that the problem is a race between nsImapProtocol::ReleaseUrlState (which is NOT called on the UI thread) and nsImapProtocol::CloseStreams (which is). The latter protects all its member access by a monitor, while the former does not. So it's possible for m_mockChannel and m_channelListener to become null between the null-check and the corresponding proxy release in ReleaseUrlState(), if CloseStreams() happens to run in there. Then we try to proxy-release null, which can give us the crash with exactly this stack. I would guess that revision 1.663 of nsImapProtocol.cpp is responsible, but that diff has nothing to do with the patch in the bug it references, so I'm not sure what bug this bug should be blocking.
Flags: blocking-thunderbird3?
Even though jag left you a note in bug 381992 comment 28?
Blocks: 388353
Attached patch Possible fix (obsolete) (deleted) — Splinter Review
This should fix the obvious races in ReleaseUrlState(), but there's still unhappiness in CloseStreams. See the XXX comments I added.
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Attachment #281144 - Flags: superreview?
Attachment #281144 - Flags: review?
> Even though jag left you a note in bug 381992 comment 28? I didn't see it in the other 29 comments.
Attachment #281144 - Flags: superreview?(bienvenu)
Attachment #281144 - Flags: superreview?
Attachment #281144 - Flags: review?(bienvenu)
Attachment #281144 - Flags: review?
Keywords: regression
I will try running with this. One worry is that it will introduce hangs because of contention over the protocol object monitor, since we've increased the scope of the time we hold onto that monitor. In fact, I had a hang right after running with this patch, but it may have been unrelated, and I haven't reproduced it since. >+ // XXXbz in other words we do the proxy release thing, no? Why the >+ // complications here? This code was written long before proxy release existed, afaik. So we could get rid of this "texas two step", I believe. >+ // XXXbz why exactly do we have this CloseMockChannel thing if all >+ // it does is call Close() on the channel? > if (m_imapMailFolderSink) > m_imapMailFolderSink->CloseMockChannel(m_mockChannel); I believe the point is to close the channel on the UI thread, not the IMAP thread.
I've had a couple more hangs running with this patch.
> I believe the point is to close the channel on the UI thread So is m_imapMailFolderSink a proxy? I assumed that anything CloseStreams() locked around we can lock around here. I could be wrong. In that case, we need finer-grained locking, but the old code is certainly wrong no matter what. David, you want to take this, since you have a much better chance of knowing what parts of this code need to happen outside the monitor than I do?
all the sinks are proxies to the UI thread, yes. I'm happy to take this bug/patch, and to make it stop hanging for me, but I'll probably need your help to verify afterwards that the problem you're seeing is still fixed, since I'm not able to reproduce your problem. Unlesss...I was seeing an immediate exit (no crash, just app exit) in some cases, and it's possible that was caused by the same underlying problem.
Assignee: bzbarsky → bienvenu
Status: ASSIGNED → NEW
I can't reproduce my problem either except insofar as my mail client (which is a Mozilla.org nightly) crashes a lot with a stack signature that indicates that some proxy release is being done on either a bogus or null pointer. This is the only place in IMAP code that does proxy release that I can see, and there are definite races here, so I thought I'd try fixing those and see whether it helped.
Comment on attachment 281144 [details] [diff] [review] Possible fix IMAP is unusable on windows with this patch for me - I hang all the time. I'll try to get rid of the deadlock
Attachment #281144 - Flags: superreview?(bienvenu)
Attachment #281144 - Flags: superreview-
Attachment #281144 - Flags: review?(bienvenu)
Attachment #281144 - Flags: review-
Attached patch proposed fix (obsolete) (deleted) — Splinter Review
bz, can you try this? It fixes the deadlock for me. I'll keep running with it. I also removed the PrepareToReleaseObject method, in favor of proxy release...
Attachment #281144 - Attachment is obsolete: true
Attachment #281317 - Flags: superreview?(mscott)
With that patch you still have a race on m_transport: it can go null after check but before monitor acquisition. But other than that, it looks like it'll solve the problem I was trying to solve, yes. I'll try running with it, but the real test is checking it in and seeing whether this stack disappears from breakpad reports.
Attached patch fix race for m_transport (deleted) — Splinter Review
this should fix the m_transport issue, thx.
Attachment #281317 - Attachment is obsolete: true
Attachment #281325 - Flags: superreview?(mscott)
Attachment #281317 - Flags: superreview?(mscott)
Attachment #281325 - Flags: superreview?(mscott) → superreview+
patch checked in - we'll see if this stack disappears from breakpad
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
stack gone from breakpad as of 2007091900 build
Status: RESOLVED → VERIFIED
Flags: blocking-thunderbird3?
Crash Signature: [@ nsProxyReleaseEvent::Run()]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: