Closed Bug 428614 Opened 17 years ago Closed 16 years ago

Crash every time I try to read news [@SearchTable]

Categories

(MailNews Core :: Database, defect)

defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9

People

(Reporter: nelson, Assigned: jcranmer)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files, 1 obsolete file)

SM trunk 20080410 nightly build
Start browser, ctrl-2 to bring up mail/news window.
Click on news group in folder pane.
Boom.  
Every time. 
Should be lots of those talkback-like things, 'cuz I report them all.
Was OK in 20080403 (nightly from one week ago)
Flags: blocking-seamonkey2.0a1?
SM trunk nightly build Gecko/2008040702 doesn't crash when I try to read news.
adding a crash id from about:crashes would be nice
Having about:crashes be discoverable would be nice, too.

9bab4814-082a-11dd-94bf-001cc45a2ce4	2008-04-11	17:49
fab1db7a-0828-11dd-a7f1-001cc45a2ce4	2008-04-11	17:37
28e450b5-0828-11dd-9fb4-001cc45a2ce4	2008-04-11	17:32
Correction:  
SM Gecko/2008040702 DOES crash if I read news after visiting a web page.   

d3bf59fa-0839-11dd-995c-001cc4e2bf68	2008-04-11	19:38
all the stacks are the same Stack overflow :

0  	xpcom_core.dll  	SearchTable  	 pldhash.c:402
1 	xpcom_core.dll 	PL_DHashTableOperate 	pldhash.c:598
2 	mail.dll 	nsMsgDatabase::GetHdrFromUseCache 	mozilla/mailnews/db/msgdb/src/nsMsgDatabase.cpp:421
3 	mail.dll 	nsMsgDatabase::GetMsgHdrForKey 	mozilla/mailnews/db/msgdb/src/nsMsgDatabase.cpp:1717
4 	mail.dll 	nsMsgHdr::GetIsKilled 	mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp:873
5 	mail.dll 	nsMsgHdr::GetIsKilled 	mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp:876
6 	mail.dll 	nsMsgHdr::GetIsKilled 	mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp:876
7 	mail.dll 	nsMsgHdr::GetIsKilled 	mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp:876
8 	mail.dll 	nsMsgHdr::GetIsKilled 	mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp:876
9 	mail.dll 	nsMsgHdr::GetIsKilled 	mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp:876
10 	mail.dll 	nsMsgHdr::GetIsKilled 	mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp:876

I had no crash reading news today with : Gecko/2008041102 SeaMonkey/2.0a1pre
Assignee: mail → bienvenu
Component: MailNews: Main Mail Window → MailNews: Database
Flags: blocking-seamonkey2.0a1?
Product: Mozilla Application Suite → Core
QA Contact: database
Taking a look at the crash information, it appears that this would be triggered by exceptionally recursive threads. Nelson, could you tell me the newsgroup you are using that has such a thread?

In the meantime, viewing ignored threads should allow you to read news w/o crashing.
Joshua, How does one view ignored threads?

What do you consider "exceptionally recursive threads"?
Threads with depths of 50 or more are not uncommon on usenet.
Does the implementation of bug 11054 have some thread depth limit 
beyond which it crashes or doesn't work? 
Nelson, which news servers / group(s) are you trying to read?

Joshua, is there a possibility with

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp&rev=1.123&mark=872-876#858

that we are getting "parentHdr == this" ?
Flags: blocking-thunderbird3.0a1?
Flags: blocking-thunderbird3.0a1? → blocking-thunderbird3.0a1+
I have 4 different news server accounts.  I think I experienced the crash 
with more than one of them.  

If you want a group with deep threads, try sci.crypt on any usenet server 
with a relatively complete feed.  I read that group on a giganews server.

I have the impression that maybe no-one in the mailnews groups uses giganews.  
Since they're one of the top 5 news server companies, I'd think it would make 
sense of MailCo (or whatever the new name is) to spring a few bucks to get 
you test accounts there.  Seems like it might be money well spent.

Another group with occasional deep threads is grc.techtalk on news.grc.com
(In reply to comment #9)
> Joshua, is there a possibility with
> 
> http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp&rev=1.123&mark=872-876#858
> 
> that we are getting "parentHdr == this" ?

As I also experience this crash when I try to read a newsgroup on news.mozilla.org, I see this: Here in the debugger "this" is not parentHdr, but the "isKilled" pointer points to the same address as parentHdr.
Hit this 3 times in rapid succession tonight.  
Looked at stack in MSVC8 debugger.
Debugger will only trace the stack 1000 levels.  
The top of the stack looks like this:
 	xpcom_core.dll!60ed1380() 	
 	xpcom_core.dll!60ed1773() 	
 	mail.dll!60d76d99() 	
 	mail.dll!60d775c7() 	
 	mail.dll!60d82726() 	
 	mail.dll!60d82734() 	
 	mail.dll!60d82734() 	
There are 993 more lines just like the last one.
Summary: Crash every time I try to read news → Crash every time I try to read news [@SearchTable]
That stack was over 42000 levels deep.  :) gotta love infinite recursion.
Attached patch Patch (deleted) — Splinter Review
Judging from above comments and others on IRC, it looks like the crash is happening when m_messageKey == parentKey; this patch will cut off this infinite recursion and should fix the problem.
Attachment #316750 - Flags: superreview?(bienvenu)
Attachment #316750 - Flags: review?(bienvenu)
Status: NEW → ASSIGNED
Comment on attachment 316750 [details] [diff] [review]
Patch

ok, thx, let's give it a try.
Attachment #316750 - Flags: superreview?(bienvenu)
Attachment #316750 - Flags: superreview+
Attachment #316750 - Flags: review?(bienvenu)
Attachment #316750 - Flags: review+
Keywords: checkin-needed
mailnews/db/msgdb/src/nsMsgHdr.cpp 1.124
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
crash stats shows no crashes with 2008042100 build
Yeah, the good news is that the crashes have stopped.  
The bad news is that, when I started up with today's nightly, the message 
list pane for the newsgroup was really screwed up.  Threads with a single
message that was read continued to appear in the message list pane, even 
after repeated selecting to show only threads with unread messages.
So I rebuilt the MSF files, but now the newsgroup appears to be all 
read.  So I lost the partially read threads.  Only time will tell 
if partially-read threads have a new problem.  
Although the patch appeared to have cleared up some problems, there have been (so far) 3 occurrences in Shredder Alpha 1 topcrasher's list (there are also 8 occurrences on the 3.0a1pre top crasher's list).

Therefore reopening, as the original fix obviously hasn't quite got all cases.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Moving alpha 1 nomination to alpha 2.
Flags: blocking-thunderbird3.0a1+ → blocking-thunderbird3.0a2?
This has also been seen on an imap account: http://crash-stats.mozilla.com/report/index/0a6b8e07-3c7b-11dd-9b9e-001cc45a2c28 (MarcoZ).

This prevented accessing the inbox of the relevant account, manually removing the msf file, and redownloading all the headers worked.
Thanks to a copy of a crashing msf file, I've found what is probably the source of the crasher.

One message key (the one that's recursing indefinitely) has its threadParent set to the suspicious value 0xffffffec, which happens to exist, with its threadParent set to the first one. I'm leery of just checking the parent for loops, since another case might crop up with larger loops.

As far as I can tell, there are two identifying characteristics that we could use to distinguish these cases:
1. "negative" values (0xffff*), so the test would be some form of
|if (!(threadParent & 0xff000000))|
2. A case where the threadParent isn't in the thread. Though probably more correct, this is probably slower.

The question still remains what is causing these problematic message keys to crop up. It looks like it's some form of deletion, but something ends up holding on that shouldn't. 
Thought I was already assigned...
Status: REOPENED → ASSIGNED
given the crashingness, and the imap crash report, this looks to me like something that could affect a lot of users.  approving blocking-tb3a2
Flags: blocking-thunderbird3.0a2? → blocking-thunderbird3.0a2+
Attached patch Fix for the newer problem (obsolete) (deleted) — Splinter Review
I chose option #2. Seems uglier, but less hackish...
Attachment #325862 - Flags: superreview?(bienvenu)
Attachment #325862 - Flags: review?(bienvenu)
Comment on attachment 325862 [details] [diff] [review]
Fix for the newer problem

should we handle the case where thread is null, i.e., parentHdr doesn't belong to a thread? Or does that not happen?
Attachment #325862 - Flags: superreview?(bienvenu)
Attachment #325862 - Flags: superreview+
Attachment #325862 - Flags: review?(bienvenu)
Attachment #325862 - Flags: review+
/cvsroot/mozilla/mailnews/db/msgdb/src/nsMsgHdr.cpp,v  <--  nsMsgHdr.cpp
new revision: 1.126; previous revision: 1.125
done
Status: ASSIGNED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9
Attached patch Patch [checked-in] (deleted) — Splinter Review
This is the slight modification of the last patch that was actually checked in.
Attachment #325862 - Attachment is obsolete: true
Attachment #325907 - Flags: superreview+
Attachment #325907 - Flags: review+
Product: Core → MailNews Core
Crash Signature: [@SearchTable]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: