Closed
Bug 68365
Opened 24 years ago
Closed 24 years ago
Crash, when trying to delete lots of messages
Categories
(SeaMonkey :: MailNews: Message Display, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: sheelar, Assigned: hyatt)
References
Details
(Keywords: crash)
Attachments
(3 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
buildid:2001020808
os:win98
I was deleting about 3300 messages from my bugzilla folder.
Selected all the messages in the thread pane.
click on the tool bar button to delete
Results: Messages were all selected and the progress bar showed the activity
for about 10 seconds. The message count did not change in the trash folder or
the folder from where I was deleting messages from because I had a lot of them
which were unread.
Tried to stop since it looked like it got stuck. Could not stop.
I clicked on the trash folder and saw the thread pane was just scrolling and
looked like it was getting all the messages
Finally moved all the 3300 messages to the trash and crashed
I have the talk back stack trace which I will attach to this bug.
Reporter | ||
Comment 1•24 years ago
|
||
change qa contact to myself, cc esther
waiting for the stack trace should be available soon.
QA Contact: esther → sheelar
Reporter | ||
Comment 2•24 years ago
|
||
Here is the stack trace:
nsScrollSmoother::Notify[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsXULTreeOuterGroupFrame.cpp,
line 208]
nsTimer::Fire [d:\builds\seamonkey\mozilla\widget\timer\src\windows\nsTimer.cpp,
line 200]
nsTimerManager::FireNextReadyTimer
[d:\builds\seamonkey\mozilla\widget\timer\src\windows\nsTimerManager.cpp, line 117]
nsAppShell::Run [d:\builds\seamonkey\mozilla\widget\src\windows\nsAppShell.cpp,
line 118]
nsAppShellService::Run
[d:\builds\seamonkey\mozilla\xpfe\appshell\src\nsAppShellService.cpp, line 408]
NETSCP6.EXE + 0x1670 (0x00401670)
NETSCP6.EXE + 0x11b8 (0x004011b8)
NETSCP6.EXE + 0x2ae2 (0x00402ae2)
KERNEL32.DLL + 0x1b560 (0xbff8b560)
KERNEL32.DLL + 0x1b412 (0xbff8b412)
KERNEL32.DLL + 0x19dd5 (0xbff89dd5)
Comment 3•24 years ago
|
||
I am investigating this, and I might know the problem. Please don't fix this,
this is a good exercise for me!
Thanks. :)
Comment 4•24 years ago
|
||
sheela, are you able to reproduce this problem?
mOuter is probably null in nsScrollSmoother::Notify()
we could add a null check and assertion to prevent the crash.
Comment 5•24 years ago
|
||
[sigh] Yes it is null, that's what I had figured out.
Seth, can I make the patch for this bug?
Comment 6•24 years ago
|
||
As I read through that file, I see that there are quite a number of places where
we could nullcheck to prevent crashes. Some of that error checking is poorly
written.
Seth, do you think I should make a patch to errorcheck on more places, or could
that somehow slow down the code?
Comment 7•24 years ago
|
||
Håkan: null ptr checks aren't always the way to fix null ptr crashers. If the
pointer shouldn't be null at that point, the real cause (for why it's null)
should be found and fixed.
Comment 8•24 years ago
|
||
A nullcheck can find a null pointer and display a dialog stating the problem
instead of silently crashing.
Comment 9•24 years ago
|
||
blake is right, adding null checks every where is not the solution. (and
brendan will go crazy on you.)
what you should do, is provide a null check & assertion patch for this crash,
get it checked in (crashing is bad).
Then start a new bug on the assertion to figure out why we have null in this
case. At least while you (or the author of the code) figure it out, the user
doesn't crash.
Comment 10•24 years ago
|
||
amen.
Comment 11•24 years ago
|
||
Comment 12•24 years ago
|
||
This patch shouldn't probably get checked in. But it points out where in the
source I think this is happening, and does some assertions.
1. This code can be the crash, because if 'this' is nil, and we first try to
create a new nsScrollSmoother and then do NS_ADDRED() on that null object we crash.
if (!mScrollSmoother) {
mScrollSmoother = new nsScrollSmoother(this);
NS_ADDREF(mScrollSmoother);
2. The constructor, nsScrollSmoother::nsScrollSmoother(nsXULTreeOuterGroupFrame*
aOuter) can also be the problem. I don't know how many times this class is
created, but it might be an out of memory here which makes something go null,
because the reporter tries to move >3000 msgs to the trash.
3. And finally the mOuter *is* null in nsScrollSmoother::Notify, I am quite sure
of this, because those lines appear in the stack trace before we call the timer
stuff. But #1 and #2 is called before this, so that's one of them are the
originator here.
Thanks, and please comment on the patch so I can improve it.
Cc'ing evaughan who checked in this code (according to cvsblame).
Comment 13•24 years ago
|
||
bug 52270 may be related, if you look at what code they both are talking about.
Comment 14•24 years ago
|
||
here comes the patch, based on hwaara's work.
In the patch, I assert (so a coder can work on this if they reproduce it)
and return if mOuter is null. It is also possible that mOuter is corrupt, in
which case this fix will not work.
after we check in, we'll have to see if sheela can still reproduce the problem.
wild, uneducated guess:
the timer fires after mOuter is gone. perhaps the code that removes mOuter
should remove the timer, too.
here comes the patch, for the crasher.
the next step would be to reproduce it in a debug build and figure out what is
going on. I don't have the cycles to do that yet. (we'll might need some help
from the owner of the code, too.)
Comment 15•24 years ago
|
||
Comment 16•24 years ago
|
||
But Seth, what about #1 and #2 in my previous posting?
Updated•24 years ago
|
Assignee: sspitzer → hyatt
Comment 17•24 years ago
|
||
#1 makes sense, we are not checking if the allocation fails.
attached is a revised patch. we will still crash (if the caller of
GetSmoother() doesn't null check the result), but at least we'll assert.
I'm not convinced mOuter was null, you could get this crash if it was garbage.
you need to take this up with hyatt, evaughan and pinkerton. doesn't look like
a mailnews problem.
Comment 18•24 years ago
|
||
Comment 19•24 years ago
|
||
+ if (!mOuter) return;
How about returning something here? We should try to notify the following code
that we return because mOuter is either null or garbage.
Comment 20•24 years ago
|
||
you can't return anything.
the return type is void.
NS_IMETHODIMP_(void) nsScrollSmoother::Notify(nsITimer *timer)
Comment 21•24 years ago
|
||
Oh yes, I forgot that.
r=hwaara on the latest patch.
Comment 22•24 years ago
|
||
BTW, is it possible to make that function non-void to return error codes?
Comment 23•24 years ago
|
||
"If anything is possible, then nothing is interesting." H.G. Wells
just thought I'd make everyone's bugzilla more enjoyable.
no, Notify is part of the nsITimerCallback interface. Unless you want to change
that, which you don't.
Reporter | ||
Comment 24•24 years ago
|
||
About reproducing this bug, I don't have that many messages yet. But as soon as
I accumulate that many messages in the bugzilla folder I will try this again and
see if I crash without the fix.
Comment 25•24 years ago
|
||
a trick to building up a large folder:
do a multiple copy from a newsgroup, like news://news.mozilla.org/netscape.test
or news://news/mcom.test
Reporter | ||
Comment 26•24 years ago
|
||
seth,
thanks for the tip.
I did delete about 2700 messages from a local folder as before. Using today's
build, I did not crash. I saw the same behavior again but it did not crash.
The only difference is that I have today's build(2001021406)instead of
2001020808 build which crashed.
Should I try this on the same build again and see if I crash?
Comment 27•24 years ago
|
||
this bug appears to be timer related, so it is going to be hard to reproduce.
to be honest, if you have other bugs to work on, I'd suggest working on them first.
this will be tricky to reproduce, and probably not worth it right now.
after the fix lands, we'll probably have to wait and see if people still see
this crasher.
talkback would be good for that.
Comment 28•24 years ago
|
||
So was this checked in?
According to LXR, no it wasn't :
http://bonsai.mozilla.org/cvslog.cgi?file=mozilla/layout/xul/base/src/nsXULTreeOuterGroupFrame.cpp
Comment 30•24 years ago
|
||
Ok, the latest patch already has got my r= on it...Seth, please get an sr= this
bug and check it in!
Comment 31•24 years ago
|
||
hwaara, I'm waiting for hyatt to review / approve the patch.
Assignee | ||
Comment 32•24 years ago
|
||
sr=hyatt
Comment 33•24 years ago
|
||
*** Bug 69781 has been marked as a duplicate of this bug. ***
Comment 34•24 years ago
|
||
fix checked in.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 35•24 years ago
|
||
Sheela wrote:
Seth,
I remember trying to recreate this crash and was not able to. This is the
crash bug when trying to delete lots of messages. Can you tell me how I can
verify this fix? You can reply to this email or add as comments in bugzilla.
Reply from Seth,
That is going to be a tough bug to properly verify, since it is timer related.
All I can suggest is looking for crashes in talkback that look like that bug.
I'd mark the bug verified, and then reopen if you see any crashes in talkback
(with the exact same stack trace) from trunk builds after the date I checked in
the fix.
-Seth
From the above comments from Seth marking this bug as verified.
Thanks again seth
Status: RESOLVED → VERIFIED
Comment 36•24 years ago
|
||
Sheela (or any other QA), out of curiousity, have you seen any talkbacks
crashing when deleting messages lately?
Updated•20 years ago
|
Product: Browser → Seamonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•