Closed Bug 53704 Opened 24 years ago Closed 24 years ago

Crash in nsWindow::UpdateIdle() if still-rendering window closed

Categories

(Core :: XUL, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 80345
Future

People

(Reporter: matt, Assigned: danm.moz)

Details

(Keywords: crash, Whiteboard: [rtm-])

Linux build 20000921, Linux 2.2.14 i686, RedHat 6.1 To reproduce: 1) Get a Bugzilla bug list page. 2) Middle click on a bug to get a new window. 3) Make some chage/comment to the bug. 4) Commit the change. 5) Click on the window-manager "close" button of the window after the commit had gone through, but while the response to the commit is still loading/rendering. This causes Mozilla to crash. Unable to reproduce in my debug build. However, I did manage to get these assertions while trying to reproduce: ###!!! ASSERTION: Channel list is not empty.: 'count == 0', file nsLoadGroup.cpp, line 260 ###!!! Break: at file nsLoadGroup.cpp, line 260 ###!!! ASSERTION: Foreground URLs are active.: 'mForegroundCount == 0', file nsLoadGroup.cpp, line 261 ###!!! Break: at file nsLoadGroup.cpp, line 26
guessing at networking...
Assignee: asa → gagan
Component: Browser-General → Networking
QA Contact: doronr → tever
doron: pls. avoid guessing. It just adds to the delay. Leave it as it is if you are not sure. Someone else would find more details. matt: could you attach the stach trace so that we'd know where this is crashing? thx. assiging to nobody to get reassessed for the right owner/component.
Assignee: gagan → nobody
Sorry, I repeatedly tried to duplicate this with a debug build, but couldn't, so no stack trace; I was hoping that someone else could get the stack trace :-(
OK, here's the stack trace from the official 20000922 build; have still been unable to reproduce with a debug build. :-( #0 0x91 in ?? () #1 0x2b34ef49 in g_idle_dispatch () from /usr/lib/libglib-1.2.so.0 #2 0x2b34df96 in g_main_dispatch () from /usr/lib/libglib-1.2.so.0 #3 0x2b34e561 in g_main_iterate () from /usr/lib/libglib-1.2.so.0 #4 0x2b34e701 in g_main_run () from /usr/lib/libglib-1.2.so.0 #5 0x2b26ffdc in gtk_main () from /usr/lib/libgtk-1.2.so.0 #6 0x2b1a870c in NSGetModule () from /usr/home/matt/graze/browsers/mz/components/libwidget_gtk.so #7 0x2af128da in inflate_mask () from /usr/home/matt/graze/browsers/mz/components/libnsappshell.so #8 0x804e265 in JS_PushArguments () #9 0x804e686 in JS_PushArguments () #10 0x2ad16fb3 in __libc_start_main (main=0x804e57c <JS_PushArguments+11960>, argc=1, argv=0x7ffff404, init=0x804b234 <_init>, fini=0x8054a6c <_fini>, rtld_fini=0x2aab55d0 <_dl_fini>, stack_end=0x7ffff3fc) at ../sysdeps/generic/libc-start.c:78
OK, I got the stack trace. I'll have to retract my previous statement, this bug can be really hard to reproduce. I had to do it over and over to get his stack trace: Program received signal SIGSEGV, Segmentation fault. nsPrintOptions::nsPrintOptions (this=0x0) at ../../../dist/include/nsIPageSequenceFrame.h:108 108 nsPrintOptions() { (gdb) print this $1 = (nsPrintOptions *) 0x8bc6b88 (gdb) where #0 nsPrintOptions::nsPrintOptions (this=0x0) at ../../../dist/include/nsIPageSequenceFrame.h:108 #1 0x2b7b10a2 in nsWindow::UpdateIdle (data=0x0) at nsWindow.cpp:613 #2 0x2b0ebf49 in g_idle_dispatch () at ../../../dist/include/nsIPageSequenceFrame.h:112 #3 0x2b0eaf96 in g_main_dispatch () at ../../../dist/include/nsIPageSequenceFrame.h:112 #4 0x2b0eb561 in g_main_iterate () at ../../../dist/include/nsIPageSequenceFrame.h:112 #5 0x2b0eb701 in g_main_run () at ../../../dist/include/nsIPageSequenceFrame.h:112 #6 0x2b00cfdc in gtk_main () at ../../../dist/include/nsIPageSequenceFrame.h:112 #7 0x2b79882a in nsAppShell::Run (this=0x815dc58) at nsAppShell.cpp:335 #8 0x2c3549e2 in nsAppShellService::Run (this=0x818f070) at nsAppShellService.cpp:407 #9 0x80537de in main1 (argc=1, argv=0x7ffff424, nativeApp=0x0) at nsAppRunner.cpp:1004 #10 0x8053e85 in main (argc=1, argv=0x7ffff424) at nsAppRunner.cpp:1185 (gdb) up #1 0x2b7b10a2 in nsWindow::UpdateIdle (data=0x0) at nsWindow.cpp:613 613 window->Update(); (gdb) print data $2 = 0x0 (gdb) print window $3 = (nsWindow *) 0x8bc6b88 Man, that looks totaly and completely wrong. nsIPageSequenceFrame.h is being referenced for the GTK stuff, and nsWindow::updateIdle():613 calls nsWindow::update(), not nsPrintOptions::nsPrintOptions. I recompiled Mozilla from scratch, with --disable-debug, and then manually added the -g option to compiling and linking (CFLAGS, CXXFLAGS, and LDFLAGS), since I couldn't seem to reproduce it with the full-fledged debug build. I guess I must have missed something. I'm changing the component to XP Toolkit/Widgets, since nsWindows and nsIPageSequenceFrame both seem to be a part of the widget subtree. Also addind crash to the keywords, and changing the summary.
Component: Networking → XP Toolkit/Widgets
Keywords: crash
Summary: Crash on window close if window still loading → Crash in nsPrintOptions()/GDK if still-rendering window closed
Bug 44222, bug 52694, and bug 53455 look simillar to this bug.
Assignee: nobody → trudelle
QA Contact: tever → jrgm
Summary: Crash in nsPrintOptions()/GDK if still-rendering window closed → Crash in nsWindow::UpdateIdle() if still-rendering window closed
OK, I've narrowed this down some, using printf(). Dispite what gdb thinks, nsPrintOptions() is never called. The crash happens in nsWindow::UpdateIdle(). gboolean nsWindow::UpdateIdle (gpointer data) { GSList *old_queue = update_queue; GSList *tmp_list = old_queue; update_idle = 0; update_queue = nsnull; while (tmp_list) { nsWindow *window = (nsWindow *)tmp_list->data; window->mIsUpdating = PR_FALSE; window->Update(); // <---- ERROR HAPPENS HERE tmp_list = tmp_list->next; } The crash happens after window->mIsUpdating is set, but before nsWindow::Update() gets called. Also, in one of the crahses, I got this message: pure virtual method called I've found a method to reproduce that seems a lot more consistent: 1) Go to this URL: http://bugzilla.mozilla.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=2&votes=&chfield=%5BBug+creation%5D&chfieldfrom=09%2F22%2F2000&chfieldto=09%2F23%2F2000&chfieldvalue=&product=Browser&product=Browser+Localizations&product=MailNews&short_desc=&short_desc_type=substring&long_desc=&long_desc_type=substring&bug_file_loc=&bug_file_loc_type=substring&status_whiteboard=&status_whiteboard_type=substring&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&newqueryname=&order=Reuse+same+sort+as+last+time 2) Middle click on 53880 to pop up a new window. 3) Watch as the blank window begins to be populated with gray rectangles which are the form components being rendered. First there'll be just one set of them in the upper-lefthand corner, and then there be a bit more scattered around the window; at that moment, close the window. Since this seems to be happening in mozilla/widget/src/gtk/...., I'm reassigning this to the XP Toolkit/Widgets owner.
->danm, cc pavlov, nominate for rtm
Assignee: trudelle → danm
Keywords: rtm
Target Milestone: --- → M19
I've narrowed down the problem some more. The "pure virtual method called" was happening because window->Update() was being invoked on an instance which had been deleted (this is the usuall cause of this error for gcc/g++ generated code). The reason this is happening is because an nsWindow is being added twice to the update_queue list, but only being deleted once by nsWindow::~nsWindow(); if I force the destructor to try and deleted each instance from the list twice, then the problem goes away. This being added twice problem never seems to happen during ordinary circumstances, and for some reason it only seems to happen to nsWindow instances on whom the Destroy() method gets called 3 times (although mutliple calls to Destroy for a single instance are normal and harmless). Following is the stack trace of when a window is being added for a second time to update_queue: #0 nsWindow::QueueDraw (this=0x88a5ee8) at nsWindow.cpp:697 #1 0x2b413e79 in nsWindow::Invalidate (this=0x88a5ee8, aRect=@0x7fffed8c, aIsSynchronous=0) at nsWindow.cpp:978 #2 0x2b405904 in handle_superwin_paint (aX=2, aY=2, aWidth=166, aHeight=85, aData=0x88a5ee8) at nsGtkEventHandler.cpp:1057 #3 0x2aced72e in gdk_superwin_handle_expose (superwin=0x89c9720, xevent=0x7fffef80, region=0x7fffeea0, dont_recurse=0) at gdksuperwin.c:683 #4 0x2aced204 in gdk_superwin_bin_filter (gdk_xevent=0x7fffef80, event=0x813fdb8, data=0x89c9720) at gdksuperwin.c:497 #5 0x2b589335 in gdk_event_apply_filters () at ../../../dist/include/nsIPageSequenceFrame.h:113 #6 0x2b589467 in gdk_event_translate () at ../../../dist/include/nsIPageSequenceFrame.h:113 #7 0x2b58a33d in gdk_events_queue () at ../../../dist/include/nsIPageSequenceFrame.h:113 #8 0x2b58a582 in gdk_event_dispatch () at ../../../dist/include/nsIPageSequenceFrame.h:113 #9 0x2b5b8f96 in g_main_dispatch () at ../../../dist/include/nsIPageSequenceFrame.h:113 #10 0x2b5b9561 in g_main_iterate () at ../../../dist/include/nsIPageSequenceFrame.h:113 #11 0x2b5b9701 in g_main_run () at ../../../dist/include/nsIPageSequenceFrame.h:113 #12 0x2b4d7fdc in gtk_main () at ../../../dist/include/nsIPageSequenceFrame.h:113 #13 0x2b3fa905 in nsAppShell::Run (this=0x80f96a8) at nsAppShell.cpp:339 #14 0x2b0339e2 in nsAppShellService::Run (this=0x8101300) at nsAppShellService.cpp:407 #15 0x80537de in main1 (argc=1, argv=0x7ffff374, nativeApp=0x0) at nsAppRunner.cpp:1004 #16 0x8053e85 in main (argc=1, argv=0x7ffff374) at nsAppRunner.cpp:1185
rtm-/future, obscure, timing related, not a topcrash.
Whiteboard: [rtm-]
Target Milestone: M19 → Future
this appears to works okay on my debug branch build from today.
FWIW, I see Segmentation Faults on Linux (2.2.17-mdk) often when I close a Moz window while it or another window is still rendering! This is with build ID 2001010211.
I see the random crashes when closing windows while it or another window is still rendering too. Now using Linux nightly 2001011621 but I've seen for a long time ago. I think the priority on this one should be upped because I see it really often.
Is this related or a dup of bug 62643?
This _needs_ to be fixed. I'm getting segfault crashes every 5 minutes when searching for obscure information because I'm always closing windows while they are still loading/rendering. Mr Cline has done some good work tracking this down and for good reason.
With the 20010126 Linux build, this problem seems to have become much less severe. For a while Mozilla was crashing any time that you closed a still rendering/loading window (as noted by Tom Mraz and Stuart Robinson), but this is no longer the case. When I first found this bug, it was hard to reproduce and only happened under specific circumstances; then it began to happen in ordinary, frequently occuring circumnstances; and now it's gone again. Might we be dealing with multiple bugs? Maybe both of them have been fixed.
i think this is a dupe of bug 80345 which contains a simple fix. *** This bug has been marked as a duplicate of 80345 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.