Closed Bug 468393 Opened 16 years ago Closed 15 years ago

Crash if page puts up an alert while trying to save it [@ nsObjCExceptionLogAbort - nsCocoaUtils::PrepareForNativeAppModalDialog]

Categories

(Core :: Widget: Cocoa, defect, P1)

x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: bzbarsky, Assigned: smichaud)

References

Details

(Keywords: crash, fixed1.9.1)

Crash Data

Attachments

(6 files, 3 obsolete files)

BUILD: Current trunk build

STEPS TO REPRODUCE:
1)  Load https://bugzilla.mozilla.org/attachment.cgi?id=351645
2)  Hit Command-S before the alert comes up and start typing a filename
3)  When the alert comes up dismiss it
4)  Finish typing the filename
5)  Hit "OK" in the filepicker

ACTUAL RESULTS: Crash with this stack: http://crash-stats.mozilla.com/report/index/ae98fc08-603c-4d92-9357-dd0012081207 (and focus is very screwed up after step 3)

EXPECTED RESULTS: probably prevent the alert from coming up while the filepicker is active, like we do for XUL modal dialogs in general.  Certainly _don't_ crash.
Using a GranParadiso nightly, I seem to hang at step 2. The alert appears, the OK buttons disappears from the Save As dialog and I'm completely locked out of pressing OK in the alert.

From the Console log when running a Shiretoko nightly:

12/7/08 10:03:25 PM firefox-bin[12495] *** Assertion failure in -[NSMenu insertItem:atIndex:], /SourceCache/AppKit/AppKit-949.35/Menus.subproj/NSMenu.m:617 
12/7/08 10:03:25 PM firefox-bin[12495] Mozilla has caught an Obj-C exception [NSInternalInconsistencyException: Item to be inserted into menu already is in another menu] 
12/7/08 10:03:25 PM [0x0-0x3f43f4].org.mozilla.firefox 2008-12-07 22:03:25.490 firefox-bin[12495:10b] *** Assertion failure in -[NSMenu insertItem:atIndex:], /SourceCache/AppKit/AppKit-949.35/Menus.subproj/NSMenu.m:617 
12/7/08 10:03:25 PM [0x0-0x3f43f4].org.mozilla.firefox 2008-12-07 22:03:25.492 firefox-bin[12495:10b] Mozilla has caught an Obj-C exception [NSInternalInconsistencyException: Item to be inserted into menu already is in another menu]
Also, because this happens at nsObjCExceptionLogAbort, it's one of the many, many crashes that make up the topcrash for Firefox on Mac. I think this should block 1.9.1 and that we should probably fix the underlying problem in 1.9.0 as well.
Flags: wanted1.9.0.x?
Flags: blocking1.9.1?
Flags: blocking1.9.0.6?
Severity: normal → critical
Keywords: crash, hang
nsObjCExceptionLogAbort
nsCocoaUtils::PrepareForNativeAppModalDialog
nsFilePicker::PutLocalFile
nsFilePicker::Show

i thought we were supposed to get the objc stack w/ crash reporter these days.
Summary: Crash if page puts up an alert while trying to save it → Crash if page puts up an alert while trying to save it [@ nsObjCExceptionLogAbort - nsCocoaUtils::PrepareForNativeAppModalDialog]
No. We are sending (but not yet collecting) the ObjC Exception info, which doesn't include the stack, but does include the ObjC Exception name and message.

Collecting and displaying this info is bug 444106.
Not blocking, but we'd definitely take a patch for 1.9.0.x after it baked on trunk and 1.9.1.
Flags: wanted1.9.0.x?
Flags: wanted1.9.0.x+
Flags: blocking1.9.0.6?
I don't crash with Boris' STR from comment #0, but I *do* crash soon
afterwards.  I'll work on getting a more precise STR.

Here's a gdb stack trace made with an opt trunk build that has debug
symbols.  I pulled the code this morning.
Assignee: joshmoz → smichaud
Priority: -- → P1
Here's a gdb trace of the exception, then the crash.  (The exception
stack is what fixing bug 444106 *won't* allow Breakpad to see
... though it will hopefully be available to Breakpad at some point in
the future.)
The hang you get in GranParadiso is bug 436473, whose fix hasn't yet been landed on the 1.9.0 branch.
Revised/reduced STR for crash on trunk (and 1.9.1 branch):

1) Load https://bugzilla.mozilla.org/attachment.cgi?id=351645.

2) Hit Command-S before the alert comes up.

   An OS-modal Save As dialog will appear.

3) Wait for the alert to come up, then dismiss it by pressing the ESC
   key.  (Any other way of dismissing it will cause a hang, for which
   see the next comment.)

4) After the alert has been dismissed, dismiss the Save As dialog (by
   clicking its Cancel button, or by clicking on it (to focus it) and
   pressing ESC).

5) After the Save As dialog has been dismissed, press Command-S again.

   Now you'll crash.
It's also possible to get a sort-of hang on the trunk (and the 1.9.1
branch) -- though the main thread isn't blocked (and though you don't
see the beachball), you can't dismiss the Save As dialog, and end up
having to force-quit the browser.

Here's a gdb stack trace of the main thread after this "hang" has
happened.

STR for the "hang":

1) Load https://bugzilla.mozilla.org/attachment.cgi?id=351645.

2) Hit Command-S before the alert comes up.

   An OS-modal Save As dialog will appear.

3) Wait for the alert to come up, then click once on it.

   You'll hear a beep (since the Save As dialog is app-modal), and (as
   best I can tell) no event (OS event or Gecko event) gets sent to
   the alert dialog.  But from this point you're stuck.

   You can click once again on the Save As dialog to give it the
   focus.  But nothing happens when you press the ESC key or click on
   the Cancel or Save buttons (you don't even see the menus flashing).
Aside from the 1.9.0-branch hang (which as I've said is really bug 436473 (and
bug 442442)), we've got two bugs here -- the crash (STR in comment #9) and the
"hang" (STR in comment #10).

Both of these problems arise from things I didn't foresee writing my patch
for bug 436473.  Or to put it another way, my patch for bug 436473 works just
fine with the Print and Page Setup app-modal dialogs (provided by the OS), but
not so well with the Save As and Open File app-modal dialogs (also provided by
the OS).

(The underlying problem for both bugs (this bug and bug 437473) is, of course,
that Cocoa app-modal dialogs and Gecko-modal dialogs are very difficult to
make get along.  This is largely due to the fact that Gecko doesn't know about
app-modal dialogs, and doesn't know how to deal with them.  But that's another
story -- one whose ending probably won't be written before the 2.0 branch
comes out.)

Neither this bug's crash nor its hang occur if you substitute Print and Page
Setup in the STRs from comment #9 and #10.  Both happen with the Save As
dialog (as we've seen) and with the Open File dialog.

The crash is easiest to explain:

Notice that when the alert comes up in comment #9's step 3, the base menu
changes -- the menu that nsCocoaUtils::PrepareForNativeAppModalDialog() put up
just before showing the Save As dialog gets replaced by the menu that Gecko
puts up before displaying any Gecko-modal dialog.  But the OS doesn't expect
this to happen.  So the original base menu (the one put up by
PrepareForNativeAppModalDialog()) doesn't get destroyed immediately.  And the
"standard edit menu" that PrepareForNativeAppModalDialog() added to it doesn't
get released immediately.

nsMenuUtilsX::GetStandardEditMenuItem() (currently) always returns the same
object (so it can be re-used).  But this means that, in step 5,
PrepareForNativeAppModalDialog() tries to add the same object to another menu
before it's been released from the previous menu.  This triggers an
Objective-C exception, which triggers a crash.

The "hang" is a bit more difficult to explain:

My patch for bug 436473 assumed that, when a Gecko-modal dialog pops up above
a Cocoa app-modal dialog, we should make the user deal with the Cocoa
app-modal dialog first.  This works fine (as I've said) with the Print and
Page Setup dialogs, but not with the Save As and Open File dialogs.  My patch
squared the circle, and allowed the Print and Page Setup dialogs to function
properly even when the code that normally processes their modal event loops
isn't running (instead what's running is the code that processes the
Gecko-modal event loop).

But (for some reason) this doesn't work for the Save As and Open File dialogs.
Possibly the code that responds to the various buttons in the app-modal
dialogs is in-loop for the Print and Page Setup dialogs, but out-of-loop for
the Save As and Open File dialogs.  In any case, some of the Save As and Open
File dialogs' buttons won't "work" until their own event-processing loops have
ended.  Since this includes the buttons that normally end the dialog (e.g the
Save As dialog's Save and Cancel buttons), and since you can't (using the
mouse or keyboard) get back to the browser once one of these dialogs has been
made active, you're stuck if you click on it after the Gecko-modal dialog has
appeared.  The Save As and Open File dialogs must be managed by their own
event loops, not the Gecko-modal event loop.  So it's not possible to deal
with either of these dialogs first if it's been interrupted by a Gecko-modal
dialog -- in those cases we need to make the user deal with the Gecko-modal
dialog first.

Because of this, and because the Print and Page Setup dialogs also "work" if
we give the Gecko-modal dialog priority, I need to change my approach from bug
436473.  My new patch (which I'll post in my next comment) makes the user deal
first with a Gecko-modal dialog if it pops up above a Cocoa app-modal dialog.

There are problems with this -- for example, it's not feasible to make a sheet
(which most Gecko-modal dialogs are) appear above a Cocoa app-modal dialog.
And since the Gecko-modal dialog will usually pop "up" under the Cocoa
app-modal dialog, the user will usually need to move the app-modal dialog
aside before dealing with the Gecko-modal dialog.  (These considerations were
what made me originally choose the app-modal-first strategy.)  But there's
simply no way around this.  A better, more comprehensive, more rational
solution isn't possible without changing Gecko.  And (as I said above) this
(probably) isn't something that can be done in a minor release.
Attached patch Fix (obsolete) (deleted) — Splinter Review
Here's my patch for this bug.  It fixes both the crash and the "hang".

Neither fix is perfect, but I think they're the best we can do for
now.

I've already mentioned (in comment #11) the major problem with my fix
for the "hang" (with changing from an app-modal-first strategy to a
Gecko-modal-first strategy) -- the user will usually have to move the
app-modal dialog (Save As or Open File) out of the way before dealing
with the Gecko-modal dialog.  But (like I also said) there's no way
around this using our current modal-window implementation.

The problem with my fix for the crash is that it doesn't fix the
underlying bug -- it doesn't attempt to stop the Gecko-modal dialog
from replacing the app-modal dialog's menu.  But we have other
problems with Gecko-modal dialogs, and I think it's best to put off
dealing with most of them until we can do so comprehensively.

A tryserver build will (I hope) follow in several hours.  Or, if I
have trouble, it may have to wait for next week.
Attachment #353908 - Flags: review?(joshmoz)
> This is largely due to the fact that Gecko doesn't know about app-modal
> dialogs, and doesn't know how to deal with them

Well, hold on.  I could have sworn that Gecko modal dialogs know about each other and don't come up if one is already up.  Couldn't we have whoever opens non-gecko modals just notify the same place, so the gecko ones won't come up?
(In reply to comment #13)

I believe Gecko-modal windows can be nested ... though I don't know of
examples off-hand.

And Cocoa widgets can deal with nested Gecko-modal dialogs if that's
all that's involved (because Cocoa widgets can deal with nested
sheets, and sheets are window-modal).

But what happens when a dialog that Gecko wants to be window-modal
comes up over an OS-provided app-modal dialog?  You've got two
problems, one platform-specific and one not.

1) The platform-specific problem is that Gecko wants to run its own
   modal event loop.  But, while sheets (which don't have their own
   event loops) can deal with that, Cocoa app-modal dialogs (which do
   have their own event loops) can't.

   The correct way to deal with this would be to make provisions for
   Gecko to delegate the managing of the event loop.

2) The more generic problem is that Gecko expects all modal dialogs to
   be window-modal, and window-modal dialogs shouldn't alter the base
   menu.

   But app-modal dialogs do need to alter the base menu (to disable
   certain items for all windows).  And (furthermore) if an app-modal
   dialog is already up, the dialog that pops up above it needs also
   to be app-modal (and to restrict the base menu in the same way).

Part of the reason we got into this trouble is that (unlike OS X)
Windows and Linux don't have "base menus" -- only per-window menus.
> Part of the reason we got into this trouble is that (unlike OS X)
> Windows and Linux don't have "base menus" -- only per-window menus.

A more basic problem is that Gecko's implementation of modality wasn't designed with OS X in mind, or (apparently) tested on OS X, at least at the design stage.
> I believe Gecko-modal windows can be nested

It certainly doesn't seem to happen with alerts:

  javascript: setTimeout(function() { alert('second'); }, 0); alert('first');

Those don't nest on either Mac or Linux.
You're right (I've also spent the last few minutes testing this bug's URL with alerts already open).

So there's now a third feature we need to add to Gecko -- a way to tell it that a modal dialog (window-modal or app-modal) is already open :-)

But I need to dig further to see what else I can find.
(Following up comment #17)

> But I need to dig further to see what else I can find.

I've now dug a little further.  I've looked at modal dialogs spawned
from nsPromptService::DoDialog() (which Cocoa widgets displays as
sheets, e.g. alerts) and nsGlobalWindow::ShowModalDialog() (which
Cocoa widgets displays as ersatz modal windows -- ersatz because the
"normal" Cocoa implementation of modal dialogs doesn't work for them,
because "normal" Cocoa modal dialogs manage their own event loops (see
bug 395465)).

I've discovered that what you (Boris) say about alerts (or more
generally about nsPromptService::DoDialog()) is also true of
nsGlobalWindow::ShowModalDialog(), and of combinations of these two
different kinds of modal "window":  You can't display two of them
simultaneously.  If you try to do this, first one will display, and
then the second will display only after the first has been closed.

Here are some simple testcases:

nsPromptService::DoDialog and nsGlobalWindow::ShowModalDialog():

(For some reason both of these testcases make
nsPromptService::DoDialog run first.)

<html>
<script>
function openModalWindow() {
  showModalDialog('data:text/html,this is a modal dialog');
}
function openAlert() {
  alert('this is an alert');
}
</script>
<body onload="setTimeout(openModalWindow, 0); openAlert();" />
</html>

<html>
<script>
function openModalWindow() {
  showModalDialog('data:text/html,this is a modal dialog');
}
function openAlert() {
  alert('this is an alert');
}
</script>
<body onload="setTimeout(openAlert, 0); openModalWindow()" />
</html>

nsGlobalWindow::ShowModalDialog() twice:

<html>
<script>
function openModalWindow() {
  showModalDialog('data:text/html,this is a modal dialog');
}
function openAlert() {
  alert('this is an alert');
}
</script>
<body onload="setTimeout(openModalWindow, 0); openModalWindow();" />
</html>
(Following up comment #18)

Note that modal windows displayed by nsGlobalWindow::ShowModalDialog()
are hacks (I called them "ersatz"), because proper Cocoa modal dialogs
need to manage their own event loops.

And these modal windows really need to be app-modal, because the only
alternative to a (window-modal) sheet is an app-modal window.

I forgot to mention either of these factors earlier.  But dealing with
them requires the Gecko changes outlined in comment #14 -- even if
Gecko modal dialogs are never nested.
Here's a revised version of my patch.  It deals better with the
possibility that Cocoa app-modal windows can be nested, and that a
Gecko-modal window might be sandwiched between two (nested) Cocoa
app-modal windows.

And here's a tryserver build made with my rev1 patch:

https://build.mozilla.org/tryserver-builds/2008-12-22_13:31-smichaud@pobox.com-bugzilla468393rev1/smichaud@pobox.com-bugzilla468393rev1-firefox-try-mac.dmg

(By the way, I'm now heading off for the holidays.  I won't be able to
get back to this until early next year.)
Attachment #353908 - Attachment is obsolete: true
Attachment #354232 - Flags: review?(joshmoz)
Attachment #353908 - Flags: review?(joshmoz)
Here's an alternative to the testcase from comment #0.  Instead of
(after a 5-second delay) displaying an alert (a window-modal sheet),
it uses showModalDialog() to display an "ersatz" app-modal window (as
I described it in comment #18 and comment #19).

It can be used to reproduce the crash (from comment #9) and the "hang"
(from comment #10) on the trunk and 1.9.1 branch, plus the (true) hang
on the 1.9.0 branch (which as I've said is a dup of bug 436473).

Unfortunately, my rev1 patch doesn't work properly with this testcase,
so I'll need to revise my patch yet again.
Attachment #354232 - Attachment is obsolete: true
Attachment #354232 - Flags: review?(joshmoz)
> It can be used to reproduce the crash (from comment #9) and the
> "hang" (from comment #10) on the trunk and 1.9.1 branch

Oops, no.  It only reproduces the "hang" from comment #10.
(Following up comment #21)

> Unfortunately, my rev1 patch doesn't work properly with this
> testcase, so I'll need to revise my patch yet again.

The problem was that when this testcase's showModalDialog() window
pops up above a Cocoa app-modal dialog (like the Open File dialog),
its close button gets disabled (it just beeps, though it still looks
enabled).  And since the close button is the only way to close that
dialog, you get stuck (and have to force-quit).

I've now got this problem fixed, and will post a new patch in my next
comment.
Here's a new revision of my patch.  It fixes the problem described in
comment #23, along with several other problems I discovered in
extensive testing.

I tested with the testcases from comment #0 (attachment 351645 [details]) and
comment #21 (attachment 356070 [details]), on OS X 10.4.11, 10.5.6 and 10.6
(build 10A222 of SnowLeopard).  There's still some wierdness in
particularly contrived cases (for example when you reload one or both
of these testcases in multiple windows at the same time and then open
a Cocoa app-modal dialog).  But I didn't ever crash, and I never got
"stuck" (I never got to the point where I had to force quit).

There's also wierdness using just Gecko-modal dialogs alone.  And
we'll not be able to get rid of all of it until we rewrite Gecko's
implementation of modal windows along the lines I described in comment
#14.  But that isn't something for the 1.9.1 branch (or possibly even
for the 1.9.X branch).  And in the meantime I think this patch leaves
us in reasonably good shape dealing with combinations of Gecko-modal
and Cocoa app-modal dialogs.

For the record, here are the Cocoa app-modal dialogs I tested with
(which are the only ones used by Firefox, as far as I'm currently
aware):

Print
Page Setup
Open File
Save As

For some reason, the Print dialog isn't app-modal on OS X 10.4.11.  My
patch takes account of this, and leaves the Print dialog alone on OS X
10.4.11.

Here's a tryserver build made with my rev2 patch:

https://build.mozilla.org/tryserver-builds/2009-01-14_08:56-smichaud@pobox.com-bugzilla468393-rev2/smichaud@pobox.com-bugzilla468393-rev2-firefox-try-mac.dmg
Attachment #356977 - Flags: review?(joshmoz)
(Following up comment #17 and comment #18)

I've now found how to get Gecko to nest two alerts (or two
showModalDialogs, or one of each).  This "works" on OS X, Linux and
Windows ... though it isn't readily apparent with two alerts in the
same window on OS X (I'll explain why below):

1) Open this bug's alert testcase (attachment 351645 [details] from comment #0)
   in one or more tabs, and this bug's showModalDialog testcase
   (attachment 356070 [details] from comment #21) in one or more tabs.

   On each load, wait for the alert or showModalDialog to appear
   before proceeding, then dismiss it.

2) Reload one tab (containing either kind of testcase), then quickly
   (before 5 seconds are up) reload another tab (containing either
   kind of testcase).

   Wait about 5 seconds.  In most cases the modal dialogs from both
   testcases will appear, one on top of the other.

   On Windows and OS X they can be dismissed in any order.  On Linux
   the last one to appear must be dismissed first.

   On OS X, if both tabs contain the alert testcase (attachment 351645 [details]
   from comment #0), the alerts won't be nested -- instead they'll
   appear in sequence.  But this *isn't* Gecko's doing -- rather
   there's logic in nsCocoaWindow::Show() to handle the special case
   of nested sheets.  You can see this by running the browser (a debug
   build, or one with debug symbols) in gdb, and breaking on
   nsCocoaWindow::SetModal() -- Gecko calls this method a second time
   before the first alert is dismissed.

I agree with Boris (comment #13) that we should find a way to tell
Gecko that a Cocoa app-modal dialog is open.  But (on current
behavior) Gecko might ignore that advice, and put up another (nested)
modal dialog anyway.

For this and other reasons, we also (eventually) need to change Gecko
as outlined in comment #14.
After reading through this bug and the patch, my feeling is that we should take only the crash fix portion of fix rev2 and file a new bug on the rest. The new bug should clearly outline the general changes to Geckos dialog system that would be necessary in order to fix our multiple-dialog bugs "the right way."

While I'm sure the code in rev2 does fix some important existing bugs, I'm not convinced that it will ultimately be an improvement. We might knock out some bugs with it but we'd be upping our exposure with more hackish code and special casing. It is pretty clear to me that we need a new strategy for dealing with dialogs instead of an improved implementation of our existing strategy. It is getting out of control.

Marking blocking1.9.1+ for the crash component of this bug only.
Flags: blocking1.9.1? → blocking1.9.1+
That said, I certainly appreciate your valiant debugging efforts Steven!
Attached patch Fix rev3 (just fix crash) (deleted) — Splinter Review
Here's the patch Josh requested.

I'd really prefer to also fix the hang (and I think my rev2 patch does
this safely) ... but I don't deny that our implementation of modal
dialogs/windows is a pile of hacks.

A tryserver build will follow in a few hours.
Attachment #356977 - Attachment is obsolete: true
Attachment #358277 - Flags: review?(joshmoz)
Attachment #356977 - Flags: review?(joshmoz)
Attachment #358277 - Flags: superreview?(roc)
Attachment #358277 - Flags: review?(joshmoz)
Attachment #358277 - Flags: review+
Attachment #358277 - Flags: superreview?(roc) → superreview+
Landed my rev3 patch on the trunk and the 1.9.1 branch:
http://hg.mozilla.org/mozilla-central/rev/5d1b3f37a76b
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/b87ca096cca8
Status: NEW → RESOLVED
Closed: 15 years ago
Keywords: fixed1.9.1
Resolution: --- → FIXED
Steven, have you already filed the hang as a new bug? I'm still be able to get into such a situation when following comment 0. And I can't find a reference here.
Target Milestone: --- → mozilla1.9.2a1
> Steven, have you already filed the hang as a new bug?

No, not yet.

You can do so if you'd like :-)  See comment #10.

If you don't, I'll get around to it eventually.
Filed as bug 476541. Moving hang keyword over there.
Keywords: hang
Crash Signature: [@ nsObjCExceptionLogAbort - nsCocoaUtils::PrepareForNativeAppModalDialog]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: