529834 - How are we getting the poison frame?

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Description

•

15 years ago

Spin off from bug 529371. See example crash stack: http://crash-stats.mozilla.com/report/index/bp-6e723c79-9a9d-4b3d-93e5-728442091119 It is possible to request to attempt creation of an nsWeakFrame initialized with a poisoned frame. This currently crashes us. Do we want to guard against this?

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 1

•

15 years ago

Also it appears possible that aSeeminglyLivingFrame->GetNextSibling() can return a poisoned frame.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 2

•

15 years ago

Well, things are already very wrong if someone is trying to use a poisoned frame. nsWeakFrame was originally designed to be initialized with a frame, which is known to be alive. Then while the nsWeakFrame is alive (normally as stack variable, but it doesn't have to be), weakFrame.IsAlive() check can be done after each possibly evil method call. Why is a11y code keeping raw references to nsIFrame objects? Although it could slow things down, replacing nsIFrame* with nsWeakFrame could help. Then later in the code use nsIFrame* f = nsWeakFrame.GetFrame() (need to null check f of course).

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 3

•

15 years ago

We are not holding raw nsIFrame objects very long anymore. In the crash stack above I don't see how the curFrame could go bad. Is frame poisoning happening off the main thread?

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 4

•

15 years ago

(In reply to comment #3) > Is frame poisoning happening off the main thread? No.

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 5

•

15 years ago

OK. Note we use nsWeakFrame quite a bit now; is it ok to hold onto them for a little while? In other words, can nsWeakFrame::GetFrame return a poisoned frame?

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 6

•

15 years ago

If nsWeakFrame is initialized with a valid nsIFrame, GetFrame returns either that frame (if it is alive), or nsnull. We should try to avoid using nsWeakFrames if possible. But in some cases we do need to hold to an nsIFrame object for a bit longer. For example nsEventStateManager has few nsWeakFrame member variables.

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 7

•

15 years ago

OK, so this bug is probably really about comment #1. I suspect we need one of: 1. GetNextSibling to return a living frame or nsnull. 2. Some API like IsFramePoisoned(nsIFrame*). Or longer term: 3. A way accessibility can get native anonymous frame content more safely.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 8

•

15 years ago

(In reply to comment #7) > 1. GetNextSibling to return a living frame or nsnull. So in which case can GetNextSibling return a deleted frame? Is the 'this' in that case really alive? > 2. Some API like IsFramePoisoned(nsIFrame*). This shouldn't be needed, and I think wouldn't even be really possible, because the memory may have been reallocated, so the frame doesn't necessarily point to poisoned memory, but to a newly created frame object. We have nsWeakFrame to detect if something gets deleted.

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 9

•

15 years ago

(In reply to comment #8) > (In reply to comment #7) > > 1. GetNextSibling to return a living frame or nsnull. > So in which case can GetNextSibling return a deleted frame? > Is the 'this' in that case really alive? > Looking at the crash state here's my take on it (I could be wrong): Note mState.frame is type nsWeakFrame. In UpdateFrame(PR_FALSE), we get by this: nsIFrame *curFrame = mState.frame.GetFrame(); if (!curFrame) { return; } We then do: mState.frame = curFrame->GetNextSibling(); This invokes the operator= of nsWeakFrame, and we end up at: nsWeakFrame::InitInternal And fail on: nsIPresShell* shell = mFrame->PresContext()->GetPresShell(); With a poisoned signature.

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 10

•

15 years ago

s/crash state/crash stack (In reply to comment #8) > (In reply to comment #7) > > 2. Some API like IsFramePoisoned(nsIFrame*). > This shouldn't be needed, and I think wouldn't even be really possible, because > the memory may have been reallocated, so the frame doesn't necessarily > point to poisoned memory, but to a newly created frame object. > We have nsWeakFrame to detect if something gets deleted. OK.

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 11

•

15 years ago

(In reply to comment #7) > OK, so this bug is probably really about comment #1. I suspect we need one of: > > 1. GetNextSibling to return a living frame or nsnull. Or maybe some nsWeakFrame::GetNextSibling API? Can anyone confirm my analysis in comment 9?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 12

•

15 years ago

David, it seems like you don't really understand what's going on here. We probably didn't explain it very well. 0) Lots of operations can destroy frames, including anything that directly or indirectly calls nsIPresShell::FlushPendingNotifications or removes content from the DOM. 1) An nsIFrame* pointer should always refer to a live frame (unless it's a variable that you're never going to use again). If it's ever a dangling pointer that you might use again, that is a critical bug and we're doomed. 2) Assuming that 1) is OK, calling frame->GetNextSibling() will always return a pointer to a live frame, unless the frame tree is corrupt, which is also a critical bug that means we're doomed. Those bugs are not common so it's unlikely that's what's happening in these crashes. 3) Given point 1, IsFramePoisoned is not a useful API (and as Olli pointed out, it's not really implementable either). If you have an nsIFrame* pointer that may or may not be dangling, you just can't use it. 4) All nsWeakFrame does is clear itself to null when the frame it's holding is destroyed. That means if you passed in a good live frame pointer to begin with, calling GetFrame() later will either return null or a live frame. 5) nsWeakFrame effectively hooks itself up to the frame, so if you pass in a bogus frame pointer, it will crash, as we see in these stacks. Since it's so easy for frames to go away without warning, code outside content/layout really shouldn't ever touch frames. They are dangerous.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 13

•

15 years ago

OK, so Marco can reproduce the bug. That's good. What are the exact steps to reproduce? What's the easiest way to trigger accessibility on Linux?

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 14

•

15 years ago

I probably misspoke if it appears I don't understand what is going on here. Everything in comment 12 makes sense to me. I'm just not seeing where we are holding onto an nsIFrame* pointer during a frame flush in the crash stack above.

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 15

•

15 years ago

(In reply to comment #13) > OK, so Marco can reproduce the bug. That's good. > > What are the exact steps to reproduce? What's the easiest way to trigger > accessibility on Linux? On GNOME in the system prefs, under Assistive Technologies there should be a checkbox like "Enable Assistive Technologies". I use Accerciser to confirm FF a11y is going: http://live.gnome.org/Accerciser#head-2b5a8f89c9b01a5a820ffe224f7ebe5e3c8de0d6

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 16

•

15 years ago

For Linux steps to reproduce this stack (from aja I think) suggests displaying addons extensions tab is enough. http://crash-stats.mozilla.com/report/index/a96616e6-43ae-4c73-a997-51f1e2091119 I'll be updating my linux vm tonight.

Marco Zehe (:MarcoZ)

Comment 17

•

15 years ago

I'm on Windows while I reproduce the crash. All I do is: 1. Have NVDA running (http://www.nvda-project.org). 2. Start the nightly of either Minefield or Namoroka. 3. Go to Tools/Add-Ons. By default it opens to the "Extensions" page for me (in builds that don't crash). This dialog never comes up, or at least NVDA doesn't tell me. Instead, Firefox crashes.

alexander :surkov (:asurkov)

Comment 18

•

15 years ago

(In reply to comment #17) > I'm on Windows while I reproduce the crash. All I do is: > 1. Have NVDA running (http://www.nvda-project.org). > 2. Start the nightly of either Minefield or Namoroka. > 3. Go to Tools/Add-Ons. By default it opens to the "Extensions" page for me (in > builds that don't crash). This dialog never comes up, or at least NVDA doesn't > tell me. Instead, Firefox crashes. I don't get crash on trunk with 0.6p3.2 NVDA. "Extensions" page is appeared.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 19

•

15 years ago

I don't get a crash with that version of NVDA on my own build, either.

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Updated

•

15 years ago

Summary: nsWeakFrame should detect nsIFrame poisoning → How are we getting the poison frame?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 20

•

15 years ago

Marco, can you do the following? 1) attach Visual Studio to your nightly build before Firefox crashes 2) set a breakpoint at nsFrame::Destroy 3) set a breakpoint on nsAccessibleTreeWalker::nsAccessibleTreeWalker and set its action to enable the breakpoint created in step #2 4) set a breakpoint on nsAccessibleTreeWalker::~nsAccessibleTreeWalker and set its action to disable the breakpoint created in step #2 5) try to trigger the crash... I have no idea if Visual Studio is accessible enough to do steps 2 to 4... David, if you can reproduce on Linux then you could do this using gdb.

Marco Zehe (:MarcoZ)

Comment 21

•

15 years ago

Roc, here's the big problem: I don't crash if I build locally or if I use a try-server build. I only crash with regular nightlies, for whatever frickin' reason. Another problem is that Visual Studio is, in principle, accessible to do the steps you ask me for. But because screen readers (any of them) hook deeply into the Firefox process, the moment the debugger halts at a breakpoint I'm dead in the water. My screen reader is halted along with Firefox. While mouse and keyboard actions still work, I no longer have control over what I'm doing, and I have no sighted assistance currently available to help me in such instances. And as for the screen reader version: I am using NVDA 2009.1RC1, and I am on Windows 7, 64 bit edition. I never saw this crash (or the original one in bug 525579) when I was still on Windows XP. That might have something to do with it, too.

Marco Zehe (:MarcoZ)

Comment 22

•

15 years ago

FYI: I am able to reproduce the still-happening crash of bug 529442 in Windows XP with the regular November 19 nightly build, see bp-780647b3-a6c4-4f27-8729-b01492091119

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 23

•

15 years ago

Aha, I can reproduce with the nightly build!

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 24

•

15 years ago

Oh thank you! Silly popup frame tree was biting us as per comment 1. (Readers can follow on bug 529371 comment 25)

David Bolter [:davidb] (NeedInfo me for attention)

Reporter

Comment 25

•

15 years ago

Pretty sure Roc nailed it. Closing as duplicate of fixed bug 529371; thanks all!

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → DUPLICATE

Daniel Veditz [:dveditz]

Updated

•

15 years ago

Whiteboard: [sg:dupe 529371]

Daniel Veditz [:dveditz]

Updated

•

13 years ago

Group: core-security