Closed Bug 466826 Opened 16 years ago Closed 16 years ago

Running mochitest asserts: ###!!! ASSERTION: should be executing script: 'mContext->fp', file /mozilla/dom/src/base/nsJSEnvironment.cpp, line 3318

Categories

(Core :: DOM: Core & HTML, defect, P2)

x86
All
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.1

People

(Reporter: smaug, Assigned: bent.mozilla)

References

Details

(Keywords: regression, verified1.9.1)

Attachments

(2 files, 1 obsolete file)

No description provided.
And note, mochitest gives the assertion quite a few times. Would be great to reduce the assertion noise.
Here's the relevant stack: 0 nsJSContext::SetTerminationFunction (this=0xf907510, aFunc=0x2520af8 <nsGlobalWindow::ClearWindowScope(nsISupports*)>, aRef=0xb08f790) at /Users/bzbarsky/mozilla/debug/mozilla/dom/src/base/nsJSEnvironment.cpp:3318 #1 0x0253aae8 in nsGlobalWindow::SetNewDocument (this=0x10269f50, aDocument=0xf56aa00, aState=0x0, aClearScopeHint=0, aIsInternalCall=1) at /Users/bzbarsky/mozilla/debug/mozilla/dom/src/base/nsGlobalWindow.cpp:1816 #2 0x02539c49 in nsGlobalWindow::SetNewDocument (this=0xb08f790, aDocument=0xf56aa00, aState=0x0, aClearScopeHint=0, aIsInternalCall=0) at /Users/bzbarsky/mozilla/debug/mozilla/dom/src/base/nsGlobalWindow.cpp:1548 #3 0x0253b932 in nsGlobalWindow::SetNewDocument (this=0xb08f790, aDocument=0xf56aa00, aState=0x0, aClearScopeHint=0) at /Users/bzbarsky/mozilla/debug/mozilla/dom/src/base/nsGlobalWindow.cpp:1506 #4 0x02400d8e in nsHTMLDocument::OpenCommon (this=0xf56aa00, aContentType=@0xbfffa960, aReplace=0) at /Users/bzbarsky/mozilla/debug/mozilla/content/html/document/src/nsHTMLDocument.cpp:1912 (gdb) frame 1 #1 0x0253aae8 in nsGlobalWindow::SetNewDocument (this=0x10269f50, aDocument=0xf56aa00, aState=0x0, aClearScopeHint=0, aIsInternalCall=1) at /Users/bzbarsky/mozilla/debug/mozilla/dom/src/base/nsGlobalWindow.cpp:1816 1816 (currentInner)); (gdb) p cx $7 = (JSContext *) 0xeed1400 (gdb) p callerScx $8 = (nsJSContext *) 0xf907510 (gdb) frame 6 #6 0x0256316d in nsHTMLDocumentSH::DocumentOpen (cx=0x2132a000, obj=0x7e2ef20, argc=0, argv=0x2142dad0, rval=0xbfffaaec) at /Users/bzbarsky/mozilla/debug/mozilla/dom/src/base/nsDOMClassInfo.cpp:8385 8385 rv = doc->Open(contentType, replace, getter_AddRefs(retval)); (gdb) p cx $9 = (JSContext *) 0x2132a000 But cx->fp is null for both of those JSContexts, for what it's worth. The |cx| in frame 6 does have a dormantFrameChain that actually points to a script (http://localhost:8888/tests/content/base/test/test_bug372964-2.html). The |cx| in frame 0 doesn't even have that. So as I see it, there are two issue here: 1) We've set aside the frame chain by the time we get into this code, for some reason. 2) We're getting what looks like the wrong cx in SetNewDocument. I can verify that running just http://localhost:8888/tests/content/base/test/test_bug372964-2.html triggers this assertion.
Fun exciting times. So we get into nsGlobalWindow::SetNewDocument. This pushes the window's own JSContext on the stack (added in bug 460811). Then later on in the same function we try to call nsContentUtils::GetCurrentJSContext, but that of course returns the JSContext we just pushed, not what it should return. It's not clear to me why this context push got added, but if it's needed we need to fetch the current JSContext before we push our own on. Not doing that leads to us not clearing scope correctly here (because our termination function should never be called, since it's not on a context with script running). It should be possible to write a test for this (and I wish we'd had one). Even if we fix the above, the assert in question will still trigger, since cx->fp will be null (though cx->dormantFrameChain won't be). Not sure whether asserting cx->fp || cx->dormantFrameChain is the right thing to do there. I'd still like to know why that cx push is needed, exactly.
Blocks: 460811
No longer blocks: 353090
Flags: blocking1.9.1?
Keywords: regression
Attached file testcase (deleted) —
Writing a test to exercise the termination function stuff might be kind of fun, because SetNewDocument() calls SetStatus() which in Firefox (but NOT Gecko apps in general) runs script on the JSContext that's currently on the JSContext stack (so the one for the document that open() is happening for) by calling out to chrome script to set the status. So we are in fact calling the termination function in this case, more or less by accident. And only in XUL-based Gecko apps, so I'd think that a trunk-based camino would have security issues here.
Oh, just allowing silent setting of a termination func when not cx->fp but cx->dormantFrameChain is no good, because of the SetStatus behavior. See bug 466836.
Severity: normal → critical
So Ben says the reason we push the cx here is that we enter a request, and apparently we need to maintain the invariant that the request we're in is on the cx that's on the top of the stack. It's not clear to me how that's supposed to work given that we can be in requests on multiple JSContexts at once as long as they're all on the main thread, no? Or am I missing something? ccing brendan for his wisdom. In any case, what we might want here are more fine-grained requests instead of the one whole-method request, so that we're not in one of those requests when we set the termination func. We might also want to have our auto-request helpers assert if the context passed in is not at the top of the stack, if that's an invariant that needs to hold.
The trick is that we have to be able to suspend any active requests when switching contexts on the same stack (otherwise we could deadlock), and the best way (that I know) to do that currently is to use the context stack.
Oh, and the switching is tough to spot. Anyone who needs a context can grab the safeJSContext, for instance, and if a context with active requests isn't in the stack then we won't suspend requests properly.
So do we just suspend requests on all contexts on the stack? As in, the context we start the request on just needs to be on the stack, not necessarily the topmost one?
Pushing onto the context stack only suspends the topmost context's requests.
Assignee: nobody → bent.mozilla
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
Target Milestone: --- → mozilla1.9.1
Then what happens if requests are live on contexts below it on the stack? Deadlock?
I split the wrong context problem off to bug 469905.
No longer blocks: 460811, 469905
Blocks: 460811
Attached patch Patch, v1 (obsolete) (deleted) — Splinter Review
Ok, pulling this bug back together, this seems to work. Thanks bz!
Attachment #353493 - Flags: superreview?(mrbkap)
Attachment #353493 - Flags: review?(bzbarsky)
Attachment #353493 - Flags: superreview?(mrbkap) → superreview+
Attached patch Patch, v1.1 (deleted) — Splinter Review
Ugh, and let's not add another deadlock hazard. Have to suspend cx's requests before pop/push.
Attachment #353493 - Attachment is obsolete: true
Attachment #353526 - Flags: superreview+
Attachment #353526 - Flags: review?(bzbarsky)
Attachment #353493 - Flags: review?(bzbarsky)
Comment on attachment 353526 [details] [diff] [review] Patch, v1.1 Looks good. Thanks!
Attachment #353526 - Flags: review?(bzbarsky) → review+
Pushed changeset 6f760def9fcd to mozilla-central.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Pushed changeset 1ff65c7790b1 to mozilla-1.9.1.
Keywords: fixed1.9.1
verified fixed 1.9.1 using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090119 Shiretoko/3.1b3pre - i have seen this assertion a lot of times durign the Topsite Run (with a build before this patch was checked in), with the new Build, i can not reproduce this assertion anymore !
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: