Closed
Bug 43466
Opened 25 years ago
Closed 25 years ago
root_points_to_gcArenaPool in mail account wizard
Categories
(SeaMonkey :: General, defect, P3)
Tracking
(Not tracked)
VERIFIED
FIXED
M18
People
(Reporter: alecf, Assigned: danm.moz)
References
Details
(Whiteboard: [nsbeta2-][nsbeta3+])
Attachments
(2 files)
So I'm getting a root_points_to_gcArenaPool assertion using the account wizard.
it's easy to reproduce:
- Open messenger
- File->New->Account
- Hit Next, then hit Cancel to cancel the dialog
- File->New->Account again
- Hit Next/Back once or twice - it crashes pretty quickly..
I've discussed this with shaver before, and some wacky things are going on in
the JS engine:
Assertion failure: root_points_to_gcArenaPool, at jsgc.c:785
Program received signal SIGABRT, Aborted.
0x4045d4e1 in __kill ()
(gdb) sha mozjs
Reading symbols from /home1/alecf/seamonkey/mozilla/dist/bin/libmozjs.so...
done.
(gdb) #0 0x4045d4e1 in __kill ()
#1 0x402201eb in raise (sig=6) at signals.c:64
#2 0x4045e868 in abort () at ../sysdeps/generic/abort.c:88
#3 0x4019d246 in JS_Assert () at jsutil.c:174
#4 0x4016cd27 in gc_root_marker (he=0x8485680, i=81, arg=0x80c1cb0)
at jsgc.c:785
#5 0x4016de3f in JS_HashTableEnumerateEntries (ht=0x80c3ea8,
f=0x4016cca4 <gc_root_marker>, arg=0x80c1cb0) at jshash.c:364
#6 0x4016d06c in js_GC (cx=0x8b5a318) at jsgc.c:952
#7 0x4016cdad in js_ForceGC (cx=0x8b5a318) at jsgc.c:810
#8 0x40151cb6 in JS_GC (cx=0x8b5a318) at jsapi.c:1154
#9 0x402be5e3 in nsJSContext::GC (this=0x8a098e0) at nsJSEnvironment.cpp:1066
(More stack frames follow...)
(gdb) frame 4
#4 0x4016cd27 in gc_root_marker (he=0x8485680, i=81, arg=0x80c1cb0)
at jsgc.c:785
Current language: auto; currently c
(gdb) print he
$1 = (JSHashEntry *) 0x0
tell me that's not wierd... the hash key is 0x0...
now check this out:
(gdb) frame 5
#5 0x4016de3f in JS_HashTableEnumerateEntries (ht=0x80c3ea8,
f=0x4016cca4 <gc_root_marker>, arg=0x80c1cb0) at jshash.c:364
(gdb) print he
$2 = (JSHashEntry *) 0x8485680
I have seen this consistently with gdb 4.17 and gdb 5.0... now the one trick is
that I build with optimized debug (-O2 -g -DDEBUG etc) so that might explain why
he is 0x0.
Reporter | ||
Comment 1•25 years ago
|
||
I thought brendan was the default JS engine owner, adding him to CC..
brendan, got a WIERD gc bug for ya.
Comment 2•25 years ago
|
||
he printing as 0 in frame 4 sounds like gdb lossage.
Usually this assertion botching means someone has freed memory containing a JS
GC root, without calling JS_RemoveRoot{,RT}. If the caller used JS_AddNamedRoot
as all callers should, then (char*) he->value is the root name, a static string
literal telling the source file or class::member name of the rooted pointer.
Another way to botch this assertion is to call JS_Add{Named,}Root(cx, mRooted)
rather than JS_Add{Named,}Root(cx, &mRooted). Roots are registered by reference
so you can make your own member pointers be JS GC roots.
/be
Comment 3•25 years ago
|
||
I doubt this is rogerl's or alecf's bug. But since alecf can reproduce it,
I'm bouncing to him. Alecf, what's (char*)he->value?
/be
Assignee: rogerl → alecf
Reporter | ||
Comment 4•25 years ago
|
||
I think there's gotta be something wierd going on with the debug vs. optimized
JavaScript - because I'm crashing (no assertion) on windows as well, during an
optimized build with debugging on (i.e. debugging symbols and -DDEBUG)
I'll try to get a stack trace, but if I remember right, it looked something
like:
NTDLL
JS_GC
JS_ForceGC
Reporter | ||
Comment 5•25 years ago
|
||
ok, this was driving me crazy, so I decided to sick purify on it.
I'm getting a whole bunch of IPR errors in the GC:
[E] IPR: Invalid pointer read in gc_root_marker {4 occurrences}
Reading 4 bytes from 0x0e95d394 (4 bytes at 0x0e95d394 illegal)
Address 0x0e95d394 points into a committed VirtualAlloc'd block
Thread ID: 0x750
Error location
gc_root_marker [jsgc.c:769]
gc_root_marker(JSHashEntry *he, intN i, void *arg)
{
jsval *rp = (jsval *)he->key;
=> jsval v = *rp;
/* Ignore null object and scalar values. */
if (!JSVAL_IS_NULL(v) && JSVAL_IS_GCTHING(v)) {
js_GC [jsgc.c:952]
js_ForceGC [jsgc.c:810]
JS_DestroyContext [jsapi.c:788]
nsJSContext::`vector deleting destructor'(UINT) [jsdom.dll]
nsJSContext::Release(void) [nsJSEnvironment.cpp:316]
nsDocShell::Destroy(void) [nsDocShell.cpp:1253]
nsXULWindow::Destroy(void) [nsXULWindow.cpp:316]
nsWebShellWindow::Destroy(void) [nsWebShellWindow.cpp:1712]
nsWebShellWindow::Close(void) [nsWebShellWindow.cpp:357]
I get 10 of these after running mozilla with -mail, doing File->New->Account,
creating an account, and quitting.
Comment 6•25 years ago
|
||
What's "a committed VirtualAlloc'd block", anyway? Freed memory, or someone
else's memory?
In any case, it sounds like data structures containing GC-rooted pointers are
being freed without JS_RemoveRoot being called. Again, if you can find what
(char*)he->value is when these IPRs fire, we may have the culprits red-handed.
/be
Reporter | ||
Comment 7•25 years ago
|
||
I'll try for this tomorrow, wish me luck :)
Reporter | ||
Comment 8•25 years ago
|
||
I finally got he->value for you:
(on windows)
(char *)he->value 0x0037ed50 "window_object"
This is the same thing shaver and I saw a while back.
Reporter | ||
Comment 9•25 years ago
|
||
reassign to danm, keeping self on CC.
I'll attach a small patch that brendan and I came up with yesterday that SEEMED
like it should fix the problem but didn't.
Also, after some experimentation, I discovered that part of the problem here is
that in CleanUp, we're destroying mScriptObject, which is bad because in
WindowOpen we need the mScriptObject in order to generate a return value from
the window.openDialog call, but it has already been destroyed by the time we
call nsJSUtils::nsConvertObjectToJSVal(nativeRet, cx, obj, rval)
http://lxr.mozilla.org/seamonkey/source/dom/src/base/nsJSWindow.cpp#2240
Assignee: alecf → danm
Reporter | ||
Comment 10•25 years ago
|
||
nominate for nsbeta2 - this can cause lots of random crashes or possibly leak
large root objects like the window object itself.
Keywords: nsbeta2
Comment 11•25 years ago
|
||
So closing a window must not destroy its script object, or the properties in
that object such as closed (with value true) and other user-defined ones. This
should be straightforward to fix, but I'm sure it won't. Let me know if I can
help (and so help me, I'm gonna reform the bad indentation in nsGlobalWindow.cpp
if you let me near it!).
/be
Comment 13•25 years ago
|
||
Changing component to Browser-General. Not sure of the exact component,
but it's not JavaScript Engine -
Component: Javascript Engine → Browser-General
Updated•25 years ago
|
QA Contact: pschwartau → asa
Updated•25 years ago
|
QA Contact: asa → doronr
Assignee | ||
Comment 14•25 years ago
|
||
Been playing with this for a while now, and I can't find any problem. I've been
stepping through the instructions to reproduce, and trying a small variety of
other windows. I've never noticed the assertion.
Curious fact #2: I haven't noticed any problems with premature deletion of the
script object in nsGlobalWindowImpl::CleanUp(). For starters, the window's
mContext member has always been released and cleared out before CleanUp is called
(by something in nsWebShellWindow::Destroy or Close, depending on the code path)
so CleanUp itself does nothing.
Could be that I haven't understood the intricacies of script object destruction
-- scratch that; I know I haven't -- but I don't see a problem here. Has this
been fixed by some other unknown checkin? Holding the bug open for comments
before WFMing.
Reporter | ||
Comment 15•25 years ago
|
||
I'm still getting this on linux.... are you saying you don't see the crash?
Assignee | ||
Comment 16•25 years ago
|
||
No crash, no worries. Brendan told me you had some unusual assert configuration
(enable-crash-on-assert?) active. Regardless, I see neither the alert nor the
crash. However, I'm told this bug actually describes the premature javascript
object deletion that generally expresses itself as an inability to access
properties of closed windows. Like window.closed, for instance. That's still a
problem (has it ever worked?). I'll work on this from that angle.
Status: NEW → ASSIGNED
Reporter | ||
Comment 17•25 years ago
|
||
well, what I was seeing was this (a better description of my 06-28 12:56
comments - this is my understanding of what's going on, but I could be wrong)
- window is opened via WindowOpen in the JS glue. Dialog is modal, so
nativeThis->Open doesn't return until the window closes
- window is closed, so GlobalWindowImpl::Close() is called, which calls
GlobalWindowImpl::CleanUp()
- CleanUp destroys mScriptObject
- back in nsJSWindow.cpp, nativeThis->Open returns. At this point nativeThis's
mScriptObject is destroyed, and nativeRet is the "native" return value (an
nsIDOMWindow)
- in order to return a JS-friendly return value from WindowOpen, WindowOpen uses
nsJSUtils::nsConvertObjectToJSVal to convert nativeRet into rval, using obj
(which is the window that just closed).. this causes obj->GetScriptObject, which
recreates another script object.
- According to brendan, mScriptObject contains some kind of reference back to
the original window... I'm not sure, but I think the window is getting released
and freed, and leaving this stale mScriptObject as a named root in the JS GC.
I hope that helps.
Reporter | ||
Comment 18•25 years ago
|
||
oh, I'm not running with crash-on-assert turned on or anything - the JS engine
uses PR_Assert which is fatal on all platforms... some people on IRC
yesterday were seeing this too - make sure you're using a debug build, and make
sure that you're opening the account wizard TWICE:
- File->New->Account
- Hit Cancel
- File->New->Account
- Hit Next/Back a few times, until you crash
Assignee | ||
Comment 19•25 years ago
|
||
Ah. Thanks -- the above descriptions help; I've a clearer picture of the problem.
I'm still not quite sure I'm seeing the problem on my build, and I can dance back
and forth well after boredom sets in on my second run at the New Account dialog.
But I know the JS object is being killed off because you can't access it after
the window is closed. Still looking...
Assignee | ||
Comment 20•25 years ago
|
||
Oooo. I figured there was already a bug on the problem I described above. Seems
like this is probably a dupe of 38951. Just marking this one as dependent for
now.
Depends on: 38951
Comment 21•25 years ago
|
||
Doesn't fit latest 'would ya pull it off the wire' criteria, marking nsbeta2-,
nominating for nsbeta3
Keywords: nsbeta3
Whiteboard: [nsbeta2+] → [nsbeta2-]
Reporter | ||
Comment 22•25 years ago
|
||
adding bienvenu to the CC because he was seeing this too.
see also bug 43470
Comment 23•25 years ago
|
||
nsbeta3+
Whiteboard: [nsbeta2-] → [nsbeta2-][nsbeta3+]
Target Milestone: --- → M18
Comment 24•25 years ago
|
||
Comment 25•25 years ago
|
||
I just created an attachment that has the call stack for the
root_points_to_gcArenaPool assertion failure I'm getting with some js heavy
Beatnik content (which you won't be able to repro because the content won't
work without the xpcom plugin that is under development).
Looks related. Any opinions on whether I should open a separate bug?
Comment 26•25 years ago
|
||
sean -- that's the same backtrace: closing a window should not nuke that
window's script object.
/be
Comment 27•25 years ago
|
||
thanks Brendan.
adding beatnik keyword as this bug causes beatnik emixes to crash mozilla.
An example emix can be found at http://www.beatnik.com/emix/ - but those
particular emixes have not been updated to run in mozilla and you need a new
plugin anyways.
Keywords: beatnik
Comment 28•25 years ago
|
||
really adding it...
Assignee | ||
Comment 29•25 years ago
|
||
Ah! I've never been able to reproduce this bug, but these steps from shannond do
it for me:
"...crash on HP-UX and Linux after saving 3 or more files/links from a webpage
(for example those found on: http://slip/projects/seamonkey/im/whitebox_tests/
smoketests.html)."
The crucial thing seems to be to repeat as necessary.
Comment 30•25 years ago
|
||
*** Bug 46430 has been marked as a duplicate of this bug. ***
Whiteboard: [nsbeta2-][nsbeta3+] → [nsbeta2-][nsbeta3+] tentative fix in hand.
Care to attach the tentative fix, for the curious and the eager-to-test?
Assignee | ||
Comment 32•25 years ago
|
||
For you, Mike, anything. Except I don't want to type all the stuff about why I
think it works, because that takes about five minutes. I'm set up to do the
talking thing with Brendan some time today. Basically, modal windows throw away
their script object and then regenerate a fresh, faulty one to hand back to the
script. The new one is never properly hooked up, and ends up being deleted
without being unrooted from the JS GC. Things crash when the GC eventually kicks
in, generally after three or four windows.
There's a simpler fix for that. But while I'm at it, I'm trying to address bug
38951 (and apparently several others that have scary patches everywhere the DOM
window realizes it needs to let go of its JS context), preventing the premature
release and regeneration of the script object.
My patch isn't quite ready for prime time, but I believe it fixes this crash.
It puts the onus of remembering what script objects have been given GC root-hood
in the JS context itself, allowing the DOM Window to relax. Obviously I'm asking
for trouble. Take a look!
--- ./dom/src/base/nsJSEnvironment.h.1.41 Thu Aug 24 10:54:22 2000
+++ ./dom/src/base/nsJSEnvironment.h Thu Aug 24 10:54:08 2000
@@ -36,6 +36,7 @@
#include "nsIScriptNameSpaceManager.h"
#endif
class nsIPrincipal;
+class nsVoidArray;
class nsJSContext : public nsIScriptContext {
private:
@@ -49,6 +50,7 @@
nsCOMPtr<nsISupports> mRef;
PRBool mScriptsEnabled;
PRUint32 mBranchCallbackCount;
+ nsVoidArray *mReferences;
static JSBool PR_CALLBACK DOMBranchCallback(JSContext *cx, JSScript *script);
--- ./dom/src/base/nsJSEnvironment.cpp.1.104 Thu Aug 24 10:54:24 2000
+++ ./dom/src/base/nsJSEnvironment.cpp Thu Aug 24 10:54:08 2000
@@ -52,6 +52,7 @@
#include "nsIInterfaceRequestor.h"
#include "nsIPrompt.h"
#include "nsIObserverService.h"
+#include "nsVoidArray.h"
// Force PR_LOGGING so we can get JS strict warnings even in release builds
#define FORCE_PR_LOG 1
@@ -256,6 +257,7 @@
nsJSContext::nsJSContext(JSRuntime *aRuntime)
{
NS_INIT_REFCNT();
+ mReferences = 0;
mContext = ::JS_NewContext(aRuntime, gStackSize);
if (mContext) {
::JS_SetContextPrivate(mContext, (void *)this);
@@ -294,6 +296,7 @@
mTerminationFunc = nsnull;
mScriptsEnabled = PR_TRUE;
mBranchCallbackCount = 0;
+ mReferences = new nsVoidArray(4);
}
const char kScriptSecurityManagerProgID[] = NS_SCRIPTSECURITYMANAGER_PROGID;
@@ -310,6 +313,14 @@
if (!mContext)
return;
+ if (mReferences) {
+ PRInt32 ctr;
+ for (ctr = mReferences->Count() - 1; ctr >= 0; ctr--)
+ ::JS_RemoveRoot(mContext, mReferences->ElementAt(ctr));
+ delete mReferences;
+ mReferences = 0;
+ }
+
/* Remove global object reference to window object, so it can be collected. */
::JS_SetGlobalObject(mContext, nsnull);
::JS_DestroyContext(mContext);
@@ -1146,12 +1157,22 @@
NS_IMETHODIMP
nsJSContext::AddNamedReference(void *aSlot, void *aScriptObject, const char *
aName)
{
- return (::JS_AddNamedRoot(mContext, aSlot, aName)) ? NS_OK : NS_ERROR_FAILURE;
+ if (::JS_AddNamedRoot(mContext, aSlot, aName)) {
+ if (mReferences)
+ mReferences->AppendElement(aSlot);
+ return NS_OK;
+ }
+ return NS_ERROR_FAILURE;
}
NS_IMETHODIMP
nsJSContext::RemoveReference(void *aSlot, void *aScriptObject)
{
+ if (mReferences) {
+ PRInt32 idx = mReferences->IndexOf(aSlot);
+ if (idx >= 0)
+ mReferences->RemoveElementAt(idx);
+ }
return (::JS_RemoveRoot(mContext, aSlot)) ? NS_OK : NS_ERROR_FAILURE;
}
--- ./dom/src/base/nsGlobalWindow.cpp.1.323 Thu Aug 24 10:54:25 2000
+++ ./dom/src/base/nsGlobalWindow.cpp Thu Aug 24 10:54:10 2000
@@ -142,8 +142,6 @@
void GlobalWindowImpl::CleanUp()
{
- if (mContext)
- mContext->RemoveReference(&mScriptObject, mScriptObject);
mContext = nsnull; // Forces Release
mDocument = nsnull; // Forces Release
NS_IF_RELEASE(mNavigator);
@@ -224,13 +222,6 @@
NS_IMETHODIMP GlobalWindowImpl::SetContext(nsIScriptContext* aContext)
{
- // if setting the context to null, then we won't get to clean up the
- // named reference, so do it now
- if (!aContext) {
- NS_WARNING("Possibly early removal of script object, see bug #41608");
- mContext->RemoveReference(&mScriptObject, mScriptObject);
- }
-
mContext = aContext;
return NS_OK;
}
@@ -336,8 +327,6 @@
jsval val = BOOLEAN_TO_JSVAL(JS_TRUE);
::JS_SetProperty((JSContext *) mContext->GetNativeContext(),
(JSObject *) mScriptObject, "closed", &val);
- mContext->RemoveReference(&mScriptObject, mScriptObject);
- mScriptObject = nsnull;
}
mContext = nsnull; // force release now
}
Comment 33•25 years ago
|
||
*** Bug 48934 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 34•25 years ago
|
||
Ended up doing something similar to the above patch, except special-cased the
window script object, rather than keeping track of every JS object referenced.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Whiteboard: [nsbeta2-][nsbeta3+] tentative fix in hand. → [nsbeta2-][nsbeta3+]
Comment 35•25 years ago
|
||
*** Bug 44777 has been marked as a duplicate of this bug. ***
Comment 36•25 years ago
|
||
More stack backtraces in 44777. NOTE: this was not fixed as of a fresh
pull/clobber/depend/build on 08/25/00, however I see the fix notice is dated
8/26/00, so we're probably ok. I've been getting these crashes randomly every
week or so.
Comment 37•25 years ago
|
||
I did a pull this morning (8/28) and I still get the following assertion:
Error: The address passed to JS_AddNamedRoot currently holds an invalid jsval.
This is usually caused by a missing call to JS_RemoveRoot.
Root name is "timeout.expr".
Assertion failure: root_points_to_gcArenaPool, at h:\moz\mozilla\js\src\jsgc.c:668
I'm pretty sure this is related to the window script object even though
he->value is timeout.expr rather than window_object.
I'll attach a call stack.
Let me know if I should open a new bug altogether.
Comment 38•25 years ago
|
||
Assignee | ||
Comment 39•25 years ago
|
||
Sean: I'd call the problem you're describing (just above) a different one.
Barring something really nefarious, it looks like a flaw in the window shutdown
logic of SetNewDocument, but a different flaw from the one involving the
window_object object. Please file a new bug for this one, and mention how to make
it happen. (Personally, I've never even seen the crash mentioned at the top of
this bug report on a Windows build.)
Comment 40•25 years ago
|
||
Dan: thanks. opened bug 50705.
Comment 41•24 years ago
|
||
dan, can you suggest/update the QA Contact for this bug. Thanks.
Assignee | ||
Comment 42•24 years ago
|
||
Updating QA contact. Philip: this bug is garbage collector timing dependent. See
my comment dated 2000-08-07 19:40 for a reliable way to reproduce it (stolen from
bug 46430).
QA Contact: doronr → pschwartau
Comment 43•24 years ago
|
||
Marking VERIFIED FIXED. Using the steps to reproduce above, from bug 46430.
No matter how many times I save the links there, I do not crash.
Using Mozilla trunk binaries 20010706xx on WinNT, Linux, and Mac -
Status: RESOLVED → VERIFIED
Updated•20 years ago
|
Product: Browser → Seamonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•