Closed Bug 58551 Opened 24 years ago Closed 24 years ago

crash on startup -- 10/30 branch build [@ js_GetSlotWhileLocked]

Categories

(Core :: DOM: Core & HTML, defect, P3)

defect

Tracking

()

VERIFIED DUPLICATE of bug 31847

People

(Reporter: jrgmorrison, Assigned: joki)

References

Details

(Keywords: smoketest, topcrash, Whiteboard: [rtm++])

Crash Data

Attachments

(3 files)

Posted this to the hook and n.p.m.builds. (This may not be JS per se but I don't have any other clues right now). I'm assuming that the builds in ftp://sweetlou/products/client/seamonkey/windows/32bit/x86/2000-10-30-20-MN6/ should be good to go, right? On win2k and win98, I am crashing on startup with this talkback provided trace (the top few lines vary somewhat, but always in jslock.c). Sometimes, with a new profile, I can start OK, but the second run will crash, either just as the browser window starts to come up, or on one occasion just after it had finished coming up. I'm filing a bug now. John js_GetSlotWhileLocked [d:\builds\seamonkey\mozilla\js\src\jslock.c, line 272] JS_GetPrivate [d:\builds\seamonkey\mozilla\js\src\jsapi.c, line 1799] nsScriptSecurityManager::GetFunctionObjectPrincipal [d:\builds\seamonkey\mozilla\caps\src\nsScriptSecurityManager.cpp, line 842] nsScriptSecurityManager::CheckFunctionAccess [d:\builds\seamonkey\mozilla\caps\src\nsScriptSecurityManager.cpp, line 610] nsJSContext::CallEventHandler [d:\builds\seamonkey\mozilla\dom\src\base\nsJSEnvironment.cpp, line 911] nsJSDOMEventListener::HandleEvent [d:\builds\seamonkey\mozilla\dom\src\events\nsJSDOMEventListener.cpp, line 89] nsEventListenerManager::HandleEventSubType [d:\builds\seamonkey\mozilla\layout\events\src\nsEventListenerManager.cpp, line 789] nsEventListenerManager::HandleEvent [d:\builds\seamonkey\mozilla\layout\events\src\nsEventListenerManager.cpp, line 1372] GlobalWindowImpl::HandleDOMEvent [d:\builds\seamonkey\mozilla\dom\src\base\nsGlobalWindow.cpp, line 512] DocumentViewerImpl::LoadComplete [d:\builds\seamonkey\mozilla\layout\base\src\nsDocumentViewer.cpp, line 676] nsWebShell::OnEndDocumentLoad [d:\builds\seamonkey\mozilla\docshell\base\nsWebShell.cpp, line 954] nsDocLoaderImpl::FireOnEndDocumentLoad [d:\builds\seamonkey\mozilla\uriloader\base\nsDocLoader.cpp, line 818] nsDocLoaderImpl::DocLoaderIsEmpty [d:\builds\seamonkey\mozilla\uriloader\base\nsDocLoader.cpp, line 620] nsDocLoaderImpl::OnStopRequest [d:\builds\seamonkey\mozilla\uriloader\base\nsDocLoader.cpp, line 555] nsLoadGroup::RemoveChannel [d:\builds\seamonkey\mozilla\netwerk\base\src\nsLoadGroup.cpp, line 583] nsJARChannel::OnStopRequest [d:\builds\seamonkey\mozilla\netwerk\protocol\jar\src\nsJARChannel.cpp, line 709] nsOnStopRequestEvent::HandleEvent [d:\builds\seamonkey\mozilla\netwerk\base\src\nsAsyncStreamListener.cpp, line 302] nsStreamListenerEvent::HandlePLEvent [d:\builds\seamonkey\mozilla\netwerk\base\src\nsAsyncStreamListener.cpp, line 106] PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 581] PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 517] _md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 1051] KERNEL32.DLL + 0x24407 (0xbff94407) 0x00658b52
adding putterman. he's seeing js crashes too.
There are 25 reports already for this from Netscape6 Builds, including 3 from me.
(about 8 were from me, on three machines. [Hey, I wanted to be sure]). But note: I'm not crashing in the mozilla build on the branch from the same build time. (Is it a build thing, or from the commercial tree ...)
*** Bug 58550 has been marked as a duplicate of this bug. ***
The other top of stack that talkback provides for this crash is ntdll.dll + 0xed61 (0x77f8ed61) ntdll.dll + 0xecf1 (0x77f8ecf1) js_LockScope1 [d:\builds\seamonkey\mozilla\js\src\jslock.c, line 668] js_LockObj [d:\builds\seamonkey\mozilla\js\src\jslock.c, line 739] js_GetSlotWhileLocked [d:\builds\seamonkey\mozilla\js\src\jslock.c, line 295] JS_GetPrivate [d:\builds\seamonkey\mozilla\js\src\jsapi.c, line 1799] nsScriptSecurityManager::GetFunctionObjectPrincipal [d:\builds\seamonkey\mozilla\caps\src\nsScriptSecurityManager.cpp, line 842] nsScriptSecurityManager::CheckFunctionAccess [d:\builds\seamonkey\mozilla\caps\src\nsScriptSecurityManager.cpp, line 610] ... snip ... I'm getting the same crash on Linux.
Yeah, I got one of the JS_LockScope1 crashes as well.
Just one more data point: I just pulled fresh commercial NS6 bits from sweetlou and things work just fine on Win2k here...
This 'xxx-20' build would be a build made in the middle of the day today after most of the limbo fixes when in right? The 'xxx-08' build doesn't have this problem, right? I'm not seeing problems with my branch debug build from this afternoon on NT. I'll update again and build the commercial part too.
I'm upgrading severity. This is the build (from afternoon, evening) which the release team made after all the Limbo bugs were checked in.
Severity: normal → critical
My crash window from the OS said kernel32.dll. Here are some talkback IDs: TB20130288H, TB20130227E. Am I having this bug or a different one? (I'm also on the limbo 1 build.)
Raise severity to blocker, add a couple random keywords
Severity: critical → blocker
Keywords: rtm, smoketest
Steve, yes, both of yours were js_LockScope1 crashes.
*** Bug 58606 has been marked as a duplicate of this bug. ***
I crash on startup with debug build. The one I downloaded from sweetlou seems to work fine except I crash 100% on Help > About Netscape. The crash happens in some js DLL, so I am assuming it could be the same as this. This is on NT.
seen on branch builds: windows 2000-10-31-09-MN6 linux 2000-10-31-09-MN6 mac 2000-10-31-08-MN6 note: trunk builds from this morning worked fined
Here's my first guess: Bubg #53849 looks very suspicious to me. I haven't talked to a developer whose seen this crash in a debug build yet. Only in release builds. Our debug builds don't have JAVA so that would make sense. One problem with my theory is that OJI shouldn't be invoked on startup right? Unless the plugin manager is doing some funky pre-loading of some things....
I also crash when doing Ctrl+n to open a new window.
mscott and I are about to test out his theory. As soon as I moved the java plugins into my build directory I crashed at this spot.
OK. I can get this in the debugger on my branch commercial build. I have not seen mozilla.exe crash in this way. This is a garbage JSObject that the code assumes is a function object. It is non-null but not a valid JSObject. It looks like pretty random memory contents. I'll dig further.
I see this crash every time with an existing profile. However, not that frequently with a *new* profile!!!
Here are the files we are backing out right now if others want to try too: cvs update -j1.19.4.1 -j1.19 mozilla/js/src/liveconnect/jsj_JSObject.c cvs update -j1.14.30.1 -j1.14 mozilla/js/src/liveconnect/jsjava.h cvs update -j1.28.2.1 -j1.28 mozilla/modules/oji/src/lcglue.cpp cvs update -j1.3.54.1 -j1.3 mozilla/modules/oji/src/lcglue.h cvs update -j1.8.2.1 -j1.8 mozilla/modules/oji/src/nsCSecurityContext.cpp cvs update -j1.4.4.1 -j1.4 mozilla/modules/oji/src/nsCSecurityContext.h You may need to copy JAVA over to your plugins directory in your debug build...
From my point of view I want to know why that particular nsJSDOMEventListener object has an mHandler that is non-null and not a valid JSObject. Was there a rooting problem? Or what?
mscott, I'm seeing this on a built w/o Java. I'm not clear on your rationale.
when I backed out the files mscott mentions, the crash went away. My crash didn't get into js_GetSlotWhileLocked, but it did get into here. JS_GetPrivate(JSContext * 0x030081d0, JSObject * 0x00d88ca0) line 1797 + 10 bytes nsScriptSecurityManager::GetFunctionObjectPrincipal(nsScriptSecurityManager * const 0x0213b050, JSContext * 0x030081d0, JSObject * 0x00d88ca0, nsIPrincipal * * 0x0012ee68) line 841 + 14 bytes nsScriptSecurityManager::CheckFunctionAccess(nsScriptSecurityManager * const 0x0213b050, JSContext * 0x030081d0, void * 0x00d88ca0, void * 0x00d17150) line 609 + 44 bytes
jband: mHandler is not rooted. That's reported in bug 31847, which was futured (!). /be
jband, can you rever the files I listed above and see if you still crash? Your stack trace is slightly different from the one we are fixing in this bug so maybe there's another problem somewhere?
That is true. We have not fixed the rooting problem. But in general we have not been running into it in normal usage. The cases where we have run into it have generally been tracked back to someone doing something they shouldn't have been doing. That was the rationale for futuring.
I'm crashing 75% of the time on mac, in PR_Lock called from js_InitContextForLocking setting platform to all. existing profile, 10/31/00-08 branch build.
OS: Windows 2000 → All
Hardware: PC → All
mscott: you can get crashes in several places in the JS engine by feeding its API a bogus JSObject pointer. That's what will happen in some cases (likelier now with Jeff Dyer's fix to bug 53849, although I don't have a complete theory for why this isn't showing up in trunk builds) due to bug 31847. I don't think any content has to misbehave to result in a crash due to 31847; I think we futured that bug by making a wish based on the low frequency observed at the time. We have high frequency now! I'm betting there is no "other bug" here, just Jeff's valid security fix to 53849 in conjunction with 31847. /be
mscott, mine is not so different. js_CompareAndSwap isa inlined and has asm. I wouldn't expect to see it in talkback. If we have a bad handler, then various events could hit it. I fear voodoo in backing out changes that *may* have litle to do with thge real problem. I'm also not certain how predicatably I can reproduce this.
If bug 31847 is biting, then the new problem could be in xul or js files. We should look for multiple event handlers for the same event.
Anyone with strict JS warnings enabled see warnings about "redeclaration of function foo"? That could tickle 31847. /be
Attached file My console output before crash (deleted) —
Per instructions from PDT, I've backed this files out on the branch for a respin so QA can get a build that doesn't crash. I'm going to leave this open because it sounds like jeff's code isn't at fault, it's just opened us up to seeing 31847 more often then we were. (Tell me if I'm misunderstanding that part)....
FWIW, The jar entry that I see loading last before the event is for "skin/classic/global/print.gif" I can't figure out how to tell which file the errant event handler is in.
I'm still building my commericial branch build but I notice ben's checkin into ns/xpfe/browser/resources/content/keywords.js has addEventListener("load", toggleKeywordMenu, true); in it. Could someone try that?
Ugh! I didn't dig through the ns tree changes! The case I'm seeing *is* an NS_PAGE_LOAD event
Ya joki! With that line commented out it starts up fine 3 times in a row. With the line in it crashes three times. We have a winner!
I say you guys that saw the Java change do something should try putting that back in and commenting out this setting of the event handler and see what you get.
<jgrm via email re: jband@netscape.com 2000-10-31 01:26> But those crashes were with the 2000102709 trunk build (all from timeless@bemail.org -- perhaps he could fill us in on how he was doing this (something about a panel from scc "nsEngineer"?)). Interesting bug. per doron: Changing component, reassigning. Per /All fixing summary.
Assignee: rogerl → dveditz
Component: Javascript Engine → Installer: XPInstall Engine
QA Contact: pschwartau → jimmylee
Summary: crash on startup -- 10/30 branch build on win32. → crash on startup -- 10/30 branch build
erm sorry, half of my changes were for the wrong bug. *undo*
Assignee: dveditz → rogerl
Component: Installer: XPInstall Engine → Javascript Engine
QA Contact: jimmylee → pschwartau
I backed out the changes to js and modules/oji, but that did not help in my case - still crash on startup. When I also commented out: ns/xpfe/browser/resources/content/keywords.js addEventListener("load", toggleKeywordMenu, true); I can start ok. I get two assertions (one I had before). I can also open Help > About Netscape and I can open a new window doing Ctrl+n. Next I will try putting back the changes in js and modules/oji.
ok, I think the oji code might have been a red herring though I swear that backing it out made it work for me. Backing out the keywords line also made it work for me. We probably backed out the wrong code. Sorry about that.
linux respin 2000-10-31-12-MN6 still crashes
Okay, I guess we jumped on a red herring. Jonathan, you mine as well stop the respins since they still have the problem. Looks like jband and joki have the right fix. Do you want them to check it in right now and restart the process?
I have now tested this on NT & Linux. The only thing that has effect on my builds is the one line in keywords.js.
I'm noting the dependency, even if a workaround is to back out ben's xpfe change (or modify it somehow to dodge bug 31847). /be
Depends on: 31847
It's really probably more than a dependency. Other than the location where it happened its really a dupe of the exact same issue. I agree we should be able to modify ben's fix to not cause this but I'm not familiar enough with the xul hierarchy to know which functions may override.
joki: agreed, but I didn't want to resolve it as a dup and make it drop off the radar. Cc'ing ben, who has missed out on all the fun. /be
Sorry I'm late to the scene of the crime. Now that the mystery is solved, what is the plan for putting the fix for 53849 back into the branch? I'm happy to do it. -jd
What bug number is for ben's one-line xpfe change that is referred to in previous comments? Thanks.
Lisa: the bug # for that is 54782. Jeff, I'll check your changes back into the branch again once PDT starts taking in limbo bugs again.
I'm confused: shouldn't Jeff's changes go back in for any re-spin that also "fixes" Ben's xpfe code in order to work around 31847? Or do Jeff's changes have to wait for a later respin, and risk not getting into rtm, just because they seemed to be associated with this crash, at first? /be
adding topcrash keyword and [@ js_GetSlotWhileLocked] for tracking. sorry for the spam.
Keywords: topcrash
Summary: crash on startup -- 10/30 branch build → crash on startup -- 10/30 branch build [@ js_GetSlotWhileLocked]
For accuracy's sake, what is the proper component for this bug? It isn't JS Engine - should it be DOM 0, for instance, or one of the XP components?
Making this a virtual dup of 31847. /be
Assignee: rogerl → joki
Component: Javascript Engine → DOM Level 0
jrgm and I looked at this and we found that the crash was caused by keywords.js being loaded twice (once in navigator.xul, and once in navExtraOverlay.xul on the commercial side) and hence the load listener added twice. This doesn't happen on all machines. I've reproduced it on my HP Kayak as has John, but I could not reproduce this on either my dell P210 or my Inspiron notebook. (W2K) keywords.js shouldn't ever be loaded by any mozilla chrome, so the fix for this particular crash symptom for all source trees is to remove reference to keywords.js from navigator.xul. I am preparing a patch (it's a one-liner ;) right now.
Ben, any idea on how keywords.js was being loaded twice? Sounds like a race in XUL content/document code to fetch a <script src=> from a main XUL file and one of its overlays. Cc'ing waterson. /be
it probably doesn't mean anything, but r=blake
I wrote: >Ben, any idea on how keywords.js was being loaded twice? which is a stupid question -- what I meant was: any idea on how that file was sourced once in some cases, twice in others? Were the machines where it seemed to load once all faster processors? Dual processors? Etc. jrgm, feel free to chime in. Thanks, /be
Dell system upon which this works: Precision 210 Workstation, Pentium III-500MHz, 256MB RAM Dell Inspiron 5000 Workstation, Pentium III-500MHz, 128Mb RAM fails on: Hewlett Packard Kayak, Pentium III-500MHz, 128Mb RAM. Macintosh G3-450, 128Mb RAM, OS8.6
going to do some more investigation to see exactly what happens and when...
It would be sourced twice in all cases for a commercial build (or should be), just the timing would differ. My Kayak is the same MHz/MB as ben's, but my Precision 210 for linux is the same MHz, but 128MB memory (if that is somehow a factor). At any rate, r=jrgm for the effectiveness of the patch; works for me, and that line was redundant in the commercial build. (But why is 'keywords.js' in the mozilla tree if it's not supposed to be part of the mozilla build?).
Uh, two be clear: I crashed on both my Linux Precision 210, and Win2k HP Kayak (both single processor, 500MHz, 128MB).
for the record, it crashed on startup for me on my Dell Dimension 4100 PIII- 933MHz, 256MB RAM without this patch.
Some more notes: I'm seeing this now in optimised builds on my /P210/ (which still doesn't show it in *debug* builds)... still not seeing it in either on my notebook. Adding dump() statements to my code shows this: *** added load listener (this is me) creating new nsJSAimChatRendezvous *** added load listener (this is me) Net2Phone.js has been interpreted... (talkback window appears) *** firing load handler (this is me) *** firing load handler (this is me) (Application error dialog appears) the sequencing of the talkback window appearing there seemed weird to me, but that could just be the way it works...
58693 filed on the keywords.js in the mozilla trunk issue. I'll cvs remove the file from the trunk when I get a chance.
I don't see this anymore, since using 10-31-14 MN6 candidate build on Win98.
this is because my change allow the keyword menu to be removed was backed out. The patch here will allow my keyword menu stuff to be landed again.
a=hyatt
and r=ben (obviously)
rtm++, please checkin ASAP so we can build today.
Whiteboard: [rtm++]
the startup bug has been worked around on the branch (checked in my patch to remove the duplicate file loading). The real bug still exists however.
The real bug is bug 31847, so can we now DUP this against that bug? Will that mess with anyone's verification procedures? /be
the symptom described in this bug is actually fixed by my checkin. See dependency for the real issue. duping, as you wish. *** This bug has been marked as a duplicate of 31847 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
Verifying as duplicate -
Status: RESOLVED → VERIFIED
Crash Signature: [@ js_GetSlotWhileLocked]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: