Closed Bug 391280 Opened 17 years ago Closed 17 years ago

crash [@js_GetGCThingTraceKind, @TraceJSObject] after landing of bug 385393

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: ispiked, Assigned: crowderbt)

References

Details

(Keywords: crash, regression, Whiteboard: [firebug-want])

Crash Data

This crash started appearing right after the patch for bug 385393 landed. See http://crash-stats.mozilla.com/report/list?range_unit=months&query_search=signature&query_type=contains&product=Firefox&branch=1.9&signature=js_LinkFunctionObject&query=js_LinkFunctionObject&range_value=1. Snipped from a Mac stacktrace: 0 js_LinkFunctionObject mozilla/js/src/jsfun.c:2251 1 js_Interpret mozilla/js/src/jsinterp.c:4886 2 js_Execute mozilla/js/src/jsinterp.c:1645 3 JS_ExecuteScript mozilla/js/src/jsapi.c:4751 4 mozJSComponentLoader::GlobalForLocation(nsILocalFile*, JSObject**, char**) mozilla/js/src/xpconnect/loader/mozJSComponentLoader.cpp:1261 5 mozJSComponentLoader::LoadModule(nsILocalFile*, nsIModule**) mozilla/js/src/xpconnect/loader/mozJSComponentLoader.cpp:598 6 nsFactoryEntry::GetFactory(nsIFactory**) mozilla/xpcom/components/nsComponentManager.cpp:3578 7 nsComponentManagerImpl::CreateInstance(nsID const&, nsISupports*, nsID const&, void**) mozilla/xpcom/components/nsComponentManager.cpp:1710 8 nsComponentManagerImpl::GetService(nsID const&, nsID const&, void**) mozilla/xpcom/components/nsComponentManager.cpp:1926 9 nsJSCID::GetService(nsISupports**) mozilla/js/src/xpconnect/src/xpcjsid.cpp:899 10 XUL@0x83d470 ...
Crash address is 0, line is ==> if (!fun->object) fun->object = funobj; in js_LinkFunctionObject called from js_CloneFunctionObject, here: fun = (JSFunction *) JS_GetPrivate(cx, funobj); if (!js_LinkFunctionObject(cx, fun, newfunobj)) { How could a function object in a script's object table have a null private slot? /be
Really would love a reproducible testcase... Duh! /be
I know this is not exactly what you have in mind for a test case, but I have an .xpi file that, when installed in 3.0a8, crashes just after the FF window is up. I tried turning on all tracing and the overlay JS runs partly before the crash. A few days ago I was getting this crash every time I changed my extension and relaunched FF. But next time I launched, no crash. Today I switched branches on my extension and built, then I crash every time. So I built the unpacked xpi (no jars). No crash. Ok, maybe that does not help, but let me know if you want the .xpi.
Forget it. I get this js_LinkFunctionObject crash every time I launch FF3 from eclipse with my xpi (firebug 1.1). It does not happen if I launch from command line or from a link in an email. So I can reproduce it, but I doubt it will happen for anyone else. This, plus the dependence on how I build the XPI, makes me think this is related to timing in the initial browser set up. (Curious that this bug has a mystery zero and timing just like the FF2 js_LookupPropertyWithFlags report).
can you set a counting breakpoint on js_LinkObject and find out how many calls there are between mozJSComponentLoader::GlobalForLocation and your crash? i.e. try to figure out how practical it is to recognize the specific object that's unhappy. you can also look at cx->fp->script->main and cx->fp->pc or something like that to try to figure out where in the js file you are (search through the many crash reports i've filed in jsengine, i'm fairly sure there are enough debugging transcripts to give you help, hopefully including an explanation of how to build jsshell.exe/js.exe and use dissrc to understand what's up. ideally you'd be able to recognize which object in the jsscript is the unhappy one, and be able to use a breakpoint that waits for fp->pc to reach a useful point and then set a databreakpoint on the object that's about to die (private => 0), and then continue and have the debugger stop at the bad point.
(I assume your suggestion is based on using MS debugger which I don't have.) In my current Firebug build FF3.0a8 crashes on the first run after my extension is change. The stack trace is similar to the one posted here (and that I had earlier), except for the last couple of frames. Here is one example: http://crash-stats.mozilla.com/report/index/5b163863-7a76-11dc-93ae-001a4bd43ef6?date=2007-10-14-16
conceptually any non sucky debugger can do what i described, but yes, they are mostly written based on using windbg/devenv/msdev. building js shell is done by: cvs co mozilla/js/src cd mozilla/js/src make -f Makefile.ref and you should be able to run it using: */js
This crash started happening again for me. This time I has tracing on in Firebug and the last JS function compiled and printed before the crash was: onScriptCreated: 463@(2899-2901)file:/C:/Program%20Files/Gran%20Paradiso/components/nsSafebrowsingApplication.js onScriptCreated name: 'NSGetModule' function NSGetModule(compMgr, fileSpec) { return ApplicationModInst; } onScriptCreated: nested function named: NSGetModule IF the print buffer was flushed, the crash was after this file was compiled but before any other compilation. I tried to force the buffer to flush every line but then I do not crash. Also now I do not see nsSafebrowsingApplication load after my debugger turns on. Another time the NSGetModule was for a different component (maybe microsummary ? book something?). I remember seeing that the JS stack had no callers. Sorry this is so much speculation, but are these components loaded on timers? Are there any options for controlling or preventing the delay? If so I could test this theory: something is corrupting the context for time-delayed component loading. Well, I could test it when it starts happening again....
Ok, I took out the flush-each-line and got a crash. First nsMicrosummaryService.js loads no crash, here is the bottom of the firebug-services.js trace: onScriptCreated: 267@(2284-2286)file:/C:/Program%20Files/Gran%20Paradiso/components/nsMicrosummaryService.js onScriptCreated name: 'NSGetModule' function NSGetModule(compMgr, fileSpec) { return XPCOMUtils.generateModule([MicrosummaryService]); } onScriptCreated: nested function named: NSGetModule traceHook: 267 traceHook: 99 ... These traceHook prints are jsd.interruptHook calls I added for this bug, printing any changed frame.script.tag. So it says that NSGetModule is run directly after the file compiles. Next the trace file ends: onScriptCreated: 336@(899-901)file:/C:/Program%20Files/Gran%20Paradiso/components/nsContentPrefService.js onScriptCreated name: 'NSGetModule' function NSGetModule(compMgr, fileSpec) { return XPCOMUtils.generateModule(components); } onScriptCreated: nested function named: NSGetModule Based on the identical result from multiple runs, I think the buffer is flushed before the crash and that the crash occurs after one of these system components are compiled but before the NSGetModule enters. And its not one of the components since different one crash different times. Does it help? onScriptCreated: 336@(899-901)file:/C:/Program%20Files/Gran%20Paradiso/components/nsContentPrefService.js onScriptCreated name: 'NSGetModule' function NSGetModule(compMgr, fileSpec) { return XPCOMUtils.generateModule(components); } onScriptCreated: nested function named: NSGetModule
Any ideas for a workaround? I can't make any more progress on Firebug for FF3b1. If I delete compreg.dat then FF3b1 does not crash but I also cannot get console output (FF3b1 disconnects from the process I guess). If I don't delete compreg.dat, I crash.
Flags: blocking1.9?
ime js_LinkFunctionObject means a GC hazard. note, i haven't really done spidermonkey work for a few years. repeat your steps, but set a breakpoint in js_GC, especially watch newborn roots. when you see gc about to zap them, record what they are, and check them against the objects that crash. if they match, that's the problem :) http://mxr-test.landfill.bugzilla.org/js/search?string=newborn for lack of some other starting point. you can of course browse through bugs I've reported w/ this signature (or referenced by such bugs). bug 301491 has an approximation of this stuff (clearly by the time i wrote that bug, i was used to such hazards) further reading: bug 180182 bug 289949 bug 289949 bug 318969 generally there are no useful workarounds for GC hazards, just fix them.
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
Does anyone know if "fastLoading" can be turned off by any config setting? This would allow me to track a larger set of Firefox scripts in Chromebug and it would -- I believe -- work around this crash. It might also provide another probe on the issue here. (I am trying to build a version without fastLoading to test the idea, but that's not working out.) cc Shaver since he unwisely left his name in the component loading code...
nglayout.debug.disable_xul_fastload stands ready. This bug needs an owner. /be
(In reply to comment #13) > nglayout.debug.disable_xul_fastload stands ready. Thanks, I'm not crashing.
(In reply to comment #14) > (In reply to comment #13) > > nglayout.debug.disable_xul_fastload stands ready. > > Thanks, I'm not crashing. Spoke too soon. As soon as I turned off my tracing I crash again. I guess my only workaround is my own tracing code ;-).
nom'ing this for a P1 since it blocks Firebug development for Fx3, one of our most important and most widely distributed developer tools.
Priority: P2 → P1
Whiteboard: [firebug-want]
--> crowder
Assignee: general → crowder
I wasn't able to reproduce this on OS X... From earlier email to John Barton (probably should have just asked here): ----- Justin Dolske wrote: > Did this problem fix itself on FF3, or is there Firebug code to avoid the problem? > > I can't seem to reproduce it on with a current FF3 nightly and Firebug 1.2 tip. > > Maybe it's Windows only? That one varies a lot, but recently it will happen 100% when I use my normal dev env, eclipse+chromebug+firebug1.2. If I turn on tracing it goes away. If I try to reproduce it on the command line (no eclipse) not a bug. That's why I was trying to build FF3 with symbols, but that experiment failed. Its related to the timing of delay-loaded scripts. I was hoping someone would find a way to reproduce it, but that isn't happening either. I can try other things, but what I don't know. ----- So, I'm unclear how to trigger this. Does Eclipse do something strange to the environment FB/FF is running in?
Today I can reproduce it on the command line (no eclipse). When I run with my dev profile it crashes, then if I hit "send and restart" on Crash Reporter, it crashes again. And if I click on a link in Thunderbird it crashes. So the eclipse bit in my previous email was just the right timing for that set of (FF3bX + my ext). I sent crowder an email with my profile in a zip file to see if he can crash. I'm willing to invest some time on a day this week to get this fixed. I have a machine with MS VS and mozilla tree, but I can't get it to build with symbols but with out tracing (tracing prevents the bug).
crowder? Any news on this?
Yeah, investigated this today and yesterday; I cannot reproduce the bug on my Mac, nor on Win32 either in debug or release, with the profile John Barton emailed me. Even with gczeal set to 2 on my debug build, no crash.
John: Can you link some of the crash-reporter reports here, for your recent crashes?
Based on your new traces, John, this has either moved, or the old bug has gone away. I'm cc:ing peterv who was most recently spotted in the xpconnect code involved here.
Summary: crash [@ js_LinkFunctionObject] after landing of bug 385393 → crash [@js_GetGCThingTraceKind, @TraceJSObject] after landing of bug 385393
The crashes in comment 23 look like bug 415028, which should be fixed in trunk.
John: How recent a build are you using?
Crash Reporter thinks I'm using 3.0b3.
Using the nightly 3.0b4pre I don't crash yet...
Marking WFM for now. peterv, if you know specific other bugs that fix this one, or two which this one should instead be duped, please speak up?
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WORKSFORME
Looks like this has stayed gone, but if anyone else here was able to reproduce it, it would be fantastic to have an idea of when and why this problem went away. Binary search, anyone?
At least based on my Crash Reports, tracking down what fixed this won't be possible. The crashes from Nov '07 are no longer available. The crashes from mid Jan are https and my firefox does not like them. The crashes from late Jan and Feb are not the same stack a I saw all Fall '07. The only thing I can say is that b4pre on Tuesday fix the crashes whose top was js_GetGCThingTraceKind, but you knew that already. On the plus side I'll run through the problem area many dozen of times a day at different speeds depending on tracing so hopefully my confidence that this is gone will go up.
No longer blocks: 411814
Crash Signature: [@js_GetGCThingTraceKind, @TraceJSObject]
You need to log in before you can comment on or make changes to this bug.