Closed Bug 718541 Opened 13 years ago Closed 13 years ago

Can't build a working optimized Windows from trunk since bug 715821 landed; resulting build crashes on startup and attempting to build installer crashes during build

Categories

(Core :: JavaScript Engine, defect)

x86
Windows 7
defect
Not set
blocker

Tracking

()

RESOLVED FIXED
mozilla12
Tracking Status
firefox12 - ---

People

(Reporter: wgianopoulos, Assigned: rain1)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files, 3 obsolete files)

The last time I was able to successfully build Firefox under windows from the trunk was a pull of the changeset used for the 3AM January 10th Nightly build. Since I suspect it will be late tonight when I actually get my hg bisect to complete (since my attempt to do this using depend builds was completely unsuccessful, so I have to do clobbers on each test, it is taking awhile) but I wanted to get as much detail in the bug as possible before it got too late tonight to be thinking straight. Details on my build environment to follow.
Attached file Output from sed | grep SDK (deleted) —
This issue is occurring using Microsoft Visual C++ 2008 Express. The attachment sdocuments the SDKs being used.
Keywords: regression
One issue is that it fails in the installer build step with the following: Problem signature: Problem Event Name: APPCRASH Application Name: xpcshell.exe Application Version: 12.0.0.4395 Application Timestamp: 4f118138 Fault Module Name: mozjs.dll Fault Module Version: 0.0.0.0 Fault Module Timestamp: 4f116fe9 Exception Code: c0000005 Exception Offset: 00060745 OS Version: 6.1.7601.2.1.0.256.48 Locale ID: 1033 Additional Information 1: 0a9e Additional Information 2: 0a9e372d3b4ad19135b953a78882e789 Additional Information 3: 0a9e Additional Information 4: 0a9e372d3b4ad19135b953a78882e789
Summary: Can't build from trunk since sometime on January 10th → Can't build Windows from trunk since sometime on January 10th
I get a very similar startup crash from just running the binaries built without trying to install. I will be posting an example soon, but was not necessarily from the exact same build so it might look less similar than you would expect.
Have you clobbered recently?
My entire effort to bisect this is based on clobber builds as I was entirely unable to figure anything out form dep builds that is why this is being filed so far after the issue.
Startup crash looks like this: Problem signature: Problem Event Name: APPCRASH Application Name: firefox.exe Application Version: 12.0.0.4398 Application Timestamp: 4f14d50a Fault Module Name: mozjs.dll Fault Module Version: 0.0.0.0 Fault Module Timestamp: 4f14ae95 Exception Code: c0000005 Exception Offset: 0005fff5 OS Version: 6.1.7601.2.1.0.768.3 Locale ID: 1033 Additional Information 1: 0a9e Additional Information 2: 0a9e372d3b4ad19135b953a78882e789 Additional Information 3: 0a9e Additional Information 4: 0a9e372d3b4ad19135b953a78882e789
Assignee: nobody → general
Blocks: 715821
Severity: major → blocker
Component: Build Config → JavaScript Engine
QA Contact: build-config → general
So hg bisect resulted in: The first bad revision is: changeset: 83761:0b102e74b3b8 user: Jeff Walden <jwalden@mit.edu> date: Fri Jan 06 00:13:20 2012 -0600 summary: Bug 715821 - Make Object.prototype.__defineGetter__ and Object.prototype.__defineSetter__ perform their work by forwarding to Object.defineProperty. This eliminates two calls to CheckRedeclaration, which is impeding property-storage-splitting work. r=bhackett
Summary: Can't build Windows from trunk since sometime on January 10th → Can't build Windows from trunk since bug 715821 landed
Could this be a dupe of bug #718187?
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
Status: RESOLVED → UNCONFIRMED
Ever confirmed: false
Resolution: DUPLICATE → ---
I decided to dupe this the other way so that the open bug would be on the supported product.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I should also mention that although this is a 32-bit build of Firefox and it fails when run on 32-bit systems, it is being built on a Windows 7 64-bit system with an AMD processor. Built using standard 32-bit Mozilla Build tools.
Severity: blocker → critical
Crash Signature: [@ JS_ForwardGetPropertyTo]
Keywords: crash
There seems to be multiple developers hitting this, it feels like it needs tracking or some sort of investigation.
(In reply to Mark Banner (:standard8) from comment #13) > There seems to be multiple developers hitting this, it feels like it needs > tracking or some sort of investigation. That was why I had the severity set to blocker. Someone else lowered it. Although we have a workaround by backing out the offending fix, it certainly sounds from bug 715821 comment 0 that there are follow-on fixes coming down the pike that are dependent on that bug. So, having people only being able to build by backing it out is not really a good plan going forward.
So, um, you're saying a change to __define[GS]etter__'s implementation is causing the build to fail? That doesn't make much sense to me. Maybe you can catch this in a debugger, or something? It's really not obvious to me how a change like that would cause the build to break.
(In reply to Jeff Walden (remove +bmo to email) from comment #15) > So, um, you're saying a change to __define[GS]etter__'s implementation is > causing the build to fail? Just to clarify: For the people on bug 718187 (which was duped to this one), including me, the build completed successfully, but the application (SeaMonkey) always crashes on startup, before anything is displayed (even before the Profile Manager comes up if requested via -P). We're all building 32-bit on Win 7 x64, but with different versions of MSVC. Note that I didn't personally track down the culprit so I can't tell whether bug 715821 caused this (I'm just watching the game).
I have a crash on startup w/ the same build settings (vc express 9, release build) that goes away when I back out the patch for bug 715821. The issue happens on three different machines running 3 different versions of windows, so I'm pretty sure it's vc express c++ optimizer that's causing the issue. I crash here: http://mxr.mozilla.org/comm-central/source/mozilla/js/src/jsobj.cpp#5450 because obj2 is null, and apparently LookupPropertyWithFlagsInline returns true but null prop and obj2 (though debugging release builds with symbols can make debugging a bit tricky) I'm happy to debug this, if you have any hints for me.
So LPWFI returned true, and it set prop to NULL? That indicates that the lookup didn't hit any errors (OOM, thrown exception, etc.), and the property wasn't found. But if prop is NULL, then you should have entered the if-block just above. And there's clearly no way out of that if-block except by returning. I just hopped off a cross-country flight on too little sleep, so I'm hesitant to have particular certainty in my judgment for the rest of the day. But even still, I don't see how you could have |prop == NULL| and have hit that |if (obj2->isNative())|.
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #16) > (In reply to Jeff Walden (remove +bmo to email) from comment #15) > > So, um, you're saying a change to __define[GS]etter__'s implementation is > > causing the build to fail? > > Just to clarify: For the people on bug 718187 (which was duped to this one), > including me, the build completed successfully, but the application > (SeaMonkey) always crashes on startup, before anything is displayed (even > before the Profile Manager comes up if requested via -P). We're all building > 32-bit on Win 7 x64, but with different versions of MSVC. Note that I didn't > personally track down the culprit so I can't tell whether bug 715821 caused > this (I'm just watching the game). It is the installer step that crashes during the build, so if you do not create the installer it will not crash. Also, that is for firefox. I have no idea if trying to build the seamonkey installer works or not.
(In reply to Jeff Walden (remove +bmo to email) from comment #15) > So, um, you're saying a change to __define[GS]etter__'s implementation is > causing the build to fail? That doesn't make much sense to me. Maybe you > can catch this in a debugger, or something? It's really not obvious to me > how a change like that would cause the build to break. It is the step to build the Firefox installer that crashes.
Summary: Can't build Windows from trunk since bug 715821 landed → Can't build a working Windows from trunk since bug 715821 landed; Building installer crashes and resulting build crashes on startup
We are now getting reports of similar failures using MSVC 10. Setting this back form Critical to blocker.
Severity: critical → blocker
If there is a legitimate reason to downgrade the severity, please explain it in a comment otherwise please leave it as it is.
Summary: Can't build a working Windows from trunk since bug 715821 landed; Building installer crashes and resulting build crashes on startup → Can't build a working Windows from trunk since bug 715821 landed; resulting build crashes on startup and attempting to build installer crashes during build
(In reply to Jeff Walden (remove +bmo to email) from comment #18) > So LPWFI returned true, and it set prop to NULL? I could be wrong about prop being null, but I'm pretty sure obj2 was null.
> Someone else lowered it. Sorry I thought at that time only very few people were affected by it - at that time mostly SeaMonkey developers.
I seem to crash here: http://mxr.mozilla.org/comm-central/source/mozilla/js/src/jsobj.cpp#5478 5478 return js_GetPropertyHelperInline(cx, obj, receiver, id, JSGET_METHOD_BARRIER, vp); > mozjs.dll!js_GetProperty(JSContext * cx=0x03d1b040, JSObject * obj=0x03d1b040, JSObject * receiver=0x03d1b040, int id=64037120, JS::Value * vp=0x0036f708) Line 5478 + 0x4 bytes C++ mozjs.dll!JSObject::getGeneric(JSContext * cx=0x03197220, JSObject * receiver=0x03d1b040, int id=64037216, JS::Value * vp=0x0036f708) Line 209 + 0x15 bytes C++ mozjs.dll!JS_GetPropertyById(JSContext * cx=0x03197220, JSObject * obj=0x03d1b040, int id=64037216, JS::Value * vp=0x0036f708) Line 4060 + 0x2b bytes C++ xul.dll!XPCWrappedNativeScope::SetGlobal(XPCCallContext & ccx={...}, JSObject * aGlobal=0x03d125e0) Line 269 + 0x15 bytes C++ xul.dll!nsXPConnect::InitClassesWithNewWrappedGlobal(JSContext * aJSContext=0x03197220, nsISupports * aCOMObj=0x031d3244, const nsID & aIID={...}, nsIPrincipal * aPrincipal=0x031b0600, nsISupports * aExtraPtr=0x00000000, unsigned int aFlags=2, nsIXPConnectJSObjectHolder * * _retval=0x0036f91c) Line 1351 C++ xpcshell.exe!main(int argc=5, char * * argv=0x002520f0, char * * envp=0x00252d20) Line 1951 C++ xpcshell.exe!__tmainCRTStartup() Line 586 + 0x17 bytes C kernel32.dll!774e3677() [Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll] ntdll.dll!77ba9f02() ntdll.dll!77ba9ed5() Disassembly at the crash point: 726607B1 mov esi,dword ptr [esp+14h] 726607B5 mov edi,dword ptr [esi] 726607B7 mov cl,byte ptr [edi+0Dh] 726607BA not cl 726607BC test cl,1 726607BF jne js_GetProperty+3B0h (72660810h) 726607C1 mov edx,dword ptr [edi] 726607C3 mov eax,dword ptr [edx] 726607C5 cmp eax,offset js::ObjectProxyClass (727C8490h) 726607CA je js_GetProperty+37Ah (726607DAh) 726607CC cmp eax,offset js::OuterWindowProxyClass (727C8578h) 726607D1 je js_GetProperty+37Ah (726607DAh) 726607D3 cmp eax,offset js::FunctionProxyClass (727C8838h) 726607D8 jne js_GetProperty+39Ah (726607FAh) 726607DA mov eax,dword ptr [esp+48h] 726607DE mov ecx,dword ptr [esp+40h] 726607E2 push eax 726607E3 push ebx 726607E4 push ecx 726607E5 push esi 726607E6 push ebp 726607E7 call js::Proxy::get (72678300h) 726607EC add esp,14h 726607EF movzx eax,al
I believe I am reproducing this with this in my .mozconfig: ac_add_options --disable-debug ac_add_options --enable-optimize If I reverse them, I get a working build. Windows 7, 64bit, VS2010 Building one final build to confirm, but I believe this info can be helpful.
I spoke too soon... I am definitely on to something though. I reversed them, and still got a broken build. Be back once I have this confirmed.
Adding: ac_add_options --disable-optimize to .mozconfig allows me to get a working build.
I'm crashing in the same place running xpcshell tests as comment #25 compiling 32 bit on a 64 bit system with VS 2008 (ver. 9.0.30729.1). Backing out Bug 715821 fixes the crash. mozjs.dll!js_GetProperty(JSContext * cx=0x0341b040, JSObject * obj=0x0341b040, JSObject * receiver=0x0341b040, int id=54599936, JS::Value * vp=0x0040f5f4) Line 5478 + 0x4 bytes C++ mozjs.dll!JSObject::getGeneric(JSContext * cx=0x03040220, JSObject * receiver=0x0341b040, int id=54600032, JS::Value * vp=0x0040f5f4) Line 209 + 0x15 bytes C++ mozjs.dll!JS_GetPropertyById(JSContext * cx=0x03040220, JSObject * obj=0x0341b040, int id=54600032, JS::Value * vp=0x0040f5f4) Line 4060 + 0x2b bytes C++ xul.dll!XPCWrappedNativeScope::SetGlobal(XPCCallContext & ccx={...}, JSObject * aGlobal=0x034125e0) Line 269 + 0x15 bytes C++ xul.dll!nsXPConnect::InitClassesWithNewWrappedGlobal(JSContext * aJSContext=0x03040220, nsISupports * aCOMObj=0x0307e524, const nsID & aIID={...}, nsIPrincipal * aPrincipal=0x0305c180, nsISupports * aExtraPtr=0x00000000, unsigned int aFlags=2, nsIXPConnectJSObjectHolder * * _retval=0x0040f804) Line 1351 C++ xpcshell.exe!main(int argc=20, char * * argv=0x002ca000, char * * envp=0x002c2f98) Line 1950 + 0x2e bytes C++ xpcshell.exe!__tmainCRTStartup() Line 586 + 0x17 bytes C kernel32.dll!770d339a() [Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll] ntdll.dll!77c29ef2() ntdll.dll!77c29ec5()
I'll give building with msvc2008 a try and see what I can find.
Attached patch Workaround (obsolete) (deleted) — Splinter Review
For those trying to build who encounter this issue this is probably a superior solution to backing out bug 715821.
(In reply to Philip Chee from comment #24) > > Someone else lowered it. > Sorry I thought at that time only very few people were affected by it - at > that time mostly SeaMonkey developers. The issue has more to do with what you are testing. It does not impact non-optimized debug builds. But if you are trying to do performance testing with an optimized build, then you run into this issue.
Summary: Can't build a working Windows from trunk since bug 715821 landed; resulting build crashes on startup and attempting to build installer crashes during build → Can't build a working optimized Windows from trunk since bug 715821 landed; resulting build crashes on startup and attempting to build installer crashes during build
This is a much better workaround because: 1. It only disables optimization in the area change by bug 715821 instead of in the entire module. 2. It removes the MSVC version checks because we have not really validated which versions are affected.
Attachment #589733 - Attachment is obsolete: true
Attachment #590051 - Attachment description: Workarond-Disspabl;e Optimization in code changed by bug 715821 → Workaday-Disable Optimization in code changed by bug 715821
Attachment #590051 - Attachment description: Workaday-Disable Optimization in code changed by bug 715821 → Workaround-Disable Optimization in code changed by bug 715821
Attachment #589733 - Attachment is obsolete: false
Attachment #590051 - Attachment is obsolete: true
(In reply to Bill Gianopoulos [:WG9s] from comment #33) > Created attachment 590051 [details] [diff] [review] > Workaround-Disable Optimization in code changed by bug 715821 > > This is a much better workaround because: > > 1. It only disables optimization in the area change by bug 715821 instead > of in the entire module. > > 2. It removes the MSVC version checks because we have not really validated > which versions are affected. Reverted to previous patch. I must have screwed up somehow when I tested this last night. My automated build today failed with the supposedly better patch.
Looks to me like since the landing of this patch we have functions that sometimes return true/false, sometimes return JS_TRUE/JS_FALSE . This seems wrong to me.
Actually these issues seem to be outside of the area modified by bug 715821. Nevermind! ;-)
I'm hitting this same crash (also building with VC2008). It's certainly possible it's just a compiler bug, but it's probably worth working around if possible since it's affecting so many people.
FYI all, I see a lot of attention focused on VC2008... I had/have the same problem with VC2010 ... comment #28 got me working around ...
I'm hitting the same problem with VC10. At js_GetProperty+34Ah (seemingly jumped to when https://mxr.mozilla.org/mozilla-central/source/js/src/jsobj.cpp#5371 returns false) I see mov ebp,dword ptr [esp+14h] At this point esp+14h is 0. mov ecx,dword ptr [ebp] and so it crashes.
Ignore my last comment. I did some uninlining, and the problem happens at https://mxr.mozilla.org/mozilla-central/source/js/src/jsobj.cpp#4991. The (!obj->nativeEmpty()) check succeeds because *(obj.ptr) is non-null. However in the subsequent nativeLookup call, at https://mxr.mozilla.org/mozilla-central/source/js/src/jsobjinlines.h#1161 a new Shape is allocated with the exact same address as (obj.ptr), which causes things to go awry at https://mxr.mozilla.org/mozilla-central/source/js/src/jsscope.h#1075. I think this might be a dangling pointer somewhere.
Attached patch "fix" (obsolete) (deleted) — Splinter Review
might be a compiler bug -- forcing nativeLookup to not be inlined fixes the crash.
(In reply to Siddharth Agarwal [:sid0] from comment #41) > Created attachment 590788 [details] [diff] [review] > "fix" > > might be a compiler bug -- forcing nativeLookup to not be inlined fixes the > crash. Might it make sense to do this on MSVC only then?
Yeah, probably. But you want MOZ_NEVER_INLINE from mozilla/Attributes.h, not Windows-specific declspec-ing. And I wouldn't try to make it Windows-specific by implementing it once in the header for non-Windows, then again in the cpp for Windows -- that'd be a recipe for unreadability.
The patch wasn't meant to be in any sort of shape to check in, hence the scare quotes.
Attached patch workaround v1 (deleted) — Splinter Review
Disables inlining this function on MSVC9 and above.
Attachment #590788 - Attachment is obsolete: true
Attachment #590871 - Flags: review?
Attachment #590871 - Flags: review? → review?(jwalden+bmo)
Attachment #589733 - Attachment is obsolete: true
(In reply to Siddharth Agarwal [:sid0] from comment #45) > Created attachment 590871 [details] [diff] [review] > workaround v1 > > Disables inlining this function on MSVC9 and above. This seems to fix the issue that I originally reported for me.
FWIW, my VC10 w/ JS PGO enabled builds don't have this problem. We may want to re-test this once PGO gets turned on again for JS.
(In reply to Ryan VanderMeulen from comment #47) > FWIW, my VC10 w/ JS PGO enabled builds don't have this problem. We may want > to re-test this once PGO gets turned on again for JS. That does not help people doing their own non-pgo builds. The official builds don't see this problem at all. This is only an issue for people doing their own optimized builds.
I've been running into a similar issue and think this bug is relevant to it, with --enable-optimize set my PGO builds are failing at the step where profiling is meant to occur. TEST-UNEXPECTED-FAIL | automation.py | Exited with code -1073741819 during test run INFO | automation.py | Application ran for: 0:00:00.132000 INFO | automation.py | Reading PID log: c:\users\shadow~1\appdata\local\temp\tmpvzvjpcpidlog f:\mozilla-central\client.mk:222:0: command 'MOZ_PGO_INSTRUMENTED=1 OBJDIR=. JARLOG_DIR=./jarlog/en-US python ./_profile/pgo/profileserver.py' failed, return code -1073741819 if I attempt to manually execute firefox.exe via the shell I receive a crash. Problem signature: Problem Event Name: APPCRASH Application Name: firefox.exe Application Version: 12.0.0.4406 Application Timestamp: 4f1e9bf7 Fault Module Name: xul.dll Fault Module Version: 12.0.0.4406 Fault Module Timestamp: 4f1e9bc4 If I edit js/src/makefile.in to allow PGO on JS, enable-optimize and everything else works perfectly fine, hopefully this information helps somebody figure it out.
start-msvc10.bat "Mozilla tools directory: C:\mozilla-build\" Visual C++ 10 Express directory: C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\ Windows SDK directory: C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\ Windows SDK version: 7.0A Setting environment for using Microsoft Visual Studio 2010 x86 tools. Mozilla build environment: MSVC version 10. Master@MARKSLAPTOP ~ $ hg version Mercurial Distributed SCM (version 1.9.1) (see http://mercurial.selenic.com for more information) Copyright (C) 2005-2011 Matt Mackall and others This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Master@MARKSLAPTOP ~ $ python -OO build/pymake/make.py -f client.mk This is compiled with no .MOZCONFIG file, and is of clean source compile, no custom changes, (baseline) attempt last 01/16. The build seems to complete okay, but the execution runs in the background, then terminates without displaying any windows. It stops according to my MSDEV debug sessions with error message pointing to a problem in mozjs.dll. Adding: ac_add_options --disable-optimize to .mozconfig allowed me to get a working build. One week later now, working on my local changes and doing daily pull -U's I removed the ac_add_options --disable-optimize to try to recreate the problem and re-join the groups degugging efforts and TWA / Trouble Went Away .... :-(
Have you tried a clobber (clean) build, Mark?
Oh, sorry... I meant building now works for me with no problems as originally encountered ... no further need for ac_add_options --disable-optimize ... and nothing for me to debug as all is well
(In reply to Mark Capella from comment #52) > Oh, sorry... I meant building now works for me with no problems as > originally encountered ... no further need for ac_add_options > --disable-optimize ... and nothing for me to debug as all is well The previous comment was trying to say that if you clobber your obdir in order to ensure everything is rebuilt with optimization enabled, the issue will likely return for you.
Oh, new developer missed a nuance ... my recent rebuild picked up previously cached portions of compiled code and masked the original problem ... lemme figure out clobber / clean build and re-try ... write me eMail or IRQ in #developers if more appropriate
Attachment #590871 - Flags: review?(jwalden+bmo) → review+
(In reply to Ryan VanderMeulen from comment #47) > FWIW, my VC10 w/ JS PGO enabled builds don't have this problem. We may want > to re-test this once PGO gets turned on again for JS. Not sure this will make a difference, since default builds aren't ever going to be PGO. PGO probably changes the codegen here so much that it avoids the problem.
Assignee: general → sagarwal
Status: NEW → ASSIGNED
Whiteboard: [inbound]
Status: ASSIGNED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Whiteboard: [inbound]
Target Milestone: --- → mozilla12
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: