Closed
Bug 1143966
Opened 10 years ago
Closed 10 years ago
jsshell startup crash with Ubuntu 12.04 LTS (gcc 4.7)
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
Tracking | Status | |
---|---|---|
firefox39 | --- | fixed |
People
(Reporter: decoder, Assigned: ehoogeveen)
Details
(Keywords: crash, Whiteboard: [fuzzblocker])
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
terrence
:
review+
|
Details | Diff | Splinter Review |
I'm seeing a startup crash on mozilla-central since rev 2b9f5019abf1:
Program received signal SIGSEGV, Segmentation fault.
js::gc::GetBackgroundAllocKind (kind=js::gc::OBJECT16) at js/src/jsgc.h:219
219 MOZ_ASSERT(!IsBackgroundFinalized(kind));
(gdb) bt
#0 js::gc::GetBackgroundAllocKind (kind=js::gc::OBJECT16) at js/src/jsgc.h:219
#1 0x086f6b2d in js::NewObjectWithGivenTaggedProto (cxArg=cxArg@entry=0x96b7560, clasp=clasp@entry=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>, proto=proto@entry=...,
parentArg=..., allocKind=js::gc::OBJECT16, newKind=newKind@entry=js::SingletonObject) at js/src/jsobj.cpp:1180
#2 0x081537f8 in js::NewObjectWithGivenTaggedProto (cx=cx@entry=0x96b7560, clasp=clasp@entry=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>, proto=..., parent=...,
newKind=newKind@entry=js::SingletonObject) at js/src/jsobjinlines.h:587
#3 0x08209a88 in NewObjectWithGivenProto (newKind=js::SingletonObject, parent=..., proto=..., clasp=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>, cx=0x96b7560)
at js/src/jsobjinlines.h:611
#4 js::GlobalObject::createInternal (cx=cx@entry=0x96b7560, clasp=clasp@entry=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>)
at js/src/vm/GlobalObject.cpp:241
#5 0x0829e35b in JSRuntime::createSelfHostingGlobal (cx=cx@entry=0x96b7560) at js/src/vm/SelfHosting.cpp:1129
#6 0x082a3022 in JSRuntime::initSelfHosting (this=this@entry=0x968aba0, cx=cx@entry=0x96b7560) at js/src/vm/SelfHosting.cpp:1161
#7 0x086a4b6c in js::NewContext (rt=0x968aba0, stackChunkSize=8192) at js/src/jscntxt.cpp:127
#8 0x086a4bd9 in JS_NewContext (rt=0x968aba0, stackChunkSize=stackChunkSize@entry=8192) at js/src/jsapi.cpp:674
#9 0x0804d9ba in NewContext (rt=<optimized out>) at js/src/shell/js.cpp:5590
#10 0x080684f5 in main (argc=1, argv=0xffffd4c4, envp=0xffffd4cc) at js/src/shell/js.cpp:6337
Bisection says:
The first bad revision is:
changeset: 233529:2b9f5019abf1
user: Emanuel Hoogeveen
date: Fri Mar 13 02:13:00 2015 +0100
summary: Bug 1139552 - Convert js::gc::AllocKind to an enum class and eliminate non-AllocKind indexing. r=terrence
The crash only reproduces with the following compiler/build flags combination:
Compiler: gcc (Ubuntu/Linaro 4.7.3-2ubuntu1~12.04) 4.7.3
Build Flags: --enable-debug --enable-optimize --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --target=i686-pc-linux-gnu
Assignee | ||
Comment 1•10 years ago
|
||
I can reproduce this with GCC 4.7 without any changes. I can also reproduce it with GCC 4.8 if I remove the 'inline' keyword from GetBackgroundAllocKind. So it seems the underlying bug is still there, but the inlining saves it. Here's the disassembly (Intel style, because AT&T makes my eyes glaze over):
Dump of assembler code for function js::gc::GetBackgroundAllocKind(js::gc::AllocKind):
0x0868c5a0 <+0>: push ebp
0x0868c5a1 <+1>: mov ebp,esp
0x0868c5a3 <+3>: push ebx
0x0868c5a4 <+4>: sub esp,0x14
0x0868c5a7 <+7>: call 0x806902b <__x86.get_pc_thunk.bx>
0x0868c5ac <+12>: add ebx,0xf70a48
0x0868c5b2 <+18>: cmp al,0x16
0x0868c5b4 <+20>: ja 0x868c5cd <js::gc::GetBackgroundAllocKind(js::gc::AllocKind)+45>
=> 0x0868c5b6 <+22>: cmp BYTE PTR [ebx+eax*1-0xbef1fb],0x0
0x0868c5be <+30>: jne 0x868c5f9 <js::gc::GetBackgroundAllocKind(js::gc::AllocKind)+89>
0x0868c5c0 <+32>: cmp al,0xb
0x0868c5c2 <+34>: ja 0x868c5d2 <js::gc::GetBackgroundAllocKind(js::gc::AllocKind)+50>
0x0868c5c4 <+36>: add esp,0x14
0x0868c5c7 <+39>: add eax,0x1
0x0868c5ca <+42>: pop ebx
0x0868c5cb <+43>: pop ebp
0x0868c5cc <+44>: ret
0x0868c5cd <+45>: call 0x868c4b0 <js::gc::IsBackgroundFinalized(js::gc::AllocKind)>
0x0868c5d2 <+50>: mov DWORD PTR [esp],0xde
0x0868c5d9 <+57>: lea edx,[ebx-0xbf7348]
0x0868c5df <+63>: lea eax,[ebx-0xbf720c]
0x0868c5e5 <+69>: call 0x871bf42 <MOZ_ReportAssertionFailure(char const*, char const*, int)>
0x0868c5ea <+74>: mov DWORD PTR ds:0x0,0xde
0x0868c5f4 <+84>: call 0x804aa20 <abort@plt>
0x0868c5f9 <+89>: mov DWORD PTR [esp],0xdd
0x0868c600 <+96>: lea edx,[ebx-0xbf7348]
0x0868c606 <+102>: lea eax,[ebx-0xc932e1]
0x0868c60c <+108>: call 0x871bf42 <MOZ_ReportAssertionFailure(char const*, char const*, int)>
0x0868c611 <+113>: mov DWORD PTR ds:0x0,0xdd
0x0868c61b <+123>: call 0x804aa20 <abort@plt>
From this, I think the problem is that it's using eax instead of al in the offending cmp. *($ebx+$al-0xbef1fb) == 0x100, which would indeed truncate down to 0 and pass the assertion - but *($ebx+$eax-0xbef1fb) cannot be accessed.
Unfortunately I have no idea how to work around this except by locally disabling optimizations (I haven't tried this yet, but I imagine it would work). I also couldn't say why GCC is failing in this way, or if it might be doing the same thing in other places.
Assignee | ||
Comment 2•10 years ago
|
||
Oh, interesting. If I wrap GetBackgroundAllocKind in a #pragma GCC optimize ("O0") block, I get a different assertion failure, with kind == 127 (!). So maybe the problematic situation is created higher up the stack.
Assignee | ||
Comment 3•10 years ago
|
||
Disregard that. I think the calling conventions used by -O3 and -O0 just aren't compatible - the AllocKind was reading random data.
Using -O1 does work, and |-O2 -fno-expensive-optimizations|. -O3 fails, but none of the exposed options are responsible - I can add all of them to |-O2 -fno-expensive-optimizations|, and it won't change the generated code at all.
Assignee | ||
Comment 4•10 years ago
|
||
This patch fixes the problem for me locally. It might need more work or it might not be the best solution, and I'd like to see if I can get a reduced testcase, but could you confirm that this fixes the crashes for you as well?
Assignee: nobody → emanuel.hoogeveen
Attachment #8579042 -
Flags: feedback?(choller)
Assignee | ||
Comment 5•10 years ago
|
||
So what happens here is that it stores the AllocKind on the stack as a uint8_t:
0x08199dc3 <+67>: mov BYTE PTR [esp+0xc],cl
Then it reads it back off the stack as a uint32_t, so the high bits are no longer zero:
0x08199ddc <+92>: call 0x86fb710
0x086fb710 <+0>: push ebp
0x086fb711 <+1>: mov ebp,esp
0x086fb719 <+9>: mov eax,DWORD PTR [ebp+0x14]
It inlines CanBeFinalizedInBackground(), and does the right thing for the IsBackgroundFinalized() check in there:
0x086fb735 <+37>: movzx edx,al
0x086fb738 <+40>: cmp BYTE PTR [ebx+edx*1-0xc43c1b],0x0
But then it calls into GetBackgroundAllocKind() (not inlined in GCC 4.7), and does the wrong thing:
0x086d9eb6 <+22>: cmp BYTE PTR [ebx+eax*1-0xc43c1b],0x0
I'm not sure what's at fault here: is the second check 'forgetting' that the high bits might not be zero, or is the optimization transforming the load and first (inline) check but forgetting to update the second?
Assignee | ||
Comment 6•10 years ago
|
||
This still reproduces in GCC 4.8.3, but doesn't appear to in GCC 4.9.1. It might just be dodging the bug in a different way though. It'd be really nice to have a reduced testcase, but I'm not sure how to approach reduction (my first naive attempt failed).
Comment 7•10 years ago
|
||
Could it be https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64037 ?
Assignee | ||
Comment 8•10 years ago
|
||
Well spotted! Yes, it seems to be fixed in 4.8 trunk. It just missed 4.8.4, so I'm trying that now to make sure.
As for where that leaves us.. Terrence suggested just changing the underlying type to be word-sized, so this problem can't happen. That will make AllocKind arrays bigger again, but otherwise I don't think there's much of a disadvantage.
I don't know whether x64 is sensitive to this, either. That is to say, would we need to make the underlying type intptr_t/uintptr_t/size_t or would int/unsigned suffice? Terrence, what would you prefer?
I'm wondering if a signed type might be slightly nicer so we can do things like MOZ_ASSERT(kind >= AllocKind::OBJECT0) again*.
* For context, with an unsigned type this is a tautological compare, which makes warn-as-err builds fail, so I removed these assertions in bug 1139552 and added a static_assert that size_t(AllocKind::OBJECT0) == 0.
Flags: needinfo?(terrence)
Comment 9•10 years ago
|
||
I'd just remove the type specifier for now and add a fixme referencing this bug. We can revisit in a few months when gcc4.9 is standard. Maybe by then the MSVC issues with range iteration will also be fixed.
Flags: needinfo?(terrence)
Assignee | ||
Comment 10•10 years ago
|
||
I confirmed that the problem still exists on GCC 4.8.4. From the bug jandem linked, I believe it also exists on GCC 4.9.2, though just removing the inline keyword from GetBackgroundAllocKind() wasn't enough to trigger it there. It should be fixed in 4.8.5 and 4.9.3, whenever those are released.
Attachment #8579042 -
Attachment is obsolete: true
Attachment #8579042 -
Flags: feedback?(choller)
Attachment #8580271 -
Flags: review?(terrence)
Assignee | ||
Updated•10 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 11•10 years ago
|
||
AllocKind is far from the only enum class with a less-than-word-sized underlying type in the tree by the way:
https://dxr.mozilla.org/mozilla-central/search?q=%22enum+class%22&case=true&redirect=true
I guess those aren't causing known problems though :\
Comment 12•10 years ago
|
||
O_O
Good point! Some days I wonder how anything works at all, ever.
Comment 13•10 years ago
|
||
Comment on attachment 8580271 [details] [diff] [review]
Remove type specifier from AllocKind to avoid miscompilations on GCC.
Review of attachment 8580271 [details] [diff] [review]:
-----------------------------------------------------------------
Great!
Attachment #8580271 -
Flags: review?(terrence) → review+
Assignee | ||
Comment 14•10 years ago
|
||
Keywords: checkin-needed
Comment 15•10 years ago
|
||
Keywords: checkin-needed
Comment 16•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in
before you can comment on or make changes to this bug.
Description
•