Assertion failure: regionStart && OffsetFromAligned(regionStart, alignment) != 0, at js/src/gc/Memory.cpp:662
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox65 | --- | unaffected |
firefox66 | --- | fixed |
firefox67 | --- | fixed |
People
(Reporter: decoder, Assigned: ehoogeveen)
References
Details
(4 keywords, Whiteboard: [jsbugmon:update])
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
sfink
:
review+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
The following testcase crashes on mozilla-central revision 666abafd77b1 (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --disable-profiling --enable-debug --without-intl-api --enable-optimize --target=i686-pc-linux-gnu, run with --fuzzing-safe --cpu-count=2 --ion-offthread-compile=off):
evaluate(`
function testCALLELEM() {
function f() {}
evalInWorker("newGlobal().offThreadCompileScript('{}');");
testCALLELEM();
}
testCALLELEM()
`);
Backtrace:
received signal SIGSEGV, Segmentation fault.
#0 js::gc::TryToAlignChunk<>(void **, void **, size_t, size_t) (aRegion=aRegion@entry=0xa2bcef8, aRetainedRegion=aRetainedRegion@entry=0xa2bcf04, length=length@entry=1048576, alignment=1048576) at js/src/gc/Memory.cpp:662
#1 0x56e82aa8 in js::gc::MapAlignedPagesLastDitch (length=length@entry=1048576, alignment=alignment@entry=1048576) at js/src/gc/Memory.cpp:592
#2 0x56e832c4 in js::gc::MapAlignedPages (length=1048576, alignment=1048576) at js/src/gc/Memory.cpp:447
#3 0x56e03f91 in js::gc::Chunk::allocate (rt=0xf44d8000) at js/src/gc/Allocator.cpp:721
#4 0x56e0e8ca in js::gc::GCRuntime::getOrAllocChunk (this=0xf44d83c8, lock=...) at js/src/gc/Allocator.cpp:653
#5 0x56e0e98f in js::gc::GCRuntime::pickChunk (this=0xf44d83c8, lock=...) at js/src/gc/Allocator.cpp:678
#6 0x56e17015 in js::gc::ArenaLists::refillFreeListAndAllocate (this=0xcce72060, freeLists=..., thingKind=js::gc::AllocKind::JITCODE, checkThresholds=js::gc::ShouldCheckThresholds::CheckThresholds) at js/src/gc/Allocator.cpp:477
#7 0x56e17273 in js::gc::GCRuntime::refillFreeListFromMainThread (cx=0xcce71800, thingKind=js::gc::AllocKind::JITCODE) at js/src/gc/Allocator.cpp:422
#8 0x56e6597b in js::gc::GCRuntime::tryNewTenuredThing<js::jit::JitCode, (js::AllowGC)0> (thingSize=32, kind=js::gc::AllocKind::JITCODE, cx=0xcce71800) at js/src/gc/Allocator.cpp:271
#9 js::Allocate<js::jit::JitCode, (js::AllowGC)0> (cx=0xcce71800) at js/src/gc/Allocator.cpp:252
#10 0x570e6482 in js::jit::JitCode::New<(js::AllowGC)0> (cx=0xcce71800, code=0x38268010 "", bufferSize=60600, headerSize=16, pool=0xec4d3070, kind=js::jit::CodeKind::Other) at js/src/jit/Ion.cpp:653
#11 0x5712876c in js::jit::Linker::newCode (this=0xa2bd2bc, cx=0xcce71800, kind=js::jit::CodeKind::Other) at js/src/jit/Linker.cpp:54
#12 0x570dcd44 in js::jit::JitRuntime::initialize (this=<optimized out>, cx=<optimized out>) at js/src/jit/Ion.cpp:306
#13 0x56a7eed7 in JSRuntime::createJitRuntime (this=0xf44d8000, cx=0xcce71800) at js/src/vm/Realm.cpp:146
#14 0x56c761aa in JS::InitSelfHostedCode (cx=0xcce71800) at js/src/jsapi.cpp:443
#15 0x567594d0 in WorkerMain (input=<optimized out>) at js/src/shell/js.cpp:3960
[...]
#19 0xf7cc441e in clone () from /lib32/libc.so.6
eax 0x57bffa54 1472199252
ebx 0xa2bcf04 170643204
ecx 0xf7d90864 -136771484
edx 0x57781f64 1467490148
esi 0x100000 1048576
edi 0x57bfeff4 1472196596
ebp 0xa2bcec8 170643144
esp 0xa2bce90 170643088
eip 0x56e819d4 <js::gc::TryToAlignChunk<>(void **, void **, size_t, size_t)+388>
=> 0x56e819d4 <js::gc::TryToAlignChunk<>(void **, void **, size_t, size_t)+388>: movl $0x0,0x0
0x56e819de <js::gc::TryToAlignChunk<>(void **, void **, size_t, size_t)+398>: ud2
This could be shell-only as it seems to be related to over-recursion with the worker implementation in the shell.
Comment 1•6 years ago
|
||
Steve, is this regression bug something you could quickly look at some point soon and investigate. Or if you don't have time could you help suggest someone else on our team I could have look at this.
Updated•6 years ago
|
Assignee | ||
Comment 2•6 years ago
|
||
Hmm, I'm not sure what the deal is with this one but I wonder if it's related to bug 1520496. I may have introduced a bug in TryToAlignChunk() that's either rare or being masked by using the new allocator by default (or both). I'll look into this tomorrow along with bug 1520496.
Comment 3•6 years ago
|
||
I haven't caught this in gdb or rr. Whatever it's doing mangles things enough that it crashes both, in different ways. I can get a core file, but it doesn't have symbols (or a valid stack).
It's something with stack exhaustion or something. Running it starts and stops a couple thousand threads. I added a printout just before the assert, which gave regionStart=0xdeb00000, alignment=1048576, offset=0. So an already-aligned chunk is slipping in?
The only way I see of that happening is if a previous call to TryToAlignChunk set region to an aligned value but returned false.
Oh! ehoogeveen, would this do it? TryToAlignChunk runs, and maps another unaligned region. It sets result to false. AlwaysGetNew is the default, true, so it does another MapMemory. Say this returns an aligned region. Then wouldn't you end up passing back an aligned pointer in ®ion, but returning false and trying again, hitting this assert?
Simply re-setting 'result' based on the alignment doesn't work; it hits MOZ_ASSERT(!tempMaps[attempt]) (it doesn't think we should have filled in tempMaps[attempt] if we succeeded.) If I "fix" that (with if (tempMaps[attempt]) attempt++;
and commenting out the assert), it "works". But I'll let you figure out how to do it the proper way.
Updated•6 years ago
|
Comment hidden (obsolete) |
I don't think comment 4 is right, the testcase may have been intermittent or something.
Updated•6 years ago
|
Assignee | ||
Comment 6•6 years ago
|
||
Thanks for the analysis, Steve - you're right, we weren't handling naturally aligned chunks properly. While testing this I also noticed that TestMapAlignedPagesLastDitch() was still disabled on 64-bit platforms. That probably caused the failures in bug 1520496, though I still want to look into why we're falling back to the old allocator at all.
PS: Sorry for sitting on this for so long, I've had some health issues recently so I haven't had much energy for things outside of my job.
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 7•6 years ago
|
||
I realized I could extend the testGCAllocator jsapi-test to catch this bug, so I did. MapAlignedPages() was dodging this bug by ignoring the return value of TryToAlignChunk(), but that was suboptimal so I changed it (now two tests in testGCAllocator fail without the change in TryToAlignChunk()).
I also added a few more (debug only) assertions to make sure TryToAlignChunk() works correctly, and added some much needed comments to MapAlignedPages() while I was there.
Comment 8•6 years ago
|
||
Assignee | ||
Comment 9•6 years ago
|
||
Thanks for the review!
Here's a try run with a change on top to disable the new allocator (to force testing the old one): https://treeherder.mozilla.org/#/jobs?repo=try&revision=cdc39223bfc2529861e090486cd6448be06f5e26
And a regular try run with just this change: https://treeherder.mozilla.org/#/jobs?repo=try&revision=28d36a68c4a4c2de1af2a447631ccc556a7913d8
Both are green, so I think this is ready to go :)
Comment 10•6 years ago
|
||
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ee8c2687d7f5
Do the right thing if we happen to get an aligned region in TryToAlignChunk. r=sfink
Comment 11•6 years ago
|
||
bugherder |
Comment 12•6 years ago
|
||
Is this something we should consider backporting to Beta or can it ride the trains?
Assignee | ||
Comment 13•6 years ago
|
||
Comment on attachment 9044774 [details] [diff] [review]
v2 - Do the right thing if we happen to get an aligned region in TryToAlignChunk.
Beta/Release Uplift Approval Request
- Feature/Bug causing the regression: Bug 1502733
- User impact if declined: This bug partially breaks our 'last ditch' allocator (except on Windows, which is unaffected) which is used in low memory situations. As a result 32-bit platforms and devices with low physical memory could run out of memory more quickly, though it's hard to judge the real world effect of this change. This patch also fixes the intermittent test failure tracked by bug 1520496, and might reduce noise for fuzzers.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): This is not a one-line fix, but the changes are contained to two functions that are heavily tested by the JS engine. The patch includes a regression test for the broken behavior identified in this bug.
- String changes made/needed:
Assignee | ||
Comment 14•6 years ago
|
||
Hi Ryan, thanks for the reminder. I wasn't sure whether to nominate this patch as it's getting pretty late in the cycle, but I think the included fix for bug 1520496 makes it worth taking.
Comment 16•6 years ago
|
||
bugherder uplift |
Description
•