AddressSanitizer: use-after-poison [@ __asan_memcpy] with WRITE of size 48 through [@ js::NewObjectCache::copyCachedToObject] and various other signatures
Categories
(Core :: JavaScript Engine: JIT, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox-esr68 | --- | unaffected |
firefox68 | --- | unaffected |
firefox69 | --- | unaffected |
firefox70 | --- | fixed |
People
(Reporter: decoder, Assigned: pbone)
References
(Blocks 1 open bug, Regression)
Details
(4 keywords, Whiteboard: [jsbugmon:update][fuzzblocker])
Attachments
(3 files, 1 obsolete file)
The following testcase crashes on mozilla-central revision ad05396bfeed (build with --enable-valgrind --enable-gczeal --disable-tests --disable-profiling --disable-debug --enable-address-sanitizer --disable-jemalloc --enable-optimize=-O2 --enable-fuzzing, run with --fuzzing-safe --cpu-count=2 --ion-offthread-compile=off --wasm-compiler=ion):
See attachment.
Backtrace:
==29939==ERROR: AddressSanitizer: use-after-poison on address 0x230d24d40000 at pc 0x55b2fa897da5 bp 0x7ffc1fc4ba30 sp 0x7ffc1fc4b1e0
WRITE of size 48 at 0x230d24d40000 thread T0
#0 0x55b2fa897da4 in __asan_memcpy
#1 0x55b2faba7d2a in js_memcpy(void*, void const*, unsigned long) js/src/jsutil.h:54:10
#2 0x55b2faba7d2a in js::NewObjectCache::copyCachedToObject(js::NativeObject*, js::NativeObject*, js::gc::AllocKind) js/src/vm/Caches.h:216
#3 0x55b2faba7d2a in js::NewObjectCache::newObjectFromHit(JSContext*, int, js::gc::InitialHeap) js/src/vm/Caches-inl.h:81
#4 0x55b2fb06d9f3 in js::NewObjectWithClassProtoCommon(JSContext*, js::Class const*, JS::Handle<JSObject*>, js::gc::AllocKind, js::NewObjectKind) js/src/vm/JSObject.cpp:907:29
#5 0x55b2faa98ed9 in js::NewObjectWithClassProto(JSContext*, js::Class const*, JS::Handle<JSObject*>, js::gc::AllocKind, js::NewObjectKind) js/src/vm/JSObject-inl.h:493:10
#6 0x55b2faa98ed9 in js::NewBuiltinClassInstance(JSContext*, js::Class const*, js::gc::AllocKind, js::NewObjectKind) js/src/vm/JSObject-inl.h:526
#7 0x55b2faa98ed9 in js::PlainObject* js::NewBuiltinClassInstance<js::PlainObject>(JSContext*, js::gc::AllocKind, js::NewObjectKind) js/src/vm/JSObject-inl.h:545
#8 0x55b2faa98ed9 in js::CopyInitializerObject(JSContext*, JS::Handle<js::PlainObject*>, js::NewObjectKind) js/src/vm/NativeObject-inl.h:624
#9 0x55b2faa99a8c in js::NewObjectOperationWithTemplate(JSContext*, JS::Handle<JSObject*>) js/src/vm/Interpreter.cpp:5212:7
#10 0x55b2fbf26d3d in js::jit::DoNewObjectFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICNewObject_Fallback*, JS::MutableHandle<JS::Value>) js/src/jit/BaselineIC.cpp:4074:11
#11 0x7fd2a1d6dacf (<unknown module>)
Address 0x230d24d40000 is a wild pointer.
SUMMARY: AddressSanitizer: use-after-poison in __asan_memcpy
Shadow bytes around the buggy address:
0x04622499ffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x04622499ffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x04622499fff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0462249a0000:[f7]f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
0x0462249a0010: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
0x0462249a0020: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Poisoned by user: f7
==29939==ABORTING
This is causing quite a few distinct and hard-to-reproduce use-after-poison issues, so I am marking this as fuzzblocker. Also marking s-s until we know the source of the problem. Unfortunately, the testcase does not reduce further than what is attached, it reproduced reliably for me though. It seems very sensitive to memory allocations and branches.
Reporter | ||
Comment 1•5 years ago
|
||
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 2•5 years ago
|
||
This look like this could be a rooting issue in the IC of Baseline, while I doubt about it this is my simplest hypothesis so far.
Matthew, can you (in/)validate this hypothesis, and maybe find more info and who might feel concerned by this issue.
Updated•5 years ago
|
Comment 3•5 years ago
|
||
This memory is "poisoned by user", so it's the ASan/Valgrind instrumentation we added to Poison. Requesting NI from paul because he made changes in that area recently.
Comment 4•5 years ago
|
||
So, rr
is not being terribly cooperative, but here's what I've got.
-
I can reproduce, with the same stack as :decoder. However, trying to set a breakpoint at the precise point where the read occurs seems to cause
rr
to crash. I was able however to step through from the start of the function. -
The byte read is also the first first byte of the allocated object here. I don't know about ASAN + our GC, but it's perhaps notable that it thinks
Address 0x24faf0440000 is a wild pointer
? -
The crash happens during the
js_memcpy
reached from the next line, the call tocopyCachedToObject
. -
Specifically, it's during the
__asan_memcpy
that gets injected that the error fires; but, I don't really understand why: the supposed uninitialized value read is inside thedst
of the memcpy! Is this an ASAN bug?
Some other notes that came up during my investigation, perhaps helpful to the next person who looks at this:
-
The value was poisoned here, by watching the value itself.
#0 __memset_avx2_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:151 #1 0x0000562c56fa2f1a in __asan_memset () at /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cc:27 #2 0x0000562c5851b903 in js::AlwaysPoison (ptr=<optimized out>, value=47 '/', num=1048576, kind=MemCheckKind::MakeUndefined) at /home/mgaudet/unified/js/src/jsutil.h:352 #3 js::Poison (ptr=0x7f73ad800000, value=47 '/', num=1048576, kind=MemCheckKind::MakeUndefined) at /home/mgaudet/unified/js/src/jsutil.h:368 #4 js::NurseryChunk::poisonAndInit (this=0x7f73ad800000, rt=0x629000000200, extent=1048576) at /home/mgaudet/unified/js/src/gc/Nursery.cpp:77 #5 js::Nursery::poisonAndInitCurrentChunk (this=<optimized out>, fullPoison=<optimized out>) at /home/mgaudet/unified/js/src/gc/Nursery.cpp:1253 #6 js::Nursery::enable (this=0x6290000029e0) at /home/mgaudet/unified/js/src/gc/Nursery.cpp:301 #7 0x0000562c5795b334 in JSRuntime::initSelfHosting (this=<optimized out>, cx=0x7ffd63cec340) at /home/mgaudet/unified/js/src/vm/SelfHosting.cpp:3158 #8 0x0000562c57e78dee in JS::InitSelfHostedCode (cx=0x61d000000a80) at /home/mgaudet/unified/js/src/jsapi.cpp:433 #9 0x0000562c56fe6eed in main (argc=-1957535376, argv=0x61d000000a80, envp=<optimized out>) at /home/mgaudet/unified/js/src/shell/js.cpp:11345
however, it's peculiar, because I see a different place the poisoning occurs if watch the shadow byte that corresponds to that address directly for changes:
#0 __memset_avx2_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:151 #1 0x0000559e0b065ed2 in __asan_poison_memory_region () at /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_poisoning.cc:141 #2 0x0000559e0c5d8243 in js::gc::MarkPagesUnused (region=0x24faf0440000, length=782336) at /home/mgaudet/unified/js/src/gc/Memory.cpp:753 #3 0x0000559e0c5d94a5 in js::NurseryDecommitTask::decommitRange (this=<optimized out>, lock=...) at /home/mgaudet/unified/js/src/gc/Nursery.cpp:191 #4 js::NurseryDecommitTask::run (this=0x629000002d30) at /home/mgaudet/unified/js/src/gc/Nursery.cpp:156 #5 0x0000559e0b763cc5 in js::GCParallelTask::runFromHelperThread (this=0x629000002d30, lock=...) at /home/mgaudet/unified/js/src/vm/HelperThreads.cpp:1716 #6 js::HelperThread::handleGCParallelWorkload (this=<optimized out>, lock=...) at /home/mgaudet/unified/js/src/vm/HelperThreads.cpp:1737 #7 0x0000559e0b7671a2 in js::HelperThread::threadLoop (this=0x611000000400) at /home/mgaudet/unified/js/src/vm/HelperThreads.cpp:2515 #8 0x0000559e0b7f5163 in js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::callMain<0ul> (this=0x602000000370) at /home/mgaudet/unified/js/src/threading/Thread.h:239 #9 js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::Start (aPack=0x602000000370) at /home/mgaudet/unified/js/src/threading/Thread.h:232 #10 0x00007f694c494182 in start_thread (arg=<optimized out>) at pthread_create.c:486 #11 0x00007f694c044b1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
-
I think it's notable that the read is the first byte of a poisoned area, where the byte before is addressable.
-
Watch points seem to be a little peculiar; if I watch the address for maniplulation, and reverse-continue, all I ever see are poison writes. On the flip side however, if I reverse to the first poisoning, and step forward, I see a non-poisioning write flagged here
#0 js::gc::ChunkBitmap::isMarkedAny (this=0x28f67cafc0a0, cell=0x28f67caad128) at /home/mgaudet/unified/js/src/gc/Heap.h:644 #1 js::gc::TenuredCell::isMarkedAny (this=0x28f67caad128) at /home/mgaudet/unified/js/src/gc/Cell.h:312 #2 js::gc::Arena::finalize<js::Shape> (this=0x28f67caad000, fop=0x7f694875c8c0, thingKind=<optimized out>, thingSize=40) at /home/mgaudet/unified/js/src/gc/GC.cpp:585 #3 0x0000559e0c501839 in FinalizeTypedArenas<js::Shape> (fop=0x7f694875c8c0, src=0x7f694875b6a0, dest=..., budget=..., thingKind=<optimized out>) at /home/mgaudet/unified/js/src/gc/GC.cpp:653 #4 FinalizeArenas (fop=0x7f694875c8c0, src=0x7f694875b6a0, dest=..., thingKind=<optimized out>, budget=...) at /home/mgaudet/unified/js/src/gc/GC.cpp:681 #5 0x0000559e0c500984 in js::gc::ArenaLists::backgroundFinalize (fop=<optimized out>, listHead=0x28f67cab2000, empty=<optimized out>) at /home/mgaudet/unified/js/src/gc/GC.cpp:3255 #6 0x0000559e0c50b232 in js::gc::GCRuntime::sweepBackgroundThings (this=0x6290000006e8, zones=..., freeBlocks=...) at /home/mgaudet/unified/js/src/gc/GC.cpp:3724 #7 0x0000559e0c50c356 in js::gc::GCRuntime::sweepFromBackgroundThread (this=0x6290000006e8, lock=...) at /home/mgaudet/unified/js/src/gc/GC.cpp:3805 #8 0x0000559e0c50c041 in js::gc::BackgroundSweepTask::run (this=0x629000001950) at /home/mgaudet/unified/js/src/gc/GC.cpp:3790 #9 0x0000559e0b763cc5 in js::GCParallelTask::runFromHelperThread (this=0x629000001950, lock=...) at /home/mgaudet/unified/js/src/vm/HelperThreads.cpp:1716 #10 js::HelperThread::handleGCParallelWorkload (this=<optimized out>, lock=...) at /home/mgaudet/unified/js/src/vm/HelperThreads.cpp:1737 #11 0x0000559e0b7671a2 in js::HelperThread::threadLoop (this=0x611000000400) at /home/mgaudet/unified/js/src/vm/HelperThreads.cpp:2515 #12 0x0000559e0b7f5163 in js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::callMain<0ul> (this=0x602000000370) at /home/mgaudet/unified/js/src/threading/Thread.h:239 #13 js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::Start (aPack=0x602000000370) at /home/mgaudet/unified/js/src/threading/Thread.h:232 #14 0x00007f694c494182 in start_thread (arg=<optimized out>) at pthread_create.c:486 #15 0x00007f694c044b1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
which seems peculiar, as
isMarkedAny
shouldn't be doing writes? Right? Don't know how much I trustrr
here.
In conclusion: I think this is either a GC thing, or, potentially just an ASAN bug exacerbated by the new poisoning?
Reporter | ||
Comment 5•5 years ago
|
||
(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #4)
In conclusion: I think this is either a GC thing, or, potentially just an ASAN bug exacerbated by the new poisoning?
ASan bugs are of course not impossible, but highly unlikely. We have not seen notable bugs in ASan itself for about 2 years now. Also, this issue only started occurring on Tue 9 Jul 2019 (at a very high rate) and not before. I think the poisoning patch is older than that. It might help if we had an exact regression range. Gary, can you bisect this, even though it might be a little flaky?
autobisectjs shows this is probably related to the following changeset:
The first bad revision is:
changeset: https://hg.mozilla.org/mozilla-central/rev/fbfcb9fbf6c5
user: Paul Bone
date: Fri Jul 05 13:18:17 2019 +0000
summary: Bug 1506733 - (part 4) Decommit unused portion of Nursery chunks r=jonco
Paul, is bug 1506733 a likely regressor?
(I fixed a bunch of bisection and/or asan compilation-related issues for this, I think this potential regressor makes sense)
Updated•5 years ago
|
Reporter | ||
Comment 7•5 years ago
|
||
Assignee | ||
Comment 8•5 years ago
|
||
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #6)
autobisectjs shows this is probably related to the following changeset:
The first bad revision is:
changeset: https://hg.mozilla.org/mozilla-central/rev/fbfcb9fbf6c5
user: Paul Bone
date: Fri Jul 05 13:18:17 2019 +0000
summary: Bug 1506733 - (part 4) Decommit unused portion of Nursery chunks r=joncoPaul, is bug 1506733 a likely regressor?
(I fixed a bunch of bisection and/or asan compilation-related issues for this, I think this potential regressor makes sense)
Yes. From your bisect and especially Matthew's back traces it it certainly either caused by that patch if not it was pre-existing and the patch helped to tease it out.
Assignee | ||
Comment 9•5 years ago
|
||
I think that Bug 1562550 fixes the first crash in comment 0, at least it does for me. decoder can you check this?
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 10•5 years ago
|
||
it looks like both those test cases/crashes are the same (they have different stacks where they crash so I checked) and that applying Bug 1562550 part 2 fixes it.
The problem was that entering zeal mode wouldn't clear the ASAN information for part of the chunk and so when we touched it later it would cause the failure.
I want to apply another patch though, to so there's no race between the background decommiter and either disabling the nursery or entering zeal mode. This was something else I investigated.
Assignee | ||
Comment 11•5 years ago
|
||
I think this is not a security bug because it's caused by improper asan calls. Not something that asan actually caught.
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 12•5 years ago
|
||
- Spelling fixes.
- Make the meaning of a comment clearer.
Assignee | ||
Comment 13•5 years ago
|
||
Depends on D38042
Assignee | ||
Comment 14•5 years ago
|
||
Depends on D38043
Reporter | ||
Comment 15•5 years ago
|
||
(In reply to Paul Bone [:pbone] from comment #10)
it looks like both those test cases/crashes are the same (they have different stacks where they crash so I checked) and that applying Bug 1562550 part 2 fixes it.
Hm, I can still reproduce with the test in comment 7 on today's build (mozilla-central revision eae7203ff8a5), even though bug 1562550 has landed several days ago. Are you sure this is fixed by the patch?
Assignee | ||
Comment 16•5 years ago
|
||
I'm sorry decoder. I must have missed something. I tested again today and Bug 1562550 fixes the first problem but not the second. they're different bugs.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 17•5 years ago
|
||
The 2nd problem was a race. The decommit task would attempt to decommit part of a chunk that was also being decommitted as a whole chunk while disabling the nursery. The solution is to (as in the patch I already wrote when I noticed this by reading code) to join() the decommit task before freeing the chunks while disabling the nursery.
Comment 18•5 years ago
|
||
Comment 19•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e49dd8bce963
https://hg.mozilla.org/mozilla-central/rev/227a100b5eb0
Updated•5 years ago
|
Updated•5 years ago
|
Updated•3 years ago
|
Description
•