Closed Bug 1507322 Opened 6 years ago Closed 6 years ago

Crash [@ js::CurrentThreadCanAccessRuntime] during GC with WeakMap

Categories

(Core :: JavaScript: GC, defect)

x86_64
Linux
defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox-esr60 --- wontfix
firefox64 --- wontfix
firefox65 --- fixed
firefox66 --- fixed

People

(Reporter: decoder, Assigned: sfink)

References

Details

(5 keywords, Whiteboard: [jsbugmon:update][sg:dos])

Crash Data

Attachments

(3 files, 1 obsolete file)

The following testcase crashes on mozilla-central revision 073045259e75 (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --disable-profiling --enable-debug --enable-optimize, run with --fuzzing-safe --cpu-count=2 --ion-offthread-compile=off): function TestGC2(m) { var head = new Object; for (key = head, i = 0; i < 48614; i++, key = m.get(key)) { m.set(key, new Object); } gc(); for (key = head; key != undefined; key = m.get(key)) {} } TestGC2(new WeakMap); Backtrace: received signal SIGSEGV, Segmentation fault. 0x0000555555b2eb74 in js::CurrentThreadCanAccessRuntime (rt=0x7ffff5f1c000) at js/src/vm/Runtime.cpp:833 #0 0x0000555555b2eb74 in js::CurrentThreadCanAccessRuntime (rt=0x7ffff5f1c000) at js/src/vm/Runtime.cpp:833 #1 0x0000555555d81393 in js::CheckThreadLocal::check (this=this@entry=0x7ffff5f182f0) at js/src/threading/ProtectedData.cpp:45 #2 0x0000555555d82ad5 in js::ProtectedData<js::CheckThreadLocal, bool>::ref (this=0x7ffff5f182e8) at js/src/threading/ProtectedData.h:116 #3 js::ProtectedData<js::CheckThreadLocal, bool>::operator bool const& (this=0x7ffff5f182e8) at js/src/threading/ProtectedData.h:84 #4 js::OnHelperThread<(js::AllowedHelperThread)1> () at js/src/threading/ProtectedData.cpp:32 #5 js::CheckZone<(js::AllowedHelperThread)1>::check (this=this@entry=0x7ffff5f58080) at js/src/threading/ProtectedData.cpp:70 #6 0x0000555555f00b22 in js::ProtectedData<js::CheckZone<(js::AllowedHelperThread)1>, JS::GCHashMap<js::gc::Cell*, unsigned long, mozilla::PointerHasher<js::gc::Cell*>, js::SystemAllocPolicy, js::gc::UniqueIdGCPolicy> >::ref (this=0x7ffff5f58058) at js/src/threading/ProtectedData.h:107 #7 JS::Zone::uniqueIds (this=0x7ffff5f58000) at js/src/gc/Zone.h:308 #8 JS::Zone::hasUniqueId (this=0x7ffff5f58000, cell=<optimized out>) at js/src/gc/Zone-inl.h:132 #9 0x0000555555f00bcb in js::MovableCellHasher<JSObject*>::hasHash (l=@0x7fffff7ff0d8: 0x7ffff4a31e40) at js/src/gc/Barrier.cpp:146 #10 0x0000555555a40f5d in js::MovableCellHasher<js::HeapPtr<JSObject*> >::hasHash (l=<optimized out>) at js/src/gc/Barrier.h:844 #11 mozilla::FallibleHashMethods<js::MovableCellHasher<js::HeapPtr<JSObject*> > >::hasHash<JSObject* const&> (l=<optimized out>) at dist/include/js/RootingAPI.h:788 #12 mozilla::HasHash<mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::MapHashPolicy, JSObject* const&> (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40) at dist/include/mozilla/HashTable.h:960 #13 mozilla::detail::HashTable<mozilla::HashMapEntry<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >, mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::MapHashPolicy, js::ZoneAllocPolicy>::readonlyThreadsafeLookup (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40, this=0x7ffff55053f8) at dist/include/mozilla/HashTable.h:2115 #14 mozilla::detail::HashTable<mozilla::HashMapEntry<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >, mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::MapHashPolicy, js::ZoneAllocPolicy>::lookup (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40, this=0x7ffff55053f8) at dist/include/mozilla/HashTable.h:2125 #15 mozilla::HashMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value>, js::MovableCellHasher<js::HeapPtr<JSObject*> >, js::ZoneAllocPolicy>::lookup (aLookup=@0x7fffff7ff0d8: 0x7ffff4a31e40, this=0x7ffff55053f8) at dist/include/mozilla/HashTable.h:254 #16 js::WeakMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >::markEntry (this=0x7ffff55053c0, marker=0x7ffff5f1d690, markedCell=0x7ffff4a31e40, origKey=...) at js/src/gc/WeakMap-inl.h:51 #17 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a31e40, values=...) at js/src/gc/Marking.cpp:629 #18 0x0000555555f0f2d2 in js::GCMarker::markImplicitEdgesHelper<JSObject*> (this=this@entry=0x7ffff5f1d690, markedThing=markedThing@entry=0x7ffff4a31e40) at js/src/gc/Marking.cpp:656 #19 0x0000555555f10386 in js::GCMarker::markImplicitEdges<JSObject> (thing=0x7ffff4a31e40, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:670 #20 js::GCMarker::markAndPush<JSObject> (this=this@entry=0x7ffff5f1d690, thing=thing@entry=0x7ffff4a31e40) at js/src/gc/Marking.cpp:856 #21 0x0000555555f10424 in js::GCMarker::traverse<JSObject*> (thing=0x7ffff4a31e40, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:859 #22 DoMarking<JSObject> (gcmarker=0x7ffff5f1d690, thing=0x7ffff4a31e40) at js/src/gc/Marking.cpp:735 #23 0x0000555555f1e02b in DoMarking<JS::Value> (thing=..., gcmarker=<optimized out>) at js/src/gc/Marking.cpp:750 #24 js::gc::TraceEdgeInternal<JS::Value> (trc=trc@entry=0x7ffff5f1d690, thingp=0x7ffff4b0b5c0, name=name@entry=0x5555568f1bf5 "ephemeron value") at js/src/gc/Marking.cpp:562 #25 0x0000555555a411d3 in js::TraceEdge<JS::Value> (name=0x5555568f1bf5 "ephemeron value", thingp=<optimized out>, trc=0x7ffff5f1d690) at js/src/gc/Tracer.h:109 #26 js::WeakMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >::markEntry (this=0x7ffff55053c0, marker=0x7ffff5f1d690, markedCell=0x7ffff4a26f00, origKey=...) at js/src/gc/WeakMap-inl.h:57 #27 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a26f00, values=...) at js/src/gc/Marking.cpp:629 #28 0x0000555555f0f2d2 in js::GCMarker::markImplicitEdgesHelper<JSObject*> (this=this@entry=0x7ffff5f1d690, markedThing=markedThing@entry=0x7ffff4a26f00) at js/src/gc/Marking.cpp:656 #29 0x0000555555f10386 in js::GCMarker::markImplicitEdges<JSObject> (thing=0x7ffff4a26f00, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:670 #30 js::GCMarker::markAndPush<JSObject> (this=this@entry=0x7ffff5f1d690, thing=thing@entry=0x7ffff4a26f00) at js/src/gc/Marking.cpp:856 #31 0x0000555555f10424 in js::GCMarker::traverse<JSObject*> (thing=0x7ffff4a26f00, this=0x7ffff5f1d690) at js/src/gc/Marking.cpp:859 #32 DoMarking<JSObject> (gcmarker=0x7ffff5f1d690, thing=0x7ffff4a26f00) at js/src/gc/Marking.cpp:735 #33 0x0000555555f1e02b in DoMarking<JS::Value> (thing=..., gcmarker=<optimized out>) at js/src/gc/Marking.cpp:750 #34 js::gc::TraceEdgeInternal<JS::Value> (trc=trc@entry=0x7ffff5f1d690, thingp=0x7ffff4b2a508, name=name@entry=0x5555568f1bf5 "ephemeron value") at js/src/gc/Marking.cpp:562 #35 0x0000555555a411d3 in js::TraceEdge<JS::Value> (name=0x5555568f1bf5 "ephemeron value", thingp=<optimized out>, trc=0x7ffff5f1d690) at js/src/gc/Tracer.h:109 #36 js::WeakMap<js::HeapPtr<JSObject*>, js::HeapPtr<JS::Value> >::markEntry (this=0x7ffff55053c0, marker=0x7ffff5f1d690, markedCell=0x7ffff4a512c0, origKey=...) at js/src/gc/WeakMap-inl.h:57 #37 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a512c0, values=...) at js/src/gc/Marking.cpp:629 [...] #127 0x0000555555ebe8a1 in js::GCMarker::markEphemeronValues (this=0x7ffff5f1d690, markedCell=0x7ffff4a1c380, values=...) at js/src/gc/Marking.cpp:629 rax 0x7ffff5f18000 140737319632896 rbx 0x7ffff5f182f0 140737319633648 rcx 0x1 1 rdx 0x0 0 rsi 0x7ffff4a31e40 140737297718848 rdi 0x7ffff5f1c000 140737319649280 rbp 0x7fffff7ff000 140737479962624 rsp 0x7fffff7ff000 140737479962624 r8 0x7fffff7ff150 140737479962960 r9 0x0 0 r10 0xa0eaf452 2699752530 r11 0x0 0 r12 0x7ffff5f58080 140737319895168 r13 0x7fffff7ff0e0 140737479962848 r14 0x7fffff7ff0f0 140737479962864 r15 0x7ffff5f1d690 140737319655056 rip 0x555555b2eb74 <js::CurrentThreadCanAccessRuntime(JSRuntime const*)+4> => 0x555555b2eb74 <js::CurrentThreadCanAccessRuntime(JSRuntime const*)+4>: push %rbx 0x555555b2eb75 <js::CurrentThreadCanAccessRuntime(JSRuntime const*)+5>: mov %rdi,%rbx This looks like a stack space exhaustion (over recursion), but I'm marking it s-s as a start because it involves GC with WeapMaps and I want to be sure there is no memory corruption or anything present causing this behavior.
Summary: Crash [@ js::CurrentThreadCanAccessRuntime] during GC with WeapMap → Crash [@ js::CurrentThreadCanAccessRuntime] during GC with WeakMap
Flags: needinfo?(jcoppeard)
Christian: any idea of the regression range here?
(In reply to Daniel Veditz [:dveditz] from comment #1) > Christian: any idea of the regression range here? No. JSBugMon will soon run on the bug, but if that doesn't find the regression range, it is up to QA/developers to determine that.
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
I take that back. Comment 4 is also unlikely to be the one.
It seems like GCMarker::markEphemeronValues ends up being called recursively in this case. Steve, could we push things onto the mark stack rather than marking them directly there?
Flags: needinfo?(jcoppeard) → needinfo?(sphink)
Group: javascript-core-security
Keywords: csectype-dos
Whiteboard: [jsbugmon:update] → [jsbugmon:update][sg:dos]
Testing this out with https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=217512636&revision=1ea013f7413050c0b9617a0af4fe0090989a49e9 Seems good so far. And I *think* it may have removed a redundant call to markImplicitEdges that I'd been meaning to look at. (processMarkStackTop's scan_obj does markImplicitEdges, and so does JSObject's markAndPush, which seems wrong.)
Flags: needinfo?(sphink)
Hopefully I got this right. It fixes the test case, at least.
Attachment #9032504 - Flags: review?(jcoppeard)
Assignee: nobody → sphink
Status: NEW → ASSIGNED
Comment on attachment 9032504 [details] [diff] [review] mark implicit edges via mark stack instead of eagerly Review of attachment 9032504 [details] [diff] [review]: ----------------------------------------------------------------- Well, I want to r+ this but I'm not actually I understand how it works. This makes JSScript marking not use the mark stack, and moves implicit edge marking to the traceChildren() method... for some set of GC things? Is that correct? The stack trace seems to be a problem with JSObject marking taking stack space proportional to the length of the path through all the weak map entries. Oh, so is it that implicit edge marking no longer happens in markAndPush()? Ah, that's what your comment says. r+ if so. Note that if you're not pushing JSScripts on the mark stack any more then there's some stuff in MarkStack related to them that can be removed, i.e. anything related to ScriptTag.
Attachment #9032504 - Flags: review?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #9) > This makes JSScript marking not use the mark stack, and moves implicit edge > marking to the traceChildren() method... for some set of GC things? Is that > correct? Yes. And perhaps it's a mistake to change JSScript marking in this patch; I only did it because the comment implied that the only reason why it needed to use the mark stack was because of the weakmap marking, so if we stop using the mark stack for that, then that reason goes away. Basically, I sort of assumed that it was *only* using the mark stack because of weakmaps, and had been changed to accommodate them. But I didn't do the research, so it's possible that there are other reasons for it. In which case, it may be too risky for this bug, and I should do it as a separate bug (just to reduce unnecessary depth on the stack). As for the set of GC things, there's a bunch of stuff that unfortunately depends on the set of GC things that can be used as weakmap keys: JSObject, JSScript, and LazyScript. All of them need to do the weakmap marking when they get marked, and I like to think of that as "implicit" edges from a key to its value. Thought of that way, then they should naturally be marking their implicit children along with their other children in places like ::traceChildren(). (JSObject is a little weird; we inline its marking-specific traceChildren into processMarkStackTop.) ...and for no apparent reason, I seem to have done it for Shape as well?? Uh, that makes no sense, I'll remove it. I must've been confused about which function I was in? > The stack trace seems to be a problem with JSObject marking taking stack > space proportional to the length of the path through all the weak map > entries. Oh, so is it that implicit edge marking no longer happens in > markAndPush()? Ah, that's what your comment says. r+ if so. Right, that's the real point here. When encountering a key of type JSScript or LazyScript, we no longer immediately trace the implicit edges because that takes C stack and can recurse. Instead, we delay that to when we're marking their children anyway. And none of that comes into play for this test case, because it's only using JSObject keys. For JSObject keys, we were already tracing implicit child edges along with the explicit children (eg group_), there was just an unnecessary eager additional call to markImplicitEdges when first pushing an object onto the mark stack. Uh... but as a result of writing this up, I'm also not seeing where LazyScript implicit children were getting handled. They go through the markAndScan, which doesn't call markImplicitEdges. I think this may be fixing a second bug here. I wonder if we have good enough test facilities to write a test case for those. > Note that if you're not pushing JSScripts on the mark stack any more then > there's some stuff in MarkStack related to them that can be removed, i.e. > anything related to ScriptTag. Good point. I think I'd better split that change out. Especially since there's a chance it could break things for unrelated reasons.

Steve, can you prioritize this bug using the (Importance) field as per your judgement?

Component: JavaScript Engine → JavaScript: GC

Set to major, which might be a bit too high -- crashes are bad, but this seems like it shouldn't happen that much in practice.

Fortunately, I just need to get back to this and split the patch up a bit; the existing patch seems to work and fix the problem.

Severity: critical → major
Finally got back around to splitting out the eager JSScript marking. I'll have to wait for this to land before I can do that.
Attachment #9036075 - Flags: review?(jcoppeard)
Attachment #9032504 - Attachment is obsolete: true
Attachment #9036075 - Flags: review?(jcoppeard) → review+
Pushed by sfink@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/5ecfcb615bd6 mark implicit edges via mark stack instead of eagerly, r=jonco
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66

Can we land the testcase for this? Also, please nominate this for Beta approval when you get a chance.

Flags: needinfo?(sphink)

Comment on attachment 9036075 [details] [diff] [review]
mark implicit edges via mark stack instead of eagerly

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1164294

User impact if declined: Denial-of-service type of crash, which is possible to stumble across with real code. Especially if that code is buggy.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: I will be landing a test here soon, and would probably want to uplift it along with this.

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): It doesn't change what happens, just when. It's the sort of thing that is pretty likely to show up in automated tests if it was busted.

String changes made/needed: none

Flags: needinfo?(sphink)
Attachment #9036075 - Flags: approval-mozilla-beta?
Attached patch test deeply recursive weakmaps (deleted) — Splinter Review
Attachment #9036736 - Flags: review?(jcoppeard)

(In reply to Steve Fink [:sfink] [:s:] from comment #17)

Is this code covered by automated tests?: No

Or rather, "sorta". This code is executed by many different automated tests. The bug in question is not exercised by any of them, but will be once the test I just attached lands.

Attachment #9036736 - Flags: review?(jcoppeard) → review+

I should run the test once before landing. Was missing the jstests-required reportCompare call.

Flags: needinfo?(sphink)

Comment on attachment 9036075 [details] [diff] [review]
mark implicit edges via mark stack instead of eagerly

[Triage Comment]
Fixes an OOM crash seen in the wild. Thanks for landing a test too. Approved for 65.0b12.

Attachment #9036075 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Depends on: 1520778
Blocks: 1529775
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: