Closed Bug 1390856 Opened 7 years ago Closed 7 years ago

Assertion failure: bool(resultScript) == (decoder.resultCode() == JS::TranscodeResult_Ok), at js/src/vm/HelperThreads.cpp:477 with OOM

Categories

(Core :: JavaScript Engine, defect, P2)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox-esr52 --- unaffected
firefox57 --- wontfix
firefox58 --- fixed
firefox59 --- fixed

People

(Reporter: decoder, Assigned: tcampbell)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, bugmon, testcase, Whiteboard: [jsbugmon:update])

Attachments

(1 file)

The following testcase crashes on mozilla-central revision 564e82f0f289 (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-stdcxx-compat --disable-profiling --enable-debug --enable-optimize, run with --fuzzing-safe --thread-count=2 --ion-eager): var lfLogBuffer = ` function evalWithCacheLoadOffThread(code, ctx) { code = code instanceof Object ? code : cacheEntry(code); var incremental = ctx.incremental || false; ctx.global = newGlobal({ cloneSingletons: !incremental }); var ctx_save; if (incremental) oomAfterAllocations(50); else ctx_save = Object.create(ctx, {saveBytecode: { value: true } }); evaluate(code, ctx_save); offThreadDecodeScript(code, ctx); ctx.global.eval(\`runOffThreadDecodedScript()\`); } test = \` \`; evalWithCacheLoadOffThread(test, {}); evalWithCacheLoadOffThread(test, { incremental: true }); `; loadFile(lfLogBuffer); function loadFile(lfVarx) { try { oomTest(function() { eval(lfVarx); }); } catch (lfVare) {} } Backtrace: received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff5efd700 (LWP 601)] 0x0000000000b872d8 in js::ScriptDecodeTask::parse (this=0x7ffff3a45400, cx=<optimized out>) at js/src/vm/HelperThreads.cpp:477 #0 0x0000000000b872d8 in js::ScriptDecodeTask::parse (this=0x7ffff3a45400, cx=<optimized out>) at js/src/vm/HelperThreads.cpp:477 #1 0x0000000000b9107d in js::HelperThread::handleParseWorkload (this=this@entry=0x7ffff69661c0, locked=...) at js/src/vm/HelperThreads.cpp:1961 #2 0x0000000000b93e2e in js::HelperThread::threadLoop (this=this@entry=0x7ffff69661c0) at js/src/vm/HelperThreads.cpp:2243 #3 0x0000000000b93eb0 in js::HelperThread::ThreadMain (arg=0x7ffff69661c0) at js/src/vm/HelperThreads.cpp:1723 #4 0x0000000000b93f12 in js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::callMain<0ul> (this=0x7ffff691e070) at js/src/threading/Thread.h:234 #5 js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::Start (aPack=0x7ffff691e070) at js/src/threading/Thread.h:227 #6 0x00007ffff7bc16fa in start_thread (arg=0x7ffff5efd700) at pthread_create.c:333 #7 0x00007ffff6c38b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 rax 0x0 0 rbx 0x7ffff3a45400 140737281020928 rcx 0x7ffff6c28a2d 140737333332525 rdx 0x0 0 rsi 0x7ffff6ef7770 140737336276848 rdi 0x7ffff6ef6540 140737336272192 rbp 0x7ffff5efbf80 140737319518080 rsp 0x7ffff5efbeb0 140737319517872 r8 0x7ffff6ef7770 140737336276848 r9 0x7ffff5efd700 140737319524096 r10 0x58 88 r11 0x7ffff6b9f750 140737332770640 r12 0x7ffff5efbed0 140737319517904 r13 0x7ffff5efbeb0 140737319517872 r14 0x7ffff5efd6f8 140737319524088 r15 0x7ffff5efbfb8 140737319518136 rip 0xb872d8 <js::ScriptDecodeTask::parse(JSContext*)+616> => 0xb872d8 <js::ScriptDecodeTask::parse(JSContext*)+616>: movl $0x0,0x0 0xb872e3 <js::ScriptDecodeTask::parse(JSContext*)+627>: ud2
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
JSBugMon: Bisection requested, result: autoBisect shows this is probably related to the following changeset: The first bad revision is: changeset: https://hg.mozilla.org/mozilla-central/rev/48078fb0fcc2 user: André Bargull date: Fri Jun 02 12:04:31 2017 +0200 summary: Bug 1368963 - Avoid extra calls to GetPropertyKeys() in Object.freeze/seal/preventExtensions. r=jandem This iteration took 277.511 seconds to run.
Andre, is bug 1368963 a likely regressor?
Blocks: 1368963
Flags: needinfo?(andrebargull)
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #2) > Andre, is bug 1368963 a likely regressor? No, I don't think so. It's more likely that due to changed GC allocation patterns bug 1368963 was picked as the regressor (cf. bug 1394523, comment #4). Maybe allocation failures in js::XDRScript [1] need to be checked instead? I've only read, but didn't debug the code, so the following explanation could be erroneous! If js::XDRScript() returns |false| because of allocation failures, XDRState<mode>::codeScript() calls postProcessContextErrors() [2], but postProcessContextErrors() doesn't update XDRState::resultCode_ when called on helper threads [3], |scriptp| is still set to nullptr in [4]. So when we return to ScriptDecodeTask::parse() [5], |resultScript| is nullptr, but |decoder.resultCode()| is still at its initial value JS::TranscodeResult_Ok [6]. That could explain why the assertion in [5] fails. [1] http://searchfox.org/mozilla-central/rev/51b3d67a5ec1758bd2fe7d7b6e75ad6b6b5da223/js/src/jsscript.cpp#311 [2] http://searchfox.org/mozilla-central/rev/51b3d67a5ec1758bd2fe7d7b6e75ad6b6b5da223/js/src/vm/Xdr.cpp#178 [3] http://searchfox.org/mozilla-central/rev/51b3d67a5ec1758bd2fe7d7b6e75ad6b6b5da223/js/src/vm/Xdr.cpp#34 [4] http://searchfox.org/mozilla-central/rev/51b3d67a5ec1758bd2fe7d7b6e75ad6b6b5da223/js/src/vm/Xdr.cpp#179 [5] http://searchfox.org/mozilla-central/rev/51b3d67a5ec1758bd2fe7d7b6e75ad6b6b5da223/js/src/vm/HelperThreads.cpp#549 [6] http://searchfox.org/mozilla-central/rev/51b3d67a5ec1758bd2fe7d7b6e75ad6b6b5da223/js/src/vm/Xdr.h#187
Flags: needinfo?(andrebargull)
Priority: -- → P2
I should look at this in relation to XDR crash investigation.
Flags: needinfo?(tcampbell)
When we OOM during XDR in a helper thread, we still return TranscodeResult_Ok. This will eventually end up in a nullptr crash or debug assert. Putting together a patch now for this and potential related problems.
Assignee: nobody → tcampbell
Status: NEW → ASSIGNED
Flags: needinfo?(tcampbell)
There are a number (35 code locations) of places where when we fail, we bubble up 'return false' without setting codeResult_. When decoding happens on a helper thread, we end up returning TranscodeResult_Ok with data not fully initialized. I think under these cases we would expect memory corruption. There are also related reasons why this would be a problem on main thread encoding (particularly incremental mode). Based on telemetry, we see these errors in the wild https://mzl.la/2yXuN6N . I don't think this explains all of the XDR crashing we see, but I think it does address real problems, so we should get it uplifted to Beta and see how the crash rates look.
(In reply to Ted Campbell [:tcampbell] from comment #7) > There are also related reasons why this would be a problem on > main thread encoding (particularly incremental mode). Based on telemetry, we > see these errors in the wild https://mzl.la/2yXuN6N . Currently we do not distinguish between asm.js failures and other failures.
Comment on attachment 8929895 [details] Bug 1390856 - Fix XDR helper thread error handling. https://reviewboard.mozilla.org/r/201066/#review206392 Nice catch!
Attachment #8929895 - Flags: review?(nicolas.b.pierron) → review+
Fixed test case to check for oomTest + Helper Threads.
Pushed by tcampbell@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3e75d4b94737 Fix XDR helper thread error handling. r=nbp
Whiteboard: [jsbugmon:update] → [jsbugmon:update,ignore]
JSBugMon: The testcase found in this bug no longer reproduces (tried revision 0bbed2e6a4cc).
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Whiteboard: [jsbugmon:update,ignore] → [jsbugmon:update]
Comment on attachment 8929895 [details] Bug 1390856 - Fix XDR helper thread error handling. Approval Request Comment [Feature/Bug causing the regression]: Unknown [User impact if declined]: This bug fixes problems that could lead to serious stability bugs and crashes. It is unclear how often we hit this state, but it may help reduce a number of difficult crash signatures. [Is this code covered by automated tests?]: Yes [Has the fix been verified in Nightly?]: Automated tests pass in nightly. [Needs manual test from QE? If yes, steps to reproduce]: No [List of other uplifts needed for the feature/fix]: This patch applies cleanly to beta tree. [Is the change risky?]: Low [Why is the change risky/not risky?]: This catches a number of cases where an XDR operation fails but we still think it succeeded. These operations already handle an error code so I expect risk is low. [String changes made/needed]: None
Attachment #8929895 - Flags: approval-mozilla-beta?
Comment on attachment 8929895 [details] Bug 1390856 - Fix XDR helper thread error handling. Fix an assertion error in javascript. Beta58+.
Attachment #8929895 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Just to clarify, in release without assertions Bad Things will happen.
(In reply to Ted Campbell [:tcampbell] from comment #15) > [Is this code covered by automated tests?]: > Yes > [Needs manual test from QE? If yes, steps to reproduce]: > No Given that this issue is already cover by automation and manual testing is not required, marking as qe-verify-.
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: