Closed Bug 977805 Opened 11 years ago Closed 9 years ago

Add an option to mark JIT pages as non-writable

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: cpeterson, Assigned: jandem)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [js:p2])

Attachments

(2 files, 3 obsolete files)

billm says: "I remember that in JaegerMonkey we had code to mark JIT pages non-writable. I don't think we ever enabled it in release builds because it was pretty expensive. Should we consider write protection for nightly and aurora just to see if we can track down this sort of memory corruption? I don't know how hard this would be. I remember that the main difficulty in JM was with ICs. However, don't we already do some memory protection of JIT pages for the operation callback? Could we reuse some of that code? " jandem says: "Yes that may be a good idea. We could use ExecutableAllocator::toggleAllCodeAsAccessible for this. JIT code is patched intentionally when we add new IC stubs, toggle incremental GC barriers, invalidate Ion code etc, we'd have to make sure all that still works. Maybe with some kind of AutoUnlockJitCode class. I don't know how much this would hurt performance though; ideally we'd only unlock a single page when we patch IC stubs."
Even if this is slow, we should at least add this behind a preference such as fuzzers default to this mode, such as we can safely guarantee that no code is patched in a scope where it is not supposed to be.
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Initial results show a 3-5% regression on Sunspider on OS X. SS is kind of worst-case of course because it runs for such a short time. We could probably optimize that a bit, but I'll try this on Windows first. If we get similar results on Windows we should consider enabling it by default, with some more work. If Windows is slower than OS X that's not going to happen.
Attached patch WIP (obsolete) (deleted) — Splinter Review
On Windows I see about a 5-7% regression on SS. In the shell it's less for some reason, no idea why. With this patch, JIT code is either read-write or read-execute, no longer RWX. That's nice for security but unfortunately it hurts perf too much. Maybe we could add a separate ExecutableAllocator for IC stubs that still has RWX permissions, and do this toggle-RW/RX thing for script code.
What we had in JM was dual mappings: RW for patching RX for running The point being that SELinux prevents you from mapping a page both W and X and by having the writing be separate from the executable you add a significant extra barrier for an attacker. It was eventually WONTFIXed because of patch obsolescence, see bug 506693.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4) > What we had in JM was dual mappings: > > RW for patching > RX for running > > The point being that SELinux prevents you from mapping a page both W and X > and by having the writing be separate from the executable you add a > significant extra barrier for an attacker. Yeah, that's what my WIP patch does too. Unfortunately switching from RW to RX and back is slow, I'll see if we can optimize this a bit. It's possible mprotect is faster on Linux, in that case we could make this Linux-only for now. Would still be nice for mobile and fuzzing, but we really want this on Windows too...
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4) > What we had in JM was dual mappings: Maybe I misunderstood, I'll look at the patch there. It seems a lot more complicated and I wonder if that will work on Windows...
Remapping memory can be expensive, yes. The point in that patch was that we don't regularly remap memory: just once when we allocate it we map it twice at separate addresses: once for writing and once for executing. On Windows that requires CreateFileMapping with an anonymous mapping and it works reasonably well.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #7) > Remapping memory can be expensive, yes. The point in that patch was that we > don't regularly remap memory: just once when we allocate it we map it twice > at separate addresses: once for writing and once for executing. I'm a bit scared by the extra complexity it will add everywhere (265K patch in bug 506693). Furthermore it's not clear it really improves security (bug 506693 comment 12 to 16). I'll think about it a bit more. Flipping RW/RX for everything other than Ion IC stubs could still be very effective and avoid most of the perf cost, we should measure.
I have no useful insight into the actual security benefits involved; I just wanted to point out the prior discussion here. I naively assumed that IP-relative addressing makes it a lot easier to construct an exploit by writing the executable page than if you had to construct an exploit on multiple address ranges.
I would be concerned about address space fragmentation from additional mappings, especially if they are at randomized addresses like the current mappings. Out-of-address-space is a frequent problem for Windows users. For some time I've been meaning to investigate the fragmentation impact from the existing randomization, but I haven't gotten around to it.
Assignee: jdemooij → nobody
Status: ASSIGNED → NEW
Attached patch Rebased WIP (obsolete) (deleted) — Splinter Review
I rebased this patch on top of my iOS patch queue which is on top of m-c rev b2e71f32548f. It builds but I'm still crashing under CodeGenerator::link despite that function having an AutoWritableJitCode in it. jandem tells me that AWJC is not nestable, so my best guess is that something else in ::link is using it and resetting the jit code to non-writable and then something else tries to write to it and breaks. The crashing stack looks like: Thread 1Queue : com.apple.main-thread (serial) #0 0x03e95fa2 in js::jit::Instruction::Instruction(unsigned int, js::jit::Assembler::Condition) at /Users/luser/build/mozilla-central/js/src/jit/arm/Assembler-arm.h:1834 #1 0x03e96038 in js::jit::InstBranchImm::InstBranchImm(js::jit::InstBranchImm::BranchTag, js::jit::BOffImm, js::jit::Assembler::Condition) at /Users/luser/build/mozilla-central/js/src/jit/arm/Assembler-arm.h:1960 #2 0x0438ecb6 in js::jit::InstBImm::InstBImm(js::jit::BOffImm, js::jit::Assembler::Condition) at /Users/luser/build/mozilla-central/js/src/jit/arm/Assembler-arm.h:1991 #3 0x04364826 in js::jit::InstBImm::InstBImm(js::jit::BOffImm, js::jit::Assembler::Condition) at /Users/luser/build/mozilla-central/js/src/jit/arm/Assembler-arm.h:1991 #4 0x04333a20 in js::jit::Assembler::RetargetNearBranch(js::jit::Instruction*, int, js::jit::Assembler::Condition, bool) at /Users/luser/build/mozilla-central/js/src/jit/arm/Assembler-arm.cpp:2650 #5 0x04333908 in js::jit::PatchJump(js::jit::CodeLocationJump&, js::jit::CodeLocationLabel) at /Users/luser/build/mozilla-central/js/src/jit/arm/Assembler-arm.cpp:546 #6 0x041b2a78 in js::jit::PatchBackedge(js::jit::CodeLocationJump&, js::jit::CodeLocationLabel, js::jit::JitRuntime::BackedgeTarget) at /Users/luser/build/mozilla-central/js/src/jit/arm/Assembler-arm.h:1054 #7 0x041b46b4 in js::jit::IonScript::copyPatchableBackedges(JSContext*, js::jit::JitCode*, js::jit::PatchableBackedgeInfo*, js::jit::MacroAssembler&) at /Users/luser/build/mozilla-central/js/src/jit/Ion.cpp:952 #8 0x041484b0 in js::jit::CodeGenerator::link(JSContext*, js::CompilerConstraintList*) at /Users/luser/build/mozilla-central/js/src/jit/CodeGenerator.cpp:7675 #9 0x041ba28a in js::jit::AttachFinishedCompilations(JSContext*) at /Users/luser/build/mozilla-central/js/src/jit/Ion.cpp:1704 #10 0x03f98660 in InvokeInterruptCallback(JSContext*) at /Users/luser/build/mozilla-central/js/src/vm/Runtime.cpp:536 #11 0x03f98572 in JSRuntime::handleInterrupt(JSContext*) at /Users/luser/build/mozilla-central/js/src/vm/Runtime.cpp:635 #12 0x04329f98 in js::jit::CheckOverRecursedWithExtra(JSContext*, js::jit::BaselineFrame*, unsigned int, unsigned int) at /Users/luser/build/mozilla-central/js/src/jit/VMFunctions.cpp:154
Attached patch Rebased WIP v2 (obsolete) (deleted) — Splinter Review
This patch applies to inbound rev 4dae2d1f59ff and passes jit-tests with various JIT flags on x86 and with the ARM simulator.
Attachment #8392245 - Attachment is obsolete: true
Attachment #8581817 - Attachment is obsolete: true
I need this for my iOS port, jandem, we talked on IRC about making this a runtime option that's on-by-default for iOS and off elsewhere (but toggleable for fuzz testing etc). Do you have time to make that change so we can get this landed?
Flags: needinfo?(jdemooij)
Summary: Investigate performance impact of marking JIT pages as non-writable in release builds → Add an option to mark JIT pages as non-writable
Here's my most recent version of your patch: http://hg.mozilla.org/users/tmielczarek_mozilla.com/gecko-ios/rev/a2087920417d I think the only significant change from what you posted in comment 12 is that I fixed the asm.js allocator as well.
Hm I'm working on this but running into some asm.js issues. I think we do a single allocation for the code + globals. So when we store to a global from asm.js code, we'll crash because the global is also rx. Luke, I see two options: (1) we add padding after the code, so that we can toggle the page permissions for code and data separately or (2) we use separate allocations for the code and the other data. Not sure how hard that is. Benjamin also mentioned the FFI and pointer tables we store there; do these need rw or rx? For now maybe we should disable asm.js when this option is used (it's necessary on iOS). A bit unfortunate though...
Flags: needinfo?(luke)
(In reply to Jan de Mooij [:jandem] from comment #15) > I see two options: (1) we add padding after the code, so that we can > toggle the page permissions for code and data separately Hm initially I didn't like this, but maybe it's the way to go for now. It may waste up to a page of memory per AsmJSModule, but on real-world, huge asm.js code that shouldn't be noticeable probably. Will see if it works...
Good news, allocating everything as rw, rounding codeBytes up to AsmJSPageSize and then toggling only the code to rx seems to work.
Flags: needinfo?(luke)
Makes sense. Just to make sure: this is just changing the existing AlignBytes() call in AsmJSModule::finish (to pass AsmJSPageSize instead of SimdMemoryAlignment) and adding an extra protection call?
(In reply to Luke Wagner [:luke] from comment #19) > Makes sense. Just to make sure: this is just changing the existing > AlignBytes() call in AsmJSModule::finish (to pass AsmJSPageSize instead of > SimdMemoryAlignment) and adding an extra protection call? Exactly. There are also a few places that use an AutoWritableJitCode to temporarily toggle a module to RW, but those are slow paths AFAIK (detachHeap, changeHeap, setProfilingEnabled). We only reprotect on iOS or when the shell flag is used, so there should be no overhead on other platforms anyway. However, once we have this mechanism, we might want to enable it on other platforms because it has some nice benefits (random memory corruption bugs can no longer corrupt JIT code without crashing etc), but we probably don't want to do that on Windows due to the overhead.
Attached patch Patch (deleted) — Splinter Review
Adds a --non-writable-jitcode shell flag. If set, JIT code is either RW or RX but never RWX. On iOS, this option is enabled by default. It's a pretty big patch, but most of it is trivial: adding AutoWritableJitCode instances or moving code around so it's in the scope of one. This patch should also be more efficient than the last one. There are a few cases that aren't very fast, the Ion backedge patching will reprotect for each backedge. As this will only be used on iOS for now, that seems fine; we can optimize this later.
Assignee: nobody → jdemooij
Attachment #8581952 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Flags: needinfo?(jdemooij)
Attachment #8617429 - Flags: review?(luke)
Attachment #8617429 - Flags: feedback?(ted)
Comment on attachment 8617429 [details] [diff] [review] Patch Review of attachment 8617429 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/asmjs/AsmJSModule.cpp @@ +300,5 @@ > + // SIMD-aligned. If we can't use RWX permissions for generated code, > + // we align the code to AsmJSPageSize, so that we can use different > + // protection flags for the code (RX) and data (RW) sections. > + size_t alignment = > + ExecutableAllocator::nonWritableJitCode ? AsmJSPageSize : SimdMemoryAlignment; Honestly, I don't really think it's worth the branch and comment; just always use AsmJSPageSize and replace the entire comment with "Start global data on a new page so JIT code may be given independent protection flags." @@ +973,5 @@ > > AutoFlushICache afc("AsmJSModule::detachHeap"); > setAutoFlushICacheRange(); > > + AutoWritableJitCode awjc(&cx->mainThread(), codeBase(), codeBytes()); Since, iiuc, the AutoFlushICache and AutoWritableJitCode usually (always?) belong in pairs, can you make a new AsmJSModule.cpp-local AutoMutateCode guard that has both as members (and calls setAutoFlushICacheRange()) and use that instead? ::: js/src/jit/ExecutableAllocatorWin.cpp @@ +267,5 @@ > + size += (startPtr - pageStartPtr); > + > + // Round size up > + size += (pageSize - 1); > + size &= ~(pageSize - 1); We have these calculations in a couple places. There's already a RoundUpPow2, I wonder if it'd be worth having a RoundUpToMultipleOfPow2. ::: js/src/jit/Ion.cpp @@ +354,5 @@ > { > PatchableBackedge* patchableBackedge = *iter; > + if (target == BackedgeLoopHeader) { > + PatchBackedge(patchableBackedge->backedge, patchableBackedge->loopHeader, target, > + /* reprotect = */ true); Perhaps have an enum instead of bool option so you don't need the comments? ::: js/src/jit/JitCompartment.h @@ +556,5 @@ > #ifdef XP_WIN > const unsigned WINDOWS_BIG_FRAME_TOUCH_INCREMENT = 4096 - 1; > #endif > > +// If ExecutableAllocator::nonWrableJitCode is |true|, this class will ensure nonWritableJitCode @@ +576,5 @@ > + AutoWritableJitCode(void* addr, size_t size) > + : AutoWritableJitCode(TlsPerThreadData.get(), addr, size) > + {} > + explicit AutoWritableJitCode(JitCode* c) > + : AutoWritableJitCode(&c->runtimeFromMainThread()->mainThread, c->raw(), c->bufferSize()) Oh boy, we can use forwarding constructors! ::: js/src/vm/Runtime.h @@ +548,5 @@ > inline void removeActiveCompilation(); > > + void toggleAutoWritableJitCodeActive(bool b) { > + MOZ_ASSERT(autoWritableJitCodeActive != b, "AutoWritableJitCode should not be nested."); > + autoWritableJitCodeActive = b; Can you also assert we're on the runtime's main thread?
Attachment #8617429 - Flags: review?(luke) → review+
(In reply to Luke Wagner [:luke] from comment #22) > Can you also assert we're on the runtime's main thread? Ah interesting. Yesterday I ran into a situation where we used AutoWritableJitCode off-thread during asm.js compilation, so I moved the flag to PerThreadData. I removed a lot of AutoWritableJitCode from asm.js code though, so maybe we can assert that now and move the flag back to JSRuntime. I'll try it. Thanks for the quick review!
For posterity: I ran into some issues on Try, in particular, the signal handler trick we use to interrupt Ion code by patching backedges: (1) It can trigger at any moment so if we have an AutoWritableJitCode on the stack and the signal handler runs and also uses AutoWritableJitCode, we fail the "nested AutoWritableJitCode" assert. This case is pretty tricky to handle correctly. (2) On Windows, the backedge patching runs on the watchdog thread, so our TLS lookup fails. (3) The naive implementation I had has terrible performance because it reprotects twice for each backedge. After discussing with Luke, I've disabled the implicit interrupt checks in Ion if we're using non-writable JIT code. The performance loss from that should be very small, and it avoids a ton of complexity and the (potentially much bigger) perf issue I mentioned.
I ran some benchmarks on OS X, 64 bit, to compare the overhead of non-writable JIT code (this includes the explicit interrupt checks we have to use when that's enabled, see comment 25): Sunspider 1.0.2: js : 138.6 ms js --non-writable-jitcode: 145.8 ms (5.2% slower) js (explicit interrupt checks): 140.7 ms. Kraken 1.1: js : 850.7 ms js --non-writable-jitcode: 890.9 ms (4.7% slower) js (explicit interrupt checks): 884.2 ms. Octane: js : 33796 points js --non-writable-jitcode: 32518 points (3.8% slower) js (explicit interrupt checks): 33398 points A 4-5% overhead on iOS on raw JS benchmarks seems pretty acceptable compared to no JIT at all. On longer-running tight loops, like Kraken has, the overhead is mostly from the explicit interrupt checks. Sunspider is short running, so compilation time and reprotecting are more of an issue compared to the interrupt checks. On Octane it's a combination of these two. I'm sure with some careful profiling we could improve this, but for now it seems fine.
gkw, decoder: if this sticks and lands on mozilla-central, there will be a new --non-writable-jitcode shell flag. With this flag, JIT memory pages never have both write and execute permissions, so we have to toggle when we modify code. This is necessary on iOS because it doesn't allow allocating RWX memory, but in the future we may want to use this on other platforms as well. (It *might* help fuzzing in some cases: random memory corruption bugs can no longer corrupt JIT code without crashing immediately.)
Flags: needinfo?(gary)
Thanks for finishing this up, I owe you one!
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
(In reply to Jan de Mooij [:jandem] from comment #20) > We only reprotect on iOS or when the shell flag is used, so there should be > no overhead on other platforms anyway. However, once we have this mechanism, > we might want to enable it on other platforms because it has some nice > benefits (random memory corruption bugs can no longer corrupt JIT code > without crashing etc), but we probably don't want to do that on Windows due > to the overhead. Is address space fragmentation a concern on 64-bit browsers? Can we enable --non-writable-jitcode on Win64 and 64-bit OS X and Linux?
(In reply to Jan de Mooij [:jandem] from comment #27) > gkw, decoder: if this sticks and lands on mozilla-central, there will be a > new --non-writable-jitcode shell flag. Thanks for the headsup. Added support in fuzzing rev cb371545f793.
Flags: needinfo?(gary)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #32) > https://hg.mozilla.org/integration/mozilla-inbound/rev/9fa312e065e1 This was a fix to add a missing #include <TargetConditionals.h> (which is where TARGET_OS_IPHONE comes from). Unfortunately it made some OS X debug tests go orange, because TARGET_OS_IPHONE is always *defined*, it's just set to 0 or 1, so #ifdef is the wrong check for it. (In reply to Ted Mielczarek [:ted.mielczarek] from comment #35) > https://treeherder.mozilla.org/#/jobs?repo=try&revision=9f296d9adfe1 This is that same patch but with the #ifdef changed to an #if. The bustage is concerning though, it'd indicate that there are bugs with this codepath.
There looked to be at least two different failure modes: https://treeherder.mozilla.org/logviewer.html#?job_id=10759674&repo=mozilla-inbound 14:03:12 WARNING - PROCESS-CRASH | dom/smil/test/test_smilCSSFromBy.xhtml | application crashed [@ js::jit::GetPropertyIC::reset()] stack: https://pastebin.mozilla.org/8836779 https://treeherder.mozilla.org/logviewer.html#?job_id=10759075&repo=mozilla-inbound 13:29:55 ERROR - PROCESS-CRASH | test_desktop_all.py TestDesktopUnits.test_units | application crashed [@ js::jit::PatchJump] stack: https://pastebin.mozilla.org/8836780
Although the latter seems to just be a frame under the former, so maybe they're the same failure.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #36) > The bustage is concerning though, it'd indicate that there are bugs with > this codepath. This is a browser-only code path (IC for DOM proxies) that doesn't unprotect. Unfortunately these proxies are not available in the shell. I'll fix and see if that makes the browser happy.
Flags: needinfo?(jdemooij)
That last try push was green, I'm re-landing with r=jandem.
Attached patch Fix IonCache::reset (deleted) — Splinter Review
Here's a small followup patch to fix the browser-only IC issue. This patch changes IonCache::reset to take a ReprotectCode argument. I should have done that initially: explicit seems better than implicit. Try with non-writable JIT code is green now: https://treeherder.mozilla.org/#/jobs?repo=try&revision=acf0911c3bde
Flags: needinfo?(jdemooij)
Attachment #8622429 - Flags: review?(luke)
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Attachment #8622429 - Flags: review?(luke) → review+
Keywords: leave-open
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Comment on attachment 8617429 [details] [diff] [review] Patch Review of attachment 8617429 [details] [diff] [review]: ----------------------------------------------------------------- :)
Attachment #8617429 - Flags: feedback?(ted) → feedback+
Depends on: 1194072
Depends on: 1215479
Depends on: 1260949
No longer depends on: 1260949
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: