Open Bug 995649 Opened 11 years ago Updated 2 years ago

mprotect the GC heap when we are not running in the JS engine

Categories

(Core :: JavaScript: GC, defect, P3)

defect

Tracking

()

flash10

People

(Reporter: terrence, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files)

We suspect that some of our consistent low volume crashes may be coming from some buggy code elsewhere in firefox accidentally scribbling over the GC heap. We could find these quite quickly by setting the GC heap PROT_NONE when we are not in a request. This should be pretty easy to implement: loop over the chunk-set and chunk-pool and mprot the chunks in JSEnter/LeaveRequest. It would be rather slow, but should very quickly tell us if this is actually the problem. Moreover, it would point us right at the offending code.
Attached patch mprotect_the_gcheap-v0.diff (deleted) — Splinter Review
This patch works; a little too well, even. It's identified places where we read from the normal heap from off-main-thread. These are all safe accesses to the script -- it is protected from both collection and from execution by the background compilation -- or to constant data on the runtime. Still, we need to fix these before we can implement the technique here.
Assignee: nobody → terrence
Status: NEW → ASSIGNED
Depends on: 998129
Note that I'm not actually working on this. There are way too many places where we have ridiculous special cases that allow access outside a request. I think it would be a nice boon to have this: it would probably have caught several cases where Ion's OMT compile thread was reading and writing from the main heap concurrent with GC. On the other hand, TSan also caught this case immediately and without us having to do 6 months of difficult, performance sensitive rewriting and debugging. So I guess we could do this, but it's a super, super low priority.
Status: ASSIGNED → NEW
Assignee: terrence → nobody
Summary: mprotect the GC heap when we are out of a JSRequest → mprotect the GC heap when we are not running in the JS engine
This could be useful to us to track down some of our crashes. Continually protecting/unprotecting the whole heap will be slow, so we could unprotect pages on demand if we detect an access to them in the fault handler. Also, we might want to only write-protect pages. Note that this can only detect access to GC things themselves, not associated malloc memory.
If we do this, we should register each protected region with MemoryProtectionExceptionHandler so attempts to access them are annotated in crash stats. Unprotecting on demand sounds nice, but how would the fault handler differentiate between legitimate accesses and corruption?
With shared memory there can be concurrent faults from multiple threads, so probably a special case?
Attached patch bug995649-mprotect-gc-heap (deleted) — Splinter Review
Initial attempt at rebasing the patch. The code to protect empty (unused) chunks is commented out. 10% of test fail.
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #5) > Unprotecting on demand sounds nice, but how would the fault handler > differentiate between legitimate accesses and corruption? We'd store a flag somewhere to say if access was allowed, i.e. we're running inside the engine.
Hmm, I guess it is mostly outside influences we're worried about. We could store a list of pages that were unprotected while the flag was set, and reprotect them when we clear it.
It seems there is a new feature for Intel CPUs called "memory protection keys" that could be very useful here: https://lwn.net/Articles/643797/ AIUI with this we wouldn't have to mprotect every time we enter or leave the JS engine, only change the permissions for the current thread.
Whoa, that's pretty cool. Seems useful for Spectre mitigations too. Looks like it's still described as only available in "future Skylake server CPUs."

(In reply to Jon Coppeard (:jonco) from comment #4)

This could be useful to us to track down some of our crashes.

Continually protecting/unprotecting the whole heap will be slow, so we could
unprotect pages on demand if we detect an access to them in the fault
handler. Also, we might want to only write-protect pages.

It could be done on some builds only to catch bugs, and enabled for short periods if we like. But that won't help protect against exploits.

Point is, it's useful to have this capability even if we don't use it all that much.

Priority: -- → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: