Closed
Bug 1211409
Opened 9 years ago
Closed 9 years ago
Make ARM simulator handle load/store exclusive and fences
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla44
Tracking | Status | |
---|---|---|
firefox44 | --- | fixed |
People
(Reporter: lth, Assigned: lth)
References
(Blocks 1 open bug)
Details
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
jolesen
:
review+
|
Details | Diff | Splinter Review |
Bug 1071024 implemented LDREX, STREX, and DMB in the simulator but in a very low-fidelity way: they do not actually synchronize. But now that we have the C++ abstractions for synchronization (in jit/Atomic-operations.h) we may be able to do something about that, or we may be able to add more C++ abstractions for it. This is actually important: Once we land test cases for multiprocessing in the DOM test suite (bug 1135026) the simulator will need to do something sane for these operations, or the tests will fail for sure. Probably we are only going to run the simulator on x86, so it is good enough to come up with a scheme that works on that architecture. For example, it is probably good enough for DMB to be an MFENCE and for LDREX and STREX to use a LOCK prefix, and then we need to remember the value loaded by LDREX and implement STREX as a CMPXCHG as outlined in bug 1071024. There are some issues with that but for the kind of code we are going to be generating it will be good enough. (Not sure what the ARM64 simulator does yet - I have a memory it does something useful. Should look to that for ideas; or open a separate bug for it if it does not implement LDREX/STREX in an interesting way.)
Assignee | ||
Comment 1•9 years ago
|
||
The ARM64 simulator simulates the local and global monitors (with simulated failures) and does some limited barriering for the acquire/release atomics and in one other case, but does not appear to actually synchronize properly for the purposes of multithreaded computation. It may be sufficient to replace the store code with an atomic compareExchange. But for multithreaded computation we really don't want to use the simulated "failed" monitor access in the ARM64 simulator, we want something closer to what I propose for the 32-bit simulator. The simulated failures may still be desirable; I'm not sure. See VisitLoadStoreExclusive() in jit/arm64/vixl/Simulator-vixl.cpp.
Assignee | ||
Comment 2•9 years ago
|
||
Basically a complete implementation, good enough for x86. This needs a little further work as follows: - Test cases that test mutual exclusion by using evalInWorker and atomics in actual shared memory. - The atomic operations used here (load, compareExchange) should be Relaxed but are currently SeqCst because we have no Relaxed operations in the AtomicOperations suite. I'm not sure if I'm going to worry too much about this; the implementation burden for AtomicOperations is already considerable and we don't need the Relaxed atomics anywhere else ATM. - Clang on 32-bit x86 does not implement 8-byte atomic load and compareExchange. Currently I've commented that code out since we don't generate LDREXD or STREXD yet. I could just make the failure permanent, until we need them. Jakob also points out that the store-exclusive only will fail on x86 if the value in the cell has changed, not if it has been updated to the same value - this in contrast to ARM, where the latter type of store also causes a conflict. That could possibly be fixed by using a sequence number - this is just an ABA problem - but that means we'll need a two-nonadjacent-word CAS operation, which is messy. Also, it is probably overkill for the simulator.
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → lhansen
Assignee | ||
Comment 3•9 years ago
|
||
Jakob further points out that this simulation re-introduces the ABA problem to ARM, which they have worked so hard to avoid on that architecture :) Presumably that will break some programs, though it will not break any JS programs and should not be a problem for the browser...
Assignee | ||
Comment 4•9 years ago
|
||
Cleaner code + test case. (I have an idea for how to solve the ABA problem but I'll file that as a followup bug.)
Attachment #8671878 -
Attachment is obsolete: true
Attachment #8671992 -
Flags: review?(jolesen)
Comment 5•9 years ago
|
||
Comment on attachment 8671992 [details] [diff] [review] bug1211409-ldrex-strex.patch Looks good. Is there a corresponding bug for arm64?
Attachment #8671992 -
Flags: review?(jolesen) → review+
Comment 6•9 years ago
|
||
Comment on attachment 8671992 [details] [diff] [review] bug1211409-ldrex-strex.patch Review of attachment 8671992 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit-test/tests/atomics/mutual-exclusion.js @@ +1,2 @@ > +// Let a few threads hammer on memory with atomics to make sure mutual > +// exclusion works properly. Should we add a comment here to the effect that this test /may/ sometimes pass even when there is a bug in the atomics implementation? @@ +84,5 @@ > + } > + return val; > +})(); > + > +console.log("This test takes a little while to run"); How long does it take? ::: js/src/jit/arm/Simulator-arm.cpp @@ +1489,5 @@ > + exclusiveMonitorHeld_ = true; > +} > + > +uint64_t > +Simulator::exclusiveMonitorGetAndClear(bool* held) The name suggests that this function clears the exclusive monitor, but the function body does not seem to do that. @@ +1532,5 @@ > +// operations anywhere else in the system, and the distinction is not > +// important to the simulation at the level where we're operating. > + > +#define loadRelaxed loadSeqCst > +#define compareExchangeRelaxed compareExchangeSeqCst If we do implement the relaxed versions one day, these #defines would silently disable them in this file. Perhaps use inline functions instead? That would cause compiler errors when these function are added to AtomicOperations.h. @@ +1626,5 @@ > + uint16_t value = AtomicOperations::loadRelaxed(ptr); > + exclusiveMonitorSet(value); > + return value; > + } > + printf("Unaligned unsigned halfword read at 0x%08x, pc=%p\n", addr, instr); "Unaligned atomic unsigned halfword... @@ +4319,2 @@ > case 4: // DSB > + // This implementation of DSB is incorrect, but we do not use DSB. Should we assert that the instruction never appears, then? @@ +4322,2 @@ > case 6: // ISB > + // This implementation of ISB is correct only for coherent-I/D-cache systems such as x86. We use ISB after generating jitted code, but I wonder if it would ever appear in code running in the simulator?
Assignee | ||
Comment 7•9 years ago
|
||
The ARM64 case is bug 1214199. In addition to the weaknesses pointed out in comment 6 (the failure to clear the monitor was embarrassing) we should double check either that the simulation is platform-independent (and I think it is) or that we assert on hardware where it is incorrect, if the atomic opcodes are used.
Assignee | ||
Comment 10•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/315e05e75b350f1aa3b54dd75f8c95678b78b445 Bug 1211409 - fix include order ON CLOSED TREE.
Comment 12•9 years ago
|
||
sorry had to back this out for test failures like https://treeherder.mozilla.org/logviewer.html#?job_id=15609395&repo=mozilla-inbound
Flags: needinfo?(lhansen)
Comment 13•9 years ago
|
||
Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/183557d89bec https://hg.mozilla.org/integration/mozilla-inbound/rev/384e33937922
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(lhansen)
Assignee | ||
Comment 14•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7852315f95c8
Assignee | ||
Comment 15•9 years ago
|
||
I think this can land; the only failure in that run is SM-tc (arm) on Linux64 Opt, it looks like an infra issue and is not even built by default, as far as I can tell from treeherder.
Comment 17•9 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #15) > I think this can land; the only failure in that run is SM-tc (arm) on > Linux64 Opt, it looks like an infra issue and is not even built by default, > as far as I can tell from treeherder. Bug 1208118 - SpiderMonkey SM-tc try jobs in TaskCluster failing to find proper compiler
https://hg.mozilla.org/mozilla-central/rev/c58f0903944c
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox44:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
You need to log in
before you can comment on or make changes to this bug.
Description
•