Closed
Bug 1430743
Opened 7 years ago
Closed 7 years ago
Arm64 / aarch64 JS baseline and wasm baseline jit-test debug/ failures
Categories
(Core :: JavaScript Engine: JIT, defect, P1)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox60 | --- | fix-optional |
People
(Reporter: lth, Assigned: lth)
References
Details
On SoftIron Overdrive 1000 (4 x Cortex-57A) compiled with gcc6, run with --baseline-eager:
--enable-debug --disable-optimize:
FAILURES:
debug/Frame-onPop-10.js
debug/Frame-onStep-03.js
debug/Frame-onStep-06.js
debug/Frame-onStep-07.js
debug/Frame-onStep-11.js
debug/Frame-onStep-12.js
debug/Frame-onStep-14.js
debug/Frame-onStep-15.js
debug/Frame-onStep-17.js
debug/Frame-onStep-18.js
debug/Frame-onStep-lines-01.js
debug/Frame-onStep-resumption-05.js
debug/Frame-this-10.js
debug/Frame-this-12.js
--baseline-eager --ion-eager debug/RematerializedFrame-retval.js
debug/Script-getLineOffsets-03.js
debug/Script-getLineOffsets-05.js
debug/breakpoint-08.js
debug/breakpoint-11.js
debug/class-05.js
debug/noExecute-04.js
debug/noExecute-05.js
debug/noExecute-06.js
debug/noExecute-07.js
TIMEOUTS:
basic/bug688939.js
basic/bug832203.js
coverage/bug1214548.js
debug/bug-1248162.js
debug/bug-1260725.js
debug/bug1251919.js
debug/bug1254123.js
debug/bug1370905.js
gc/bug-1143706.js
gc/bug-1215678.js
--ion-pgo=on gc/bug-1226896.js
gc/bug-1292564.js
gc/bug-1303015.js
gc/bug-1384047.js
gc/oomInNewGlobal.js
--disable-debug --enable-optimize (note one GC failure here that was not in the debug run):
FAILURES:
debug/Frame-onPop-10.js
debug/Frame-onStep-06.js
debug/Frame-onStep-12.js
debug/Frame-onStep-14.js
debug/Frame-onStep-16.js
debug/Frame-onStep-17.js
debug/Frame-onStep-18.js
debug/Frame-onStep-lines-01.js
debug/Frame-onStep-resumption-01.js
debug/Frame-onStep-resumption-02.js
debug/Frame-this-10.js
debug/Frame-this-12.js
debug/Script-getLineOffsets-03.js
debug/Script-getLineOffsets-05.js
debug/breakpoint-06.js
debug/breakpoint-08.js
debug/breakpoint-resume-01.js
debug/Script-getLineOffsets-06.js
debug/bug1368736.js
debug/class-05.js
debug/noExecute-01.js
debug/noExecute-02.js
debug/noExecute-04.js
debug/noExecute-07.js
debug/noExecute-06.js
debug/noExecute-05.js
gc/bug-1155455.js
Assignee | ||
Comment 1•7 years ago
|
||
These failures only occur on hardware, btw.
Assignee | ||
Updated•7 years ago
|
OS: Unspecified → Linux
Hardware: Other → ARM64
Comment 2•7 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #1)
> These failures only occur on hardware, btw.
I guess this lead to 2 questions:
1. Why is it different?
2. Can we fix the simulator?
(P2, as these failures seems to be restricted to debug test cases)
status-firefox60:
--- → fix-optional
Priority: -- → P2
Assignee | ||
Comment 3•7 years ago
|
||
I'm seeing similar failures now for the wasm debug tests, on hardware, both with and without the baseline jit, so it's not clear the baseline jit is at fault here. I should re-run the tests above with --no-baseline to see if any of them reproduce.
In the case of the wasm tests I'm seeing an onStep handler that's being called a variable number of times; the tests always fail when the number of calls is low (2) and always succeeds when it is higher, but still variable. Since the wasm tests use a different harness I'm not sure how much to read into that, but it's a lead, maybe. Will follow up more next week.
Assignee | ||
Comment 4•7 years ago
|
||
The failures above are definitely a JS baseline jit problem, there are no failures in a debug build with --no-baseline, but many failures with --baseline-eager.
Assignee | ||
Comment 5•7 years ago
|
||
We're not flushing the icache after patching for debugging.
ExecutableAllocator::makeExecutable() must flush the icache, or its caller must - in this case, ~AutoWritableJitCode(). Not sure what's best yet.
This is possibly OS-dependent. On Linux, which I'm running, this problem appears to have been promoted to a feature. A StackOverflow thread [1] observes that some libc variants perform a flush after an mprotect on ARM, and a thread [2] on the linux-arm-kernel list suggests very strongly that icache invalidation should not be done by mprotect except to ensure cache coherence across CPUs; this thread discusses ARM64 specifically and the conclusion appears to be that icache flushing is the application's responsibility.
[1] https://stackoverflow.com/questions/2777725/does-mprotect-flush-the-instruction-cache-on-arm-linux/4991147
[2] Rooted roughly at https://www.spinics.net/lists/arm-kernel/msg506068.html
This will affect us on all platforms other than x86, so we should look for a general solution here, and try to figure out what different OSs need. Presumably Android might differ from stock Linux; iOS and Windows-on-ARM64 will likely do their own thing...
Until now we've probably not had issues with debugging failing on ARM so it would have gone unnoticed for good reason. But the times they are a-changing.
Assignee | ||
Updated•7 years ago
|
Blocks: Rabaldr-ARM64
Summary: Arm64 / aarch64 JS baseline jit-test failures → AutoWritableJitCode must flush icache when reprotecting (was: Arm64 / aarch64 JS baseline jit-test failures)
Assignee | ||
Comment 6•7 years ago
|
||
Not just debug related anymore.
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
Priority: P2 → P1
Comment 7•7 years ago
|
||
If we do this in AutoWritableJitCode, we should check if we can remove some explicit icache flushes elsewhere.
Assignee | ||
Comment 8•7 years ago
|
||
More worms under this rock:
Turns out that the JS debug tests fail for a slightly different reason than the Wasm debug tests. In the case of the JS baseline, it looks like missing cache flushing (it uses AutoWritableJitCode which does not flush). In WasmDebug.cpp, however, there's an additional AutoFlushICache object along with each AutoWritableJitCode. But in two cases, that AutoFlushICache object is not given a range to flush, so it flushes nothing.
That's kind of a footgun, is it not? I understand why it is that way, but even so it seems brittle.
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Comment 9•7 years ago
|
||
Final update, now that I've read more code:
The debug-test failures are a consequence of a missing implementation of AutoFlushICache::flush() on ARM64 (evil ifdef, see bug 1439333) and failure to use that function in the ARM64 MacroAssembler and Assembler. When those problems are fixed, debug tests pass for both JS and Wasm. The patches that fix this will appear on bug 1439333, bug 1436953 (updated patch to be reviewed), and bug 1313336 (amended already-reviewed patch). From that point of view, the present bug can be closed.
I'm a little concerned that the name AutoWritableJitCode implies icache flushing but since I don't have anything actionable I don't think I'll follow up further on that now.
Leaving this open until the dependencies have landed.
Assignee | ||
Comment 10•7 years ago
|
||
Fixed because all dependencies are fixed.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•