ARM64 wasm prologue/epilogue should use STP/LDP
Categories
(Core :: JavaScript: WebAssembly, enhancement, P3)
Tracking
()
People
(Reporter: lth, Unassigned)
References
(Blocks 3 open bugs)
Details
The ARM64 wasm prologue is sub sp, 16; store lr; store fp; mov fp, sp, but we should be able to use stp {lr,fp} for the same behavior, and we may even be able to use the auto-decrement behavior to avoid the sub.
Ditto the epilogue should be able to use ldp.
This will save six instructions per function (because each function has two copies of the prologue and we save two for each, and two for the epilogue). WasmCheckedTailEntryOffset should be updated accordingly to avoid pointless nopfills.
Using stp is pretty easy, probably; using the auto-decrement is going to be harder because it means StartUnwinding has to be updated in more radical ways.
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 1•3 years ago
|
||
Even if traps are placed OOL (bug 1680243) the savings won't be enough to get rid of a lot of NOPs on ARM64 that are there to align the unchecked call entry on a 16-byte boundary. The callable prologue will be reduced to two instructions, leaving only two for the signature check, and that always has to be enough. It will be if the signature is a smallish immediate or a pointer easily loaded from tls (ie offset is small enough) but that's not good enough, because sometimes the immediate will be large or the pointer has to be loaded from a large offset.
So I think the right combination of fixes here is, introduce STP (post-decrement) and LDP (post-increment) to generally reduce code size without worrying about the checked function entry size; this will reduce function size in general by four words. Then bug 1756792 can reduce the bloat from the checked call entry, maybe.
This is not urgent.
Description
•