Speed up call_indirect with a dual-path strategy
Categories
(Core :: JavaScript: WebAssembly, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox99 | --- | fixed |
People
(Reporter: lth, Assigned: lth)
References
(Blocks 1 open bug)
Details
Attachments
(2 files)
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review |
call_indirect can be sped up by testing at run-time whether the callee tls is the same as the caller tls; if so, no context switch is needed. The slow path code can be placed OOL (or not).
Assignee | ||
Comment 1•3 years ago
|
||
This changes MacroAssembler::wasmCallIndirect to implement dual-path
call code for call_indirect: if the caller's tls equals the callee's
tls, no context switch will be needed and a fast call can be used,
otherwise a slow call with a context switch must be used. This speeds
up call_indirect significantly in the vast majority of cases at a
small cost in code size.
As a result of this, wasmCallIndirect has two call instructions and
therefore two safepoints, and this complication bubbles up to the
baseline compiler, the codegenerator, and lowering. The main issue is
that a LIR node only has one safepoint, so we must generate a second,
synthetic LIR node for the second safepoint.
Drive-by fix: the InterModule attribute in the baseline compiler is
not really about whether a call is inter-module, but about whether the
register state and realm must be restored after a call. The change to
call_indirect exposes this incorrectness: such calls may be
intermodule, but the compiler never needs to restore the register
state or the realm - the macroassembler does this, as needed, on the
slow path.
Drive-by fix: minor cleanup of the emitted code, notably, better
pointer scaling on ARM64.
Drive-by fix: remove some redundant parameters in lowering to reduce
confusion about whether a MIR node is updated for some LIR operations.
Assignee | ||
Comment 2•3 years ago
|
||
An older patch that moved the slow path for call_indirect out of line and let the fast path fall through. This will not apply to the code in its current form but we may want it later. I'm going to not try to do this now though because I prefer to work on tail calls first, plus it's going to be a little tricky to deal with exception handling here - the exception region around the call_indirect will be split into two code ranges if one of the calls is moved out of line.
Updated•3 years ago
|
Comment 4•3 years ago
|
||
bugherder |
Description
•