Open Bug 1727043 Opened 3 years ago Updated 3 years ago

Add profiling/debug no-op markers to generated JS and/or wasm code

Categories

(Core :: JavaScript Engine: JIT, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: jseward, Unassigned)

References

(Blocks 2 open bugs)

Details

Currently debugging and low-level profiling of code generated from wasm and JS
suffers from the problems:

  1. If the generated code crashes, it can be very difficult to map from the
    crash point disassembly back to IONFLAGS=codegen output. If that were
    easier, then in conjunction with the machinery in bug 1725587, it would be
    relatively easy to identify which part of SM's compilation suite created
    the failing code.

  2. When hotblock profiling wasm and/or JS apps running in the browser:

2a. It is difficult to know which blocks originate from JS, which from wasm,
and which from other JITs that we have no control over, for example
libLLVM jitting shader code etc into the process.

2b. For blocks that do originate from (eg) wasm, it would be very helpful to
know both the module and function index that the block originates from.

It is easy enough to insert no-op markers in the generated code, holding at
least 16 bits. For arm64 we can use

 add xN, xN, #imm12 ; sub xN, xN, #imm12

The imm12 carries 12 bits, and the N ranges over 0 .. 15 (and, really, a
bit further) to carry the other 4 bits. For Intels it's easier:

  leaq uimm31(%rax,,), %rax ; leaq -uimm31(%rax,,)

More of a question is:

  1. How to divide up the very limited encoding space (16 bits) to give adequate
    resolution on JS/wasm?

  2. How to enable it. For shell builds, an env var would be adequate. For
    in-browser profiling use-cases, could we enable it with a pref? Or would
    that constitute a security loophole of some kind?

I guess it becomes a matter of cost and whether it perturbs the hot-block analysis, but unconditionally branching four bytes forward across a four-byte identifier would also work.

On ARM64, adrp can hold a 20-bit immediate, and you can target a scratch register, so there's no need to undo the op if you believe we can recognize the adrp.

On x86, a multi-byte NOP can have a payload up to four bytes, ditto.

Env vars work for the browser too and are less fiddly; we could restrict to Nightly-only. This stuff has to have no security impact in any case.

Generally, in the past, we have been reluctant to land debugging functionality of this kind, with the justification that it tends to bitrot and is not useful enough to maintain - cheaper to recreate when it is needed. We might have a discussion around the principles of that.

Severity: -- → N/A
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.