Closed Bug 1259295 Opened 9 years ago Closed 9 years ago

wasm: postorder

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox49 --- fixed

People

(Reporter: sunfish, Assigned: sunfish)

References

Details

Attachments

(2 files, 19 obsolete files)

(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
luke
: review+
Details | Diff | Splinter Review
The following patch begins an experiment with postorder decoding for wasm.
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
It's still quite early; lots of stuff is missing, and in need of more tidying, but it's enough to show some basic ideas for an iterator interface, such as unifying validating and non-validating decoding code, and extracting the Decoder out of FunctionCompiler (and eventually what is currently called FunctionDecoder). Comments welcome!
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
This updated patch implements most expressions, and sets up an outline for what remains. Comments on the design most welcome. The FunctionCompiler class is great, and the post-order iteration logic can use it with almost no changes! Still remaining to do: - the remaining expression opcodes - implement the encoding side - testing - iterate (no pun intended) on the API
Attachment #8734206 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
This version adds a rewrite of the Wasm.cpp validation logic in terms of the postorder iterator, and contains a bunch of API changes.
Attachment #8734378 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
This version rewrites significant portions of AsmJS.cpp encoding to use postorder format, as well as a bunch of other misc. changes. This patch can now encode many (though not yet all) asm.js expressions and statements using postorder, validate, and translate them to Ion.
Attachment #8735616 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
This version iterates on the API in several areas, takes more code off of the validate-only code path, and adds more of the validation logic.
Attachment #8735720 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
This version converts the WasmTextToBinary encoder, implements many more expressions, and fixes several bugs. Many tests pass now. Note that this patch still incorporates several somewhat experimental ideas that are still in flux, particularly surrounding arities and type checking.
Attachment #8736049 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Rebased on top of control flow yielding values and other patches, more bugs fixed and expressions implemented. 75% of the wasm and asm.js jit-test test files now pass. The main pieces of functionality missing are now just if/else, SIMD, and Atomics. Arities, some aspects of type checking, and some opcode table bits still contain experimental ideas.
Attachment #8736996 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
The significant change here is the removal of almost all the arity-related code, removal of the unreachable flag, and a return to pushing Void and Any values on the stack. I believe the unreachable+arities approach is still theoretically viable (blocks and block-like constructs would need to declare their result type explicitly, in order to produce a concrete (non-"any") type in the case that they have no exit paths), and while it does simplify some things, it also adds complexity in other areas, and isn't obviously enough of a win to justify itself, so I'm suspending it for now. Also, this implements if-else. Known remaining work includes: - port binary-to-text to postorder (using the iterator interface!) - SIMD and Atomic operators (not complicated) - handle if-else in dead code (tricky because postorder means part of it could be live while part of it dead). All wasm tests except the binary-to-text test pass, though a few if-else-in-dead-code cases are temporarily commented out. All asm.js tests pass except those using SIMD and Atomics.
Attachment #8737339 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Rebased on trunk, and folded in an if-else patch which was a separate patch that I had previously omitted.
Attachment #8737993 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Implement the atomics, plus other misc. cleanups.
Attachment #8738203 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Re-add WasmIterator.h.
Attachment #8738334 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
This update adds SIMD support and fixes several bugs.
Attachment #8738520 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
And this version fixes a bug with partially-dead if-else, and with a missing interrupt check in loops. All the asm.js and wasm jit-tests now pass, except wasm/totext1.js, which is the binary-to-text feature.
Attachment #8739188 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Now attaching the correct version of the patch.
Attachment #8739202 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Switch to the proposed 0xb opcode values. The most visible change here is unifying the various "end" opcodes for various constructs into a single "end" opcode that looks at the control stack and does the right thing. Also, make if, else, and function bodies implicit blocks.
Attachment #8739206 - Attachment is obsolete: true
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Temporarily set the version to 0x10000b, to allow iterating on what will be 0x0b. Also, added more comments, and misc. code cleanups.
Attachment #8739296 - Attachment is obsolete: true
Blocks: 1263202
In this patch the iterator's internal value stack grows without bound within functions, loops, and blocks even if it could be constant size. Consider eg the fac function from the test suite: (module (func $fac-opt (param i32) (result i32) (local i32) (set_local 1 (i32.const 1)) (block (br_if 0 (i32.lt_s (get_local 0) (i32.const 2))) (loop (set_local 1 (i32.mul (get_local 1) (get_local 0))) (set_local 0 (i32.add (get_local 0) (i32.const -1))) (br_if 0 (i32.gt_s (get_local 0) (i32.const 1))))) (get_local 1)) ...) The for-effect set_locals in the loop body are never consumed and hence not popped, so if the block body has many such expressions the stack will be deep; even with fairly short bodies the program will consume more stack space than the recursive decoder (just a different stack). This is probably just a bug but it's not obvious to me whose bug it is. Possibly the iterator could have state so that it discards unused results, but that would slow down iteration generally - any expression might be unused. It could push the chore onto the consumer, at the cost of exposing a "consume unused result" operation and complexity in the consumer. Or it could ignore the problem, but I think that is probably wrong.
It's a problem with the spec itself, as of the 0xb version. This is the first version with postorder, and the wrinkles are still being worked out. It's not currently possible (in general) for a single-pass decoder to know whether a set_local's result is used until either the use is seen or the end of the block is reached. A block with a long sequence of set_locals could either just end, discarding all but the last, or have a call that consumes up to all of them as arguments. We should also note that the growth is at most linear with the size of the program, and is not the only required linear-growth data structure needed for wasm. However, I do agree that it would be nice to fix this. One approach would be to popularize a convention wherein wasm producers break up long sequences of instructions into multiple blocks, since the end of a block pops the block's contents. Another would be to add something to the spec to allow producers to declare unused values explicitly.
(In reply to Dan Gohman [:sunfish] from comment #18) > It's a problem with the spec itself, as of the 0xb version. This is the > first version with postorder, and the wrinkles are still being worked out. > > It's not currently possible (in general) for a single-pass decoder to know > whether a set_local's result is used until either the use is seen or the end > of the block is reached. A block with a long sequence of set_locals could > either just end, discarding all but the last, or have a call that consumes > up to all of them as arguments. The general problem troubles me quite a bit at present, postorder is information-poor and that makes life painful for what I'm trying to build (a single-pass on-the-fly-code-generating compiler). > One approach would be to popularize a convention wherein wasm producers > break up long sequences of instructions into multiple blocks, since the end > of a block pops the block's contents. Another would be to add something to > the spec to allow producers to declare unused values explicitly. The latter would at least add information to the program represenation. I've not had a chance to read all the background material, so I will tread very lightly here, but why do people want a postorder representation? I see that it probably makes a direct, dumb interpreter fairly simple, but is that a strong use case? Also, about the landing schedule: I asked Luke last week when postorder might land and he said "landing is nigh", which I concede can mean a number of things :) But from your comments I'm inclined to infer that the format isn't even baked yet. What's your take on this?
Attached patch binarytotext.patch (deleted) — Splinter Review
For posterity, a small attempt at the binary-to-text transformation atop of your latest patch. This can render the following really simplistic function: `(func (result i32) (param i32) (i32.add (i32.const 42) (get_local 0)))`. Some notes that came to mind about the implementation: - ControlItem/Value are JSString* in both cases. I thought there would be a nice way to avoid flattening strings all the time (and using JSRope almost everywhere), but we don't only want to do simple concatenations: we also want to add stuff in the middle. For instance, when we're rendering the i32.add above, we want to 1. render "(i32.add ", 2. lhs value, 3. " ", 4. rhs value, 5. ")", so it's not just a simple concatenation of the two children. Maybe we can use JSRope still, with a bit more effort, but I worry that this would spawn many GC things quickly and provoke memory churn (albeit binary-to-text won't be perf critical anyways, it'd be nice that it wouldn't be too slow). Data might be nice to have here. - re: the API, it's nice to use. It's weird that you have to pass function context information at the lowest level: for instance, when rendering (get_local 0), the iterator wants the full ValType vector of local's types. It would sound better to pass them directly at the highest level, once and for all, to the iterator, and then the iterator wouldn't need them when going through the nodes. (This function context info isn't stored at all in the binary-to-text transformation, so this dummy patch puts some raw value there). Nice work there!
(In reply to Lars T Hansen [:lth] from comment #19) > I've not had a chance to read all the background material, so I will tread > very lightly here, but why do people want a postorder representation? I see > that it probably makes a direct, dumb interpreter fairly simple, but is that > a strong use case? Additional motivations include: - faster decoding: roughly speaking, a postorder implementation should be faster than a non-recursive non-continuation-passing-style preorder implementation, because the latter needs an opcode switch for every operand. Experiments have so far confirmed this. - simpler debugging: one can step through it like a dumb interpreter, and have a natural correspondence between "byte offset" and "program counter" concepts. It's unfortunately not unambiguously compelling, but the general sentiment in the group right now is that the advantages outweigh the disadvantages. > Also, about the landing schedule: I asked Luke last week when postorder > might land and he said "landing is nigh", which I concede can mean a number > of things :) But from your comments I'm inclined to infer that the format > isn't even baked yet. What's your take on this? "0xb", like "0xa" before it, is a snapshot in time that we're coordinating with other implementations. The patch in this bug may land soon, as part of "0xb". After that, there will be a "0xc" and so on, as the wasm design evolves, until we're done.
There are some programs that can be expressed in the unstructured postorder syntax that cannot be expressed in the structured s-expression syntax. We can imagine a simple RPN notation to express postorder directly, and assuming that, I can write things like this: (i32.const 1) (if) (i32.const 1) (i32.const 2) (else) (i32.const 3) (end) Note the two expressions in the body of the "then" arm. Presumably this is a well-formedness error? But I'm not sure where the validator flags that error if it does it at all (and I haven't yet sat down to write the assembler so that I can test this). This question came to mind as follows. Suppose the above program is invalid. Then unless the validator always catches an error before it feeds the phrase to the consumer, the validator must always run to completion (in a function at least but maybe in the program) before it feeds anything to the consumer, or an invalid program could in principle trick the consumer into going wrong along the way because the consumer must assume the program is valid. I think the validator always runs to completion at present before the program is compiled, but is it the intent to continue to do that?
(In reply to Lars T Hansen [:lth] from comment #22) > There are some programs that can be expressed in the unstructured postorder > syntax that cannot be expressed in the structured s-expression syntax. We > can imagine a simple RPN notation to express postorder directly, and > assuming that, I can write things like this: > > (i32.const 1) > (if) > (i32.const 1) > (i32.const 2) > (else) > (i32.const 3) > (end) > > Note the two expressions in the body of the "then" arm. Presumably this is > a well-formedness error? The 'then' arm is a block with two top level expressions. All the expressions remaining on the stack at the end of a block are the top level expressions for the block. It does seem to be a requirement that the code be validated before it could be interpreted literally. Perhaps the discussion in https://github.com/WebAssembly/design/pull/611 will answer some questions.
:dougc, thanks. Also related materials here: https://github.com/WebAssembly/design/pull/641. Perhaps the s-expression syntax needs a (begin ...) form to express this: (if E (begin E1 E2 ... En) (begin Ek Ej ... El))
(In reply to Lars T Hansen [:lth] from comment #22) > There are some programs that can be expressed in the unstructured postorder > syntax that cannot be expressed in the structured s-expression syntax. We > can imagine a simple RPN notation to express postorder directly, and > assuming that, I can write things like this: > > (i32.const 1) > (if) > (i32.const 1) > (i32.const 2) > (else) > (i32.const 3) > (end) > > Note the two expressions in the body of the "then" arm. Presumably this is > a well-formedness error? But I'm not sure where the validator flags that > error if it does it at all (and I haven't yet sat down to write the > assembler so that I can test this). Yes. This is a well-formedness error in the current format, and the current patch here diagnoses it. The way it works is that the validator keeps track of the depth of the evaluation stack at the entry to each block, and rejects attempts to pop values below the depth of the innermost block. That way, it can catch the problem when it happens, rather than waiting until the end of decoding. > This question came to mind as follows. Suppose the above program is > invalid. Then unless the validator always catches an error before it feeds > the phrase to the consumer, the validator must always run to completion (in > a function at least but maybe in the program) before it feeds anything to > the consumer, or an invalid program could in principle trick the consumer > into going wrong along the way because the consumer must assume the program > is valid. > > I think the validator always runs to completion at present before the > program is compiled, but is it the intent to continue to do that? No, the interator interface in the current patch is designed to be usable in a single-pass manner. It performs validation as it goes, one opcode at a time. For example, when a binary opcode is decoded, the iterator checks that the operands are both valid to be popped and it performs type checking before proceeding.
Several offline discussions have surfaced the idea of adding to the spec versions of `set_local`, `store`, `call`, and `call_indirect` which return void. This would allow a single-pass decoder to avoid linear stack growth, and would permit a single-pass backend to avoid ending up allocating registers or memory for dead values. Further discussion is here: https://github.com/WebAssembly/design/issues/655
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Updated postorder patch: - Reset the binary version back to plain 0xb. - Temporarily disable binary-to-text. - Minor cleanups.
Attachment #8739461 - Attachment is obsolete: true
Attachment #8743003 - Flags: review?(luke)
Attached patch wasm-postorder.patch (obsolete) (deleted) — Splinter Review
Updating with fix for ARM (and platforms that require MAsmJSHeapAccess::offset() == 0) by adding back some logic that was inadvertently removed (see SetHeapAccessOffset).
Attachment #8743003 - Attachment is obsolete: true
Attachment #8743003 - Flags: review?(luke)
Attachment #8744450 - Flags: review?(luke)
Blocks: 1232205
Attached patch postorder (obsolete) (deleted) — Splinter Review
With review nits applied.
Attachment #8744450 - Attachment is obsolete: true
Attachment #8744450 - Flags: review?(luke)
Attachment #8745001 - Flags: review?(luke)
Attached patch postorder (deleted) — Splinter Review
This patch fixes a warning and a bug where the control stack wasn't checked for underflow.
Attachment #8745070 - Flags: review?(luke)
Attachment #8745001 - Attachment is obsolete: true
Attachment #8745001 - Flags: review?(luke)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Depends on: 1269223
Depends on: 1271375
Attachment #8745070 - Flags: review?(luke) → review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: