Closed
Bug 1135039
Opened 10 years ago
Closed 10 years ago
SIMD: Generalize shuffles/swizzles for Ion
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
Tracking | Status | |
---|---|---|
firefox39 | --- | fixed |
People
(Reporter: bbouvier, Assigned: bbouvier)
References
Details
Attachments
(7 files, 3 obsolete files)
(deleted),
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
Currently, our SIMD shuffles / swizzles MIR nodes expect constant patterns (masks?), so they need to generalized before being inlinable within Ion.
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → benj
Assignee | ||
Comment 1•10 years ago
|
||
This will make the interpreter behave more like the polyfill: if the lane index is out of bounds ([0..3] for float32x4/int32x4, [0..1] for float64x2), just load the coerced undefined value in it. Might be not intended in the polyfill (see also [0]), but it's actually pretty convenient to implement in the JITs...
[0] https://github.com/johnmccutchan/ecmascript_simd/issues/111#issuecomment-75968636
Attachment #8569307 -
Flags: review?(sunfish)
Assignee | ||
Comment 2•10 years ago
|
||
This patch implements a generalized swizzle for int32x4 and float32x4, from inlining to codegen on x86. Implementation does the following:
- reserve stack space for 2 vectors
- spill the input vector onto the stack
- load each value addressed by the lane index into a temp register
- spill this temp register at its final position in the output vector (onto the stack)
Not really efficient (with sse4.1, we could probably insert the value directly into the lane), but It Works (tm).
Attachment #8569310 -
Flags: feedback?(sunfish)
Comment 3•10 years ago
|
||
Comment on attachment 8569310 [details] [diff] [review]
swizzle.patch
Review of attachment 8569310 [details] [diff] [review]:
-----------------------------------------------------------------
Nice and straight-forward.
::: js/src/jit-test/tests/SIMD/swizzle.js
@@ +37,5 @@
> + var r = SIMD.int32x4.swizzle(i4, i % 4, (i + 1) % 4, (i + 2) % 4, (i + 3) % 4);
> + assertEqX4(r, compI[i % 4]);
> +
> + // Out of bounds variables lanes
> + assertEqX4(SIMD.int32x4.swizzle(i4, i + 4, 3, 2, 1), [0, 4, 3, 2]);
It'd be good to test negative lane indices too.
::: js/src/jit/MIR.h
@@ +1821,5 @@
>
> ALLOW_CLONE(MSimdSwizzle)
> };
>
> +class MSimdGeneralSwizzle :
Please add a brief comment here describing what a "general" swizzle is.
::: js/src/jit/TypePolicy.h
@@ +331,5 @@
> };
>
> +class SimdSwizzlePolicy MOZ_FINAL : public TypePolicy
> +{
> + public:
Style nit: this should be at 2 spaces according to SpiderMonkey style
::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
@@ +2363,5 @@
> +{
> + FloatRegister input = ToFloatRegister(ins->base());
> + Register temp = ToRegister(ins->temp());
> +
> + masm.reserveStack(Simd128DataSize * 2);
It might be nice to have a comment somewhere around here mentioning that this isn't going to generate amazingly fast code, but that it's ok because it's expected that users will use constant swizzle indices.
@@ +2374,5 @@
> + Label loadZero, go, join;
> + masm.cmp32(lane, Imm32(0));
> + masm.j(Assembler::LessThan, &loadZero);
> + masm.cmp32(lane, Imm32(4));
> + masm.j(Assembler::LessThan, &go);
An unsigned comparison would allow this to be done in a single branch. Not that performance is a big concern here, but it would make the code a little simpler.
@@ +2416,5 @@
> + Label loadNaN, go, join;
> + masm.cmp32(lane, Imm32(0));
> + masm.j(Assembler::LessThan, &loadNaN);
> + masm.cmp32(lane, Imm32(4));
> + masm.j(Assembler::LessThan, &go);
Ditto about unsigned comparison.
@@ +2419,5 @@
> + masm.cmp32(lane, Imm32(4));
> + masm.j(Assembler::LessThan, &go);
> +
> + masm.bind(&loadNaN);
> + masm.loadConstantFloat32(0.f, ScratchFloat32Reg);
This is loading zero, rather than NaN.
Attachment #8569310 -
Flags: feedback?(sunfish) → feedback+
Comment 4•10 years ago
|
||
Comment on attachment 8569307 [details] [diff] [review]
1.polyfill.patch
Review of attachment 8569307 [details] [diff] [review]:
-----------------------------------------------------------------
The current polyfill behavior isn't really intentional. I'm inclined to wait for resolution on the above-mentioned #111 before making changes here, but I could be convinced otherwise.
Attachment #8569307 -
Flags: review?(sunfish)
Assignee | ||
Comment 5•10 years ago
|
||
Thanks for the feedback! Switching to review, even if it will introduce new differences between the JIT and the interpreter. My hope is to get this landed by next week, so that SIMD in Ion can be demo'ed at GDC (otherwise, it's fine if it's not, because we can just have a try-build and demo it at GDC, but well, having it on nightlies would make more impact!).
Attachment #8569310 -
Attachment is obsolete: true
Attachment #8569813 -
Flags: review?(sunfish)
Assignee | ||
Comment 6•10 years ago
|
||
Note to self, I'll also implement shuffle later, as it's not used in any of the SIMD demos...
Keywords: leave-open
Comment 7•10 years ago
|
||
Comment on attachment 8569813 [details] [diff] [review]
swizzle.patch
Review of attachment 8569813 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/MIR.cpp
@@ +933,5 @@
> +MSimdGeneralSwizzle::foldsTo(TempAllocator &alloc)
> +{
> + int32_t lanes[4];
> + for (size_t i = 0; i < 4; i++) {
> + if (!lane(i)->isConstant() || !lane(i)->type() == MIRType_Int32)
The right side there should be lane(i)->type() != MIRType_Int32.
@@ +936,5 @@
> + for (size_t i = 0; i < 4; i++) {
> + if (!lane(i)->isConstant() || !lane(i)->type() == MIRType_Int32)
> + return this;
> + lanes[i] = lane(i)->toConstant()->value().toInt32();
> + if (lanes[i] < 0 || lanes[i] > 4)
The right side here should be >= 4.
Attachment #8569813 -
Flags: review?(sunfish) → review+
Assignee | ||
Comment 8•10 years ago
|
||
Thanks for the review! Unfortunately, I realized I can't land it without the interpreter patch, because on ARM where we don't have SIMD in Ion, the code stays in the interpreter and thus the test doesn't pass (difference of behavior between engines... it's pretty bad).
Do you prefer me to rework the patch so as to throw when a lane index is out of bounds, or should we check in the first patch (at least, temporarily, until the spec polyfill gets updated)?
Flags: needinfo?(sunfish)
Comment 9•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #8)
> Do you prefer me to rework the patch so as to throw when a lane index is out
> of bounds, or should we check in the first patch (at least, temporarily,
> until the spec polyfill gets updated)?
Throwing when a lane out of bounds sounds to me like desirable semantics. I just commented on issue #111 to this effect. I'd be ok implementing that here for now, and may even hope that we can keep it that way.
Flags: needinfo?(sunfish)
Assignee | ||
Comment 10•10 years ago
|
||
Assignee | ||
Comment 12•10 years ago
|
||
Following the behavior described in https://github.com/johnmccutchan/ecmascript_simd/issues/120#issuecomment-77085936
Attachment #8573218 -
Flags: review?(sunfish)
Assignee | ||
Comment 13•10 years ago
|
||
And the patch for swizzle in the JITs (note that i somehow forgot to add inlining of SIMD.float32x4.swizzle in the previous patch, shame on me!).
Attachment #8569307 -
Attachment is obsolete: true
Attachment #8573222 -
Flags: review?(sunfish)
Assignee | ||
Comment 14•10 years ago
|
||
So while looking at the codegen for swizzle, i realized that the same codegen could be used for shuffles, and for other vector types as well, so I've tried to generalize the MIR node as much as possible. Codegen is ready to support int8x16 and int16x8 and float64x2.
This means that LSimdShuffleBase becomes variadic. If the approach looks fine, I could create a LVariadicInstruction and have LSimdShuffleBase inherit from it, so that LVariadicInstruction can be reused for other operations...
Another solution was to just have a fixed amount of operands, which would be an upper bounds, but that would mean this upper bound would need to be 18 (for SIMD.int8x16.swizzle). Thoughts?
Attachment #8573350 -
Flags: feedback?(sunfish)
Attachment #8573350 -
Flags: feedback?(jdemooij)
Comment 15•10 years ago
|
||
Comment on attachment 8573350 [details] [diff] [review]
3.shuffle.patch
Review of attachment 8573350 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry, I'm overloaded with reviews and needinfos these weeks, so I'll leave this to sunfish. He's also more familiar with the SIMD instructions here.
Attachment #8573350 -
Flags: feedback?(jdemooij)
Updated•10 years ago
|
Attachment #8573222 -
Flags: review?(sunfish) → review+
Comment 16•10 years ago
|
||
Comment on attachment 8573218 [details] [diff] [review]
1.throw-oob.patch
Review of attachment 8573218 [details] [diff] [review]:
-----------------------------------------------------------------
yay!
Attachment #8573218 -
Flags: review?(sunfish) → review+
Assignee | ||
Comment 17•10 years ago
|
||
Comment 18•10 years ago
|
||
Backed out for Linux32 jit-test asserts.
https://hg.mozilla.org/integration/mozilla-inbound/rev/b79d608ca174
https://treeherder.mozilla.org/logviewer.html#?job_id=7487175&repo=mozilla-inbound
Comment 19•10 years ago
|
||
Also WinXP jit-test timeouts. Looking at your last Try push, all of these issues were present there as well. Please verify that you're green on Try before pushing in the future.
Flags: needinfo?(benj)
Comment 20•10 years ago
|
||
Comment on attachment 8573350 [details] [diff] [review]
3.shuffle.patch
Review of attachment 8573350 [details] [diff] [review]:
-----------------------------------------------------------------
(In reply to Benjamin Bouvier [:bbouvier] from comment #14)
> Created attachment 8573350 [details] [diff] [review]
> 3.shuffle.patch
>
> So while looking at the codegen for swizzle, i realized that the same
> codegen could be used for shuffles, and for other vector types as well, so
> I've tried to generalize the MIR node as much as possible. Codegen is ready
> to support int8x16 and int16x8 and float64x2.
Great!
> This means that LSimdShuffleBase becomes variadic. If the approach looks
> fine, I could create a LVariadicInstruction and have LSimdShuffleBase
> inherit from it, so that LVariadicInstruction can be reused for other
> operations...
Creating an LVariadicInstruction sounds nice, to keep the mundane operand-accessor kind of code somewhat separate from the interesting operation-specific code, even if we only have one subclass of LVariadicInstruction.
> Another solution was to just have a fixed amount of operands, which would be
> an upper bounds, but that would mean this upper bound would need to be 18
> (for SIMD.int8x16.swizzle). Thoughts?
I think variadic things are fine here. We mainly want to keep the code simple, since this is code that we hope won't be used much anyway.
::: js/src/jit/LIR-Common.h
@@ +376,5 @@
>
> +class LSimdGeneralSwizzleBase : public LInstruction
> +{
> + LDefinition def_;
> + mozilla::Vector<LAllocation> operands_;
You can use the FixedList class here instead of Vector.
Attachment #8573350 -
Flags: feedback?(sunfish) → feedback+
Assignee | ||
Comment 21•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #19)
> Also WinXP jit-test timeouts. Looking at your last Try push, all of these
> issues were present there as well. Please verify that you're green on Try
> before pushing in the future.
Sorry for being an idiot... I've somehow lost track of this try push, so there are a few things I need to enhance in my bugzilla workflow...
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9dcc46796f42
Assignee | ||
Comment 22•10 years ago
|
||
Here we go!
Attachment #8573350 -
Attachment is obsolete: true
Attachment #8576657 -
Flags: review?(sunfish)
Assignee | ||
Comment 23•10 years ago
|
||
Attachment #8576658 -
Flags: review?(sunfish)
Assignee | ||
Comment 24•10 years ago
|
||
Totally not mandatory
Attachment #8576659 -
Flags: review?(sunfish)
Assignee | ||
Comment 25•10 years ago
|
||
Totally not mandatory as well
Attachment #8576661 -
Flags: review?(sunfish)
Assignee | ||
Comment 26•10 years ago
|
||
And a try build for this new set of patches as well...
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9161b0d10d3f
Assignee | ||
Comment 27•10 years ago
|
||
Pushing the two changesets, as try-run in comment 21.
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/36be55e9a4ff
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/6a42e09d9a26
Flags: needinfo?(benj)
Assignee | ||
Comment 28•10 years ago
|
||
And a new try run for the other patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3352d70d9bc4
Comment 29•10 years ago
|
||
Assignee | ||
Comment 30•10 years ago
|
||
Comment on attachment 8576658 [details] [diff] [review]
4. Shuffle
Review of attachment 8576658 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/LIR-Common.h
@@ +376,5 @@
>
> +class LSimdGeneralShuffleBase : public LVariadicInstruction<1, 1>
> +{
> + public:
> + LSimdGeneralShuffleBase(const LDefinition &temp) {
(fixed locally: made this one and the two other ones below explicit)
Comment 31•10 years ago
|
||
Comment on attachment 8576657 [details] [diff] [review]
3. LVariadicInstruction
Review of attachment 8576657 [details] [diff] [review]:
-----------------------------------------------------------------
Cool!
Attachment #8576657 -
Flags: review?(sunfish) → review+
Comment 32•10 years ago
|
||
Comment on attachment 8576658 [details] [diff] [review]
4. Shuffle
Review of attachment 8576658 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good.
::: js/src/jit/MIR.cpp
@@ +950,5 @@
>
> MDefinition *
> +MSimdGeneralShuffle::foldsTo(TempAllocator &alloc)
> +{
> + mozilla::Vector<uint32_t> lanes;
It'd be slightly nicer to use FixedList here, since we know the size up front.
@@ +963,2 @@
> return this;
> + lanes.infallibleAppend(uint32_t(temp));
so this becomes lanes[i] = uint32_t(temp);
Attachment #8576658 -
Flags: review?(sunfish) → review+
Comment 33•10 years ago
|
||
Comment on attachment 8576659 [details] [diff] [review]
5. bonus: Factor LInstructionHelper/LVariadicInstruction code
Review of attachment 8576659 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good. The details namespace pattern is probably something we should use more :).
Attachment #8576659 -
Flags: review?(sunfish) → review+
Updated•10 years ago
|
Attachment #8576661 -
Flags: review?(sunfish) → review+
Assignee | ||
Comment 34•10 years ago
|
||
Status: NEW → ASSIGNED
Keywords: leave-open
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(benj)
Assignee | ||
Comment 35•10 years ago
|
||
Try link in comment 34.
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/12c2f0b35afe
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/bcad11e292db
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/71a8d6e735ef
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/6abbe0f83479
Flags: needinfo?(benj)
Comment 36•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/12c2f0b35afe
https://hg.mozilla.org/mozilla-central/rev/bcad11e292db
https://hg.mozilla.org/mozilla-central/rev/71a8d6e735ef
https://hg.mozilla.org/mozilla-central/rev/6abbe0f83479
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in
before you can comment on or make changes to this bug.
Description
•