764477 - IonMonkey: Inline Math.max and Math.min

Jan de Mooij [:jandem]

Reporter

Description

•

12 years ago

For parity with JM+TI, we should inline Math.max/Math.min with two int32/double arguments.

Tom S [:evilpie]

Assignee

Updated

•

12 years ago

Assignee: general → evilpies

Marty Rosenberg [:mjrosenb]

Comment 1

•

12 years ago

I say we should just inline an arbitrary number of arguments, since I've seen bug reports that using more than 2 arguments slows it down noticeably, and it shouldn't be that much extra work.

Tom S [:evilpie]

Assignee

Comment 2

•

12 years ago

Attached patch v1 (obsolete) (deleted) — Details — Splinter Review

I looked at inlining more than two args, but it doesn't look like it's so easy to do that with an arbitrary number of args, at least we don't seem to implement that anywhere. Here is a patch that handles two args for int and double. In the double case we eagerly de-optimize for zero and NaN, but that could probably be partly fixed. This is also missing arm support for doubles ...

Attachment #637184 - Flags: review?(jdemooij)

Jan de Mooij [:jandem]

Reporter

Comment 3

•

12 years ago

Comment on attachment 637184 [details] [diff] [review] v1 Review of attachment 637184 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. I'd like an OOL path though to handle 0/NaN. Especially 0 is probably common and we have to be careful to avoid repeated bailouts. ::: js/src/ion/CodeGenerator.cpp @@ +1303,5 @@ > + > + if (ins->second()->isConstant()) > + masm.move32(Imm32(ToInt32(ins->second())), output); > + else > + masm.movePtr(ToRegister(ins->second()), output); Nit: masm.mov(..) ::: js/src/ion/IonBuilder.h @@ +394,5 @@ > InliningStatus inlineMathAbs(uint32 argc, bool constructing); > InliningStatus inlineMathFloor(uint32 argc, bool constructing); > InliningStatus inlineMathRound(uint32 argc, bool constructing); > InliningStatus inlineMathSqrt(uint32 argc, bool constructing); > + InliningStatus inlineMathMinMax(uint32 argc, bool constructing, bool max); Nit: all other functions pass |constructing| as the last argument, so I'd do that here too. ::: js/src/ion/Lowering.cpp @@ +626,5 @@ > return defineReuseInput(lir, ins, 0); > } else { > JS_ASSERT(num->type() == MIRType_Double); > LAbsD *lir = new LAbsD(useRegister(num)); > + return defineReuseInput(lir, ins, 0); You don't need this change (and AbsD requires input != output). ::: js/src/ion/MCallOptimize.cpp @@ +404,5 @@ > > IonBuilder::InliningStatus > +IonBuilder::inlineMathMinMax(uint32 argc, bool constructing, bool max) > +{ > + if (argc != 2 || constructing) The easiest way to handle 3 or 4 arguments here is to add multiple MathMin/MathMax instructions. Maybe worth doing in a follow-up bug or patch, but with 2 arguments we at least match JM+TI now. @@ +421,5 @@ > + > + if (returnType == MIRType_Int32 && > + (arg1Type != MIRType_Int32 || arg2Type != MIRType_Int32)) > + { > + returnType = MIRType_Double; I think we should "return InliningStatus_NotInlined;" here instead. If the call has an int32 and double argument, and always returns the int32 argument, we still have to tell TI if we finally get a double. ::: js/src/ion/MIR.h @@ +2076,5 @@ > + isMax_(isMax) > + { > + JS_ASSERT(type == MIRType_Double || type == MIRType_Int32); > + setResultType(type); > + specialization_ = type; Call setMovable() here so that we can hoist/GVN MMinMax. @@ +2096,5 @@ > + TypePolicy *typePolicy() { > + return this; > + } > + bool congruentTo(MDefinition *const &ins) const { > + return congruentIfOperandsEqual(ins); We should also compare isMax_ here, to avoid optimizing Math.max(x, y) and Math.min(x, y) to the same instruction. Something like this: if (!ins->isMinMax()) return false; if (ins->toMinMax()->isMax() != isMax()) return false; return congruentIfOperandsEqual(ins); ::: js/src/ion/shared/CodeGenerator-x86-shared.cpp @@ +433,5 @@ > return true; > } > > bool > +CodeGeneratorX86Shared::visitMinMaxD(LMinMaxD *ins) Can we move this to CodeGenerator (and use moveDouble etc) so that we don't need an ARM port? @@ +447,5 @@ > + masm.xorpd(scratch, scratch); > + Assembler::DoubleCondition cond = Assembler::DoubleEqualOrUnordered; > + masm.compareDouble(cond, first, scratch); > + if (!bailoutIf(Assembler::ConditionFromDoubleCondition(cond), ins->snapshot())) > + return false; Especially 0 may be common so we have to be careful not to invalidate/bailout all the time. Can you add an out-of-line path to call a max/min helper function which takes two doubles, like you did for LTruncateDToInt32?

Attachment #637184 - Flags: review?(jdemooij)

Tom S [:evilpie]

Assignee

Comment 4

•

12 years ago

Attached patch v2 WIP (obsolete) (deleted) — Details — Splinter Review

So checking for -0 is already complicated and sucks on x86/x64, but it's even harder on arm, so I am (or somebody? :) going to implement a pretty lightweight version there. As it turned out it's not so easy to tell apart -0 and 0 without using a GPR register, but I had this funny idea that seems to work great. When we already know that both parameter are zero, we shift the first one, one bit to the right. If it was -0, it's now going to be 2.0 and thus not anymore equal to 0.

Attachment #637184 - Attachment is obsolete: true

Marty Rosenberg [:mjrosenb]

Comment 5

•

12 years ago

ok, I just tried this out on my ARM board, and it seems to work. The shifting right idea also requires moving data between a floating point register, and a gpr (most likely back as well), which would be rather slow. As it turns out, distinguishing 0 from -0 is not necessary. If we only consider the cases where both operands are zero or negative zero (call it zeroey from truthy and falsey), the only bit that we care about manipulating is the sign bit. so making a truth table min(T, T) = T min(T, F) = T min(F, T) = T min(F, F) = F in other words, we want to or together the sign bits. the common operations available are multiply, negate, subtract and add These can be thought of as: addition: 0+0 = 0, 0+(-0)=0, (-0)+0=0, (-0)+(-0)=-0 aka and multiplication: 0*0=0, (-0)*0=-0, 0*(-0)=-0, (-0)*(-0)=0 (xor) negation: -(0) = -0, -(-0)=0 (not) subtraction: 0-0=0, 0-(-0)=0, (-0)-0=-0, (-0)-(-0)=0 (implication) none of these happen to be or, but using "and", "not" and demorgan's theorem, we can generate OR, namely -(-a+-b), which is also equivalent to -(-a-b). On ARM the following sequence of instructions implements this on ARM: fnegd d2, d0 fsubd d2, d2, d1 fnegd d2, d2 I have not yet tried this instruction sequence on x86 nor x64.

Tom S [:evilpie]

Assignee

Comment 6

•

12 years ago

I think we should just take my current approach for x86/x64 and yours for ARM and be done with it.

Sean Stangl [:sstangl]

Updated

•

12 years ago

Blocks: 768572

Jan de Mooij [:jandem]

Reporter

Updated

•

12 years ago

Blocks: 768745

Tom S [:evilpie]

Assignee

Comment 7

•

12 years ago

If you need this in the next to weeks, please take it. Same for Bug 725966.

David Anderson [:dvander] - inactive, e-mail if emergency

Updated

•

12 years ago

Whiteboard: [ion:t]

Tom S [:evilpie]

Assignee

Comment 8

•

12 years ago

Attached patch v1 (obsolete) (deleted) — Details — Splinter Review

So I hope this even compiles and maybe works on ARM :). This now handles -0 and NaN properly inline. Marty would you mind checking that out? When we have two zero values, we always look at the first one. For Max, when the first one isn't -0, we use that as the result. (aka we don't move the second argument to the result register) For Min, when the first one is -0 we use that.

Attachment #639141 - Attachment is obsolete: true

Attachment #647740 - Flags: review?(jdemooij)

Attachment #647740 - Flags: feedback?(mrosenberg)

Jan de Mooij [:jandem]

Reporter

Comment 9

•

12 years ago

Comment on attachment 647740 [details] [diff] [review] v1 Review of attachment 647740 [details] [diff] [review]: ----------------------------------------------------------------- Looks good! r=me on the non-ARM changes with nits addressed. The only thing that worries me is use of the Parity flag to test for NaN outside x86-only code, if that's a problem for ARM we can always move the code to x86-shared and give ARM its own implementation. You may want to enable the hg color extension, it will show trailing whitespace and is very nice for viewing diffs anyway :) ::: js/src/ion/CodeGenerator.cpp @@ +1544,5 @@ > + > +bool > +CodeGenerator::visitMinMaxD(LMinMaxD *ins) > +{ > + Nit: can remove this new line. @@ +1556,5 @@ > + > + masm.zeroDouble(ScratchFloatReg); > + masm.compareDouble(Assembler::DoubleEqualOrUnordered, first, ScratchFloatReg); > + masm.j(Assembler::Parity, &nan); // If one argument is NaN the result is NaN. > + masm.j(Assembler::NotEqual, &secondCheck); I'm not sure about using the Parity flag here, it seems very x86-specific, please ask mjrosenb what he thinks. @@ +1579,5 @@ > + > + masm.bind(&secondCheck); > + // first is not NaN and not zero so we only need to check if second is NaN. > + masm.compareDouble(Assembler::DoubleEqualOrUnordered, second, ScratchFloatReg); > + masm.j(Assembler::Parity, &nan); This can be masm.branchDouble(Assembler::DoubleUnordered, second, second, &nan); ::: js/src/ion/LIR-Common.h @@ +1232,5 @@ > +class LMinMaxD : public LInstructionHelper<1, 2, 0> > +{ > + public: > + LIR_HEADER(MinMaxD); > + LMinMaxD(const LAllocation &first, const LAllocation &second) Nit: trailing whitespace. ::: js/src/ion/Lowering.cpp @@ +647,5 @@ > return false; > return define(lir, ins); > } > > +bool Nit: trailing whitespace ::: js/src/ion/MCallOptimize.cpp @@ +39,5 @@ > return inlineMathSqrt(argc, constructing); > + if (native == js_math_max) > + return inlineMathMinMax(true /* max */, argc, constructing); > + if (native == js_math_min) > + return inlineMathMinMax(false /* max */, argc, constructing); Nit: trailing whitespace. @@ +408,5 @@ > + MIRType arg2Type = getInlineArgType(argc, 2); > + if (arg2Type != MIRType_Double && arg2Type != MIRType_Int32) > + return InliningStatus_NotInlined; > + > + if (returnType == MIRType_Int32 && Nit: trailing whitespace. ::: js/src/jit-test/tests/ion/mathMinMax.js @@ +36,5 @@ > + > + } > +} > + > +for (var i = 0; i < 40; i++) Nit: can you remove this loop and change the other one to 100 iterations or so? That should be enough for --ion-eager and --no-jm. There's also some trailing whitespace in this file.

Attachment #647740 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Reporter

Comment 10

•

12 years ago

(In reply to Jan de Mooij (:jandem) from comment #9) > The only thing that worries me is use of the Parity flag to test for NaN outside x86-only > code, if that's a problem for ARM we can always move the code to x86-shared > and give ARM its own implementation. I just discussed this with mjrosenb, there may be some way to abstract it but it seems cleaner to move visitMinMaxD to CodeGeneratorX86Shared and CodeGeneratorARM. Sorry for suggesting otherwise in an earlier comment, I didn't know these double condition flags would complicate things. And thanks again for taking this bug, would be good to get this landed soon for parity with JM+TI and V8.

Status: NEW → ASSIGNED

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 11

•

12 years ago

Comment on attachment 647740 [details] [diff] [review] v1 Review of attachment 647740 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ion/LIR-Common.h @@ +1208,5 @@ > +class LMinMaxI : public LInstructionHelper<1, 2, 0> > +{ > + public: > + LIR_HEADER(MinMaxI); > + LMinMaxI(const LAllocation &first, const LAllocation &second) Trailing whitespace ::: js/src/ion/arm/MacroAssembler-arm.cpp @@ +1924,5 @@ > ma_b(label, ConditionFromDoubleCondition(cond)); > } > > +void > +MacroAssemblerARMCompat::branchTestNegativeZero(Condition cond, const FloatRegister &input, Id. ::: js/src/jit-test/tests/ion/mathMinMax.js @@ +15,5 @@ > + assertEq(min(0, negative_zero), negative_zero); > + assertEq(min(negative_zero, 0), negative_zero); > + > + assertEq(min(negative_zero, negative_zero), negative_zero); > + assertEq(max(negative_zero, negative_zero), negative_zero); More @@ +32,5 @@ > + assertEq(max(Infinity, nan), nan); > + > + assertEq(max(negative_zero, -5), negative_zero); > + assertEq(min(negative_zero, -5), -5); > + More

Tom S [:evilpie]

Assignee

Comment 12

•

12 years ago

Attached patch v2 (deleted) — Details — Splinter Review

Totally new version based on v8 code. Also should handle ARM properly this time, but try is still going on https://tbpl.mozilla.org/?tree=Try&rev=20fe4cb4cbf2.

Attachment #647740 - Attachment is obsolete: true

Attachment #647740 - Flags: feedback?(mrosenberg)

Attachment #650625 - Flags: review?(jdemooij)

Jan de Mooij [:jandem]

Reporter

Comment 13

•

12 years ago

Comment on attachment 650625 [details] [diff] [review] v2 Review of attachment 650625 [details] [diff] [review]: ----------------------------------------------------------------- Very nice! r=me on everything but the ARM changes, ask Marty to review these.

Attachment #650625 - Flags: review?(jdemooij) → review+

Tom S [:evilpie]

Assignee

Updated

•

12 years ago

Attachment #650625 - Flags: review?(mrosenberg)

Marty Rosenberg [:mjrosenb]

Comment 14

•

12 years ago

Comment on attachment 650625 [details] [diff] [review] v2 Review of attachment 650625 [details] [diff] [review]: ----------------------------------------------------------------- Any plans for special casing this when one of the arguments is 0? I suspect anything that makes the register allocator's job easier will be helpful. ::: js/src/ion/arm/CodeGenerator-arm.cpp @@ +288,5 @@ > + ? Assembler::VFP_GreaterThanOrEqual > + : Assembler::VFP_LessThanOrEqual; > + Label nan, equal, returnSecond, done; > + > + masm.compareDouble(second, first); I think you want compareDouble(first, second)

Attachment #650625 - Flags: review?(mrosenberg) → review+

Tom S [:evilpie]

Assignee

Comment 15

•

12 years ago

http://hg.mozilla.org/projects/ionmonkey/rev/5be6b8747c74 Yaay thank you guys.

Tom S [:evilpie]

Assignee

Updated

•

12 years ago

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Boris Zbarsky [:bzbarsky]

Comment 16

•

12 years ago

Hmm. I'm running a script that does this: data[targetpixel] = Math.max(0, Math.min(255, sum * factor + bias)); and I see js_math_min showing up in my profiles... any idea why? Is it a matter of the args being an int and a double?

Sean Stangl [:sstangl]

Comment 17

•

12 years ago

(In reply to Boris Zbarsky (:bz) from comment #16) > Hmm. I'm running a script that does this: > > data[targetpixel] = Math.max(0, Math.min(255, sum * factor + bias)); > > and I see js_math_min showing up in my profiles... any idea why? Is it a > matter of the args being an int and a double? Looking at IonBuilder::inlineMathMinMax() in js/src/ion/MCallOptimize.cpp, the only situation with known int+double arguments we wouldn't inline is when TI is forcing the return type to int32 and there is a double argument.

v1 12 years ago Tom S [:evilpie] (deleted), patch		Details \| Diff \| Splinter Review
v2 WIP 12 years ago Tom S [:evilpie] (deleted), patch		Details \| Diff \| Splinter Review
v1 12 years ago Tom S [:evilpie] (deleted), patch	jandem : review+	Details \| Diff \| Splinter Review
v2 12 years ago Tom S [:evilpie] (deleted), patch	jandem : review+ mjrosenb : review+	Details \| Diff \| Splinter Review