Closed Bug 707844 Opened 13 years ago Closed 13 years ago

IonMonkey: try switching bailouts from three instructions to one or two

Categories

(Core :: JavaScript Engine, defect)

ARM
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: mjrosenb, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

currently for nearly all bailouts, we generate: movwcc r12, bottom movtcc r12, top bxcc r12 which means we have 3 instructions that don't do anything but take up space in the instruction stream and pipeline It would be much smaller (and probably faster) to either ldrcc r12, [pc, #pool_offset] bxcc r12 or even ldrc pc, [pc, #pool_offset]
This code has a bunch of extra changes in it, since I found a whole slew of bugs related to the multi-pool code.
Attachment #592058 - Flags: review?(Jacob.Bramley)
Comment on attachment 592058 [details] [diff] [review] shrink bailouts to 1 instruction Review of attachment 592058 [details] [diff] [review]: ----------------------------------------------------------------- Some tests have been modified. Are these left-over debug changes? Other than that, there are only a few minor glitches. r+ with those addressed. ::: js/src/ion/arm/Assembler-arm.cpp @@ +486,5 @@ > JS_ASSERT(branch->checkDest(temp)); > uint32 *dest = (uint32*) (targ_bot.decode() | (targ_top.decode() << 16)); > return dest; > + } else if (jump->is<InstLDR>()) { > + Should there be a TODO in here? @@ +1160,5 @@ > Assembler::as_genmul(Register dhi, Register dlo, Register rm, Register rn, > MULOp op, SetCond_ sc, Condition c) > { > > + writeInst(RN(dhi) | maybeRD(dlo) | RM(rm) | rn.code() | op | sc | c | mull_tag); What's mull_tag for? Doesn't it belong in the MULOp values? @@ +1238,5 @@ > poolVDTR = 3 > }; > + private: > + uint32 index : 17; > + uint32 cond : 4; Inconsistent spacing. @@ +1276,5 @@ > + int32 getIndex() { > + return index; > + } > + void setIndex(uint32 index_) { > + JS_ASSERT(ONES == 0xf && loadType != poolBOGUS); You should perhaps assert the ONES value on init, in case an index is provided there and not set again. @@ +1346,3 @@ > { > PoolHintPun php; > + php.phd.init(0, c, PoolHintData::poolDTR, dest); Ah, much cleaner :-) @@ +1872,5 @@ > +void > +Assembler::flushBuffer() > +{ > + m_buffer.flushPool(); > +} I think there should be spaces around functions. ::: js/src/ion/arm/MacroAssembler-arm.cpp @@ +851,5 @@ > > RelocBranchStyle > b_type() > { > + return LDR; Is this left in for experimentation? It produces lots of dead code in ma_b (and that is the only place that b_type is referenced.) ::: js/src/ion/arm/Trampoline-arm.cpp @@ +410,5 @@ > + // callee token > + // frame descriptor > + // padding <- sp now > + // return address > + masm.ma_dtr(IsLoad, sp, Imm32(4), r4, PreIndex); // r3 <- sizeDescriptor with FrameType. I think the comment should say r4. Also, I think 'frame descriptor' should be replaced with 'sizeDescriptor' in the layout comment. @@ +413,5 @@ > + // return address > + masm.ma_dtr(IsLoad, sp, Imm32(4), r4, PreIndex); // r3 <- sizeDescriptor with FrameType. > + // Remove the rectifier frame. > + // callee token > + // frame descriptor <- sp now; r3 <- frame descriptor r4 ::: js/src/ion/shared/Assembler-x86-shared.h @@ +835,5 @@ > return (ptrdiff_t)x; > } > + void flushBuffer() {} > + void finish() { > + } There are two different empty-function conventions in use there. Also, it might be a good idea to have a comment in there, to explain that they are deliberately empty. ::: js/src/ion/shared/IonAssemblerBufferWithConstantPools.h @@ +335,2 @@ > // align the pool. > + poolDest = (uint8*) (curPool->align(poolDest)); The (uint8*) cast shouldn't be necessary here. @@ +343,2 @@ > // align the pool. > + poolDest = (uint8*) (curPool->align(poolDest)); Ditto. @@ +617,5 @@ > > // We have a perforation. Time to cut the instruction stream, patch in the pool > // and possibly re-arrange the pool to accomodate its new location. > int poolOffset = perforation.getOffset(); > + int magicAlign = numDumps == 0 ? 0 : poolInfo[numDumps-1].finalPos - poolInfo[numDumps-1].offset; Perhaps magicAlign should be calculated by a helper function of some kind.
Attachment #592058 - Flags: review?(Jacob.Bramley) → review+
Effect of changing bailouts from MOVW/MOVT/BLX to LDR pc [pc, IMM] TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.182x as fast 12383.2ms +/- 0.7% 10475.5ms +/- 0.3% significant ============================================================================= 3d: 1.078x as fast 2034.9ms +/- 4.2% 1886.9ms +/- 1.3% significant cube: 1.087x as fast 659.8ms +/- 5.9% 607.3ms +/- 1.8% significant morph: - 744.7ms +/- 8.3% 692.3ms +/- 0.3% raytrace: 1.073x as fast 630.4ms +/- 3.3% 587.3ms +/- 3.2% significant access: 1.086x as fast 2653.0ms +/- 0.1% 2442.1ms +/- 0.2% significant binary-trees: 1.024x as fast 412.4ms +/- 0.3% 402.7ms +/- 0.8% significant fannkuch: 1.126x as fast 1146.5ms +/- 0.1% 1018.1ms +/- 0.2% significant nbody: 1.038x as fast 747.9ms +/- 0.2% 720.2ms +/- 0.1% significant nsieve: 1.150x as fast 346.2ms +/- 0.2% 301.0ms +/- 0.1% significant bitops: 3.77x as fast 1661.8ms +/- 0.0% 441.3ms +/- 0.1% significant 3bit-bits-in-byte: 10.3x as fast 139.6ms +/- 0.1% 13.5ms +/- 0.1% significant bits-in-byte: 19.2x as fast 383.9ms +/- 0.0% 20.0ms +/- 1.5% significant bitwise-and: 47.2x as fast 686.2ms +/- 0.0% 14.5ms +/- 0.2% significant nsieve-bits: 1.150x as fast 452.1ms +/- 0.1% 393.3ms +/- 0.1% significant controlflow: 12.8x as fast 275.2ms +/- 0.1% 21.5ms +/- 0.2% significant recursive: 12.8x as fast 275.2ms +/- 0.1% 21.5ms +/- 0.2% significant crypto: 1.095x as fast 731.8ms +/- 0.2% 668.2ms +/- 0.1% significant aes: 1.176x as fast 358.5ms +/- 0.1% 304.8ms +/- 0.1% significant md5: 1.015x as fast 118.2ms +/- 0.4% 116.5ms +/- 0.4% significant sha1: 1.033x as fast 255.1ms +/- 0.6% 246.9ms +/- 0.4% significant date: 1.037x as fast 717.2ms +/- 0.8% 691.9ms +/- 0.6% significant format-tofte: 1.045x as fast 402.9ms +/- 0.5% 385.5ms +/- 0.3% significant format-xparb: 1.026x as fast 314.3ms +/- 1.7% 306.4ms +/- 1.3% significant math: 1.023x as fast 2688.7ms +/- 0.4% 2627.3ms +/- 0.5% significant cordic: 1.024x as fast 1026.6ms +/- 0.3% 1002.5ms +/- 0.4% significant partial-sums: 1.003x as fast 728.3ms +/- 0.2% 725.8ms +/- 0.3% significant spectral-norm: 1.039x as fast 933.7ms +/- 0.9% 899.0ms +/- 0.8% significant regexp: - 86.7ms +/- 0.2% 86.4ms +/- 0.2% dna: - 86.7ms +/- 0.2% 86.4ms +/- 0.2% string: *1.050x as slow* 1533.8ms +/- 0.2% 1609.9ms +/- 0.3% significant base64: 1.032x as fast 234.2ms +/- 0.3% 226.8ms +/- 0.3% significant fasta: 1.051x as fast 512.0ms +/- 0.6% 487.3ms +/- 0.6% significant tagcloud: *1.37x as slow* 308.7ms +/- 0.5% 423.4ms +/- 0.9% significant unpack-code: 1.050x as fast 215.8ms +/- 0.4% 205.5ms +/- 0.4% significant validate-input: *1.014x as slow* 263.1ms +/- 0.8% 266.9ms +/- 0.2% significant
(In reply to Marty Rosenberg [:Marty] from comment #3) > Effect of changing bailouts from MOVW/MOVT/BLX to LDR pc [pc, IMM] Well, I'd say that's worthwhile :-) Which processor did you try?
FWIW, this improvement doesn't show up on AWFY, yet.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: