Closed Bug 584223 Opened 14 years ago Closed 14 years ago

Performance optimizations for sqrts via Math.pow

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 564548

People

(Reporter: billm, Assigned: billm)

References

Details

Attachments

(1 file, 1 obsolete file)

This makes a small change to the Math.pow runtime code to check if the exponent is 0.5 or -0.5. In such cases, it calls sqrt (or 1.0/sqrt) instead of pow. This speeds up Sunspider's partial-sums on my laptop by about 10%.
Attached patch The patch (obsolete) (deleted) — Splinter Review
Here's the actual patch.
Could you please also add this to the traceable native? (same file, search for pow)
Did as Andreas suggested. Speedup with the tracer running is ~20% on partial-sums.
Attachment #462552 - Attachment is obsolete: true
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
oop, different bug
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
SQRTPD xmm1,xmm2/m128 we should use that for sqrt and also for this pow special case.
Rockin! /be
SQRTPD is 2x faster than FSQRT, which libc seems to use. Also, we can special case on trace for the -0.5 there.
(In reply to comment #8) > SQRTPD is 2x faster than FSQRT, which libc seems to use. Also, we can special > case on trace for the -0.5 there. I did a few benchmarks and I don't think it's worth the trouble. 1. I compared the performance of SQRTSD (the scalar version of SQRTPD) to FSQRT in the following loop: for (i=1..100000) { x += 1.0/sqrt(i); }. The SQRTSD version was 20% faster. 2. I translated the partial-sums benchmark to C and compiled it with gcc, comparing an SSE2 version to an x87 version. The x87 version was actually a little faster, although I don't know why. Take this all with a grain of salt since my laptop has a pretty lame FPU. But then, lots of people have laptops.
Comment on attachment 462559 [details] [diff] [review] Patch for traceable native as well This is a small but clear win. Let's not get bogged down in the asm; I suggest filing a follow-up bug for that. Any objections to my r+?
Attachment #462559 - Flags: review+
r=me
Assignee: general → wmccloskey
Does this just need to get landed at this point? Did it fall through the cracks?
Sorry, this change got folded in with bug 564548, which optimized pow in a different way.
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Resolution: FIXED → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: