Open Bug 837334 Opened 12 years ago Updated 2 years ago

BaselineCompiler: Inline cache for x % y when y is a power of 2

Categories

(Core :: JavaScript Engine, defect)

Other Branch
x86
macOS
defect

Tracking

()

People

(Reporter: bhackett1024, Unassigned)

References

(Blocks 1 open bug)

Details

The critical loop in kraken-audio-oscillator looks like this:

  for ( var i = 0; i < this.bufferSize; i++ ) {
    offset = Math.round((frameOffset + i) * step);
    this.signal[i] = this.waveTable[offset % this.waveTableLength] * this.amplitude;
  }

The ModI in there is really painful.  this.waveTableLength is always 2048, and if I change the array index to 'offset & 0x7ff' our time improves from 207ms to 151ms.  v8 improves from 130ms to 120ms, so they seem to be doing something already with this (their raw ModI perf seems much worse than us so I don't think this is just better code for the %).

Knowing that this.waveTableLength is 2048 during Ion compilation needs information more precise than what we can get from TI, and determining this using a baseline IC attached to the mod seems the way to go.  If we can determine during Ion compilation that there is a single stub attached to the mod which specializes for rhs == 2048 then Ion can generate a bitand for the mod and loop hoist a guard that this.waveTableLength == 2048.

FWIW, the rest of the gap between us and v8 seems to be related to the this.signal buffer above.  This is an array with 8192 elements, and we allocate 500 of these objects (one per oscillator, the benchmark just does the same thing 500 times).  If I reuse the same array instead of allocating a new one each time (on top of replacing the mod with a bitand), our score improves to 99 and v8 improves to 109.  Allocating the 8192 elements all at once rather than resizing each array 10 times (ouch) gives us 7ms or so, but doing anything more seems to be blocked on ggc.
Assignee: general → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.