Closed
Bug 502778
Opened 15 years ago
Closed 15 years ago
nanojit: speed up CseFilter
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: fixed-in-nanojit fixed-in-tracemonkey)
Attachments
(2 files, 1 obsolete file)
(deleted),
patch
|
edwsmith
:
review+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details |
This patch splits the CseFilter's LInsHashSet into seven, one for each
instruction group. This save somewhere between 3--6ms in SS. It also
increases the starting size of these LInsHashSets from 64--256, which seems
to help another couple of ms, but the difference is small enough that it's difficult to claim it's decisive.
The patch also does a little bit of cleaning up:
- renames some of the hash*() and find*() functions so that the suffixes
match the ins*() functions.
- avoids an unnecessary argument to hash() by computing hashcode() inside it
- reorders add() so that the call to find() isn't necessary (it adds the
item then checks for growth, so the find() within grow() covers the newly
added element)
- Removes CseReader, which is dead in TM and Tamarin
- removes LInsHashSet::replace(); also dead
I tried some further changes -- splitting add(), find() and grow() into more
specific kinds (addImm(), addImmq(), etc) but it didn't help speed at all.
Attachment #387130 -
Flags: review?(edwsmith)
Updated•15 years ago
|
Attachment #387130 -
Flags: review?(edwsmith) → review+
Comment 1•15 years ago
|
||
Comment on attachment 387130 [details] [diff] [review]
patch
A comment in CseFilter for the rationale behind N tables vs 1 would be nice, otherwise looks great.
Assignee | ||
Comment 2•15 years ago
|
||
The previous patch's effect somehow disappeared in the time since I first wrote
it. Here is a new patch. It's more extensive. (Nb: If you read LIR.h changes
before LIR.cpp changes it'll make more sense.)
- The separation of one hash table into nine is now done within LInsHashSet
rather than outside it. This was necessary for some of the subsequent
changes. The initial size of each one can now be controlled independently
from the others, which is good as some of them get more use than others.
- Previously, we had a mixture of specialised and generic operations:
findXYZ() was specialised for the instruction kind (Imm, Immq, etc) when
inserting, but add()/find()/grow()/equals()/hashcode() were generic, working
with any instruction kind. This was a bit sucky because certain operations
were effectively coded twice, most notably find(), and if the
implementations didn't match then problems would occur. Also, we had to
do switches in various places, most notably hashcode() and equals().
The patch gets rid of find(), equals() and hashcode() by making add() and
grow() specialised for the instruction kind (simply by taking a LInsKind
argument, which then causes the appropriate findXYZ() function to be used).
This required introducing additional findXYZ() wrapper functions that take
a LInsp argument, and putting them the m_find[] array.
- Now handling insImmf() separately from insImmq().
- Made variable naming more consistent: 'ins' for instructions, 'k' for
hashtable indices. And used the LInsKind names (eg. "Imm",
"Immq", "Call") more consistently (eg. avoided "imm", "call").
As well as improving the code's readability, it's a performance win because
the switches are avoided in hashcode() and equals(), and because the hash
tables are split and sized appropriately for the instruction kinds which
means that grow() is called about 1/2 as often. I'm currently getting about
a 4ms SS speedup. This might be optimistic -- some of the tests (eg.
3d-morph) have sped up surprisingly -- but the tests that should speed up
have (crypto-md5, 3d-raytrace, date-format-xparb) and I've confirmed with
Cachegrind that fewer instructions are being run so it's a clear win.
Attachment #387130 -
Attachment is obsolete: true
Attachment #399129 -
Flags: review?(edwsmith)
Assignee | ||
Comment 3•15 years ago
|
||
Comment 4•15 years ago
|
||
Comment on attachment 399129 [details] [diff] [review]
completely overhauled patch
it would be nice to allow kInitialCaps[kind] to be zero. a growth formula like this might work:
newsize = oldsize + oldsize/2 + K
where K is a largish initial bump, and the overall 1.5x growth factor has less waste when the tables get big
(and the math works nice for an initial size of 0)
Attachment #399129 -
Flags: review?(edwsmith) → review+
Assignee | ||
Comment 5•15 years ago
|
||
Ed, the current hashing algorithm requires that the table sizes are multiples of two, because they use a bitmask. Keeping multiples of two seems simpler.
Assignee | ||
Comment 6•15 years ago
|
||
Whiteboard: fixed-in-nanojit
Whiteboard: fixed-in-nanojit → fixed-in-nanojit fixed-in-tracemonkey
Comment 8•15 years ago
|
||
This code is failing on tamarin when SOFTFLOAT is defined (e.g. under winmo).
In LInsHashSet.grow() the loop over oldList there is a LIR_qjoin entry. I would imagine that this is incorrect and we should only have call's in the list.
It appears that the SoftFloatFilter is interacting poorly with this new cse filter (i.e. moving SoftFloatFilter downstream of the cse filter cures these symptoms).
Will require further investigation as moving the filter is not a solution, it will effectively disable optimization of any SoftFloatFilter changes.
Assignee | ||
Comment 9•15 years ago
|
||
Rick, the patch wasn't supposed to change functional behaviour of the CseFilter at all. But it did. Can you try the patch in bug 527288?
Comment 10•15 years ago
|
||
still see the issue post-patch.
Assignee | ||
Comment 11•15 years ago
|
||
(In reply to comment #10)
> still see the issue post-patch.
Bug 527754 is the follow-up bug for the SoftFloat issue.
Comment 12•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•