Closed
Bug 577883
Opened 14 years ago
Closed 14 years ago
TM: inline unit string comparison
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: gal, Assigned: gal)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Assignee | ||
Comment 1•14 years ago
|
||
sayrer and I both found this issue separately over the last 2 years, time to finally fix it. I will try to find the original bug(s) and dup it/them. Tired and want to upload this.
Assignee | ||
Comment 2•14 years ago
|
||
Measured it again. This is closer to a 30ms/7% speedup with the other patch actually (on sunspider).
Assignee: general → gal
Assignee | ||
Updated•14 years ago
|
Attachment #456712 -
Flags: review?(nnethercote)
Comment 3•14 years ago
|
||
Comment on attachment 456712 [details] [diff] [review]
patch
>+template <typename T>
>+static inline bool CompareChars(const jschar **s1p, const jschar **s2p, size_t *np)
>+{
>+ const size_t N = sizeof(T) / sizeof(jschar);
>+ while (*np >= N) {
>+ if (*reinterpret_cast<const T *>(*s1p) != *reinterpret_cast<const T *>(*s2p))
>+ return false;
>+ (*np) -= N;
>+ (*s1p) += N;
>+ (*s2p) += N;
>+ }
>+ return true;
>+}
CompareChars is an awful name for a function returning a bool -- what does a true result mean?
"CharsAreEqual" or something similar would be better.
>+ inline bool equals(JSString *other) {
>+ /* Fast case: pointer equality could be a quick win. */
>+ if (this == other)
>+ return true;
>+
>+ size_t n = length();
>+ if (n != other->length())
>+ return false;
>+
>+ if (n == 0)
>+ return true;
>+
>+ const jschar *s1 = chars();
>+ const jschar *s2 = other->chars();
>+ if (n == 1)
>+ return *s1 == *s2;
>+
>+ if (sizeof(jsuword) == 8 && !js::CompareChars<jsuword>(&s1, &s2, &n))
>+ return false;
>+ if (!js::CompareChars<uint32>(&s1, &s2, &n))
>+ return false;
>+ JS_ASSERT(n == 0 || n == 1);
>+ return n == 0 || *s1 == *s2;
>+ }
Barf. So CompareChars is effectively unrolling the loop? On 64-bits you
first try comparing 8 bytes at a time, and then fall back to 4 bytes? I
guess you're assuming the compiler will eliminate the "sizeof(jsuword)..."
test on 32-bit machines as dead code? What happens if the size of the
string is not a multiple of 8 or 4 -- won't you be comparing values past the
end of the string?
In general, there's way too much going on here. Comments would help. I also think the template class confuses things. Just making the two instances explicit would probably be no longer but much easier to understand.
I only skimmed the rest of the patch, but I think the same problems occur as
in the patch for bug 577882 -- LIR_uc2ui used instead of LIR_us2ui and you're
assuming the string is flat.
Attachment #456712 -
Flags: review?(nnethercote) → review-
Comment 4•14 years ago
|
||
Is the unit string bit worth doing in JM?
Comment 5•14 years ago
|
||
Back to the patch: you added inlining, plus you special-cased 1-length strings in equals(), plus you did the loop unrolling in equals(). Any idea the perf benefits of these individually?
Also, would using memcmp() instead of unrolling the loop yourself be smarter? Presumably the system memcmp() is optimised already.
Comment 6•14 years ago
|
||
> Presumably the system memcmp() is optimised already.
We discovered this to not necessarily be the case on some operating systems when experimenting with indexOf. See 551539 for details.
Comment 7•14 years ago
|
||
(In reply to comment #6)
> > Presumably the system memcmp() is optimised already.
>
> We discovered this to not necessarily be the case on some operating systems
> when experimenting with indexOf. See 551539 for details.
That's the kind of thing that I would *love* to see in a comment!
Comment 8•14 years ago
|
||
The comments added with the code in bug 551539 mention the issue, yes. Though note that memcmp was faster than manual loop on Windows/Mac for long enough strings, too.
Assignee | ||
Updated•14 years ago
|
Summary: TM: inline unit string comparison and speed up js_StringEquals → TM: inline unit string comparison
Assignee | ||
Comment 9•14 years ago
|
||
Attachment #456712 -
Attachment is obsolete: true
Comment 10•14 years ago
|
||
Cachegrind results of this patch plus the one in bug 584499:
---------------------------------------------------------------
| millions of instructions executed |
| total | on-trace (may overestimate) |
---------------------------------------------------------------
| 111.202 111.487 (0.997x) | 20.797 20.809 (0.999x) | 3d-cube
| 56.990 56.948 (1.001x) | 22.313 22.313 (------) | 3d-morph
| 147.820 146.788 (1.007x) | 21.982 21.971 (1.001x) | 3d-raytrace
| 77.548 77.419 (1.002x) | 17.401 17.400 (------) | access-binary-
| 183.728 183.440 (1.002x) | 103.707 103.704 (------) | access-fannkuc
| 41.565 41.403 (1.004x) | 17.553 17.551 (------) | access-nbody
| 60.014 59.983 (1.001x) | 27.013 27.012 (------) | access-nsieve
| 15.688 15.681 (------) | 2.862 2.862 (------) | bitops-3bit-bi
| 43.303 43.285 (------) | 30.280 30.280 (------) | bitops-bits-in
| 22.773 22.770 (------) | 10.219 10.219 (------) | bitops-bitwise
| 71.624 71.610 (------) | 39.453 39.453 (------) | bitops-nsieve-
| 42.397 42.309 (1.002x) | 26.655 26.654 (------) | controlflow-re
| 47.115 47.002 (1.002x) | 5.541 5.660 (0.979x) | crypto-md5
| 31.336 31.286 (1.002x) | 7.030 7.109 (0.989x) | crypto-sha1
| 127.956 121.973 (1.049x) | 10.461 12.076 (0.866x) | date-format-to
| 99.144 98.905 (1.002x) | 11.118 11.116 (------) | date-format-xp
| 52.617 52.594 (------) | 27.614 27.614 (------) | math-cordic
| 38.416 38.406 (------) | 6.252 6.252 (------) | math-partial-s
| 28.590 28.474 (1.004x) | 10.286 10.285 (------) | math-spectral-
| 100.938 100.914 (------) | 75.045 75.045 (------) | regexp-dna
| 44.077 40.705 (1.083x) | 8.098 10.397 (0.779x) | string-base64
| 123.212 123.302 (0.999x) | 24.544 24.543 (------) | string-fasta
| 272.611 269.834 (1.010x) | 6.840 6.671 (1.025x) | string-tagclou
| 251.422 251.321 (------) | 5.637 5.635 (------) | string-unpack-
| 64.493 63.695 (1.013x) | 7.401 7.521 (0.984x) | string-validat
-------
| 2156.590 2141.546 (1.007x) | 546.114 550.163 (0.993x) | all
Basically it's a good improvement for date-format-tofte, little difference for anything else.
Comment 11•14 years ago
|
||
Comment on attachment 463431 [details] [diff] [review]
patch
>+ if (!l_str->isRope() && !r_str->isRope() && l_str->length() == 1 && r_str->length() == 1) {
>+ VMSideExit *exit = snapshot(BRANCH_EXIT);
>+ guard(true, lir->ins2(LIR_eqp, getStringLength(l_ins), INS_CONSTWORD(1)), exit);
>+ guard(true, lir->ins2(LIR_eqp, getStringLength(r_ins), INS_CONSTWORD(1)), exit);
Factor out INS_CONSTWORD(1) so CseFilter doesn't have to?
>+JS_REQUIRES_STACK LIns*
>+TraceRecorder::getStringLength(LIns* str_ins)
>+{
>+ return lir->ins2ImmI(LIR_rshup,
>+ lir->insLoad(LIR_ldp, str_ins, offsetof(JSString, mLengthAndFlags),
>+ ACCSET_OTHER, LOAD_CONST),
>+ JSString::FLAGS_LENGTH_SHIFT);
>+}
>+
>+JS_REQUIRES_STACK LIns*
>+TraceRecorder::getStringChars(LIns* str_ins)
>+{
>+ return lir->insLoad(LIR_ldp, str_ins, offsetof(JSString, mChars), ACCSET_OTHER, LOAD_CONST);
>+}
I like these. I'll like them even more if you use addName() to give the
values names like "strLen" and "strChars". You could do likewise in
getStringLength() with mLengthAndFlags.
Nice patch!
Attachment #463431 -
Flags: review+
Assignee | ||
Comment 12•14 years ago
|
||
Assignee | ||
Updated•14 years ago
|
Whiteboard: fixed-in-tracemonkey
Comment 13•14 years ago
|
||
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•