Closed Bug 584595 Opened 14 years ago Closed 14 years ago

TM: speed up scanning three ways

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 2 obsolete files)

patch 1 (against TM 48641:8cae6d5cd69c) 14 years ago Nicholas Nethercote [inactive] (deleted), patch		Details \| Diff \| Splinter Review
patch 2 (against TM 48641:8cae6d5cd69c) 14 years ago Nicholas Nethercote [inactive] (deleted), patch		Details \| Diff \| Splinter Review
Parsemark patch comparison. 14 years ago Chris Leary [:cdleary] (not checking bugmail) (deleted), text/plain		Details
TM patch (against TM 49115:f516ec208e64) 14 years ago Nicholas Nethercote [inactive] (deleted), patch	cdleary : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

14 years ago

Attached patch patch 1 (against TM 48641:8cae6d5cd69c) (obsolete) (deleted) — Details — Splinter Review

getCharFillLinebuf() gets a lot of mispredicted branches when looking for EOL sequences. Based on my rule of thumb, which is that mispredicted branches always cost more than you think, I've tried to eliminate them. Attached are two patches. The first one is simpler and reduces the mispredictions by about 2/3, and also uses fewer instructions overall. The second one is more complex; it reduces the mispredictions almost completely, but incurs more instructions due to some shifting and masking. I'm possibly getting up to 5ms speedup on SS with either patch, but it's hard to tell with the usual SS noise. cdleary, can you try both patches with parsemark and see if any clear results emerge? Thanks.

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

14 years ago

Attached patch patch 2 (against TM 48641:8cae6d5cd69c) (obsolete) (deleted) — Details — Splinter Review

Attachment #463042 - Flags: feedback?(cdleary)

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Attachment #463041 - Flags: feedback?(cdleary)

Chris Leary [:cdleary] (not checking bugmail)

Comment 2

•

14 years ago

I see 2.30% for p1, 2.77% for p2. Note that you can also run parsemark on your local machine -- js/src/tests/parsemark.py and compare_bench.py, with the suite of tests located in bug 548621.

Chris Leary [:cdleary] (not checking bugmail)

Comment 3

•

14 years ago

Attached file Parsemark patch comparison. (deleted) — Details

Whoops, those aren't weighted results, d'oh. Looks like p1 does a little better in general. Attached the output.

Brendan Eich [:brendan]

Comment 4

•

14 years ago

Awesome -- keep this bug going. I wrote the original scanner in the days of 386 and MIPS (my fave!). /be

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Summary: TM: reduce branch misprediction rate in getCharFillLinebuf() → TM: speed up scanning three ways

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

14 years ago

Attached patch TM patch (against TM 49115:f516ec208e64) (deleted) — Details — Splinter Review

This patch has three improvements to the scanner. - First, it adds the maybeEOL table (as described above) to avoid mispredictions when checking for newlines in getCharFillLinebuf(). I went with the first of the above two patches, it was simpler and gave better results when parsing a really big file (see below for more). - Second, it adds the maybeStrSpecial table. When scanning inside a string, we currently have to do 4 tests in the common case, against: \n, \' or ", EOF, \\. This table allows these to be combined into a single test. - Third, it rearranges getChar(). Currently, in the common case, for every char we do an ungetbuf test, an end-of-linebuf test, and an is-it-newline test. With my change we just do an end-of-currbuf test. This makes getChar() small enough to inline. I did this by (a) changing ungetbuf to be a TokenBuf like linebuf and introducing 'currbuf' to point to the TokenBuf we're currently reading from; and (b) taking advantage of the fact that \n can only occur as the last char in linebuf/ungetbuf. Results on SS and parsemark are inconclusive, due to noise. But I also tried a simpler benchmark, which consisted of 22627 lines that looked like this: x = "<string>"; where <string> was 2348 chars long. This is just over 50MB. On this test I run 1.20--1.25x faster with the patch applied, so I figure it's probably worth landing, as it may help big scripts (eg. string-tagcloud, string-unpack-code, regexp-dna, v8-earley-boyer, v8-regexp).

Attachment #463041 - Attachment is obsolete: true

Attachment #463042 - Attachment is obsolete: true

Attachment #463985 - Flags: review?(cdleary)

Attachment #463041 - Flags: feedback?(cdleary)

Attachment #463042 - Flags: feedback?(cdleary)

Chris Leary [:cdleary] (not checking bugmail)

Comment 6

•

14 years ago

Comment on attachment 463985 [details] [diff] [review] TM patch (against TM 49115:f516ec208e64) Good stuff! I like the maybeEOL check better than the (d & 0xdfd0 == 0) one we were doing before. I don't think it's easy to create language-level tests for our scanner trickery... making the 1024th char a '\r' and 1025th a '\n' outside of a string in a way that would demonstrate bad behavior at a language level is hard to pull off the top of my head. I guess ideally we would have C++ tests, but that's not common SM practice. + ungetbuf.limit = ungetbuf.ptr = buf + UNGET_MAX; + linebuf.base = linebuf.limit = linebuf.ptr = buf + UNGET_MAX; Methinks UNGET_MAX should be called UNGET_LIMIT by convention -- max suffix is supposed to indicate the <= for-loop termination condition, whereas limit indicates <; i.e. the first char of linebuf is *(buf + UNGET_LIMIT). + maybeStrSpecial[uint32_t(EOF) & 0xff] = true; Why the uint32 cast? I think SM style also prefers uint32(EOF) -- people seem to dislike the _t's. + JS_ASSERT(currbuf = &linebuf); Double equals. + if (c == '\n' || c == EOF) { + ungetChar(c); + ReportCompileErrorNumber(cx, this, NULL, JSREPORT_ERROR, + JSMSG_UNTERMINATED_STRING); + goto error; + } Personally I would shove this under the more likely '\\' case, but it's not going to make one lick of difference perf wise. :-) Might be nice in a followup patch for us to make a TokenBuf class with some helper methods to indicate common patterns (like full() and pop()) and a ReverseTokenBuf to do similar for ungetbuf. It's a very isolated bit of code, though, so clearly low priority.

Attachment #463985 - Flags: review?(cdleary) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

14 years ago

Pushed with nits fixed: http://hg.mozilla.org/tracemonkey/rev/c2c09af14293

Whiteboard: fixed-in-tracemonkey

Robert Sayre

Comment 8

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/c2c09af14293

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

TM: speed up scanning three ways

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Crash Data

Security

(public)

User Story

Attachments

(2 files, 2 obsolete files)

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Attachment

General

Description

File Name

Content Type