Closed
Bug 637549
Opened 14 years ago
Closed 14 years ago
Speed up expression parsing
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(2 files, 1 obsolete file)
(deleted),
patch
|
jimb
:
review+
brendan
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jimb
:
review+
brendan
:
review+
|
Details | Diff | Splinter Review |
I have two patches that speed up expression parsing. Coming shortly.
Assignee | ||
Comment 1•14 years ago
|
||
If you're parsing an expression that doesn't involve any operators, eg. |3| or |foo|, a whole heap of checks for operators fail in the parsers condExpr, orExpr, ..., addExpr, mulExpr. More specifically, in every one of those we call matchToken() once or twice to see if a particular operator is present. Most of the time, no matching operator is present. And a failing matchToken() involves a getToken() followed by an ungetToken(). So we end up ungetting and regetting the same token about 15 times in a row. Even though ungetting and regetting tokens is cheap, it adds up.
This patch tweaks these parsers so that when they return, TokenStream's state is always one token past the end of the expression, instead of zero tokens past the end. This means the next token is always at hand, so we can avoid all the ungetting and regetting. (cdleary, I think you tried doing something like this at one point? I couldn't find the bug, though I think it had a title like "pump prime the parser"?)
For benchmarking, I used parsemark with an additional test (kraken-imaging) that contains the huge array-of-numbers from Kraken's imaging tests. Instruction count results with the patch:
---------------------------------------------------------------
| millions of instructions executed |
| total | compiled (may overestimate) |
---------------------------------------------------------------
| 15.111 14.244 (1.061x) | 163875 164163 (0.998x) | 280slides-comb
| 61.342 56.730 (1.081x) | 835251 836595 (0.998x) | ball-pool-comb
| 31.240 28.649 (1.090x) | 395039 395979 (0.998x) | dojo
| 57.514 53.144 (1.082x) | 764475 765647 (0.998x) | effectgames-ga
| 14.342 13.174 (1.089x) | 177572 178058 (0.997x) | ga
| 324.741 291.769 (1.113x) | 5.069 5.080 (0.998x) | gmail-combined
| 94.907 86.478 (1.097x) | 1.313 1.315 (0.998x) | gravity-combin
| 21.909 20.206 (1.084x) | 273781 274451 (0.998x) | jquery-min
| 39.960 36.004 (1.110x) | 520153 521695 (0.997x) | jsgb-combined
| 779.284 645.875 (1.207x) | 10.870 10.870 (------) | kraken-imaging
| 58.048 53.602 (1.083x) | 788597 790169 (0.998x) | mochikit
| 273.672 248.183 (1.103x) | 3.818 3.829 (0.997x) | pipio-combined
| 12.362 11.688 (1.058x) | 127551 127865 (0.998x) | processing-twi
| 41.228 39.026 (1.056x) | 336458 336918 (0.999x) | sunspider-comb
| 18.487 16.969 (1.089x) | 234882 235466 (0.998x) | tetris-combine
| 70.310 64.348 (1.093x) | 968758 971506 (0.997x) | twitter-combin
| 74.674 68.090 (1.097x) | 945588 947736 (0.998x) | v8-combined
| 9.567 8.987 (1.065x) | 102786 103026 (0.998x) | yui-min
| 434.166 397.597 (1.092x) | 6.017 6.026 (0.998x) | zimbra-combine
-------
| 2432.874 2154.772 (1.129x) | 33.724 33.772 (0.999x) | all
I imported the parsemark+kraken-imaging files into the SunSpider harness and saw a 1.10x speedup, which matches the instruction counts closely. I also tried measuring with test/parsemark.py, but I've never managed to see anything close to deterministic results with it on previous occasions and the same was true this time.
I also measured some function counts. The patch reduces the number of unget/reget operations by roughly 3x.
Attachment #515815 -
Flags: review?(jimb)
Assignee | ||
Comment 2•14 years ago
|
||
This patch is less elegant than the previous one. It just forces inlining of all the expression parsers in the hot cases (ie. when the operator matching fails). This requires having two versions of each expression parser, one that's always inlined and one that's never inlined. I reordered the functions so that functions are always defined before their call points, to assist the compiler with the inlining.
I see a 1.063x speedup and the following instruction count improvements (both of those are relative to the first patch, ie. with both patches applied I see a 1.17x speedup).
---------------------------------------------------------------
| millions of instructions executed |
| total | compiled (may overestimate) |
---------------------------------------------------------------
| 14.244 13.883 (1.026x) | 164163 160059 (1.026x) | 280slides-comb
| 56.729 54.753 (1.036x) | 836595 808549 (1.035x) | ball-pool-comb
| 28.649 27.575 (1.039x) | 395979 384547 (1.030x) | dojo
| 53.144 51.386 (1.034x) | 765647 747007 (1.025x) | effectgames-ga
| 13.174 12.686 (1.038x) | 178058 171950 (1.036x) | ga
| 291.769 277.652 (1.051x) | 5.080 4.897 (1.037x) | gmail-combined
| 86.478 82.920 (1.043x) | 1.315 1.271 (1.035x) | gravity-combin
| 20.206 19.494 (1.037x) | 274451 265883 (1.032x) | jquery-min
| 36.004 34.462 (1.045x) | 521695 509671 (1.024x) | jsgb-combined
| 645.874 595.846 (1.084x) | 10.870 10.870 (------) | kraken-imaging
| 53.603 51.736 (1.036x) | 790169 767213 (1.030x) | mochikit
| 248.183 237.449 (1.045x) | 3.829 3.690 (1.038x) | pipio-combined
| 11.688 11.407 (1.025x) | 127865 124451 (1.027x) | processing-twi
| 39.026 38.162 (1.023x) | 336918 331574 (1.016x) | sunspider-comb
| 16.969 16.328 (1.039x) | 235466 227870 (1.033x) | tetris-combine
| 64.348 61.822 (1.041x) | 971506 935974 (1.038x) | twitter-combin
| 68.090 65.435 (1.041x) | 947736 925136 (1.024x) | v8-combined
| 8.987 8.747 (1.027x) | 103026 100566 (1.024x) | yui-min
| 397.597 382.151 (1.040x) | 6.026 5.838 (1.032x) | zimbra-combine
-------
| 2154.772 2043.902 (1.054x) | 33.772 33.029 (1.023x) | all
I see a ~12KB increase in the binary size on a 32-bit linux build and a ~20KB increase on a 32-bit mac build.
Attachment #515818 -
Flags: review?(jimb)
Comment 3•14 years ago
|
||
Wow, those are great results. I will try to look at this tomorrow.
Assignee | ||
Comment 4•14 years ago
|
||
Wait until you see what I have in store for the scanner! :)
No rush on the review, this obviously is post-4.0.
Comment 5•14 years ago
|
||
(In reply to comment #1)
> This patch tweaks these parsers so that when they return, TokenStream's state
> is always one token past the end of the expression, instead of zero tokens past
> the end. This means the next token is always at hand, so we can avoid all the
> ungetting and regetting. (cdleary, I think you tried doing something like this
> at one point? I couldn't find the bug, though I think it had a title like
> "pump prime the parser"?)
Bug 556486, my first patch. I saw no speedup on parsemark.
Assignee | ||
Comment 6•14 years ago
|
||
Review ping!
With bug 639420 having landed, the combined instruction count reduction from these two patches has improved from 1.17x to 1.248x.
Assignee | ||
Comment 7•14 years ago
|
||
Somehow I managed to accidentally combine patch 1 into patch 2 v1. Here is the proper patch.
Attachment #515818 -
Attachment is obsolete: true
Attachment #515818 -
Flags: review?(jimb)
Attachment #520122 -
Flags: review?(jimb)
Comment 8•14 years ago
|
||
Comment on attachment 515815 [details] [diff] [review]
patch 1 (against TM 62858:f85299200bd3)
>@@ -6598,137 +6597,142 @@ Parser::condExpr()
> JSParseNode *pn3 = assignExpr();
> if (!pn3)
> return NULL;
> pn->pn_pos.begin = pn1->pn_pos.begin;
> pn->pn_pos.end = pn3->pn_pos.end;
> pn->pn_kid1 = pn1;
> pn->pn_kid2 = pn2;
> pn->pn_kid3 = pn3;
>+ tokenStream.getToken(); /* need to read one token past the end */
This can fail due to invalid char/token, right?
r=me FWIW, :jimb r+ awaited -- good thinking!
/be
Attachment #515815 -
Flags: review+
Comment 9•14 years ago
|
||
Comment on attachment 520122 [details] [diff] [review]
patch 2, v2
..12345678901234567890123456789012345678901234567890123456789012345678901234567890
>+#define BEGIN_EXPR_PARSER(name) \
>+ JS_ALWAYS_INLINE JSParseNode * \
>+ Parser::name##i() \
>+ {
>+
>+#define END_EXPR_PARSER(name) \
>+ } \
>+ JS_NEVER_INLINE JSParseNode * \
>+ Parser::name##n() { \
>+ return name##i(); \
>+ }
Please put \ in colument 79 per prevailing style. If you can squish macro bodies to fit without too much unreadability, do it.
>+BEGIN_EXPR_PARSER(mulExpr1)
>+ TokenKind tt;
>+ JSParseNode *pn = unaryExpr();
Nit: blank line here.
>+ /*
>+ * Note: unlike addExpr1() et al, we use getToken() here instead of
>+ * isCurrentTokenType() because unaryExpr() doesn't leave the TokenStream
>+ * state one past the end of the unary expression.
>+ */
>+ while (pn && (tt = tokenStream.getToken(), (tt == TOK_STAR || tt == TOK_DIVOP))) {
Nit: we often nest assignment in the first conjunct or disjunct, so long as that assignment dominates all later uses in the condition. So:
.. while (pn && ((tt = tokenStream.getToken()) == TOK_STAR || tt == TOK_DIVOP))) {
Avoids a repeated tt and the comma operator.
Crazy-good thinking here. What is the rational for inlining only below andExpr, though? Sorry if I missed it. r=me FWIW.
/be
Attachment #520122 -
Flags: review+
Comment 10•14 years ago
|
||
I'll take a look at this on Monday. Sorry I didn't get to it this week.
Assignee | ||
Comment 11•14 years ago
|
||
(In reply to comment #9)
>
> Crazy-good thinking here. What is the rational for inlining only below andExpr,
> though? Sorry if I missed it.
I don't understand the question... all the functions between condExpr1 and mulExpr1 are always inlined (excluding the off-the-hot-path versions with a name that ends with 'n'.
Assignee | ||
Comment 12•14 years ago
|
||
(In reply to comment #8)
> Comment on attachment 515815 [details] [diff] [review]
> patch 1 (against TM 62858:f85299200bd3)
>
> >@@ -6598,137 +6597,142 @@ Parser::condExpr()
> > JSParseNode *pn3 = assignExpr();
> > if (!pn3)
> > return NULL;
> > pn->pn_pos.begin = pn1->pn_pos.begin;
> > pn->pn_pos.end = pn3->pn_pos.end;
> > pn->pn_kid1 = pn1;
> > pn->pn_kid2 = pn2;
> > pn->pn_kid3 = pn3;
> >+ tokenStream.getToken(); /* need to read one token past the end */
>
> This can fail due to invalid char/token, right?
It can "fail" only in the sense that the current token can be set to a TOK_ERROR token. AFAICT that's not a problem.
Updated•14 years ago
|
Attachment #515815 -
Flags: review?(jimb) → review+
Comment 13•14 years ago
|
||
Comment on attachment 520122 [details] [diff] [review]
patch 2, v2
We should have a follow-up patch that makes this more friendly to editors that try to understand indentation; Emacs is going to make this code pretty hard to work with.
Attachment #520122 -
Flags: review?(jimb) → review+
Assignee | ||
Comment 14•14 years ago
|
||
After discussion with jimb on IRC I added some curly braces to the 2nd patch to make Emacs happy.
http://hg.mozilla.org/tracemonkey/rev/36a1aba3cd60
http://hg.mozilla.org/tracemonkey/rev/0ffe53f7606a
Whiteboard: fixed-in-tracemonkey
Comment 15•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/36a1aba3cd60
http://hg.mozilla.org/mozilla-central/rev/0ffe53f7606a
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•