Closed
Bug 564369
Opened 15 years ago
Closed 14 years ago
streamline TokenStream::getChar()
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(10 files, 1 obsolete file)
(deleted),
patch
|
cdleary
:
review+
brendan
:
feedback+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
getChar() gets the next character, normalizing all EOL sequences to '\n' as
it goes. It's pretty horrid, largely because a \r\n sequence can get split,
which makes everything that follows more difficult.
This patch:
- Replaces the use of js_fgets() with a new method
TokenStream::fillUserbuf() which has slightly different behaviour.
(The reason for the name fillUserbuf() will become clear shortly.)
(I kept js_fgets() in place because it's a JS_FRIEND_API function, I'd be
happy to get rid of it though as there are no other users within
SpiderMonkey.)
fillUserbuf() ensures that a \r\n sequence is never split. It does
this by always keeping a spare char in the buffer free so a final \n can
be grabbed if necessary.
- This allows getChar() to be simplified quite a bit. The EOL normalizing
code is currently near the end. All the code in the case marked with
"Does the line segment end in \r? We must check for a \n..." is no longer
necessary. (I also added some extra assertions in the related cases.)
- Removing that code removes the only place that sets TSF_CRFLAG. So
earlier in the function I removed all the code that depends on TSF_CRFLAG
being set. And then I removed all occurrences of TSF_CRFLAG.
- Having done that, there's no longer any need to first get the data into
cbuf[] and then copy it into userbuf, so I removed cbuf and changed
getLineFromFile() to write directly into userbuf.base. Hence the name
'fillUserbuf()'.
Assignee | ||
Comment 1•15 years ago
|
||
Cachegrind says this reduces instruction counts for string-tagcloud by 1.4%. I haven't done any SS timings. I tried parsemark but got inconclusive results, cdleary said he would try the patch on his machine.
Assignee | ||
Comment 2•15 years ago
|
||
(In reply to comment #1)
> Cachegrind says this reduces instruction counts for string-tagcloud by 1.4%.
getChar() uses about 25% fewer instructions, and fillUserbuf() uses about 15% fewer than js_fgets() did.
Assignee | ||
Comment 3•15 years ago
|
||
I have some follow-up improvements, BTW, but this is enough changes for one patch.
Assignee | ||
Updated•15 years ago
|
Attachment #444037 -
Flags: review?(cdleary)
Attachment #444037 -
Flags: feedback?(brendan)
Comment 4•15 years ago
|
||
Both my x86 Pentium Dual Core Linux box and x64 macbook show marginal improvements, no regressions. http://hg.mozilla.org/users/cleary_mozilla.com/parsemark_results/file/4829c0da5441/escobar/tm_parsemark_eolfix_comparison_2010_05_07.txt
Assignee | ||
Comment 5•15 years ago
|
||
I think this change won't help much when the main activity is proper parsing, as I suspect it is for parsemark.
But for benchmarks that contain whopping great strings -- like string-tagcloud, regexp-dna, string-unpack-code -- more of the work is just in the scanning, and this change has a more noticeable effect.
Assignee | ||
Comment 6•15 years ago
|
||
And even if there was no perf benefit, this patch is worth it for making things clearer! :)
Comment 7•15 years ago
|
||
Comment on attachment 444037 [details] [diff] [review]
patch
Nice work -- it pays to read old code.
You could reduce TSF_ renumbering churn by swapping the last flag for the one being removed. We do that elsewhere to minimize diff size, and sometimes it wins anyway since the last flag is "more important" than the earlier ones, esp. the one being removed (whose removal signifies its marginality!).
/be
Attachment #444037 -
Flags: feedback?(brendan) → feedback+
Comment 8•15 years ago
|
||
Cool! Any effects on the aforementioned SS benchmarks?
Assignee | ||
Comment 9•15 years ago
|
||
(In reply to comment #8)
> Cool! Any effects on the aforementioned SS benchmarks?
Cachegrind says instruction counts are down by 1--1.5% but timing differences are within noise.
To give you an idea, the number of instructions in getChar() on string-tagcloud dropped from 15.4M to 11.9M, and the number of instructions in js_fgets() dropped from 3.3M to 2.8M. And I'm hoping to get it down substantially further with a couple of follow-up changes. If I'm lucky I may even be able to split the slow code off into a separate function which might allow the fast path to be inlined, although ungetbuf[] might make that tricky.
(I should probably do a stack of mq patches to see the effect all at once but I still haven't got into that style of working, it makes me nervous.)
Assignee | ||
Comment 10•15 years ago
|
||
(In reply to comment #7)
>
> You could reduce TSF_ renumbering churn by swapping the last flag for the one
> being removed. We do that elsewhere to minimize diff size, and sometimes it
> wins anyway since the last flag is "more important" than the earlier ones, esp.
> the one being removed (whose removal signifies its marginality!).
Which is considered the greater good -- minimize churn or keeping the flags in a sensible order? I can go either way but I lean towards the former.
Comment 11•15 years ago
|
||
(In reply to comment #10)
> (In reply to comment #7)
> >
> > You could reduce TSF_ renumbering churn by swapping the last flag for the one
> > being removed. We do that elsewhere to minimize diff size, and sometimes it
> > wins anyway since the last flag is "more important" than the earlier ones, esp.
> > the one being removed (whose removal signifies its marginality!).
>
> Which is considered the greater good -- minimize churn or keeping the flags in
> a sensible order? I can go either way but I lean towards the former.
It depends -- too much of the former and you end up needing the latter in a big "recover sensible order" patch. This particular case looks like win-win for swap-and-delete.
/be
Assignee | ||
Comment 12•15 years ago
|
||
Comment on attachment 444037 [details] [diff] [review]
patch
I'm going to redo this change and some follow-up ones as a stack of patches, a la bug 553671.
Attachment #444037 -
Attachment is obsolete: true
Attachment #444037 -
Flags: review?(cdleary)
Assignee | ||
Updated•15 years ago
|
Summary: don't allow \r\n EOL sequence to be split in getChar() → streamline TokenStream::getChar()
Assignee | ||
Comment 13•15 years ago
|
||
This patch:
- Adds TokenStream::getLineFromFile() and replaced getChar()'s call to
js_fgets() with a call to it. It's similar to js_fgets() but it
guarantees that a \r\n sequence won't be split, which will make things
easier for getChar(). It's also slightly faster as it unrolls the loop
following a \r, thus avoiding the 'crflag' check every time around the
loop. It also doesn't bother appending a '\0' because getChar() doesn't
need it to.
- Doesn't remove js_fgets(), even though it now has no callers, because it's
a JS_FRIEND_API function.
string-tagcloud is the benchmark most affected by getChar() changes. This
patch reduced the instruction count (as measured by Cachegrind) for
string-tagcloud from 283.4M to 282.5M.
Attachment #444578 -
Flags: review?(cdleary)
Assignee | ||
Comment 14•15 years ago
|
||
This patch:
- Removes the handling of the split \r\n case in getChar(), because it is no
longer possible.
- Adds some comments and extra assertions.
string-tagcloud: 282.5M (unchanged)
Attachment #444579 -
Flags: review?(cdleary)
Assignee | ||
Comment 15•15 years ago
|
||
This patch:
- Removes all code involving TSF_CRFLAG and crflag, because they are now
never set.
- Declares some variables closer to their first use point rather than at the
top of getChar().
string-tagcloud: 282.5M --> 281.9M
Attachment #444580 -
Flags: review?(cdleary)
Assignee | ||
Comment 16•15 years ago
|
||
This patch:
- Removes cbuf, which is no longer needed, instead reading directly from
file into userbuf.
- Renames getLineFromFile() as fillUserbuf() and removes its arguments
accordingly.
- Tweaks the checking of the fillUserbuf()'s return value.
string-tagcloud: 281.9M --> 279.4M
Attachment #444581 -
Flags: review?(cdleary)
Assignee | ||
Comment 17•15 years ago
|
||
This patch:
- Changes 'len' and 'olen' in getChar() to 'llen' and 'ulen', which makes it
clear which buffer they refer to, and avoids updating them in confusing
ways.
- Adds some assertions about things that can only happen when scanning from
memory.
- Adds some comments.
string-tagcloud: 279.4M (unchanged)
Attachment #444582 -
Flags: review?(cdleary)
Assignee | ||
Comment 18•15 years ago
|
||
This patch changes the userbuf-to-linebuf copying in getChar(). The old
version did it in a strange way:
- look in userbuf for an EOL
- copy from userbuf to linebuf, up to the end of userbuf or linebuf or the
first EOL, whichever comes first
- if we copied up to an EOL, normalize it if necessary
The new version does this:
- for each char in userbuf up
- if we reached the end of userbuf or linebuf, stop
- copy char from userbuf to linebuf
- if char was an EOL, normalize then stop
This is shorter, simpler, avoids the need for the 'saveEOL' member, and
faster because it only iterates over userbuf once.
Also, it now copies up to LINE_LIMIT chars, not just LINE_LIMIT-1. (I think
the -1 might have been necessary when \r\n could be split.) This required
changing an assertion in reportCompileErrorNumberVA().
string-tagcloud: 279.4M --> 277.0M
Attachment #444583 -
Flags: review?(cdleary)
Assignee | ||
Comment 19•15 years ago
|
||
This patch changes fillUserbuf() so it no longer stops when it reaches an
EOL; this behaviour isn't needed any more. The change does preserve the
"don't split \r\n behaviour", however.
This change makes fillUserbuf() faster because it doesn't need to check for
\n and \r every time it gets a char. It also means fillUserbuf() is called
fewer times.
(Hmm, I wonder if it would be better now to replace the repeated
fast_fgetc() calls with some variant of fgets()? I don't know if that would
be faster.)
string-tagcloud: 277.0M --> 276.3M
Attachment #444584 -
Flags: review?(cdleary)
Assignee | ||
Comment 20•15 years ago
|
||
This patch increases LINE_LIMIT from 256 to 1024, which means that
fillUserbuf can be called less often.
string-tagcloud: 276.3M --> 275.3M
Attachment #444585 -
Flags: review?(cdleary)
Assignee | ||
Comment 21•15 years ago
|
||
This patch:
- Simplifies the calculation of linepos. Instead of recording linelen and
TSF_NLFLAG and using those two to calculate linepos the next time around, we
instead calculate and record a single value, lineposNext.
- Removes TSF_NLFLAG because it's no longer needed, and moves
TSF_KEYWORD_IS_NAME over it. I also realised that the 0x10 slot was free in
TokenStreamFlags, so the patch moves TSF_UNEXPECTED_EOF up as well.
string-tagcloud: 275.3M (unchanged)
Attachment #444586 -
Flags: review?(cdleary)
Assignee | ||
Comment 22•15 years ago
|
||
This patch moves the linebuf-refilling code out of getChar() into
a new function getCharFillLinebuf(). This makes getChar() small enough that
the compiler can inline it in places.
string-tagcloud: 275.3M --> 271.3M
Attachment #444587 -
Flags: review?(cdleary)
Assignee | ||
Comment 23•15 years ago
|
||
A summary of the perf improvements:
- Instruction count for string-tagcloud started at 283.4M, ended at 271.3M,
that's a 1.045x reduction overall.
- SS results are noisy, but look like roughly 3--4ms for string-tagcloud and
string-unpack-code, giving an SS improvement of about 1% overall.
- ParseMark results are noisy but good, 5--11% better on 32-bit Linux, 2--5%
better on 64-bit Linux.
Plus the new code is *much* clearer :)
Apologies for the huge stack o' patches to review.
Assignee | ||
Comment 24•15 years ago
|
||
Comment on attachment 444578 [details] [diff] [review]
patch 1
Brendan, any feedback you have on any of these patches would be welcome. The first three are equivalent to the now-obsolete patch that you previously looked at.
Attachment #444578 -
Flags: feedback?(brendan)
Comment 25•15 years ago
|
||
Comment on attachment 444578 [details] [diff] [review]
patch 1
Looks good -- no need for that null terminator I suppose. Guide disagrees with the comment style ( https://wiki.mozilla.org/JavaScript:SpiderMonkey:C_Coding_Style#Comments ).
Attachment #444578 -
Flags: review?(cdleary) → review+
Updated•15 years ago
|
Attachment #444579 -
Flags: review?(cdleary) → review+
Comment 26•15 years ago
|
||
Comment on attachment 444580 [details] [diff] [review]
patch 3
Destroying lexer flags makes me happy.
Attachment #444580 -
Flags: review?(cdleary) → review+
Comment 27•15 years ago
|
||
Comment on attachment 444581 [details] [diff] [review]
patch 4
Nice hoisting here. Idle thought: should we be using size_t/ssize_t instead of a int for vars that contain file-entity lengths? We're probably not seeing gigabyte payloads of minified JS quite yet, but future-proofing can't hurt.
Attachment #444581 -
Flags: review?(cdleary) → review+
Updated•15 years ago
|
Attachment #444582 -
Flags: review?(cdleary) → review+
Comment 28•15 years ago
|
||
Comment on attachment 444578 [details] [diff] [review]
patch 1
>+TokenStream::getLineFromFile(char *buf, int size, FILE *file)
>+{
>+ // We must be careful with ASCII newlines:
>+ //
>+ // - \n ends a line
>+ // - \r not followed by \n ends a line
>+ // - \r\n ends a line at the \n
>+ //
>+ // For the last case, we avoid splitting a \r\n pair in order to keep
>+ // things simpler for getChar(). To do this we keep one element in buf in
>+ // reserve; that way, if the nth char we get is \r, we can peek ahead one
>+ // more and get \n as the (n+1)th char if it follows.
Nits:
- Seems more readable to use prevailing major comment style (/*\n * ...\n */, indented appropriately).
- French spacing (which is really English spacing, but I'll keep calling it French spacing) is also not house style, but surely a nit.
- In any case, one space after ";" -- not two.
This is good for the shell, unused by the browser.
Great to see the old CRFLAG state machine go away in this bug.
/be
Attachment #444578 -
Flags: feedback?(brendan) → feedback+
Assignee | ||
Comment 29•15 years ago
|
||
(In reply to comment #28)
>
> - French spacing (which is really English spacing, but I'll keep calling it
> French spacing) is also not house style, but surely a nit.
> - In any case, one space after ";" -- not two.
Wow, is that in the style guide? I don't see it. I'll have to get past almost 20 years of muscle memory to not use French spacing -- I learnt it in my 8th grade typing lessons...
Comment 30•15 years ago
|
||
Comment on attachment 444585 [details] [diff] [review]
patch 8
Did you try any larger values to see what the marginal improvement?
Another weird question: why do we hardcode a doubling of the linebuf allocation in TokenStream::init when there's a file? Doesn't seem to get used anywhere...
Attachment #444585 -
Flags: review?(cdleary) → review+
Assignee | ||
Comment 31•15 years ago
|
||
(In reply to comment #30)
> (From update of attachment 444585 [details] [diff] [review])
> Did you try any larger values to see what the marginal improvement?
Yeah, 2048 didn't help any more.
> Another weird question: why do we hardcode a doubling of the linebuf allocation
> in TokenStream::init when there's a file? Doesn't seem to get used anywhere...
That puzzled me at first, too. It does get used, look closely at init(). When reading from a file the first half of the allocated chunk serves as linebuf and the second half serves as userbuf. Pretty hacky, I think it's to avoid doing two mallocs but a comment would have been nice.
Comment 32•15 years ago
|
||
(In reply to comment #31)
> I think it's to avoid doing
> two mallocs but a comment would have been nice.
Yes, that's it. Patched for bug 397210 -- sorry I didn't review harder and ask for a comment then.
/be
Updated•15 years ago
|
Attachment #444583 -
Flags: review?(cdleary) → review+
Updated•15 years ago
|
Attachment #444584 -
Flags: review?(cdleary) → review+
Updated•15 years ago
|
Attachment #444586 -
Flags: review?(cdleary) → review+
Comment 33•15 years ago
|
||
Comment on attachment 444587 [details] [diff] [review]
patch 10
Awesome stuff, Nick! Feel free to r=me on future cleanup patches. I <3 clean code.
Attachment #444587 -
Flags: review?(cdleary) → review+
Assignee | ||
Comment 34•15 years ago
|
||
(In reply to comment #27)
> (From update of attachment 444581 [details] [diff] [review])
> Idle thought: should we be using size_t/ssize_t instead of
> a int for vars that contain file-entity lengths? We're probably not seeing
> gigabyte payloads of minified JS quite yet, but future-proofing can't hurt.
Which length are you referrring to -- the return value of fillUserbuf()? I just copied the 'int' from js_fgets(). Besides, fillUserbuf()'s return value will never exceed LINE_LIMIT, which is currently 1024. Or were you referring to something else?
Comment 35•15 years ago
|
||
(In reply to comment #34)
Don't mind me, you're right.
Assignee | ||
Comment 36•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/dbeffed4edb0
http://hg.mozilla.org/tracemonkey/rev/965b6adb65c6
http://hg.mozilla.org/tracemonkey/rev/395dbcafc7fc
http://hg.mozilla.org/tracemonkey/rev/1135b2628d73
http://hg.mozilla.org/tracemonkey/rev/51399ee02290
http://hg.mozilla.org/tracemonkey/rev/65f84aca73d1
http://hg.mozilla.org/tracemonkey/rev/91a4406e7a8c
http://hg.mozilla.org/tracemonkey/rev/ad37e6e5bb42
http://hg.mozilla.org/tracemonkey/rev/01f9e6ca37e1
http://hg.mozilla.org/tracemonkey/rev/ffbe83895f69
http://hg.mozilla.org/tracemonkey/rev/2bbcccfc2398
There are 11 patches there, only 10 were reviewed. The last patch just changed some // comments into /* */ comments. I marked it as r=cdleary which was erroneous, but hey, worse things have happened.
Whiteboard: fixed-in-tracemonkey
Comment 37•14 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•