Closed Bug 120990 Opened 23 years ago Closed 23 years ago

Conversion of integer to string is slow

Categories

(Core :: JavaScript Engine, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: bratell, Assigned: bratell)

References

()

Details

(Keywords: js1.5, perf)

Attachments

(2 files, 4 obsolete files)

Conversions from js integers to js strings are slow, noticed in the sorting
testcase at <http://www.formula-one.nu/mozilla/jsTestcases/sortArray.htm>. It's
not the biggest fruit but it's the lowest hanging. 

Patch coming up. This replaces the sprintf call with a simple straightforward C
algorithm. This is ~5-10 times faster. A review anybody?
Attached patch Replace snprintf with homemade algorithm. (obsolete) (deleted) — Splinter Review
Summary: Conversion of integer to string → Conversion of integer to string is slow
Blocks: js-perf
Thanks for doing this -- nit-picky review in a second.

/be
Keywords: js1.5, mozilla0.9.9
Comment on attachment 65822 [details] [diff] [review]
Replace snprintf with homemade algorithm.

>>+static void
>+js_IntToString(int i, char* buf)

Statics need no js_ prefix, just IntToString will do.

i should have type jsint, not int (this actually matters on some platforms,
notably Win16, which we don't support -- but the convention should be upheld).

>+{
>+    /* the buf must be big enough for MIN_INT to fit including '-' and '\0' */
>+    if (i == 0) {
>+        buf[0] = '0';
>+        buf[1] = '\0';
>+    } else {
>+        char *startnump;
>+        if (i<0) {

Prevailing style puts a space around binary operators...

>+            *(buf++) = '-';

... and does not overparenthesize expressions such as that one.

>+            i = -i;
>+        }
>+        startnump = buf;
>+        /* build the string from behind and then reverse it */

Nits: extra newline before comment that's not preceded by a {, and capitalize
sentences in comments, ending them with full-stops. 

Non-nit: Why not build the string from the end of the buffer toward the front,
avoiding the need to reverse?

>+        while (i != 0) {
>+            /* there should be a combined % and / -> (i,r) = i %/ 10 */

(Ye old divmod instruction, yeah. Python has it, IIRC!)

>+            int newi=i/10;
>+            *(buf++) = (i-newi*10)+'0';
>+            i = newi;
>+        }
>+        *buf = '\0'; /* terminate string */
>+        buf--;
>+        /* reverse the string */
>+        while(startnump < buf) {

A space should come after the while, but I think this reversal should be
eliminated.  The caller could pass in buf and size, and IntToString would store
a '\0' at buf[size-1], formatting backwards, and finally return a char * to the
start of the formatted string in buf.

/be
Attachment #65822 - Flags: needs-work+
Attached patch Better patch (obsolete) (deleted) — Splinter Review
This should address all your comments. It was a good idea to build the string
from the end of the buffer which made the algorithm both shorter and faster. I
will try to get numbers later, but until then, here is a new patch to look at.
Attachment #65822 - Attachment is obsolete: true
Comment on attachment 65854 [details] [diff] [review]
Better patch

Nits only:

>+/* The buf must be big enough for MIN_INT to fit including '-' and '\0'. */
>+static char *
>+IntToString(jsint i, char* buf, jsint bufSize)

char *, not char*, in JS code.	Also, size_t for sizes (actual param is
sizeof...).

>+{
>+    JSBool isNegative;
>+    char *nextPos;
>+    
>+    if (i == 0) {
>+        buf[0] = '0';
>+        buf[1] = '\0';
>+        return buf;
>+    }
>+
>+    isNegative = i < 0;
>+    if (isNegative)
>+        i = -i;
>+
>+    nextPos = buf + bufSize - 1; /* Last buffer cell. */
>+    *nextPos-- = '\0'; /* Null terminate the string to be. */

Nit: comments that fit on the same line and often are sentence fragments need
no capitalization/full-stop.  Sorry, didn't say this earlier!

Less of a nit: use cp as canonical name, and (more imp., not a nit) *--cp =
'\0' (pre-decrement), setting cp = buf + bufsize first.  This should generate
better code on some platforms.

>+    
>+    /* Build the string from behind. */
>+    while (i != 0) {
>+        /* There should be a combined % and / -> (i,r) = i %/ 10. */

This comment is true, but kind of cryptic, or misplaced (write a letter to the
C standard people! :-).  How about elaborating it to say "We use multiply and
subtract rather than modulus for reduced strength."

>+        int newi = i / 10;
>+        *nextPos-- = (i - newi * 10) + '0';
>+        i = newi;
>+    }
>+    
>+    if (isNegative)
>+        *nextPos-- = '-';
>+
>+    return nextPos + 1;

The predecremented char pointer (*--cp) approach also saves an add here.

>+    if (JSDOUBLE_IS_INT(d, i)) {
>+        /* buf must be big enough for MIN_INT to fit as string */

Redundant comment, ok but we're all friends in this file (and it should be a
sentence -- cap and full-stop).

>+        numStr = IntToString(i, buf, sizeof buf);
>+    } else {
> 	numStr = JS_dtostr(buf, sizeof buf, DTOSTR_STANDARD, 0, d);
> 	if (!numStr) {
> 	    JS_ReportOutOfMemory(cx);

Fix these and sr=brendan@mozilla.org for 0.9.9.

/be
Attachment #65854 - Flags: superreview+
Keywords: perf
I've made all the changes mentioned. The last version of the patch (not
attached) is 10% faster than my "create and then reverse" code so it was a good
change. >50% of the time now is in the arithmetic that we cannot avoid. 

khanson, can you review this? I can attach the patch with be's modifications
later if that helps.
>+    if (isNegative)
>+        i = -i;

Will this work properly if i = INT_MIN?
No, it will output the wrong chars for that case, since -i will still be the
same number. Good catch for a problem I knew about but always managed to ignore.
Will attach a new patch that replaces i by u, where u is an unsigned int that
can hold -INT_MIN.
Attached patch Patch that also handles i = 0x80000000. (obsolete) (deleted) — Splinter Review
Here it is. Let the reviews fall over it like rain from a cloudy sky.
Attachment #65854 - Attachment is obsolete: true
BTW, if you use a do {} while  loop you can eliminate the special case
for 0.
Attached patch Best patch. (obsolete) (deleted) — Splinter Review
Couldn't you have said that earlier?  :-) 

Well, here is the smallest, fastest and most wonderful patch ever!
Attachment #65917 - Attachment is obsolete: true
Just out of curiousity, do you have any numbers comparing this new patch with
the conversion rutine in the current souce?
With the patch, 1 conversion is done in 0.08us on my computer. That's at least
10-20 times faster than the current code.
Comment on attachment 65919 [details] [diff] [review]
Best patch.

Almost there, but some types are not as portable or consistent as they should
be.

>+static char *
>+IntToString(jsint i, char *buf, size_t bufSize)
>+{
>+    char *cp;
>+    unsigned int u;

jsuint u;

>+    
>+    if (i < 0)
>+        u = -i;
>+    else
>+        u = i;

Why not 'u = (i < 0) ? -i : i;' on one line?

>+
>+    cp = buf + bufSize; /* one past last buffer cell */
>+    *--cp = '\0'; /* null terminate the string to be */

Make those comments start in the same column (expandtab-tabbed to 0 mod 8
worksforme).

>+    
>+    /* Build the string from behind. We use multiply and subtraction

Ultra-nit: JS multiline comment style opens /* on its own line.

>+     * instead of modulus because that's much faster.
>+     */
>+    do {
>+        int newu = u / 10;

jsuint newu = ... -- right?

>+        *--cp = (u - newu * 10) + '0';
>+        u = newu;
>+    } while (u != 0);
>+    
>+    if (i < 0)
>+        *--cp = '-';
>+
>+    return cp;
>+}

>@@ -254,9 +284,10 @@
> 	    return JS_FALSE;
> 	if (base < 2 || base > 36) {
> 	    char numBuf[12];
>-	    JS_snprintf(numBuf, sizeof numBuf, "%ld", (long) base);
>+	    char *numStr;
>+	    numStr = IntToString(base, numBuf, sizeof numBuf);

Feel free (it's a good thing for block-locals such as this) to initialize
numStr at its declaration.

> 	    JS_ReportErrorNumber(cx, js_GetErrorMessage, NULL, JSMSG_BAD_RADIX,
>-	    			 numBuf);
>+	    			 numStr);
> 	    return JS_FALSE;
> 	}
>     }
>@@ -583,10 +614,10 @@
> {
>     jsint i;
>     char buf[DTOSTR_STANDARD_BUFFER_SIZE];
>-    char *numStr = buf;
>+    char *numStr;
> 
>     if (JSDOUBLE_IS_INT(d, i))
>-	JS_snprintf(buf, sizeof buf, "%ld", (long)i);
>+        numStr = IntToString(i, buf, sizeof buf);
>     else {
> 	numStr = JS_dtostr(buf, sizeof buf, DTOSTR_STANDARD, 0, d);
> 	if (!numStr) {

Also fee free to expand tabs in the functions you touch, if they're not too
huge for you to take the cvsblame.  Thanks,

/be
+    /* Build the string from behind. We use multiply and subtraction
+     * instead of modulus because that's much faster.
+     */

Just a comment that this depends a great deal on the compiler. On Linux 
it is _not_ true for egcs-1.1.2 or gcc-2.95.2 but is true for gcc-3.0.3. 
The first two compilers recognize that an adjacent modulus and division 
are part of the same thing and do something tricky.
tenthumbs, I see integer division by 10 reduced to reciprocal multiply with
normalization via right shift on rh7.1 (bastard gcc version, but still).  Are
you sure that divide by constant 10 compiles to a div when not "near enough" to
a % 10 expression?

/be
Attached patch With jsuint (deleted) — Splinter Review
Good you catched the mix of int and jsuint. Could have caused problem, I'm
sure. I also addressed the rest of the comments except for the tab thing. There
are just too many of them so I leave them where they are and am satisfied that
the new function is so pure. :-)
Attachment #65919 - Attachment is obsolete: true
Comment on attachment 65952 [details] [diff] [review]
With jsuint

Beauty, sr=brendan@mozilla.org and thanks very much for this patch!

/be
Attachment #65952 - Flags: superreview+
Attached file compiler differences (deleted) —
Brendan: rather than cluttering the main bug, read this.
Comment on attachment 65952 [details] [diff] [review]
With jsuint

r=khanson

Nice code.  Good catch on min_int.
Attachment #65952 - Flags: review+
Fix checked in. Thanks for the reviews.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Marking Verified -
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: