Closed
Bug 496816
Opened 15 years ago
Closed 15 years ago
Rounding difference between JIT and interpreter
Categories
(Core :: JavaScript Engine, defect, P2)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta1-fixed |
People
(Reporter: gkw, Assigned: graydon)
Details
(Keywords: regression, testcase, Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file)
(deleted),
patch
|
Details | Diff | Splinter Review |
v = 0
for (x = 0; x < 2; ++x) {
--v;
for each(e in ['', String(''), '']) {
print(v /= --v)
}
}
Note the different output between -j on and -j off in 1.9.1 branch.
===
$ ~/Desktop/25902-191/js-dbg-moz191-intelmac -j
js> v = 0
for (x = 0; x < 2; ++x) {
--v;
for each(e in ['', String(''), '']) {
print(v /= --v)
}
}0
js>
0.5
-1
0.5
0.3333333333333333
-0.49999999999999994
0.3333333333333333
js> ^C
$ ~/Desktop/25902-191/js-dbg-moz191-intelmac
js> v = 0
for (x = 0; x < 2; ++x) {
--v;
for each(e in ['', String(''), '']) {
print(v /= --v)
}
}0
js>
0.5
-1
0.5
0.3333333333333333
-0.4999999999999999
0.33333333333333326
js> ^C
Flags: blocking1.9.1?
Reporter | ||
Comment 1•15 years ago
|
||
This is a regression between:
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/3f1e6876d253 - identical -j and no -j output
and
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/ff069347b3e5 - different output.
http://hg.mozilla.org/releases/mozilla-1.9.1/pushloghtml?fromchange=3f1e6876d253&tochange=ff069347b3e5
Keywords: regression,
regressionwindow-wanted
Comment 2•15 years ago
|
||
That range doesn't really have anything that touches JS ...
Reporter | ||
Comment 3•15 years ago
|
||
(In reply to comment #1)
> This is a regression between:
>
> http://hg.mozilla.org/releases/mozilla-1.9.1/rev/3f1e6876d253 - identical -j
> and no -j output
>
> and
>
> http://hg.mozilla.org/releases/mozilla-1.9.1/rev/ff069347b3e5 - different
> output.
>
> http://hg.mozilla.org/releases/mozilla-1.9.1/pushloghtml?fromchange=3f1e6876d253&tochange=ff069347b3e5
(In reply to comment #2)
> That range doesn't really have anything that touches JS ...
Sorry, I got confused myself. That range is invalid. Both listed changeset have different outputs and thus both show the bug.
That said, I tested this on Ubuntu. I _think_ this doesn't affect Mac, but am not sure.
Comment 4•15 years ago
|
||
This is TM tip on macosx. This is a rounding issue, so it might be platform and compiler dependent. Gary, are you using Windows?
host-7-221:src gal$ ./Darwin_DBG.OBJ/js -j x.js
0.5
-1
0.5
0.3333333333333333
-0.49999999999999994
0.3333333333333333
host-7-221:src gal$ ./Darwin_DBG.OBJ/js x.js
0.5
-1
0.5
0.3333333333333333
-0.49999999999999994
0.3333333333333333
Comment 5•15 years ago
|
||
Note: I have noticed differences in rounding behavior between MSVC and gcc before. sin(x) is often different between macosx firefox and safari and our Windows builds. MSVC was usually right in those cases, and everyone else was wrong. However, I think the rounding modes are somewhat underspecified and both results were acceptable (for sin anyway).
Comment 6•15 years ago
|
||
So it seems the non-jit linux results are incorrect since jit and nojit on macosx both agree with jit linux:
$ ~/Desktop/25902-191/js-dbg-moz191-intelmac
js> v = 0
for (x = 0; x < 2; ++x) {
--v;
for each(e in ['', String(''), '']) {
print(v /= --v)
}
}0
js>
0.5
-1
0.5
0.3333333333333333
-0.4999999999999999
0.33333333333333326
js> ^C
Updated•15 years ago
|
Summary: TM: Different result from testcase involving for...in, print, for → Rounding difference between JIT and interpreter
Comment 7•15 years ago
|
||
You might not want to treat Mac output as normative. Mac forces 80-bit x87 mode while using 64-bit in SSE2, or did. See bug 338756.
ECMA-262 specifies the exact results for these inputs, I would think (for the sake of interoperability). To conform to the standard, we used to use fdlibm to gain cross-platform consistency. We gave it up for speed, OS compatibility where possible, and code footprint (both source and object).
/be
Comment 8•15 years ago
|
||
But maybe there's a tweak for Mac we need, or some way to bypass extended precision (80-bit) f.p. in the x87, if that is what is biting. From bug 338756 you can see that I found no way to convince the OS and hardware to go to 64-bit mode in that FPU.
/be
Reporter | ||
Comment 9•15 years ago
|
||
(In reply to comment #4)
> This is TM tip on macosx. This is a rounding issue, so it might be platform and
> compiler dependent. Gary, are you using Windows?
Nope, I'm on Ubuntu 9.04, for this particular testcase.
I get the same results as comment #4, using Windows (VS2k5).
Comment 11•15 years ago
|
||
Fwiw, the numbers in comment 4 look correct to me, from a general "what is the most precise output you could get given the previous value" perspective...
Comment 12•15 years ago
|
||
Rhino agrees with comment 4, and for what it's worth, so does OpenJDK:
public strictfp class Fx {
public static void main(String[] args) {
double v = 0;
for (int x = 0; x < 2; ++x) {
--v;
for (int e = 0; e < 3; e++) {
v = v / (v - 1);
System.out.println("" + v);
}
}
}
}
0.5
-1.0
0.5
0.3333333333333333
-0.49999999999999994
0.3333333333333333
The "strictfp" is supposed to ensure freshness: http://en.wikipedia.org/wiki/Strictfp
java version "1.6.0_0"
IcedTea6 1.3.1 (6b12-0ubuntu6.4) Runtime Environment (build 1.6.0_0-b12)
OpenJDK Server VM (build 1.6.0_0-b12, mixed mode)
Comment 13•15 years ago
|
||
Without -j, the results are the same in tip as in rev 1.
Comment 14•15 years ago
|
||
Interpreter:
0.3333333333333333 (0x3fd5555555555555)
/ -0.6666666666666667 (0xbfe5555555555556)
== -0.4999999999999999 (0xbfdffffffffffffe)
JIT:
0.3333333333333333 (0x3fd5555555555555)
/ -0.6666666666666667 (0xbfe5555555555556)
== -0.49999999999999994 (0xbfdfffffffffffff)
I have no idea which might be correct or whether both might be allowed. I defer to Graydon.
Comment 15•15 years ago
|
||
>>> import decimal
>>> n = decimal.Decimal("0.3333333333333333")
>>> n
Decimal("0.3333333333333333")
>>> d = decimal.Decimal("-0.6666666666666667")
>>> d
Decimal("-0.6666666666666667")
>>> n / d
Decimal("-0.4999999999999999250000000000")
-0.49999999999999994
-0.4999999999999999
JIT seems to be correct, then.
Comment 16•15 years ago
|
||
Ignore me, I'm stupid, recalculating correctly now...
Comment 17•15 years ago
|
||
find-waldo-now:/tmp jwalden$ cat foo.cpp
#include <stdio.h>
#include <fenv.h>
#include <stdint.h>
#include <math.h>
int main()
{
union { double d; uint64_t u; } a, b;
printf("%d\n", fegetround());
a.u = 0x3fd5555555555555ULL;
b.u = 0xbfe5555555555556ULL;
return (int) a.u;
}
Breakpoint 1, main () at foo.cpp:10
10 printf("%d\n", fegetround());
(gdb) n
0
12 a.u = 0x3fd5555555555555ULL;
(gdb) s
13 b.u = 0xbfe5555555555556ULL;
(gdb) s
15 return (int) a.u;
(gdb) set variable b.d = a.d / b.d
(gdb) p/x b.u
$1 = 0xbfdffffffffffffe
So interpreter is correct.
Comment 18•15 years ago
|
||
what code does gcc emit here?
Comment 19•15 years ago
|
||
gcc emits fdivl:
.loc 1 4008 0
fldl -936(%ebp)
fdivl -944(%ebp)
fstpl -936(%ebp)
Nanojit is using sse:
fsub1 = fsub $global0, #0x3ff00000:0
fdiv1 = fdiv $global0, fsub1
stqi state[748] = fdiv1
ld8 = ld state[732]
add1 = add ld8, 1
ov1 = ov add1
xt1: xt ov1 -> pc=0x8aabbe6 imacpc=(nil) sp+0 rp+0
movq xmm0,740(ecx) ecx(state) esi(sp)
movsd xmm2,(#0x8215850) // =1.0 ecx(state) esi(sp) xmm0($global0)
movsd xmm1,xmm0 ecx(state) esi(sp) xmm0($global0) xmm2(#0x3ff00000:0)
subsd xmm1,xmm2 ecx(state) esi(sp) xmm0($global0) xmm2(#0x3ff00000:0)
divsd xmm0,xmm1 ecx(state) esi(sp) xmm0($global0) xmm1(fsub1)
movq 748(ecx),xmm0 ecx(state) esi(sp) xmm0(fdiv1)
mov ebx,732(ecx) ecx(state) esi(sp)
add ebx,1 ecx(state) ebx(ld8) esi(sp)
jo 0xb7ad1fe8 ecx(state) ebx(add1) esi(sp)
Assignee | ||
Comment 20•15 years ago
|
||
Add "-msse2 -mfpmath=sse" perhaps?
Not blocker: per comment 5, we've had this deviation across platforms before.
Flags: blocking1.9.1? → blocking1.9.1-
Reporter | ||
Updated•15 years ago
|
Flags: blocking1.9.2?
Assignee | ||
Comment 22•15 years ago
|
||
If (as is worth confirming) this is just x87-vs-sse2, IMO it should neither be a blocker for any release, nor even really a "bug"; it's the price we will continue to pay for having runtime-detected sse2 math vs. compile-time-chosen conservative x87 math in the interpreter.
Other options are to step the minimum requirements for running the interpreter up to sse2 hardware and compile with -msse2 -mfpmath=sse (plausible, that only means P4/K8 or newer) or else stick a softfp library or some other widget that eliminates the 80-bit-ness of the x87 operands.
Good to check for sure, of course. I'll poke at it some more later.
Assignee: general → graydon
Reporter | ||
Comment 23•15 years ago
|
||
(In reply to comment #22)
> If (as is worth confirming) this is just x87-vs-sse2, IMO it should neither be
> a blocker for any release, nor even really a "bug"; it's the price we will
> continue to pay for having runtime-detected sse2 math vs. compile-time-chosen
> conservative x87 math in the interpreter.
This bug arguably interferes with the fuzzing in bug 465479.
Updated•15 years ago
|
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
Reporter | ||
Updated•15 years ago
|
Flags: in-testsuite?
Reporter | ||
Comment 24•15 years ago
|
||
(In reply to comment #23)
> (In reply to comment #22)
> > If (as is worth confirming) this is just x87-vs-sse2, IMO it should neither be
> > a blocker for any release, nor even really a "bug"; it's the price we will
> > continue to pay for having runtime-detected sse2 math vs. compile-time-chosen
> > conservative x87 math in the interpreter.
>
> This bug arguably interferes with the fuzzing in bug 465479.
Since the fuzzer in bug 465479 just got folded into jsfunfuzz, we have had to disable JITcompare (which does find issues nowadays) on Linux due to the noise generated by this issue.
Assignee | ||
Comment 25•15 years ago
|
||
Hm. I looked into this a bit and am now more perplexed than previously.
I added a mechanism to force-disable SSE2 (an environment variable like the ARM one); when I do so, and confirm we're generating x87 code, we still differ in JIT and interp modes. Funny.
Also, I noticed a wrinkle here that the original diagnosis didn't: your original test was on mac, yes? Macs compile with SSE2 by default. All our interpreter routines use SSE there (I just disassembled them), so if it were just an SSE issue you would not see it on mac. I thought you were reporting seeing this everywhere *except* mac.
I talked to Julian and he suggested that he's seen bugs like this before due to spilling an 80 bit operand to a 64 bit slot when in x87 mode. That's a possibility -- we are in fact performing such precision-losing spills -- and it explains the phenomenon I'm seeing on linux, but *not* if it happens on mac.
But then, I cannot reproduce this on a mac, only a linux-gcc configuration. Can you confirm the exact host-compiler combinations you're seeing this on? Ideally it'll help if you objdump -d or otool -tV jsinterp.o on the offending platform, and grep for 'div', paste in what you find. Also tell me if it's a machine that *has* SSE2.
I'm CC'ing Julian on this bug so he can chime in if anything else that happens here reminds him of one of something he's seen before.
Assignee | ||
Comment 26•15 years ago
|
||
Ok, I think I know what's going on here now, and I can fix the bug, but I'd appreciate comments from anyone else who remembers this code, and/or assumptions baked into our FP code. It's pretty old.
On some platforms, the x87 FPU is set to 80 bit mode on startup. On others, 64 bit mode. We have existing code in jsnum.cpp that clobbers the FPU state back to 64 bit mode -- FIX_FPU -- but it's only called on a specific (and very narrow) set of platforms: OS/2 or WINXP when building with icc or gcc but *not* mingw.
As far as I can tell, archaeologically, this narrow ifdef-condition was arrived at merely to get the thing to build given that it uses some mask definitions and an intrinsic that are not available in the header files of all, or even most, x86 platforms. But it's not hard to just insert the necessary fpu-environment-twiddling asm manually, and ignore the issue of "getting the include files to work" entirely.
The attached patch does just that, and correspondingly relaxes the ifdef to actually fire on linux, where it seems this was biting us. It's not clear to me which (if any) other platforms were reporting the problem, but I can only reproduce it on linux-x86, and the patch fixes that case. Also removes an ancient comment about Alpha, which I'd bet money we don't even build on anymore.
(The reason the bug exists is that, when you get the x87 FPU running in 80 bit mode, it will perform more precise calculations in interpreter mode than the JIT does in SSE mode. I'll file a followup patch to force-disable SSE mode in the JIT as well, but it turns out it's not necessary in this case.)
Attachment #397078 -
Flags: review?
Assignee | ||
Updated•15 years ago
|
Attachment #397078 -
Flags: review? → review?(brendan)
Reporter | ||
Comment 27•15 years ago
|
||
Unfortunately comment #0 had a wrongly named js binary - it's not supposed to be called js-dbg-moz191-intelmac - I had hardcoded my compile scripts to output "intelmac" on every platform. :-/ It's a Linux-only bug as far as I can tell, doesn't seem to occur on Mac/WinXP. Apologies for the confusion.
The patch seems to work out great on Linux! Thanks Graydon!
Keywords: regressionwindow-wanted
Assignee | ||
Comment 28•15 years ago
|
||
Comment on attachment 397078 [details] [diff] [review]
fix the bug
Brendan reviewed in person at my desk just now, so removing the r? on him and landing.
Attachment #397078 -
Flags: review?(brendan)
Assignee | ||
Comment 29•15 years ago
|
||
Whiteboard: fixed-in-tracemonkey
Comment 30•15 years ago
|
||
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 31•15 years ago
|
||
status1.9.2:
--- → beta1-fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•