496816 - Rounding difference between JIT and interpreter

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Description

•

15 years ago

v = 0 for (x = 0; x < 2; ++x) { --v; for each(e in ['', String(''), '']) { print(v /= --v) } } Note the different output between -j on and -j off in 1.9.1 branch. === $ ~/Desktop/25902-191/js-dbg-moz191-intelmac -j js> v = 0 for (x = 0; x < 2; ++x) { --v; for each(e in ['', String(''), '']) { print(v /= --v) } }0 js> 0.5 -1 0.5 0.3333333333333333 -0.49999999999999994 0.3333333333333333 js> ^C $ ~/Desktop/25902-191/js-dbg-moz191-intelmac js> v = 0 for (x = 0; x < 2; ++x) { --v; for each(e in ['', String(''), '']) { print(v /= --v) } }0 js> 0.5 -1 0.5 0.3333333333333333 -0.4999999999999999 0.33333333333333326 js> ^C

Flags: blocking1.9.1?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 1

•

15 years ago

This is a regression between: http://hg.mozilla.org/releases/mozilla-1.9.1/rev/3f1e6876d253 - identical -j and no -j output and http://hg.mozilla.org/releases/mozilla-1.9.1/rev/ff069347b3e5 - different output. http://hg.mozilla.org/releases/mozilla-1.9.1/pushloghtml?fromchange=3f1e6876d253&tochange=ff069347b3e5

Keywords: regression, regressionwindow-wanted

Mike Beltzner [:beltzner, not reading bugmail]

Comment 2

•

15 years ago

That range doesn't really have anything that touches JS ...

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 3

•

15 years ago

(In reply to comment #1) > This is a regression between: > > http://hg.mozilla.org/releases/mozilla-1.9.1/rev/3f1e6876d253 - identical -j > and no -j output > > and > > http://hg.mozilla.org/releases/mozilla-1.9.1/rev/ff069347b3e5 - different > output. > > http://hg.mozilla.org/releases/mozilla-1.9.1/pushloghtml?fromchange=3f1e6876d253&tochange=ff069347b3e5 (In reply to comment #2) > That range doesn't really have anything that touches JS ... Sorry, I got confused myself. That range is invalid. Both listed changeset have different outputs and thus both show the bug. That said, I tested this on Ubuntu. I _think_ this doesn't affect Mac, but am not sure.

Andreas Gal :gal

Comment 4

•

15 years ago

This is TM tip on macosx. This is a rounding issue, so it might be platform and compiler dependent. Gary, are you using Windows? host-7-221:src gal$ ./Darwin_DBG.OBJ/js -j x.js 0.5 -1 0.5 0.3333333333333333 -0.49999999999999994 0.3333333333333333 host-7-221:src gal$ ./Darwin_DBG.OBJ/js x.js 0.5 -1 0.5 0.3333333333333333 -0.49999999999999994 0.3333333333333333

Andreas Gal :gal

Comment 5

•

15 years ago

Note: I have noticed differences in rounding behavior between MSVC and gcc before. sin(x) is often different between macosx firefox and safari and our Windows builds. MSVC was usually right in those cases, and everyone else was wrong. However, I think the rounding modes are somewhat underspecified and both results were acceptable (for sin anyway).

Andreas Gal :gal

Comment 6

•

15 years ago

So it seems the non-jit linux results are incorrect since jit and nojit on macosx both agree with jit linux: $ ~/Desktop/25902-191/js-dbg-moz191-intelmac js> v = 0 for (x = 0; x < 2; ++x) { --v; for each(e in ['', String(''), '']) { print(v /= --v) } }0 js> 0.5 -1 0.5 0.3333333333333333 -0.4999999999999999 0.33333333333333326 js> ^C

Jesse Ruderman

Updated

•

15 years ago

Summary: TM: Different result from testcase involving for...in, print, for → Rounding difference between JIT and interpreter

Brendan Eich [:brendan]

Comment 7

•

15 years ago

You might not want to treat Mac output as normative. Mac forces 80-bit x87 mode while using 64-bit in SSE2, or did. See bug 338756. ECMA-262 specifies the exact results for these inputs, I would think (for the sake of interoperability). To conform to the standard, we used to use fdlibm to gain cross-platform consistency. We gave it up for speed, OS compatibility where possible, and code footprint (both source and object). /be

Brendan Eich [:brendan]

Comment 8

•

15 years ago

But maybe there's a tweak for Mac we need, or some way to bypass extended precision (80-bit) f.p. in the x87, if that is what is biting. From bug 338756 you can see that I found no way to convince the OS and hardware to go to 64-bit mode in that FPU. /be

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 9

•

15 years ago

(In reply to comment #4) > This is TM tip on macosx. This is a rounding issue, so it might be platform and > compiler dependent. Gary, are you using Windows? Nope, I'm on Ubuntu 9.04, for this particular testcase.

David Anderson [:dvander] - inactive, e-mail if emergency

Comment 10

•

15 years ago

I get the same results as comment #4, using Windows (VS2k5).

Boris Zbarsky [:bzbarsky]

Comment 11

•

15 years ago

Fwiw, the numbers in comment 4 look correct to me, from a general "what is the most precise output you could get given the previous value" perspective...

Jason Orendorff [:jorendorff]

Comment 12

•

15 years ago

Rhino agrees with comment 4, and for what it's worth, so does OpenJDK: public strictfp class Fx { public static void main(String[] args) { double v = 0; for (int x = 0; x < 2; ++x) { --v; for (int e = 0; e < 3; e++) { v = v / (v - 1); System.out.println("" + v); } } } } 0.5 -1.0 0.5 0.3333333333333333 -0.49999999999999994 0.3333333333333333 The "strictfp" is supposed to ensure freshness: http://en.wikipedia.org/wiki/Strictfp java version "1.6.0_0" IcedTea6 1.3.1 (6b12-0ubuntu6.4) Runtime Environment (build 1.6.0_0-b12) OpenJDK Server VM (build 1.6.0_0-b12, mixed mode)

Jason Orendorff [:jorendorff]

Comment 13

•

15 years ago

Without -j, the results are the same in tip as in rev 1.

Jason Orendorff [:jorendorff]

Comment 14

•

15 years ago

Interpreter: 0.3333333333333333 (0x3fd5555555555555) / -0.6666666666666667 (0xbfe5555555555556) == -0.4999999999999999 (0xbfdffffffffffffe) JIT: 0.3333333333333333 (0x3fd5555555555555) / -0.6666666666666667 (0xbfe5555555555556) == -0.49999999999999994 (0xbfdfffffffffffff) I have no idea which might be correct or whether both might be allowed. I defer to Graydon.

Jeff Walden [:Waldo]

Comment 15

•

15 years ago

>>> import decimal >>> n = decimal.Decimal("0.3333333333333333") >>> n Decimal("0.3333333333333333") >>> d = decimal.Decimal("-0.6666666666666667") >>> d Decimal("-0.6666666666666667") >>> n / d Decimal("-0.4999999999999999250000000000") -0.49999999999999994 -0.4999999999999999 JIT seems to be correct, then.

Jeff Walden [:Waldo]

Comment 16

•

15 years ago

Ignore me, I'm stupid, recalculating correctly now...

Jeff Walden [:Waldo]

Comment 17

•

15 years ago

find-waldo-now:/tmp jwalden$ cat foo.cpp #include <stdio.h> #include <fenv.h> #include <stdint.h> #include <math.h> int main() { union { double d; uint64_t u; } a, b; printf("%d\n", fegetround()); a.u = 0x3fd5555555555555ULL; b.u = 0xbfe5555555555556ULL; return (int) a.u; } Breakpoint 1, main () at foo.cpp:10 10 printf("%d\n", fegetround()); (gdb) n 0 12 a.u = 0x3fd5555555555555ULL; (gdb) s 13 b.u = 0xbfe5555555555556ULL; (gdb) s 15 return (int) a.u; (gdb) set variable b.d = a.d / b.d (gdb) p/x b.u $1 = 0xbfdffffffffffffe So interpreter is correct.

Andreas Gal :gal

Comment 18

•

15 years ago

what code does gcc emit here?

Jason Orendorff [:jorendorff]

Comment 19

•

15 years ago

gcc emits fdivl: .loc 1 4008 0 fldl -936(%ebp) fdivl -944(%ebp) fstpl -936(%ebp) Nanojit is using sse: fsub1 = fsub $global0, #0x3ff00000:0 fdiv1 = fdiv $global0, fsub1 stqi state[748] = fdiv1 ld8 = ld state[732] add1 = add ld8, 1 ov1 = ov add1 xt1: xt ov1 -> pc=0x8aabbe6 imacpc=(nil) sp+0 rp+0 movq xmm0,740(ecx) ecx(state) esi(sp) movsd xmm2,(#0x8215850) // =1.0 ecx(state) esi(sp) xmm0($global0) movsd xmm1,xmm0 ecx(state) esi(sp) xmm0($global0) xmm2(#0x3ff00000:0) subsd xmm1,xmm2 ecx(state) esi(sp) xmm0($global0) xmm2(#0x3ff00000:0) divsd xmm0,xmm1 ecx(state) esi(sp) xmm0($global0) xmm1(fsub1) movq 748(ecx),xmm0 ecx(state) esi(sp) xmm0(fdiv1) mov ebx,732(ecx) ecx(state) esi(sp) add ebx,1 ecx(state) ebx(ld8) esi(sp) jo 0xb7ad1fe8 ecx(state) ebx(add1) esi(sp)

Graydon Hoare :graydon

Assignee

Comment 20

•

15 years ago

Add "-msse2 -mfpmath=sse" perhaps?

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 21

•

15 years ago

Not blocker: per comment 5, we've had this deviation across platforms before.

Flags: blocking1.9.1? → blocking1.9.1-

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Updated

•

15 years ago

Flags: blocking1.9.2?

Graydon Hoare :graydon

Assignee

Comment 22

•

15 years ago

If (as is worth confirming) this is just x87-vs-sse2, IMO it should neither be a blocker for any release, nor even really a "bug"; it's the price we will continue to pay for having runtime-detected sse2 math vs. compile-time-chosen conservative x87 math in the interpreter. Other options are to step the minimum requirements for running the interpreter up to sse2 hardware and compile with -msse2 -mfpmath=sse (plausible, that only means P4/K8 or newer) or else stick a softfp library or some other widget that eliminates the 80-bit-ness of the x87 operands. Good to check for sure, of course. I'll poke at it some more later.

Assignee: general → graydon

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 23

•

15 years ago

(In reply to comment #22) > If (as is worth confirming) this is just x87-vs-sse2, IMO it should neither be > a blocker for any release, nor even really a "bug"; it's the price we will > continue to pay for having runtime-detected sse2 math vs. compile-time-chosen > conservative x87 math in the interpreter. This bug arguably interferes with the fuzzing in bug 465479.

Robert Sayre

Updated

•

15 years ago

Flags: blocking1.9.2? → blocking1.9.2+

Priority: -- → P2

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Updated

•

15 years ago

Flags: in-testsuite?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 24

•

15 years ago

(In reply to comment #23) > (In reply to comment #22) > > If (as is worth confirming) this is just x87-vs-sse2, IMO it should neither be > > a blocker for any release, nor even really a "bug"; it's the price we will > > continue to pay for having runtime-detected sse2 math vs. compile-time-chosen > > conservative x87 math in the interpreter. > > This bug arguably interferes with the fuzzing in bug 465479. Since the fuzzer in bug 465479 just got folded into jsfunfuzz, we have had to disable JITcompare (which does find issues nowadays) on Linux due to the noise generated by this issue.

Graydon Hoare :graydon

Assignee

Comment 25

•

15 years ago

Hm. I looked into this a bit and am now more perplexed than previously. I added a mechanism to force-disable SSE2 (an environment variable like the ARM one); when I do so, and confirm we're generating x87 code, we still differ in JIT and interp modes. Funny. Also, I noticed a wrinkle here that the original diagnosis didn't: your original test was on mac, yes? Macs compile with SSE2 by default. All our interpreter routines use SSE there (I just disassembled them), so if it were just an SSE issue you would not see it on mac. I thought you were reporting seeing this everywhere *except* mac. I talked to Julian and he suggested that he's seen bugs like this before due to spilling an 80 bit operand to a 64 bit slot when in x87 mode. That's a possibility -- we are in fact performing such precision-losing spills -- and it explains the phenomenon I'm seeing on linux, but *not* if it happens on mac. But then, I cannot reproduce this on a mac, only a linux-gcc configuration. Can you confirm the exact host-compiler combinations you're seeing this on? Ideally it'll help if you objdump -d or otool -tV jsinterp.o on the offending platform, and grep for 'div', paste in what you find. Also tell me if it's a machine that *has* SSE2. I'm CC'ing Julian on this bug so he can chime in if anything else that happens here reminds him of one of something he's seen before.

Graydon Hoare :graydon

Assignee

Comment 26

•

15 years ago

Attached patch fix the bug (deleted) — Details — Splinter Review

Ok, I think I know what's going on here now, and I can fix the bug, but I'd appreciate comments from anyone else who remembers this code, and/or assumptions baked into our FP code. It's pretty old. On some platforms, the x87 FPU is set to 80 bit mode on startup. On others, 64 bit mode. We have existing code in jsnum.cpp that clobbers the FPU state back to 64 bit mode -- FIX_FPU -- but it's only called on a specific (and very narrow) set of platforms: OS/2 or WINXP when building with icc or gcc but *not* mingw. As far as I can tell, archaeologically, this narrow ifdef-condition was arrived at merely to get the thing to build given that it uses some mask definitions and an intrinsic that are not available in the header files of all, or even most, x86 platforms. But it's not hard to just insert the necessary fpu-environment-twiddling asm manually, and ignore the issue of "getting the include files to work" entirely. The attached patch does just that, and correspondingly relaxes the ifdef to actually fire on linux, where it seems this was biting us. It's not clear to me which (if any) other platforms were reporting the problem, but I can only reproduce it on linux-x86, and the patch fixes that case. Also removes an ancient comment about Alpha, which I'd bet money we don't even build on anymore. (The reason the bug exists is that, when you get the x87 FPU running in 80 bit mode, it will perform more precise calculations in interpreter mode than the JIT does in SSE mode. I'll file a followup patch to force-disable SSE mode in the JIT as well, but it turns out it's not necessary in this case.)

Attachment #397078 - Flags: review?

Graydon Hoare :graydon

Assignee

Updated

•

15 years ago

Attachment #397078 - Flags: review? → review?(brendan)

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 27

•

15 years ago

Unfortunately comment #0 had a wrongly named js binary - it's not supposed to be called js-dbg-moz191-intelmac - I had hardcoded my compile scripts to output "intelmac" on every platform. :-/ It's a Linux-only bug as far as I can tell, doesn't seem to occur on Mac/WinXP. Apologies for the confusion. The patch seems to work out great on Linux! Thanks Graydon!

Keywords: regressionwindow-wanted

Graydon Hoare :graydon

Assignee

Comment 28

•

15 years ago

Comment on attachment 397078 [details] [diff] [review] fix the bug Brendan reviewed in person at my desk just now, so removing the r? on him and landing.

Attachment #397078 - Flags: review?(brendan)

Graydon Hoare :graydon

Assignee

Comment 29

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/8ad65f099bb5

Whiteboard: fixed-in-tracemonkey

Blake Kaplan (:mrbkap) (inactive)

Comment 30

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/8ad65f099bb5

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Robert Sayre

Comment 31

•

15 years ago

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/c3d0a4cd42ce

status1.9.2: --- → beta1-fixed