Closed
Bug 580515
Opened 14 years ago
Closed 14 years ago
TM: LIR_cmovd mishandled with X86_FORCE_SSE2=no
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey)
Attachments
(1 file)
(deleted),
patch
|
edwsmith
:
review+
|
Details | Diff | Splinter Review |
Nanojit's i386 back-end has two modes. You can avoid generating SSE code (ie. use the x87 stack only for FP computation) by specifying the env var X86_FORCE_SSE2=no. I'm probably the only person who tests this configuration regularly. The following three trace-tests are failing
X86_FORCE_SSE2=no debug32/js -j -e "const platform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/closures/set-outer-trace-3.js
X86_FORCE_SSE2=no debug32/js -j -e "const platform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/closures/t004.js
X86_FORCE_SSE2=no debug32/js -j -e "const platform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/closures/t027.js
The failure is this:
Assertion failure: (_allocator.active[FST0] && _fpuStkDepth == -1) || (!_allocator.active[FST0] && _fpuStkDepth == 0) (../nanojit/Assembler.cpp:352)
It could well be a latent bug in the i386 backend that fatvals has exposed.
Comment 1•14 years ago
|
||
(In reply to comment #0)
> It could well be a latent bug in the i386 backend that fatvals has exposed.
That error is unlike anything I've ever seen caused by the tracer. Do you know what situation it is asserting?
Assignee | ||
Comment 2•14 years ago
|
||
(In reply to comment #1)
>
> That error is unlike anything I've ever seen caused by the tracer.
Probably because you've never run with X86_FORCE_SSE2=no before.
> Do you know
> what situation it is asserting?
In x87 mode the backend is really simplistic, it only ever pushes a single FP value onto the x87 stack at a time. Sometimes you have to pop what's on the stack explicitly, and sometimes you consume it naturally. There's 'bool pop' parameters in various places in Nativei386.cpp, eg. see the first arg of FSTQ.
It also bleeds into Assembler.cpp, esp. see prepareResultReg() which contains a big comment about this stuff.
What the assertion means is that something has gone wrong with the stack management -- eg. it's trying to push a second value on the stack, or pop an empty stack (I'm not sure which). The _fpuStkDepth variable is how the back-end tracks how many elements the stack has. I've never quite understood how it works, though.
Assignee | ||
Comment 3•14 years ago
|
||
I heard on IRC that we can assume SSE2 for Firefox 4.0. In which case this bug is moot.
Shaver, can you confirm this?
Assignee | ||
Comment 4•14 years ago
|
||
The three tests mentioned in comment 0 are no longer failing. wmaddox fixed up some problems in the x87 assertion checking, that's probably what fixed it.
We do have one new failure:
X86_FORCE_SSE2=no /home/njn/moz/ws0/js/src/debug32/shell/js -j -e "const p
latform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/basic/test586387.js
/home/njn/moz/ws0/js/src/trace-test/tests/basic/test586387.js:14: Error: Assertion failed: got 37, expected 496
A regression range is needed.
Comment 5•14 years ago
|
||
For the bug 586387 test:
The first bad revision is:
changeset: 48822:e866db9165ef
user: Luke Wagner <lw@mozilla.com>
date: Tue Aug 03 22:06:44 2010 -0700
summary: Bug 584158 - ensure that typed arrays cannot produce non-canonical nans (r=gal)
Assignee | ||
Comment 6•14 years ago
|
||
(In reply to comment #5)
>
> The first bad revision is:
> changeset: 48822:e866db9165ef
> user: Luke Wagner <lw@mozilla.com>
> date: Tue Aug 03 22:06:44 2010 -0700
> summary: Bug 584158 - ensure that typed arrays cannot produce non-canonical
> nans (r=gal)
This revision added uses of LIR_cmovd, so I think it very likely that LIR_cmovd isn't handled properly on i386 without SSE2.
Summary: TM: tests failing with X86_FORCE_SSE2=no due to fatvals → TM: LIR_cmovd mishandled with X86_FORCE_SSE2=no
Assignee | ||
Comment 7•14 years ago
|
||
Here's a smaller test case:
var v = new Float32Array(32);
var t = 0;
for (var i = 0; i < v.length; ++i)
t += v[i];
print(t);
Run normally, it prints zero. Run with X86_FORCE_SSE2=no it prints a bogus
value, often -Infinity, often a very large negative double.
Here's the relevant LIR and native code:
ldf2d1 = ldf2d.tdata addi1[0]
...... fld32 0(eax)
eqd1 = eqd ldf2d1, ldf2d1
immd1 = immd nan
cmovd1 = cmovd eqd1 ? ldf2d1 : immd1/*nan*/
...... fld f0
...... fcompp
...... fnstsw_ax
...... test ah, 68
...... fldq -40(ebp) <= restore cmovd1
...... mov eax,-28(ebp) <= restore eos
...... jnp ......
...... ffree f0
...... fincstp
...... fldq (......)
The LIR is from TM's canonicalizeNaNs(). It compares ldf2d1 to itself (a
NaN) test. If it fails (ie. it's a NaN), it replaces it with a canonical
NaN value.
I haven't yet worked out the problem with the generated code, my x87
knowledge is weak.
Status: NEW → ASSIGNED
Assignee | ||
Comment 8•14 years ago
|
||
(In reply to comment #7)
>
> eqd1 = eqd ldf2d1, ldf2d1
> immd1 = immd nan
> cmovd1 = cmovd eqd1 ? ldf2d1 : immd1/*nan*/
> ...... fld f0
> ...... fcompp
> ...... fnstsw_ax
> ...... test ah, 68
> ...... fldq -40(ebp) <= restore cmovd1
> ...... mov eax,-28(ebp) <= restore eos
> ...... jnp ......
> ...... ffree f0
> ...... fincstp
> ...... fldq (......)
The |fldq -40(ebp)| looks wrong. When generating code backwards we have these steps:
- After the cmovd1 instruction, cmovd1 is in FST0.
- When we get back to the comparison part (in asm_cmpd()) we need to use FST0 to hold ldf2d1 to do the comparison.
- So we have to "spill" cmovd1 out of FST0; because we're going backwards, this generates a restore (the |fldq -40(ebp)|)
- The problem is that -40(ebp), the spill slot assigned for cmovd1, doesn't hold something meaningful at this point. We're restoring garbage, because we're restoring cmovd1 before it's even been computed!
Man, I hate x87 code. Not sure what the fix is.
Assignee | ||
Comment 9•14 years ago
|
||
BTW, I think the |ffree f0; fincstp| sequence can be changed to just |fstp f0|. It just needs to pop the stack.
Assignee | ||
Updated•14 years ago
|
blocking2.0: --- → ?
Updated•14 years ago
|
blocking2.0: ? → betaN+
Assignee | ||
Comment 10•14 years ago
|
||
This ended up being a bit more complicated than I thought it would. The
best explanation of what this patch does comes from the following comment I
added to asm_cmov():
// The obvious way to handle this is as follows:
//
// mov rr, rt # only needed if rt is live afterwards
// do comparison
// jt end
// mov rr, rf
// end:
//
// The problem with this is that doing the comparison can cause
// registers to be evicted, possibly including 'rr', which holds
// 'ins'. And that screws things up. So instead we do this:
//
// do comparison
// mov rr, rt # only needed if rt is live afterwards
// jt end
// mov rr, rf
// end:
//
// Putting the 'mov' between the comparison and the jump is ok
// because move instructions don't modify the condition codes.
The "obvious way" was how things were done prior to the patch.
The rest of the patch is just a lot of moving stuff around to allow this to
happen. In particular:
- I had to allow separation of conditional tests and conditional branches,
which required adding asm_branch_helper() et al.
- In NativeX64.cpp, this required operand swapping performed for some double
comparisons to be done in asm_cmpd() instead of asm_branchd() and
asm_condd(). This makes the X64 back-end more like the i386 back-end,
which I think is a good thing.
- In Nativei386.cpp, I used the commonly-used FSTP(FST0) instruction to pop
the stack instead of the unusual FINCSTP()/FFREE() sequence. This allowed
FINCSTP() and FFREE() to be removed.
- I added a lirasm test. Without the patch, it asserts when in non-SSE2
mode.
The failing test mentioned in comment 4 now works, too.
Attachment #493853 -
Flags: review?(edwsmith)
Comment 11•14 years ago
|
||
Comment on attachment 493853 [details] [diff] [review]
patch (agasint TM 58013:2000c21b7910)
Tamarin doesn't currently use cmovd, so I haven't tested on the sandbox. Bug 606561 actually worked around the x87 bug; we can re-try a cmovd version of Math.abs() once this lands.
Attachment #493853 -
Flags: review?(edwsmith) → review+
Assignee | ||
Comment 12•14 years ago
|
||
http://hg.mozilla.org/projects/nanojit-central/rev/4effe362e918
http://hg.mozilla.org/tracemonkey/rev/a687492cff3d
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey
Comment 13•14 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•