Closed Bug 919958 Opened 11 years ago Closed 11 years ago

Ionmonkey: correct instruction numbering in alias analysis and re-enable heap load movement.

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox26 --- fixed
firefox27 --- fixed
b2g-v1.2 --- fixed

People

(Reporter: dougc, Assigned: dougc)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [games] )

Attachments

(3 files)

Bug 877338 improved the support for VN and LICM for the asm.js
heap load operations, but this causes errors in some asm.js demos.

For example, the BananaBench demo fails with:
JavaScript error: , line 0: uncaught exception: abort() at @bb.js:3542
ra@bb.js:6
.d@bb.js:3571
Kd@bb.js:3543

This error has been bisected to changeset 71f2968c7359.

The error appears to be related to LICM. If LICM is disable
then the error does not occur.

A workaround is to comment out the call to setMovable()
in the MAsmJSLoadHeap constructor.

Disabling Value Numbering alone does not stop the error.

The following changes do not prevent the error, suggesting
the new definitions of these are not to blame.

* Change MAsmJSLoadHeap::congruentTo to always return false.

* Change MAsmJSLoadHeap::mightAlias to always return true.

* Change MAsmJSLoadHeap getAliasSet() to return AliasSet::Load(AliasSet::Any)

* Change MAsmJSStoreHeap getAliasSet() to return AliasSet::Store(AliasSet::Any)

Perhaps an asm.js specific issue with LICM that has been tickled by
the improved support for load-heap?
Thanks for digging in to this!  One bug-finding technique I've used in the past to deal this sort of thing is to instrument the build to selectively disable the offending optimization in all methods in a range (determined by environment variables) and then bisect down to a single method, and then, from there, bisect down to a single instruction.  Once we have that, it should be a lot easier to understand what is happening.
I will try to provide all the help I can. However, as of today's Nightly (2013-09-24), I can't reproduce it on Linux x64.

Are there specific steps to raise this error? Shell or browser? All platforms?
Flags: needinfo?(dtc-moz)
This error reproduces reliably when the BananaBench demo, in the URL above, is run, selecting 'GO'.

Just tested on a x86 Linux m-c build.  Shall try some nightly builds soon.
Flags: needinfo?(dtc-moz)
I can reproduce this on 64-bit linux nightly. Says

uncaught exception: abort() at X<._abort@http://kripken.github.io/misc-js-benchmarks/banana/bb.js:3542
ra@http://kripken.github.io/misc-js-benchmarks/banana/bb.js:6
X<.__emscripten_push_uncounted_main_loop_blocker/<.d@http://kripken.github.io/misc-js-benchmarks/banana/bb.js:3571
X<._emscripten_set_main_loop/Kd@http://kripken.github.io/misc-js-benchmarks/banana/bb.js:3543
I see a similar abort on Epic Citadel now as well, http://unrealengine.com/html5 so this is looking more and more serious.
Whiteboard: [games]
Whiteboard: [games] → [games] [leave open]
Comment on attachment 810289 [details] [diff] [review]
Disable movement of the heap loads while the issue is explored.

Review of attachment 810289 [details] [diff] [review]:
-----------------------------------------------------------------

Is disabling heap loads also the solution for the Epic demo?
Attachment #810289 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #7)
> Comment on attachment 810289 [details] [diff] [review]
> Disable movement of the heap loads while the issue is explored.
> 
> Review of attachment 810289 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Is disabling heap loads also the solution for the Epic demo?

The Epic demo did work in a recent build with this patch applied,
but lets see.  It would be unexpected if disabling the heap load
movement caused a regression.
Keywords: checkin-needed
I tested locally and this fixes Citadel/BananaBread.
FIXME: convention nag.

Thanks for fast comment-out action. Is there an optimization penalty?

/be
(In reply to Brendan Eich [:brendan] from comment #12)
The optimization was only in for a few days before it was disabled and no movement on awfy.  This makes sense b/c we'd expect LLVM to have already done this to the extent it could.

It would be useful to understand what went wrong here; based on comment 0, it sounds like a bug in LICM.
Assignee: nobody → dtc-moz
OS: Linux → All
Hardware: x86 → All
Summary: Odinmonkey: error related to heap load LICM → Ionmonkey: correct instruction number in alias analysis and re-enable heap load movement.
Summary: Ionmonkey: correct instruction number in alias analysis and re-enable heap load movement. → Ionmonkey: correct instruction numbering in alias analysis and re-enable heap load movement.
Whiteboard: [games] [leave open] → [games]
The alias analysis algorithm depends on the instruction id ordering but the id of the last instruction in blocks was not being set which lead to random but systematic failures.  Code was being hoisted that should not have been.

The patch assigns an ordered id to the last instruction in blocks.

There appears to be a similar potential issue in 'edge case analysis' and although it did not contribute to any known failure the patch also assigns an ordered id to the last instruction in blocks during this pass.

The movement of asm.js heap loads is re-enabled, and Citadel runs.
Attachment #821471 - Flags: review?(jdemooij)
Comment on attachment 821471 [details] [diff] [review]
Correct instruction numbering and re-enable heap load movement

Review of attachment 821471 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, good find!
Attachment #821471 - Flags: review?(jdemooij) → review+
Keywords: checkin-needed
(In reply to Jan de Mooij [:jandem] from comment #15)
> Comment on attachment 821471 [details] [diff] [review]
> Correct instruction numbering and re-enable heap load movement

https://hg.mozilla.org/integration/mozilla-inbound/rev/22abf408f98a
Status: NEW → ASSIGNED
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/22abf408f98a
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: JIT code generation errors.
Testing completed (on m-c, etc.): The fix has landed on m-c, and corrected the bug in Citadel
Risk to taking this patch (and alternatives if risky): Low.  It fixes an omission in instruction numbering.
String or IDL/UUID changes made by this patch:

The bug in the instruction numbering exits in prior FF versions and could lead to erroneous code generation.  The heap load movement support was only added recently so is not relevant for older versions.  This patch includes the fix alone.  The source patch in attachment 821471 [details] [diff] [review] has r+.
Attachment #825654 - Flags: approval-mozilla-beta?
Attachment #825654 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: gecko-games
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: