Closed
Bug 678928
Opened 13 years ago
Closed 13 years ago
WebGL JS broken after pushing 649202
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox7 | --- | unaffected |
firefox8 | + | fixed |
firefox9 | + | fixed |
People
(Reporter: romaxa, Assigned: mjrosenb)
References
()
Details
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
adrake
:
review+
johnath
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Latest trunk show broken WebGL on ARM
Also on
http://learningwebgl.com/lessons/lesson04/index.html - it crashes
Reporter | ||
Comment 1•13 years ago
|
||
#0 0x3aaf57fc in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:45
#1 0x3bdf2280 in CheckScript (script=<value optimized out>, prev=<value optimized out>)
at js/src/jsscript.cpp:306
#2 0x3bdf2a98 in js_TraceScript (trc=0xae91571c, script=0x3b712448, owner=0x6) at js/src/jsscript.cpp:1442
#3 0x3bd7bd68 in ScanObject (this=0xae91571c) at js/src/jsgcmark.cpp:636
#4 js::GCMarker::drainMarkStack (this=0xae91571c) at js/src/jsgcmark.cpp:802
#5 0x3b67dd84 in setMarkColor (trc=0xae91571c, data=0x11c7) at js/src/jsgc.h:1314
#6 XPCJSRuntime::TraceJS (trc=0xae91571c, data=0x11c7) at js/src/xpconnect/src/xpcjsruntime.cpp:383
#7 0x3bd71288 in js::MarkRuntime (trc=0xae91571c) at js/src/jsgc.cpp:1846
#8 0x3bd73228 in MarkAndSweep (cx=0x405885c0, comp=<value optimized out>, gckind=GC_SHRINK, gcTimer=<value optimized out>)
at js/src/jsgc.cpp:2272
#9 GCCycle (cx=0x405885c0, comp=<value optimized out>, gckind=GC_SHRINK, gcTimer=<value optimized out>)
at js/src/jsgc.cpp:2642
#10 0x3bd73cd4 in js_GC (cx=0x405885c0, comp=<value optimized out>, gckind=GC_SHRINK) at js/src/jsgc.cpp:2728
#11 0x3b328c3c in nsJSContext::ScriptEvaluated (this=0x41031d80, aTerminated=4551)
at dom/base/nsJSEnvironment.cpp:3167
#12 0x3b3245bc in nsJSContext::ScriptExecuted (this=0x41031d80) at dom/base/nsJSEnvironment.cpp:3239
#13 0x3b68a818 in AutoScriptEvaluate::~AutoScriptEvaluate (this=0xae915ae8, __in_chrg=<value optimized out>)
---Type <return> to continue, or q <return> to quit---
at js/src/xpconnect/src/xpcwrappedjsclass.cpp:122
#14 0x3b68c9d0 in nsXPCWrappedJSClass::CallMethod (this=0x42150b50, wrapper=<value optimized out>, methodIndex=<value optimized out>, info=0x40468848,
nativeParams=0xae915c10) at js/src/xpconnect/src/xpcwrappedjsclass.cpp:1896
#15 0x3b6870e0 in nsXPCWrappedJS::CallMethod (this=0x42198640, methodIndex=3, info=0x40468848, params=0xae915c10)
at js/src/xpconnect/src/xpcwrappedjs.cpp:585
#16 0x3bacb780 in PrepareAndDispatch (self=<value optimized out>, methodIndex=<value optimized out>, args=0xae915cd4)
at xpcom/reflect/xptcall/src/md/unix/xptcstubs_arm.cpp:131
#17 0x3bacae00 in SharedStub () from /nfsobj-build/dist/bin/libxul.so
#18 0x3af25740 in nsRefreshDriver::Notify (this=0x3fd8c540, aTimer=<value optimized out>)
at layout/base/nsRefreshDriver.cpp:361
#19 0x3baba5a4 in nsTimerImpl::Fire (this=0x42198680) at xpcom/threads/nsTimerImpl.cpp:427
#20 0x3baba720 in nsTimerEvent::Run (this=<value optimized out>) at xpcom/threads/nsTimerImpl.cpp:520
#21 0x3bab6a00 in nsThread::ProcessNextEvent (this=0x3fd0cc50, mayWait=<value optimized out>, result=<value optimized out>)
at xpcom/threads/nsThread.cpp:631
Reporter | ||
Comment 2•13 years ago
|
||
Have this problem on Maemo6 and WebOS.
Updated•13 years ago
|
Group: core-security
Comment 3•13 years ago
|
||
GC-related crash. Its unclear to me how the ARM patch caused this. Just in case we unearthed an existing GC problem, I am going to hide the bug until someone gets around to take a look. This doesn't crash x86, at least not on mac or linux, but maybe thats just luck/different memory pressure.
Reporter | ||
Comment 4•13 years ago
|
||
I've bisected it... and before that commit everything works fine... after WebGL broken and some WEbGL examples crashes.
Reporter | ||
Comment 5•13 years ago
|
||
ARM patch might cause some VFP incorrect behavior and memory corruption.
Assignee | ||
Comment 6•13 years ago
|
||
It is entirely possible that I generated an instruction that does not exist on your particular chip. Do you know what type of chip you have or what instruction sets are supported?
While running JIT'ed code, that is no good reason for the stack to look like anything sane, but that does appear as if you have a rather sane stack that is far from any JIT'ed code.
Reporter | ||
Comment 7•13 years ago
|
||
Here is the CPUINFO on chips where I see problems with WebGL
HP Touchpad:
Processor : ARMv7 Processor rev 2 (v7l)
processor : 0
BogoMIPS : 13.52
Features : swp half thumb fastmult vfp edsp neon vfpv3
CPU implementer : 0x51
CPU architecture: 7
CPU variant : 0x0
CPU part : 0x02d
CPU revision : 2
Hardware : TENDERLOIN
Revision : 0000
Serial : 0000000000000000
N9:
Processor : ARMv7 Processor rev 2 (v7l)
BogoMIPS : 996.74
Features : swp half thumb fastmult vfp edsp neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x3
CPU part : 0xc08
CPU revision : 2
Hardware : Nokia RM-696 board
Revision : 1504
Serial : 0000000000000000
Soc Info : OMAP 3630 ES1.2-hs
IDCODE : 2b89102f
Pr. ID : 00000000 00000000 000000cc cafeb891
Reporter | ||
Comment 8•13 years ago
|
||
Also on N9 we are using hardfp toolchain if that make any difference
Assignee | ||
Comment 9•13 years ago
|
||
That looks like it supports all of the instructions that I would have used. I only have an android device for testing. I'll see if it reproduces there, and otherwise, I'll borrow someone's device that does exhibit the issue.
Reporter | ||
Comment 10•13 years ago
|
||
if you are in MV office, the you probably can take N9 from dougt... For webOS you need additional patches to compile FF... probably JS part is easy to compile there
Reporter | ||
Comment 11•13 years ago
|
||
Tested this also on Galaxy Tab 9.0xx , and it works wrong also there, and crashes too.
http://learningwebgl.com/lessons/lesson05/index.html
http://learningwebgl.com/lessons/lesson04/index.html
Reporter | ||
Comment 12•13 years ago
|
||
any progress here? this seems like on the way to 8.0 release, and presumably breaking WebGL on android (at least galaxy tab similar HW)
Comment 13•13 years ago
|
||
We cannot keep carrying this regression. If we don't have a fix today, please back out the offending patch.
Assignee | ||
Comment 14•13 years ago
|
||
I haven't been able to get a maemo development environment set up yet, and the problem doesn't reproduce on my nexus-s. I'll see if I can borrow someone's galaxy S today in order to get to the point where I can actually start debugging this.
Reporter | ||
Comment 15•13 years ago
|
||
Not sure about galaxy S, but I have Webg broken on galaxy Tab... hope HW is the same for them
Assignee | ||
Comment 16•13 years ago
|
||
Sorry, I meant galaxy tab, not galaxy S (must be too used to typing nexus S).
Assignee | ||
Comment 17•13 years ago
|
||
I still haven't been able to get an environment to reproduce this...
A simple fix may be to just disable fast typed arrays on arm rather than to back out the whole patch. It should be as simple as removing the line: ENABLE_POLYIC_TYPED_ARRAY=1
from the arm section of js/src/configure.in
Comment 18•13 years ago
|
||
Is that disabling any optimization we didn't have before the patch? And what is the path to fix that?
Comment 19•13 years ago
|
||
Why would the specific ARM chip matter if it's a GC bug?
Assignee: general → mrosenberg
Comment 20•13 years ago
|
||
Also, what does it mean that you can't get an environment to reproduce this? I have it reproduced on my galaxy s2. If you don't have one you can use mine. I am downstairs on the 2nd floor.
Reporter | ||
Comment 21•13 years ago
|
||
(In reply to Daniel Veditz from comment #19)
> Why would the specific ARM chip matter if it's a GC bug?
I think we have a deal with memory corruption, which endup in GC...
Comment 22•13 years ago
|
||
If we suspect memory corruption, the patch has to come out. In particular if we have no explanation whats going on here exactly. Lets investigate this on a project branch instead of carrying risk on trunk. Thats what project branches are there for.
Assignee | ||
Comment 23•13 years ago
|
||
Looks like I got this right in 4/5 of the callsites, and the last callsite is never hit by any of our tests.
In other news, fmem_imm_off is something that shouldn't be called from outside ARMAssembler.*, so I'm about to clean that up.
Assignee | ||
Comment 24•13 years ago
|
||
additional thought: it would appear as if storeFloat does not just store floats. It takes a double, converts it to a float, then stores that. This is blatantly obvious because storefloat takes a "floatReg", which is used for both single precision and double precision values.
If storeFloat is actually supposed to convert a double to a float before writing it to memory, then most of the variants of storeFloat are wrong, which means that they are not tested anywhere. I'm happy to change it, but we should get some tests that make sure this is doing something sane.
Reporter | ||
Comment 25•13 years ago
|
||
Comment on attachment 554036 [details] [diff] [review]
comments saying "fmem_imm_offset is expecting the offset in words" should be heeded
ok, with this patch I see webgl working fine again, and no crash.
Attachment #554036 -
Flags: feedback+
Assignee | ||
Comment 26•13 years ago
|
||
as (sort of) stated earlier, fmem_imm_off cannot be called with any offset. In order to have a sane macro assembler, this has been wraped in floatTransfer, which will ensure that the offset is encoded correctly.
for whatever reason, I'd called fmem_imm_off, rather than floatTransfer. this has been mitigated.
Attachment #554036 -
Attachment is obsolete: true
Assignee | ||
Updated•13 years ago
|
Attachment #554243 -
Flags: review?(adrake)
Updated•13 years ago
|
Attachment #554243 -
Flags: review?(adrake) → review+
Updated•13 years ago
|
tracking-firefox8:
--- → ?
tracking-firefox9:
--- → ?
Assignee | ||
Comment 27•13 years ago
|
||
Comment on attachment 554243 [details] [diff] [review]
should fix the issue *and* not have 10-bit overflow issues
This is a pretty big security risk. Using a typed array, and crafting your indexes into it, you can write single floating point values up to 3 times the length of the array past the end of the array. Most likely this would just corrupt the heap but I cannot say for sure.
The fix is plenty simple. It calling an already existent wrapper function that
will encode the offset, but it will apply the correct shift, and it will also guard against cases where the offset is out of range for the given instruction.
Attachment #554243 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•13 years ago
|
Keywords: checkin-needed
Comment 28•13 years ago
|
||
Landed on m-c:
http://hg.mozilla.org/mozilla-central/rev/af2098f08fc7
Comment 29•13 years ago
|
||
Comment on attachment 554243 [details] [diff] [review]
should fix the issue *and* not have 10-bit overflow issues
Discussed and approved in triage - we're assuming this isn't needed on beta? (Please land it soon!)
Attachment #554243 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Reporter | ||
Comment 30•13 years ago
|
||
beta does not have fix for 649202 IIRC
Comment 31•13 years ago
|
||
Landed on Aurora, for Firefox 8:
http://hg.mozilla.org/releases/mozilla-aurora/rev/9a243f427279
status-firefox8:
--- → fixed
Updated•13 years ago
|
status-firefox7:
--- → unaffected
status-firefox9:
--- → fixed
Updated•13 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•