Closed
Bug 679710
Opened 13 years ago
Closed 5 years ago
FF6 is 5x slower than Chromium 15 on this JS benchmark
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: trigrou, Unassigned)
References
Details
(Keywords: perf, testcase, Whiteboard: js-triage-done)
Attachments
(2 files)
User Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1 (KHTML, like Gecko) Ubuntu/11.04 Chromium/15.0.849.0 Chrome/15.0.849.0 Safari/535.1
Steps to reproduce:
I run this page http://osgjs.org/osgjs/sandbox/key_implementation.html
Actual results:
It's 5 times slower on firefox 6 than on chromium 15
Expected results:
at best the same
Updated•13 years ago
|
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Comment 1•13 years ago
|
||
Here's the raw data for the perf profiler. This includes firefox warm startup, but that took a lot less time than running the benchmark so it should still be meaningful. Below is a summary, you can get full details by doing
$ xz -d perf.data.xz
$ perf report
Here you go:
55.93% perf-6053.map 0x7f127313ff18
8.83% libxul.so js_GetProperty(JSContext*, JSObject*, JSObject*
6.23% libxul.so js::mjit::stubs::GetElem(js::VMFrame&)
5.62% libxul.so js::PropertyTable::search(long, bool)
3.35% libxul.so js::NewBuiltinClassInstance(JSContext*, js::Cla
2.97% libxul.so RunTracer(js::VMFrame&, js::mjit::ic::TraceICIn
2.08% libxul.so void js::gc::FinalizeArenas<JSObject_Slots2>(JS
1.55% libxul.so js::MonitorTracePoint(JSContext*, bool*, void**
1.13% libxul.so js_CheckForStringIndex(long)
1.07% libxul.so js::mjit::stubs::GreaterEqual(js::VMFrame&)
0.75% libxul.so js_ValueToNonNullObject(JSContext*, js::Value c
0.66% libxul.so js_ValueToBoolean(js::Value const&)
0.48% libxul.so DisabledGetElem(js::VMFrame&, js::mjit::ic::Get
0.39% libxul.so js::ToNumberSlow(JSContext*, js::Value, double*
0.33% libxul.so js::mjit::stubs::ValueToBoolean(js::VMFrame&)
0.30% libxul.so js::gc::RefillFinalizableFreeList(JSContext*, u
0.28% libxul.so JSObject::getGlobal() const
0.25% libxul.so js_GetCurrentBytecodePC(JSContext*)
0.19% libxul.so js::mjit::stubs::InvokeTracer(js::VMFrame&, js:
0.18% [nvidia] 0x698e0c
0.16% libxul.so MOZ_Z_inflate_fast
0.16% [kernel.kallsyms] csd_lock_wait.clone.1
0.14% libpthread-2.13.so pthread_mutex_lock
0.13% ld-2.13.so do_lookup_x
0.13% [kernel.kallsyms] hpet_next_event.clone.3
0.12% libxul.so SearchTable(PLDHashTable*, void const*, unsigne
0.12% [nvidia] cache_flush
0.11% libpthread-2.13.so __pthread_mutex_unlock_usercnt
0.11% [kernel.kallsyms] clear_page_c
0.10% [kernel.kallsyms] put_mems_allowed
0.09% libxul.so JS_PropertyStub
0.08% firefox arena_dalloc
0.08% [kernel.kallsyms] handle_mm_fault
0.07% libc-2.13.so __memset_sse2
0.07% [kernel.kallsyms] page_fault
0.06% ld-2.13.so _dl_fixup
0.06% libxul.so CheckScript(JSScript*, JSScript*)
0.06% libxul.so MOZ_Z_crc32
0.06% libxul.so PickChunk(JSContext*)
Comment 2•13 years ago
|
||
Note: I got this in Nightly from August 15 on linux x86-64
Comment 3•13 years ago
|
||
And I confirm that Chrome 13 is a lot faster on it.
Version: 6 Branch → Trunk
Updated•13 years ago
|
Hardware: x86 → All
Comment 4•13 years ago
|
||
The top 55.93% entry,
55.93% perf-6053.map 0x7f127313ff18
is methodjit code according to this call tree:
perf-6053.map 0x7f127313ff18
0x7f1273105b9d
js::mjit::EnterMethodJIT(JSContext*, js::StackFrame*, void*, js::Value*)
js::mjit::JaegerShotAtSafePoint(JSContext*, void*)
js::Interpret(JSContext*, js::StackFrame*, js::InterpMode)
js::Invoke(JSContext*, js::CallArgs const&, js::MaybeConstruct)
js::ExternalInvoke(JSContext*, js::Value const&, js::Value const&, unsig
JS_CallFunctionValue
nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMeth
nsXPCWrappedJS::CallMethod(unsigned short, XPTMethodDescriptor const*, n
PrepareAndDispatch
SharedStub
nsEventListenerManager::HandleEventSubType(nsListenerStruct*, nsIDOMEven
nsEventListenerManager::HandleEventInternal(nsPresContext*, nsEvent*, ns
nsEventTargetChainItem::HandleEvent(nsEventChainPostVisitor&, unsigned i
nsEventTargetChainItem::HandleEventTargetChain(nsEventChainPostVisitor&,
nsEventDispatcher::Dispatch(nsISupports*, nsPresContext*, nsEvent*, nsID
DocumentViewerImpl::LoadComplete(unsigned int)
_ZN10nsDocShell11EndPageLoadEP14nsIWebProgressP10nsIChannelj.part.109
nsDocShell::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, un
nsDocLoader::FireOnStateChange(nsIWebProgress*, nsIRequest*, int, unsign
nsDocLoader::doStopDocumentLoad(nsIRequest*, unsigned int)
nsDocLoader::DocLoaderIsEmpty(int)
nsDocLoader::OnStopRequest(nsIRequest*, nsISupports*, unsigned int)
nsLoadGroup::RemoveRequest(nsIRequest*, nsISupports*, unsigned int)
nsDocument::DoUnblockOnload()
nsDocument::DispatchContentLoadedEvents()
nsRunnableMethodImpl<void (nsHTMLStyleElement::*)(), true>::Run()
nsThread::ProcessNextEvent(int, int*)
NS_ProcessNextEvent_P(nsIThread*, int)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*)
MessageLoop::Run()
nsBaseAppShell::Run()
nsAppStartup::Run()
XRE_main
main
__libc_start_main
0x7f1273138835
js::mjit::EnterMethodJIT(JSContext*, js::StackFrame*, void*, js::Value*)
Updated•13 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Updated•13 years ago
|
Whiteboard: js-triage-needed
Updated•13 years ago
|
Summary: javascript performance issue → FF6 is 5x slower than Chromium 15 on this JS benchmark
Comment 5•13 years ago
|
||
Some numbers:
Test 0 test 1
d8 : 1251 2306
js -m -n: 7686 3618
js -m : 6377 4056
Assignee: general → jandemooij
Status: NEW → ASSIGNED
Comment 6•13 years ago
|
||
The main problem here is GetElem stub calls. The script defines a Key function which inherits from Array. A Key object is just an array with length 3, some extra functions and a named property.
So this is yet another bug depending on bug 586842...
Comment 7•13 years ago
|
||
On closer look only Test 0 uses arrays with named properties. That explains why test 0 is faster in d8 than test 1. Test 1 uses a single array to store everything. We are 1.5x slower there, I will try to find out why.
Comment 8•13 years ago
|
||
Argh, the difference for Test 1 is due to a bug in the script. This line:
var endTime = keyEnd[keyEnd];
Should be:
var endTime = keys[keyEnd];
The problem was that we were taking stub calls for the GetElem and later on for >= because endTime was undefined.
With this fixed, for Test 1:
d8 : 1168
js -m -n: 1264
js -m : 1912
d8 no/cs: 2739
Interestingly, Test 1 is now faster than Test 0 in both SM and V8.
Comment 9•13 years ago
|
||
Cedric, the -n switch enables the type inference engine in the JavaScript shell. TI will be available in Firefox 9 if no (major) problems are found. In case you want to try it out, you can download a nightly build from http://nightly.mozilla.org/. TI is enabled by default in the browser.
Thanks for the bug report and please let us know if you find other performance problems.
Comment 10•13 years ago
|
||
To beat V8 on Test 1 we have to make this fast:
--
function f() {
var t0 = new Date;
var x;
for (var i=0; i<10000000; i++) {
if (x) {};
x = 1;
}
print(new Date - t0);
}
f();
--
The problem is that we call stubs::ValueToBoolean for "if (x)" because x is either undefined or int32. booleanJumpScript only supports boolean or known-int32.
With TI, booleanJumpScript should look at the possible types of x. Ideally it should support things like undefined-or-object, null-or-object, undefined-or-int32, etc. Generating inline code for 2 or 3 types would probably cover most cases.
Comment 11•12 years ago
|
||
Bug 827490 just landed. It might help.
Comment 12•12 years ago
|
||
Thanks to the shell test case this was pretty easy to test.
Before:
Test 0: 5024
Test 1: 2076
After:
Test 0: 1614
Test 1: 2118
Looking pretty good.
Comment 13•12 years ago
|
||
So just to check, it's expected that we're still 3x slower than V8 on this?
Comment 14•12 years ago
|
||
Well, no. There's a shell testcase, would be good to add to awfy-assorted so we can track perf here. Taking a glance, the testcase is wrapped in a big closure, which are pretty rotten for (spidermonkey) perf. I wrote a patch for this last month in bug 821361 but it's just been sitting there waiting for review. What happens if you rm the closure?
Comment 15•12 years ago
|
||
With closure:
SpiderMonkey:
Test 0: 1767
Test 1: 2257
d8:
Test 0: 429
Test 1: 794
Without closure:
SpiderMonkey:
Test 0: 1405
Test 1: 2012
d8:
Test 0: 611
Test 1: 812
Comment 16•12 years ago
|
||
Er, wait. There are more nested closures. If I take those out too, I get:
SpiderMonkey:
Test 0: 1371
Test 1: 2028
d8:
Test 0: 646
Test 1: 840
Comment 10 might cover this.
Comment 17•10 years ago
|
||
Now I get:
SpiderMonkey:
Test 0: 279
Test 1: 2130
d8:
Test 0: 245
Test 1: 938
So test 0 is close, test 1 is still a lot slower. According to Instruments, we spend 76% under js::GetElement, will take a look.
Comment 18•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #17)
> So test 0 is close, test 1 is still a lot slower. According to Instruments,
> we spend 76% under js::GetElement, will take a look.
Ah this is due to the problem I mentioned in comment 8: keyEnd[keyEnd] should be keys[keyEnd]. With that fixed:
SpiderMonkey:
Test 0: 279
Test 1: 186
d8:
Test 0: 243
Test 1: 214
So Test 1 is a bit faster than d8, test 0 is about 10% slower.
Comment 19•10 years ago
|
||
Test 0 is still a bit slower than V8 due to bug 1073587.
Unassigning myself as I'm not working on this atm and it's fixed for the most part.
Assignee: jdemooij → nobody
Status: ASSIGNED → NEW
Comment 20•5 years ago
|
||
The shell testcase (with comment #8 fixed) is now slightly faster for Test 1
when compared to V8, and noticeably faster for Test 0
. Therefore resolving this issue as WFM.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•