Closed
Bug 772466
Opened 12 years ago
Closed 12 years ago
Much slower than WebKit on Dromaeo DOM getAttribute()
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: jrmuizel, Unassigned)
References
Details
Attachments
(3 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details |
http://dromaeo.com/?id=174436,174430,174431
Moz WebKit
getAttribute 73.03runs/s ±27.06% 224.39runs/s ±9.30%
I will try to get a profile for this.
Reporter | ||
Comment 1•12 years ago
|
||
It seems like this is not just on mobile.
With http://people.mozilla.com/~jmuizelaar/dom-tests/dom-attr.html I get:
509 in Chrome and 1416 in FF
Summary: Much slower than WebKit on some of the DOM attribute tests on mobile → Much slower than WebKit on some of the DOM getAttribute()
Reporter | ||
Updated•12 years ago
|
Summary: Much slower than WebKit on some of the DOM getAttribute() → Much slower than WebKit on Dromaeo DOM getAttribute()
Comment 2•12 years ago
|
||
I think this is a dup, and IIRC sicking was looking at this some time ago.
Comment 3•12 years ago
|
||
A perf profile would be still great.
Comment 4•12 years ago
|
||
Jeff did a profile, and it looks like some DOM binding and string stuff, the ascii-lowercasing, and the general attribute name manipulation junk.
Bug 558516 covers a lot of the latter.
As far as DOM bindings go, on the testcase from comment 1 we make 102004000 calls to getAttribute in 15524ms on my machine, so about 152ns per call. Chrome taks 5423ms, so 53ns per call.
I tried timing a method that just takes a (constant) string arg and returns an nsString (so makes use of the same refcounting-and-sharing setup that getAttribute would), to get an idea of binding performance. Such a method with the quickstub bindings (which is what getAttribute is right now) seems to take about 90ns and with the webidl bindings takes about 72ns. It'll get a little faster still when efaust does his specialized-native stuff for methods, but that'll save maybe another 10ns.
So we need to both reduce the 62ns of actual attr-value-getting gunk we have (starting with bug 558516) and do something to speed up the actual binding code here, since 62 > 53. ;)
A profile of just the binding testcase says something like this:
17% is self time in the generated binding code; efaust's stuff might help here.
15% of the time is kernel time in vm_fault
11% is under JS_NewInternalString
10% is the overhead of the two(!) nested function calls it takes to get to there from
the binding code (xpc::NonVoidStringToJsval and XPCStringConver::ReadableToJSVal)
10% is the string assignment in the underlying C++ (the mov after the "lock add" is
particularly prominent)
9% are the destructors for the in-param dependent string and the return value
nsString
2% is JS_GetStringCharsZAndLength
17% is actual jitcode.
So yeah.
Jeff mentioned that WebKit has some sort of hashtable mapping string addresses to JS objects. We could try doing something like that and seeing how much it helps; we'd gain an atomic release in the profile if we did that, I bet, since right now we steal the nsString's ref for the JSExternalString.
Depends on: 558516
Comment 5•12 years ago
|
||
A bit more data. I did some testing with a new-binding method taking a string and returning nothing, a new-binding method taking nothing and returning a string, and new-binding method that takes nothing and returns nothing. Based on that, the times needed for the three things involved are approximately:
Method itself: 20ns
String argument: 15ns
String retval: 35s
which actually adds up to the 70ns I was seeing for the combination of the three above, yay.
Again, I think we can get the method itself down closer to 10-12ns. We should probably get separate bugs filed for string arguments and string return values....
For string arguments, I wonder whether we can use a "fake" nsDependentString with an inline constructor and destructor. Basically just have a class with the same member layout and a no-op destructor, which is enough for nsDependentString, and reinterpret_cast it when making the call. This relies on there being no virtual stuff on strings, of course. The current setup involves two function calls just to get to code that decides to do nothing for our dependent string. :( Peter, thoughts?
For string return values we should try this hashtable thing. 35ns is a pretty long time compared to even a hashtable lookup.
Comment 6•12 years ago
|
||
Filed bug 773519 on string arguments and bug 773520 on string return values.
Comment 7•12 years ago
|
||
Oddly enough, an experiment to use non-atomic refcounting on stringbuffers doesn't seem to change much in the timing...
IIRC Justin saw the same thing last time he looked at that.
Comment 9•12 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #8)
> IIRC Justin saw the same thing last time he looked at that.
I did, and it's one of those experimental results I had difficulty believing, but never re-ran. Maybe there's some other atomic op nearby which forces us to flush the string buffer refcount cache line even if we don't use atomic ops on the refcount. I dunno.
Comment 10•12 years ago
|
||
We should make methods non-virtual too. InternalGetExistingAttrNameFromQName and GetAttr
(And we should have a method like InternalGetExistingAttrNameFromQName which returns the value)
Comment 11•12 years ago
|
||
So with the patches in the three bugs this depends on, we're down to about 90ns (from 145).
Switching to new bindings will buy us at least another 20, though we may be able to get some of that now if we switch to calling a non-virtual method from the custom quickstub. If efaust's method patches go well, it might buy us 30.
Olli's removal of the XUL prototype stuff will let us remove a bit more code.
At that point we'll at least be in the same ballpark and it'll be worth reprofiling and filing off more rough edges (e.g. making the string-return fast path even faster by inlining the simple case, maybe).
Comment 12•12 years ago
|
||
Just as an update, compared to comment 5 I see these numbers with the patches this bug depends on:
Method itself: 16ns
String argument: 5ns
String retval: 26s
getAttribute is down to about 90ns. Of that time, about 35%, so about 31ns, is in the GetAttribute code itself. That includes the string assignment inside nsAttrValue::ToString, etc.
efaust's patches should get the "Method itself" number down to 9s or so. So then we'll be looking at a total of 71ns.
Doing better will involve getting rid of the two remaining layers of function calls for NonVoidStringToJsval and maybe speeding up the attr value get somehow...
Comment 13•12 years ago
|
||
Oh, and I tried hacking the stringbuffer refcounting to be non-atomic.... it doesn't seem to change things much(!). I still get a lot of time on the memory traffic after the increment or decrement.
Another thought on what we could do better here: an inline-ish version of JS_GetStringCharsZAndLength.
Comment 14•12 years ago
|
||
Though the "String retval" time above includes some time that's currently spent under getAttribute. So we might do better than 71, but I doubt we'd get below 60.
Comment 15•12 years ago
|
||
Comment 16•12 years ago
|
||
Comment 17•12 years ago
|
||
I don't think we end up using stringbuffer's refcounting. Seems like we just copy.
In this particular test we handle atom attr values, so we should just share that atom to
JS. Would need to create a new kind of JSExternalString.
Atom addref/release should be cheap.
Comment 18•12 years ago
|
||
I'm wrong here. We do end up addref stringbuffer.
Comment 19•12 years ago
|
||
We should remeasure here now that the WebIDL bindings have landed.
Comment 20•12 years ago
|
||
Modified the previous test.
Nightly 105-110, Chrome 90-95.
Comment 22•12 years ago
|
||
This is totally fixed on Mac.
Should still measure on ARM, I guess...
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•