Closed
Bug 118933
Opened 23 years ago
Closed 3 years ago
[meta] DOM Performance Issues
Categories
(Core :: DOM: Core & HTML, defect, P3)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
People
(Reporter: markushuebner, Unassigned)
References
()
Details
(Keywords: meta, perf, testcase, Whiteboard: [FIXED ON TRUNK])
Attachments
(8 files, 5 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
hjtoi-bugzilla
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jst
:
review+
jst
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
text/html
|
Details | |
(deleted),
patch
|
peterv
:
review+
vidur
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jesup
:
review+
jesup
:
superreview+
jesup
:
approval+
|
Details | Diff | Splinter Review |
10 runs on each testcase with build 2002010803 on win-xp
with a 1.1ghz machine.
mozilla ie6
getElementById() 985 280
getElementsByTagName() 1230 293
createElement() 1314 235
getAttribute() 465 148
setAttribute() 447 157
Reporter | ||
Updated•23 years ago
|
Reporter | ||
Updated•23 years ago
|
Comment 1•23 years ago
|
||
I looked at the source for getElementById and it works by doing a linear search
in the document the first time an id is asked for. After that it is cached in a
hashtable. I guess it could be made somewhat quicker by putting all id:s in the
hashtable from the beginning. The testcase looks up the same id all the time so
it shouldn't be that.
I think that the linear search could be made quicker though if that's whats
slowing us down. It's done by recursion now and rewriting it into a more
iterative method could save a lot of function calls.
It could also be the translation from javascript to C++ and back again thats
slow. I would really like to run Quantify on these tests but I don't know if I
have the time.
Comment 2•23 years ago
|
||
I have a reason for the slowness of Mozilla getElementById after running that
testcase in Quantify.
53% of the time is spent in nsScriptSecurityManager::CanAccess.
More than two thirds of that time is spent in
nsScriptSecurityManager::GetSecurityLevel with nsCodebasePrincipal::GetOrigin
being the worst offender. I would guess that security level can't change during
the course of a script run so shouldn't that info be cached somewhere? The rest
of the time in CanAccess is spent in nsXPIDLCString::PrepareForUseAsOutParam and
nsScriptSecurityManager::GetObjectPrincipal.
The rest of the time (when the sucurity part is removed) is spent as follows:
31% XPC_WN_Helper_NewResolve
19% XPCConvert::NativeData2JS
13% nsHTMLDocument::GetElementById (this is ~97 ms on my computer)
10% xptiInterfaceEntry::GetIID
6% XPCConvert::JSData2Native
21% Various stuff like memory managment and JS and XPC overhead
So I would first look at why the security is so damn slow, then figure out what
kind of animal XPC_WN_Helper_NewResolve is.
Also, can NativeData2JS be made faster? It seems to be dominated by
thread safety overhead. PR_EnterMonitor, PR_ExitMonitor and PR_Lock together is
~70% of the time. The rest being AddRef and js_AddRootRT.
Is there a bug for the slowness of nsScriptSecurityManager::CanAccess? Should I
open one?
Comment 3•23 years ago
|
||
mstoltz has some *really* nice goodies in his pocket that should make a world of
difference here. Don't know if there's a bug for those speed improvements though.
Comment 4•23 years ago
|
||
Thanks Daniel for the precious information! CC'ing dbradley for
XPCConvert::NativeData2JS, and XPC_WN_Helper_NewResolve
Comment 5•23 years ago
|
||
I've seen CanAccess show up in lots of Jprofs - never major, but often
noticable. Check bug 49141 (window open, CanAccess is 17/929, or about 2%) and
ones like it. I'll check my jprof's on Monday.
Bug 107746 (DHTML Sliding menu perf) shows 6% in CanAccess (12/207)
Comment 6•23 years ago
|
||
The biggest offender in XPConnect is the JS_AddNamedRootRT and
JS_RemoveNamedRootRT call in XPCWrappedNative::AddRef. Removing this was
discussed in bug 112152.
The XPC_WN_Helper_NewResolve looks like it boils down to nsHashTable/nsStringKey
performance. nsStringKey's conversion to unicode, the PL_HashTableLookup, and
the freeing of the key. This appears in both paths nsWindowSH::NewResolve and
nsWindowSH::GlobalResolve.
Comment 7•23 years ago
|
||
Don't use nsHashtable, it's malloc-happy.
Can we not expedite bug 112152? Is it an infrequent crash bug? Or if we don't
yet have thread hazards in the current Mozilla code involved in that bug, can we
not get rid of the rooting now in a single-threaded fashion?
/be
Comment 8•23 years ago
|
||
I have changes that change the DOM code to use PLDHash in stead of nsHashtable,
it still needs some cleaning up, but it's getting closer...
Priority: -- → P2
Target Milestone: --- → mozilla0.9.9
Comment 9•23 years ago
|
||
I might be out on thin ice with my pldhash useage here, so this needs some
pldhash familiar eyes. Brendan, would you have a look?
Comment 10•23 years ago
|
||
Oh, the 10+ number was from initial testing in Quantify with the
getElementById() testcase.
Comment 11•23 years ago
|
||
I've never quite seen PL_DHashGetStubOps used that way, but if it works, why not...
Please don't copy the performance disaster (yes, it shows up on profiles) that
is nsCRT::HashCode(const PRUnichar*) any further. Those HashCodes on nsAStrings
are much better written using character sinks, since writing them the way you
did gives you a lot of iterator overhead. I wrote them once, and I'll try to
dig up the patch.
Comment 12•23 years ago
|
||
Here's part of a patch I was working on for bug 109129, although I never
figured out how to solve the atom-related part of the problem.
Comment 13•23 years ago
|
||
Comment on attachment 65088 [details] [diff] [review]
10+ times speedup of nsScriptNameSpaceManager::LookupName()
GetKey, HashKey, and MatchEntry callbacks, but no ClearEntry?
That's wrong, it leaks -- it also explains why you thought you needed to
Enumerate with a Cleanup callback before Finishing.
>-static PRBool PR_CALLBACK
>-NameStructCleanupCallback(nsHashKey *aKey, void *aData, void* closure)
>+PR_STATIC_CALLBACK(PLDHashOperator)
>+NameStructCleanupCallback(PLDHashTable *table, PLDHashEntryHdr *hdr,
>+ PRUint32 number, void *arg)
> {
>- nsGlobalNameStruct *s = (nsGlobalNameStruct *)aData;
>+ GlobalNameMapEntry *entry = NS_STATIC_CAST(GlobalNameMapEntry *, hdr);
>
>- delete s;
>+ entry->mKey.~nsString();
>
>- return PR_TRUE;
>+ return PL_DHASH_NEXT;
> }
The above method should be morphed into a ClearEntry op, and then you won't
need the Enuerate below:
> nsScriptNameSpaceManager::~nsScriptNameSpaceManager()
> {
>- mGlobalNames.Reset(NameStructCleanupCallback);
>+ if (mIsInitialized) {
>+ // Clear all hash keys and values.
>+ PL_DHashTableEnumerate(&mGlobalNames, NameStructCleanupCallback, nsnull);
>+
>+ // Destroy the hash
>+ PL_DHashTableFinish(&mGlobalNames);
>+ }
>+}
See, Finish calls the clearEntry op for you.
BTW, I'd rather you initialize the ops static explicitly, but that's mostly
style (isn't the static set from a function return value more expensive?).
>+
>+nsGlobalNameStruct *
>+nsScriptNameSpaceManager::AddToHash(const nsAString& aKey)
>+{
>+ GlobalNameMapEntry *entry =
>+ NS_STATIC_CAST(GlobalNameMapEntry *,
>+ PL_DHashTableOperate(&mGlobalNames, &aKey, PL_DHASH_ADD));
>+
>+ if (!entry) {
>+ return nsnull;
>+ }
>+
>+ // The memory for the string key has already been allocated, but the
>+ // string constructor was never called. Use placement new to
>+ // initialize the key string.
>+ nsString *str = new (&entry->mKey) nsString(aKey);
>+
>+ return &entry->mGlobalName;
> }
Don't you need to set other members?
>
> nsresult
>@@ -111,25 +165,18 @@
> continue;
> }
>
>- // XXX Mac chokes _later_ if we put the |NS_Conv...| right in
>- // the |nsStringKey| ctor, so make a temp
>- NS_ConvertASCIItoUCS2 temp(categoryEntry);
>+ NS_ConvertASCIItoUCS2 keyStr(categoryEntry);
>
>- nsStringKey key(temp);
>+ nsGlobalNameStruct *s = FindInHash(keyStr);
>
>- if (!mGlobalNames.Get(&key)) {
>- nsGlobalNameStruct *s = new nsGlobalNameStruct;
>+ if (!s) {
>+ s = AddToHash(keyStr);
> NS_ENSURE_TRUE(s, NS_ERROR_OUT_OF_MEMORY);
>
> s->mType = aType;
> s->mCID = cid;
Oh, here are those member assignments -- but why hash twice when adding? You
should use PL_DHASH_ADD to do the lookup, and if the entry already exists,
you'll know by null- or zero-testing the members.
If you need initial values other than 0/null, use the optional initEntry hook
and make sure clearEntry stores the non-zero default values.
How about a new patch fixing the above and capturing dbaron's HashCode work?
/be
Attachment #65088 -
Flags: needs-work+
Comment 14•23 years ago
|
||
This should be much closer, and I know it's even faster than the previous
patch, thanks to dbaron's HashString() code.
Comment 15•23 years ago
|
||
Comment on attachment 65210 [details] [diff] [review]
Fixes all Brendan's issues and includes the new HashCode and HashString code.
>+class GlobalNameMapEntry : public PLDHashEntryHdr
>+{
>+public:
>+ nsString mKey; // must be first, to look like PLDHashEntryStub
The comment seems false now -- you don't depend on Stub ops that want a void
*key first, do you?
>+GlobalNameHashClearEntry(PLDHashTable *table, PLDHashEntryHdr *entry)
> {
>- nsGlobalNameStruct *s = (nsGlobalNameStruct *)aData;
>+ GlobalNameMapEntry *e = NS_STATIC_CAST(GlobalNameMapEntry *, entry);
>+
>+ // An entry is being cleared, let the key (nsString) do it's own
Possessive "its", not contracted it-is.
>+ if (s->mType == nsGlobalNameStruct::eTypeNotInitialized) {
> s->mType = aType;
> s->mCID = cid;
>
>- nsGlobalNameStruct *old =
>- (nsGlobalNameStruct *)mGlobalNames.Put(&key, s);
>-
>- NS_WARN_IF_FALSE(!old, "Redefining name in global namespace hash!");
>-
>- delete old;
>+ return NS_OK;
> } else {
return before else -- nuke the else and unbrace/exdent this:
> NS_WARNING("Global script name not overwritten!");
> }
Fix these nits and sr=brendan@mozilla.org.
/be
Attachment #65210 -
Flags: superreview+
Comment 16•23 years ago
|
||
> The comment seems false now -- you don't depend on Stub ops that want a void
> *key first, do you?
True, comment removed.
"it's" replaced with "its"
Re the else after return, the return was not supposed to be there, no idea how
that ended up there. The else remains, but the return is gone.
Thanks for the sr
Comment 17•23 years ago
|
||
Brendan, this is kindof a sidenote from this bug, but is there a good reason for
the entry that's passed to the initEntry hook being const? Seems kinda weird,
the whole idea is that you're supposed to muck with the entry in that hook, so
why the const?
Also, do we really want to name the new HashString(const nsA[C]String&) methods
HashString, I'd vote for naming them HashCode to match the nsCRT ones. Thoughts?
Comment 18•23 years ago
|
||
Comment on attachment 65210 [details] [diff] [review]
Fixes all Brendan's issues and includes the new HashCode and HashString code.
r=peterv with Whaa and the comment for LookupName corrected.
Attachment #65210 -
Flags: review+
Comment 19•23 years ago
|
||
> Also, do we really want to name the new HashString(const nsA[C]String&) methods
> HashString, I'd vote for naming them HashCode to match the nsCRT ones. Thoughts?
As a global (rather than a static method of nsCRT), I slightly prefer
HashString, since HashCode is a little general.
Comment 20•23 years ago
|
||
Dammit! How did that happen? The PLDHashInitEntry typedef's entry param should
*not* be a pointer-to-const, of course. Patch here coming up, I want this fixed
ASAP.
/be
Comment 21•23 years ago
|
||
pldhash.h is derived via js/src/plify_jsdhash.sed from jsdhash.h, which
accounts for the off-by-four underhanging indentation glitch in pldhash.h's
patch. Quick r= and sr=? jst, can you roll this into your patch and remove
your const_cast?
/be
Comment 22•23 years ago
|
||
This one's ready to go, if I can get an a= I'll check it in today.
Comment 23•23 years ago
|
||
I prefer HashString too, the noun-verb combo works better than a noun phrase
with an implied verb (which is what HashCode seems to be). I have a dumb
question about these names, though: are we comfortable with all our global
namespace claims? Pellucid names are great, but the problem is that other
subsystems with which we may have to integrate will choose them too. Are we
saved by inline somehow?
/be
Comment 24•23 years ago
|
||
We should be saved by C++'s capability for overloading, at least to some degree,
since the mangled name for HashString would have to have nsAString/nsACString in it.
Comment 26•23 years ago
|
||
For 0.9.8, I'd say it's probably better to hold off on the nsCRT.{h,cpp}
changes, since there's a chance someone could be depending on the current
behavior. I'm willing to file a separate bug to land those later. How risky
are the DOM changes?
Comment 27•23 years ago
|
||
dbaron, I'm cool with holding off on the nsCRT changes till 0.9.9, but the
namespace manager stuff is as safe as any other pldhash usage in the system.
/be
Comment 28•23 years ago
|
||
in http://bugzilla.mozilla.org/show_bug.cgi?id=118933#c7 brendan asked...
> Can we not expedite bug 112152? Is it an infrequent crash bug? Or if we don't
> yet have thread hazards in the current Mozilla code involved in that bug, can we
> not get rid of the rooting now in a single-threaded fashion?
As I read the post removing that rooting would account for *part* of %70 of %19
of 47%. Pushing forward with gc-time marking rather than rooting without fixing
the new problems that would uncover *might* cause new real crashes. The only
crashes we know of with the current scheme are when using thread stress tests.
But, yes I believe this can be done. I'm doubtful that the improvement will be
significant (yes, I know.. 'every little bit...').
Comment 29•23 years ago
|
||
nsCRT changes are now bug 120363.
Updated•23 years ago
|
Keywords: mozilla0.9.7+
Comment 30•23 years ago
|
||
The latest patch modulo the nsCRT changes have been checked in. Leaving bug open
for further optimizations.
Status: NEW → ASSIGNED
Comment 31•23 years ago
|
||
Markus, would you be able to run the tests again with a build from tomorrow or
newer on the same hardware you ran the initial tests with and report back how
much an imporvment we've made so far?
Comment 32•23 years ago
|
||
Another resolve function that's called a lot in the testcases is
nsHTMLDocument::ResolveName. Could the exact same solution, with pldhash, be
used there too? There doesn't seem to be very many references to mNameHashTable
(15 in nsHTMLDocument.cpp if emacs counted correctly).
Comment 33•23 years ago
|
||
Yup, I already have that in my tree, still needs some cleaning up though...
Reporter | ||
Comment 34•23 years ago
|
||
Running again (1.1ghz, winxp) the test with bulid 2002011703
I get the following results:
getElementById() 929
getElementsByTagName() 1107
createElement() 1227
getAttribute() 441
setAttribute() 421
nice work! :))
Comment 35•23 years ago
|
||
I did the test before/after on my slow 300MHz PIII NT4 SP6a
Here are the results (10 runs):
Test Before(20010116) After(20010117) Win(%) IE5.5
|---------------------|-------------------|-------------------|-------|----------|
|getElementById | 4131 ms | 4017 ms | 3% | 972 ms |
|---------------------|-------------------|-------------------|-------|----------|
|getElementsByTagName | 5031 ms | 4382 ms | 13% | 1093 ms |
|---------------------|-------------------|-------------------|-------|----------|
|createElement | 5101 ms | 5062 ms | 1% | 760 ms |
|---------------------|-------------------|-------------------|-------|----------|
|getAttribute | 1799 ms | 1800 ms | 0% | 496 ms |
|---------------------|-------------------|-------------------|-------|----------|
|setAttribute | 1795 ms | 1803 ms | 0% | 541 ms |
Biggest gain is for getElementsByTagName with very little gain for other tests.
Still light-years from IE5.5.
Comment 36•23 years ago
|
||
Thanks for the testing, there's definately more to be done, the patches in this
bug so far only fixed a small part of the performance problems. There's more on
its way.
Comment 37•23 years ago
|
||
I'm getting hundreds of assertions like below when I start mozilla fetched &
built from this morning. s->mType is eTypeNotInitialized so it complains. Anyone
know what is causing them?
###!!! ASSERTION: Whaaa, JS environment name clash!: '!(s && s->mType != nsGloba
lNameStruct::eTypeInterface)', file d:\m\source_clean\mozilla\dom\src\build\nsSc
riptNameSpaceManager.cpp, line 414
Comment 38•23 years ago
|
||
Yup, I have a fix...
Comment 39•23 years ago
|
||
This is already checked in btw, feel free to do a post-checkin review if you
feel it's necessary.
Updated•23 years ago
|
Whiteboard: [driver:brendan]
Comment 40•23 years ago
|
||
Aren't all the 0.9.8 changes in now? If so, let's remove this from 115520's
dependency list and let the TM of 0.9.9 mean that it'll get pursued then.
/be
Comment 42•23 years ago
|
||
This patch makes the documents (indirect) JS resolve hook and the
implementation of document.getElementById() about 4 times faster (when combined
with the fix for bug 120718) by eliminating nsStringKey and nsHashtable in
favor of pldhash. Also combinging the id and name hashes in the document into
one hash. This lets us do fewer and faster hash lookups. Over all this doesn't
improve the times in the testcases in this bug by all that much since most of
the time is spent outside this code (~70% in the security manager, see bug
119646).
Attachment #65088 -
Attachment is obsolete: true
Attachment #65095 -
Attachment is obsolete: true
Attachment #65210 -
Attachment is obsolete: true
Attachment #65269 -
Attachment is obsolete: true
Comment 43•23 years ago
|
||
Oh, this latest patch should also speed up pageload times a little bit since we
do get some traffic in these hashtables while loading HTML pages.
Keywords: mozilla0.9.7+
Whiteboard: [driver:brendan] → [HAVE FIX]
Comment 44•23 years ago
|
||
Brendan, dbaron: Would nsHTMLDocument::InvalidateHashTables() (in the patch) be
faster if I'd just do a PL_DHashTableFinish() + PL_DHashTableInit() in stead of
enumerating the hash to remove every entry, or would that cause more heap traffic?
Comment 45•23 years ago
|
||
Comment on attachment 65745 [details] [diff] [review]
Above patch as diff -w
>Index: content/html/document/src/nsHTMLDocument.cpp
>===================================================================
>+ // An entry is being cleared, let the key (nsString) do its own
>+ // cleanup and release the content list if there is one.
>+ e->mKey.~nsString();
>+ // Inititlize the key in the entry with placement new
>+ nsString *str = new (&e->mKey) nsString(*keyStr);
I'll have to talk to you about these (and I guess refresh my memory on some
C++ as well...)
>+ mIdAndNameHashIsLive = PL_DHashTableInit(&mIdAndNameHashTable,
>+ &hash_table_ops, nsnull,
>+ sizeof(IdAndNameMapEntry), 16);
I think you might be better off allocating a larger value initially. I checked
some 20 pages in the page load test (like Netscape, Microsoft, Google,
mapQuest etc. pages) and on average they had 20-30 name attributes. Google
had only 4, while the LXR test page has one per source line. A value of 32
might be
better.
>+ // We're looked for this id before and we didn't find it, so it
Typo, "We've".
>+ entry->mIdContent = NS_STATIC_CAST(nsIContent *, ID_NOT_IN_DOCUMENT);
Don't need this cast.
Comment on attachment 65745 [details] [diff] [review]
Above patch as diff -w
r=heikki
After discussing this again with jst things look good as they are. The hash
isn't allocated until it is actually used, so even on LXR pages you may end up
using less than the 16 entries.
Attachment #65745 -
Flags: review+
Comment 48•23 years ago
|
||
Comment 49•23 years ago
|
||
Comment on attachment 66419 [details] [diff] [review]
Final patch for this round
jband sez sr=jband.
Attachment #66419 -
Flags: superreview+
Attachment #66419 -
Flags: review+
Comment 50•23 years ago
|
||
Latest patch (attachment 66419 [details] [diff] [review]) checked in. Leaving bug open in case we find
more to optimize.
Would someone run some more tests and report back the difference from when we
started optimizing this?
Comment 51•23 years ago
|
||
Result of 10 runs on a P3 450MHz / Windows NT4. Tested with Mozilla 0.9.7 and
the latest nighly build.
----------------------------------------------------------------------------------
| IE6 Gecko/2001122106 Gecko/2002012403 % gained
----------------------------------------------------------------------------------
getElementById | 865 3374 3077 8,80
getElementByTagName | 869 3334 3313 0,62
createElement | 687 4075 3793 6,92
getAttribute | 442 1407 1390 1,20
setAttribute | 461 1387 1348 2,81
----------------------------------------------------------------------------------
Comment 52•23 years ago
|
||
Emmanuel, you will need a build from the 25th likely to accurately test this.
The build you were testing was built about 18 hours _before_ this patch went in.
Comment 53•23 years ago
|
||
Result of 10 runs on a P3 450MHz / Windows NT4, with the right build this time ;)
| MSIE 6.0 Opera 6.0 Mozilla 0.9.7 2002012403
2001022503 %-m0.9.7 %-nightly
----------------------------------------------------------------------------------------------------------
getElementById | 865 1166 3374 3077 2983
11,59% 3,05%
getElementByTagName | 869 1168 3334 3313 3030
9,12% 8,54%
createElement | 687 1162 4075 3793 3742
8,17% 1,34%
getAttribute | 442 N/A 1407 1390 1342
4,62% 3,45%
setAttribute | 461 N/A 1387 1348 1295
6,63% 3,93%
Comment 54•23 years ago
|
||
Reformating the table for clarity, sorry for the spam.
| IE6 Moz 0.9.7 2002012403 2002012503 %-m0.9.7 %-nightly
--------------------------------------------------------------------------------
getElementById | 865 3374 3077 2983 11,59% 3,05%
getElementByTagName| 869 3334 3313 3030 9,12% 8,54%
createElement | 687 4075 3793 3742 8,17% 1,34%
getAttribute | 442 1407 1390 1342 4,62% 3,45%
setAttribute | 461 1387 1348 1295 6,63% 3,93%
Comment 55•23 years ago
|
||
Hereafter are my timings still with PII 300 MHz NT4 SP6
I added a column with the gain ouside security manager. for that, I
took the assumption that SM takes 70% of the time, as per jst's comment #42.
I took 70% of the original time and substract this time from all
measurements. FWIW.
+---------------------+--------+----------+----------+-------+-------------+
|Test | before | w/ 65273 | w/ 66419 | Gain% |Gain SM excl.|
|---------------------|--------|----------|----------|-------|-------------|
|getElementByID |4131 ms | 4017 ms | 3715 ms | 10% | 50% |
|---------------------|--------|----------|----------|-------|-------------|
|getElementsByTagName |5031 ms | 4382 ms | 4016 ms | 28% | 305% |
|---------------------|--------|----------|----------|-------|-------------|
|CreateElement |5101 ms | 5062 ms | 4956 ms | 3% | --- |
|---------------------|--------|----------|----------|-------|-------------|
|getAttribute |1799 ms | 1800 ms | 1734 ms | 4% | --- |
|---------------------|--------|----------|----------|-------|-------------|
|setAttribute |1795 ms | 1803 ms | 1686 ms | 6% | --- |
+---------------------+--------+----------+----------+-------+-------------+
Comment 56•23 years ago
|
||
Read 66% instead of 305%. Sorry for the spam.
Comment 57•23 years ago
|
||
Mstoltz's security manager changes should cut those numbers nearly in half.
Should land soon...
Reporter | ||
Comment 58•23 years ago
|
||
Running the tests again (1.1ghz, winxp)
I get the following results:
-------------------------------------------------------------------
| 2002010803 2002011703 2002012503
-------------------------------------------------------------------
getElementById | 985 929 893
getElementByTagName | 1230 1107 1155
createElement | 1314 1227 1218
getAttribute | 465 441 434
setAttribute | 447 421 411
-------------------------------------------------------------------
Comment 59•23 years ago
|
||
Results on 566MHz Celeron with Linux:
Mozilla 0.9.8 (original binary talkback build from mozilla.org)
and
Mozilla 0.9.8 with the patch from bug 119646 compiled with gcc-2.96-98 and -O2
and -march=i686
Test Original m0.9.8 m0.9.8 with patch
getElementById | 5541 3526
getElementByTagName | 5410 5797
createElement | 7205 5526
getAttribute | 2077 2657
setAttribute | 2110 2672
This is really strange :-(
There is really big speedup on getElementById and createElement but small
slowdown on the rest of the tests.
Comment 60•23 years ago
|
||
I have a question about the patch. Why are the id strings stored in Unicode when
ID & NAME are alphanumeric (+ a few other symbols)?
Comment 61•23 years ago
|
||
Becsause we don't want to be converting back n' fourth every time we hit the
hashes, which we do a *lot*, and also because there are non-ascii characters in
names on existing webpages already. We'be had bugs on non-ascii names not
working already...
Comment 62•23 years ago
|
||
Can someone double check that setAttribute() n' friends really did slow down?
Comment 63•23 years ago
|
||
Since my last measurements (comment #55 2002-01-25),
I'm seeing a small slowdown with 2002020703.
getElementByID 3715 ---> 3928
createElement 4956 ---> 5124
getAttribute 1734 ---> 1790
setAttribute 1686 ---> 1702
More strange is the behavior of getElementsByTagName.
When running the test with 10 runs, I see a great variation
between the runs (from 4s to 8s). Moreover, I see a "great memory
activity with NT task manager. Something have definitively changed here.
I noted also that odd runs are quite fast (4s) while even runs take
much more time (up to 8s). A hash table lookup should not have such
variations.
getElementsByTagName 4016 ---> 5628 (not stable)
I'm rather skeptical on Tom's measurements in comment #59.
With a 500 MHz Pentium, he is slower than me with my 300 MHz.
Maybe some background task running on his PC.
Comment 64•23 years ago
|
||
The getElementsByTagName() tests might very well trigger GC's while the tests
are running (since we create new objects for every iteration through the loop),
that could easily cause large variations in the test results.
Comment 65•23 years ago
|
||
Here are my results :
(10 runs on a P3 450MHz / Windows NT4 )
| m0.9.7 2002012403 20020125 m0.9.8 2002020703
------------------------------------------------------------------------
getElementById | 3374 3077 2983 3163 2989
getElementByTagName | 3334 3313 3030 3229 3053
createElement | 4075 3793 3742 3824 3774
getAttribute | 1407 1390 1342 1453 1357
setAttribute | 1387 1348 1295 1426 1336
m0.9.8 is slower but that's not a big surprise since it branched 2 days before
jst's patch. Latest build doesn't show any significant change over the post
patch build (20020125)
Comment 66•23 years ago
|
||
OK in my last comment I've compared two uncomparable builds (different compiler
and settings). I've made a build with the gcc-2.96-98 compiler and WITHOUT the
mstoltz patch from bug 119646 and compared the results with the same compiled
build with the patch and there is bigger difference.
Which leads me that the speed of the testcase is heavily depending on build
environment (compiler, options...).
So here are the comparable results with and without the mstoltz patch:
Test Orig. m0.9.8 Our m0.9.8 W patch Our m0.9.8 W/O patch
getElementById | 5541 3526 6166
getElementByTagName | 5410 5797 5680
createElement | 7205 5526 7996
getAttribute | 2077 2657 2482
setAttribute | 2110 2672 2455
So there definitely is some minor performance degradation on set/getAttribute.
The getElemntByTagName case may be caused by the GC. The getElementById and
createElement are highly improved.
Comment 67•23 years ago
|
||
I don't know what happened, but bug 87808 returned somewhere between 02/06 and
02/09 Might be related to this.
Depends on: 87808
Reporter | ||
Comment 68•23 years ago
|
||
There seems to be a problem concerning the testcases at
http://www.formula-one.nu/mozilla/domtestcases/
Using build 2002020908 on win-xp pressing "run" doesn't seem to trigger any
action - a new bug ?!
Comment 69•23 years ago
|
||
Markus, you're right.
I'm unable to run the tests (at least getElementById: I didn't try
the others) with 2002020908 NT4.
Seems to perform the first run , display the result (briefly) and
re-init.
Did you file a bug or mentionning here is enough ?
Reporter | ||
Comment 70•23 years ago
|
||
Filed a new bug on the regression seen in the testcases - bug 124841.
Comment 71•23 years ago
|
||
That regression was due to bug 124749.
Comment 72•23 years ago
|
||
Here are the results with build from 2002021303 which includes
Mstoltz's security manager changes.
+---------------------+--------+----------+------------+-------+----------+
|Test | before | w/ 66419 | w/ new SM | Gain% | IE5.5 |
|---------------------|--------|----------|------------|-------|----------|
|getElementByID |4131 ms | 3715 ms | 2176 ms | 47% | 972 ms |
|---------------------|--------|----------|------------|-------|----------|
|getElementsByTagName |5031 ms | 4016 ms | 4309 ms | 14% | 1093 ms |
|---------------------|--------|----------|------------|-------|----------|
|CreateElement |5101 ms | 4956 ms | 3197 ms | 37% | 760 ms |
|---------------------|--------|----------|------------|-------|----------|
|getAttribute |1799 ms | 1734 ms | 1751 ms | -- | 496 ms |
|---------------------|--------|----------|------------|-------|----------|
|setAttribute |1795 ms | 1686 ms | 1688 ms | 6% | 541 ms |
+---------------------+--------+----------+------------+-------+----------+
Any other ideas to narrow even more the gap with IE.
Comment 73•23 years ago
|
||
There was a mail in n.p.m.performance, "Fast RAM semaphores" by Michael Kaply.
He pointed to this bug, and how with the patch Moz OS/2 beats Windows in
javascript speed tests. He points to this bug...
http://bugzilla.mozilla.org/show_bug.cgi?id=125123
Comment 74•23 years ago
|
||
I read that too. Any volunteer to port this code
on the other OS.
Comment 75•23 years ago
|
||
Well...
on linux _MD_NEW_LOCK is implemented using a macro:
#define _MD_NEW_LOCK(lock) PR_SUCCESS
I dont know many implementations that will do that faster...
Comment 76•23 years ago
|
||
I just made profiles of those tests with a cvs build from this morning (has
mstoltz's fix). You can see them at
http://boris.zbarsky.org:8000/~bzbarsky/dom/perf/
Four of them nothing is leaping at me.... getElementsByTagName we're spending
lots of time in nsVoidArray::IndexOf. Here's what happens there:
Nodelists are live. This means they add themselves to the document observer
list when they get created and remove themselves when they get destroyed. Now
this page creates thousands of nodelists in a tight loop. I put a debug printf
in nsVoidArray::IndexOf and it looks like the length of the list starts out at a
fairly low number, grows up to 10008 (10000 nodelists and 8 legitimate
observers) and then shrinks to a low number.
The interesting thing is that if I tell the page to run the test twice I get
_two_ spikes of about 10000 observers. So it looks like the observers are being
deleted one by one when the function the nodelists were created in returns....
Is this business of creating lots of nodelists in a tight loop actually common
out there? As things stand the addition and removal of observers takes time
that is O(N^2) in the number of nodelists in the testcase. Do observers need to
be notified in a consistent order? That is, could we store them in a hashtable
instead of an array?
Comment 77•23 years ago
|
||
Creating nodeList's in a tight loop is not common as far as I know -- all good
js coders know they have to cache stuff before entering a tight loop.
Reporter | ||
Comment 79•23 years ago
|
||
Another testcase is at
http://www.formula-one.nu/mozilla/domtestcase2/
Seems we are doing very badly on them.
test1 test2
---------------------------------
msie6 3831 4738
2002021303 30206 28952
The problem on test2 (no animation displayed) seems to be due to bug 117061
Comment 80•23 years ago
|
||
On my 300MHz PII, animation is not visible in both testcases.
Since the DIVs are overlapping, I think bug #115247 may do something here, but
I'm pretty sure there are other causes too.
Comment 81•23 years ago
|
||
Moving Netscape owned 0.9.9 and 1.0 bugs that don't have an nsbeta1, nsbeta1+,
topembed, topembed+, Mozilla0.9.9+ or Mozilla1.0+ keyword. Please send any
questions or feedback about this to adt@netscape.com. You can search for
"Moving bugs not scheduled for a project" to quickly delete this bugmail.
Target Milestone: mozilla1.0 → mozilla1.2
Keywords: mozilla1.0,
nsbeta1
Target Milestone: mozilla1.2 → mozilla1.0
Comment 82•23 years ago
|
||
90% is nsBlockFrame::Reflow (surprise)
33% f+d in nsContainerFrame::SyncFrameViewAfterReflow, almost 1/2 of that in
nsCSSRendering::FindBackground(). Almost all (12%) of FindBackground's time is
spent in QI(!)
12% in nsContainerFrame::PositionFrameView() (1/2 from
SyncFrameViewAfterReflow)
10% in ComputeLineHeight, most that from nsFontCache::GetMetricsFor, and much
of that from nsFont::Equals()
14% in nsHTMLReflowState::nsHTMLReflowState, most of that (12%) in
nsHTMLReflowState::InitConstraints()
12% in nsContainerFrame::PositionFrameView(), split between
nsViewManager::MoveViewTo, nsFrame::GetView, and nsFrame::GetOffsetFromView
12% in nsHTMLDivElement::QueryInterface()(!)
5% direct hits in nsViewManager::IsViewInserted()
5% direct hits in the combination nsStyleContext::GetStyleData() and its callee
nsRuleNode::GetStyleData()
5% in nsBindingManager::GetBindingImplementation()
Comment 83•23 years ago
|
||
Bug 129079 has a fix for the FindBackground issue; cuts FindBackground from 15%
to <5%, about a 10% speedup on the overall testcase. Thanks BZ.
Comment 84•23 years ago
|
||
FYI, checked the Images version of the testcase. jprof is almost identical to
the text version; i.e. almost all the time is spent doing reflow.
Comment 85•23 years ago
|
||
I believe we've done what we can here for mozilla1.0. The reflow coalescing work
that's going on in bug 129115 will help with DHTML perf across the board, but
other than that there's no time to work on this now.
Comment 86•23 years ago
|
||
Um, that was a comment from me, not from Nisheeth.
Comment 87•23 years ago
|
||
This patch makes mozilla pre-define the document and window properties on the
global object, this means that accessing those properties will be handled
directly by the JS engine in stead of going through XPConnect. This might cause
the security checks for window and document to not work as expected though,
need to think about that...
Comment 88•23 years ago
|
||
Make doCheckPropertyAccess() not do a security check for document and window.
Attachment #81102 -
Attachment is obsolete: true
Comment 89•23 years ago
|
||
Comment on attachment 81103 [details] [diff] [review]
50% speedup of the getElementById() testcase, security problem fixed...
>+ } else if (mJSObject) {
>+ JSContext *cx = (JSContext *)mContext->GetNativeContext();
>+
>+ nsWindowSH::CacheDocumentProperty(cx, mJSObject, this);
>+ }
I prefer
nsWindowSH::CacheDocumentProperty((JSContext *)mContext->GetNativeContext(),
mJSObject, this);
r=peterv.
Attachment #81103 -
Flags: review+
Updated•23 years ago
|
Attachment #81103 -
Flags: superreview+
Comment 90•23 years ago
|
||
Comment on attachment 81103 [details] [diff] [review]
50% speedup of the getElementById() testcase, security problem fixed...
sr=vidur. Comment the about:blank case in nsGlobalWindow to explain why it's
necessary.
Comment 91•23 years ago
|
||
attachment 81103 [details] [diff] [review] checked in on the trunk.
Comment 92•23 years ago
|
||
Wooooh! another one like that and we'll catch IE !!!!!
+---------------------+--------+----------+-------+----------+
|Test | before | w/ 81103 | Gain% | IE5.5 |
|---------------------|--------|----------|-------|----------|
|getElementByID |2176 ms | 1169 ms | 46% | 972 ms |
|---------------------|--------|----------|-------|----------|
|getElementsByTagName |4309 ms | 4119 ms | --- | 1093 ms |
|---------------------|--------|----------|-------|----------|
|CreateElement |3197 ms | 2193 ms | 31% | 760 ms |
|---------------------|--------|----------|-------|----------|
|getAttribute |1751 ms | 1472 ms | 16% | 496 ms |
|---------------------|--------|----------|-------|----------|
|setAttribute |1688 ms | 1425 ms | 16% | 541 ms |
+---------------------+--------+----------+-------+----------+
Comment 93•23 years ago
|
||
Turns out that the security related change in this patch wasn't needed, I backed
that part out of the trunk. Patch for the branch coming up.
Comment 94•23 years ago
|
||
Comment 95•23 years ago
|
||
Comment on attachment 82101 [details] [diff] [review]
Same as above but for the 1.0 branch and w/o the security related change
Pulling forward existing r/sr from trunk
a=rjesup@wgate.com for branch
Attachment #82101 -
Flags: superreview+
Attachment #82101 -
Flags: review+
Attachment #82101 -
Flags: approval+
Comment 96•23 years ago
|
||
let's wait and get this one in for rtm. adt1.0.0- [adt2 RTM]
Reporter | ||
Comment 97•23 years ago
|
||
Wolfgang Schwarz has made a new great test-suite at
http://www.umsu.de/jsperf/index2.php ! :)
Reporter | ||
Comment 98•23 years ago
|
||
Won't this be checked into the 1.0 branch ?!
Comment 99•23 years ago
|
||
I was wondering about that.... the patch has driver approval and adt1.0.0-. Does
that not mean that a non-Netscape-employee could check it into the branch?
Comment 100•23 years ago
|
||
The ADT will re-evaluate this fix after the MachV Beta is released, until then
*I* am unable to check this in on the branch.
Updated•23 years ago
|
Updated•23 years ago
|
Whiteboard: [HAVE FIX] [adt2 RTM] → [FIXED ON TRUNK] [adt2 RTM]
Updated•23 years ago
|
Comment 101•23 years ago
|
||
adding adt1.0.0+. Please checkin to the 1.0 branch.
Comment 102•23 years ago
|
||
Fixed on branch too, leaving bug open though since there's more to do here...
Keywords: fixed1.0.0
Clearing keywords to get this off my query radars. I would advice closing this
and opening new bugs for specific issues as this has become so big.
Comment 104•23 years ago
|
||
I ran the tests on 10 runs on each testcase with following 3 browsers.
Branch: 2002-05-22-18-1.0.0
Trunk: 2002-05-23-08-trunk.
Other: IE 5.5
& here are results.
TEST Trunk : Branch : Other (IE)
getElementById() 1165ms : 1435ms : 1180ms
getElementsByTagName() 1222ms : 4127ms : 1438ms
createElement() 2137ms : 2410ms : 869ms
getAttribute() 1435ms : 1747ms : 550ms
setAttribute() 1391ms : 1665ms : 658ms
Looking at these results branch takes more time than trunk & other browser also.
Removing fixed1.0.0 keyword. Please correct me If I'm wrong.
Keywords: fixed1.0.0
Comment 105•23 years ago
|
||
This was "fixed" on the branch, but there's a lot of other improvments on the
trunk, so we can't really compare the two. In any case, the last patch in this
bug was checked in on the branch, so I'm adding fixed1.0.0 and verified1.0.0
keywords to this bug.
Keywords: fixed1.0.0,
verified1.0.0
Comment 106•22 years ago
|
||
If the changes on the trunk prove out well during 1.1alpha, someone should
gather a branch patch and nominate it to drivers for inclusion on the branch, in
1.0.1 (after 1.0 ships).
/be
Updated•22 years ago
|
Target Milestone: mozilla1.0 → Future
Reporter | ||
Comment 107•22 years ago
|
||
just an update on tests:
trunk build 2002082208 on win-xp pro,1.1ghz,512ram
10 runs for each ...
TEST Trunk : MSIE6.0
getElementById() 287ms : 270ms
getElementsByTagName() 305ms : 302ms
createElement() 792ms : 231ms
getAttribute() 378ms : 145ms
setAttribute() 340ms : 152ms
huge difference on createElement()
Reporter | ||
Comment 108•22 years ago
|
||
btw: will this get now on the branch - regarding brendan's comment #106 ?
Comment 109•22 years ago
|
||
If what /be suggests is not already on the 1.0 branch for 1.0.1, then it will
not make it. But, we should nomintae it for landing in the 1.0.2 milestone.
Blocks: 143047
Comment 110•22 years ago
|
||
So which bugs are the other issues referred to that we should consider for
1.0.2? Please mention them here, and also mark them with the keyword
"mozilla1.0.2" to nominate them.
Comment 111•22 years ago
|
||
I took a new look at createElement testcase and I can easily say what's slow,
but I don't know how fixable it is.
Of the time taken by createElement:
43% is converting from C++ to JS
22% is doing the createElement()
7% is security related
7% is converting from JS to C++
6% is getting some property (js_GetProperty called from js_Interpret)
6% is javascript garbage collecting
5% is 2 calls to some hook (jsd_FunctionCallHook) per createElement()
the rest is minor overhead everywhere
So there are two major offenders and a couple of small ones. Let's take them one
by one.
Converting from C++ to JS (43%):
=================================
This is done by XPCConvert::NativeData2JS / XPCConvert::NativeInterface2JSObject
which I've seen in profiles before. I don't know the workings of this but it is
slow. It's almost certainly this that makes setAttribute and getAttribute slow
as well, even though I should check that.
Anyway, it breaks as this:
(90% is XPCWrappedNative::GetNewOrUsed which contains the functions below)
40% is XPCWrappedNative::Init which only does a JS_NewObject
11% is XPCWrappedNative::FindTearOff which is
60% QueryInterface (explicit and through nsCOMPtr::assign_from_helper)
36% XPCWrappedNative::ExtendSet (hashtable and locks)
11% is calls to new
9% is locks (the same lock used several times per call)
7% is QueryInterface
7% is XPCNativeInterface::GetNewOrUsed
90% of this is locking, 8% is looking in an hashtable
so a strategy here seems to be to try to QueryInterface less and use locks less.
If that's possible, and I expect it to be, I guess we can save 10-20% at least.
I'm reluctant to touch that code though. It seems crash prone (magical commented
wrappers to avoid crashes).
The createElement call (22%):
==============================
40% here ends in NS_NewHTMLDivElement (i.e. only 10% of the time of the DOM call
is actually _creating_ the DIV element)
24% is getting some service in NS_CreateHTMLElement
11% is in nsNodeInfoManager::GetNodeInfo which is an hashtable lookup
8% is Querying an Interface on the created nsHTMLDivElement
7% is converting some string ("div"?) to an ID in nsParserService
7% is copying a string into an nsAutoString in nsHTMLDocument::CreateElement
I don't know if there's very much to do here. We could maybe avoid getting the
service for every call, and maybe we could avoid doing the QueryInterface. That
could, if possible, save a couple of percent of the total time.
Security (7%):
==============
I imagine security (CanAccess) is quite optimized now so I don't think it's easy
to find time to win here.
Converting from JS to C++ (7%):
================================
Nothing to win here either. All the time is spent allocating a wrapper for the
JS string and we can't easily avoid that.
Accessing some property (6%):
===============================
This is the js_GetProperty called from js_Interpret. I don't know what property
this is. |document| maybe?
40% is creating a XPCCallContext object
30% is calling nsHTMLDocumentSH::NewResolve
12% is looking something up in an hashtable in js_LookupProperty
8% is destroying the XPCCallContext object
4% is XPCNativeSet::FindMember
Maybe we can keep the XPCCallContext object around somewhere? It seems expensive
to create and destroy.
Javascript garbage collecting (6%)
==================================
I don't even knows if this happens every time.
Hooks (jsd_FunctionCallHook) (5%):
====================================
For every createElement there are 2 calls to jsd_FunctionCallHook in js_Invoke.
98% Locking (jsd_Lock, jsd_Unlock)
2% callHook overhead
0% JS_GetFrameScript and JS_IsConstructorFrame
So here we must either avoid the call altogether or skip the locking. I've no
idea what this is.
Comment 112•22 years ago
|
||
> 24% is getting some service in NS_CreateHTMLElement
The first three lines of NS_CreateHTMLElement are:
845 // Cache this service! The XML content sink uses this method!
846
847 nsCOMPtr<nsIParserService> parserService =
848 do_GetService(kParserServiceCID, &rv);
This sounds like "low-hanging-fruit which should help perf outside
createElement" to me...
Comment 113•22 years ago
|
||
Oh, and one more thing.
> 7% is converting some string ("div"?) to an ID in nsParserService
This code looks like it could be made faster too... right now it takes an atom,
gets the string out, hashes it, and does a hashtable lookup. If we can fix
nsHTMLTags::LookupTag to also use atoms, we could hash the atoms instead of raw
strings (which will make the hash key calculation faster, if nothing else; just
use the pointer as the key).
Comment 114•22 years ago
|
||
Daniel, you must have the js debugger (venkman) open for jsd_FunctionCallHook to
show up at all, I think. Or at least, venkman must be installed and must have
been opened? Cc'ing rginda. Normal users will never see cycles in jsd_*, so I
think you should reprofile with venkman off to get rid of this noise.
dbradley is cc'd -- I wonder whether we could eliminate most locking costs using
techniques used by the JS engine (see bug 54743). Also, if this createElement
benchmark is wasting time looking for a cached wrapper or tear-off or whatever
when it just needs to allocate a new one constantly, maybe the benchmark doesn't
really represent real-world workloads. Or if it does represent some interesting
workload to optimize, perhaps xpconnect can tell when it needs to stop reusing
and switch to new allocation?
/be
Comment 115•22 years ago
|
||
I guess building the js debugger is enough to have it "installed"? In that case
I have it installed and I might have used it in some profile of mine, but it
hasn't been used in the profile I used for the tests. Maybe some lock is
positioned outside the debugger-in-use-check.
If the test case is relevant? That's always a good question. In this case it
shows that JS to C++-conversion is a bottle neck and that's interesting. It's
interesting because it affects much more than createElement since (I assume)
lots of code in DHTML and in mozilla pass over that bridge, though I'm not
familiar enough with that code to really understand the meaning of everything it
does.
Anyway, two of the mentioned possibilities for improvement are, including
patches, in bug 165890 and 165893 and those two improved createElement
performace for me from ~1000 ms to ~900 ms. I was looking at reducing the number
of QueryInterface:s too, but each one seem to be needed and I don't see a way to
make them faster.
Comment 116•22 years ago
|
||
>If the test case is relevant? That's always a good question. In this case it
>shows that JS to C++-conversion is a bottle neck and that's interesting. It's
>interesting because it affects much more than createElement since (I assume)
>lots of code in DHTML and in mozilla pass over that bridge,
Do they? My point was that many (most?) DHTML pages do not create elements by
the thousands, or constantly, or frequently. The bridge crossed when creating a
new element is nsXPCWrappedNative::GetNewOrUsed, but once you've crossed that
bridge, to operate on a referenced instance is cheaper, and crosses a different
code bridge. Right?
/be
Comment 117•22 years ago
|
||
You mean that when an object has been converted once, then it's much cheaper to
convert (cached)? In that case, maybe the createElement test case doesn't tell
us very much as long as it isn't ridicously slow. getElementById would tell us more.
Comment 118•22 years ago
|
||
If you've got the debugger installed it will add some hooks to the js engine by
default. To uninstall these hooks, type "/startup-init off" in the Interactive
Session view. You need write access to the component registry, as this removes
a component from the startup observer category. The setting will be saved until
you do a "/startup-init on", or until something causes the js component
venkman-service.js to be re-registered.
Comment 119•22 years ago
|
||
Some thoughts on the xpconnect related suggestions...
- As noted by others we should be sure we are focusing on a real world benchmark
before going off and optimizing.
- On the issue raised regarding XPCCallContexts... They are stack only objects.
The cost is not in creation and destruction but rather is in initialization and
completion. Various bits of call state needs to be calculated before the call
and various bits of work need to done after the call. These just happen to be
done in the ctor and dtor. Recycling the actual objects would do no good.
- On locking... Brendan suggested a lock avoidance scheme like that used in the
js engine. XPConnect already has a lock avoidance scheme that saves *lots* of
DOM wrapper locking in the normal access paths. Rather than use the (relatively)
complicated scheme as used in the js engine to track occurance of access from
secondary threads, the xpconnect scheme just leverages the MAIN_THREAD_ONLY
class info flag to avoid locking overhead when dealing with wrapper specific
data. That is, many of the uses of autolocks in xpconnect use a null lock and
are almost NOPs. However, xpconnect also factors out interface info specific
data and per scope tables of wrappers. These are shared across the entire
runtime and require synchronized access. Most of this data is only diddled at
wrapper creation time - though there is one lock/unlock pair in the normal path
of looking up an existing wrapper.
These runtime-wide subsystems generally require synchronized access. Someone
*could* factor out parallel subsystems that would be used only on the main
thread; e.g. factoring out the XPCNativeInterface/XPCNativeSet stuff so that one
instance of the graph is for use on main thread only wrappers and the other is
used for everything else. Same story on wrapper sets. This would add complexity.
And, for some paths, it might mean an unlocked lookup followed by a locked
lookup before the right wrapper was found. But it might reduce DOM wrapper
locking significantly.
There is the question of whether or not the long term vision for the DOM
requires main thread only use. There has been some discussion of running DOMs on
multiple threads; i.e. perhaps each DOM would be single threaded, but different
DOMs might run on different threads. I don't know if this is likely to happen.
But if it did then the refactoring I suggest above could blow up into a lot of
duplicated data if we tried to give each DOM thread its own set of stuff.
Anyway, I'm not so sure that the locking overhead through realistic code paths
is significant. If it is then someone could look into some refactoring of shared
data in order to cut out some of the locks. I don't know who would be doing
that. The effort required to get up to speed on xpconnect internals in not trivial.
Comment 120•22 years ago
|
||
Naive question, but performance be improved by writing some custom marshalers
rather than going through XPConnect each time?
Reporter | ||
Updated•22 years ago
|
Keywords: mozilla1.0.2
Comment 121•22 years ago
|
||
www.formula-one.nu seems to be gone. Did anyone make a copy of the tests in
comment 68 or does anyone know their new location?
Reporter | ||
Comment 122•22 years ago
|
||
Updated URL of testcases ... I have a backup of all testcases. Just drop me a
line if you need anything more.
Comment 123•22 years ago
|
||
on 20021210 we're still far far away from IE or even Opera 7!!!
| IE6 2002121008 Opera 7
---------------------------------------------
getElementById | 326 413 267
getElementByTagName| 333 528 283
createElement | 271 1266 568
getAttribute | 131 649 656
setAttribute | 152 587 726
I strongly belive that this should be 1.3beta blocker. It's low level part of
almost all DHTML scripts, and speeding this one up will result in speed up in
many other bugs.
Flags: blocking1.3b?
Comment 124•22 years ago
|
||
Talking about 1.3beta blockers in a metabug is silly. First, get a patch or
five going that solve the underlying problem(s). Then we may well be in
1.4alpha. Any big and/or scary patches belong in an alpha, not in a beta,
milestone.
/be
Updated•22 years ago
|
Flags: blocking1.3b? → blocking1.3b-
Reporter | ||
Comment 125•22 years ago
|
||
just retesting this ... and the differences are bigger
then originally in comment #107 --> performance regression.
10 runs for each ...
TEST Trunk 2002121108 : Trunk 2002121108 : MSIE6.0 SP1
getElementById() 313ms : 287ms : 270ms
getElementsByTagName() 370ms : 305ms : 281ms
createElement() 702ms : 792ms : 231ms
getAttribute() 442ms : 378ms : 140ms
setAttribute() 407ms : 340ms : 144ms
Speaking for myself and (I think) brendan, I have no interest in ever running
DOMs in multiple threads.
Reporter | ||
Comment 127•22 years ago
|
||
Filed bug 186133 about the regression in performance.
No longer depends on: 186133
Comment 128•22 years ago
|
||
Mass-reassigning bugs to dom_bugs@netscape.com
Assignee: jst → dom_bugs
Status: ASSIGNED → NEW
Comment 129•21 years ago
|
||
Test: CVS-Trunk from yesterday Mozilla 20030107 (1.3b
trunk)
getElementById(): Average (10runs) :323ms Average (10runs) :156ms
getElementsByTagName(): Average (10runs) :372ms Average (10runs) :197ms
createElement(): Average (10runs) :787ms Average (10runs) :497ms
getAttribute(): Average (10runs) :438ms Average (10runs) :248ms
setAttribute(): Average (10runs) :413ms Average (10runs) :233ms
(to compare IE 5.5:
getElementById(): Average (10runs) :156ms
getElementsByTagName(): Average (10runs) :146ms
createElement(): Average (10runs) :125ms
getAttribute(): Average (10runs) :64ms
setAttribute(): Average (10runs) :62ms
CVS trunk, not optimized build; Mozilla 20030107 is a nightly downloaded from
ftp.mozilla.org; tests performed on a P4 2,53ghz on Windows 2000)
Comment 130•21 years ago
|
||
I'm afraid that comparing debug to non-debug (nightly) builds of Mozilla is an
apples-to-oranges comparison. It would be more helpful to compare the 20030107
nightly to the most recent nightly, for example, to remove most of the variables
in the comparison.
Comment 131•21 years ago
|
||
Athlon 1000, 512MB:
Mozilla Firebird Nightly:
481, 590, 1101, 704, 659
Mozilla Firebird 0.6 (optimised debian Package):
423, 495, 958, 605, 562
Mozilla 1.3.1 (also debian Package):
395, 461, 877, 560, 519
Opera 7.11:
93, 220, 246, 507, 3210
comments:
- optimisations matter a lot
- everything loses totally against opera
Comment 132•21 years ago
|
||
to Comment 130:
My cvs trunk build is a non-debug build (i disabled debug)
Comment 133•16 years ago
|
||
Still seeing various issues related to DOM performance in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1a1pre) Gecko/2008060905 Minefield/3.1a1pre ID:2008060905
~B
Flags: blocking1.9.1?
Comment 134•16 years ago
|
||
I was referring to some of the filed dependent bugs!
~B
File specific bugs on issues you are seeing. This is just a tracker bug at this point so nothing to block on here.
Given the amount of DOM fixes we've done since the last activity here (2002) i'm marking this WORKSFORME.
Status: NEW → RESOLVED
Closed: 16 years ago
Flags: blocking1.9.1? → blocking1.9.1-
Resolution: --- → WORKSFORME
Comment 136•16 years ago
|
||
The testcase still shows a number of issues, and as you said this _is_ a tracker. I don't think we should be resolving it until we're happy with our performance on that testcase or at least fix the dependent bugs.
Component: DOM: Core → DOM: Core & HTML
QA Contact: stummala → general
Comment 137•14 years ago
|
||
This doesn't block "domperf", because this isn't an actionable bug.
No longer blocks: domperf
Comment 138•14 years ago
|
||
Can this block Peacekeeper (bug #499198)? Test 15 tests the speed of getElementById and getElementsByTagName and FF is slower than Chrome at it. Or should I file another bug with the test case from Peacekeeper?
Comment 139•14 years ago
|
||
Probably better to file specific bug for specific issues. See comment 135.
Comment 140•8 years ago
|
||
url down
Updated•5 years ago
|
Summary: DOM Performance Issues → [meta] DOM Performance Issues
Updated•5 years ago
|
Priority: P2 → P3
Comment 141•3 years ago
|
||
Meta bug with no activity for years (11 in this case), closing.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 3 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•