118933 - [meta] DOM Performance Issues

Reporter

Description

•

23 years ago

10 runs on each testcase with build 2002010803 on win-xp with a 1.1ghz machine. mozilla ie6 getElementById() 985 280 getElementsByTagName() 1230 293 createElement() 1314 235 getAttribute() 465 148 setAttribute() 447 157

Markus Hübner

Reporter

Updated

•

23 years ago

Keywords: perf, testcase

Hardware: PC → All

Markus Hübner

Reporter

Updated

•

23 years ago

Blocks: 21762, 71066

Daniel Bratell

Comment 1

•

23 years ago

I looked at the source for getElementById and it works by doing a linear search in the document the first time an id is asked for. After that it is cached in a hashtable. I guess it could be made somewhat quicker by putting all id:s in the hashtable from the beginning. The testcase looks up the same id all the time so it shouldn't be that. I think that the linear search could be made quicker though if that's whats slowing us down. It's done by recursion now and rewriting it into a more iterative method could save a lot of function calls. It could also be the translation from javascript to C++ and back again thats slow. I would really like to run Quantify on these tests but I don't know if I have the time.

Daniel Bratell

Comment 2

•

23 years ago

I have a reason for the slowness of Mozilla getElementById after running that testcase in Quantify. 53% of the time is spent in nsScriptSecurityManager::CanAccess. More than two thirds of that time is spent in nsScriptSecurityManager::GetSecurityLevel with nsCodebasePrincipal::GetOrigin being the worst offender. I would guess that security level can't change during the course of a script run so shouldn't that info be cached somewhere? The rest of the time in CanAccess is spent in nsXPIDLCString::PrepareForUseAsOutParam and nsScriptSecurityManager::GetObjectPrincipal. The rest of the time (when the sucurity part is removed) is spent as follows: 31% XPC_WN_Helper_NewResolve 19% XPCConvert::NativeData2JS 13% nsHTMLDocument::GetElementById (this is ~97 ms on my computer) 10% xptiInterfaceEntry::GetIID 6% XPCConvert::JSData2Native 21% Various stuff like memory managment and JS and XPC overhead So I would first look at why the security is so damn slow, then figure out what kind of animal XPC_WN_Helper_NewResolve is. Also, can NativeData2JS be made faster? It seems to be dominated by thread safety overhead. PR_EnterMonitor, PR_ExitMonitor and PR_Lock together is ~70% of the time. The rest being AddRef and js_AddRootRT. Is there a bug for the slowness of nsScriptSecurityManager::CanAccess? Should I open one?

Johnny Stenback (:jst)

Comment 3

•

23 years ago

mstoltz has some *really* nice goodies in his pocket that should make a world of difference here. Don't know if there's a bug for those speed improvements though.

Fabian Guisset

Comment 4

•

23 years ago

Thanks Daniel for the precious information! CC'ing dbradley for XPCConvert::NativeData2JS, and XPC_WN_Helper_NewResolve

Daniel Bratell

Updated

•

23 years ago

Depends on: 119646

Randell Jesup [:jesup] (needinfo me)

Comment 5

•

23 years ago

I've seen CanAccess show up in lots of Jprofs - never major, but often noticable. Check bug 49141 (window open, CanAccess is 17/929, or about 2%) and ones like it. I'll check my jprof's on Monday. Bug 107746 (DHTML Sliding menu perf) shows 6% in CanAccess (12/207)

Daniel Bratell

Updated

•

23 years ago

Depends on: 119745

David Bradley

Comment 6

•

23 years ago

The biggest offender in XPConnect is the JS_AddNamedRootRT and JS_RemoveNamedRootRT call in XPCWrappedNative::AddRef. Removing this was discussed in bug 112152. The XPC_WN_Helper_NewResolve looks like it boils down to nsHashTable/nsStringKey performance. nsStringKey's conversion to unicode, the PL_HashTableLookup, and the freeing of the key. This appears in both paths nsWindowSH::NewResolve and nsWindowSH::GlobalResolve.

Brendan Eich [:brendan]

Comment 7

•

23 years ago

Don't use nsHashtable, it's malloc-happy. Can we not expedite bug 112152? Is it an infrequent crash bug? Or if we don't yet have thread hazards in the current Mozilla code involved in that bug, can we not get rid of the rooting now in a single-threaded fashion? /be

Johnny Stenback (:jst)

Comment 8

•

23 years ago

I have changes that change the DOM code to use PLDHash in stead of nsHashtable, it still needs some cleaning up, but it's getting closer...

Priority: -- → P2

Target Milestone: --- → mozilla0.9.9

Johnny Stenback (:jst)

Comment 9

•

23 years ago

Attached patch 10+ times speedup of nsScriptNameSpaceManager::LookupName() (obsolete) (deleted) — Details — Splinter Review

I might be out on thin ice with my pldhash useage here, so this needs some pldhash familiar eyes. Brendan, would you have a look?

Johnny Stenback (:jst)

Comment 10

•

23 years ago

Oh, the 10+ number was from initial testing in Quantify with the getElementById() testcase.

David Baron :dbaron:

Comment 11

•

23 years ago

I've never quite seen PL_DHashGetStubOps used that way, but if it works, why not... Please don't copy the performance disaster (yes, it shows up on profiles) that is nsCRT::HashCode(const PRUnichar*) any further. Those HashCodes on nsAStrings are much better written using character sinks, since writing them the way you did gives you a lot of iterator overhead. I wrote them once, and I'll try to dig up the patch.

David Baron :dbaron:

Comment 12

•

23 years ago

Attached patch work in progress on HashString and nsCRT::HashCode improvements (obsolete) (deleted) — Details — Splinter Review

Here's part of a patch I was working on for bug 109129, although I never figured out how to solve the atom-related part of the problem.

Brendan Eich [:brendan]

Comment 13

•

23 years ago

Comment on attachment 65088 [details] [diff] [review] 10+ times speedup of nsScriptNameSpaceManager::LookupName() GetKey, HashKey, and MatchEntry callbacks, but no ClearEntry? That's wrong, it leaks -- it also explains why you thought you needed to Enumerate with a Cleanup callback before Finishing. >-static PRBool PR_CALLBACK >-NameStructCleanupCallback(nsHashKey *aKey, void *aData, void* closure) >+PR_STATIC_CALLBACK(PLDHashOperator) >+NameStructCleanupCallback(PLDHashTable *table, PLDHashEntryHdr *hdr, >+ PRUint32 number, void *arg) > { >- nsGlobalNameStruct *s = (nsGlobalNameStruct *)aData; >+ GlobalNameMapEntry *entry = NS_STATIC_CAST(GlobalNameMapEntry *, hdr); > >- delete s; >+ entry->mKey.~nsString(); > >- return PR_TRUE; >+ return PL_DHASH_NEXT; > } The above method should be morphed into a ClearEntry op, and then you won't need the Enuerate below: > nsScriptNameSpaceManager::~nsScriptNameSpaceManager() > { >- mGlobalNames.Reset(NameStructCleanupCallback); >+ if (mIsInitialized) { >+ // Clear all hash keys and values. >+ PL_DHashTableEnumerate(&mGlobalNames, NameStructCleanupCallback, nsnull); >+ >+ // Destroy the hash >+ PL_DHashTableFinish(&mGlobalNames); >+ } >+} See, Finish calls the clearEntry op for you. BTW, I'd rather you initialize the ops static explicitly, but that's mostly style (isn't the static set from a function return value more expensive?). >+ >+nsGlobalNameStruct * >+nsScriptNameSpaceManager::AddToHash(const nsAString& aKey) >+{ >+ GlobalNameMapEntry *entry = >+ NS_STATIC_CAST(GlobalNameMapEntry *, >+ PL_DHashTableOperate(&mGlobalNames, &aKey, PL_DHASH_ADD)); >+ >+ if (!entry) { >+ return nsnull; >+ } >+ >+ // The memory for the string key has already been allocated, but the >+ // string constructor was never called. Use placement new to >+ // initialize the key string. >+ nsString *str = new (&entry->mKey) nsString(aKey); >+ >+ return &entry->mGlobalName; > } Don't you need to set other members? > > nsresult >@@ -111,25 +165,18 @@ > continue; > } > >- // XXX Mac chokes _later_ if we put the |NS_Conv...| right in >- // the |nsStringKey| ctor, so make a temp >- NS_ConvertASCIItoUCS2 temp(categoryEntry); >+ NS_ConvertASCIItoUCS2 keyStr(categoryEntry); > >- nsStringKey key(temp); >+ nsGlobalNameStruct *s = FindInHash(keyStr); > >- if (!mGlobalNames.Get(&key)) { >- nsGlobalNameStruct *s = new nsGlobalNameStruct; >+ if (!s) { >+ s = AddToHash(keyStr); > NS_ENSURE_TRUE(s, NS_ERROR_OUT_OF_MEMORY); > > s->mType = aType; > s->mCID = cid; Oh, here are those member assignments -- but why hash twice when adding? You should use PL_DHASH_ADD to do the lookup, and if the entry already exists, you'll know by null- or zero-testing the members. If you need initial values other than 0/null, use the optional initEntry hook and make sure clearEntry stores the non-zero default values. How about a new patch fixing the above and capturing dbaron's HashCode work? /be

Attachment #65088 - Flags: needs-work+

Johnny Stenback (:jst)

Comment 14

•

23 years ago

Attached patch Fixes all Brendan's issues and includes the new HashCode and HashString code. (obsolete) (deleted) — Details — Splinter Review

This should be much closer, and I know it's even faster than the previous patch, thanks to dbaron's HashString() code.

Brendan Eich [:brendan]

Comment 15

•

23 years ago

Comment on attachment 65210 [details] [diff] [review] Fixes all Brendan's issues and includes the new HashCode and HashString code. >+class GlobalNameMapEntry : public PLDHashEntryHdr >+{ >+public: >+ nsString mKey; // must be first, to look like PLDHashEntryStub The comment seems false now -- you don't depend on Stub ops that want a void *key first, do you? >+GlobalNameHashClearEntry(PLDHashTable *table, PLDHashEntryHdr *entry) > { >- nsGlobalNameStruct *s = (nsGlobalNameStruct *)aData; >+ GlobalNameMapEntry *e = NS_STATIC_CAST(GlobalNameMapEntry *, entry); >+ >+ // An entry is being cleared, let the key (nsString) do it's own Possessive "its", not contracted it-is. >+ if (s->mType == nsGlobalNameStruct::eTypeNotInitialized) { > s->mType = aType; > s->mCID = cid; > >- nsGlobalNameStruct *old = >- (nsGlobalNameStruct *)mGlobalNames.Put(&key, s); >- >- NS_WARN_IF_FALSE(!old, "Redefining name in global namespace hash!"); >- >- delete old; >+ return NS_OK; > } else { return before else -- nuke the else and unbrace/exdent this: > NS_WARNING("Global script name not overwritten!"); > } Fix these nits and sr=brendan@mozilla.org. /be

Attachment #65210 - Flags: superreview+

Johnny Stenback (:jst)

Comment 16

•

23 years ago

> The comment seems false now -- you don't depend on Stub ops that want a void > *key first, do you? True, comment removed. "it's" replaced with "its" Re the else after return, the return was not supposed to be there, no idea how that ended up there. The else remains, but the return is gone. Thanks for the sr

Johnny Stenback (:jst)

Comment 17

•

23 years ago

Brendan, this is kindof a sidenote from this bug, but is there a good reason for the entry that's passed to the initEntry hook being const? Seems kinda weird, the whole idea is that you're supposed to muck with the entry in that hook, so why the const? Also, do we really want to name the new HashString(const nsA[C]String&) methods HashString, I'd vote for naming them HashCode to match the nsCRT ones. Thoughts?

Peter Van der Beken [:peterv]

Comment 18

•

23 years ago

Comment on attachment 65210 [details] [diff] [review] Fixes all Brendan's issues and includes the new HashCode and HashString code. r=peterv with Whaa and the comment for LookupName corrected.

Attachment #65210 - Flags: review+

David Baron :dbaron:

Comment 19

•

23 years ago

> Also, do we really want to name the new HashString(const nsA[C]String&) methods > HashString, I'd vote for naming them HashCode to match the nsCRT ones. Thoughts? As a global (rather than a static method of nsCRT), I slightly prefer HashString, since HashCode is a little general.

Brendan Eich [:brendan]

Comment 20

•

23 years ago

Dammit! How did that happen? The PLDHashInitEntry typedef's entry param should *not* be a pointer-to-const, of course. Patch here coming up, I want this fixed ASAP. /be

Brendan Eich [:brendan]

Comment 21

•

23 years ago

Attached patch proposed fix for JS/PLDHashInitEntry typedef's bogus const entry ptr (obsolete) (deleted) — Details — Splinter Review

pldhash.h is derived via js/src/plify_jsdhash.sed from jsdhash.h, which accounts for the off-by-four underhanging indentation glitch in pldhash.h's patch. Quick r= and sr=? jst, can you roll this into your patch and remove your const_cast? /be

Johnny Stenback (:jst)

Comment 22

•

23 years ago

Attached patch Complete patch including peterv's nit fixes and the jsdhash/pldhash const fixes. (deleted) — Details — Splinter Review

This one's ready to go, if I can get an a= I'll check it in today.

Brendan Eich [:brendan]

Comment 23

•

23 years ago

I prefer HashString too, the noun-verb combo works better than a noun phrase with an implied verb (which is what HashCode seems to be). I have a dumb question about these names, though: are we comfortable with all our global namespace claims? Pellucid names are great, but the problem is that other subsystems with which we may have to integrate will choose them too. Are we saved by inline somehow? /be

David Baron :dbaron:

Comment 24

•

23 years ago

We should be saved by C++'s capability for overloading, at least to some degree, since the mangled name for HashString would have to have nsAString/nsACString in it.

Christopher Blizzard (:blizzard)

Comment 25

•

23 years ago

a=blizzard on behalf of drivers for 0.9.7

Blocks: 115520

David Baron :dbaron:

Comment 26

•

23 years ago

For 0.9.8, I'd say it's probably better to hold off on the nsCRT.{h,cpp} changes, since there's a chance someone could be depending on the current behavior. I'm willing to file a separate bug to land those later. How risky are the DOM changes?

Brendan Eich [:brendan]

Comment 27

•

23 years ago

dbaron, I'm cool with holding off on the nsCRT changes till 0.9.9, but the namespace manager stuff is as safe as any other pldhash usage in the system. /be

John Bandhauer

Comment 28

•

23 years ago

in http://bugzilla.mozilla.org/show_bug.cgi?id=118933#c7 brendan asked... > Can we not expedite bug 112152? Is it an infrequent crash bug? Or if we don't > yet have thread hazards in the current Mozilla code involved in that bug, can we > not get rid of the rooting now in a single-threaded fashion? As I read the post removing that rooting would account for *part* of %70 of %19 of 47%. Pushing forward with gc-time marking rather than rooting without fixing the new problems that would uncover *might* cause new real crashes. The only crashes we know of with the current scheme are when using thread stress tests. But, yes I believe this can be done. I'm doubtful that the improvement will be significant (yes, I know.. 'every little bit...').

David Baron :dbaron:

Comment 29

•

23 years ago

nsCRT changes are now bug 120363.

Christopher Blizzard (:blizzard)

Updated

•

23 years ago

Keywords: mozilla0.9.7+

Johnny Stenback (:jst)

Comment 30

•

23 years ago

The latest patch modulo the nsCRT changes have been checked in. Leaving bug open for further optimizations.

Status: NEW → ASSIGNED

Johnny Stenback (:jst)

Comment 31

•

23 years ago

Markus, would you be able to run the tests again with a build from tomorrow or newer on the same hardware you ran the initial tests with and report back how much an imporvment we've made so far?

Daniel Bratell

Comment 32

•

23 years ago

Another resolve function that's called a lot in the testcases is nsHTMLDocument::ResolveName. Could the exact same solution, with pldhash, be used there too? There doesn't seem to be very many references to mNameHashTable (15 in nsHTMLDocument.cpp if emacs counted correctly).

Daniel Bratell

Updated

•

23 years ago

Depends on: 120506

Johnny Stenback (:jst)

Comment 33

•

23 years ago

Yup, I already have that in my tree, still needs some cleaning up though...

Markus Hübner

Reporter

Comment 34

•

23 years ago

Running again (1.1ghz, winxp) the test with bulid 2002011703 I get the following results: getElementById() 929 getElementsByTagName() 1107 createElement() 1227 getAttribute() 441 setAttribute() 421 nice work! :))

Marc Boullet

Comment 35

•

23 years ago

I did the test before/after on my slow 300MHz PIII NT4 SP6a Here are the results (10 runs): Test Before(20010116) After(20010117) Win(%) IE5.5 |---------------------|-------------------|-------------------|-------|----------| |getElementById | 4131 ms | 4017 ms | 3% | 972 ms | |---------------------|-------------------|-------------------|-------|----------| |getElementsByTagName | 5031 ms | 4382 ms | 13% | 1093 ms | |---------------------|-------------------|-------------------|-------|----------| |createElement | 5101 ms | 5062 ms | 1% | 760 ms | |---------------------|-------------------|-------------------|-------|----------| |getAttribute | 1799 ms | 1800 ms | 0% | 496 ms | |---------------------|-------------------|-------------------|-------|----------| |setAttribute | 1795 ms | 1803 ms | 0% | 541 ms | Biggest gain is for getElementsByTagName with very little gain for other tests. Still light-years from IE5.5.

Johnny Stenback (:jst)

Comment 36

•

23 years ago

Thanks for the testing, there's definately more to be done, the patches in this bug so far only fixed a small part of the performance problems. There's more on its way.

Adam Lock

Comment 37

•

23 years ago

I'm getting hundreds of assertions like below when I start mozilla fetched & built from this morning. s->mType is eTypeNotInitialized so it complains. Anyone know what is causing them? ###!!! ASSERTION: Whaaa, JS environment name clash!: '!(s && s->mType != nsGloba lNameStruct::eTypeInterface)', file d:\m\source_clean\mozilla\dom\src\build\nsSc riptNameSpaceManager.cpp, line 414

Johnny Stenback (:jst)

Comment 38

•

23 years ago

Yup, I have a fix...

Johnny Stenback (:jst)

Comment 39

•

23 years ago

Attached patch Fix bogus assertion and remove unecessary null check (diff -w) (deleted) — Details — Splinter Review

This is already checked in btw, feel free to do a post-checkin review if you feel it's necessary.

Christopher Blizzard (:blizzard)

Updated

•

23 years ago

Whiteboard: [driver:brendan]

Brendan Eich [:brendan]

Comment 40

•

23 years ago

Aren't all the 0.9.8 changes in now? If so, let's remove this from 115520's dependency list and let the TM of 0.9.9 mean that it'll get pursued then. /be

Johnny Stenback (:jst)

Comment 41

•

23 years ago

Removing dependency on bug 115520.

No longer blocks: 115520

Johnny Stenback (:jst)

Comment 42

•

23 years ago

Attached patch 4x speedup to nsHTMLDocument::ResolveName() and ::GetElementById(). (deleted) — Details — Splinter Review

This patch makes the documents (indirect) JS resolve hook and the implementation of document.getElementById() about 4 times faster (when combined with the fix for bug 120718) by eliminating nsStringKey and nsHashtable in favor of pldhash. Also combinging the id and name hashes in the document into one hash. This lets us do fewer and faster hash lookups. Over all this doesn't improve the times in the testcases in this bug by all that much since most of the time is spent outside this code (~70% in the security manager, see bug 119646).

Attachment #65088 - Attachment is obsolete: true

Attachment #65095 - Attachment is obsolete: true

Attachment #65210 - Attachment is obsolete: true

Attachment #65269 - Attachment is obsolete: true

Johnny Stenback (:jst)

Comment 43

•

23 years ago

Oh, this latest patch should also speed up pageload times a little bit since we do get some traffic in these hashtables while loading HTML pages.

Keywords: mozilla0.9.7+

Whiteboard: [driver:brendan] → [HAVE FIX]

Johnny Stenback (:jst)

Comment 44

•

23 years ago

Brendan, dbaron: Would nsHTMLDocument::InvalidateHashTables() (in the patch) be faster if I'd just do a PL_DHashTableFinish() + PL_DHashTableInit() in stead of enumerating the hash to remove every entry, or would that cause more heap traffic?

Johnny Stenback (:jst)

Comment 45

•

23 years ago

Attached patch Above patch as diff -w (deleted) — Details — Splinter Review

Tom Mraz

Updated

•

23 years ago

Depends on: 40988

[not reading bugmail]

Updated

•

23 years ago

Keywords: review

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 46

•

23 years ago

Comment on attachment 65745 [details] [diff] [review] Above patch as diff -w >Index: content/html/document/src/nsHTMLDocument.cpp >=================================================================== >+ // An entry is being cleared, let the key (nsString) do its own >+ // cleanup and release the content list if there is one. >+ e->mKey.~nsString(); >+ // Inititlize the key in the entry with placement new >+ nsString *str = new (&e->mKey) nsString(*keyStr); I'll have to talk to you about these (and I guess refresh my memory on some C++ as well...) >+ mIdAndNameHashIsLive = PL_DHashTableInit(&mIdAndNameHashTable, >+ &hash_table_ops, nsnull, >+ sizeof(IdAndNameMapEntry), 16); I think you might be better off allocating a larger value initially. I checked some 20 pages in the page load test (like Netscape, Microsoft, Google, mapQuest etc. pages) and on average they had 20-30 name attributes. Google had only 4, while the LXR test page has one per source line. A value of 32 might be better. >+ // We're looked for this id before and we didn't find it, so it Typo, "We've". >+ entry->mIdContent = NS_STATIC_CAST(nsIContent *, ID_NOT_IN_DOCUMENT); Don't need this cast.

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 47

•

23 years ago

Comment on attachment 65745 [details] [diff] [review] Above patch as diff -w r=heikki After discussing this again with jst things look good as they are. The hash isn't allocated until it is actually used, so even on LXR pages you may end up using less than the 16 entries.

Attachment #65745 - Flags: review+

Johnny Stenback (:jst)

Comment 48

•

23 years ago

Attached patch Final patch for this round (deleted) — Details — Splinter Review

Johnny Stenback (:jst)

Comment 49

•

23 years ago

Comment on attachment 66419 [details] [diff] [review] Final patch for this round jband sez sr=jband.

Attachment #66419 - Flags: superreview+

Attachment #66419 - Flags: review+

Johnny Stenback (:jst)

Comment 50

•

23 years ago

Latest patch (attachment 66419 [details] [diff] [review]) checked in. Leaving bug open in case we find more to optimize. Would someone run some more tests and report back the difference from when we started optimizing this?

Emmanuel Bourg

Comment 51

•

23 years ago

Result of 10 runs on a P3 450MHz / Windows NT4. Tested with Mozilla 0.9.7 and the latest nighly build. ---------------------------------------------------------------------------------- | IE6 Gecko/2001122106 Gecko/2002012403 % gained ---------------------------------------------------------------------------------- getElementById | 865 3374 3077 8,80 getElementByTagName | 869 3334 3313 0,62 createElement | 687 4075 3793 6,92 getAttribute | 442 1407 1390 1,20 setAttribute | 461 1387 1348 2,81 ----------------------------------------------------------------------------------

Christopher Aillon (sabbatical, not receiving bugmail)

Comment 52

•

23 years ago

Emmanuel, you will need a build from the 25th likely to accurately test this. The build you were testing was built about 18 hours _before_ this patch went in.

Emmanuel Bourg

Comment 53

•

23 years ago

Result of 10 runs on a P3 450MHz / Windows NT4, with the right build this time ;) | MSIE 6.0 Opera 6.0 Mozilla 0.9.7 2002012403 2001022503 %-m0.9.7 %-nightly ---------------------------------------------------------------------------------------------------------- getElementById | 865 1166 3374 3077 2983 11,59% 3,05% getElementByTagName | 869 1168 3334 3313 3030 9,12% 8,54% createElement | 687 1162 4075 3793 3742 8,17% 1,34% getAttribute | 442 N/A 1407 1390 1342 4,62% 3,45% setAttribute | 461 N/A 1387 1348 1295 6,63% 3,93%

Emmanuel Bourg

Comment 54

•

23 years ago

Reformating the table for clarity, sorry for the spam. | IE6 Moz 0.9.7 2002012403 2002012503 %-m0.9.7 %-nightly -------------------------------------------------------------------------------- getElementById | 865 3374 3077 2983 11,59% 3,05% getElementByTagName| 869 3334 3313 3030 9,12% 8,54% createElement | 687 4075 3793 3742 8,17% 1,34% getAttribute | 442 1407 1390 1342 4,62% 3,45% setAttribute | 461 1387 1348 1295 6,63% 3,93%

Marc Boullet

Comment 55

•

23 years ago

Hereafter are my timings still with PII 300 MHz NT4 SP6 I added a column with the gain ouside security manager. for that, I took the assumption that SM takes 70% of the time, as per jst's comment #42. I took 70% of the original time and substract this time from all measurements. FWIW. +---------------------+--------+----------+----------+-------+-------------+ |Test | before | w/ 65273 | w/ 66419 | Gain% |Gain SM excl.| |---------------------|--------|----------|----------|-------|-------------| |getElementByID |4131 ms | 4017 ms | 3715 ms | 10% | 50% | |---------------------|--------|----------|----------|-------|-------------| |getElementsByTagName |5031 ms | 4382 ms | 4016 ms | 28% | 305% | |---------------------|--------|----------|----------|-------|-------------| |CreateElement |5101 ms | 5062 ms | 4956 ms | 3% | --- | |---------------------|--------|----------|----------|-------|-------------| |getAttribute |1799 ms | 1800 ms | 1734 ms | 4% | --- | |---------------------|--------|----------|----------|-------|-------------| |setAttribute |1795 ms | 1803 ms | 1686 ms | 6% | --- | +---------------------+--------+----------+----------+-------+-------------+

Marc Boullet

Comment 56

•

23 years ago

Read 66% instead of 305%. Sorry for the spam.

Johnny Stenback (:jst)

Comment 57

•

23 years ago

Mstoltz's security manager changes should cut those numbers nearly in half. Should land soon...

Markus Hübner

Reporter

Comment 58

•

23 years ago

Running the tests again (1.1ghz, winxp) I get the following results: ------------------------------------------------------------------- | 2002010803 2002011703 2002012503 ------------------------------------------------------------------- getElementById | 985 929 893 getElementByTagName | 1230 1107 1155 createElement | 1314 1227 1218 getAttribute | 465 441 434 setAttribute | 447 421 411 -------------------------------------------------------------------

Tom Mraz

Comment 59

•

23 years ago

Results on 566MHz Celeron with Linux: Mozilla 0.9.8 (original binary talkback build from mozilla.org) and Mozilla 0.9.8 with the patch from bug 119646 compiled with gcc-2.96-98 and -O2 and -march=i686 Test Original m0.9.8 m0.9.8 with patch getElementById | 5541 3526 getElementByTagName | 5410 5797 createElement | 7205 5526 getAttribute | 2077 2657 setAttribute | 2110 2672 This is really strange :-( There is really big speedup on getElementById and createElement but small slowdown on the rest of the tests.

Adam Lock

Comment 60

•

23 years ago

I have a question about the patch. Why are the id strings stored in Unicode when ID & NAME are alphanumeric (+ a few other symbols)?

Johnny Stenback (:jst)

Comment 61

•

23 years ago

Becsause we don't want to be converting back n' fourth every time we hit the hashes, which we do a *lot*, and also because there are non-ascii characters in names on existing webpages already. We'be had bugs on non-ascii names not working already...

Johnny Stenback (:jst)

Comment 62

•

23 years ago

Can someone double check that setAttribute() n' friends really did slow down?

Marc Boullet

Comment 63

•

23 years ago

Since my last measurements (comment #55 2002-01-25), I'm seeing a small slowdown with 2002020703. getElementByID 3715 ---> 3928 createElement 4956 ---> 5124 getAttribute 1734 ---> 1790 setAttribute 1686 ---> 1702 More strange is the behavior of getElementsByTagName. When running the test with 10 runs, I see a great variation between the runs (from 4s to 8s). Moreover, I see a "great memory activity with NT task manager. Something have definitively changed here. I noted also that odd runs are quite fast (4s) while even runs take much more time (up to 8s). A hash table lookup should not have such variations. getElementsByTagName 4016 ---> 5628 (not stable) I'm rather skeptical on Tom's measurements in comment #59. With a 500 MHz Pentium, he is slower than me with my 300 MHz. Maybe some background task running on his PC.

Johnny Stenback (:jst)

Comment 64

•

23 years ago

The getElementsByTagName() tests might very well trigger GC's while the tests are running (since we create new objects for every iteration through the loop), that could easily cause large variations in the test results.

Emmanuel Bourg

Comment 65

•

23 years ago

Here are my results : (10 runs on a P3 450MHz / Windows NT4 ) | m0.9.7 2002012403 20020125 m0.9.8 2002020703 ------------------------------------------------------------------------ getElementById | 3374 3077 2983 3163 2989 getElementByTagName | 3334 3313 3030 3229 3053 createElement | 4075 3793 3742 3824 3774 getAttribute | 1407 1390 1342 1453 1357 setAttribute | 1387 1348 1295 1426 1336 m0.9.8 is slower but that's not a big surprise since it branched 2 days before jst's patch. Latest build doesn't show any significant change over the post patch build (20020125)

Tom Mraz

Comment 66

•

23 years ago

OK in my last comment I've compared two uncomparable builds (different compiler and settings). I've made a build with the gcc-2.96-98 compiler and WITHOUT the mstoltz patch from bug 119646 and compared the results with the same compiled build with the patch and there is bigger difference. Which leads me that the speed of the testcase is heavily depending on build environment (compiler, options...). So here are the comparable results with and without the mstoltz patch: Test Orig. m0.9.8 Our m0.9.8 W patch Our m0.9.8 W/O patch getElementById | 5541 3526 6166 getElementByTagName | 5410 5797 5680 createElement | 7205 5526 7996 getAttribute | 2077 2657 2482 setAttribute | 2110 2672 2455 So there definitely is some minor performance degradation on set/getAttribute. The getElemntByTagName case may be caused by the GC. The getElementById and createElement are highly improved.

Aaron Kaluszka

Comment 67

•

23 years ago

I don't know what happened, but bug 87808 returned somewhere between 02/06 and 02/09 Might be related to this.

Depends on: 87808

Markus Hübner

Reporter

Comment 68

•

23 years ago

There seems to be a problem concerning the testcases at http://www.formula-one.nu/mozilla/domtestcases/ Using build 2002020908 on win-xp pressing "run" doesn't seem to trigger any action - a new bug ?!

Marc Boullet

Comment 69

•

23 years ago

Markus, you're right. I'm unable to run the tests (at least getElementById: I didn't try the others) with 2002020908 NT4. Seems to perform the first run , display the result (briefly) and re-init. Did you file a bug or mentionning here is enough ?

Markus Hübner

Reporter

Comment 70

•

23 years ago

Filed a new bug on the regression seen in the testcases - bug 124841.

Johnny Stenback (:jst)

Comment 71

•

23 years ago

That regression was due to bug 124749.

Marc Boullet

Comment 72

•

23 years ago

Here are the results with build from 2002021303 which includes Mstoltz's security manager changes. +---------------------+--------+----------+------------+-------+----------+ |Test | before | w/ 66419 | w/ new SM | Gain% | IE5.5 | |---------------------|--------|----------|------------|-------|----------| |getElementByID |4131 ms | 3715 ms | 2176 ms | 47% | 972 ms | |---------------------|--------|----------|------------|-------|----------| |getElementsByTagName |5031 ms | 4016 ms | 4309 ms | 14% | 1093 ms | |---------------------|--------|----------|------------|-------|----------| |CreateElement |5101 ms | 4956 ms | 3197 ms | 37% | 760 ms | |---------------------|--------|----------|------------|-------|----------| |getAttribute |1799 ms | 1734 ms | 1751 ms | -- | 496 ms | |---------------------|--------|----------|------------|-------|----------| |setAttribute |1795 ms | 1686 ms | 1688 ms | 6% | 541 ms | +---------------------+--------+----------+------------+-------+----------+ Any other ideas to narrow even more the gap with IE.

Chris Nelson

Comment 73

•

23 years ago

There was a mail in n.p.m.performance, "Fast RAM semaphores" by Michael Kaply. He pointed to this bug, and how with the patch Moz OS/2 beats Windows in javascript speed tests. He points to this bug... http://bugzilla.mozilla.org/show_bug.cgi?id=125123

Marc Boullet

Comment 74

•

23 years ago

I read that too. Any volunteer to port this code on the other OS.

Dennis Haney

Comment 75

•

23 years ago

Well... on linux _MD_NEW_LOCK is implemented using a macro: #define _MD_NEW_LOCK(lock) PR_SUCCESS I dont know many implementations that will do that faster...

Boris Zbarsky [:bzbarsky]

Comment 76

•

23 years ago

I just made profiles of those tests with a cvs build from this morning (has mstoltz's fix). You can see them at http://boris.zbarsky.org:8000/~bzbarsky/dom/perf/ Four of them nothing is leaping at me.... getElementsByTagName we're spending lots of time in nsVoidArray::IndexOf. Here's what happens there: Nodelists are live. This means they add themselves to the document observer list when they get created and remove themselves when they get destroyed. Now this page creates thousands of nodelists in a tight loop. I put a debug printf in nsVoidArray::IndexOf and it looks like the length of the list starts out at a fairly low number, grows up to 10008 (10000 nodelists and 8 legitimate observers) and then shrinks to a low number. The interesting thing is that if I tell the page to run the test twice I get _two_ spikes of about 10000 observers. So it looks like the observers are being deleted one by one when the function the nodelists were created in returns.... Is this business of creating lots of nodelists in a tight loop actually common out there? As things stand the addition and removal of observers takes time that is O(N^2) in the number of nodelists in the testcase. Do observers need to be notified in a consistent order? That is, could we store them in a hashtable instead of an array?

Fabian Guisset

Comment 77

•

23 years ago

Creating nodeList's in a tight loop is not common as far as I know -- all good js coders know they have to cache stuff before entering a tight loop.

Johnny Stenback (:jst)

Comment 78

•

23 years ago

Pushing to mozilla1.0

Target Milestone: mozilla0.9.9 → mozilla1.0

Markus Hübner

Reporter

Updated

•

23 years ago

Depends on: 29805

Markus Hübner

Reporter

Comment 79

•

23 years ago

Another testcase is at http://www.formula-one.nu/mozilla/domtestcase2/ Seems we are doing very badly on them. test1 test2 --------------------------------- msie6 3831 4738 2002021303 30206 28952 The problem on test2 (no animation displayed) seems to be due to bug 117061

Marc Boullet

Comment 80

•

23 years ago

On my 300MHz PII, animation is not visible in both testcases. Since the DIVs are overlapping, I think bug #115247 may do something here, but I'm pretty sure there are other causes too.

scottputterman

Comment 81

•

23 years ago

Moving Netscape owned 0.9.9 and 1.0 bugs that don't have an nsbeta1, nsbeta1+, topembed, topembed+, Mozilla0.9.9+ or Mozilla1.0+ keyword. Please send any questions or feedback about this to adt@netscape.com. You can search for "Moving bugs not scheduled for a project" to quickly delete this bugmail.

Target Milestone: mozilla1.0 → mozilla1.2

timeless

Updated

•

23 years ago

Keywords: mozilla1.0, nsbeta1

Target Milestone: mozilla1.2 → mozilla1.0

Randell Jesup [:jesup] (needinfo me)

Comment 82

•

23 years ago

Attached file jprof of new testcase, text version, with text "Test" (deleted) — Details

90% is nsBlockFrame::Reflow (surprise) 33% f+d in nsContainerFrame::SyncFrameViewAfterReflow, almost 1/2 of that in nsCSSRendering::FindBackground(). Almost all (12%) of FindBackground's time is spent in QI(!) 12% in nsContainerFrame::PositionFrameView() (1/2 from SyncFrameViewAfterReflow) 10% in ComputeLineHeight, most that from nsFontCache::GetMetricsFor, and much of that from nsFont::Equals() 14% in nsHTMLReflowState::nsHTMLReflowState, most of that (12%) in nsHTMLReflowState::InitConstraints() 12% in nsContainerFrame::PositionFrameView(), split between nsViewManager::MoveViewTo, nsFrame::GetView, and nsFrame::GetOffsetFromView 12% in nsHTMLDivElement::QueryInterface()(!) 5% direct hits in nsViewManager::IsViewInserted() 5% direct hits in the combination nsStyleContext::GetStyleData() and its callee nsRuleNode::GetStyleData() 5% in nsBindingManager::GetBindingImplementation()

Randell Jesup [:jesup] (needinfo me)

Updated

•

23 years ago

Depends on: 129079

Randell Jesup [:jesup] (needinfo me)

Comment 83

•

23 years ago

Bug 129079 has a fix for the FindBackground issue; cuts FindBackground from 15% to <5%, about a 10% speedup on the overall testcase. Thanks BZ.

Randell Jesup [:jesup] (needinfo me)

Comment 84

•

23 years ago

FYI, checked the Images version of the testcase. jprof is almost identical to the text version; i.e. almost all the time is spent doing reflow.

Randell Jesup [:jesup] (needinfo me)

Updated

•

23 years ago

Depends on: 129115

Nisheeth Ranjan

Comment 85

•

23 years ago

I believe we've done what we can here for mozilla1.0. The reflow coalescing work that's going on in bug 129115 will help with DHTML perf across the board, but other than that there's no time to work on this now.

Keywords: nsbeta1 → nsbeta1-

Target Milestone: mozilla1.0 → mozilla1.1

Johnny Stenback (:jst)

Comment 86

•

23 years ago

Um, that was a comment from me, not from Nisheeth.

Boris Zbarsky [:bzbarsky]

Updated

•

23 years ago

Depends on: 132974

Johnny Stenback (:jst)

Comment 87

•

23 years ago

Attached patch 50% speedup of the getElementById() testcase... (obsolete) (deleted) — Details — Splinter Review

This patch makes mozilla pre-define the document and window properties on the global object, this means that accessing those properties will be handled directly by the JS engine in stead of going through XPConnect. This might cause the security checks for window and document to not work as expected though, need to think about that...

Johnny Stenback (:jst)

Comment 88

•

23 years ago

Attached patch 50% speedup of the getElementById() testcase, security problem fixed... (deleted) — Details — Splinter Review

Make doCheckPropertyAccess() not do a security check for document and window.

Attachment #81102 - Attachment is obsolete: true

Peter Van der Beken [:peterv]

Comment 89

•

23 years ago

Comment on attachment 81103 [details] [diff] [review] 50% speedup of the getElementById() testcase, security problem fixed... >+ } else if (mJSObject) { >+ JSContext *cx = (JSContext *)mContext->GetNativeContext(); >+ >+ nsWindowSH::CacheDocumentProperty(cx, mJSObject, this); >+ } I prefer nsWindowSH::CacheDocumentProperty((JSContext *)mContext->GetNativeContext(), mJSObject, this); r=peterv.

Attachment #81103 - Flags: review+

vidur (gone)

Updated

•

23 years ago

Attachment #81103 - Flags: superreview+

vidur (gone)

Comment 90

•

23 years ago

Comment on attachment 81103 [details] [diff] [review] 50% speedup of the getElementById() testcase, security problem fixed... sr=vidur. Comment the about:blank case in nsGlobalWindow to explain why it's necessary.

Johnny Stenback (:jst)

Comment 91

•

23 years ago

attachment 81103 [details] [diff] [review] checked in on the trunk.

Boris Zbarsky [:bzbarsky]

Updated

•

23 years ago

Depends on: 140758

Marc Boullet

Comment 92

•

23 years ago

Wooooh! another one like that and we'll catch IE !!!!! +---------------------+--------+----------+-------+----------+ |Test | before | w/ 81103 | Gain% | IE5.5 | |---------------------|--------|----------|-------|----------| |getElementByID |2176 ms | 1169 ms | 46% | 972 ms | |---------------------|--------|----------|-------|----------| |getElementsByTagName |4309 ms | 4119 ms | --- | 1093 ms | |---------------------|--------|----------|-------|----------| |CreateElement |3197 ms | 2193 ms | 31% | 760 ms | |---------------------|--------|----------|-------|----------| |getAttribute |1751 ms | 1472 ms | 16% | 496 ms | |---------------------|--------|----------|-------|----------| |setAttribute |1688 ms | 1425 ms | 16% | 541 ms | +---------------------+--------+----------+-------+----------+

Johnny Stenback (:jst)

Comment 93

•

23 years ago

Turns out that the security related change in this patch wasn't needed, I backed that part out of the trunk. Patch for the branch coming up.

Johnny Stenback (:jst)

Comment 94

•

23 years ago

Attached patch Same as above but for the 1.0 branch and w/o the security related change (deleted) — Details — Splinter Review

Randell Jesup [:jesup] (needinfo me)

Comment 95

•

23 years ago

Comment on attachment 82101 [details] [diff] [review] Same as above but for the 1.0 branch and w/o the security related change Pulling forward existing r/sr from trunk a=rjesup@wgate.com for branch

Attachment #82101 - Flags: superreview+

Attachment #82101 - Flags: review+

Attachment #82101 - Flags: approval+

Johnny Stenback (:jst)

Updated

•

23 years ago

Keywords: adt1.0.0

Target Milestone: mozilla1.1alpha → mozilla1.0

Jaime Rodriguez, Jr.

Comment 96

•

23 years ago

let's wait and get this one in for rtm. adt1.0.0- [adt2 RTM]

Keywords: adt1.0.0 → adt1.0.0-

Whiteboard: [HAVE FIX] → [HAVE FIX] [adt2 RTM]

Markus Hübner

Reporter

Comment 97

•

23 years ago

Wolfgang Schwarz has made a new great test-suite at http://www.umsu.de/jsperf/index2.php ! :)

Markus Hübner

Reporter

Comment 98

•

23 years ago

Won't this be checked into the 1.0 branch ?!

Boris Zbarsky [:bzbarsky]

Comment 99

•

23 years ago

I was wondering about that.... the patch has driver approval and adt1.0.0-. Does that not mean that a non-Netscape-employee could check it into the branch?

Johnny Stenback (:jst)

Comment 100

•

23 years ago

The ADT will re-evaluate this fix after the MachV Beta is released, until then *I* am unable to check this in on the branch.

Heikki Toivonen (remove -bugzilla when emailing directly)

Updated

•

23 years ago

Keywords: nsbeta1- → nsbeta1+

Johnny Stenback (:jst)

Updated

•

23 years ago

Whiteboard: [HAVE FIX] [adt2 RTM] → [FIXED ON TRUNK] [adt2 RTM]

Heikki Toivonen (remove -bugzilla when emailing directly)

Updated

•

23 years ago

Keywords: adt1.0.0- → adt1.0.0

scottputterman

Comment 101

•

23 years ago

adding adt1.0.0+. Please checkin to the 1.0 branch.

Keywords: adt1.0.0 → adt1.0.0+

Johnny Stenback (:jst)

Comment 102

•

23 years ago

Fixed on branch too, leaving bug open though since there's more to do here...

Keywords: fixed1.0.0

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 103

•

23 years ago

Clearing keywords to get this off my query radars. I would advice closing this and opening new bugs for specific issues as this has become so big.

Keywords: adt1.0.0+, nsbeta1+, review

Whiteboard: [FIXED ON TRUNK] [adt2 RTM] → [FIXED ON TRUNK]

Prashant Desale

Comment 104

•

23 years ago

I ran the tests on 10 runs on each testcase with following 3 browsers. Branch: 2002-05-22-18-1.0.0 Trunk: 2002-05-23-08-trunk. Other: IE 5.5 & here are results. TEST Trunk : Branch : Other (IE) getElementById() 1165ms : 1435ms : 1180ms getElementsByTagName() 1222ms : 4127ms : 1438ms createElement() 2137ms : 2410ms : 869ms getAttribute() 1435ms : 1747ms : 550ms setAttribute() 1391ms : 1665ms : 658ms Looking at these results branch takes more time than trunk & other browser also. Removing fixed1.0.0 keyword. Please correct me If I'm wrong.

Keywords: fixed1.0.0

Johnny Stenback (:jst)

Comment 105

•

23 years ago

This was "fixed" on the branch, but there's a lot of other improvments on the trunk, so we can't really compare the two. In any case, the last patch in this bug was checked in on the branch, so I'm adding fixed1.0.0 and verified1.0.0 keywords to this bug.

Keywords: fixed1.0.0, verified1.0.0

Brendan Eich [:brendan]

Comment 106

•

22 years ago

If the changes on the trunk prove out well during 1.1alpha, someone should gather a branch patch and nominate it to drivers for inclusion on the branch, in 1.0.1 (after 1.0 ships). /be

Markus Hübner

Reporter

Updated

•

22 years ago

Depends on: 23187

Johnny Stenback (:jst)

Updated

•

22 years ago

Target Milestone: mozilla1.0 → Future

Markus Hübner

Reporter

Comment 107

•

22 years ago

just an update on tests: trunk build 2002082208 on win-xp pro,1.1ghz,512ram 10 runs for each ... TEST Trunk : MSIE6.0 getElementById() 287ms : 270ms getElementsByTagName() 305ms : 302ms createElement() 792ms : 231ms getAttribute() 378ms : 145ms setAttribute() 340ms : 152ms huge difference on createElement()

Markus Hübner

Reporter

Comment 108

•

22 years ago

btw: will this get now on the branch - regarding brendan's comment #106 ?

Jaime Rodriguez, Jr.

Comment 109

•

22 years ago

If what /be suggests is not already on the 1.0 branch for 1.0.1, then it will not make it. But, we should nomintae it for landing in the 1.0.2 milestone.

Blocks: 143047

Randell Jesup [:jesup] (needinfo me)

Comment 110

•

22 years ago

So which bugs are the other issues referred to that we should consider for 1.0.2? Please mention them here, and also mark them with the keyword "mozilla1.0.2" to nominate them.

Daniel Bratell

Comment 111

•

22 years ago

I took a new look at createElement testcase and I can easily say what's slow, but I don't know how fixable it is. Of the time taken by createElement: 43% is converting from C++ to JS 22% is doing the createElement() 7% is security related 7% is converting from JS to C++ 6% is getting some property (js_GetProperty called from js_Interpret) 6% is javascript garbage collecting 5% is 2 calls to some hook (jsd_FunctionCallHook) per createElement() the rest is minor overhead everywhere So there are two major offenders and a couple of small ones. Let's take them one by one. Converting from C++ to JS (43%): ================================= This is done by XPCConvert::NativeData2JS / XPCConvert::NativeInterface2JSObject which I've seen in profiles before. I don't know the workings of this but it is slow. It's almost certainly this that makes setAttribute and getAttribute slow as well, even though I should check that. Anyway, it breaks as this: (90% is XPCWrappedNative::GetNewOrUsed which contains the functions below) 40% is XPCWrappedNative::Init which only does a JS_NewObject 11% is XPCWrappedNative::FindTearOff which is 60% QueryInterface (explicit and through nsCOMPtr::assign_from_helper) 36% XPCWrappedNative::ExtendSet (hashtable and locks) 11% is calls to new 9% is locks (the same lock used several times per call) 7% is QueryInterface 7% is XPCNativeInterface::GetNewOrUsed 90% of this is locking, 8% is looking in an hashtable so a strategy here seems to be to try to QueryInterface less and use locks less. If that's possible, and I expect it to be, I guess we can save 10-20% at least. I'm reluctant to touch that code though. It seems crash prone (magical commented wrappers to avoid crashes). The createElement call (22%): ============================== 40% here ends in NS_NewHTMLDivElement (i.e. only 10% of the time of the DOM call is actually _creating_ the DIV element) 24% is getting some service in NS_CreateHTMLElement 11% is in nsNodeInfoManager::GetNodeInfo which is an hashtable lookup 8% is Querying an Interface on the created nsHTMLDivElement 7% is converting some string ("div"?) to an ID in nsParserService 7% is copying a string into an nsAutoString in nsHTMLDocument::CreateElement I don't know if there's very much to do here. We could maybe avoid getting the service for every call, and maybe we could avoid doing the QueryInterface. That could, if possible, save a couple of percent of the total time. Security (7%): ============== I imagine security (CanAccess) is quite optimized now so I don't think it's easy to find time to win here. Converting from JS to C++ (7%): ================================ Nothing to win here either. All the time is spent allocating a wrapper for the JS string and we can't easily avoid that. Accessing some property (6%): =============================== This is the js_GetProperty called from js_Interpret. I don't know what property this is. |document| maybe? 40% is creating a XPCCallContext object 30% is calling nsHTMLDocumentSH::NewResolve 12% is looking something up in an hashtable in js_LookupProperty 8% is destroying the XPCCallContext object 4% is XPCNativeSet::FindMember Maybe we can keep the XPCCallContext object around somewhere? It seems expensive to create and destroy. Javascript garbage collecting (6%) ================================== I don't even knows if this happens every time. Hooks (jsd_FunctionCallHook) (5%): ==================================== For every createElement there are 2 calls to jsd_FunctionCallHook in js_Invoke. 98% Locking (jsd_Lock, jsd_Unlock) 2% callHook overhead 0% JS_GetFrameScript and JS_IsConstructorFrame So here we must either avoid the call altogether or skip the locking. I've no idea what this is.

Boris Zbarsky [:bzbarsky]

Comment 112

•

22 years ago

> 24% is getting some service in NS_CreateHTMLElement The first three lines of NS_CreateHTMLElement are: 845 // Cache this service! The XML content sink uses this method! 846 847 nsCOMPtr<nsIParserService> parserService = 848 do_GetService(kParserServiceCID, &rv); This sounds like "low-hanging-fruit which should help perf outside createElement" to me...

Boris Zbarsky [:bzbarsky]

Comment 113

•

22 years ago

Oh, and one more thing. > 7% is converting some string ("div"?) to an ID in nsParserService This code looks like it could be made faster too... right now it takes an atom, gets the string out, hashes it, and does a hashtable lookup. If we can fix nsHTMLTags::LookupTag to also use atoms, we could hash the atoms instead of raw strings (which will make the hash key calculation faster, if nothing else; just use the pointer as the key).

Daniel Bratell

Updated

•

22 years ago

Depends on: 165890

Daniel Bratell

Updated

•

22 years ago

Depends on: 165893

Brendan Eich [:brendan]

Comment 114

•

22 years ago

Daniel, you must have the js debugger (venkman) open for jsd_FunctionCallHook to show up at all, I think. Or at least, venkman must be installed and must have been opened? Cc'ing rginda. Normal users will never see cycles in jsd_*, so I think you should reprofile with venkman off to get rid of this noise. dbradley is cc'd -- I wonder whether we could eliminate most locking costs using techniques used by the JS engine (see bug 54743). Also, if this createElement benchmark is wasting time looking for a cached wrapper or tear-off or whatever when it just needs to allocate a new one constantly, maybe the benchmark doesn't really represent real-world workloads. Or if it does represent some interesting workload to optimize, perhaps xpconnect can tell when it needs to stop reusing and switch to new allocation? /be

Daniel Bratell

Comment 115

•

22 years ago

I guess building the js debugger is enough to have it "installed"? In that case I have it installed and I might have used it in some profile of mine, but it hasn't been used in the profile I used for the tests. Maybe some lock is positioned outside the debugger-in-use-check. If the test case is relevant? That's always a good question. In this case it shows that JS to C++-conversion is a bottle neck and that's interesting. It's interesting because it affects much more than createElement since (I assume) lots of code in DHTML and in mozilla pass over that bridge, though I'm not familiar enough with that code to really understand the meaning of everything it does. Anyway, two of the mentioned possibilities for improvement are, including patches, in bug 165890 and 165893 and those two improved createElement performace for me from ~1000 ms to ~900 ms. I was looking at reducing the number of QueryInterface:s too, but each one seem to be needed and I don't see a way to make them faster.

Brendan Eich [:brendan]

Comment 116

•

22 years ago

>If the test case is relevant? That's always a good question. In this case it >shows that JS to C++-conversion is a bottle neck and that's interesting. It's >interesting because it affects much more than createElement since (I assume) >lots of code in DHTML and in mozilla pass over that bridge, Do they? My point was that many (most?) DHTML pages do not create elements by the thousands, or constantly, or frequently. The bridge crossed when creating a new element is nsXPCWrappedNative::GetNewOrUsed, but once you've crossed that bridge, to operate on a referenced instance is cheaper, and crosses a different code bridge. Right? /be

Daniel Bratell

Comment 117

•

22 years ago

You mean that when an object has been converted once, then it's much cheaper to convert (cached)? In that case, maybe the createElement test case doesn't tell us very much as long as it isn't ridicously slow. getElementById would tell us more.

Robert Ginda

Comment 118

•

22 years ago

If you've got the debugger installed it will add some hooks to the js engine by default. To uninstall these hooks, type "/startup-init off" in the Interactive Session view. You need write access to the component registry, as this removes a component from the startup observer category. The setting will be saved until you do a "/startup-init on", or until something causes the js component venkman-service.js to be re-registered.

John Bandhauer

Comment 119

•

22 years ago

Some thoughts on the xpconnect related suggestions... - As noted by others we should be sure we are focusing on a real world benchmark before going off and optimizing. - On the issue raised regarding XPCCallContexts... They are stack only objects. The cost is not in creation and destruction but rather is in initialization and completion. Various bits of call state needs to be calculated before the call and various bits of work need to done after the call. These just happen to be done in the ctor and dtor. Recycling the actual objects would do no good. - On locking... Brendan suggested a lock avoidance scheme like that used in the js engine. XPConnect already has a lock avoidance scheme that saves *lots* of DOM wrapper locking in the normal access paths. Rather than use the (relatively) complicated scheme as used in the js engine to track occurance of access from secondary threads, the xpconnect scheme just leverages the MAIN_THREAD_ONLY class info flag to avoid locking overhead when dealing with wrapper specific data. That is, many of the uses of autolocks in xpconnect use a null lock and are almost NOPs. However, xpconnect also factors out interface info specific data and per scope tables of wrappers. These are shared across the entire runtime and require synchronized access. Most of this data is only diddled at wrapper creation time - though there is one lock/unlock pair in the normal path of looking up an existing wrapper. These runtime-wide subsystems generally require synchronized access. Someone *could* factor out parallel subsystems that would be used only on the main thread; e.g. factoring out the XPCNativeInterface/XPCNativeSet stuff so that one instance of the graph is for use on main thread only wrappers and the other is used for everything else. Same story on wrapper sets. This would add complexity. And, for some paths, it might mean an unlocked lookup followed by a locked lookup before the right wrapper was found. But it might reduce DOM wrapper locking significantly. There is the question of whether or not the long term vision for the DOM requires main thread only use. There has been some discussion of running DOMs on multiple threads; i.e. perhaps each DOM would be single threaded, but different DOMs might run on different threads. I don't know if this is likely to happen. But if it did then the refactoring I suggest above could blow up into a lot of duplicated data if we tried to give each DOM thread its own set of stuff. Anyway, I'm not so sure that the locking overhead through realistic code paths is significant. If it is then someone could look into some refactoring of shared data in order to cut out some of the locks. I don't know who would be doing that. The effort required to get up to speed on xpconnect internals in not trivial.

Adam Lock

Comment 120

•

22 years ago

Naive question, but performance be improved by writing some custom marshalers rather than going through XPConnect each time?

Markus Hübner

Reporter

Updated

•

22 years ago

Keywords: mozilla1.0.2

Daniel Bratell

Comment 121

•

22 years ago

www.formula-one.nu seems to be gone. Did anyone make a copy of the tests in comment 68 or does anyone know their new location?

Markus Hübner

Reporter

Comment 122

•

22 years ago

Updated URL of testcases ... I have a backup of all testcases. Just drop me a line if you need anything more.

URL: http://www.formula-one.nu/mozilla/dom... → http://www.world-direct.com/mozilla/d...

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 123

•

22 years ago

on 20021210 we're still far far away from IE or even Opera 7!!! | IE6 2002121008 Opera 7 --------------------------------------------- getElementById | 326 413 267 getElementByTagName| 333 528 283 createElement | 271 1266 568 getAttribute | 131 649 656 setAttribute | 152 587 726 I strongly belive that this should be 1.3beta blocker. It's low level part of almost all DHTML scripts, and speeding this one up will result in speed up in many other bugs.

Flags: blocking1.3b?

Brendan Eich [:brendan]

Comment 124

•

22 years ago

Talking about 1.3beta blockers in a metabug is silly. First, get a patch or five going that solve the underlying problem(s). Then we may well be in 1.4alpha. Any big and/or scary patches belong in an alpha, not in a beta, milestone. /be

Asa Dotzler [:asa]

Updated

•

22 years ago

Flags: blocking1.3b? → blocking1.3b-

Markus Hübner

Reporter

Comment 125

•

22 years ago

just retesting this ... and the differences are bigger then originally in comment #107 --> performance regression. 10 runs for each ... TEST Trunk 2002121108 : Trunk 2002121108 : MSIE6.0 SP1 getElementById() 313ms : 287ms : 270ms getElementsByTagName() 370ms : 305ms : 281ms createElement() 702ms : 792ms : 231ms getAttribute() 442ms : 378ms : 140ms setAttribute() 407ms : 340ms : 144ms

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 126

•

22 years ago

Speaking for myself and (I think) brendan, I have no interest in ever running DOMs in multiple threads.

Markus Hübner

Reporter

Updated

•

22 years ago

Depends on: 186133

Markus Hübner

Reporter

Comment 127

•

22 years ago

Filed bug 186133 about the regression in performance.

No longer depends on: 186133

Boris Zbarsky [:bzbarsky]

Updated

•

22 years ago

Depends on: 186133

Johnny Stenback (:jst)

Comment 128

•

22 years ago

Mass-reassigning bugs to dom_bugs@netscape.com

Assignee: jst → dom_bugs

Status: ASSIGNED → NEW

Frank Wein [:mcsmurf]

Comment 129

•

21 years ago

Test: CVS-Trunk from yesterday Mozilla 20030107 (1.3b trunk) getElementById(): Average (10runs) :323ms Average (10runs) :156ms getElementsByTagName(): Average (10runs) :372ms Average (10runs) :197ms createElement(): Average (10runs) :787ms Average (10runs) :497ms getAttribute(): Average (10runs) :438ms Average (10runs) :248ms setAttribute(): Average (10runs) :413ms Average (10runs) :233ms (to compare IE 5.5: getElementById(): Average (10runs) :156ms getElementsByTagName(): Average (10runs) :146ms createElement(): Average (10runs) :125ms getAttribute(): Average (10runs) :64ms setAttribute(): Average (10runs) :62ms CVS trunk, not optimized build; Mozilla 20030107 is a nightly downloaded from ftp.mozilla.org; tests performed on a P4 2,53ghz on Windows 2000)

Adam D. Moss

Comment 130

•

21 years ago

I'm afraid that comparing debug to non-debug (nightly) builds of Mozilla is an apples-to-oranges comparison. It would be more helpful to compare the 20030107 nightly to the most recent nightly, for example, to remove most of the variables in the comparison.

Kai Lahmann (is there, where MNG is)

Comment 131

•

21 years ago

Athlon 1000, 512MB: Mozilla Firebird Nightly: 481, 590, 1101, 704, 659 Mozilla Firebird 0.6 (optimised debian Package): 423, 495, 958, 605, 562 Mozilla 1.3.1 (also debian Package): 395, 461, 877, 560, 519 Opera 7.11: 93, 220, 246, 507, 3210 comments: - optimisations matter a lot - everything loses totally against opera

Frank Wein [:mcsmurf]

Comment 132

•

21 years ago

to Comment 130: My cvs trunk build is a non-debug build (i disabled debug)

Markus Hübner

Reporter

Updated

•

21 years ago

Blocks: 203448

Target Milestone: Future → ---

Markus Hübner

Reporter

Updated

•

21 years ago

Depends on: 213943

Markus Hübner

Reporter

Updated

•

21 years ago

Depends on: 214139

Markus Hübner

Reporter

Updated

•

21 years ago

Depends on: 166013

Bryan

Comment 133

•

16 years ago

Still seeing various issues related to DOM performance in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1a1pre) Gecko/2008060905 Minefield/3.1a1pre ID:2008060905 ~B

Flags: blocking1.9.1?

Bryan

Comment 134

•

16 years ago

I was referring to some of the filed dependent bugs! ~B

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 135

•

16 years ago

File specific bugs on issues you are seeing. This is just a tracker bug at this point so nothing to block on here. Given the amount of DOM fixes we've done since the last activity here (2002) i'm marking this WORKSFORME.

Status: NEW → RESOLVED

Closed: 16 years ago

Flags: blocking1.9.1? → blocking1.9.1-

Resolution: --- → WORKSFORME

Boris Zbarsky [:bzbarsky]

Comment 136

•

16 years ago

The testcase still shows a number of issues, and as you said this _is_ a tracker. I don't think we should be resolving it until we're happy with our performance on that testcase or at least fix the dependent bugs.

Status: RESOLVED → REOPENED

Keywords: meta

Resolution: WORKSFORME → ---

timeless

Updated

•

16 years ago

Component: DOM: Core → DOM: Core & HTML

QA Contact: stummala → general

:Ms2ger (he/him; ⌚ UTC+1/+2)

Updated

•

14 years ago

Assignee: general → nobody

Blocks: domperf

Boris Zbarsky [:bzbarsky]

Comment 137

•

14 years ago

This doesn't block "domperf", because this isn't an actionable bug.

No longer blocks: domperf

Trev

Comment 138

•

14 years ago

Can this block Peacekeeper (bug #499198)? Test 15 tests the speed of getElementById and getElementsByTagName and FF is slower than Chrome at it. Or should I file another bug with the test case from Peacekeeper?

Boris Zbarsky [:bzbarsky]

Comment 139

•

14 years ago

Probably better to file specific bug for specific issues. See comment 135.

u572625

Comment 140

•

8 years ago

url down

BugBot [:suhaib / :marco/ :calixte]

Updated

•

5 years ago

Summary: DOM Performance Issues → [meta] DOM Performance Issues

Hsin-Yi Tsai (she/her) [:hsinyi]

Updated

•

5 years ago

Priority: P2 → P3

Sylvestre Ledru [:Sylvestre]

Comment 141

•

3 years ago

Meta bug with no activity for years (11 in this case), closing.

Status: REOPENED → RESOLVED

Closed: 16 years ago → 3 years ago

Resolution: --- → FIXED

10+ times speedup of nsScriptNameSpaceManager::LookupName() 23 years ago Johnny Stenback (:jst) (deleted), patch		Details \| Diff \| Splinter Review
work in progress on HashString and nsCRT::HashCode improvements 23 years ago David Baron :dbaron: (deleted), patch		Details \| Diff \| Splinter Review
Fixes all Brendan's issues and includes the new HashCode and HashString code. 23 years ago Johnny Stenback (:jst) (deleted), patch	peterv : review+ brendan : superreview+	Details \| Diff \| Splinter Review
proposed fix for JS/PLDHashInitEntry typedef's bogus const entry ptr 23 years ago Brendan Eich [:brendan] (deleted), patch		Details \| Diff \| Splinter Review
Complete patch including peterv's nit fixes and the jsdhash/pldhash const fixes. 23 years ago Johnny Stenback (:jst) (deleted), patch		Details \| Diff \| Splinter Review
Fix bogus assertion and remove unecessary null check (diff -w) 23 years ago Johnny Stenback (:jst) (deleted), patch		Details \| Diff \| Splinter Review
4x speedup to nsHTMLDocument::ResolveName() and ::GetElementById(). 23 years ago Johnny Stenback (:jst) (deleted), patch		Details \| Diff \| Splinter Review
Above patch as diff -w 23 years ago Johnny Stenback (:jst) (deleted), patch	hjtoi-bugzilla : review+	Details \| Diff \| Splinter Review
Final patch for this round 23 years ago Johnny Stenback (:jst) (deleted), patch	jst : review+ jst : superreview+	Details \| Diff \| Splinter Review
jprof of new testcase, text version, with text "Test" 23 years ago Randell Jesup [:jesup] (needinfo me) (deleted), text/html		Details
50% speedup of the getElementById() testcase... 23 years ago Johnny Stenback (:jst) (deleted), patch		Details \| Diff \| Splinter Review
50% speedup of the getElementById() testcase, security problem fixed... 23 years ago Johnny Stenback (:jst) (deleted), patch	peterv : review+ vidur : superreview+	Details \| Diff \| Splinter Review
Same as above but for the 1.0 branch and w/o the security related change 23 years ago Johnny Stenback (:jst) (deleted), patch	jesup : review+ jesup : superreview+ jesup : approval+	Details \| Diff \| Splinter Review