Closed
Bug 490695
Opened 15 years ago
Closed 15 years ago
implement cycle collector's purple buffer with entries that can be cheaply removed from object in buffer
Categories
(Core :: XPCOM, defect, P3)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla1.9.1
People
(Reporter: dbaron, Assigned: dbaron)
References
Details
(Keywords: fixed1.9.1, perf)
Attachments
(2 files, 3 obsolete files)
(deleted),
text/plain; charset=UTF-8
|
Details | |
(deleted),
patch
|
peterv
:
review+
benjamin
:
review+
peterv
:
superreview+
beltzner
:
approval1.9.1+
|
Details | Diff | Splinter Review |
pldhash's SearchTable called from within nsCycleCollector::Suspect and Forget shows up in profiles a bit. We can fix this by replacing the purple buffer with a circularly linked list: * Each entry in the list has: + pointer to object + next and prev entry pointers + object's refcount * nsCycleCollectingAutoRefcnt replaces its mValue with a tagged value: when untagged it's a reference count, when tagged it's a pointer to the entry in the purple buffer list, which contains the reference count
![]() |
||
Comment 1•15 years ago
|
||
> When untagged it's a reference count
I'd think we want it the other way around. If the tagged value is the refcount, then we start with a test; if it's tagged we just need to shift to get the value, and if not we directly deref the pointer. If the tagged value is the pointer, we still need to shift in the untagged case, but in the tagged case we have to mask and deref.
Comment 2•15 years ago
|
||
Do we perhaps want to recycle entries in the list? And/or use arena allocation for them?
Comment 3•15 years ago
|
||
Huh. I was actually part way through writing up a task-specific container for the CC -- a task-specific digital trie -- but this approach sounds faster still. At least for the purple buffer. (Perhaps that's the only place the speed of pointer-based associative lookup counts? Maybe both..) (In reply to comment #2) > Do we perhaps want to recycle entries in the list? And/or use arena allocation > for them? I think we'd want a freelist of cells next to the active list -- pointers enter and leave the purple buffer at very high speed, constantly -- but I doubt there'd be a benefit from an arena. I don't imagine the resident-count is terribly bursty. I imagine it's reasonably-near-average for a long while, and then drops to zero during collection, and then quickly trends back up to near-average. But of course, real measurements are the only way to tell.
![]() |
||
Comment 4•15 years ago
|
||
> I don't imagine the resident-count is terribly bursty.
I think it actually is.... At least if I increment a counter in Put() and decrement it in Remove() and log the high-water mark value, I get something like 600k or so on the testcase I used for my profiling talk. That testcase has aroung 1500k total Put() calls. I suspect the issue is that each DOM node we work with ends up suspect when we're done with it, since the last thing that happens is a release. Then we cc sometime after we're done running the testcase.
Assignee | ||
Comment 5•15 years ago
|
||
Here's a patch that's getting close to something I'm happy with. I'm unhappy with the fact that we need to worry about binary compatibility here; nsXPCOMCycleCollectionParticipant is effectively frozen, but I think the changes that that forced me to undo probably were better off undone anyway. This leaves a chunk of old code for backwards compatibility with users of old XPCOM glue that I'm really not sure will get enough testing going forward, though.
Comment 6•15 years ago
|
||
next and prev pointers seem like a waste of space and time to update. Order in the list is not important, right? In which case just put the pointer to Obj and refcount into a dense array that you can iterate through during cycle collection. When you delete something put it on a freelist, when you allocate take it from the freelist if non-empty, otherwise extend the array. (the array could be in chunks with all items in a chunk initially added to the freelist when the chunk is allocated. Obviously you then need some list of the chunks) If it turns out that a large fraction of all objects are in the purple buffer at some point in their life then it would make the code a lot simpler and faster to always keep the refcount in the object itself, and just make each object 4 bytes larger. That will also save a pointer chase just find find the refcount (not to mention conditional code to decide where the refcount is)
Assignee | ||
Comment 7•15 years ago
|
||
Hmmm. I've written the patch before seeing these comments. (In reply to comment #1) > > When untagged it's a reference count > > I'd think we want it the other way around. If the tagged value is the > refcount, then we start with a test; if it's tagged we just need to shift to > get the value, and if not we directly deref the pointer. If the tagged value > is the pointer, we still need to shift in the untagged case, but in the tagged > case we have to mask and deref. I came to the same conclusion without seeing your comment. :-) (In reply to comment #2) > Do we perhaps want to recycle entries in the list? And/or use arena allocation > for > them? I allocated them in blocks, but decided to free the blocks each time we empty the list for a collection since I suspect using freelists for a long time is bad for memory locality. I haven't measured anything here yet, though.
Assignee | ||
Comment 8•15 years ago
|
||
(In reply to comment #6) > next and prev pointers seem like a waste of space and time to update. Order in > the list is not important, right? In which case just put the pointer to Obj > and refcount into a dense array that you can iterate through during cycle > collection. Yeah, I think you're probably right. I suspect it's rare that the purple buffer shrinks substantially from its maximum size between collections before the next collection happens.
Assignee | ||
Comment 9•15 years ago
|
||
Here's a patch using arrays rather than a linked list (we just step over the items on the free list when we enumerate). I might have performance comparison numbers from the try server by... well, tomorrow morning, most likely.
Attachment #375239 -
Attachment is obsolete: true
Assignee | ||
Comment 10•15 years ago
|
||
There's something wrong with the second patch (but not the first) that makes it leak.
Assignee | ||
Comment 11•15 years ago
|
||
Corrected version of previous patch, in which I actually link new blocks into the list when I allocate them.
Attachment #375255 -
Attachment is obsolete: true
Assignee | ||
Comment 12•15 years ago
|
||
Performance results from try server talos show that: (1) [Based on the 2 runs Friday morning] This patch makes Twinopen about 2-3% faster. (2) [Based on the 2 runs Thursday] the array might be slightly faster than the linked list, although that could have been confounded by the leak of Blocks in the linked list run. (8 of the 10 runs were faster for Twinopen) (3) [Based on 2 runs Friday morning] It looks like this makes Tp slightly faster, but not much. (8 of 10 runs were faster with the patch than without; the 2 that weren't were the 2 Vista runs)
Assignee | ||
Updated•15 years ago
|
Summary: implement cycle collector's purple buffer using circularly linked list → implement cycle collector's purple buffer with entries that can be cheaply removed from object in buffer
Assignee | ||
Comment 13•15 years ago
|
||
Polished up a few things, and now requesting review.
Attachment #375302 -
Attachment is obsolete: true
Attachment #375735 -
Flags: superreview?(peterv)
Attachment #375735 -
Flags: review?(peterv)
Assignee | ||
Comment 14•15 years ago
|
||
Comment on attachment 375735 [details] [diff] [review] patch And requesting additional review from Benjamin regarding the XPCOM glue code and the frozenness of things (including nsPurpleBufferEntry). Also note that we might want to consider this for 1.9.1.
Attachment #375735 -
Flags: review?(benjamin)
Comment 15•15 years ago
|
||
Comment on attachment 375735 [details] [diff] [review] patch Ugh... maybe we should have hidden CC from the glue altogether: I forget why we added it in the first place.
Attachment #375735 -
Flags: review?(benjamin) → review+
Comment 16•15 years ago
|
||
(In reply to comment #15) > I forget why we > added it in the first place. Maybe because it is useful for binary extensions.
Comment 17•15 years ago
|
||
Comment on attachment 375735 [details] [diff] [review] patch >diff --git a/xpcom/base/nsCycleCollector.cpp b/xpcom/base/nsCycleCollector.cpp > nsPurpleBuffer::NoteAll(GCGraphBuilder &builder) > { >- SpillAll(); >- mBackingStore.Enumerate(noteAllCallback, &builder); >+ mCompatObjects.EnumerateEntries(selectionCallback, &builder); s/selectionCallback/noteAllCallback/ >@@ -2164,19 +2188,18 @@ nsCycleCollector::Suspect(nsISupports *n >+ // Caller is responsible for filling in result's mRefCnt. This isn't needed for compat objects, right? >+ return mPurpleBuf.PutCompatObject(n); >diff --git a/xpcom/glue/nsISupportsImpl.h b/xpcom/glue/nsISupportsImpl.h > nsrefcnt decr(nsISupports *owner) >+ if (NS_UNLIKELY(refcount == 0)) { >+ if (NS_UNLIKELY(!NS_CycleCollectorForget2(e))) { >+ NS_NOTREACHED("forget should not fail when reference count hits 0"); >+ } >+ mTagged = NS_CCAR_REFCNT_TO_TAGGED(refcount); or mTagged = NS_CCAR_TAGGED_STABILIZED_REFCNT?
Attachment #375735 -
Flags: superreview?(peterv)
Attachment #375735 -
Flags: superreview+
Attachment #375735 -
Flags: review?(peterv)
Attachment #375735 -
Flags: review+
Assignee | ||
Comment 18•15 years ago
|
||
(In reply to comment #17) > >+ if (NS_UNLIKELY(refcount == 0)) { > >+ if (NS_UNLIKELY(!NS_CycleCollectorForget2(e))) { > >+ NS_NOTREACHED("forget should not fail when reference count hits 0"); > >+ } > >+ mTagged = NS_CCAR_REFCNT_TO_TAGGED(refcount); > > or mTagged = NS_CCAR_TAGGED_STABILIZED_REFCNT? I'd rather keep the stabilizeForDeletion logic separate, as it was before. (These values are actually different; the stabilized refcount is PURPLE_ENTRY_TO_TAGGED(0) rather than REFCNT_TO_TAGGED(0).) Other comments addressed, though.
Comment 19•15 years ago
|
||
Fine by me.
Assignee | ||
Comment 20•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/f152b230cd48
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Priority: -- → P3
Resolution: --- → FIXED
Assignee | ||
Comment 21•15 years ago
|
||
Txul improvement is visible on the graphs for the more sensitive machines: http://graphs-new.mozilla.org/#tests=[{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2251%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2252%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2253%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2254%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2273%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2258%22}]&sel=1241538147,1241710862
Assignee | ||
Comment 22•15 years ago
|
||
There's also a Tp3 improvement (around 1%) visible in the graphs for all platforms other than Vista (which is way too noisy to detect a <5% change).
Assignee | ||
Comment 23•15 years ago
|
||
Comment on attachment 375735 [details] [diff] [review] patch This gets a measurable somewhat-across-the-board performance improvement, most noticeable on Twinopen and probably some "DHTML" testcases.
Attachment #375735 -
Flags: approval1.9.1?
Updated•15 years ago
|
Attachment #375735 -
Flags: approval1.9.1? → approval1.9.1+
Comment 24•15 years ago
|
||
Comment on attachment 375735 [details] [diff] [review] patch a191=beltzner
Comment 25•15 years ago
|
||
(please land this today so we can have some extra time to shake it out for topcrashes)
Assignee | ||
Comment 26•15 years ago
|
||
I'm unable to land it today, but I will probably be able to land it tomorrow morning EDT (UTC-4). If somebody else could land it today that would be helpful.
Keywords: checkin-needed
Assignee | ||
Comment 27•15 years ago
|
||
(I'll have it merged by this evening if somebody else can watch the tree for me, though.)
Assignee | ||
Comment 28•15 years ago
|
||
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/0866be0a6600 (I still need to fix up the comments about version numbers in the XPCOM frozen functions, on both branches, to reflect that this landed on 1.9.1.)
Keywords: checkin-needed → fixed1.9.1
Target Milestone: mozilla1.9.2a1 → mozilla1.9.1
Assignee | ||
Comment 29•15 years ago
|
||
Comments fixed: http://hg.mozilla.org/releases/mozilla-1.9.1/rev/db5055c04dff http://hg.mozilla.org/mozilla-central/rev/8ff39c36bd2f
Assignee | ||
Comment 30•15 years ago
|
||
Yesterday afternoon I backed this out temporarily from 1.9.1 and then relanded it as part of the investigation of bug 494255. It looks like it did not cause that bug; however, it appears that this patch did hurt some performance numbers on Windows (Tdhtml on XP and Vista; Tp3 on XP) while helping them on Linux and Mac (Txul and Tdhtml (and maybe also Tp3) on Linux, Txul on Mac OS X 10.4 and 10.5, Tp3 on Mac OS X 10.4) Perhaps that means I should back it out permanently, though I really don't see why it would cause a regression.
You need to log in
before you can comment on or make changes to this bug.
Description
•