Closed Bug 490695 Opened 15 years ago Closed 15 years ago

implement cycle collector's purple buffer with entries that can be cheaply removed from object in buffer

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla1.9.1

People

(Reporter: dbaron, Assigned: dbaron)

References

Details

(Keywords: fixed1.9.1, perf)

Attachments

(2 files, 3 obsolete files)

patch 15 years ago David Baron :dbaron: (deleted), patch		Details \| Diff \| Splinter Review
patch 15 years ago David Baron :dbaron: (deleted), patch		Details \| Diff \| Splinter Review
patch 15 years ago David Baron :dbaron: (deleted), patch		Details \| Diff \| Splinter Review
performance results from try server talos 15 years ago David Baron :dbaron: (deleted), text/plain; charset=UTF-8		Details
patch 15 years ago David Baron :dbaron: (deleted), patch	peterv : review+ benjamin : review+ peterv : superreview+ beltzner : approval1.9.1+	Details \| Diff \| Splinter Review

David Baron :dbaron:

Assignee

Description

•

15 years ago

pldhash's SearchTable called from within nsCycleCollector::Suspect and Forget shows up in profiles a bit.  We can fix this by replacing the purple buffer with a circularly linked list:

 * Each entry in the list has:
   + pointer to object
   + next and prev entry pointers
   + object's refcount

 * nsCycleCollectingAutoRefcnt replaces its mValue with a tagged value:  when untagged it's a reference count, when tagged it's a pointer to the entry in the purple buffer list, which contains the reference count

Boris Zbarsky [:bzbarsky]

Comment 1

•

15 years ago

> When untagged it's a reference count

I'd think we want it the other way around.  If the tagged value is the refcount, then we start with a test; if it's tagged we just need to shift to get the value, and if not we directly deref the pointer.  If the tagged value is the pointer, we still need to shift in the untagged case, but in the tagged case we have to mask and deref.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 2

•

15 years ago

Do we perhaps want to recycle entries in the list? And/or use arena allocation for
them?

Graydon Hoare :graydon

Comment 3

•

15 years ago

Huh. I was actually part way through writing up a task-specific container for the CC -- a task-specific digital trie -- but this approach sounds faster still. At least for the purple buffer.

(Perhaps that's the only place the speed of pointer-based associative lookup counts? Maybe both..)

(In reply to comment #2)
> Do we perhaps want to recycle entries in the list? And/or use arena allocation
> for them?

I think we'd want a freelist of cells next to the active list -- pointers enter and leave the purple buffer at very high speed, constantly -- but I doubt there'd be a benefit from an arena. I don't imagine the resident-count is terribly bursty. I imagine it's reasonably-near-average for a long while, and then drops to zero during collection, and then quickly trends back up to near-average. 

But of course, real measurements are the only way to tell.

Boris Zbarsky [:bzbarsky]

Comment 4

•

15 years ago

> I don't imagine the resident-count is terribly bursty.

I think it actually is....  At least if I increment a counter in Put() and decrement it in Remove() and log the high-water mark value, I get something like 600k or so on the testcase I used for my profiling talk.  That testcase has aroung 1500k total Put() calls.  I suspect the issue is that each DOM node we work with ends up suspect when we're done with it, since the last thing that happens is a release.  Then we cc sometime after we're done running the testcase.

David Baron :dbaron:

Assignee

Comment 5

•

15 years ago

Attached patch patch (obsolete) (deleted) — Details — Splinter Review

Here's a patch that's getting close to something I'm happy with.

I'm unhappy with the fact that we need to worry about binary compatibility here; nsXPCOMCycleCollectionParticipant is effectively frozen, but I think the changes that that forced me to undo probably were better off undone anyway.  This leaves a chunk of old code for backwards compatibility with users of old XPCOM glue that I'm really not sure will get enough testing going forward, though.

Bruce Hoult

Comment 6

•

15 years ago

next and prev pointers seem like a waste of space and time to update.  Order in
the list is not important, right?  In which case just put the pointer to Obj
and refcount into a dense array that you can iterate through during cycle
collection. When you delete something put it on a freelist, when you allocate
take it from the freelist if non-empty, otherwise extend the array.  (the array
could be in chunks with all items in a chunk initially added to the freelist
when the chunk is allocated. Obviously you then need some list of the chunks)

If it turns out that a large fraction of all objects are in the purple buffer
at some point in their life then it would make the code a lot simpler and
faster to always keep the refcount in the object itself, and just make each
object 4 bytes larger. That will also save a pointer chase just find find the
refcount (not to mention conditional code to decide where the refcount is)

David Baron :dbaron:

Assignee

Comment 7

•

15 years ago

Hmmm.  I've written the patch before seeing these comments.

(In reply to comment #1)
> > When untagged it's a reference count
> 
> I'd think we want it the other way around.  If the tagged value is the
> refcount, then we start with a test; if it's tagged we just need to shift to
> get the value, and if not we directly deref the pointer.  If the tagged value
> is the pointer, we still need to shift in the untagged case, but in the tagged
> case we have to mask and deref.

I came to the same conclusion without seeing your comment. :-)

(In reply to comment #2)
> Do we perhaps want to recycle entries in the list? And/or use arena allocation
> for
> them?

I allocated them in blocks, but decided to free the blocks each time we empty the list for a collection since I suspect using freelists for a long time is bad for memory locality.  I haven't measured anything here yet, though.

David Baron :dbaron:

Assignee

Comment 8

•

15 years ago

(In reply to comment #6)
> next and prev pointers seem like a waste of space and time to update.  Order in
> the list is not important, right?  In which case just put the pointer to Obj
> and refcount into a dense array that you can iterate through during cycle
> collection.

Yeah, I think you're probably right.  I suspect it's rare that the purple buffer shrinks substantially from its maximum size between collections before the next collection happens.

David Baron :dbaron:

Assignee

Comment 9

•

15 years ago

Attached patch patch (obsolete) (deleted) — Details — Splinter Review

Here's a patch using arrays rather than a linked list (we just step over the items on the free list when we enumerate).

I might have performance comparison numbers from the try server by... well, tomorrow morning, most likely.

Attachment #375239 - Attachment is obsolete: true

David Baron :dbaron:

Assignee

Comment 10

•

15 years ago

There's something wrong with the second patch (but not the first) that makes it leak.

David Baron :dbaron:

Assignee

Comment 11

•

15 years ago

Attached patch patch (obsolete) (deleted) — Details — Splinter Review

Corrected version of previous patch, in which I actually link new blocks into the list when I allocate them.

Attachment #375255 - Attachment is obsolete: true

David Baron :dbaron:

Assignee

Comment 12

•

15 years ago

Attached file performance results from try server talos (deleted) — Details

Performance results from try server talos show that:

 (1) [Based on the 2 runs Friday morning] This patch makes Twinopen about 2-3% faster.

 (2) [Based on the 2 runs Thursday] the array might be slightly faster than the linked list, although that could have been confounded by the leak of Blocks in the linked list run.  (8 of the 10 runs were faster for Twinopen)

 (3) [Based on 2 runs Friday morning] It looks like this makes Tp slightly faster, but not much.  (8 of 10 runs were faster with the patch than without; the 2 that weren't were the 2 Vista runs)

David Baron :dbaron:

Assignee

Updated

•

15 years ago

Summary: implement cycle collector's purple buffer using circularly linked list → implement cycle collector's purple buffer with entries that can be cheaply removed from object in buffer

David Baron :dbaron:

Assignee

Updated

•

15 years ago

Blocks: 470684

David Baron :dbaron:

Assignee

Updated

•

15 years ago

Blocks: 377787

David Baron :dbaron:

Assignee

Comment 13

•

15 years ago

Attached patch patch (deleted) — Details — Splinter Review

Polished up a few things, and now requesting review.

Attachment #375302 - Attachment is obsolete: true

Attachment #375735 - Flags: superreview?(peterv)

Attachment #375735 - Flags: review?(peterv)

David Baron :dbaron:

Assignee

Comment 14

•

15 years ago

Comment on attachment 375735 [details] [diff] [review]
patch

And requesting additional review from Benjamin regarding the XPCOM glue code and the frozenness of things (including nsPurpleBufferEntry).  Also note that we might want to consider this for 1.9.1.

Attachment #375735 - Flags: review?(benjamin)

Benjamin Smedberg

Comment 15

•

15 years ago

Comment on attachment 375735 [details] [diff] [review]
patch

Ugh... maybe we should have hidden CC from the glue altogether: I forget why we added it in the first place.

Attachment #375735 - Flags: review?(benjamin) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 16

•

15 years ago

(In reply to comment #15)
> I forget why we
> added it in the first place.
Maybe because it is useful for binary extensions.

Peter Van der Beken [:peterv]

Comment 17

•

15 years ago

Comment on attachment 375735 [details] [diff] [review]
patch

>diff --git a/xpcom/base/nsCycleCollector.cpp b/xpcom/base/nsCycleCollector.cpp

> nsPurpleBuffer::NoteAll(GCGraphBuilder &builder)
> {
>-    SpillAll();
>-    mBackingStore.Enumerate(noteAllCallback, &builder);
>+    mCompatObjects.EnumerateEntries(selectionCallback, &builder);

s/selectionCallback/noteAllCallback/

>@@ -2164,19 +2188,18 @@ nsCycleCollector::Suspect(nsISupports *n

>+    // Caller is responsible for filling in result's mRefCnt.

This isn't needed for compat objects, right?

>+    return mPurpleBuf.PutCompatObject(n);

>diff --git a/xpcom/glue/nsISupportsImpl.h b/xpcom/glue/nsISupportsImpl.h

>   nsrefcnt decr(nsISupports *owner)

>+      if (NS_UNLIKELY(refcount == 0)) {
>+        if (NS_UNLIKELY(!NS_CycleCollectorForget2(e))) {
>+          NS_NOTREACHED("forget should not fail when reference count hits 0");
>+        }
>+        mTagged = NS_CCAR_REFCNT_TO_TAGGED(refcount);

or mTagged = NS_CCAR_TAGGED_STABILIZED_REFCNT?

Attachment #375735 - Flags: superreview?(peterv)

Attachment #375735 - Flags: superreview+

Attachment #375735 - Flags: review?(peterv)

Attachment #375735 - Flags: review+

David Baron :dbaron:

Assignee

Comment 18

•

15 years ago

(In reply to comment #17)
> >+      if (NS_UNLIKELY(refcount == 0)) {
> >+        if (NS_UNLIKELY(!NS_CycleCollectorForget2(e))) {
> >+          NS_NOTREACHED("forget should not fail when reference count hits 0");
> >+        }
> >+        mTagged = NS_CCAR_REFCNT_TO_TAGGED(refcount);
> 
> or mTagged = NS_CCAR_TAGGED_STABILIZED_REFCNT?

I'd rather keep the stabilizeForDeletion logic separate, as it was before.  (These values are actually different; the stabilized refcount is PURPLE_ENTRY_TO_TAGGED(0) rather than REFCNT_TO_TAGGED(0).)

Other comments addressed, though.

Peter Van der Beken [:peterv]

Comment 19

•

15 years ago

Fine by me.

David Baron :dbaron:

Assignee

Comment 20

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/f152b230cd48

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Priority: -- → P3

Resolution: --- → FIXED

David Baron :dbaron:

Assignee

Comment 21

•

15 years ago

Txul improvement is visible on the graphs for the more sensitive machines:

http://graphs-new.mozilla.org/#tests=[{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2251%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2252%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2253%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2254%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2273%22},{%22test%22:%2217%22,%22branch%22:%221%22,%22machine%22:%2258%22}]&sel=1241538147,1241710862

David Baron :dbaron:

Assignee

Comment 22

•

15 years ago

There's also a Tp3 improvement (around 1%) visible in the graphs for all platforms other than Vista (which is way too noisy to detect a <5% change).

Jesse Ruderman

Updated

•

15 years ago

Depends on: 492324

David Baron :dbaron:

Assignee

Comment 23

•

15 years ago

Comment on attachment 375735 [details] [diff] [review]
patch

This gets a measurable somewhat-across-the-board performance improvement, most noticeable on Twinopen and probably some "DHTML" testcases.

Attachment #375735 - Flags: approval1.9.1?

Mike Beltzner [:beltzner, not reading bugmail]

Updated

•

15 years ago

Attachment #375735 - Flags: approval1.9.1? → approval1.9.1+

Mike Beltzner [:beltzner, not reading bugmail]

Comment 24

•

15 years ago

Comment on attachment 375735 [details] [diff] [review]
patch

a191=beltzner

Mike Beltzner [:beltzner, not reading bugmail]

Comment 25

•

15 years ago

(please land this today so we can have some extra time to shake it out for topcrashes)

David Baron :dbaron:

Assignee

Comment 26

•

15 years ago

I'm unable to land it today, but I will probably be able to land it tomorrow morning EDT (UTC-4).  If somebody else could land it today that would be helpful.

Keywords: checkin-needed

David Baron :dbaron:

Assignee

Comment 27

•

15 years ago

(I'll have it merged by this evening if somebody else can watch the tree for me, though.)

David Baron :dbaron:

Assignee

Comment 28

•

15 years ago

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/0866be0a6600

(I still need to fix up the comments about version numbers in the XPCOM frozen functions, on both branches, to reflect that this landed on 1.9.1.)

Keywords: checkin-needed → fixed1.9.1

Target Milestone: mozilla1.9.2a1 → mozilla1.9.1

David Baron :dbaron:

Assignee

Comment 29

•

15 years ago

Comments fixed:
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/db5055c04dff
http://hg.mozilla.org/mozilla-central/rev/8ff39c36bd2f

David Baron :dbaron:

Assignee

Comment 30

•

15 years ago

Yesterday afternoon I backed this out temporarily from 1.9.1 and then relanded it as part of the investigation of bug 494255.  It looks like it did not cause that bug; however, it appears that this patch did hurt some performance numbers on Windows (Tdhtml on XP and Vista; Tp3 on XP) while helping them on Linux and Mac (Txul and Tdhtml (and maybe also Tp3) on Linux, Txul on Mac OS X 10.4 and 10.5, Tp3 on Mac OS X 10.4)

Perhaps that means I should back it out permanently, though I really don't see why it would cause a regression.

You need to log in before you can comment on or make changes to this bug.