Closed Bug 1275755 Opened 9 years ago Closed 8 years ago

Make Atoms threadsafe

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla49

Tracking Flags:

Tracking

Status

firefox49

---

fixed

People

(Reporter: bholley, Assigned: bholley)

References

(Blocks 2 open bugs)

Details

Attachments

(6 files, 3 obsolete files)

Make atoms refcountable off-main-thread. v1 9 years ago Bobby Holley (:bholley) (deleted), patch		Details \| Diff \| Splinter Review
Part 1 - Remove MOZ_DUMP_ATOM_LEAKS. v1 9 years ago Bobby Holley (:bholley) (deleted), patch	froydnj : review+	Details \| Diff \| Splinter Review
Part 2 - Use an explicit init routine for the atom table. v1 9 years ago Bobby Holley (:bholley) (deleted), patch	froydnj : review+	Details \| Diff \| Splinter Review
Part 3 - Protect gAtomTable with a lock. v1 9 years ago Bobby Holley (:bholley) (deleted), patch	froydnj : review+	Details \| Diff \| Splinter Review
Part 4 - Use threadsafe refcounting for atoms and remove main-thread restrictions. v1 9 years ago Bobby Holley (:bholley) (deleted), patch		Details \| Diff \| Splinter Review
Part 4 - Use a GC scheme to free unused atoms. v1 9 years ago Bobby Holley (:bholley) (deleted), patch		Details \| Diff \| Splinter Review
Part 5 - Remove main-thread restrictions on atoms. v1 9 years ago Bobby Holley (:bholley) (deleted), patch	froydnj : review+	Details \| Diff \| Splinter Review
Part 4 - Use a GC scheme to free unused atoms. v2 8 years ago Bobby Holley (:bholley) (deleted), patch	froydnj : review+	Details \| Diff \| Splinter Review
set/removeAttribute microbenchmark 8 years ago Olli Pettay [:smaug][bugs@pettay.fi] (deleted), text/html		Details

Bobby Holley (:bholley)

Assignee

Description

•

9 years ago

This was removed in bug 387445, on the grounds that the atom table isn't threadsafe, and so atoms shouldn't be refcounted off the main thread either. For stylo, we may be able to get away with doing any actual atomization off-main-thread, but we definitely need to refcount the atoms. This is a relatively small and uninvasive change. Unittest run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b369c7269884 Talos run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b62e64a76ee&selectedJob=21455153 [1] I consulted with jmaher about the Talos results and he confirmed that it looked fine. [1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=5511d54a3f172c1d68f98cc55dce4de1d0ba1b51&newProject=try&newRevision=3b62e64a76ee&framework=1&showOnlyImportant=0

Bobby Holley (:bholley)

Assignee

Comment 1

•

9 years ago

Attached patch Make atoms refcountable off-main-thread. v1 (obsolete) (deleted) — Details — Splinter Review

Attachment #8756603 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Updated

•

9 years ago

Blocks: 1275766

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 2

•

9 years ago

How are you planning on handling the last release of an atom happening off the main thread?

Bobby Holley (:bholley)

Assignee

Comment 3

•

9 years ago

(In reply to Jonas Sicking (:sicking) from comment #2) > How are you planning on handling the last release of an atom happening off > the main thread? See the patch.

Bobby Holley (:bholley)

Assignee

Comment 4

•

9 years ago

Though now I realize there's a race in there if somebody else grabs the entry in the mean time. Let me see if I can fix that.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 5

•

9 years ago

Any measurements whether this affects to microbenchmarks? (I just optimized out some atom getting because it was too slow, but sure, that was less about atom addref/release and more about hashtable usage.)

Bobby Holley (:bholley)

Assignee

Comment 6

•

9 years ago

(In reply to Olli Pettay [:smaug] from comment #5) > Any measurements whether this affects to microbenchmarks? Not really. > (I just optimized out some atom getting because it was too slow, but sure, > that was less about atom addref/release and more about hashtable usage.) Yeah, I'm pretty sure the hash table lookup would dominate. And realistically, of the only impact of this was a microbenchmark, we'd probably still want to land it.

Bobby Holley (:bholley)

Assignee

Updated

•

9 years ago

Attachment #8756603 - Flags: review?(nfroyd)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 7

•

9 years ago

Sure, but we do optimize rather heavily even for microbenchmarks in some cases. Bug 1273481 as an example.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 8

•

9 years ago

In other words, would be just good to know how much this affects in some microbenchmark. Not objecting the change or anything.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 9

•

9 years ago

A silly test which addrefs/releases atoms quite a bit data:text/html,<script>window.mo = new MutationObserver(function(){}); function test() { for (var i = 0; i < 10000; ++i) { mo.observe(document, { attributes: true, attributeFilter: ["a", "b", "c", "d", "e", "f", "g", "h"]}); } console.log(new Date()); }; for (var i = 0; i < 100; ++i) { setTimeout(test, i * 50); }</script>

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 10

•

9 years ago

(that console.log() is just random debug output to see that something is happening.)

Bobby Holley (:bholley)

Assignee

Comment 11

•

9 years ago

Discussed this a bit with sicking. We decided that it's probably worth just making atoms threadsafe in general, since it would make it easier to write code in Workers. I have some patches which I'm running through try. They make the microbenchmark in comment 9 about 5% slower on average, which is almost certainly acceptable.

Summary: Give Atoms threadsafe refcounting → Make Atoms threadsafe

Bobby Holley (:bholley)

Assignee

Updated

•

9 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=387445

Bobby Holley (:bholley)

Assignee

Comment 12

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b369c7269884 https://treeherder.mozilla.org/#/jobs?repo=try&revision=19e3dff4870e&selectedJob=21481765 https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=5511d54a3f17&newProject=try&newRevision=19e3dff4870e&framework=1

Bobby Holley (:bholley)

Assignee

Comment 13

•

9 years ago

Attached patch Part 1 - Remove MOZ_DUMP_ATOM_LEAKS. v1 (deleted) — Details — Splinter Review

I don't think anyone is using this anymore. It would be good to assert that there are no leaks, but that doesn't pass for me in a local build, and I don't have time to chase it.

Attachment #8756711 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Comment 14

•

9 years ago

Attached patch Part 2 - Use an explicit init routine for the atom table. v1 (deleted) — Details — Splinter Review

Attachment #8756712 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Comment 15

•

9 years ago

Attached patch Part 3 - Protect gAtomTable with a lock. v1 (deleted) — Details — Splinter Review

Attachment #8756713 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Comment 16

•

9 years ago

Attached patch Part 4 - Use threadsafe refcounting for atoms and remove main-thread restrictions. v1 (obsolete) (deleted) — Details — Splinter Review

Attachment #8756714 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Updated

•

9 years ago

Attachment #8756603 - Attachment is obsolete: true

Bobby Holley (:bholley)

Assignee

Comment 17

•

9 years ago

Comment on attachment 8756714 [details] [diff] [review] Part 4 - Use threadsafe refcounting for atoms and remove main-thread restrictions. v1 Review of attachment 8756714 [details] [diff] [review]: ----------------------------------------------------------------- Ugh, there's still a race here, between the threadsafe refcount decrement and the acquisition of the lock to destroy. I'll think about this some more.

Attachment #8756714 - Flags: review?(nfroyd)

Nathan Froyd [:froydnj]

Comment 18

•

9 years ago

Comment on attachment 8756711 [details] [diff] [review] Part 1 - Remove MOZ_DUMP_ATOM_LEAKS. v1 Review of attachment 8756711 [details] [diff] [review]: ----------------------------------------------------------------- Works for me.

Attachment #8756711 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Updated

•

9 years ago

Attachment #8756712 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Updated

•

9 years ago

Attachment #8756713 - Flags: review?(nfroyd) → review+

Bobby Holley (:bholley)

Assignee

Comment 19

•

9 years ago

As for the last patch - I talked about it with bz and smaug on IRC, and the current plan is to try a GC mechanism whereby AddRef/Release just twiddle refcounts, and we periodically call into the (locked) atom table to evict entries whose refcount is zero. We can probably keep a counter on the number of times we've Release()-ed and found refcount zero, and perform a GC when that counter gets high (and on debug-build shutdown). We can probably make it really high and GC pretty infrequently - waiting for 10000 possibly-dead atoms may be fine (with a smaller threshold in debug builds to make sure the code gets exercised). I may not get to this before PTO though.

Bobby Holley (:bholley)

Assignee

Comment 20

•

9 years ago

Ok, I managed to get this working this afternoon. Overhead on the microbenchmark is about 10-12% (presumably from the atomics usage), which is probably acceptable. I'll post patches.

Bobby Holley (:bholley)

Assignee

Comment 21

•

9 years ago

Attached patch Part 4 - Use a GC scheme to free unused atoms. v1 (obsolete) (deleted) — Details — Splinter Review

Please review carefully.

Attachment #8757097 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Updated

•

9 years ago

Attachment #8756714 - Attachment is obsolete: true

Bobby Holley (:bholley)

Assignee

Comment 22

•

9 years ago

Attached patch Part 5 - Remove main-thread restrictions on atoms. v1 (deleted) — Details — Splinter Review

Attachment #8757098 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Comment 23

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fff250b1689d

Bobby Holley (:bholley)

Assignee

Comment 24

•

9 years ago

https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=fff250b1689d

Bobby Holley (:bholley)

Assignee

Comment 25

•

8 years ago

Looks like we also run leak checking in ASAN builds, so need to GC there on shutdown as well. https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d8b71f5a8ef

Bobby Holley (:bholley)

Assignee

Comment 26

•

8 years ago

Attached patch Part 4 - Use a GC scheme to free unused atoms. v2 (deleted) — Details — Splinter Review

Attachment #8757288 - Flags: review?(nfroyd)

Bobby Holley (:bholley)

Assignee

Updated

•

8 years ago

Attachment #8757097 - Attachment is obsolete: true

Attachment #8757097 - Flags: review?(nfroyd)

Andrew McCreight [:mccr8]

Comment 27

•

8 years ago

Please use NS_FREE_PERMANENT_DATA instead of defined(DEBUG) || defined(MOZ_ASAN). (Currently it is defined to be the same thing, but it conveys the intent better, and will ensure this keeps working in case we add some other testing that cares about leaks.)

Bobby Holley (:bholley)

Assignee

Comment 28

•

8 years ago

(In reply to Andrew McCreight [:mccr8] from comment #27) > Please use NS_FREE_PERMANENT_DATA instead of defined(DEBUG) || > defined(MOZ_ASAN). (Currently it is defined to be the same thing, but it > conveys the intent better, and will ensure this keeps working in case we add > some other testing that cares about leaks.) Ok. And maybe I'll also set the opt threshold to something lower, since it will probably improve hashtable performance to a large number of dead entries. Maybe 500 or 1000? What do you think Nathan?

Nathan Froyd [:froydnj]

Comment 29

•

8 years ago

(In reply to Bobby Holley (PTO through June 13) from comment #28) > And maybe I'll also set the opt threshold to something lower, since it > will probably improve hashtable performance to a large number of dead > entries. Maybe 500 or 1000? What do you think Nathan? The first sentence is trying to say that a lower threshold would improve hashtable performance, because we wouldn't retain so many dead entries over time and we'd thus have a smaller (or at least a more lightly-loaded) hashtable, correct? That sounds plausible.

Bobby Holley (:bholley)

Assignee

Comment 30

•

8 years ago

(In reply to Nathan Froyd [:froydnj] from comment #29) > (In reply to Bobby Holley (PTO through June 13) from comment #28) > > And maybe I'll also set the opt threshold to something lower, since it > > will probably improve hashtable performance to a large number of dead > > entries. Maybe 500 or 1000? What do you think Nathan? > > The first sentence is trying to say that a lower threshold would improve > hashtable performance, because we wouldn't retain so many dead entries over > time and we'd thus have a smaller (or at least a more lightly-loaded) > hashtable, correct? That sounds plausible. Correct.

Nathan Froyd [:froydnj]

Comment 31

•

8 years ago

Comment on attachment 8757288 [details] [diff] [review] Part 4 - Use a GC scheme to free unused atoms. v2 Review of attachment 8757288 [details] [diff] [review]: ----------------------------------------------------------------- A couple comments, but this looks OK. I think the microbenchmark slowdown is just because the refcount handling in our mutation observer implementation leaves something to be desired; I think we could make that microbenchmark faster. ::: xpcom/ds/nsAtomTable.cpp @@ +64,5 @@ > }; > > //---------------------------------------------------------------------- > > +static Atomic<uint32_t> gUnusedAtomCount(0); I think we can get away with making this Atomic<uint32_t, ReleaseAcquire>, which would make things a little cheaper...at least on x86oids (no memory barrier required on store, and I'm pretty sure it requires fewer barriers on ARM, too). @@ +71,5 @@ > { > public: > DynamicAtom(const nsAString& aString, uint32_t aHash) > { > + ++gUnusedAtomCount; I think this deserves a comment...because I'm not completely sure what this is here for. I think it's there because this can now happen with our GC'ing scheme: 1. Atom A drops to zero refcount; gUnusedAtomCount is incremented. 2. A GC is *not* triggered. 3. Somebody comes along and re-atomizes a string corresponding to A. 4. We find A still in our hashtable, even with a zero refcount. 5. We AddRef A. 6. To be consistent, we now have to decrement gUnusedAtomCount. And so: 7. To balance out the case where we're creating an atom, we need this increment here. Is that correct? If so, OK, that makes sense. But the increment here just to decrement in AddRef for the initial construction seems like needless work. How about we do this: 1. Make this constructor private. 2. Change the constructor so that it starts like: DynamicAtom(const nsAString&, uint32_t) : mRefCnt(1) 3. Add a public static Create function: already_AddRefed<DynamicAtom> DynamicAtom::Create(const nsAString& aString, uint32_t aHash) { // The refcount is appropriately initialized in the constructor. return dont_AddRef(new DynamicAtom(aString, aHash)); } 4. Replace |new DynamicAtom| calls with calls to Create(). 5. Put a comment describing the above scenario in AddRef, so that the comment is close to the code that needs describing. This is admittedly gnarly microoptimization, since atoms aren't created all that often (a non-rigorous stress test of popular websites only created about 4K atoms). @@ +393,5 @@ > > +void > +DynamicAtom::GCAtomTable() > +{ > + gAtomTableLock->AssertCurrentThreadOwns(); Let's move the lock acquisition into this function. Or make the function take a |const MutexAutoLock&| parameter. @@ +402,5 @@ > + if (atom->mRefCnt == 0) { > + i.Remove(); > + delete atom; > + MOZ_ASSERT(gUnusedAtomCount > 0); > + --gUnusedAtomCount; In the interest of micro-optimization--as operations on gUnusedAtomCount are relatively expensive--why don't we do something like: uint32_t removedCount = 0; for (...) { ... if (atom->mRefCnt == 0) { ... removedCount++; } } MOZ_ASSERT(removedCount <= gUnusedAtomCount); gUnusedAtomCount -= removedCount;

Attachment #8757288 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Updated

•

8 years ago

Attachment #8757098 - Flags: review?(nfroyd) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 32

•

8 years ago

the mutationobserver case is just something where I knew we are using nsIAtoms so that it gets exposed to the web. (it is just that some/most real world benchmarks contains that kind of totally silly microbenchmarks. bug 1273481 being an example of such: the test is testing the performance of a no-op method call.)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 33

•

8 years ago

setAttribute performance might be more important than MutationObserver.observe. That is a case which we have optimized heavily.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 34

•

8 years ago

Attached file set/removeAttribute microbenchmark (deleted) — Details

Pulsebot

Comment 35

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/5bab999902fc https://hg.mozilla.org/integration/mozilla-inbound/rev/2ec4d49ae88d https://hg.mozilla.org/integration/mozilla-inbound/rev/10ce0f01ee4a https://hg.mozilla.org/integration/mozilla-inbound/rev/bec37fdc70f9 https://hg.mozilla.org/integration/mozilla-inbound/rev/de635a6b22cf

David Baron :dbaron:

Comment 36

•

8 years ago

(In reply to Nathan Froyd [:froydnj] from comment #31) > > +static Atomic<uint32_t> gUnusedAtomCount(0); > > I think we can get away with making this Atomic<uint32_t, ReleaseAcquire>, > which would make things a little cheaper...at least on x86oids (no memory > barrier required on store, and I'm pretty sure it requires fewer barriers on > ARM, too). One other thought here is that this patch is using ThreadSafeAutoRefCnt for DynamicAtom's refcount, which uses sequential consistency not because refcounting really needs it, but because it could break pre-existing code that was depending on the memory barriers for other things. Perhaps DynamicAtom's reference count could be ReleaseAcquire rather than SequentiallyConsistent. Or, even more efficient, no memory barrier for AddRef, Release(ordering) semantics for Release(refcount)-to-nonzero, and Acquire semantics for Release(refcount)-to-zero, which IIRC is what's needed for threadsafe reference counting. Or, better -- Atoms are special, since they're immutable once they're created, and for immutable objects, I actually don't see why reference counting can't just use Relaxed semantics (since the reason for the need for release and acquire semantics, is, as I understand it, the need for the object's destruction on a given thread to take into account all writes to the object and things it references on other threads before the object's last Release(refcounting) on those threads). Somebody should definitely check this idea, though. And, additionally, I don't see why gUnusedAtomCount can't use Relaxed semantics -- even without any of those immutability assumptions. It seems like the key thing that's needed there is that there's a memory barrier from unlocking/locking the AtomTableLock, and when the atom table hands out a new atom, it does the initial AddRef of that atom on the thread on which it's handing it out while holding the atom table lock (which NS_Atomize does), which in turn assures that DynamicAtom::GCAtomTable (which holds the atom table lock) sees that reflected in the reference count. Not sure if this is worth filing followups for or if it's micro-optimization that won't actually make a difference.

Carsten Book [:Tomcat]

Comment 37

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/5bab999902fc https://hg.mozilla.org/mozilla-central/rev/2ec4d49ae88d https://hg.mozilla.org/mozilla-central/rev/10ce0f01ee4a https://hg.mozilla.org/mozilla-central/rev/bec37fdc70f9 https://hg.mozilla.org/mozilla-central/rev/de635a6b22cf

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox49: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla49

David Baron :dbaron:

Updated

•

8 years ago

Blocks: 1277711

David Baron :dbaron:

Updated

•

8 years ago

Blocks: 1277713

David Baron :dbaron:

Comment 38

•

8 years ago

I turned comment 36 into bug 1277709, bug 1277711, and bug 1277713.

Nicholas Nethercote [inactive]

Comment 39

•

8 years ago

If I understand this bug correctly, I think we can now remove the static atoms table. It only existed to allow some atoms (used by the HTML parser) to be looked up off the main thread, but now we can do that for any atom. See bug 529808 comment 2 and onwards.

Nicholas Nethercote [inactive]

Comment 40

•

8 years ago

(In reply to Nicholas Nethercote [:njn] from comment #39) > If I understand this bug correctly, I think we can now remove the static > atoms table. It only existed to allow some atoms (used by the HTML parser) > to be looked up off the main thread, but now we can do that for any atom. > See bug 529808 comment 2 and onwards. I think it will also help with bug 1269490.

Blocks: 1269490

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

8 years ago

Depends on: 1351303

You need to log in before you can comment on or make changes to this bug.