Closed
Bug 1275755
Opened 9 years ago
Closed 8 years ago
Make Atoms threadsafe
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla49
Tracking | Status | |
---|---|---|
firefox49 | --- | fixed |
People
(Reporter: bholley, Assigned: bholley)
References
(Blocks 2 open bugs)
Details
Attachments
(6 files, 3 obsolete files)
(deleted),
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
(deleted),
text/html
|
Details |
This was removed in bug 387445, on the grounds that the atom table isn't threadsafe, and so atoms shouldn't be refcounted off the main thread either.
For stylo, we may be able to get away with doing any actual atomization off-main-thread, but we definitely need to refcount the atoms. This is a relatively small and uninvasive change.
Unittest run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b369c7269884
Talos run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b62e64a76ee&selectedJob=21455153 [1]
I consulted with jmaher about the Talos results and he confirmed that it looked fine.
[1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=5511d54a3f172c1d68f98cc55dce4de1d0ba1b51&newProject=try&newRevision=3b62e64a76ee&framework=1&showOnlyImportant=0
Assignee | ||
Comment 1•9 years ago
|
||
Attachment #8756603 -
Flags: review?(nfroyd)
How are you planning on handling the last release of an atom happening off the main thread?
Assignee | ||
Comment 3•9 years ago
|
||
(In reply to Jonas Sicking (:sicking) from comment #2)
> How are you planning on handling the last release of an atom happening off
> the main thread?
See the patch.
Assignee | ||
Comment 4•9 years ago
|
||
Though now I realize there's a race in there if somebody else grabs the entry in the mean time. Let me see if I can fix that.
Comment 5•9 years ago
|
||
Any measurements whether this affects to microbenchmarks?
(I just optimized out some atom getting because it was too slow, but sure, that was less about atom addref/release and more about hashtable usage.)
Assignee | ||
Comment 6•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #5)
> Any measurements whether this affects to microbenchmarks?
Not really.
> (I just optimized out some atom getting because it was too slow, but sure,
> that was less about atom addref/release and more about hashtable usage.)
Yeah, I'm pretty sure the hash table lookup would dominate. And realistically, of the only impact of this was a microbenchmark, we'd probably still want to land it.
Assignee | ||
Updated•9 years ago
|
Attachment #8756603 -
Flags: review?(nfroyd)
Comment 7•9 years ago
|
||
Sure, but we do optimize rather heavily even for microbenchmarks in some cases.
Bug 1273481 as an example.
Comment 8•9 years ago
|
||
In other words, would be just good to know how much this affects in some microbenchmark.
Not objecting the change or anything.
Comment 9•9 years ago
|
||
A silly test which addrefs/releases atoms quite a bit
data:text/html,<script>window.mo = new MutationObserver(function(){}); function test() { for (var i = 0; i < 10000; ++i) { mo.observe(document, { attributes: true, attributeFilter: ["a", "b", "c", "d", "e", "f", "g", "h"]}); } console.log(new Date()); }; for (var i = 0; i < 100; ++i) { setTimeout(test, i * 50); }</script>
Comment 10•9 years ago
|
||
(that console.log() is just random debug output to see that something is happening.)
Assignee | ||
Comment 11•9 years ago
|
||
Discussed this a bit with sicking. We decided that it's probably worth just making atoms threadsafe in general, since it would make it easier to write code in Workers.
I have some patches which I'm running through try. They make the microbenchmark in comment 9 about 5% slower on average, which is almost certainly acceptable.
Summary: Give Atoms threadsafe refcounting → Make Atoms threadsafe
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 12•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b369c7269884
https://treeherder.mozilla.org/#/jobs?repo=try&revision=19e3dff4870e&selectedJob=21481765
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=5511d54a3f17&newProject=try&newRevision=19e3dff4870e&framework=1
Assignee | ||
Comment 13•9 years ago
|
||
I don't think anyone is using this anymore. It would be good to assert that there
are no leaks, but that doesn't pass for me in a local build, and I don't have time
to chase it.
Attachment #8756711 -
Flags: review?(nfroyd)
Assignee | ||
Comment 14•9 years ago
|
||
Attachment #8756712 -
Flags: review?(nfroyd)
Assignee | ||
Comment 15•9 years ago
|
||
Attachment #8756713 -
Flags: review?(nfroyd)
Assignee | ||
Comment 16•9 years ago
|
||
Attachment #8756714 -
Flags: review?(nfroyd)
Assignee | ||
Updated•9 years ago
|
Attachment #8756603 -
Attachment is obsolete: true
Assignee | ||
Comment 17•9 years ago
|
||
Comment on attachment 8756714 [details] [diff] [review]
Part 4 - Use threadsafe refcounting for atoms and remove main-thread restrictions. v1
Review of attachment 8756714 [details] [diff] [review]:
-----------------------------------------------------------------
Ugh, there's still a race here, between the threadsafe refcount decrement and the acquisition of the lock to destroy. I'll think about this some more.
Attachment #8756714 -
Flags: review?(nfroyd)
Comment 18•9 years ago
|
||
Comment on attachment 8756711 [details] [diff] [review]
Part 1 - Remove MOZ_DUMP_ATOM_LEAKS. v1
Review of attachment 8756711 [details] [diff] [review]:
-----------------------------------------------------------------
Works for me.
Attachment #8756711 -
Flags: review?(nfroyd) → review+
Updated•9 years ago
|
Attachment #8756712 -
Flags: review?(nfroyd) → review+
Updated•9 years ago
|
Attachment #8756713 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 19•9 years ago
|
||
As for the last patch - I talked about it with bz and smaug on IRC, and the current plan is to try a GC mechanism whereby AddRef/Release just twiddle refcounts, and we periodically call into the (locked) atom table to evict entries whose refcount is zero.
We can probably keep a counter on the number of times we've Release()-ed and found refcount zero, and perform a GC when that counter gets high (and on debug-build shutdown). We can probably make it really high and GC pretty infrequently - waiting for 10000 possibly-dead atoms may be fine (with a smaller threshold in debug builds to make sure the code gets exercised).
I may not get to this before PTO though.
Assignee | ||
Comment 20•9 years ago
|
||
Ok, I managed to get this working this afternoon. Overhead on the microbenchmark is about 10-12% (presumably from the atomics usage), which is probably acceptable. I'll post patches.
Assignee | ||
Comment 21•9 years ago
|
||
Please review carefully.
Attachment #8757097 -
Flags: review?(nfroyd)
Assignee | ||
Updated•9 years ago
|
Attachment #8756714 -
Attachment is obsolete: true
Assignee | ||
Comment 22•9 years ago
|
||
Attachment #8757098 -
Flags: review?(nfroyd)
Assignee | ||
Comment 23•9 years ago
|
||
Assignee | ||
Comment 24•9 years ago
|
||
Assignee | ||
Comment 25•8 years ago
|
||
Looks like we also run leak checking in ASAN builds, so need to GC there on shutdown as well.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d8b71f5a8ef
Assignee | ||
Comment 26•8 years ago
|
||
Attachment #8757288 -
Flags: review?(nfroyd)
Assignee | ||
Updated•8 years ago
|
Attachment #8757097 -
Attachment is obsolete: true
Attachment #8757097 -
Flags: review?(nfroyd)
Comment 27•8 years ago
|
||
Please use NS_FREE_PERMANENT_DATA instead of defined(DEBUG) || defined(MOZ_ASAN). (Currently it is defined to be the same thing, but it conveys the intent better, and will ensure this keeps working in case we add some other testing that cares about leaks.)
Assignee | ||
Comment 28•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #27)
> Please use NS_FREE_PERMANENT_DATA instead of defined(DEBUG) ||
> defined(MOZ_ASAN). (Currently it is defined to be the same thing, but it
> conveys the intent better, and will ensure this keeps working in case we add
> some other testing that cares about leaks.)
Ok. And maybe I'll also set the opt threshold to something lower, since it will probably improve hashtable performance to a large number of dead entries. Maybe 500 or 1000? What do you think Nathan?
Comment 29•8 years ago
|
||
(In reply to Bobby Holley (PTO through June 13) from comment #28)
> And maybe I'll also set the opt threshold to something lower, since it
> will probably improve hashtable performance to a large number of dead
> entries. Maybe 500 or 1000? What do you think Nathan?
The first sentence is trying to say that a lower threshold would improve hashtable performance, because we wouldn't retain so many dead entries over time and we'd thus have a smaller (or at least a more lightly-loaded) hashtable, correct? That sounds plausible.
Assignee | ||
Comment 30•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #29)
> (In reply to Bobby Holley (PTO through June 13) from comment #28)
> > And maybe I'll also set the opt threshold to something lower, since it
> > will probably improve hashtable performance to a large number of dead
> > entries. Maybe 500 or 1000? What do you think Nathan?
>
> The first sentence is trying to say that a lower threshold would improve
> hashtable performance, because we wouldn't retain so many dead entries over
> time and we'd thus have a smaller (or at least a more lightly-loaded)
> hashtable, correct? That sounds plausible.
Correct.
Comment 31•8 years ago
|
||
Comment on attachment 8757288 [details] [diff] [review]
Part 4 - Use a GC scheme to free unused atoms. v2
Review of attachment 8757288 [details] [diff] [review]:
-----------------------------------------------------------------
A couple comments, but this looks OK. I think the microbenchmark slowdown is just because the refcount handling in our mutation observer implementation leaves something to be desired; I think we could make that microbenchmark faster.
::: xpcom/ds/nsAtomTable.cpp
@@ +64,5 @@
> };
>
> //----------------------------------------------------------------------
>
> +static Atomic<uint32_t> gUnusedAtomCount(0);
I think we can get away with making this Atomic<uint32_t, ReleaseAcquire>, which would make things a little cheaper...at least on x86oids (no memory barrier required on store, and I'm pretty sure it requires fewer barriers on ARM, too).
@@ +71,5 @@
> {
> public:
> DynamicAtom(const nsAString& aString, uint32_t aHash)
> {
> + ++gUnusedAtomCount;
I think this deserves a comment...because I'm not completely sure what this is here for. I think it's there because this can now happen with our GC'ing scheme:
1. Atom A drops to zero refcount; gUnusedAtomCount is incremented.
2. A GC is *not* triggered.
3. Somebody comes along and re-atomizes a string corresponding to A.
4. We find A still in our hashtable, even with a zero refcount.
5. We AddRef A.
6. To be consistent, we now have to decrement gUnusedAtomCount.
And so:
7. To balance out the case where we're creating an atom, we need this increment here.
Is that correct? If so, OK, that makes sense. But the increment here just to decrement in AddRef for the initial construction seems like needless work. How about we do this:
1. Make this constructor private.
2. Change the constructor so that it starts like:
DynamicAtom(const nsAString&, uint32_t) : mRefCnt(1)
3. Add a public static Create function:
already_AddRefed<DynamicAtom>
DynamicAtom::Create(const nsAString& aString, uint32_t aHash)
{
// The refcount is appropriately initialized in the constructor.
return dont_AddRef(new DynamicAtom(aString, aHash));
}
4. Replace |new DynamicAtom| calls with calls to Create().
5. Put a comment describing the above scenario in AddRef, so that the comment is close to the code that needs describing.
This is admittedly gnarly microoptimization, since atoms aren't created all that often (a non-rigorous stress test of popular websites only created about 4K atoms).
@@ +393,5 @@
>
> +void
> +DynamicAtom::GCAtomTable()
> +{
> + gAtomTableLock->AssertCurrentThreadOwns();
Let's move the lock acquisition into this function. Or make the function take a |const MutexAutoLock&| parameter.
@@ +402,5 @@
> + if (atom->mRefCnt == 0) {
> + i.Remove();
> + delete atom;
> + MOZ_ASSERT(gUnusedAtomCount > 0);
> + --gUnusedAtomCount;
In the interest of micro-optimization--as operations on gUnusedAtomCount are relatively expensive--why don't we do something like:
uint32_t removedCount = 0;
for (...) {
...
if (atom->mRefCnt == 0) {
...
removedCount++;
}
}
MOZ_ASSERT(removedCount <= gUnusedAtomCount);
gUnusedAtomCount -= removedCount;
Attachment #8757288 -
Flags: review?(nfroyd) → review+
Updated•8 years ago
|
Attachment #8757098 -
Flags: review?(nfroyd) → review+
Comment 32•8 years ago
|
||
the mutationobserver case is just something where I knew we are using nsIAtoms so that it gets exposed to the web.
(it is just that some/most real world benchmarks contains that kind of totally silly microbenchmarks.
bug 1273481 being an example of such: the test is testing the performance of a no-op method call.)
Comment 33•8 years ago
|
||
setAttribute performance might be more important than MutationObserver.observe.
That is a case which we have optimized heavily.
Comment 34•8 years ago
|
||
Comment 35•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/5bab999902fc
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ec4d49ae88d
https://hg.mozilla.org/integration/mozilla-inbound/rev/10ce0f01ee4a
https://hg.mozilla.org/integration/mozilla-inbound/rev/bec37fdc70f9
https://hg.mozilla.org/integration/mozilla-inbound/rev/de635a6b22cf
Comment 36•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #31)
> > +static Atomic<uint32_t> gUnusedAtomCount(0);
>
> I think we can get away with making this Atomic<uint32_t, ReleaseAcquire>,
> which would make things a little cheaper...at least on x86oids (no memory
> barrier required on store, and I'm pretty sure it requires fewer barriers on
> ARM, too).
One other thought here is that this patch is using ThreadSafeAutoRefCnt for DynamicAtom's refcount, which uses sequential consistency not because refcounting really needs it, but because it could break pre-existing code that was depending on the memory barriers for other things. Perhaps DynamicAtom's reference count could be ReleaseAcquire rather than SequentiallyConsistent. Or, even more efficient, no memory barrier for AddRef, Release(ordering) semantics for Release(refcount)-to-nonzero, and Acquire semantics for Release(refcount)-to-zero, which IIRC is what's needed for threadsafe reference counting.
Or, better -- Atoms are special, since they're immutable once they're created, and for immutable objects, I actually don't see why reference counting can't just use Relaxed semantics (since the reason for the need for release and acquire semantics, is, as I understand it, the need for the object's destruction on a given thread to take into account all writes to the object and things it references on other threads before the object's last Release(refcounting) on those threads). Somebody should definitely check this idea, though.
And, additionally, I don't see why gUnusedAtomCount can't use Relaxed semantics -- even without any of those immutability assumptions. It seems like the key thing that's needed there is that there's a memory barrier from unlocking/locking the AtomTableLock, and when the atom table hands out a new atom, it does the initial AddRef of that atom on the thread on which it's handing it out while holding the atom table lock (which NS_Atomize does), which in turn assures that DynamicAtom::GCAtomTable (which holds the atom table lock) sees that reflected in the reference count.
Not sure if this is worth filing followups for or if it's micro-optimization that won't actually make a difference.
Comment 37•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5bab999902fc
https://hg.mozilla.org/mozilla-central/rev/2ec4d49ae88d
https://hg.mozilla.org/mozilla-central/rev/10ce0f01ee4a
https://hg.mozilla.org/mozilla-central/rev/bec37fdc70f9
https://hg.mozilla.org/mozilla-central/rev/de635a6b22cf
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox49:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Comment 38•8 years ago
|
||
Comment 39•8 years ago
|
||
If I understand this bug correctly, I think we can now remove the static atoms table. It only existed to allow some atoms (used by the HTML parser) to be looked up off the main thread, but now we can do that for any atom. See bug 529808 comment 2 and onwards.
Comment 40•8 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #39)
> If I understand this bug correctly, I think we can now remove the static
> atoms table. It only existed to allow some atoms (used by the HTML parser)
> to be looked up off the main thread, but now we can do that for any atom.
> See bug 529808 comment 2 and onwards.
I think it will also help with bug 1269490.
Blocks: 1269490
You need to log in
before you can comment on or make changes to this bug.
Description
•