Closed Bug 1360889 Opened 8 years ago Closed 8 years ago

stylo: Add a custom Arc to use for style structs

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: bholley, Assigned: bholley)

References

Details

Attachments

(4 files)

Part 1 - Fork std::arc. v1 8 years ago Bobby Holley (:bholley) (deleted), patch	emilio : review+	Details \| Diff \| Splinter Review
Part 2 - Strip down StyleArc to what we need. v1 8 years ago Bobby Holley (:bholley) (deleted), patch	emilio : review+	Details \| Diff \| Splinter Review
Part 3 - Use StyleArc in the style system. v1 8 years ago Bobby Holley (:bholley) (deleted), patch	emilio : review+	Details \| Diff \| Splinter Review
Part 4 - Make StyleBuilder more efficient using stylearc. v1 8 years ago Bobby Holley (:bholley) (deleted), patch	emilio : review+	Details \| Diff \| Splinter Review

Bobby Holley (:bholley)

Assignee

Description

•

8 years ago

As discussed in bug 1360883, there are a few disadvantages of Arc for us: * The weak refcount adds extra memory overhead (certainly on 32-bit), and we don't use it. * We do several extra atomic RMU operations to handle the weak refcount. * It doesn't have a free get_mut()-like API (i.e. one that's not RMU). * We can't use a custom allocator, which makes it inappropriate for arenas, which we may want here. I took a few minutes to prototype a forked version of Arc and it looks like it'll work fine. Need some more time to finish it up, but I should have patches here soon.

Emilio Cobos Álvarez (:emilio)

Comment 1

•

8 years ago

Do we really want a custom Arc? I mean, we only use make/get_mut dufing the cascade, right? Seems to me that we could do something like: ``` enum StyleBuilderStruct<'a, T> { Borrowed(&'a Arc<T>), Owned(T), } impl<'a, T> StyleBuilderStruct<'a, T> { pub fn make_mut(&mut self) -> &mut T { if let StyleBuilderStruct::Borrowed(arc) = *self { *self = StyleBuilderStruct::Owned((**arc).clone()); } match *self { StyleBuilderStruct::Owned(ref mut v) => v, StyleBuilderStruct::Borrowed(..) => debug_unreachable!(), } } pub fn build(self) -> Arc<T> { match self { StyleBuilderStruct::Borrowed(s) => s.clone(), StyleBuilderStruct::Owned(s) => Arc::new(s), } } } ``` That would avoid the initial cloning of the parent/default style structs, and make subsequent make_mut calls basically free.

Bobby Holley (:bholley)

Assignee

Comment 2

•

8 years ago

But that would make StyleBuilder enormous (because it needs inline storage for 20 style structs), and trigger a bunch of extra memmoving, right? That seems really suboptimal. I don't see what's so bad about forking arc. With the weak stuff stripped out, it's not much code at all.

Simon Sapin (:SimonSapin)

Comment 3

•

8 years ago

> that would make StyleBuilder enormous Is that bad, if it’s on the stack and never moves?

Bobby Holley (:bholley)

Assignee

Comment 4

•

8 years ago

(In reply to Simon Sapin (:SimonSapin) from comment #3) > > that would make StyleBuilder enormous > > Is that bad, if it’s on the stack and never moves? It's certainly likely to cause a few more cache misses. Maybe that's ok and not as bad as I was imagining, but we can definitely do better.

Bobby Holley (:bholley)

Assignee

Comment 5

•

8 years ago

Update: We decided to land emilio's approach in comment 1 over the weekend and build on that, which landed in [1]. We also landed [2] and [3] to significantly reduce allocations. I just profiled obama: * before: https://perfht.ml/2oU1fRY * after: https://perfht.ml/2oYGJzC The upshot is that we significantly reduced allocations, but significantly increased memmov traffic, which is roughly what I'd expect. The exact overhead of the cache misses is hard to measure, but to me the memmov traffic is enough justification to proceed with the custom Arc (also, it sounds like we will indeed need to do arenas). I'll try to get that finished up today. [1] https://github.com/servo/servo/pull/16663 [2] https://github.com/servo/servo/pull/16661 [3] https://github.com/servo/servo/pull/16664

Emilio Cobos Álvarez (:emilio)

Comment 6

•

8 years ago

I think the biggest offender there now is the font-size stuff. I think we should fix that too. Manish, you wrote in a comment that you need to cascade font-size regardless of whether it was specified to handle mathml's script min size and language changes. Both of those are super-rare, is there a reason that shouldn't be guarded by the seen bitfield? Seems quite wasteful to cascade all the time unnecessarily.

Flags: needinfo?(manishearth)

Emilio Cobos Álvarez (:emilio)

Comment 7

•

8 years ago

(I believe that accounts for most of the memmove traffic, given the font struct is specially large)

Bobby Holley (:bholley)

Assignee

Comment 8

•

8 years ago

Let's move the font-size discussion to bug 1361126.

Flags: needinfo?(manishearth)

Bobby Holley (:bholley)

Assignee

Comment 9

•

8 years ago

Attached patch Part 1 - Fork std::arc. v1 (deleted) — Details — Splinter Review

This is a verbatim copy of the source at [1]. It won't compile on stable rust, but I'm including it here to make the changes clear. [1] https://doc.rust-lang.org/src/alloc/arc.rs.html MozReview-Commit-ID: XEbOK4fXQX

Attachment #8863881 - Flags: review?(emilio+bugs)

Bobby Holley (:bholley)

Assignee

Comment 10

•

8 years ago

Attached patch Part 2 - Strip down StyleArc to what we need. v1 (deleted) — Details — Splinter Review

We remove most of the doc comments to minimize the number of lines of forked code. MozReview-Commit-ID: LehEisKxkJW

Attachment #8863882 - Flags: review?(emilio+bugs)

Bobby Holley (:bholley)

Assignee

Comment 11

•

8 years ago

Attached patch Part 3 - Use StyleArc in the style system. v1 (deleted) — Details — Splinter Review

MozReview-Commit-ID: flF0fv9E9M

Attachment #8863883 - Flags: review?(emilio+bugs)

Bobby Holley (:bholley)

Assignee

Comment 12

•

8 years ago

Attached patch Part 4 - Make StyleBuilder more efficient using stylearc. v1 (deleted) — Details — Splinter Review

MozReview-Commit-ID: 8wSsYPEmYE4

Attachment #8863884 - Flags: review?(emilio+bugs)

Bobby Holley (:bholley)

Assignee

Comment 13

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2e469521f57a190f73fe1eb73dc16324d7c2c1a5

Bobby Holley (:bholley)

Assignee

Comment 14

•

8 years ago

These patches improve cascade performance by about 25%. It appears to be mostly patch 4. StyleArc itself improves release and drop performance for arcs by a couple of nanoseconds each.

Bobby Holley (:bholley)

Assignee

Comment 15

•

8 years ago

here is a profile of the cascade on the obama testcase with these patches: https://perfht.ml/2p2K9RU . It's a bit noisy (a lot of which seems to depend on malloc performance), but this profile shows us a 22ms for the cascade, which is a lot better than the 35ms we were at last week.

Blocks: 1360878

Emilio Cobos Álvarez (:emilio)

Updated

•

8 years ago

Attachment #8863881 - Flags: review?(emilio+bugs) → review+

Emilio Cobos Álvarez (:emilio)

Comment 16

•

8 years ago

Comment on attachment 8863882 [details] [diff] [review] Part 2 - Strip down StyleArc to what we need. v1 Review of attachment 8863882 [details] [diff] [review]: ----------------------------------------------------------------- It's worth perhaps to put it into a standalone crate? ::: servo/components/style/stylearc.rs @@ +158,5 @@ > + // Note: in stable rust we can't use intrinsics::abort. > + // Panic is good enough in practice here (it will trigger > + // an abort at least in Gecko, and this case is degenerate > + // enough that Servo shouldn't have code that triggers it). > + panic!(); If you keep the panic! there's no need for unsafe {}. We can also call at ::std::process::abort, which is both stable and safe, as you prefer. @@ -583,5 @@ > - // Use Acquire to ensure that we see any writes to `weak` that happen > - // before release writes (i.e., decrements) to `strong`. Since we hold a > - // weak count, there's no chance the ArcInner itself could be > - // deallocated. > - if this.inner().strong.compare_exchange(1, 0, Acquire, Relaxed).is_err() { ugh, yeah, I see now why this would be way more slower than needed. @@ +206,5 @@ > +impl<T: ?Sized> Arc<T> { > + #[inline] > + pub fn get_mut(this: &mut Self) -> Option<&mut T> { > + // See make_mut() for documentation of memory ordering and threadsafety. > + if this.inner().count.load(Relaxed) == 1 { Can we keep this in is_unique()? Seemed a bit clear, even if it desugars to the same. @@ +327,5 @@ > } > } > > #[cfg(test)] > mod tests { Do we really want to keep the tests? We don't run them except on travis, perhaps it's a good time to run them from test-unit instead? Otherwise, perhaps move the tests to tests/unit, assuming they don't rely on internals?

Attachment #8863882 - Flags: review?(emilio+bugs) → review+

Emilio Cobos Álvarez (:emilio)

Updated

•

8 years ago

Attachment #8863883 - Flags: review?(emilio+bugs) → review+

Emilio Cobos Álvarez (:emilio)

Comment 17

•

8 years ago

Comment on attachment 8863884 [details] [diff] [review] Part 4 - Make StyleBuilder more efficient using stylearc. v1 Review of attachment 8863884 [details] [diff] [review]: ----------------------------------------------------------------- I told you the StyleBuilder refactoring was going to be worth it regardless of this, hope this patch wasn't too unpleasant to write ;-) ::: servo/components/style/properties/properties.mako.rs @@ +2024,5 @@ > pub enum StyleStructRef<'a, T: 'a> { > /// A borrowed struct from the parent, for example, for inheriting style. > Borrowed(&'a Arc<T>), > /// An owned struct, that we've already mutated. > + Owned(Arc<T>), It'd be nice to compare this with and without StyleArc (that is, reverting part 3), to see what's the impact of the weak count on perf.

Attachment #8863884 - Flags: review?(emilio+bugs) → review+

Bobby Holley (:bholley)

Assignee

Comment 18

•

8 years ago

(In reply to Emilio Cobos Álvarez [:emilio] from comment #16) > Comment on attachment 8863882 [details] [diff] [review] > Part 2 - Strip down StyleArc to what we need. v1 > > Review of attachment 8863882 [details] [diff] [review]: > ----------------------------------------------------------------- > > It's worth perhaps to put it into a standalone crate? We can eventually, but given that we might do arena stuff, I want to leave it all in the style creates for simplicity. Once things settle down, we can break out what makes sense. > > ::: servo/components/style/stylearc.rs > @@ +158,5 @@ > > + // Note: in stable rust we can't use intrinsics::abort. > > + // Panic is good enough in practice here (it will trigger > > + // an abort at least in Gecko, and this case is degenerate > > + // enough that Servo shouldn't have code that triggers it). > > + panic!(); > > If you keep the panic! there's no need for unsafe {}. We can also call at > ::std::process::abort, which is both stable and safe, as you prefer. It's only stable as of 1.17, which we don't require yet. I'll add a comment. > > @@ -583,5 @@ > > - // Use Acquire to ensure that we see any writes to `weak` that happen > > - // before release writes (i.e., decrements) to `strong`. Since we hold a > > - // weak count, there's no chance the ArcInner itself could be > > - // deallocated. > > - if this.inner().strong.compare_exchange(1, 0, Acquire, Relaxed).is_err() { > > ugh, yeah, I see now why this would be way more slower than needed. > > @@ +206,5 @@ > > +impl<T: ?Sized> Arc<T> { > > + #[inline] > > + pub fn get_mut(this: &mut Self) -> Option<&mut T> { > > + // See make_mut() for documentation of memory ordering and threadsafety. > > + if this.inner().count.load(Relaxed) == 1 { > > Can we keep this in is_unique()? Seemed a bit clear, even if it desugars to > the same. Done. > > @@ +327,5 @@ > > } > > } > > > > #[cfg(test)] > > mod tests { > > Do we really want to keep the tests? We don't run them except on travis, > perhaps it's a good time to run them from test-unit instead? Yeah, good catch. I'll just remove them. > > Otherwise, perhaps move the tests to tests/unit, assuming they don't rely on > internals?

Bobby Holley (:bholley)

Assignee

Comment 19

•

8 years ago

(In reply to Emilio Cobos Álvarez [:emilio] from comment #17) > Comment on attachment 8863884 [details] [diff] [review] > Part 4 - Make StyleBuilder more efficient using stylearc. v1 > > Review of attachment 8863884 [details] [diff] [review]: > ----------------------------------------------------------------- > > I told you the StyleBuilder refactoring was going to be worth it regardless > of this, hope this patch wasn't too unpleasant to write ;-) Yeah totally. :-) > > ::: servo/components/style/properties/properties.mako.rs > @@ +2024,5 @@ > > pub enum StyleStructRef<'a, T: 'a> { > > /// A borrowed struct from the parent, for example, for inheriting style. > > Borrowed(&'a Arc<T>), > > /// An owned struct, that we've already mutated. > > + Owned(Arc<T>), > > It'd be nice to compare this with and without StyleArc (that is, reverting > part 3), to see what's the impact of the weak count on perf. I did that. The numbers were definitely worse than with part 4, but it was a bit noisy and hard to tell how much they improved over std::sync::arc. Given my microbenchmarks (a few ns improvement on release and drop), my guess is that it isn't a huge difference. A more interesting question is how much the performance would regress if we used the CAS get_mut() on std::sync::arc rather than the free get_mut() on stylearc::Arc. If we end up not doing arena allocation, and we at some point want to consider switching back to std::Arc, we could measure that.

Bobby Holley (:bholley)

Assignee

Comment 20

•

8 years ago

https://github.com/servo/servo/pull/16702

Bobby Holley (:bholley)

Assignee

Updated

•

8 years ago

Summary: stylo: Add a StyleArc to use for style structs → stylo: Add a custom Arc to use for style structs

Emilio Cobos Álvarez (:emilio)

Comment 21

•

8 years ago

I think this can be marked as fixed now, please reopen if it's not the case.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.