Closed
Bug 1397976
Opened 7 years ago
Closed 7 years ago
stylo: Do a second pass on the style sharing cache after computing the rule node
Categories
(Core :: CSS Parsing and Computation, defect, P3)
Core
CSS Parsing and Computation
Tracking
()
People
(Reporter: bholley, Assigned: bholley)
References
(Blocks 1 open bug)
Details
Attachments
(7 files, 6 obsolete files)
(deleted),
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bholley
:
review+
|
Details | Diff | Splinter Review |
I had this idea yesterday.
The style sharing cache operates before selector matching, so that we can skip selector matching if we get a hit. But the cache's mechanism for determining which rules would match is inherently less precise than the rule node, which describes what rules _actually_ matched, and is what controls sharing in gecko (via the style context tree).
So the idea here would be to have a second look at the entries in the cache after computing the rule node. If any of the rule nodes match, then we reuse the ComputedValues. To avoid regressing performance with lots of superfluous lookups, we can use a bloom filter for rule node pointers.
I just hacked this up, and added instrumentation to measure the number of style contexts. The results on the HTML spec look really good:
> Gecko: 27600
> stylo 1 thread: 46000
> stylo 1 thread + 2ndPass: 27400
> stylo 3 threads: 83800
> stylo 3 threads + 2ndPass: 60000
Note that all the stylo runs have a few hundred gecko contexts floating around, which explains why we do a tiny bit better than gecko on stylo 1 thread + 2ndPass.
My patch doesn't currently support pseudos or visited styles (I haven't thought through exactly how those would work, or measured if it's worthwhile), and doesn't implement the bloom filter yet. But I'm posting it for feedback here in the interim and pushing to try.
Assignee | ||
Comment 1•7 years ago
|
||
MozReview-Commit-ID: 7q9DGzE0mFQ
Assignee | ||
Comment 2•7 years ago
|
||
MozReview-Commit-ID: EcVQDLoxwFP
Assignee | ||
Comment 3•7 years ago
|
||
MozReview-Commit-ID: H67j3Sbt3gr
Assignee | ||
Comment 4•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8e09344ec1b39e1d12796db74048fd5e4ba5690e
cam/bz - would be interesting to see numbers on how this moves the needle on its own, and how it interacts with the reset struct sharing.
Assignee | ||
Comment 5•7 years ago
|
||
Hm so adding a check for !self.element.is_native_anonymous() (which I added between the initial measurements and the posted patch) brings stylo 1 thread + 2ndPass from 27400 to 31503. So it seems like we should try to find a way to share those correctly.
Assignee | ||
Comment 6•7 years ago
|
||
Also, I'm missing the can_share_across_parents check. Will add that, but need to head out right now.
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #5)
> Hm so adding a check for !self.element.is_native_anonymous() (which I added
> between the initial measurements and the posted patch) brings stylo 1 thread
> + 2ndPass from 27400 to 31503. So it seems like we should try to find a way
> to share those correctly.
I was wrong - there aren't any native anonymous elements here that match, and the numbers are just noisy.
Assignee | ||
Comment 8•7 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #0)
> > Gecko: 27600
> > stylo 1 thread: 46000
> > stylo 1 thread + 2ndPass: 27400
> > stylo 3 threads: 83800
> > stylo 3 threads + 2ndPass: 60000
So, with the proper parent checks in place, I get:
1 thread: ~40k
3 threads: 71.7
So that's still an improvement, but a fair bit worse than gecko on sequential mode. The two hypotheses I have are that we're not handling pseudos yet, or that the cache is too small. I'll experiment with those tomorrow.
Assignee | ||
Comment 9•7 years ago
|
||
Now with correct parent checks.
MozReview-Commit-ID: H67j3Sbt3gr
Assignee | ||
Updated•7 years ago
|
Attachment #8905738 -
Attachment is obsolete: true
Assignee | ||
Comment 10•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Priority: -- → P3
Assignee | ||
Comment 11•7 years ago
|
||
I took some measurements a little while ago for different cache sizes:
127 entries:
* 1 thread: 31.4k
* 3 threads: panics for some reason
64 entries:
* 1 thread: 35.1k
* 3 threads: 65.3k
39 entries:
* 1 thread: 39.1k
* 3 threads: 69.1k
47 entries:
* 1 thread: 35.4k / 36.1k / 38.5k
* 3 threads: 70k / 68.6k / 67.3k
So it seems that cache size does make a difference. Something to consider tweaking, but that can probably be done separately. I'll file a bug so I don't forget.
Assignee | ||
Comment 12•7 years ago
|
||
Assignee | ||
Comment 13•7 years ago
|
||
Assignee | ||
Comment 14•7 years ago
|
||
Assignee | ||
Comment 15•7 years ago
|
||
Assignee | ||
Comment 16•7 years ago
|
||
So I've got this all done, including sharing on recascade. I _think_ it should be green now, but doing another try push to be sure. The only thing remaining is the measure the lookup performance on the cache and figure out whether we want a bloom filter or not.
Patches: https://github.com/bholley/gecko/commits/2ndpass_only
Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c921f26432066b910e27e20d2eb163821747ddc0
Assignee | ||
Comment 17•7 years ago
|
||
Using traversal_parent here is wrong.
MozReview-Commit-ID: GHCIjkgx4VE
Attachment #8907418 -
Flags: review?(emilio)
Assignee | ||
Comment 18•7 years ago
|
||
This will make the linear probing faster. If we end up implementing the two-tier
cache setup, this code will be unnecessary and can go away.
MozReview-Commit-ID: BRfV5ump34n
Attachment #8907419 -
Flags: review?(emilio)
Assignee | ||
Comment 19•7 years ago
|
||
This gives us more flexibility, and doesn't cost us anything.
MozReview-Commit-ID: CuvOEcLA3My
Attachment #8907420 -
Flags: review?(emilio)
Assignee | ||
Comment 20•7 years ago
|
||
MozReview-Commit-ID: H67j3Sbt3gr
Attachment #8907421 -
Flags: review?(emilio)
Assignee | ||
Updated•7 years ago
|
Attachment #8905733 -
Attachment is obsolete: true
Attachment #8905737 -
Attachment is obsolete: true
Attachment #8905791 -
Attachment is obsolete: true
Assignee | ||
Comment 21•7 years ago
|
||
MozReview-Commit-ID: AFTwtzi4P93
Attachment #8907422 -
Flags: review?(emilio)
Comment 22•7 years ago
|
||
Comment on attachment 8907418 [details] [diff] [review]
Part 1 - Use inheritance_parent to control style_sharing. v1
Review of attachment 8907418 [details] [diff] [review]:
-----------------------------------------------------------------
> Using traversal_parent here is wrong.
Well, it's not to big of a deal though, right?
I mean, it only differs from traversal_parent when we're NAC, and we don't share in that case anyway.
Attachment #8907418 -
Flags: review?(emilio) → review+
Updated•7 years ago
|
Attachment #8907420 -
Flags: review?(emilio) → review+
Comment 23•7 years ago
|
||
Comment on attachment 8907421 [details] [diff] [review]
Part 4 - Do a second pass on the sharing cache to reuse style by rule node identity. v2
Review of attachment 8907421 [details] [diff] [review]:
-----------------------------------------------------------------
All the PRIMARY_STYLE_REUSED_VIA_RULE_NODE stuff is super out-of-band in style_resolver.rs
You should make StyleResolver return something like:
struct ResolvedElementStyles {
primary: PrimaryStyle,
pseudos: EagerPseudoStyles,
}
And make PrimaryStyle, contain whether it was reused via rule node.
After that you can just query it directly, and implement From<ResolvedElementStyles> for ElementStyles.
You can make the cache return a ResolvedElementStyles instead of an element, too, by querying the element data.
::: servo/components/style/sharing/mod.rs
@@ +115,5 @@
> fn from(cv: &ComputedValues) -> Self {
> let p = NonZeroPtrMut::new(cv as *const ComputedValues as *const () as *mut ());
> OpaqueComputedValues(p)
> }
> + fn eq(&self, cv: &ComputedValues) -> bool { Self::from(cv) == *self }
nit: I think this should move to the next line.
::: servo/components/style/traversal.rs
@@ +733,5 @@
>
> + if primary_style_reused_via_rule_node {
> + data.flags.insert(data::PRIMARY_STYLE_REUSED_VIA_RULE_NODE);
> + } else {
> + data.flags.remove(data::PRIMARY_STYLE_REUSED_VIA_RULE_NODE);
This feels super out-of-band... This really belongs to ElementStyles.
Attachment #8907421 -
Flags: review?(emilio) → review-
Comment 24•7 years ago
|
||
Comment on attachment 8907419 [details] [diff] [review]
Part 2 - Cache the parent CV identity on ValidationData. v1
Review of attachment 8907419 [details] [diff] [review]:
-----------------------------------------------------------------
I'm not sure I buy the "this is faster" claim. In general the fast path when comparing siblings doesn't require to go and grab the element data, and in this case it does?
Is this required for something else? I think that unless I'm missing something obvious, or you have measurements, I'd like to drop this patch, or move it to another bug.
::: servo/components/style/sharing/checks.rs
@@ +13,4 @@
> use sharing::{StyleSharingCandidate, StyleSharingTarget};
>
> +/// Determines whether a target and a candidate have compatible parents for sharing.
> +pub fn parents_allow_sharing<E>(target: &mut StyleSharingTarget<E>,
nit: Indentation.
@@ +24,3 @@
>
> + // Siblings can always share.
> + let parent = match target.inheritance_parent() {
parent_style_identity calls unwrap() here, so you can just do the same. It's a bit unfortunate that this check is after the style_identity
Attachment #8907419 -
Flags: review?(emilio)
Comment 25•7 years ago
|
||
Comment on attachment 8907422 [details] [diff] [review]
Part 5 - Share styles during recascades. v1
Review of attachment 8907422 [details] [diff] [review]:
-----------------------------------------------------------------
This patch doesn't insert into the cache when we're recascading, so this code will do close-to-nothing, I believe.
I'd need to reason about when would it be correct to share the whole ElementStyles and not, etc.
I've opened https://github.com/servo/servo/pull/18482 which should make trivial to share the code for the first part in a `cascade_primary_style` function, hopefully making what I proposed there easier.
Meanwhile, I propose to move this to another bug, since the insertion seems non-trivial to me.
Attachment #8907422 -
Flags: review?(emilio) → review-
Comment 26•7 years ago
|
||
(In reply to Emilio Cobos Álvarez [:emilio] from comment #25)
> Comment on attachment 8907422 [details] [diff] [review]
> Part 5 - Share styles during recascades. v1
>
> Review of attachment 8907422 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> This patch doesn't insert into the cache when we're recascading, so this
> code will do close-to-nothing, I believe.
This is wrong, and I missed the traversal.rs changes.
Comment 27•7 years ago
|
||
Comment on attachment 8907419 [details] [diff] [review]
Part 2 - Cache the parent CV identity on ValidationData. v1
Review of attachment 8907419 [details] [diff] [review]:
-----------------------------------------------------------------
Per IRC discussion, r=me if you think this is a clear win. With the formatting comments addressed though.
Attachment #8907419 -
Flags: review+
Assignee | ||
Comment 28•7 years ago
|
||
We'll use these next to propagate information about style reuse to the ElementDataFlags.
MozReview-Commit-ID: Dya6vgzydpL
Attachment #8907804 -
Flags: review?(emilio)
Assignee | ||
Updated•7 years ago
|
Attachment #8907421 -
Attachment is obsolete: true
Attachment #8907422 -
Attachment is obsolete: true
Assignee | ||
Comment 29•7 years ago
|
||
MozReview-Commit-ID: H67j3Sbt3gr
Attachment #8907805 -
Flags: review?(emilio)
Assignee | ||
Comment 30•7 years ago
|
||
MozReview-Commit-ID: AFTwtzi4P93
Attachment #8907806 -
Flags: review?(emilio)
Assignee | ||
Comment 31•7 years ago
|
||
Assignee | ||
Comment 32•7 years ago
|
||
This is emilio's patch.
MozReview-Commit-ID: VuWHKzYVod
Attachment #8907809 -
Flags: review+
Assignee | ||
Comment 33•7 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #31)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=ff2e48c5d7a93c5df1bd866ef9e4bb2a1c22f940
I figured I might as well rebase. Cancelled the above and repushed:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c8cc8bd0128193b87b188154ec2542ec873e4bca
Comment 34•7 years ago
|
||
Comment on attachment 8907804 [details] [diff] [review]
Part 4 - Add some wrapper types to propagate styles out of style resolver. v1
Review of attachment 8907804 [details] [diff] [review]:
-----------------------------------------------------------------
::: servo/components/style/style_resolver.rs
@@ +69,5 @@
> +impl ResolvedStyle {
> + /// Creates a new ResolvedStyle.
> + pub fn new(style: Arc<ComputedValues>) -> Self {
> + ResolvedStyle {
> + style: style,
nit: I'd just use the shorthand initializer (ResolvedStyle { style })
@@ +76,5 @@
> +}
> +
> +impl PrimaryStyle {
> + /// Convenience accessor for the style.
> + pub fn style(&self) -> &Arc<ComputedValues> {
I think we don't clone this anywhere, may it make sense just returning &ComputedValues here? It'd save a few &** in a few other places.
Attachment #8907804 -
Flags: review?(emilio) → review+
Comment 35•7 years ago
|
||
Comment on attachment 8907805 [details] [diff] [review]
Part 5 - Do a second pass on the sharing cache to reuse style by rule node identity. v3
Review of attachment 8907805 [details] [diff] [review]:
-----------------------------------------------------------------
::: servo/components/style/data.rs
@@ +290,5 @@
>
> /// Sets a new set of styles, returning the old ones.
> pub fn set_styles(&mut self, new_styles: ResolvedElementStyles) -> ElementStyles {
> + if new_styles.primary.0.reused_via_rule_node {
> + self.flags.insert(PRIMARY_STYLE_REUSED_VIA_RULE_NODE);
Ok, I can live with it here :)
::: servo/components/style/sharing/mod.rs
@@ +247,5 @@
> /// FakeCandidate below.
> #[derive(Debug)]
> pub struct StyleSharingCandidate<E: TElement> {
> /// The element.
> + pub element: E,
Is this `pub` used?
@@ +756,5 @@
> + // landed when computed values were fused, see
> + // https://bugzilla.mozilla.org/show_bug.cgi?id=1381635.)
> + // So at the moment, we need to additionally compare visitedness,
> + // since that is not accounted for by rule nodes alone.
> + // FIXME(jryans): This seems like it breaks the constant time
We already do the same for the normal sharing cache (comparing the element state without ignoring visitedness), and we ran it through jryans and dbaron iirc, and there didn't seem to be a problem.
See the:
if target.element.get_state() != candidate.get_state() {
check in the sharing code.
::: servo/components/style/style_resolver.rs
@@ +74,3 @@
> ResolvedStyle {
> style: style,
> + reused_via_rule_node: reused_via_rule_node,
ditto re. the shorthand syntax.
@@ +277,5 @@
> + // Before doing the cascade, check the sharing cache and see if we can reuse
> + // the style via rule node identity.
> + let may_reuse = pseudo.is_none() && !self.element.is_native_anonymous() &&
> + parent_style.is_some() && inputs.rules.is_some();
> + let cached = if !may_reuse { None } else {
nit: Let's indent this properly, even though it takes a bit more space.
but actually, since we don't use `cached` for anything else, why not:
if may_reuse {
let cached = cache.lookup_by_rules(
// ...
);
if let Some(el) = cached {
styles_reused += 1;
return ...
}
}
Attachment #8907805 -
Flags: review?(emilio) → review+
Comment 36•7 years ago
|
||
Comment on attachment 8907806 [details] [diff] [review]
Part 6 - Share styles during recascades. v3
Review of attachment 8907806 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with that fixed.
Much more obvious than the first series, thanks!
Kinda unrelated, but I'd prefer the part 0 to include the two commits, because the first one was logically separated.
You can use:
curl -L https://github.com/servo/servo/pull/18482.patch | git am -
in your servo tree for that.
::: servo/components/style/sharing/mod.rs
@@ +438,3 @@
> self.entries.insert(StyleSharingCandidate {
> element: el,
> + validation_data: validation_data,
You really dislike the shorthand syntax ;)
::: servo/components/style/traversal.rs
@@ +723,5 @@
> + context.thread_local.sharing_cache.insert_if_possible(
> + &element,
> + new_styles.primary.style(),
> + None,
> + context.thread_local.bloom_filter.matching_depth(),
You probably want to use `traversal_data.current_dom_depth`, given we don't set up the bloom filter in this path.
We should use it in the other insert_if_possible, for consistency, I guess, and probably
debug_assert_eq!(bloom_filter.matching_depth(), traversal_data.current_dom_depth)
in the MatchAndCascade path.
Attachment #8907806 -
Flags: review?(emilio) → review+
Assignee | ||
Comment 37•7 years ago
|
||
(In reply to Emilio Cobos Álvarez [:emilio] from comment #34)
> Comment on attachment 8907804 [details] [diff] [review]
> Part 4 - Add some wrapper types to propagate styles out of style resolver. v1
>
> Review of attachment 8907804 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: servo/components/style/style_resolver.rs
> @@ +69,5 @@
> > +impl ResolvedStyle {
> > + /// Creates a new ResolvedStyle.
> > + pub fn new(style: Arc<ComputedValues>) -> Self {
> > + ResolvedStyle {
> > + style: style,
>
> nit: I'd just use the shorthand initializer (ResolvedStyle { style })
Fixed.
>
> @@ +76,5 @@
> > +}
> > +
> > +impl PrimaryStyle {
> > + /// Convenience accessor for the style.
> > + pub fn style(&self) -> &Arc<ComputedValues> {
>
> I think we don't clone this anywhere, may it make sense just returning
> &ComputedValues here? It'd save a few &** in a few other places.
It does mean that we can't have similar semantics with style_mut(), but I can probably just inline that. Fixed.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #35)
> ::: servo/components/style/sharing/mod.rs
> @@ +247,5 @@
> > /// FakeCandidate below.
> > #[derive(Debug)]
> > pub struct StyleSharingCandidate<E: TElement> {
> > /// The element.
> > + pub element: E,
>
> Is this `pub` used?
Not anymore, removed.
>
> @@ +756,5 @@
> > + // landed when computed values were fused, see
> > + // https://bugzilla.mozilla.org/show_bug.cgi?id=1381635.)
> > + // So at the moment, we need to additionally compare visitedness,
> > + // since that is not accounted for by rule nodes alone.
> > + // FIXME(jryans): This seems like it breaks the constant time
>
> We already do the same for the normal sharing cache (comparing the element
> state without ignoring visitedness), and we ran it through jryans and dbaron
> iirc, and there didn't seem to be a problem.
>
> See the:
>
> if target.element.get_state() != candidate.get_state() {
>
> check in the sharing code.
This part was written by jryans, who investigated the visited failures. I'm going to just leave it because I'm trying to avoid thinking about visited style, feel free to NI him if you want it changed.
> ::: servo/components/style/style_resolver.rs
> @@ +74,3 @@
> > ResolvedStyle {
> > style: style,
> > + reused_via_rule_node: reused_via_rule_node,
>
> ditto re. the shorthand syntax.
Fixed.
>
> @@ +277,5 @@
> > + // Before doing the cascade, check the sharing cache and see if we can reuse
> > + // the style via rule node identity.
> > + let may_reuse = pseudo.is_none() && !self.element.is_native_anonymous() &&
> > + parent_style.is_some() && inputs.rules.is_some();
> > + let cached = if !may_reuse { None } else {
>
> nit: Let's indent this properly, even though it takes a bit more space.
>
> but actually, since we don't use `cached` for anything else, why not:
>
> if may_reuse {
> let cached = cache.lookup_by_rules(
> // ...
> );
>
> if let Some(el) = cached {
> styles_reused += 1;
> return ...
> }
> }
I was trying to avoid too much indenting, but I guess this is fine.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #36)
> Kinda unrelated, but I'd prefer the part 0 to include the two commits,
> because the first one was logically separated.
Ok.
> ::: servo/components/style/sharing/mod.rs
> @@ +438,3 @@
> > self.entries.insert(StyleSharingCandidate {
> > element: el,
> > + validation_data: validation_data,
>
> You really dislike the shorthand syntax ;)
Fixed.
>
> ::: servo/components/style/traversal.rs
> @@ +723,5 @@
> > + context.thread_local.sharing_cache.insert_if_possible(
> > + &element,
> > + new_styles.primary.style(),
> > + None,
> > + context.thread_local.bloom_filter.matching_depth(),
>
> You probably want to use `traversal_data.current_dom_depth`, given we don't
> set up the bloom filter in this path.
>
> We should use it in the other insert_if_possible, for consistency, I guess,
> and probably
>
> debug_assert_eq!(bloom_filter.matching_depth(),
> traversal_data.current_dom_depth)
>
> in the MatchAndCascade path.
Good catch.
Updated•7 years ago
|
status-firefox55:
--- → wontfix
status-firefox56:
--- → wontfix
status-firefox57:
--- → affected
status-firefox-esr52:
--- → wontfix
Assignee | ||
Comment 38•7 years ago
|
||
Assignee | ||
Comment 39•7 years ago
|
||
Assignee | ||
Comment 40•7 years ago
|
||
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 41•7 years ago
|
||
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•