Closed
Bug 1398322
Opened 7 years ago
Closed 7 years ago
stylo: PropertyDeclarationBlock storage is inefficient
Categories
(Core :: CSS Parsing and Computation, enhancement, P3)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox57 | --- | fixed |
People
(Reporter: bholley, Assigned: mbrubeck)
References
Details
On gmail, bz measures 5.5MBs of PDBs, which is a lot. I think we can shrink it.
The definition is at [1]. The main inefficiency is that we store an Importance alongside every PropertyDeclaration, costing a word where we only need a bit. We should instead store this bit out-of-band, and use minimal storage for doing so.
My idea is as follows. Adjustments / counter-proposals welcome:
* We introduce a type called SmallBitVec, that just wraps a usize.
* If the rightmost bit of the usize is unset, then we treat it as inline storage.
* In the inline storage case, the rightmost bit is a sentinel from which len() can be derived.
* If the rightmost bit is set, then the usize is treated as heap pointer (modulo the rightmost bit, which needs to be masked off) to a buffer that is capacity + len + buffer.
The type would expose indexed access to the bits, as well as methods for computing whether all of the bits are set or none of the bits are set. This would allow us to eliminate important_count, since that's all the consumer really wants to know. This means that the inline size of PDB will not increase, which is a nice bonus.
[1] http://searchfox.org/mozilla-central/rev/44c693914255638d74bcf1ec3b0bcd448dfab2fd/servo/components/style/properties/declaration_block.rs#72
Reporter | ||
Comment 1•7 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #0)
> * In the inline storage case, the rightmost bit is a sentinel from which
> len() can be derived.
> * If the rightmost bit is set, then the usize is treated as heap pointer
> (modulo the rightmost bit, which needs to be masked off) to a buffer that is
> capacity + len + buffer.
I realize that I'm overloading "rightmost". In the first case (the sentinel), it refers to "the lowest order bit that is set". In the second case (the discriminant), it refers to "the bit of order zero".
Reporter | ||
Updated•7 years ago
|
Priority: -- → P3
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(mbrubeck)
Comment 3•7 years ago
|
||
I have slightly different idea that we introduce a new type called VecWithBit which contains a Vec and a union of usize and pointer, which serves as bitvec with length identical to the capacity of the Vec, so that we don't need any extra bits for checking whether the bitvec is inlined as well as the length of bitvec, because we can simply derive them from Vec's length.
Reporter | ||
Comment 4•7 years ago
|
||
(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #3)
> I have slightly different idea that we introduce a new type called
> VecWithBit which contains a Vec and a union of usize and pointer, which
> serves as bitvec with length identical to the capacity of the Vec, so that
> we don't need any extra bits for checking whether the bitvec is inlined as
> well as the length of bitvec, because we can simply derive them from Vec's
> length.
That works too, at the cost of being slightly less reusable.
If we made it reusable, we could replace the BitVec in [1] and save some heap-allocations, which would be nice.
[1] http://searchfox.org/mozilla-central/rev/44c693914255638d74bcf1ec3b0bcd448dfab2fd/servo/components/style/sharing/mod.rs#126
Reporter | ||
Comment 5•7 years ago
|
||
(I think eliminating the BitVec for revalidation_match_results would be a nice improvement, so I'm inclined to prefer the approach in comment 0 unless we discover other downsides)
Assignee | ||
Comment 6•7 years ago
|
||
I have started implementing the SmallBitVec type at https://github.com/mbrubeck/smallbitvec
Assignee | ||
Comment 7•7 years ago
|
||
Reporter | ||
Comment 8•7 years ago
|
||
(In reply to Matt Brubeck (:mbrubeck) from comment #6)
> I have started implementing the SmallBitVec type at
> https://github.com/mbrubeck/smallbitvec
Nice! Looks like it's missing a Drop impl though?
Reporter | ||
Comment 9•7 years ago
|
||
Also might be worth adding some cargo bench microbenchmarks to compare perf against BitVec.
Reporter | ||
Comment 10•7 years ago
|
||
Also: probably worth using an unchecked get from the iterator, so that we don't have to do the somewhat-complicated len() computation on each call to next()?
Reporter | ||
Comment 11•7 years ago
|
||
(In reply to Matt Brubeck (:mbrubeck) from comment #7)
> https://github.com/servo/servo/pull/18431
This looks great on a quick skim. May be worth pushing through gecko try before landing.
Assignee | ||
Comment 12•7 years ago
|
||
Added the Drop impl, get_unchecked method, and some microbenchmarks. Also made some tweaks to improve performance on some of the benchmarks. Currently, SmallBitVec is about 20% slower than BitVec in benchmarks that test the `set` method, and about 200% slower in benchmarks that iterate over the vector.
Assignee | ||
Comment 13•7 years ago
|
||
Reporter | ||
Comment 14•7 years ago
|
||
(In reply to Matt Brubeck (:mbrubeck) from comment #12)
> Added the Drop impl, get_unchecked method, and some microbenchmarks. Also
> made some tweaks to improve performance on some of the benchmarks.
> Currently, SmallBitVec is about 20% slower than BitVec in benchmarks that
> test the `set` method, and about 200% slower in benchmarks that iterate over
> the vector.
I noticed you added a commit to iterator performance by fiddling with inlining. How much did it help? In practice, we probably only really care about the non-spilled case, so we could make that logic inline and make the spill-handling #[inline(never)] / #[cold].
Also, what about eq performance? I noticed that SmallBitVec compares via iterators, whereas BitVec compares the words. That potentially makes BitVec 64x faster on eq, which probably matters when comparing the bitvecs for revalidation selectors. Should be straightforward to add a similar eq impl?
Assignee | ||
Comment 15•7 years ago
|
||
The inlining changes improved iteration speed about 30% in both spilled and non-spilled cases.
I'll push a fix for eq performance shortly.
Assignee | ||
Comment 16•7 years ago
|
||
Reporter | ||
Comment 17•7 years ago
|
||
Thanks Matt!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
status-firefox57:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•