568664 - Improve Hashtable

Reporter

Description

•

14 years ago

Our current Hashtable code (used for dynamic objects as well as other misc use) isn't bad, but it can be improved in both space and speed.

Steven Johnson

Reporter

Comment 1

•

14 years ago

Attached patch New template-based hashtable (obsolete) (deleted) — Details — Splinter Review

Asking for feedback on general approach for now, though the profiling I've done so far has been promising. We currently use a one-size-fits-all Hashtable, with runtime checks for special cases, and masquerading non-Atoms as Atoms. This keeps the basic quadradic-probe lookup but makes a number of improvements: -- sizeof(Hashtable) == sizeof(void*) (current InlineHashtable is 8 bytes) -- empty Hashtables all point to the same data, so "creating" one is cheap -- fewer runtime checks for things we know at compiletime (eg "dontEnumMask") -- better encapsulation of "weakness"; the WeakAtom and WeakPointer variants have the same API as the non-weak versions, so it's harder for the caller to do the wrong thing This patch only defines the new class(es); the old ones will be removed in a subsequent patch. Note that this makes heavy use of templates; I've only tested on GCC so far, but it's possible some mobile compilers might generate suboptimal code with this (obviously we'd need to test others too). Impact on codesize isn't known yet either (though I'd be surprised if it's significant; the core code is actually pretty small, despite the expanse of source code).

Assignee: nobody → stejohns

Attachment #447853 - Flags: feedback?(edwsmith)

Steven Johnson

Reporter

Updated

•

14 years ago

Attachment #447853 - Flags: feedback?(lhansen)

Steven Johnson

Reporter

Comment 2

•

14 years ago

Attached patch Use the new hashtables (obsolete) (deleted) — Details — Splinter Review

Use the new Hashtables everywhere instead of the old ones. The most interesting changes are to ScriptObject and Dictionary. Testing shows modest but positive gains on various benchmarks (generally in the 1-2% range, with a few up to 5 or 6%); testing on a microbenchmark that just happens on dynamic property access shows up to 25% gain. (More testing is required.)

Attachment #447857 - Flags: feedback?(edwsmith)

Steven Johnson

Reporter

Comment 3

•

14 years ago

Size comparison: a MacTel32 build went from 1,669,976 to 1,706,916 bytes with these two patches applied.

Steven Johnson

Reporter

Updated

•

14 years ago

Attachment #447857 - Flags: feedback?(lhansen)

Steven Johnson

Reporter

Comment 4

•

14 years ago

Attached file Microbenchmark (deleted) — Details

Simple microbenchmark to hammer on get/set action. On MacTel I get about 5% improvement on the setter loop and about 25% improvement on the getter loop.

Edwin Smith

Comment 5

•

14 years ago

Symbian and solaris are the typical places where template code falls over. at first glance, this new code is not fundamentally more template-heavy than anything we already have, so i'm not opposed on principle. I'm a bit surprised this generates 36K more code (+2.2%) but is that stripped? non-universal binary, right? hard to stomach, but whenever code size matters, we should be looking for savings *off* the critical path, so its not enough to reject the patch. (ahem, error generation is very bloaty, for example).

Edwin Smith

Updated

•

14 years ago

Attachment #447853 - Flags: feedback?(edwsmith) → feedback+

Edwin Smith

Updated

•

14 years ago

Attachment #447857 - Flags: feedback?(edwsmith) → feedback+

Edwin Smith

Comment 6

•

14 years ago

presumably we'll get some of that code size back when the old code is removed too, or was that already accounted for?

Steven Johnson

Reporter

Comment 7

•

14 years ago

(In reply to comment #5) > I'm a bit surprised this generates 36K more code (+2.2%) but is that stripped? > non-universal binary, right? hard to stomach, but whenever code size matters, > we should be looking for savings *off* the critical path, so its not enough to > reject the patch. (ahem, error generation is very bloaty, for example). That's using the standard x-plat build script on i386, which I assume is dead-stripped and minus symbols... but I'll doublecheck. (Also will check on MSVC.) (In reply to comment #6) > presumably we'll get some of that code size back when the old code is removed > too, or was that already accounted for? I think I did that test with the old code removed, but again, will doublecheck.

Steven Johnson

Reporter

Comment 8

•

14 years ago

On a related note: is there a version of the sizereport utility that works on Mac builds? Last time I used it (loooong time ago), it was Windows-only.

Steven Johnson

Reporter

Comment 9

•

14 years ago

(In reply to comment #5) > I'm a bit surprised this generates 36K more code (+2.2%) but is that stripped? Win32 (built via IDE, not xplat) goes from 849408 to 850944. Win64 (built via IDE, not xplat) goes from 1152512 to 1158656.

Steven Johnson

Reporter

Comment 10

•

14 years ago

FWIW, I noticed that neither the mac xplat build scripts, nor xcode, were enabling dead-stripping at linktime... doing so shaved a nice chunk off final size (167k->152k) but didn't affect the delta in size for the Mac build. (It is mighty disappointing though that the Mac build is nearly twice the size of the Win build for the same CPU architecture... might be worth revisiting compiler settings to see how much might be affected by our choices vs. intrinsic differences between GCC and VS2008)

Edwin Smith

Comment 11

•

14 years ago

i would not be surprised at all if gcc -O3 was much more aggressive at inlining than msvc -O2

Steven Johnson

Reporter

Comment 12

•

14 years ago

I tried marking various methods as no_inline as an experiment, and only succeeded in making the code (trivially) larger... so it may be FOL with this version of GCC. Still, this expansion is likely worthwhile for the perf improvement, assuming it doesn't balloon substantially further in Flash; I'll do some checking there first.

Steven Johnson

Reporter

Comment 13

•

14 years ago

Timing on Win32 gives similar speedups for the microbenchmarks and various perf tests.

Lars T Hansen

Comment 14

•

14 years ago

Comment on attachment 447853 [details] [diff] [review] New template-based hashtable Mmm, nested classes with non-public members, inside a template class. :-) Since a comment is necessary to clarify that Hashtable::getSize returns the size of the population, not the size of the table, maybe this method should be called something else. (I tend to go for getPopulation, but since we have bytesUsed maybe elementsUsed may work.) emptyKey and emptyValue are static and can't be const but it may be good to state that their return values don't change. Using the word "distinguished" somewhere in the comments should suffice. Comments missing on getIsEnumerable, setIsEnumerable, next, keyAt, valueAt. "STG"? More generally, KH, VH, DH? Seems like documentation about strategy, acronyms missing. The semantics of the various KH, VH, DH implementations needs to be pinned down properly, but there's nothing here. All that said the approach seems nice enough. I've not looked at algorithms etc in a lot of detail. I'm guessing eg Hashtable::add could usefully be split into something inlinish and something definitely out-of-line.

Lars T Hansen

Updated

•

14 years ago

Attachment #447853 - Flags: feedback?(lhansen) → feedback+

Lars T Hansen

Comment 15

•

14 years ago

Comment on attachment 447857 [details] [diff] [review] Use the new hashtables Seems nice.

Attachment #447857 - Flags: feedback?(lhansen) → feedback+

Felix S. Klock II [:pnkfelix, :fklock]

Comment 16

•

14 years ago

Why did the traits()->needsHashtable() check below go away? @@ -191,20 +184,21 @@ bool ScriptObject::hasMultinameProperty(const Multiname* multiname) const { - if (traits()->needsHashtable() && multiname->isValidDynamicName()) - { - return hasAtomProperty(multiname->getName()->atom()); - } - else - { - // ISSUE should this walk the proto chain? + if (multiname->isValidDynamicName()) + { + return hasAtomProperty(multiname->getName()->atom()); + } + else + { + // ISSUE should this walk the proto chain? return false; } }

Felix S. Klock II [:pnkfelix, :fklock]

Comment 17

•

14 years ago

(In reply to comment #16) > Why did the traits()->needsHashtable() check below go away? (I can see in other similar cases that you replaced the call with a check that the htoffset is non-zero, but I do not see the analogous change here.)

Steven Johnson

Reporter

Comment 18

•

14 years ago

> (In reply to comment #16) > > Why did the traits()->needsHashtable() check below go away? it's redundant; if you examine the logic in hasAtomProperty then you see we end up with the same result.

Steven Johnson

Reporter

Comment 19

•

14 years ago

(In reply to comment #15) > (From update of attachment 447857 [details] [diff] [review]) > Seems nice. Remaining issues: -- I have updated patches with various modest bug fixes in place, I'll upload those for actual review as I get time -- Flash glue code needs nontrivial (but non-awful) changes for this, those will need buy-in from their owners -- have to vet Flash for code size change and ensure it's acceptable -- plus general size/speed testing on ARM platforms this is not on my critical path so it's likely to trickle out over the next week or two.

Steven Johnson

Reporter

Comment 20

•

14 years ago

(In reply to comment #14) > Mmm, nested classes with non-public members, inside a template class. :-) Yep :-) > Since a comment is necessary to clarify that Hashtable::getSize returns the > size of the population, not the size of the table, maybe this method should be > called something else. (I tend to go for getPopulation, but since we have > bytesUsed maybe elementsUsed may work.) Doable, there aren't that many callers. > emptyKey and emptyValue are static and can't be const but it may be good to > state that their return values don't change. Using the word "distinguished" > somewhere in the comments should suffice. Will do. (I'd love to make them real constants but haven't figured out how to massage C++ to allow static pointer constants... this seems like an absurd language wart but it is what it is) > Comments missing on getIsEnumerable, setIsEnumerable, next, keyAt, valueAt. Will add. > "STG"? More generally, KH, VH, DH? Seems like documentation about strategy, > acronyms missing. The semantics of the various KH, VH, DH implementations > needs to be pinned down properly, but there's nothing here. Fair cop, will document. (Are you happy with the names assuming they are documented? I tried to err on the side of terseness, as long names made things unreadable in a different way...) > All that said the approach seems nice enough. I've not looked at algorithms > etc in a lot of detail. Algorithm is basically the same as it used to be, just tuned more carefully. > I'm guessing eg Hashtable::add could usefully be split into something inlinish > and something definitely out-of-line. Good thought, though add() doesn't seem to be the hottest spot in most cases I've profiled; that honor tends to belong to get() and the iterators.

Lars T Hansen

Comment 21

•

14 years ago

(In reply to comment #20) > > "STG"? More generally, KH, VH, DH? Seems like documentation about strategy, > > acronyms missing. The semantics of the various KH, VH, DH implementations > > needs to be pinned down properly, but there's nothing here. > > Fair cop, will document. (Are you happy with the names assuming they are > documented? I tried to err on the side of terseness, as long names made things > unreadable in a different way...) I can sort of intuit what the names mean but I don't actually know, yet. (I'm lost on STG.) But yes, documentation goes a long way. > > I'm guessing eg Hashtable::add could usefully be split into something inlinish > > and something definitely out-of-line. > > Good thought, though add() doesn't seem to be the hottest spot in most cases > I've profiled; that honor tends to belong to get() and the iterators. Right now you give the compiler the choice between inlining something massive or always going out-of-line. Most will choose the latter but some will choose the former. If add() isn't a hotspot you probably want to move it to the cpp file because you don't want something that big inlined ever.

Steven Johnson

Reporter

Comment 22

•

14 years ago

(In reply to comment #21) > If add() isn't a hotspot you probably want to move it to the cpp > file because you don't want something that big inlined ever. As a template, we can't really do that, but your point is taken; I'll restructure to try to avoid this necessity.

Steven Johnson

Reporter

Comment 23

•

14 years ago

Attached patch New template-based hashtable, v2 (deleted) — Details — Splinter Review

Revised, with various bugfixes, tweaks, changes per suggestion, and comments. Most notably, I abandoned the "DH" parameter (used to control whether we used WB when storing the data), as it was just too fiddly and unnecessary; instead, use the approach that List<> does (call FindBeginning), which is more robust and seems to make no meaningful performance difference. Also unrolled find() into a set of macros, which actually does move the needle by about 5% on certain microbenchmarks. I think this is ready for an actual review.

Attachment #447853 - Attachment is obsolete: true

Attachment #449736 - Flags: review?(lhansen)

Steven Johnson

Reporter

Updated

•

14 years ago

Attachment #447857 - Flags: review?(edwsmith)

Steven Johnson

Reporter

Updated

•

14 years ago

Attachment #447857 - Flags: review?(edwsmith)

Steven Johnson

Reporter

Updated

•

14 years ago

Attachment #449736 - Flags: review?(edwsmith)

Steven Johnson

Reporter

Comment 24

•

14 years ago

Attached patch Use the new hashtables, v2 (deleted) — Details — Splinter Review

Review-ready version.

Attachment #447857 - Attachment is obsolete: true

Attachment #449737 - Flags: review?(lhansen)

Steven Johnson

Reporter

Updated

•

14 years ago

Attachment #449737 - Flags: review?(edwsmith)

Steven Johnson

Reporter

Comment 25

•

14 years ago

Attached patch Remove old Hashtables (deleted) — Details — Splinter Review

Remove the old Hashtable classes.

Attachment #449738 - Flags: review?(lhansen)

Steven Johnson

Reporter

Updated

•

14 years ago

Attachment #449738 - Flags: review?(edwsmith)

Lars T Hansen

Comment 26

•

14 years ago

(Of course there's no way this can land without test-compiling with the Symbian and Brew compilers.)

Steven Johnson

Reporter

Comment 27

•

14 years ago

(In reply to comment #26) > (Of course there's no way this can land without test-compiling with the Symbian > and Brew compilers.) Yep. Symbian I can probably make happen (with assistance from Samuli) but I have no clue about Brew (or is it BREW?)... are we doing any such builds in-house?

Edwin Smith

Updated

•

14 years ago

Blocks: 449886

Edwin Smith

Updated

•

14 years ago

Priority: -- → P2

Whiteboard: PACMAN

Target Milestone: --- → flash10.2

Lars T Hansen

Comment 28

•

14 years ago

(In reply to comment #4) > Created an attachment (id=447862) [details] > Microbenchmark > > Simple microbenchmark to hammer on get/set action. On MacTel I get about 5% > improvement on the setter loop and about 25% improvement on the getter loop. Neat. Most objects don't have 100000 properties, but less than 10 (probably less than six, actually, but it's hard to generalize I'm sure). A complementary benchmark that tests many small objects instead of one large object, or indeed one small object, would be useful. There's a strange mixture of annotated and un-annotated variables here. Accident or deliberate? Warrants an explanation.

Lars T Hansen

Comment 29

•

14 years ago

(In reply to comment #27) > (In reply to comment #26) > > (Of course there's no way this can land without test-compiling with the Symbian > > and Brew compilers.) > > Yep. Symbian I can probably make happen (with assistance from Samuli) but I > have no clue about Brew (or is it BREW?)... are we doing any such builds > in-house? We do; I'd be astonished if Rupen did not know the details.

Lars T Hansen

Comment 30

•

14 years ago

Comment on attachment 449736 [details] [diff] [review] New template-based hashtable, v2 r- because of likely missing write barriers and FIXMEs; consider the rest to be nits, though the missing spec for handler methods probably warrants an r- as well. avmplusHashtable.h: I question the wisdom of choosing "STG" for "STORAGE" when the latter is only four letters longer and there are "only" 139 occurrences of the word with all patches applied. "C++ language issues" is used as a reason for the emptyValue and emptyKey methods; the issue should be described briefly so that we can fix the code if/when C++ changes (and for general information). "Temove" should presumably be "Remove". About "getIsEnumerable" it is written that "If the hashtable doesn't support enumeration, this always returns true." Would it not make more sense for this method to throw an error like setIsEnumerable does, if the hashtable does not support enumeration? I mention this because the VM is continually bitten by providing clever default behavior which, once provided, is hard to change. (Bete noir of the moment is that the combination of NEED_REST and NEED_ARGUMENTS is allowed with the former taking precedence over the latter.) This comment for KVH_Atom is cryptic but is probably meant to warn about something important, thus might be fleshed out: "Note that "undefinedAtom" is a legal value." Generally about your comments: the initial letter of the initial sentence should be capitalized, you do this about 2/3 of the time. In the comments for KVH_Atom, etc the word "but" is used in comment to introduce clauses about enumerability, but that's inappropriate. The clause about enumerability does not follow from the constraints on the storage type, it should be a stand-alone clause. Regarding KVH_GCObject and KVH_RCObject, will the implementations somehow check that only the proper types are used with these handlers? A comment to clarify would be useful perhaps. Maybe I'm fussy but there's still no explanation about what the handler methods do, whether there are constraints on them, etc - what does isStale() do? Why (or under what circumstances) is it OK not to define setIsEnumerable(), and what's the consequence? Is it correct to use store() even if store_in_empty() could be used? There are assertions for conditions in the definitions of some of those methods, but it's far from obvious - given the lack of a spec - that those assertions are correct / valid - I mean, since there's an isEmpty predicate, why is it wrong to call load() on an empty slot. Etc. avmplusHashtable-inlines.h: Not obvious that KVH_Atom::transfer is correct, since object scanning is not atomic - if transfering from "late" in the object to "early", you could be hiding a pointer from the garbage collector. Ditto for other transfer methods that operate on pointer values. Tests like "atomKind(uintptr_t(p) & ~kDontEnumBit) <= kNamespaceType" should presumably be abstracted in avmcore somehow? FIXME in KVH_WeakAtom::mark_empty and KVH_WeakAtom::mark_deleted, also later in the file, also in grow(). The comment about RC objects on VH_RawPointer<_TYPE>::transfer seems to be unwarranted?

Attachment #449736 - Flags: review?(lhansen) → review-

Steven Johnson

Reporter

Comment 31

•

14 years ago

Lots of good comments. I'll address these and re-post a patch. Re: KVH_Atom::transfer() not being correct, I suspect this is a job for GC::movePointers, which I didn't know about when I wrote this patch...

Steven Johnson

Reporter

Comment 32

•

14 years ago

(In reply to comment #28) > Most objects don't have 100000 properties, but less than 10 (probably less than > six, actually, but it's hard to generalize I'm sure). A complementary > benchmark that tests many small objects instead of one large object, or indeed > one small object, would be useful. Good call. I'll add that and rebench.

Steven Johnson

Reporter

Comment 33

•

14 years ago

(In reply to comment #27) > (In reply to comment #26) > > (Of course there's no way this can land without test-compiling with the Symbian > > and Brew compilers.) BTW, on this front: I have not tested on either of these yet, but existing template usage in List<> is actually more complex than anything in this Hashtable rewrite, so the odds of introducing new problems is small.

Lars T Hansen

Comment 34

•

14 years ago

(In reply to comment #33) > (In reply to comment #27) > > (In reply to comment #26) > > > (Of course there's no way this can land without test-compiling with the Symbian > > > and Brew compilers.) > > BTW, on this front: I have not tested on either of these yet, but existing > template usage in List<> is actually more complex than anything in this > Hashtable rewrite, so the odds of introducing new problems is small. The reason I brought it up (one reason anyway) is that we've had problems with nested classes on Symbian - they can't be private, they have to be public. Whether that was a temporary issue or is still an issue... anyway the less-standard compilers bite in bizarre ways.

Lars T Hansen

Updated

•

14 years ago

Attachment #449738 - Flags: review?(lhansen) → review+

Steven Johnson

Reporter

Comment 35

•

14 years ago

(In reply to comment #30) > About "getIsEnumerable" it is written that "If the hashtable doesn't support > enumeration, this always returns true." Would it not make more sense for this > method to throw an error like setIsEnumerable does, if the hashtable does not > support enumeration? I looked at this, but specializing it in this way turned out to be more trouble than it's worth, IMHO; the gotcha is that we have a use case for enumerating all hashtable varieties (even the ones without individual enum control). So unless this is a strong objection, it's easier to leave this one as-is.

Edwin Smith

Updated

•

14 years ago

Attachment #449736 - Flags: review?(edwsmith)

Edwin Smith

Updated

•

14 years ago

Attachment #449738 - Flags: review?(edwsmith) → review+

Edwin Smith

Updated

•

14 years ago

Attachment #449737 - Flags: review?(lhansen)

Edwin Smith

Updated

•

14 years ago

Attachment #449737 - Flags: review?(edwsmith) → review?(lhansen)

Lars T Hansen

Comment 36

•

14 years ago

Comment on attachment 449737 [details] [diff] [review] Use the new hashtables, v2 I'd like to postpone this review until a rev of the basic functionality has been offered.

Attachment #449737 - Flags: review?(lhansen)

Steven Johnson

Reporter

Comment 37

•

14 years ago

(In reply to comment #36) > I'd like to postpone this review until a rev of the basic functionality has > been offered. Agreed; this patch set needs another round of cleanup, which I'm not likely to get to for a little while.

Trevor Baker

Updated

•

14 years ago

Flags: flashplayer-bug+

Whiteboard: PACMAN → PACMAN, has-patch, must-fix-candidate

Andre Kruetzfeldt

Updated

•

14 years ago

Blocks: 645018

Dan Smith

Updated

•

13 years ago

Flags: flashplayer-qrb+

Flags: flashplayer-injection-

Target Milestone: Q3 11 - Serrano → Q1 12 - Brannan

Dan Smith

Updated

•

13 years ago

Whiteboard: PACMAN, has-patch, must-fix-candidate → PACMAN, has-patch

Dan Smith

Updated

•

13 years ago

Assignee: stejohns → nobody

Sylvestre Ledru [:Sylvestre]

Comment 38

•

6 years ago

Tamarin isn't maintained anymore. WONTFIX remaining bugs.

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → WONTFIX

New template-based hashtable 14 years ago Steven Johnson (deleted), patch	edwsmith : feedback+ lhansen : feedback+	Details \| Diff \| Splinter Review
Use the new hashtables 14 years ago Steven Johnson (deleted), patch	edwsmith : feedback+ lhansen : feedback+	Details \| Diff \| Splinter Review
Microbenchmark 14 years ago Steven Johnson (deleted), text/plain		Details
New template-based hashtable, v2 14 years ago Steven Johnson (deleted), patch	lhansen : review-	Details \| Diff \| Splinter Review
Use the new hashtables, v2 14 years ago Steven Johnson (deleted), patch		Details \| Diff \| Splinter Review
Remove old Hashtables 14 years ago Steven Johnson (deleted), patch	lhansen : review+ edwsmith : review+	Details \| Diff \| Splinter Review