Closed Bug 568664 Opened 14 years ago Closed 6 years ago

Improve Hashtable

Categories

(Tamarin Graveyard :: Virtual Machine, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX
Q1 12 - Brannan

People

(Reporter: stejohns, Unassigned)

References

Details

(Whiteboard: PACMAN, has-patch)

Attachments

(4 files, 2 obsolete files)

Our current Hashtable code (used for dynamic objects as well as other misc use) isn't bad, but it can be improved in both space and speed.
Attached patch New template-based hashtable (obsolete) (deleted) — Splinter Review
Asking for feedback on general approach for now, though the profiling I've done so far has been promising. We currently use a one-size-fits-all Hashtable, with runtime checks for special cases, and masquerading non-Atoms as Atoms. This keeps the basic quadradic-probe lookup but makes a number of improvements: -- sizeof(Hashtable) == sizeof(void*) (current InlineHashtable is 8 bytes) -- empty Hashtables all point to the same data, so "creating" one is cheap -- fewer runtime checks for things we know at compiletime (eg "dontEnumMask") -- better encapsulation of "weakness"; the WeakAtom and WeakPointer variants have the same API as the non-weak versions, so it's harder for the caller to do the wrong thing This patch only defines the new class(es); the old ones will be removed in a subsequent patch. Note that this makes heavy use of templates; I've only tested on GCC so far, but it's possible some mobile compilers might generate suboptimal code with this (obviously we'd need to test others too). Impact on codesize isn't known yet either (though I'd be surprised if it's significant; the core code is actually pretty small, despite the expanse of source code).
Assignee: nobody → stejohns
Attachment #447853 - Flags: feedback?(edwsmith)
Attachment #447853 - Flags: feedback?(lhansen)
Attached patch Use the new hashtables (obsolete) (deleted) — Splinter Review
Use the new Hashtables everywhere instead of the old ones. The most interesting changes are to ScriptObject and Dictionary. Testing shows modest but positive gains on various benchmarks (generally in the 1-2% range, with a few up to 5 or 6%); testing on a microbenchmark that just happens on dynamic property access shows up to 25% gain. (More testing is required.)
Attachment #447857 - Flags: feedback?(edwsmith)
Size comparison: a MacTel32 build went from 1,669,976 to 1,706,916 bytes with these two patches applied.
Attachment #447857 - Flags: feedback?(lhansen)
Attached file Microbenchmark (deleted) —
Simple microbenchmark to hammer on get/set action. On MacTel I get about 5% improvement on the setter loop and about 25% improvement on the getter loop.
Symbian and solaris are the typical places where template code falls over. at first glance, this new code is not fundamentally more template-heavy than anything we already have, so i'm not opposed on principle. I'm a bit surprised this generates 36K more code (+2.2%) but is that stripped? non-universal binary, right? hard to stomach, but whenever code size matters, we should be looking for savings *off* the critical path, so its not enough to reject the patch. (ahem, error generation is very bloaty, for example).
Attachment #447853 - Flags: feedback?(edwsmith) → feedback+
Attachment #447857 - Flags: feedback?(edwsmith) → feedback+
presumably we'll get some of that code size back when the old code is removed too, or was that already accounted for?
(In reply to comment #5) > I'm a bit surprised this generates 36K more code (+2.2%) but is that stripped? > non-universal binary, right? hard to stomach, but whenever code size matters, > we should be looking for savings *off* the critical path, so its not enough to > reject the patch. (ahem, error generation is very bloaty, for example). That's using the standard x-plat build script on i386, which I assume is dead-stripped and minus symbols... but I'll doublecheck. (Also will check on MSVC.) (In reply to comment #6) > presumably we'll get some of that code size back when the old code is removed > too, or was that already accounted for? I think I did that test with the old code removed, but again, will doublecheck.
On a related note: is there a version of the sizereport utility that works on Mac builds? Last time I used it (loooong time ago), it was Windows-only.
(In reply to comment #5) > I'm a bit surprised this generates 36K more code (+2.2%) but is that stripped? Win32 (built via IDE, not xplat) goes from 849408 to 850944. Win64 (built via IDE, not xplat) goes from 1152512 to 1158656.
FWIW, I noticed that neither the mac xplat build scripts, nor xcode, were enabling dead-stripping at linktime... doing so shaved a nice chunk off final size (167k->152k) but didn't affect the delta in size for the Mac build. (It is mighty disappointing though that the Mac build is nearly twice the size of the Win build for the same CPU architecture... might be worth revisiting compiler settings to see how much might be affected by our choices vs. intrinsic differences between GCC and VS2008)
i would not be surprised at all if gcc -O3 was much more aggressive at inlining than msvc -O2
I tried marking various methods as no_inline as an experiment, and only succeeded in making the code (trivially) larger... so it may be FOL with this version of GCC. Still, this expansion is likely worthwhile for the perf improvement, assuming it doesn't balloon substantially further in Flash; I'll do some checking there first.
Timing on Win32 gives similar speedups for the microbenchmarks and various perf tests.
Comment on attachment 447853 [details] [diff] [review] New template-based hashtable Mmm, nested classes with non-public members, inside a template class. :-) Since a comment is necessary to clarify that Hashtable::getSize returns the size of the population, not the size of the table, maybe this method should be called something else. (I tend to go for getPopulation, but since we have bytesUsed maybe elementsUsed may work.) emptyKey and emptyValue are static and can't be const but it may be good to state that their return values don't change. Using the word "distinguished" somewhere in the comments should suffice. Comments missing on getIsEnumerable, setIsEnumerable, next, keyAt, valueAt. "STG"? More generally, KH, VH, DH? Seems like documentation about strategy, acronyms missing. The semantics of the various KH, VH, DH implementations needs to be pinned down properly, but there's nothing here. All that said the approach seems nice enough. I've not looked at algorithms etc in a lot of detail. I'm guessing eg Hashtable::add could usefully be split into something inlinish and something definitely out-of-line.
Attachment #447853 - Flags: feedback?(lhansen) → feedback+
Comment on attachment 447857 [details] [diff] [review] Use the new hashtables Seems nice.
Attachment #447857 - Flags: feedback?(lhansen) → feedback+
Why did the traits()->needsHashtable() check below go away? @@ -191,20 +184,21 @@ bool ScriptObject::hasMultinameProperty(const Multiname* multiname) const { - if (traits()->needsHashtable() && multiname->isValidDynamicName()) - { - return hasAtomProperty(multiname->getName()->atom()); - } - else - { - // ISSUE should this walk the proto chain? + if (multiname->isValidDynamicName()) + { + return hasAtomProperty(multiname->getName()->atom()); + } + else + { + // ISSUE should this walk the proto chain? return false; } }
(In reply to comment #16) > Why did the traits()->needsHashtable() check below go away? (I can see in other similar cases that you replaced the call with a check that the htoffset is non-zero, but I do not see the analogous change here.)
> (In reply to comment #16) > > Why did the traits()->needsHashtable() check below go away? it's redundant; if you examine the logic in hasAtomProperty then you see we end up with the same result.
(In reply to comment #15) > (From update of attachment 447857 [details] [diff] [review]) > Seems nice. Remaining issues: -- I have updated patches with various modest bug fixes in place, I'll upload those for actual review as I get time -- Flash glue code needs nontrivial (but non-awful) changes for this, those will need buy-in from their owners -- have to vet Flash for code size change and ensure it's acceptable -- plus general size/speed testing on ARM platforms this is not on my critical path so it's likely to trickle out over the next week or two.
(In reply to comment #14) > Mmm, nested classes with non-public members, inside a template class. :-) Yep :-) > Since a comment is necessary to clarify that Hashtable::getSize returns the > size of the population, not the size of the table, maybe this method should be > called something else. (I tend to go for getPopulation, but since we have > bytesUsed maybe elementsUsed may work.) Doable, there aren't that many callers. > emptyKey and emptyValue are static and can't be const but it may be good to > state that their return values don't change. Using the word "distinguished" > somewhere in the comments should suffice. Will do. (I'd love to make them real constants but haven't figured out how to massage C++ to allow static pointer constants... this seems like an absurd language wart but it is what it is) > Comments missing on getIsEnumerable, setIsEnumerable, next, keyAt, valueAt. Will add. > "STG"? More generally, KH, VH, DH? Seems like documentation about strategy, > acronyms missing. The semantics of the various KH, VH, DH implementations > needs to be pinned down properly, but there's nothing here. Fair cop, will document. (Are you happy with the names assuming they are documented? I tried to err on the side of terseness, as long names made things unreadable in a different way...) > All that said the approach seems nice enough. I've not looked at algorithms > etc in a lot of detail. Algorithm is basically the same as it used to be, just tuned more carefully. > I'm guessing eg Hashtable::add could usefully be split into something inlinish > and something definitely out-of-line. Good thought, though add() doesn't seem to be the hottest spot in most cases I've profiled; that honor tends to belong to get() and the iterators.
(In reply to comment #20) > > "STG"? More generally, KH, VH, DH? Seems like documentation about strategy, > > acronyms missing. The semantics of the various KH, VH, DH implementations > > needs to be pinned down properly, but there's nothing here. > > Fair cop, will document. (Are you happy with the names assuming they are > documented? I tried to err on the side of terseness, as long names made things > unreadable in a different way...) I can sort of intuit what the names mean but I don't actually know, yet. (I'm lost on STG.) But yes, documentation goes a long way. > > I'm guessing eg Hashtable::add could usefully be split into something inlinish > > and something definitely out-of-line. > > Good thought, though add() doesn't seem to be the hottest spot in most cases > I've profiled; that honor tends to belong to get() and the iterators. Right now you give the compiler the choice between inlining something massive or always going out-of-line. Most will choose the latter but some will choose the former. If add() isn't a hotspot you probably want to move it to the cpp file because you don't want something that big inlined ever.
(In reply to comment #21) > If add() isn't a hotspot you probably want to move it to the cpp > file because you don't want something that big inlined ever. As a template, we can't really do that, but your point is taken; I'll restructure to try to avoid this necessity.
Revised, with various bugfixes, tweaks, changes per suggestion, and comments. Most notably, I abandoned the "DH" parameter (used to control whether we used WB when storing the data), as it was just too fiddly and unnecessary; instead, use the approach that List<> does (call FindBeginning), which is more robust and seems to make no meaningful performance difference. Also unrolled find() into a set of macros, which actually does move the needle by about 5% on certain microbenchmarks. I think this is ready for an actual review.
Attachment #447853 - Attachment is obsolete: true
Attachment #449736 - Flags: review?(lhansen)
Attachment #447857 - Flags: review?(edwsmith)
Attachment #447857 - Flags: review?(edwsmith)
Attachment #449736 - Flags: review?(edwsmith)
Attached patch Use the new hashtables, v2 (deleted) — Splinter Review
Review-ready version.
Attachment #447857 - Attachment is obsolete: true
Attachment #449737 - Flags: review?(lhansen)
Attachment #449737 - Flags: review?(edwsmith)
Attached patch Remove old Hashtables (deleted) — Splinter Review
Remove the old Hashtable classes.
Attachment #449738 - Flags: review?(lhansen)
Attachment #449738 - Flags: review?(edwsmith)
(Of course there's no way this can land without test-compiling with the Symbian and Brew compilers.)
(In reply to comment #26) > (Of course there's no way this can land without test-compiling with the Symbian > and Brew compilers.) Yep. Symbian I can probably make happen (with assistance from Samuli) but I have no clue about Brew (or is it BREW?)... are we doing any such builds in-house?
Blocks: 449886
Priority: -- → P2
Whiteboard: PACMAN
Target Milestone: --- → flash10.2
(In reply to comment #4) > Created an attachment (id=447862) [details] > Microbenchmark > > Simple microbenchmark to hammer on get/set action. On MacTel I get about 5% > improvement on the setter loop and about 25% improvement on the getter loop. Neat. Most objects don't have 100000 properties, but less than 10 (probably less than six, actually, but it's hard to generalize I'm sure). A complementary benchmark that tests many small objects instead of one large object, or indeed one small object, would be useful. There's a strange mixture of annotated and un-annotated variables here. Accident or deliberate? Warrants an explanation.
(In reply to comment #27) > (In reply to comment #26) > > (Of course there's no way this can land without test-compiling with the Symbian > > and Brew compilers.) > > Yep. Symbian I can probably make happen (with assistance from Samuli) but I > have no clue about Brew (or is it BREW?)... are we doing any such builds > in-house? We do; I'd be astonished if Rupen did not know the details.
Comment on attachment 449736 [details] [diff] [review] New template-based hashtable, v2 r- because of likely missing write barriers and FIXMEs; consider the rest to be nits, though the missing spec for handler methods probably warrants an r- as well. avmplusHashtable.h: I question the wisdom of choosing "STG" for "STORAGE" when the latter is only four letters longer and there are "only" 139 occurrences of the word with all patches applied. "C++ language issues" is used as a reason for the emptyValue and emptyKey methods; the issue should be described briefly so that we can fix the code if/when C++ changes (and for general information). "Temove" should presumably be "Remove". About "getIsEnumerable" it is written that "If the hashtable doesn't support enumeration, this always returns true." Would it not make more sense for this method to throw an error like setIsEnumerable does, if the hashtable does not support enumeration? I mention this because the VM is continually bitten by providing clever default behavior which, once provided, is hard to change. (Bete noir of the moment is that the combination of NEED_REST and NEED_ARGUMENTS is allowed with the former taking precedence over the latter.) This comment for KVH_Atom is cryptic but is probably meant to warn about something important, thus might be fleshed out: "Note that "undefinedAtom" is a legal value." Generally about your comments: the initial letter of the initial sentence should be capitalized, you do this about 2/3 of the time. In the comments for KVH_Atom, etc the word "but" is used in comment to introduce clauses about enumerability, but that's inappropriate. The clause about enumerability does not follow from the constraints on the storage type, it should be a stand-alone clause. Regarding KVH_GCObject and KVH_RCObject, will the implementations somehow check that only the proper types are used with these handlers? A comment to clarify would be useful perhaps. Maybe I'm fussy but there's still no explanation about what the handler methods do, whether there are constraints on them, etc - what does isStale() do? Why (or under what circumstances) is it OK not to define setIsEnumerable(), and what's the consequence? Is it correct to use store() even if store_in_empty() could be used? There are assertions for conditions in the definitions of some of those methods, but it's far from obvious - given the lack of a spec - that those assertions are correct / valid - I mean, since there's an isEmpty predicate, why is it wrong to call load() on an empty slot. Etc. avmplusHashtable-inlines.h: Not obvious that KVH_Atom::transfer is correct, since object scanning is not atomic - if transfering from "late" in the object to "early", you could be hiding a pointer from the garbage collector. Ditto for other transfer methods that operate on pointer values. Tests like "atomKind(uintptr_t(p) & ~kDontEnumBit) <= kNamespaceType" should presumably be abstracted in avmcore somehow? FIXME in KVH_WeakAtom::mark_empty and KVH_WeakAtom::mark_deleted, also later in the file, also in grow(). The comment about RC objects on VH_RawPointer<_TYPE>::transfer seems to be unwarranted?
Attachment #449736 - Flags: review?(lhansen) → review-
Lots of good comments. I'll address these and re-post a patch. Re: KVH_Atom::transfer() not being correct, I suspect this is a job for GC::movePointers, which I didn't know about when I wrote this patch...
(In reply to comment #28) > Most objects don't have 100000 properties, but less than 10 (probably less than > six, actually, but it's hard to generalize I'm sure). A complementary > benchmark that tests many small objects instead of one large object, or indeed > one small object, would be useful. Good call. I'll add that and rebench.
(In reply to comment #27) > (In reply to comment #26) > > (Of course there's no way this can land without test-compiling with the Symbian > > and Brew compilers.) BTW, on this front: I have not tested on either of these yet, but existing template usage in List<> is actually more complex than anything in this Hashtable rewrite, so the odds of introducing new problems is small.
(In reply to comment #33) > (In reply to comment #27) > > (In reply to comment #26) > > > (Of course there's no way this can land without test-compiling with the Symbian > > > and Brew compilers.) > > BTW, on this front: I have not tested on either of these yet, but existing > template usage in List<> is actually more complex than anything in this > Hashtable rewrite, so the odds of introducing new problems is small. The reason I brought it up (one reason anyway) is that we've had problems with nested classes on Symbian - they can't be private, they have to be public. Whether that was a temporary issue or is still an issue... anyway the less-standard compilers bite in bizarre ways.
Attachment #449738 - Flags: review?(lhansen) → review+
(In reply to comment #30) > About "getIsEnumerable" it is written that "If the hashtable doesn't support > enumeration, this always returns true." Would it not make more sense for this > method to throw an error like setIsEnumerable does, if the hashtable does not > support enumeration? I looked at this, but specializing it in this way turned out to be more trouble than it's worth, IMHO; the gotcha is that we have a use case for enumerating all hashtable varieties (even the ones without individual enum control). So unless this is a strong objection, it's easier to leave this one as-is.
Attachment #449736 - Flags: review?(edwsmith)
Attachment #449738 - Flags: review?(edwsmith) → review+
Attachment #449737 - Flags: review?(lhansen)
Attachment #449737 - Flags: review?(edwsmith) → review?(lhansen)
Comment on attachment 449737 [details] [diff] [review] Use the new hashtables, v2 I'd like to postpone this review until a rev of the basic functionality has been offered.
Attachment #449737 - Flags: review?(lhansen)
(In reply to comment #36) > I'd like to postpone this review until a rev of the basic functionality has > been offered. Agreed; this patch set needs another round of cleanup, which I'm not likely to get to for a little while.
Flags: flashplayer-bug+
Whiteboard: PACMAN → PACMAN, has-patch, must-fix-candidate
Blocks: 645018
Flags: flashplayer-qrb+
Flags: flashplayer-injection-
Target Milestone: Q3 11 - Serrano → Q1 12 - Brannan
Whiteboard: PACMAN, has-patch, must-fix-candidate → PACMAN, has-patch
Assignee: stejohns → nobody
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: