Closed Bug 697479 Opened 13 years ago Closed 13 years ago

Implement Harmony simple Map and Set builtins

Categories

(Core :: JavaScript Engine, defect)

Other Branch
x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla12

People

(Reporter: jorendorff, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: dev-doc-complete)

Attachments

(1 file, 6 obsolete files)

Blocks: es6
Assignee: general → jorendorff
Attached patch WIP 1 (obsolete) (deleted) — Splinter Review
The basics. Notes: - for-in iteration, per spec, is property enumeration, not data structure iteration. - for-of iteration is supposed to walk the data structure, but we don't support the for-of syntax yet. - The spec doesn't give a way to query the Map/Set size. - Can't use InlineMap because 0 is a valid value for HashableValue. I could change that, or change InlineMap to let the user specify which value to use as a tombstone. - The hash function needs some more work. - Need more tests. - Per spec, var s = new Set; s.add(0); s.has(Math.ceil(-0.1)) === false (!) This is because +0 and -0 are distinct keys. I am usually a big fan of the SameValue algorithm but this is going to drive people up the wall. No one will predict when out of the blue some mathematical operation will produce a -0, especially since ceil() round() and parseInt() can do it.
Attached patch v2 (obsolete) (deleted) — Splinter Review
Some of the notes above still apply, but the hash function is as sorted as it gets. The intent here is to get the functionality in. Someone else would be much better than me at perf-tuning the thing, I'm sure. There's probably plenty of upside here.
Attachment #571151 - Attachment is obsolete: true
Attachment #571363 - Flags: review?(luke)
Oops, delete() is supposed to return a boolean. Btw, evilpie wonders if NaN Values aren't supposed to be canonicalized already. If that is the case, we can dispense with a branch to check for NaN in HashableValue::setValue. But I didn't see anywhere in Value::setDouble where we canonicalize NaNs or assert that they are canonicalized.
Attachment #571363 - Attachment is obsolete: true
Attachment #571363 - Flags: review?(luke)
Attachment #571368 - Flags: review?(luke)
(In reply to Jason Orendorff [:jorendorff] from comment #3) > Btw, evilpie wonders if NaN Values aren't supposed to be canonicalized > already. If that is the case, we can dispense with a branch to check for NaN > in HashableValue::setValue. But I didn't see anywhere in Value::setDouble > where we canonicalize NaNs or assert that they are canonicalized. The JS_CANONICALIZE_NAN occurs at JS engine boundaries JS_NewNumberValue, DOUBLE_TO_JSVAL, TypedArrayTemplate<float/double>::copyIndexToValue, etc. So the question for whether to canonicalize is: where did your double come from? If it came from an engine-internal double or operation on an engine internal double (which we assume preserves canonical-ness), no need to check.
Comment on attachment 571368 [details] [diff] [review] v3 - now with boolean return from .delete method Stealing review; luke says he's backed up.
Attachment #571368 - Flags: review?(luke) → review?(jimb)
(In reply to Jason Orendorff [:jorendorff] from comment #1) > - Per spec, > var s = new Set; > s.add(0); > s.has(Math.ceil(-0.1)) === false (!) > This is because +0 and -0 are distinct keys. I am usually a big fan of the > SameValue algorithm but this is going to drive people up the wall. No one > will predict when out of the blue some mathematical operation will produce > a -0, especially since ceil() round() and parseInt() can do it. I think this is a really important issue. I've raised it with Allen Wirfs-Brock; if you can find other appropriate ways to raise awareness, please do so.
> I think this is a really important issue. I've raised it with Allen > Wirfs-Brock; if you can find other appropriate ways to raise awareness, > please do so. Fear not, I'm tracking this process, and taking all issues that come up back to TC39. Dave
Comment on attachment 571368 [details] [diff] [review] v3 - now with boolean return from .delete method Review of attachment 571368 [details] [diff] [review]: ----------------------------------------------------------------- Yay! ::: js/src/builtin/MapObject.cpp @@ +61,5 @@ > + proto->setPrivate(NULL); > + > + JSAtom *atom = cx->runtime->atomState.classAtoms[key]; > + JSFunction *ctor = global->createConstructor(cx, construct, clasp, atom, 0); > + if (!ctor) Nit: an '||' chain seems like it would be a little less noisy here. @@ +88,5 @@ > + } else if (v.isDouble()) { > + jsdouble d = v.toDouble(); > + int32 i; > + if (JSDOUBLE_IS_NaN(d)) { > + /* All NaN values are the same. Make the bit-pattern reflect this. */ Luke's comment #4 makes it seem like we could just do an JS_ASSERT_IF here, but I think it's fine to just go ahead and avoid the whole question by doing this. @@ +92,5 @@ > + /* All NaN values are the same. Make the bit-pattern reflect this. */ > + value.setDouble(js_NaN); > + } else if (JSDOUBLE_IS_INT32(d, &i)) { > + /* Normalize int32-valued doubles to int32 for faster hashing and testing. */ > + value.setInt32(i); Nice to see that Math.sqrt(81) reliably returns a non-INT32 int, so this can be tested. @@ +128,5 @@ > +HashableValue::equals(const HashableValue &other) const > +{ > + /* Two HashableValues are equal if they have equal bits or they're equal strings. */ > + bool b = (value.asRawBits() == other.value.asRawBits()) || > + (value.isString() && I think this should be indented so that tokens appear to the right of parens they're within. @@ +186,5 @@ > + MapObject *mapobj = static_cast<MapObject *>(obj); > + if (ValueMap *map = mapobj->getData()) { > + for (ValueMap::Range r = map->all(); !r.empty(); r.popFront()) { > + gc::MarkValue(trc, r.front().key, "key"); > + gc::MarkValue(trc, r.front().value, "key"); "value", right? @@ +302,5 @@ > + JS_StrictPropertyStub, /* setProperty */ > + JS_EnumerateStub, > + JS_ResolveStub, > + JS_ConvertStub, > + finalize, That this works without qualification is really neat. ::: js/src/builtin/MapObject.h @@ +47,5 @@ > + > +#include "js/HashTable.h" > + > +namespace js { > + /* We don't usually indent namespace contents. @@ +51,5 @@ > + /* > + * Comparing two ropes for equality can fail. The js::HashTable template > + * requires infallible hash() and match() operations. Therefore we require > + * all values to be converted to hashable form before being used as a key > + * in a Map or Set object. bhackett assures me that, while ropes can mutate into non-ropes, the reverse never happens. So ensuring keys aren't ropes when they're inserted should be good enough. @@ +81,5 @@ > + public: > + static JSObject *initClass(JSContext *cx, JSObject *obj); > + static Class class_; > + private: > + typedef ValueMap Data; These 'Data' member typedefs are for the benefit of the UNPACK_THIS macro, right? I get it, but it seems kind of obscure to use it outside the macro as well. ::: js/src/jit-test/tests/collections/Map-gc-1.js @@ +6,5 @@ > + n.set(m, i); > + m = n; > +} > + > +gc(); It seems like this would be a great place to use findReferences / referencesVia. ::: js/src/jit-test/tests/collections/Map-gc-2.js @@ +6,5 @@ > + n.set(i, m); > + m = n; > +} > + > +gc(); findReferences/referencesVia here as well. ::: js/src/jit-test/tests/collections/Map-surfaces-1.js @@ +5,5 @@ > +assertEq(desc.configurable, true); > +assertEq(desc.writable, true); > + > +assertEq(typeof Map, 'function'); > +assertEq(Object.keys(Map).join(), ""); Better to use Object.keys(map).length here, too. @@ +24,5 @@ > + > +assertEq(Map.prototype.get.length, 1); > +assertEq(Map.prototype.has.length, 1); > +assertEq(Map.prototype.set.length, 2); > +assertEq(Map.prototype.delete.length, 1); Obviously, need tests for the rest of the interface. ::: js/src/jit-test/tests/collections/Set-gc-1.js @@ +6,5 @@ > + t.add(s); > + s = t; > +} > + > +gc(); consider findReferences/referencesVia ::: js/src/jit-test/tests/collections/Set-surfaces-1.js @@ +4,5 @@ > +assertEq(desc.configurable, true); > +assertEq(desc.writable, true); > + > +assertEq(typeof Set, 'function'); > +assertEq(Object.keys(Set).join(), ""); Test keys' length instead of using join. ::: js/src/jit-test/tests/collections/for-in.js @@ +1,4 @@ > +// for-in loops on Maps and Sets enumerate properties. > + > +var test = function test(obj) { > + assertEq(Object.keys(obj).join(), ""); Don't you want Object.keys(obj).length == 0? If the object has a property named '', this will pass. @@ +8,5 @@ > + i++; > + assertEq(i, 0); > + > + obj.ownProp = 1; > + assertEq(Object.keys(obj).join(), "ownProp"); Here, I think you want an argument to 'join', so that { '':true, 'ownProp' } wouldn't pass. Saving the world, one nit at a time. ::: js/src/jit-test/tests/collections/key-equality-0.js @@ +28,5 @@ > +assertEq(m.delete(-0), true); > +assertEq(m.has(0), true); > +assertEq(m.get(0), 'y'); > +assertEq(m.has(-0), false); > +assertEq(m.get(-0), undefined); Seems like it might be good to check that they're both in there simultaneously. ::: js/src/jit-test/tests/collections/key-equality-1.js @@ +14,5 @@ > +test(9, Math.sqrt(81)); > + > +// Ordinary strings and ropes are the same key. > +var a = "a", b = "b"; > +test("ab", a + b); This 'a + b' doesn't produce a rope, in my testing. I think they need to be longer. With this, 'b' seems to be a rope: var a = Array(1000).join('x') var b = Array(500).join('x') + Array(500).join('x') @@ +21,5 @@ > +a = ""; > +b = ""; > +for (var i = 0; i < 10; i++) { > + a = 'x' + a; > + b = b + 'x'; These don't end up being ropes either. ::: js/src/jit-test/tests/collections/key-equality-2.js @@ +1,1 @@ > +// Number keys are distinct from string keys that would name the same property. That didn't occur to me; neat.
It might be nice to also check that two {} are treated as distinct keys. Similarly for [].
Comment on attachment 571368 [details] [diff] [review] v3 - now with boolean return from .delete method It looks great so far; just need to finish off the tests.
Attachment #571368 - Flags: review?(jimb)
(In reply to Jim Blandy :jimb from comment #8) > Nit: an '||' chain seems like it would be a little less noisy here. Done. > Luke's comment #4 makes it seem like we could just do an JS_ASSERT_IF here, I changed it to an assertion. > > + bool b = (value.asRawBits() == other.value.asRawBits()) || > > + (value.isString() && > > I think this should be indented so that tokens appear to the right of parens > they're within. Fixed. (in Map marking) > "value", right? Yep. Fixed. > We don't usually indent namespace contents. Fixed. > @@ +81,5 @@ > > + public: > > + static JSObject *initClass(JSContext *cx, JSObject *obj); > > + static Class class_; > > + private: > > + typedef ValueMap Data; > > These 'Data' member typedefs are for the benefit of the UNPACK_THIS macro, > right? I get it, but it seems kind of obscure to use it outside the macro as > well. That would be obscure. The Data typedefs are not really used anywhere else. Do you mean that the stated return types of the getData() methods should be ValueMap * and ValueSet * rather than Data *? I'll change that in case that's what you meant. > ::: js/src/jit-test/tests/collections/Map-gc-1.js > ::: js/src/jit-test/tests/collections/Map-gc-2.js > ::: js/src/jit-test/tests/collections/Set-gc-1.js > It seems like this would be a great place to use findReferences / > referencesVia. Done. Please take a look. (I kept the loops and gc() calls because I still think testing that stuff end-to-end is good. It's not a lot of code.) > ::: js/src/jit-test/tests/collections/Map-surfaces-1.js > @@ +5,5 @@ > > +assertEq(desc.configurable, true); > > +assertEq(desc.writable, true); > > + > > +assertEq(typeof Map, 'function'); > > +assertEq(Object.keys(Map).join(), ""); > > Better to use Object.keys(map).length here, too. I used join() to generate a more informative message if it ever fails. > > +assertEq(Map.prototype.get.length, 1); > > +assertEq(Map.prototype.has.length, 1); > > +assertEq(Map.prototype.set.length, 2); > > +assertEq(Map.prototype.delete.length, 1); > > Obviously, need tests for the rest of the interface. That's the whole interface so far. Or else I don't understand the question. > ::: js/src/jit-test/tests/collections/for-in.js > > + assertEq(Object.keys(obj).join(), ""); > > Don't you want Object.keys(obj).length == 0? If the object has a property > named '', this will pass. Fixed. > @@ +8,5 @@ > > + i++; > > + assertEq(i, 0); > > + > > + obj.ownProp = 1; > > + assertEq(Object.keys(obj).join(), "ownProp"); > > Here, I think you want an argument to 'join', so that { '':true, 'ownProp' } > wouldn't pass. The default is ','. js> Object.keys({"": 1, "ownProp": 2}).join() ",ownProp" > ::: js/src/jit-test/tests/collections/key-equality-0.js > @@ +28,5 @@ > > +assertEq(m.delete(-0), true); > > +assertEq(m.has(0), true); > > +assertEq(m.get(0), 'y'); > > +assertEq(m.has(-0), false); > > +assertEq(m.get(-0), undefined); > > Seems like it might be good to check that they're both in there > simultaneously. Yep. Added it. Thanks. > ::: js/src/jit-test/tests/collections/key-equality-1.js > This 'a + b' doesn't produce a rope, in my testing. I think they need to be > longer. With this, 'b' seems to be a rope: > var a = Array(1000).join('x') > var b = Array(500).join('x') + Array(500).join('x') Good catch! Fixed. Mini-puzzle: my first attempt to fix the test using those two lines of code caused the test to fail. But the functionality being tested is correct. What was wrong? > > +a = ""; > > +b = ""; > > +for (var i = 0; i < 10; i++) { > > + a = 'x' + a; > > + b = b + 'x'; > > These don't end up being ropes either. Good catch again. Fixed.
Attached patch Interdiff from v3 to v4 (obsolete) (deleted) — Splinter Review
Attached patch v4 (obsolete) (deleted) — Splinter Review
Attachment #571368 - Attachment is obsolete: true
Attachment #578625 - Flags: review?(jimb)
(In reply to Jason Orendorff [:jorendorff] from comment #12) > > @@ +81,5 @@ > > > + public: > > > + static JSObject *initClass(JSContext *cx, JSObject *obj); > > > + static Class class_; > > > + private: > > > + typedef ValueMap Data; > > > > These 'Data' member typedefs are for the benefit of the UNPACK_THIS macro, > > right? I get it, but it seems kind of obscure to use it outside the macro as > > well. > > That would be obscure. The Data typedefs are not really used anywhere else. > Do you mean that the stated return types of the getData() methods should be > ValueMap * and ValueSet * rather than Data *? I'll change that in case that's > what you meant. Yeah, it seems more descriptive. > > ::: js/src/jit-test/tests/collections/Map-gc-1.js > > ::: js/src/jit-test/tests/collections/Map-gc-2.js > > ::: js/src/jit-test/tests/collections/Set-gc-1.js > > It seems like this would be a great place to use findReferences / > > referencesVia. > > Done. Please take a look. That looks great. > > ::: js/src/jit-test/tests/collections/Map-surfaces-1.js > > @@ +5,5 @@ > > > +assertEq(desc.configurable, true); > > > +assertEq(desc.writable, true); > > > + > > > +assertEq(typeof Map, 'function'); > > > +assertEq(Object.keys(Map).join(), ""); > > > > Better to use Object.keys(map).length here, too. > > I used join() to generate a more informative message if it ever fails. I wasn't remembering that the default separator was "," (I've seen it often enough). So that seems fine. > > > +assertEq(Map.prototype.get.length, 1); > > > +assertEq(Map.prototype.has.length, 1); > > > +assertEq(Map.prototype.set.length, 2); > > > +assertEq(Map.prototype.delete.length, 1); > > > > Obviously, need tests for the rest of the interface. > > That's the whole interface so far. Or else I don't understand the question. Sorry, misplaced comment. Immediately before, you got the descriptor for Map.prototype.get, and checked all its properties. I was assuming 'has', 'set', and 'delete' needed the same treatment. > > ::: js/src/jit-test/tests/collections/key-equality-1.js > > This 'a + b' doesn't produce a rope, in my testing. I think they need to be > > longer. With this, 'b' seems to be a rope: > > var a = Array(1000).join('x') > > var b = Array(500).join('x') + Array(500).join('x') > > Good catch! Fixed. > > Mini-puzzle: my first attempt to fix the test using those two lines of code > caused the test to fail. But the functionality being tested is correct. What > was wrong? I peeked at the new test; was it because Array(500).join('x').length == 499?
Attachment #578625 - Flags: review?(jimb) → review+
Keywords: dev-doc-needed
> Sorry, misplaced comment. Immediately before, you got the descriptor for > Map.prototype.get, and checked all its properties. I was assuming 'has', > 'set', and 'delete' needed the same treatment. OK. > I peeked at the new test; was it because Array(500).join('x').length == 499? Yes. Since findReferences is broken at the moment, this must wait for bug 708261. I hope it doesn't have to wait long!
Depends on: 708261
Relanded as: https://hg.mozilla.org/integration/mozilla-inbound/rev/ee420d0f03df However causes win64 make-check failures: https://tbpl.mozilla.org/php/getParsedLog.php?id=7846633&tree=Mozilla-Inbound And also appears to be the cause of Win opt crashes (first build was green, but every push after was purple, and the push after this only changed mobile/xul files): https://tbpl.mozilla.org/php/getParsedLog.php?id=7846927&tree=Mozilla-Inbound { TEST-PASS | jit_test.py -m -d -n | e:\builds\moz2_slave\m-in-w32\build\js\src\jit-test\tests\basic\testConvertibleObjectEqUndefined.js command timed out: 300 seconds without output, attempting to kill SIGKILL failed to kill process using fake rc=-1 program finished with exit code -1 remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last): Failure: exceptions.RuntimeError: SIGKILL failed to kill process ] } Backed out: https://hg.mozilla.org/integration/mozilla-inbound/rev/02ec94922e96
Man, this sucks. (Thanks, Ed.)
All this just goes to show: no patch is too simple for try server. I'll figure out the Windows breakage later today.
The tests pass for me on Windows. I'll ask the try server about it tomorrow.
Attached patch v5 - unbitrotted (obsolete) (deleted) — Splinter Review
This is going through the Try Server right now.
Attachment #578624 - Attachment is obsolete: true
Attachment #578625 - Attachment is obsolete: true
btw, dherman mentioned that Map and Set may end up being specified with deterministic enumeration order. Since as it stands there's no enumeration API at all, this would not affect the bits of API introduced in this patch. However, it would call for a rather different implementation strategy.
The patch failed every Map and Set test on 64-bit Windows: https://tbpl.mozilla.org/php/getParsedLog.php?id=8366440&full=1&branch=try#error0 I can't reproduce any of the test failures on my Windows machine here. Whether I build 32-bit or 64-bit, debug or release, with or without jemalloc, they all pass. Next step: get a tinderbox and debug remotely.
Reproduced on a build machine, even with PGO disabled. During shutdown, GC is calling MapObject::finalize with obj == 0x30. I can't tell why though. Next I'll try it with --disable-optimize. Each build takes rather a long time. :-\
The fix, in short: >- delete map; >+ cx->delete_(map); >- delete set; >+ cx->delete_(set); I'll re-Try with this change.
Attached patch v6 (deleted) — Splinter Review
Unbitrotted again, for checkin. This one passed Try Server. Carrying forward jimb's review.
Attachment #586319 - Attachment is obsolete: true
Attachment #590168 - Flags: review+
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
For documentation beyond just the Map and Set references, I'd like to ask for a separate guide page, stating when it's best to use a particular data type for a given purpose. For example, if the keys are non-negative integers, it might be better to use an array ([]) instead of a Map. If the keys are strings, perhaps a raw object map might be better ({}). If the keys are objects and you might not have a reference to the key forever, perhaps a WeakMap. Note that these examples are just guesses. Jason and Jim, or other JSAPI developers may say that it's better to use Map for in-memory data storage, or they may have better ideas.
(In reply to Alex Vincent [:WeirdAl] from comment #31) > For documentation beyond just the Map and Set references, I'd like to ask > for a separate guide page, stating when it's best to use a particular data > type for a given purpose. I'll be documenting these in a minute. It sounds like a good idea to do a page that lists basic data structures and explain which is better suited for which use case. As an example, I see a lot of people coming from PHP creating arrays all over the place where a JavaScript object would a better fit. > For example, if the keys are non-negative integers, it might be better to > use an array ([]) instead of a Map. If the keys are strings, perhaps a raw > object map might be better ({}). > > If the keys are objects and you might not have a reference to the key > forever, perhaps a WeakMap. > > Note that these examples are just guesses. Jason and Jim, or other JSAPI > developers may say that it's better to use Map for in-memory data storage, > or they may have better ideas. Performance-related issues are interesting to note, but are a different concern. Performance depends of on the browser/platform and changes over time. I would consider these aspects as secondary for documentation purposes. Maybe a section could be dedicated to performance tips.
(In reply to Alex Vincent [:WeirdAl] from comment #31) > If the keys are strings, perhaps a raw > object map might be better ({}). Note that using {} is rarely the right choice, even for string-keyed maps, because of pollution from Object.prototype, __proto__, and such. Object.create(null) makes this somewhat better, but Maps have a very simple model with no corner cases.
Started https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Map (overall structure of the article copied from WeakMap).
Map is just the JavaScript equivalent of a very common and popular data structure that tons of other languages already have: Python has dict Ruby has Hash Java has java.util.HashMap .NET has System.Collections.Generic.Hashtable C++ has std::unordered_map and now JS has Map Set is the same thing: Python has set Ruby has Set (you have to require 'set' first) Java has java.util.HashSet .NET has System.Collections.Generic.HashSet C++ has std::unordered_set and now JS has Set
Assignee: jorendorff → blackconnect
Status: RESOLVED → REOPENED
Component: JavaScript Engine → Java-Implemented Plugins
QA Contact: general → blackconnect
Resolution: FIXED → ---
Target Milestone: mozilla12 → ---
Thanks bugzilla.
Assignee: blackconnect → general
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Component: Java-Implemented Plugins → JavaScript Engine
QA Contact: blackconnect → general
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
(In reply to Sam Tobin-Hochstadt [:samth] from comment #33) > Note that using {} is rarely the right choice, even for string-keyed maps, > because of pollution from Object.prototype, __proto__, and such. > Object.create(null) makes this somewhat better, but Maps have a very simple > model with no corner cases. Right; concretely, consider what happens when your users give you a key named 'toString' or 'valueOf'. This is a fundamental problem with using JS objects as hash tables. The idea of making objects that have properties, arrays that have indices, and hash tables that have keys all be the same thing may have seemed like a good idea when JS was designed, but in practice it works poorly.
The reference doc for Set is ready. https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Set As David is planning to write an article on when to use any of the {Weak|}{Map|Set} objects (comment 32), I'm not marking this as dev-doc-complete yet.
Depends on: 723219
Features documented (by marcoos): https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Set https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Map I'll do the "data structure page" during the next doc sprint (in my MDN TODO list). Marking as dev-doc-complete.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: