Closed Bug 1387968 Opened 7 years ago Closed 7 years ago

Reduce allocations in RegExpGetSubstitution

Categories

(Core :: JavaScript: Standard Library, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: anba, Assigned: anba)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

RegExpGetSubstitution is often called when the RegExp contains capturing groups and the replacement string uses "$n" replacement formats. To optimize this case, we should avoid copying the capturing groups in http://searchfox.org/mozilla-central/rev/b52285fffc13f36eca6b47de735d4e4403b3859e/js/src/builtin/RegExp.js#443-466 and give the captures vector in http://searchfox.org/mozilla-central/rev/b52285fffc13f36eca6b47de735d4e4403b3859e/js/src/builtin/RegExp.cpp#1434-1436 a bit of stack space.
The patch changes js::RegExpGetSubstitution to directly take the regexp match-result object instead of separate arguments for the matched string and the captured groups. This allows us to directly pass the match-result to RegExpGetSubstitution in the replace fast paths, which avoids the extra array-copy for the capturing groups.
Attachment #8894423 - Flags: review?(till)
This gives a bit of stack allocated space for the captures vector in js::RegExpGetSubstitution (2 would be enough for Speedometer, but it probably doesn't hurt to double the stack allocated space to 4 elements). It also optimizes the array accesses slightly by directly accessing the dense elements instead of going through the more general GetElement(). And in intrinsic_RegExpGetSubstitution, we can easily avoid rooting the strings twice by directly calling ensureLinear() on the arguments, so let's do that, too.
Attachment #8894429 - Flags: review?(till)
And a bit of additional clean-up for RegExp code: - IsTrailSurrogateWithLeadSurrogate had an unused JSContext*, so removed that. - ExecuteRegExp was re-rooting an already rooted object. - It should be able to avoid the extra rooting in GetFirstDollarIndex for the arguments string. - And intrinsic_GetStringDataProperty had some unnecessary rooting, too. I'll run the hazard analysis in the try-push to make sure the rooting changes are all okay to do.
Attachment #8894431 - Flags: review?(till)
(In reply to André Bargull from comment #3) > I'll run the hazard analysis in the try-push to make sure the rooting > changes are all okay to do. Try looks good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=22c5f47704485b72a606cc88fbf541897d4b0811
Comment on attachment 8894423 [details] [diff] [review] bug1387968-part1-reduce-array-copies.patch Review of attachment 8894423 [details] [diff] [review]: ----------------------------------------------------------------- This looks good, thanks. I left a bunch of comments for some further optimizations. I can't see a reason why those wouldn't work, but applying them is optional. r=me either way. ::: js/src/builtin/RegExp.cpp @@ +1420,5 @@ > > + // Step 10 (reordered). > + uint32_t matchResultLength; > + if (!GetLengthProperty(cx, matchResult, &matchResultLength)) > + return false; If you change this function to take matchResult as a HandleArrayObject - see comments for SelfHosting.cpp below -, you can replace this with uint32_t matchResultLength = matchResult->length() @@ +1424,5 @@ > + return false; > + MOZ_ASSERT(matchResultLength > 0); > + > + RootedValue matchedValue(cx); > + if (!GetElement(cx, matchResult, matchResult, 0, &matchedValue)) This and the GetElement calls below should be replaceable by matchResult->getDenseElement(n), assuming you change matchResult to be an ArrayObject. ::: js/src/vm/SelfHosting.cpp @@ +1536,5 @@ > +#ifdef DEBUG > + bool isArray = false; > + MOZ_ALWAYS_TRUE(IsArray(cx, matchResult, &isArray)); > + MOZ_ASSERT(isArray); > +#endif You should be able to replace all this with MOZ_ASSERT(matchResult->is<ArrayObject>()). Or, even better, actually coerce to ArrayObject and pass that to RegExpGetSubstitution: RootedArrayObject matchResult(cx, &args[0].toObject().as<ArrayObject>())
Attachment #8894423 - Flags: review?(till) → review+
(In reply to Till Schneidereit [:till] from comment #5) > Comment on attachment 8894423 [details] [diff] [review] > bug1387968-part1-reduce-array-copies.patch > > Review of attachment 8894423 [details] [diff] [review]: > ----------------------------------------------------------------- > > This looks good, thanks. I left a bunch of comments for some further > optimizations. I can't see a reason why those wouldn't work, but applying > them is optional. r=me either way. Oh, now I do see a reason: an UnboxedArrayObject isn't a NativeObject. That probably makes these optimizations too annoying to be worth it.
Comment on attachment 8894429 [details] [diff] [review] bug1387968-part2-optimize-getsubstitution.patch Review of attachment 8894429 [details] [diff] [review]: ----------------------------------------------------------------- Hah, and now I see that part of these optimizations largely boil down to those I suggested in the previous review. Excellent.
Attachment #8894429 - Flags: review?(till) → review+
Comment on attachment 8894431 [details] [diff] [review] bug1387968-part3-extra-rooting.patch Review of attachment 8894431 [details] [diff] [review]: ----------------------------------------------------------------- r=me ::: js/src/builtin/RegExp.cpp @@ +897,5 @@ > * otherwise. YOU HAVE BEEN WARNED. > */ > > /* Steps 1-2 performed by the caller. */ > + Handle<RegExpObject*> reobj = regexp.as<RegExpObject>(); There must be thousands of places in our code base where we could avoid re-rooting like this ...
Attachment #8894431 - Flags: review?(till) → review+
(In reply to Till Schneidereit [:till] from comment #8) > ::: js/src/builtin/RegExp.cpp > @@ +897,5 @@ > > * otherwise. YOU HAVE BEEN WARNED. > > */ > > > > /* Steps 1-2 performed by the caller. */ > > + Handle<RegExpObject*> reobj = regexp.as<RegExpObject>(); > > There must be thousands of places in our code base where we could avoid > re-rooting like this ... Nah, we don't seem to do that bad! :-) http://searchfox.org/mozilla-central/search?q=Rooted.*%26%5Cw%2B(%5C.%7C-%3E)as%3C%5Cw%2B%3E%5C(%5C)%5C)&case=false&regexp=true&path=js%2Fsrc
(In reply to André Bargull from comment #9) > Nah, we don't seem to do that bad! :-) "Number of results: 1000 (maximum is 1000)" I fear we do ;)
(In reply to Till Schneidereit [:till] from comment #10) > (In reply to André Bargull from comment #9) > > Nah, we don't seem to do that bad! :-) > > "Number of results: 1000 (maximum is 1000)" > > I fear we do ;) That's strange, I get: "Number of results: 141 (maximum is 1000)".
(In reply to André Bargull from comment #11) > That's strange, I get: "Number of results: 141 (maximum is 1000)". Oh, interesting: I'm using the Skip Redirect extension[1], and that somehow messes up the link when clicked in gmail. If I disable that or click the link in bugzilla, I get your result. 141 is indeed much better than I'd have hoped. [1] https://addons.mozilla.org/en-US/firefox/addon/skip-redirect/
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/769b9e5ce199 Part 1: Reduce array copies for RegExpGetSubstitution with captures. r=till https://hg.mozilla.org/integration/mozilla-inbound/rev/cd1d769cabe8 Part 2: Optimize array accesses and allocations in RegExpGetSubstitution. r=till https://hg.mozilla.org/integration/mozilla-inbound/rev/1f294885ea45 Part 3: Remove unnecessary or duplicate rooting in RegExp code. r=till
Keywords: checkin-needed
Depends on: 1389904
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: