Closed
Bug 808245
Opened 12 years ago
Closed 12 years ago
Use YARR's new MatchOnly JIT mode
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla20
People
(Reporter: dvander, Assigned: sstangl)
References
(Blocks 1 open bug)
Details
Attachments
(6 files, 3 obsolete files)
(deleted),
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
On benchmarks, particularly v8-regexp (and on the web, too), it's common to have code like:
/(pattern)/.exec(string);
And never consume the matched subexpressions. We currently optimize this by not creating the output array, but we can do better by using YARR's new MatchOnly mode. This mode compiles the regex code to not bother saving the subexpression positions in the first place.
Unfortunately there is one complication: the global RegExp object has unspecified, defacto behavior that the last match results are available via RegExp.$1, RegExp.$2, etc. In SpiderMonkey, these are implemented in RegExpStatics.*. To avoid breaking this, JavaScriptCore has an internal variable on the global RegExp constructor that caches the last match inputs (that is, the string and regex pattern). If the user requests RegExp.$n, and the results were optimized away, it will re-run the regex to compute the results.
We expect this to fix bug 806646, which would account for the largest single performance improvement we can currently make on our V8 score. It should be pretty straightforward.
Assignee | ||
Comment 1•12 years ago
|
||
If the result of exec() is unused, we can actually get away with executing almost nothing at all -- all that's necessary is to cache the last input for reasons given in the above comment.
Reporter | ||
Comment 2•12 years ago
|
||
(In reply to Sean Stangl from comment #1)
> If the result of exec() is unused, we can actually get away with executing
> almost nothing at all -- all that's necessary is to cache the last input for
> reasons given in the above comment.
Good point!
Assignee | ||
Updated•12 years ago
|
Assignee: general → sstangl
Reporter | ||
Updated•12 years ago
|
Status: NEW → ASSIGNED
Comment 3•12 years ago
|
||
How much of an improvement is this bug expected to make? Currently, our v8-regex score is really bad. ETA on fix?
Assignee | ||
Comment 4•12 years ago
|
||
(In reply to Paul Wright from comment #3)
> How much of an improvement is this bug expected to make? Currently, our
> v8-regex score is really bad. ETA on fix?
Adding MatchOnly mode moves v8-regexp's score from 1800 to 2000 locally; we require 3600. MatchOnly mode roughly matches our performance on .test() and unused .exec() with the v8 engine (8800 versus 9000 on a v8-regexp benchmark with only .exec() calls), but our benchmark score is held back by poor .replace() performance.
I have patches to address that, but I'm working on breaking them up to post in separate bugs for ease of review. The patch I originally wrote for this bug turned out to be way too complicated.
Comment 5•12 years ago
|
||
> but our benchmark score is held back by poor .replace() performance.
>
> I have patches to address that, but I'm working on breaking them up to post
> in separate bugs for ease of review.
Can you point me in the direction of the applicable bugs / patches?
Assignee | ||
Comment 6•12 years ago
|
||
This is the first patch in a three-part series, the latter two to be resolved by Bug 820118 and Bug 820124. I have broken it up in this way in the hope of making the patches easier to review.
This patch implements YARR MatchOnly support and uses it in regexp_test(). In order to support this new mode, it was necessary to change the behavior of ExecuteRegExp() and its callsites. Because ExecuteRegExp() behavior is baked into the JSAPI, a function ExecuteRegExpLegacy() now exists that provides the old behavior, including output array creation.
A number of data structure changes facilitate the MatchOnly support. Instead of requiring ExecuteRegExp() callers to hold a LifoAllocScope, re.execute() now accepts a |MatchPairs| outparam, which is a resizable array of |MatchPair| objects, with memory details fully handled by the caller via |ScopedMatchPairs| and |VectorMatchPairs| (for RegExpStatics). This allows the same code to store directly into the RegExpStatics matches buffer without copying in the lazy evaluation case. Most uses of raw integers in matches have also been changed into uses of MatchPair objects, improving readability.
In total, the patch makes the following changes:
> Removes the RegExpCode/RegExpShared distinction, keeping RegExpShared,
> Compiles RegExp code at execution time, not when getting the RegExpShared,
> Changes RegExp execution to always output MatchPairs or a MatchPair,
> Uses a MatchConduit (union) to switch between MatchOnly/IncludeSubpatterns,
> Moves most RegExpShared querying logic into MatchPairs,
> Lazifies RegExpShared.
Local improvement to v8-regexp is about 1280 -> 1380, but the score is primarily held back by str_replace() performance. Bug 820124 will fix that, but requires MatchOnly mode to exist. On a version of the v8-regexp benchmark with only calls to .exec() (which are executed as .test()), we roughly match v8, ~8800 versus ~9000.
Assignee | ||
Updated•12 years ago
|
Attachment #690572 -
Flags: review?(dvander)
Reporter | ||
Updated•12 years ago
|
Attachment #690572 -
Flags: feedback?(choller)
Reporter | ||
Updated•12 years ago
|
Attachment #690572 -
Flags: feedback?(gary)
Assignee | ||
Comment 7•12 years ago
|
||
Version 2, with some additional cleanup and bugfixes around lazy RegExpStatics::makeMatch().
Attachment #690572 -
Attachment is obsolete: true
Attachment #690572 -
Flags: review?(dvander)
Attachment #690572 -
Flags: feedback?(gary)
Attachment #690572 -
Flags: feedback?(choller)
Attachment #691128 -
Flags: review?(dvander)
Assignee | ||
Updated•12 years ago
|
Attachment #691128 -
Flags: feedback?(gary)
Attachment #691128 -
Flags: feedback?(choller)
Comment 8•12 years ago
|
||
Comment on attachment 691128 [details] [diff] [review]
Implement MatchOnly support, v2
feedback+ given that it didn't blow up jsfunfuzz after about half an hour, but this may be a red herring given that the regex fuzzer in jsfunfuzz keeps hitting bug 808478 before anything else.
Attachment #691128 -
Flags: feedback?(gary) → feedback+
Assignee | ||
Comment 9•12 years ago
|
||
To make the patch easier to review, I have retroactively broken it up into 5 parts. I didn't go too crazy with rewriting data structures, so occasionally a component from a later patch will creep into an earlier one. I verified that all of the patches compile, but did not bother running or testing them.
The series appears to be green on try, with some random orange: https://tbpl.mozilla.org/?tree=Try&rev=050160f7bc4e
Assignee | ||
Comment 10•12 years ago
|
||
Changes are mostly to vim modelines. At some point I will run through the repo and update all of them, but these were the ones I changed when my editor behaved obnoxiously.
Attachment #691128 -
Attachment is obsolete: true
Attachment #691128 -
Flags: review?(dvander)
Attachment #691128 -
Flags: feedback?(choller)
Attachment #691640 -
Flags: review?(dvander)
Assignee | ||
Comment 11•12 years ago
|
||
Precursor to supporting compilation at execution time. Unused, and OK'd by Luke.
Attachment #691641 -
Flags: review?(dvander)
Assignee | ||
Comment 12•12 years ago
|
||
Data structure simplification.
Attachment #691642 -
Flags: review?(dvander)
Assignee | ||
Comment 13•12 years ago
|
||
Attachment #691643 -
Flags: review?(dvander)
Assignee | ||
Comment 14•12 years ago
|
||
This is the main data structure change patch, which rewrites re.execute() and all associated functions to pass the full matches list up to the caller. The interface to MatchPairs is also improved.
Most of the changes to jsstr.cpp are in preparation for lazifying RegExpStatics. They were too obnoxious to rip out.
Attachment #691646 -
Flags: review?(dvander)
Assignee | ||
Comment 15•12 years ago
|
||
Adds executeMatchOnly(). In order to use MatchOnly mode in regexp_test(), lazy RegExpStatics need to be in place, so this patch does that also. The cleanup to RegExpStatics was done as part of the previous patch -- I just deleted anything to do with the lazy mode.
Attachment #691647 -
Flags: review?(dvander)
Reporter | ||
Updated•12 years ago
|
Attachment #691640 -
Flags: review?(dvander) → review+
Reporter | ||
Updated•12 years ago
|
Attachment #691641 -
Flags: review?(dvander) → review+
Reporter | ||
Updated•12 years ago
|
Attachment #691642 -
Flags: review?(dvander) → review+
Reporter | ||
Comment 16•12 years ago
|
||
Comment on attachment 691643 [details] [diff] [review]
Part 3/5 - Compile at execution time
Review of attachment 691643 [details] [diff] [review]:
-----------------------------------------------------------------
talked IRL, map code needs to be added back
Attachment #691643 -
Flags: review?(dvander)
Assignee | ||
Comment 17•12 years ago
|
||
Oh my. This is the correct version of the patch. dvander pointed out that I got carried away with code removal, and removed waaaay too much.
1) Adding the RegExpShared into map_ is necessary for code reuse.
2) Adding the RegExpShared into inUse_ is necessary for object cleanup.
Attachment #691643 -
Attachment is obsolete: true
Attachment #692092 -
Flags: review?(dvander)
Reporter | ||
Updated•12 years ago
|
Attachment #691647 -
Flags: review?(dvander) → review+
Reporter | ||
Updated•12 years ago
|
Attachment #691646 -
Flags: review?(dvander) → review+
Reporter | ||
Comment 18•12 years ago
|
||
Comment on attachment 692092 [details] [diff] [review]
Part 3/5 - Compile at execution time [v2]
Review of attachment 692092 [details] [diff] [review]:
-----------------------------------------------------------------
Nice, glad to see all this RegExp stuff get cleaned up.
Attachment #692092 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 19•12 years ago
|
||
Thank you for the quick reviews!
Try run submitted for final state, green with random orange:
https://tbpl.mozilla.org/?tree=Try&rev=d8a45a696e8b
Assignee | ||
Comment 20•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9121bae603a2
https://hg.mozilla.org/integration/mozilla-inbound/rev/787527f064da
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea7d93401f96
https://hg.mozilla.org/integration/mozilla-inbound/rev/d229008d60ac
https://hg.mozilla.org/integration/mozilla-inbound/rev/7711a36c2771
https://hg.mozilla.org/integration/mozilla-inbound/rev/b7e2ba73b2ff
Comment 21•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/9121bae603a2
https://hg.mozilla.org/mozilla-central/rev/787527f064da
https://hg.mozilla.org/mozilla-central/rev/ea7d93401f96
https://hg.mozilla.org/mozilla-central/rev/d229008d60ac
https://hg.mozilla.org/mozilla-central/rev/7711a36c2771
https://hg.mozilla.org/mozilla-central/rev/b7e2ba73b2ff
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Comment 22•12 years ago
|
||
Comment on attachment 691646 [details] [diff] [review]
Part 4/5 - Use MatchPairs for RegExp output
Review of attachment 691646 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/vm/RegExpObject.cpp
@@ +122,5 @@
>
> + /* Initialize all MatchPair objects to invalid locations. */
> + for (size_t i = 0; i < pairCount; i++) {
> + pairs_[i].start = size_t(-1);
> + pairs_[i].limit = size_t(-1);
Sooo... start and limit are ints, which gives me
js/src/vm/RegExpObject.cpp:125:36 [-Woverflow] overflow in implicit constant conversion
js/src/vm/RegExpObject.cpp:126:36 [-Woverflow] overflow in implicit constant conversion
Assignee | ||
Comment 23•12 years ago
|
||
(In reply to :Ms2ger from comment #22)
> Comment on attachment 691646 [details] [diff] [review]
> Part 4/5 - Use MatchPairs for RegExp output
>
> Review of attachment 691646 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: js/src/vm/RegExpObject.cpp
> @@ +122,5 @@
> >
> > + /* Initialize all MatchPair objects to invalid locations. */
> > + for (size_t i = 0; i < pairCount; i++) {
> > + pairs_[i].start = size_t(-1);
> > + pairs_[i].limit = size_t(-1);
>
> Sooo... start and limit are ints, which gives me
>
> js/src/vm/RegExpObject.cpp:125:36 [-Woverflow] overflow in implicit constant
> conversion
> js/src/vm/RegExpObject.cpp:126:36 [-Woverflow] overflow in implicit constant
> conversion
That initialization isn't even necessary, since Yarr's JIT performs the same initialization as the first step of execution. I'll fix it as part of one of the other regexp bugs.
You need to log in
before you can comment on or make changes to this bug.
Description
•