Closed Bug 820124 Opened 12 years ago Closed 12 years ago

Optimize str_replace() for RegExps using YARR's MatchOnly mode

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: sstangl, Assigned: sstangl)

References

(Blocks 1 open bug)

Details

Attachments

(6 files, 6 obsolete files)

(deleted), application/x-bzip
Details
(deleted), application/x-bzip
Details
(deleted), patch
dvander
: review+
Details | Diff | Splinter Review
(deleted), patch
dvander
: review+
Details | Diff | Splinter Review
(deleted), text/plain
Details
(deleted), patch
decoder
: feedback+
Details | Diff | Splinter Review
Once Bug 808245 and Bug 820118 are landed, performance on regexp.exec() is roughly comparable with v8's: 8800 versus 9000 on a modified version of v8-regexp that only includes calls to .exec(). Half of v8-regexp uses str.replace(), however, and on a modified version that only includes those calls, v8 receives 3x our points. We can optimize our implementation by using MatchOnly mode in the non-lambda case, and in the frequent case of replacement with the empty string and no perl-style $ variables, write a simple accumulator looped around MatchOnly. This should hopefully be the last thing required to match v8 on v8-regexp.
Assignee: general → sstangl
Blocks: 806646
Attached patch [WIP] Implement fast removal for str_replace(). (obsolete) (deleted) — Splinter Review
Work-in-progress patch. Numbers posted for a reduced v8-regexp benchmark that only includes calls to .replace(). The corresponding code from JSC is Source/JavaScriptCore/runtime/StringPrototype.cpp's removeUsingRegExpSearch(). The patch works completely, but only raises perf from ~2400 -> ~2650. Without any string concatenation whatsovever, perf is ~4600. v8's implementation manages ~6300. From a profile, the hot functions are: > 32.7% in JM/Ion/Yarr code -- likely mostly Yarr > 13.3% in AppendSubstrings() > 8.0% in Yarr's interpreter > 5.0% in str_replace() > 3.5% in str_replace_regexp_remove() At this point, the only difference we have between our implementation of Yarr and Apple's is that Apple's JS engine optimizes for ASCII strings, and compiles special Yarr JIT code to handle ASCII strings more efficiently. The v8-regexp benchmark uses ASCII strings exclusively. The v8 engine makes the same optimization.
Version of the previous benchmark that forces strings to be 16-bit. No effect on our engine performance, but v8 drops from 6400 to 4470. This benchmark is more useful to make comparisons between the two engines, until we implement known-ASCII strings.
Using RegExpObject is a fine idea, but it turns out that most of the regular expressions evaluated in jsstr.cpp are not bound to a RegExpObject. Therefore, in order to use MatchOnly mode in string operations, it's necessary to hold a strong reference to the RegExpShared. Checked for safety with Bill.
Attachment #694695 - Flags: review?(dvander)
This is a 9% performance improvement on the v8-regexp benchmark. It's not what we need, but I would prefer to land this while we continue investigating.
Attachment #694634 - Attachment is obsolete: true
Attachment #694696 - Flags: review?(dvander)
Left in some debugging stuff.
Attachment #694696 - Attachment is obsolete: true
Attachment #694696 - Flags: review?(dvander)
Attachment #695006 - Flags: review?(dvander)
Final version of this patch: builds ropes instead of flattened strings. Perf on regexp-replace-16.js moves from ~2650 -> ~3860, which is the majority of the perf gain that this patch was expected to have. On the same benchmark, v8 manages 4470, but we're not far off at all now.
Attachment #695006 - Attachment is obsolete: true
Attachment #695006 - Flags: review?(dvander)
Attachment #695027 - Flags: review?(dvander)
Attachment #694695 - Flags: review?(dvander) → review+
Comment on attachment 695027 [details] [diff] [review] Part 2/2 - Implement fast removal code for str_replace(). [v3] Review of attachment 695027 [details] [diff] [review]: ----------------------------------------------------------------- Nice!
Attachment #695027 - Flags: review?(dvander) → review+
Backed out for causing failures while parsing reftest manifest files (which make use of regexes). https://hg.mozilla.org/integration/mozilla-inbound/rev/95c2a38b92ad https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=f95b0378d4ee (orange on every crashtest and reftest)
Based on the error and the fact that these patches are regex-related makes me think the error is most likely in the code in lines 736-754 here: http://hg.mozilla.org/mozilla-central/file/ae6237161b6c/layout/tools/reftest/reftest.js#l736 I was going to say 754 was most likely, but now that I see this is about str.replace, I'm thinking maybe 751.
And it looks like it turned a bunch of other tests orange too.
...and it still turned things orange like crazy. Did you push the right patches to inbound? https://hg.mozilla.org/integration/mozilla-inbound/rev/c55e74c41184 https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=038194a2ffc3 for lots of orange.
This is intermittently asserting in Mac debug xpcshell tests: https://tbpl.mozilla.org/php/getParsedLog.php?id=18299360&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=18299858&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=18299992&tree=Mozilla-Inbound TEST-INFO | (xpcshell/head.js) | 0 check(s) todo WARNING: nsExceptionService ignoring thread destruction after shutdown: file ../../../xpcom/base/nsExceptionService.cpp, line 166 Assertion failure: shared->activeUseCount == 0, at ../../../js/src/vm/RegExpObject.cpp:656 <<<<<<< PROCESS-CRASH | /builds/slave/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_resource.js | application crashed [@ js::RegExpCompartment::~RegExpCompartment()] Crash dump filename: /builds/slave/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/F9A481CF-5811-4EB9-BD90-E58230BBD1F9.dmp Operating system: Mac OS X 10.8.0 12A269 CPU: amd64 family 6 model 42 stepping 7 8 CPUs Crash reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS Crash address: 0x0 Thread 0 (crashed) 0 XUL!js::RegExpCompartment::~RegExpCompartment() [HashTable.h : 1232 + 0x0] rbx = 0x00007fff7dd03c68 r12 = 0x00000001071ae400 r13 = 0x000000010585b158 r14 = 0x00000001071ae9a0 r15 = 0x0000000000000015 rip = 0x000000010231a1ec rsp = 0x00007fff5fbfeca0 rbp = 0x00007fff5fbfece0 Found by: given as instruction pointer in context 1 XUL!JSCompartment::~JSCompartment() [jscompartment.cpp : 108 + 0xb] rbx = 0x0000000000000000 r12 = 0x00000001071ae400 r13 = 0x000000010585b158 r14 = 0x00000001071ae400 r15 = 0x0000000000000015 rip = 0x00000001020d424a rsp = 0x00007fff5fbfecf0 rbp = 0x00007fff5fbfed00 Found by: call frame info 2 XUL!SweepCompartments [jscntxt.h : 358 + 0x7] rbx = 0x00007fff5fbfedf8 r12 = 0x00000001071ae400 r13 = 0x000000010585b158 r14 = 0x00007fff5fbfedf8 r15 = 0x0000000000000015 rip = 0x0000000102110211 rsp = 0x00007fff5fbfed10 rbp = 0x00007fff5fbfed70 Found by: call frame info 3 XUL!IncrementalCollectSlice [jsgc.cpp : 3685 + 0x4] rbx = 0x0000000104583000 r12 = 0x00000001045833c8 r13 = 0x0000000104583000 r14 = 0x00000001045833c8 r15 = 0x0000000105877170 rip = 0x000000010210fa2d rsp = 0x00007fff5fbfed80 rbp = 0x00007fff5fbfef10 Found by: call frame info 4 XUL!GCCycle [jsgc.cpp : 4160 + 0x10] rbx = 0x0000000104583000 r12 = 0x0000000000000000 r13 = 0x0000000000000000 r14 = 0x00000001027b843a r15 = 0x0000000104583000 rip = 0x000000010210d51e rsp = 0x00007fff5fbfef20 rbp = 0x00007fff5fbfef60 Found by: call frame info 5 XUL!Collect [jsgc.cpp : 4278 + 0x14] rbx = 0x0000000104583000 r12 = 0x0000000000000000 r13 = 0x00000001045833c8 r14 = 0x0000000000000002 r15 = 0x0000000000000000 rip = 0x000000010210be5c rsp = 0x00007fff5fbfef70 rbp = 0x00007fff5fbfefb0 Found by: call frame info 6 XUL!js::DestroyContext(JSContext*, js::DestroyContextMode) [jscntxt.cpp : 359 + 0x4]
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Sorry, backed out until we can get more information about the intermittent assertions: https://hg.mozilla.org/mozilla-central/rev/f2a500997116 So far we have seen 5 failures out of 90 Mac debug xpcshell runs. That's pretty frequent relative to our average intermittent test bug, but still rare enough that it's not easy to reproduce on demand (e.g. on the Try server).
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla20 → ---
The earliest occurrence of the failure is on this changeset (bug 780549), which landed shortly after this bug. There's at least a small chance that the failure is triggered or exposed by bug 780549: https://hg.mozilla.org/integration/mozilla-inbound/rev/58d263c1c3dc
Attached file stack (obsolete) (deleted) —
eval("\ r = RegExp(\"(?!()(((!))))\", \"g\");\ \"^\".replace(r, '');\ r = (\"1+\")\ ") gc() RegExp.$8 causes Assertion failure: JSString::isLinear(), on 32-bit debug Windows shell on m-c changeset 23549b4dffb1, and this no longer occurs on the following changeset f2a500997116 which is a backout of the patches in this bug. Fuzzing is our friend!
Attached file stack (obsolete) (deleted) —
eval("\ x = RegExp(\"()\", \"y\");\ x.test();\ x = {};\ ") gc() RegExp.$6 crashes at js::VectorImpl on 32-bit debug Windows shell on m-c changeset 23549b4dffb1, and this no longer occurs on the following changeset f2a500997116 which is a backout of the patches in this bug.
(In reply to Matt Brubeck (:mbrubeck) from comment #19) > The earliest occurrence of the failure is on this changeset (bug 780549), > which landed shortly after this bug. After more retriggers, the test failure did occur on a changeset before 780549 landed. So it's definitely unrelated to bug 780549.
Attached patch Full patch, for fuzzing. (obsolete) (deleted) — Splinter Review
This is the complete patch, with additional bug fixes. I'm unable to reproduce the random OS X xpcshell failures for which the patch was backed out. Would you mind fuzzing the patch, to see whether it's reproducible in some other way?
Attachment #699517 - Flags: feedback?(gary)
Attachment #699517 - Flags: feedback?(choller)
Comment on attachment 699517 [details] [diff] [review] Full patch, for fuzzing. verifyprebarriers() r = /()()()\3/.test() gc() RegExp().test() Assertion failure: [barrier verifier] Unmarked edge: regexpshared source, (tested on 32-bit Linux shell)
Attachment #699517 - Flags: feedback?(gary) → feedback-
Attached file stack for testcase in comment 24 (deleted) —
Attachment #696157 - Attachment is obsolete: true
Attachment #696180 - Attachment is obsolete: true
Seeing the same error as Gary is seeing and it's trigger happy. I'll leave it running a bit longer but I doubt it'll find a lot because this issue keeps triggering so often.
Comment on attachment 699517 [details] [diff] [review] Full patch, for fuzzing. '123456'.replace(/1(\d+)3/, ''); RegExp.lastMatch.toString(); Assertion failure: end <= matchesInput->length(), at ../vm/RegExpStatics-inl.h:37
Attachment #699517 - Flags: feedback?(choller) → feedback-
Attached patch Full patch, for fuzzing. [v2] (deleted) — Splinter Review
Thanks. This version fixes both of those bugs, thanks to help from Bill. The first bug may have been responsible for the intermittent failures on TBPL.
Attachment #699517 - Attachment is obsolete: true
Attachment #700062 - Flags: feedback?(gary)
Attachment #700062 - Flags: feedback?(choller)
Comment on attachment 700062 [details] [diff] [review] Full patch, for fuzzing. [v2] Marking feedback+ based on the fact that nothing bad has showed up in a few hours' worth of fuzzing. I'll continue to leave it running overnight though.
Attachment #700062 - Flags: feedback?(gary) → feedback+
Comment on attachment 700062 [details] [diff] [review] Full patch, for fuzzing. [v2] No new issues, feedback+ :)
Attachment #700062 - Flags: feedback?(choller) → feedback+
I have a fix; the issue is pre-existing in the tree. Since the patches were backed out before I could push the fix, I guess I'll spin on tryserver yet again.
Landed for the 5th and hopefully the final time. Bug 826581, which unfortunately caused the last backout of this patch, landed, fixing the RegExpShared destructor crashes. https://tbpl.mozilla.org/?tree=Try&rev=2b21065f9ba4 https://hg.mozilla.org/integration/mozilla-inbound/rev/fabb72141c55 https://hg.mozilla.org/integration/mozilla-inbound/rev/d13d8a663302
Follow-up fix: use a prebarrier on the raw source pointer. https://hg.mozilla.org/integration/mozilla-inbound/rev/861fb57b93d8
Depends on: 830676
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
\o/
(In reply to Yuan Pengfei from comment #38) > Some websites are broken on win64 > > First bad: > http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla- > inbound-win64/1358209518/firefox-21.0a1.en-US.win64-x86_64.installer.exe > Last good: > http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla- > inbound-win64/1358209097/firefox-21.0a1.en-US.win64-x86_64.installer.exe > > Steps to reproduce: > 1. Open http://en.mail.qq.com/ > 2. Press tab key Thanks for the report. YARR, the RegExp JIT, contains a new execution mode that is broken on Win64. I have filed Bug 831884 to work around the issue by avoiding the crashy code on Win64 until it can be fixed more properly.
Depends on: 854124
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: