Crash [@ v8::internal::RegExpCapture::ToNode] with too much recursion on 32-bit
Categories
(Core :: JavaScript Engine, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | wontfix |
firefox77 | --- | unaffected |
firefox78 | --- | wontfix |
firefox79 | --- | wontfix |
firefox80 | --- | wontfix |
firefox81 | --- | verified |
People
(Reporter: decoder, Assigned: iain)
References
(Regression)
Details
(Keywords: crash, regression, testcase, Whiteboard: [bugmon:update,bisected,confirmed])
Crash Data
Attachments
(3 files)
The following testcase crashes on mozilla-central revision 20200608-63dc5e9b1b02 (debug build, run with --fuzzing-safe --cpu-count=2 --ion-offthread-compile=off):
evalInWorker(`
var N = 10000;
var left = repeat_str('(1&', N);
var right = repeat_str(')', N);
var str = 'actual = '.concat(left, '1', right, ';');
function repeat_str(str, repeat_count) {
var arr = new Array(--repeat_count);
while (repeat_count != 0)
arr[--repeat_count] = str;
return str.concat.apply(str, arr);
}
var re = new RegExp(str);
re.exec(N - 1);
`);
Backtrace:
received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xf60ffb40 (LWP 23404)]
0x571e52a6 in v8::internal::RegExpCapture::ToNode(v8::internal::RegExpTree*, int, v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
#0 0x571e52a6 in v8::internal::RegExpCapture::ToNode(v8::internal::RegExpTree*, int, v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
#1 0x571e5255 in v8::internal::RegExpCapture::ToNode(v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
#2 0x571e55ad in v8::internal::RegExpAlternative::ToNode(v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
#3 0x571e5310 in v8::internal::RegExpCapture::ToNode(v8::internal::RegExpTree*, int, v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
#4 0x571e5255 in v8::internal::RegExpCapture::ToNode(v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
#5 0x571e55ad in v8::internal::RegExpAlternative::ToNode(v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
[...]
#127 0x571e5255 in v8::internal::RegExpCapture::ToNode(v8::internal::RegExpCompiler*, v8::internal::RegExpNode*) ()
eax 0x4672 18034
ebx 0x591c8000 1495040000
ecx 0x4672 18034
edx 0x59177560 1494709600
esi 0x4673 18035
edi 0xf593d910 -174860016
ebp 0xf5fc0028 4126933032
esp 0xf5fc0000 4126932992
eip 0x571e52a6 <v8::internal::RegExpCapture::ToNode(v8::internal::RegExpTree*, int, v8::internal::RegExpCompiler*, v8::internal::RegExpNode*)+70>
=> 0x571e52a6 <_ZN2v88internal13RegExpCapture6ToNodeEPNS0_10RegExpTreeEiPNS0_14RegExpCompilerEPNS0_10RegExpNodeE+70>: call 0x571dc770 <_ZN2v88internal4Zone3NewEj>
0x571e52ab <_ZN2v88internal13RegExpCapture6ToNodeEPNS0_10RegExpTreeEiPNS0_14RegExpCompilerEPNS0_10RegExpNodeE+75>: add $0x10,%esp
I didn't see this on 64-bit. It might be that we have limits in place that work out for 64-bit platforms, but they might be insufficient for 32-bit.
Reporter | ||
Comment 1•4 years ago
|
||
Comment 2•4 years ago
|
||
Iain, usually this is solved with (CheckrecursionLimit)[https://searchfox.org/mozilla-central/rev/fac90408bcf52ca88a3dcd2ef30a379b68ab24e2/js/src/jsfriendapi.h#1009], however I do not know how this would play with the iregexp integration.
Assignee | ||
Comment 3•4 years ago
|
||
Yeah, V8 has StackLimitCheck, which we implement as a wrapper around CheckRecursionLimit. It's not being used here, though, and it's awkward to insert, because the code where we overflow is doing a recursive walk of the AST, so we would have to rewrite many different calls to be fallible. I upstreamed a patch to fix a similar bug by making it not use a recursive algorithm, but it would be a much larger rewrite here.
There are a few ways to fix this, none of them ideal. We can reduce the maximum number of regexp captures on 32 bit. We actually care about the number of nested captures, not the total number of captures, so if we want to be more precise, we can walk the AST and figure that out. (Unfortunately, although irregexp provides an interface for walking the AST, it is not well designed. Lots of void* floating around.)
Uploading a patch to do the latter, although I'm also open to doing the former.
Assignee | ||
Comment 4•4 years ago
|
||
On 32-bit, pathological regexps can overflow the stack just from recursively walking the AST during compilation. One way to avoid this is by checking the AST ahead of time to make sure it is not too deep.
The visitor API for RegExpTree is sad and full of void*
spiders, but I don't think it's worth upstreaming a patch to make it nicer.
An alternative to this patch would be to change JSRegExp::maxCaptures (in RegExpShim.h) from 1 << 15
to 1 << 13
on 32-bit. This has the upside of not having to write a gross visitor, but the downside of preventing us from compiling regexps that we could otherwise compile.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 5•4 years ago
|
||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 6•4 years ago
|
||
We observe this but on 64
uname -a
: Linux my.host.name 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23 00:01:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
this regex (from old systemjs):
var esmRegEx = /(^\s*|[}\);\n]\s*)(import\s*(['"]|(\*\s+as\s+)?(?!type)([^"'\(\)\n; ]+)\s*from\s*['"]|\{)|export\s+\*\s+from\s+["']|export\s*(\{|default|function|class|var|const|let|async\s+function))/;
fails on this file:
https://github.com/tutao/tutanota/blob/8b51475de7e0d4a2f1b5e3e52aac6d59308a1034/src/api/worker/crypto/lib/crypto-sjcl-1.0.7.js
The bug which breaks module loader (and possible any other regex) sounds pretty bad if you ask me.
Fortunately we don't really need this regex but who knows where else it will crop up.
Reporter | ||
Comment 7•4 years ago
|
||
Sounds like we might have to retriage priority for this based on comment 6?
Assignee | ||
Comment 8•4 years ago
|
||
The problem in this bug occurs when parsing extremely large regexp patterns with deeply nested capture groups (thousands of levels deep). It's not likely to occur in any regexp written by a human. I don't think comment 6 is describing the same failure.
Ivan, what do you mean by "fails on this file"? Does it fail to match? Does it throw an exception? Does the tab crash? I can't replicate any problem locally.
Comment 9•4 years ago
|
||
Lain, sorry for being unclear, I get the same exception.
You can try to clone https://github.com/tutao/tutanota
npm ci
node make prod
open http://localhost:9000/
I get such a stack trace (maybe on the next reload):
Unhandled rejection translateAndInstantiate/<@http://localhost:9000/build/libs/system.src.js:3467:76
tryCatcher@http://localhost:9000/build/libs/bluebird.js:5370:23
[22]</module.exports/Promise.prototype._settlePromiseFromHandler@http://localhost:9000/build/libs/bluebird.js:3366:31
[22]</module.exports/Promise.prototype._settlePromise@http://localhost:9000/build/libs/bluebird.js:3423:18
[22]</module.exports/Promise.prototype._settlePromise0@http://localhost:9000/build/libs/bluebird.js:3468:10
[22]</module.exports/Promise.prototype._settlePromises@http://localhost:9000/build/libs/bluebird.js:3548:18
_drainQueueStep@http://localhost:9000/build/libs/bluebird.js:145:12
_drainQueue@http://localhost:9000/build/libs/bluebird.js:138:24
[2]</Async.prototype._drainQueues@http://localhost:9000/build/libs/bluebird.js:154:16
Async/this.drainQueues@http://localhost:9000/build/libs/bluebird.js:67:14
after modifying build/libs/system.src.js on line 3467 like this to print the exception:
try {
if (metadata.load.format !== 'esm' && (metadata.load.format || !source.match(esmRegEx))) {
return source;
}
} catch (e) {
console.log(e)
throw e
}
I see InternalError: too much recursion
.
I would expect that it depends on the system stack settings.
➜ ~ ulimit -s
8192
It only occurs in Firefox and only started happening recently and looks very similar from the outside.
If you need more info, please let me know.
Assignee | ||
Comment 10•4 years ago
|
||
Throwing InternalError: too much recursion
is the expected result here. It means that you have overflowed your stack. If your stack is small enough and your regexp is large enough, we will run out of stack space and there's not much we can do about it besides throwing an exception.
This bug is about a gap in our stack overflow detection code. When this bug triggers, we do not throw a JS exception; we overflow the native stack and the process crashes.
It's a little interesting that Firefox crashes on your testcase and Chrome doesn't, because we're using the same regexp engine under the hood. Are you using the same stack limit in both cases?
Comment 11•4 years ago
|
||
Lain, thank you for the clarification, I indeed misunderstood it.
I believe this is another issue with new regexp engine. I am aware that you use the same one as in V8 now but I only observe the issue in Firefox.
Shall I open a separate issue then? As you can see in the comment 6 regexp is not too large.
Assignee | ||
Comment 12•4 years ago
|
||
Before opening a bug, can you confirm which version of Firefox you're using? We replaced our regexp engine in Firefox 78.
Comment 13•4 years ago
|
||
Yes, it is indeed Firefox 78
Version 78.0.1
Build ID 20200630195452
Built from https://hg.mozilla.org/releases/mozilla-release/rev/31d99efb370ca7525ebd376a62ec503f0f6de6d9
Assignee | ||
Comment 14•4 years ago
|
||
Huh, okay. Go ahead and open a bug and I'll take a look, although there's a good chance there's not much to be done about it.
Comment 15•4 years ago
|
||
Comment 16•4 years ago
|
||
I'm seeing this on my 64-bit linux laptop, as a failure in the jit-test regexp/huge-02.js
. (Uploaded the log.) My mozconfig is:
ac_add_options --enable-project=js --with-project=js
ac_add_options --enable-debug
ac_add_options --disable-optimize
ac_add_options --enable-js-shell
ac_add_options --enable-fuzzing
ac_add_options --enable-linker=lld
mk_add_options MOZ_MAKE_FLAGS="-j8 -s"
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-debug-js
(Yes, that's a silly mozconfig. But it is indeed what I'm using.)
It's reproducible on the exact build I have now. I have an rr pack of it, if that helps. (But it's 649MB uncompressed.)
Comment 17•4 years ago
|
||
Comment 18•4 years ago
|
||
bugherder |
Comment 19•4 years ago
|
||
Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 21•4 years ago
|
||
Comment 22•4 years ago
|
||
Should we land a test for this?
Assignee | ||
Comment 23•4 years ago
|
||
It's hard to write a portable testcase for this, because the size of regexp necessary to trigger it depends on the machine / compiler / config options and so on. I tried writing a test that would do a binary search to find the correct size, but it turns out that we don't get enough information from an exception to determine whether we are above or below the target.
As mentioned in comment 16, we already have a test in the test suite that will trigger this in at least one configuration. I think that's about the best we can reasonably do.
Description
•