Closed Bug 1317397 Opened 8 years ago Closed 8 years ago

Implement RegExp.lastIndex changes from ES2017

Categories

(Core :: JavaScript: Standard Library, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: anba, Assigned: anba)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

lastIndex is no longer read resp. written to when the RegExp object is neither global nor sticky. Spec changes: https://github.com/tc39/ecma262/pull/627/files
The PR was accepted, but it seems to be erroneous and hence not implementable for us. Consider the following case: ``` var re = /(.)\1/gu; re.lastIndex = { valueOf() { // Recompile without Unicode-flag. re.compile(re.source, "g"); return 0; } }; var r = re.exec("\u{D83D}\u{DCA3}\u{DCA3}"); print(r ? r[0] : null); ``` With the new semantics, it seems that the expected result value is `null`. And here's why: In RegExpBuiltinExec, the [[OriginalFlags]] is retrieved and stored, followed by calling ToLength(Get(R, "lastIndex")). Side-effects in ToLength can recompile the RegExp object! When the [[RegExpMatcher]] tries to find a match at string index 0, a failure is returned, because the recompiled RegExp works on code units and the sequence <U+D83D, U+DCA3> doesn't match "(.)\1". AdvanceStringIndex is called to advance to the next string index, but AdvanceStringIndex uses fullUnicode=true, so it skips over the complete code point U+1F4A3 and returns 2. The matcher also won't be able to find a match at string index 2, and therefore RegExpBuiltinExec returns `null`. I don't think we want to support these semantics, so back to TC39 to get this corrected.
partially reverted: https://github.com/tc39/ecma262/pull/798/files remaining changes: https://tc39.github.io/ecma262/#sec-regexpbuiltinexec > 21.2.5.2.2 Runtime Semantics: RegExpBuiltinExec ( R, S ) > ... > 12. Repeat, while matchSucceeded is false > a. If lastIndex > length, then > i. If global is true or sticky is true, then > 1. Perform ? Set(R, "lastIndex", 0, true). > ... https://tc39.github.io/ecma262/#sec-regexp.prototype-@@search > 21.2.5.9 RegExp.prototype [ @@search ] ( string ) > ... > 5. If SameValue(previousLastIndex, 0) is false, then > a. Perform ? Set(rx, "lastIndex", 0, true). > ... > 7. Let currentLastIndex be ? Get(rx, "lastIndex"). > 8. If SameValue(currentLastIndex, previousLastIndex) is false, then > a. Perform ? Set(rx, "lastIndex", previousLastIndex, true).
Depends on: 1343375
Attached patch bug1317397.patch (obsolete) (deleted) — Splinter Review
The changes for the optimized path in RegExpSearch are bit more tricky, please let me know if you think it needs more explanations.
Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
Attachment #8843468 - Flags: review?(arai.unmht)
Comment on attachment 8843468 [details] [diff] [review] bug1317397.patch Review of attachment 8843468 [details] [diff] [review]: ----------------------------------------------------------------- Thank you :) great testcases! the comment for RegExpSearch opt path should be sufficient.
Attachment #8843468 - Flags: review?(arai.unmht) → review+
Attached patch bug1317397.patch (deleted) — Splinter Review
Updated the patch to: - fix an eslint style warning - use correct git hashes for ES2017 draft references - and to apply cleanly on inbound. Carrying r+ from arai.
Attachment #8843468 - Attachment is obsolete: true
Attachment #8844060 - Flags: review+
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/21743640e4d1 Only set lastIndex for global or sticky RegExps in RegExpBuiltinExec per ES2017. r=arai
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: