Closed Bug 123437 Opened 23 years ago Closed 21 years ago

regexp backreferences /(a)? etc./ must hold |undefined| if not used

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: pschwartau, Assigned: rogerl)

References

Details

(Whiteboard: [QA note: verify HTML testcase as well as the JS testcase])

From correspondence between rogerl and waldemar: Waldemar: "The match array always has N+1 elements, where N is the number of capturing left parentheses in the regular expression, regardless of whether all of them are actually used or not." EXAMPLE: arr = /(a)?a/("a") arr.toSource() SHOULD BE: ["a", undefined] BUT IN CURRENT SPIDERMONKEY: ["a"] EXAMPLE: arr = /a|(b)/("a") arr.toSource() SHOULD BE: ["a", undefined] BUT IN CURRENT SPIDERMONKEY: ["a"] EXAMPLE: arr = /(a)?(a)/("a") arr.toSource() SHOULD BE: ["a", undefined, "a"] BUT IN CURRENT SPIDERMONKEY: ["a", "", "a"]
Note in the last example, ["a", , "a"] would also be acceptable output. That is, as an array ["a", , "a"] is equivalent to ["a", undefined, "a"]. The same contraction cannot be made in the first two examples above. That is, the array ["a",] is not equivalent to ["a", undefined]. Why? Trailing commas are ignored; ["a",] is equivalent to ["a"].
Note the contrast between a capturing left parenthesis "(" and a NON-capturing left parenthesis "(?:"
Testcase added to JS testsuite: mozilla/js/tests/ecma_3/RegExp/regress-123437.js
*** Bug 156465 has been marked as a duplicate of this bug. ***
cc'ing Brendan for his advice. This bug will be fixed by the big RegExp rewrite in bug 85721. But should we push in a fix for this separately (since it is a small fix) to ensure this gets into JS1.5?
Here is the HTML testcase from the duplicate bug 156465: http://bugzilla.mozilla.org/attachment.cgi?id=90721&action=view OUTPUT IN NN4.7, IE6: // A regexp to trim strings - re = /^\s*(\S+(\s+\S+)*)*\s*$/g str = " hello world! " (note single spaces on each side) str.replace(re, "\"$1\"") = "hello world!" str = " " (i.e. just a single space) str.replace(re, "\"$1\"") = "" OUTPUT IN MOZILLA (trunk 20020701xx): the latter case fails: str = " " (i.e. just a single space) str.replace(re, "\"$1\"") = "$1"
Whiteboard: [QA note: verify HTML testcase as well as the JS testcase]
Summary: Backreferences /(a)? etc./ must hold |undefined| if not used → regexp backreferences /(a)? etc./ must hold |undefined| if not used
Note: this bug will be fixed by the patch for bug 85721; adding that bug as a dependency -
Depends on: RegExpPerf
Blocks: 223247
The RegExp rewrite (bug 85721) has now gone in. It has fixed this bug (the JS testcase in Comment #3 now passes). However, the HTML testcase given in Comment #6 still fails. Therefore I will resolve this bug as FIXED, but re-open bug 156465, where the HTML testcase came from. Apparently bug 156465 was not a duplicate of the current bug after all -
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Oops, I was checking the HTML testcase with an outdated build! It now passes, too. All to the good; I will not re-open bug 156465. Marking Verified -
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.