Closed
Bug 1033946
Opened 10 years ago
Closed 10 years ago
Differential Testing: Different output message involving RegExp, exec, --latin1-strings
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla34
Tracking | Status | |
---|---|---|
firefox31 | --- | unaffected |
firefox32 | --- | wontfix |
firefox33 | --- | wontfix |
firefox34 | --- | fixed |
firefox35 | --- | fixed |
People
(Reporter: gkw, Assigned: bhackett1024)
References
Details
(Keywords: regression, testcase)
Attachments
(1 file, 2 obsolete files)
(deleted),
patch
|
jandem
:
review+
Sylvestre
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
print((/(?!(?!(?!6)[\Wc]))/i).exec())
$ ./js-dbgDisabled-opt-64-prof-dm-ts-darwin-5d9af625f42e --ion-offthread-compile=off --ion-eager 370.js
$ ./js-dbgDisabled-opt-64-prof-dm-ts-darwin-5d9af625f42e --ion-offthread-compile=off --ion-eager --latin1-strings 370.js
null
(Tested this on 64-bit Mac js opt threadsafe deterministic shell off m-c rev 5d9af625f42e)
My configure flags are:
CC="clang -Qunused-arguments" CXX="clang++ -Qunused-arguments" AR=ar sh /Users/skywalker/trees/mozilla-central/js/src/configure --target=x86_64-apple-darwin12.5.0 --disable-debug --enable-optimize --enable-profiling --enable-gczeal --enable-debug-symbols --disable-tests --enable-more-deterministic --with-ccache --enable-threadsafe <other NSPR options>
Guessing this is related to bug 998392 as well, so setting needinfo? from Jan.
Flags: needinfo?(jdemooij)
Comment 1•10 years ago
|
||
Hah, this seems to be a bug in our irregexp port somewhere that only affects TwoByte strings:
I get the following output for:
print((/(?!(?!(?!6)[\Wc]))/i).test());
d8: : false
SM before irregexp : false
SM --latin1-strings: false
SM : true
Comment 2•10 years ago
|
||
Not sure but I think this is a bug in CharacterRange::AddCaseEquivalents.
Flags: needinfo?(jdemooij) → needinfo?(bhackett1024)
Updated•10 years ago
|
status-firefox31:
--- → unaffected
status-firefox32:
--- → affected
status-firefox33:
--- → affected
Assignee | ||
Comment 3•10 years ago
|
||
This is a problem with the same \u0130 'capital I with a dot over it' character that also caused problems in one of the existing tests in bug 976446. That bug just changed the test, though I guess this fix is better (which also restores the original test).
Assignee: nobody → bhackett1024
Attachment #8456580 -
Flags: review?(jdemooij)
Flags: needinfo?(bhackett1024)
Comment 4•10 years ago
|
||
Comment on attachment 8456580 [details] [diff] [review]
patch
Review of attachment 8456580 [details] [diff] [review]:
-----------------------------------------------------------------
I don't think this is correct, with the patch the following test fails:
assertEq("foobar\xff5baz\u1200".search(/bar\u0178\d/i), 3);
Attachment #8456580 -
Flags: review?(jdemooij)
Assignee | ||
Comment 5•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #4)
> Comment on attachment 8456580 [details] [diff] [review]
> patch
>
> Review of attachment 8456580 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I don't think this is correct, with the patch the following test fails:
>
> assertEq("foobar\xff5baz\u1200".search(/bar\u0178\d/i), 3);
OK, I have no idea what the correct fix is here then.
We have this comment:
// The standard requires that non-ASCII characters cannot have ASCII
// character codes in their equivalence class.
But this is contradicted by our unicode tables (which we presumably pull from some standard source), in which the lowercase of \u0130 is a lowercase 'i'.
We could just watch for cases where the unicode tables have a pattern like this, and filter it out, like this patch does, but doing that on Latin1 characters is apparently not correct. We could do it on Ascii characters (<= 127) but the irregexp source already seems to conflate the concepts of Ascii strings and Latin1 strings all over the place, including when invoking this mysterious standard.
Should I just do a spotfix for this stupid \u0130 character?
Assignee: bhackett1024 → nobody
Comment 6•10 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #5)
> OK, I have no idea what the correct fix is here then.
I think the problem is in CharacterRange::AddCaseEquivalents. IIRC we call GetCaseIndependentLetters, but V8 doesn't. We should look at their code and see what they do exactly, and in which cases it's different from what we do.
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #6)
> (In reply to Brian Hackett (:bhackett) from comment #5)
> > OK, I have no idea what the correct fix is here then.
>
> I think the problem is in CharacterRange::AddCaseEquivalents. IIRC we call
> GetCaseIndependentLetters, but V8 doesn't. We should look at their code and
> see what they do exactly, and in which cases it's different from what we do.
v8's unicode library's interface is totally different from ours, so to avoid needing to port all of that stuff (and any additional dependencies) I implemented it less efficiently using GetCaseIndependentLetters and our library's ToUpperCase and ToLowerCase. Barring spec insanity (to wit: this bug) these should be equivalent.
Comment 8•10 years ago
|
||
I know it sucks but we should still fix it somehow :(
Assignee | ||
Comment 9•10 years ago
|
||
Attachment #8456580 -
Attachment is obsolete: true
Attachment #8458125 -
Flags: review?(jdemooij)
Comment 10•10 years ago
|
||
Hm there's also a problem with "k" if you disable Latin1 strings.. (EnableLatin1Strings in String.cpp
assertEq((/(?!(?!(?!6)[\Wc]))/i).test("k"), false);
Comment 11•10 years ago
|
||
The spec says this:
The abstract operation Canonicalize takes a character parameter ch and performs the following steps:
1. If IgnoreCase is false, return ch.
2. Let u be ch converted to upper case as if by calling the standard built-in method String.prototype.toUpperCase on the one-character String ch.
3. If u does not consist of a single character, return ch.
4. Let cu be u's character.
5. If ch's code unit value is greater than or equal to decimal 128 and cu's code unit value is less than decimal 128, then return ch.
6. Return cu.
Does that help?
Updated•10 years ago
|
Attachment #8458125 -
Flags: review?(jdemooij)
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(bhackett1024)
Assignee | ||
Comment 12•10 years ago
|
||
Attachment #8458125 -
Attachment is obsolete: true
Attachment #8466291 -
Flags: review?(jdemooij)
Flags: needinfo?(bhackett1024)
Comment 13•10 years ago
|
||
Comment on attachment 8466291 [details] [diff] [review]
patch
Review of attachment 8466291 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/irregexp/RegExpEngine.cpp
@@ +215,5 @@
> + // Watch for duplicates.
> + bool found = false;
> + for (size_t j = 0; j < count; j++) {
> + if (letters[j] == c)
> + found = true;
Nit: can |break;| here.
Attachment #8466291 -
Flags: review?(jdemooij) → review+
Reporter | ||
Comment 14•10 years ago
|
||
Thanks for fixing! Perhaps this is ready for landing soon?
Flags: needinfo?(bhackett1024)
Assignee | ||
Comment 15•10 years ago
|
||
Flags: needinfo?(bhackett1024)
Comment 16•10 years ago
|
||
Assignee: nobody → bhackett1024
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Flags: qe-verify-
Updated•10 years ago
|
Comment 18•10 years ago
|
||
Comment on attachment 8466291 [details] [diff] [review]
patch
The patch was not backported but apparently people are running into this (see bug 1076728). I know it's late in the cycle but requesting approval just in case.
Approval Request Comment
[Feature/regressing bug #]: Bug 976446.
[User impact if declined]: Broken websites.
[Describe test coverage new/current, TBPL]: Tested on TBPL, patch is in 34/35.
[Risks and why]: Low risk.
[String/UUID change made/needed]: None.
Attachment #8466291 -
Flags: approval-mozilla-beta?
Comment 19•10 years ago
|
||
Jan, that is unfortunate. However, having it on m-c for a month demonstrates that the change should be safe. You confirm that it didn't cause any side effect?
Flags: needinfo?(jdemooij)
Comment 20•10 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #19)
> You confirm that it didn't cause any side effect?
I'm not aware of any side effects, no.
Flags: needinfo?(jdemooij)
Updated•10 years ago
|
status-firefox34:
--- → affected
status-firefox35:
--- → fixed
Updated•10 years ago
|
Comment 21•10 years ago
|
||
Comment on attachment 8466291 [details] [diff] [review]
patch
We are too late in the cycle to take it. Sorry, this will be fixed with the release of 34...
Attachment #8466291 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in
before you can comment on or make changes to this bug.
Description
•