Closed Bug 1026438 Opened 10 years ago Closed 10 years ago

Make irregexp work with Latin1 strings

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(9 files, 2 obsolete files)

(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
(deleted), patch
bhackett1024
: review+
Details | Diff | Splinter Review
      No description provided.
Pass HandleLinearString instead of chars + length, to work better with Latin1 strings and moving GC.
Attachment #8441271 - Flags: review?(bhackett1024)
Attached patch Part 2 - Sampling (deleted) — Splinter Review
Attachment #8441273 - Flags: review?(bhackett1024)
Attached patch Part 3 - Store Latin1/TwoByte code separately (obsolete) (deleted) — Splinter Review
Allow storing both Latin1 and TwoByte regex JitCode or byteCode.
Attachment #8441308 - Flags: review?(bhackett1024)
Forgot to qref.
Attachment #8441308 - Attachment is obsolete: true
Attachment #8441308 - Flags: review?(bhackett1024)
Attachment #8441311 - Flags: review?(bhackett1024)
Attached patch Part 4 - StringMatch case (deleted) — Splinter Review
Attachment #8441323 - Flags: review?(bhackett1024)
Attached patch Part 5 - Run regexps (deleted) — Splinter Review
Just enough changes to run some simple regexps. There are other places in irregexp that I have to fix, but I think I'll do that as part of fixing jit-tests/jstests.
Attachment #8441386 - Flags: review?(bhackett1024)
Attachment #8441271 - Flags: review?(bhackett1024) → review+
Attachment #8441273 - Flags: review?(bhackett1024) → review+
Comment on attachment 8441311 [details] [diff] [review]
Part 3 - Store Latin1/TwoByte code separately

Review of attachment 8441311 [details] [diff] [review]:
-----------------------------------------------------------------

Is isCompiled() used anywhere now?
Attachment #8441311 - Flags: review?(bhackett1024) → review+
Attachment #8441323 - Flags: review?(bhackett1024) → review+
Comment on attachment 8441386 [details] [diff] [review]
Part 5 - Run regexps

Review of attachment 8441386 [details] [diff] [review]:
-----------------------------------------------------------------

I think irregexp::InterpretCode can GC via the interrupt callback.
Attachment #8441386 - Flags: review?(bhackett1024) → review+
Parts 3 and 4:

https://hg.mozilla.org/integration/mozilla-inbound/rev/ea0a96320d86
https://hg.mozilla.org/integration/mozilla-inbound/rev/bd69b2caad20

(In reply to Brian Hackett (:bhackett) from comment #7)
> Is isCompiled() used anywhere now?

Yes, an assert in getParenCount: isCompiled() || canStringMatch.
And part 5:

https://hg.mozilla.org/integration/mozilla-inbound/rev/4df3e4664b11

(In reply to Brian Hackett (:bhackett) from comment #8)
> I think irregexp::InterpretCode can GC via the interrupt callback.

Right, good catch! I changed it to use AutoStableStringChars instead for the regex interpreter.
Attachment #8442749 - Flags: review?(bhackett1024)
Attachment #8442749 - Flags: review?(bhackett1024) → review+
Attachment #8442996 - Flags: review?(bhackett1024)
Attached patch Part 7 - GetCaseIndependentLetters (obsolete) (deleted) — Splinter Review
V8 has the code below, but I think it's wrong. Latin1 char 0xff has an upper-case char outside the Latin1 range. This patch is more conservative just to be safe. I'll also see if I can write a testcase that fails with V8.

  if (!ascii_subject || character <= String::kMaxOneByteCharCode) {
    return length;
  }
  // The standard requires that non-ASCII characters cannot have ASCII
  // character codes in their equivalence class.
  return 0;
Attachment #8443735 - Flags: review?(bhackett1024)
OK, what V8 does is fine (see previous comment), because their FilterASCII code converts 0x178 to 0xff. This patch makes our FilterASCII and GetCaseIndependentLettersbehave work more like theirs, and the test I wrote passes with this.
Attachment #8443735 - Attachment is obsolete: true
Attachment #8443735 - Flags: review?(bhackett1024)
Attachment #8443763 - Flags: review?(bhackett1024)
Attachment #8442996 - Flags: review?(bhackett1024) → review+
Attachment #8443763 - Flags: review?(bhackett1024) → review+
Fixes the last regex-related jit-test/jstest failures.
Attachment #8444299 - Flags: review?(bhackett1024)
Comment on attachment 8444299 [details] [diff] [review]
Part 9 - CheckNotBackReferenceIgnoreCase

Review of attachment 8444299 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/irregexp/NativeRegExpMacroAssembler.cpp
@@ +734,5 @@
> +        // Save register contents to make the registers available below. After
> +        // this, the temp0, backtrack_stack_pointer, and current_position
> +        // registers are available.
> +        masm.push(current_position);
> +        masm.push(backtrack_stack_pointer);

I think you can use temp2 (which v8's version doesn't have) to avoid one of these pushes.
Attachment #8444299 - Flags: review?(bhackett1024) → review+
Depends on: 1029422
And part 9:

https://hg.mozilla.org/integration/mozilla-inbound/rev/61400645a576

(In reply to Brian Hackett (:bhackett) from comment #24)
> I think you can use temp2 (which v8's version doesn't have) to avoid one of
> these pushes.

Ah great, done.
Keywords: leave-open
No longer depends on: 1029422
https://hg.mozilla.org/mozilla-central/rev/61400645a576
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: