Closed Bug 1019585 Opened 10 years ago Closed 10 years ago

Latin1 strings: support string.match, string.search and string.replace

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(8 files)

These are the more complicated functions, because they have to work with regular expressions and have all kinds of optimizations for flat strings and rope matching.

I'm going to handle the simple cases first and we'll see what's needed.
Attached patch Part 1 - StringMatch case (deleted) — Splinter Review
Just enough changes to make the StringMatch case work.
Attachment #8433331 - Flags: review?(luke)
Attached patch Part 2 - RopeMatch case (deleted) — Splinter Review
Uses rope matching if all leaf nodes have the same encoding. This made it pretty easy to convert the code to handle Latin1 ropes/patterns.

I verified the test calls RopeMatchImpl (and removing the all-leaf-nodes-have-same-encoding check triggers assertion failures).
Attachment #8433469 - Flags: review?(luke)
Comment on attachment 8433331 [details] [diff] [review]
Part 1 - StringMatch case

Review of attachment 8433331 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsstr.cpp
@@ +1827,5 @@
>  
>  /* Result of a successfully performed flat match. */
>  class FlatMatch
>  {
> +    RootedAtom pat;

Pre-existing, but can you append _ to pat?  Mixed use of _ looks tacky to me.
Attachment #8433331 - Flags: review?(luke) → review+
Comment on attachment 8433469 [details] [diff] [review]
Part 2 - RopeMatch case

Review of attachment 8433469 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsstr.cpp
@@ +1398,4 @@
>              return false;
>          while (!r.empty()) {
> +            if (threshold-- == 0 ||
> +                r.front()->hasLatin1Chars() != text->hasLatin1Chars() ||

Might be worth it to hoist text->hasLatin1Chars(); I expect the compiler can't automatically.
Attachment #8433469 - Flags: review?(luke) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/b3c9dcbc34c0
https://hg.mozilla.org/integration/mozilla-inbound/rev/f2b79650ac7b

There's more to do here, but it depends on the regexp engine. Yarr will hopefully be gone next week so I'll wait for that to avoid breaking it.
Keywords: leave-open
Attached patch Part 3 - Fix JSSubString (deleted) — Splinter Review
This patch changes JSSubString from a (chars, length) struct to a (base, offset, length) struct. JSSubString isn't used across GC calls as far as I can see.
Attachment #8440792 - Flags: review?(terrence)
Comment on attachment 8440792 [details] [diff] [review]
Part 3 - Fix JSSubString

Review of attachment 8440792 [details] [diff] [review]:
-----------------------------------------------------------------

And now that it contains a raw JSString*, we'll hear about it from Hf if we ever do accidentally introduce GC while one is live. Nice!
Attachment #8440792 - Flags: review?(terrence) → review+
Attached patch Part 4 - FindDollarIndex (deleted) — Splinter Review
Attachment #8442100 - Flags: review?(terrence)
Attached patch Part 5 - BuildDollarReplacement (deleted) — Splinter Review
Attachment #8442101 - Flags: review?(terrence)
Attached patch Part 6 - RegExp replacement (deleted) — Splinter Review
Attachment #8442102 - Flags: review?(terrence)
Attached patch Part 7 - flattenPattern (deleted) — Splinter Review
Attachment #8442103 - Flags: review?(terrence)
Attached patch Part 8 - StrReplaceRegexpRemove (deleted) — Splinter Review
And of course there's an optimization for the case where the replacement is the empty string... So many paths str_replace can take, but I think I got all of them.
Attachment #8442104 - Flags: review?(terrence)
Comment on attachment 8442100 [details] [diff] [review]
Part 4 - FindDollarIndex

Review of attachment 8442100 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442100 - Flags: review?(terrence) → review+
Comment on attachment 8442101 [details] [diff] [review]
Part 5 - BuildDollarReplacement

Review of attachment 8442101 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/latin1/replace.js
@@ +38,5 @@
> +    assertEq(s.replace(pat, "A$`\u1300"), "FooAFoo\u1300baz123");
> +    assertEq(s.replace(pat, "A$&\u1300"), "FooAbar\u1200\u1300baz123");
> +    assertEq(s.replace(pat, "A$'\u1300"), "FooAbaz123\u1300baz123");
> +}
> +testDollarReplacement();

Nice!
Attachment #8442101 - Flags: review?(terrence) → review+
Comment on attachment 8442102 [details] [diff] [review]
Part 6 - RegExp replacement

Review of attachment 8442102 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442102 - Flags: review?(terrence) → review+
Comment on attachment 8442103 [details] [diff] [review]
Part 7 - flattenPattern

Review of attachment 8442103 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442103 - Flags: review?(terrence) → review+
Comment on attachment 8442104 [details] [diff] [review]
Part 8 - StrReplaceRegexpRemove

Review of attachment 8442104 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442104 - Flags: review?(terrence) → review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: