Closed
Bug 1032726
Opened 10 years ago
Closed 10 years ago
Make DOM bindings work with Latin1 strings and nursery-allocated strings
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla33
People
(Reporter: jandem, Assigned: jandem)
References
(Blocks 1 open bug)
Details
Attachments
(6 files, 2 obsolete files)
(deleted),
patch
|
bzbarsky
:
review+
terrence
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
terrence
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
terrence
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
terrence
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
I prototyped this last week and discussed it with bz, and I got it to the point where it was a pretty small slowdown (in the worst case).
Assignee | ||
Comment 1•10 years ago
|
||
ConvertJSValueToByteString scans the string for chars > 255, we can skip this for Latin1 strings. The patch also adds the AutoCheckCannotGC API, so that the GC analysis checks that we don't use the chars across a GC call. I had to move the ThrowErrorMessage outside the loop to make that work. r? bz for the DOM changes, Terrence for the rest.
Attachment #8448606 -
Flags: review?(terrence)
Attachment #8448606 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 2•10 years ago
|
||
Note that Codegen.py still generates code that calls GetTwoByteAtomChars unconditionally. I'll fix this in a later patch.
Attachment #8448621 -
Flags: review?(terrence)
Attachment #8448621 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 3•10 years ago
|
||
The inner loop can easily be templatized to work on either Latin1Char or jschar with no overhead.
Attachment #8448662 -
Flags: review?(terrence)
Attachment #8448662 -
Flags: review?(bzbarsky)
Comment 4•10 years ago
|
||
Comment on attachment 8448606 [details] [diff] [review] Part 1 - ConvertJSValueToByteString r=me
Attachment #8448606 -
Flags: review?(bzbarsky) → review+
Comment 5•10 years ago
|
||
Comment on attachment 8448621 [details] [diff] [review] Part 2 - GetArrayIndexFromId >+ ${argName}.SetData(js::GetTwoByteAtomChars(nogc, atom), js::GetAtomLength(atom)); How do we know the atom is 2-byte here? r=me with that addressed.
Attachment #8448621 -
Flags: review?(bzbarsky) → review+
Comment 6•10 years ago
|
||
Oh, I see, that's what you plan to fix in a later patch. OK.
Comment 7•10 years ago
|
||
Comment on attachment 8448662 [details] [diff] [review] Part 3 - FindEnumStringIndex >+ if (JS_StringHasLatin1Chars(str)) { We'll want an inline API for that eventually, then use it here? Followup is ok. r=me
Attachment #8448662 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 8•10 years ago
|
||
The main goal is to add an inlineable js::CopyStringChars friend API. This requires some extra machinery, but this stuff is perf-critical and we need this to keep the regression as small as possible.
Attachment #8448767 -
Flags: review?(terrence)
Assignee | ||
Comment 9•10 years ago
|
||
This is the main patch. It renames FakeDependentString to FakeString. Inline storage (64 chars atm) is used for short strings. nsStringBuffer is used for longer strings, this potentially avoids copies later on because nsStringBuffer can be shared. In a follow-up patch I want to move this class + AssignJSString to its own header, so that we can use it in other code instead of nsDependentJSString. As I mentioned last week, with this patch a (worst case) getAttribute micro-benchmark is about 9% slower with a 32-bit opt build on OS X.
Attachment #8448839 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 10•10 years ago
|
||
Sorry for the bugspam; I always notice something after requesting review :(
Attachment #8448839 -
Attachment is obsolete: true
Attachment #8448839 -
Flags: review?(bzbarsky)
Attachment #8448843 -
Flags: review?(bzbarsky)
Comment 11•10 years ago
|
||
Comment on attachment 8448843 [details] [diff] [review] Part 5 - Fake(Dependent)String >+ if (len + 1 <= sInlineCapacity) { This is subject to integer overflow, right? Better to check |len <= sInlineCapacity - 1|. >+ nsRefPtr<nsStringBuffer> buf = nsStringBuffer::Alloc((len + 1) * sizeof(nsString::char_type)); This too seems subject to integer overflow... Of course nsStringBuffer::Alloc will add more stuff to it. :( We should actually verify that (len + 1) * sizeof(nsString::char_type) + sizeof(nsStringBuffer) will fit in a uint32_t. >+ SetData(static_cast<nsString::char_type*>(buf->Data()), len); >+ mozilla::unused << buf.forget(); // Take ownership. How about: SetData(static_cast<nsString::char_type*>(buf.forget().take()->Data()), len); and then no need for the second line. >+ mFlags = nsString::F_SHARED | nsString::F_TERMINATED; Please document that this method leaves the string in an inconsistent state where it has F_TERMINATED set but no actual null terminator and that the caller is expected to put the null terminator in appropriately. >+static bool >+AssignJSString(JSContext *cx, FakeString &dest, JSString *s) inline bool? >+ if (!dest.SetCapacityAndLength(len) || >+ !js::CopyStringChars(cx, dest.BeginWriting(), s, len)) >+ { >+ return false; Doesn't failure to SetCapacityAndLength need to throw an exception? >- if isMember == "Variadic": No, I think you want to keep that branching. In particular, for variadics of short strings this will cause an extra copy (complete with malloc) here that's not really desirable. The comment probably should be updated with s/have to make a copy/may have to make a copy/ r=me with those fixed
Attachment #8448843 -
Flags: review?(bzbarsky) → review+
Comment 12•10 years ago
|
||
Comment on attachment 8448606 [details] [diff] [review] Part 1 - ConvertJSValueToByteString Review of attachment 8448606 [details] [diff] [review]: ----------------------------------------------------------------- r=me
Attachment #8448606 -
Flags: review?(terrence) → review+
Comment 13•10 years ago
|
||
Comment on attachment 8448621 [details] [diff] [review] Part 2 - GetArrayIndexFromId Review of attachment 8448621 [details] [diff] [review]: ----------------------------------------------------------------- r=me
Attachment #8448621 -
Flags: review?(terrence) → review+
Comment 14•10 years ago
|
||
Comment on attachment 8448662 [details] [diff] [review] Part 3 - FindEnumStringIndex Review of attachment 8448662 [details] [diff] [review]: ----------------------------------------------------------------- r=me
Attachment #8448662 -
Flags: review?(terrence) → review+
Comment 15•10 years ago
|
||
Comment on attachment 8448767 [details] [diff] [review] Part 4 - Beef up friend APIs Review of attachment 8448767 [details] [diff] [review]: ----------------------------------------------------------------- I would not be opposed to moving all of this new API to js/public/String.h and just having inlinable string methods as part of the official API, although we can always do that later. ::: js/src/jsfriendapi.h @@ +574,5 @@ > const JSJitInfo *jitinfo; > void *_1; > }; > > +struct String { { on newline.
Attachment #8448767 -
Flags: review?(terrence) → review+
Assignee | ||
Comment 16•10 years ago
|
||
Thanks for the fast reviews! (In reply to Boris Zbarsky [:bz] from comment #11) > r=me with those fixed I'll post a follow-up patch to address your comments + rebase the patch on top of Peter's patch. (In reply to Terrence Cole [:terrence] from comment #15) > I would not be opposed to moving all of this new API to js/public/String.h > and just having inlinable string methods as part of the official API, > although we can always do that later. Yeah, Waldo suggested that too... It's probably a good idea and will make some callers nicer; I'll file a follow-up bug.
Assignee | ||
Comment 17•10 years ago
|
||
Part 1: https://hg.mozilla.org/integration/mozilla-inbound/rev/4ffff7bdb4c8
Keywords: leave-open
Comment 18•10 years ago
|
||
Comment on attachment 8448843 [details] [diff] [review] Part 5 - Fake(Dependent)String >+// A struct that has the same layout as an nsString but much faster >+// constructor and destructor behavior. FakeString uses inline storage >+// for small strings and a nsStringBuffer for longer strings. >+struct FakeString { >+ FakeString() : >+ mFlags(nsString::F_TERMINATED) > { > } > >+ ~FakeString() { >+ if (mFlags & nsString::F_SHARED) { >+ nsStringBuffer::FromData(mData)->Release(); >+ } >+ } [Is this still much faster than nsString?]
Assignee | ||
Comment 19•10 years ago
|
||
(In reply to neil@parkwaycc.co.uk from comment #18) > [Is this still much faster than nsString?] Yes, definitely. Last week I prototyped this and at first I used a class that inherited from nsAutoString and it was much slower (even for small strings that should fit in the inline storage).
Assignee | ||
Comment 20•10 years ago
|
||
I wanted to post an interdiff, but the rebase touched a bunch of code so I think it's best to post the new patch.
Attachment #8448843 -
Attachment is obsolete: true
Attachment #8449405 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 21•10 years ago
|
||
Parts 2 + 3: https://hg.mozilla.org/integration/mozilla-inbound/rev/60e70f9d98cb https://hg.mozilla.org/integration/mozilla-inbound/rev/219b6b2a4a29 (In reply to Boris Zbarsky [:bz] from comment #7) > We'll want an inline API for that eventually, then use it here? Followup is > ok. Yup, will post a follow-up patch for that.
Comment 22•10 years ago
|
||
> [Is this still much faster than nsString?]
Yes. nsString has out-of-line (effectively, in the case of the dtor) ctor/dtor, the ctor does a bit more work than FakeString's, and the dtor also does a bit more work.
Comment 23•10 years ago
|
||
Comment on attachment 8449405 [details] [diff] [review] Part 5 - Fake(Dependent)String, v2 >- nsDependentString string; >+ binding_detail::FakeString string; No, we really do want to keep the nsDependentString here. >@@ -4440,44 +4440,35 @@ def getJSToNativeConversionInfo(type, de > # We have to make a copy, except in the variadic case, because our This comment is no longer relevant. Perhaps replace it with: # Convert directly into the nsString member we have. ? > return JSToNativeConversionInfo( > fill( I don't think this complexity is needed. There is no more $*{assign} or "str" temporary, so all this is doing is returning $*{convert} inside a now-unnecessary block. We should just do: return JSToNativeConversionInfo( getConversionCode("${declName}"), declType=declType, dealWithOptional=isOptional) r=me with those fixed.
Attachment #8449405 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 24•10 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #23) > No, we really do want to keep the nsDependentString here. We can no longer use a nsDependentString for JS strings right, because we might GC or have Latin1 strings? Am I missing something?
Comment 25•10 years ago
|
||
Comment on attachment 8449405 [details] [diff] [review] Part 5 - Fake(Dependent)String, v2 > pval.set(JS::StringValue(s)); // Root the new string. > } > >- size_t len; >- const jschar *chars = JS_GetStringCharsZAndLength(cx, s, &len); >- if (!chars) { >- return false; >- } >- >- result.Rebind(chars, len); >- return true; >+ return binding_detail::AssignJSString(cx, result, s); If you're copying, rather than rebinding, the string, do you still need to root it? (Should you try to avoid copying in the two byte char cases?)
Comment 26•10 years ago
|
||
> We can no longer use a nsDependentString for JS strings right Oh, right. Just use nsString or nsAutoString there there. > If you're copying, rather than rebinding, the string, do you still need to root it? Only if some part of AssignJSString can GC. If not, we can nix this pval argument altogether, I think. > (Should you try to avoid copying in the two byte char cases?) That would fail for the nursery-allocated string case, right? Because if we don't copy and pass a pointer directly into the inline nursery string into Gecko and then cause a GC, the string can move and suddenly we're screwed.
Comment 27•10 years ago
|
||
(In reply to Boris Zbarsky from comment #26) > > If you're copying, rather than rebinding, the string, do you still need to root it? > > Only if some part of AssignJSString can GC. If not, we can nix this pval > argument altogether, I think. > > > (Should you try to avoid copying in the two byte char cases?) > > That would fail for the nursery-allocated string case, right? Because if we > don't copy and pass a pointer directly into the inline nursery string into > Gecko and then cause a GC, the string can move and suddenly we're screwed. I don't know which case(s) apply here, I was just concerned that we were rooting a JS string and then copying it, and I wasn't sure which, if either, was wrong. (Unfortunately DXR isn't very good with generated files so I don't even know whether it's a good idea to keep trying to share code between FakeString and the old FakeDependentString.)
Comment 28•10 years ago
|
||
Basically, the JS folks do not want us holding char pointers into JS strings across GC. nsAString provides a char pointer. That means either copying when we produce an nsAString or no longer using nsAString for Web IDL API string arguments and instead using a class that keeps alive a JSString* and asks it for the chars as needed, with no-gc static asserts or something. The former is way more doable on a sane timeframe. :(
Assignee | ||
Comment 29•10 years ago
|
||
(In reply to neil@parkwaycc.co.uk from comment #25) > If you're copying, rather than rebinding, the string, do you still need to > root it? (In reply to Boris Zbarsky [:bz] from comment #26) > Only if some part of AssignJSString can GC. If not, we can nix this pval > argument altogether, I think. Oh, good point. AssignJSString can't GC AFAIK (rope flattening is fallible but won't GC), I'll remove pval in the followup patch I'm about to post.
Assignee | ||
Comment 30•10 years ago
|
||
After removing pval from ConvertJSValueToString I could remove mutableVal from a bunch of places in Codegen.py (I think..)
Attachment #8449616 -
Flags: review?(bzbarsky)
Comment 31•10 years ago
|
||
Comment on attachment 8449616 [details] [diff] [review] Part 6 - Followup changes > ConvertJSValueToByteString(JSContext* cx, JS::Handle<JS::Value> v, > if (!JS_StringHasLatin1Chars(s)) { Worth using the inline thing here too? Might not be.... r=me
Attachment #8449616 -
Flags: review?(bzbarsky) → review+
https://hg.mozilla.org/mozilla-central/rev/4ffff7bdb4c8 https://hg.mozilla.org/mozilla-central/rev/60e70f9d98cb https://hg.mozilla.org/mozilla-central/rev/219b6b2a4a29
Assignee | ||
Comment 33•10 years ago
|
||
Parts 4 and 5: https://hg.mozilla.org/integration/mozilla-inbound/rev/3de3f4f8a970 https://hg.mozilla.org/integration/mozilla-inbound/rev/be135c627773
Comment 34•10 years ago
|
||
(In reply to Boris Zbarsky from comment #28) > Basically, the JS folks do not want us holding char pointers into JS strings > across GC. Fair enough. Copying seems sane in that case, except for external strings. (In reply to Jan de Mooij from comment #29) > Oh, good point. AssignJSString can't GC AFAIK (rope flattening is fallible > but won't GC), I'll remove pval in the followup patch I'm about to post. (Seems a waste to flatten the string and then copy it again.) (From update of attachment 8449616 [details] [diff] [review]) > // pval must not be null and must point to a rooted JS::Value > template<typename T> > static inline bool > ConvertJSValueToString(JSContext* cx, JS::Handle<JS::Value> v, >- JS::MutableHandle<JS::Value> pval, Nit: comment is out of date.
Assignee | ||
Comment 35•10 years ago
|
||
Part 6: https://hg.mozilla.org/integration/mozilla-inbound/rev/a133199d09e2 (In reply to Boris Zbarsky [:bz] from comment #31) > > ConvertJSValueToByteString(JSContext* cx, JS::Handle<JS::Value> v, > > if (!JS_StringHasLatin1Chars(s)) { > > Worth using the inline thing here too? Might not be.... Done. (In reply to neil@parkwaycc.co.uk from comment #34) > Nit: comment is out of date. Fixed.
Assignee | ||
Updated•10 years ago
|
Keywords: leave-open
Comment 36•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/3de3f4f8a970 https://hg.mozilla.org/mozilla-central/rev/be135c627773 https://hg.mozilla.org/mozilla-central/rev/a133199d09e2
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Comment 37•10 years ago
|
||
Comment on attachment 8449405 [details] [diff] [review] Part 5 - Fake(Dependent)String, v2 >+ if (MOZ_UNLIKELY(!dest.SetCapacity(len + 1, mozilla::fallible_t()))) { >+ JS_ReportOutOfMemory(cx); >+ return false; >+ } >+ if (MOZ_UNLIKELY(!js::CopyStringChars(cx, dest.BeginWriting(), s, len))) { >+ return false; >+ } >+ dest.BeginWriting()[len] = '\0'; >+ dest.SetLength(len); >+ return true; I overlooked that this is probably overkill; SetLength calls SetCapacity anyway, and SetCapacity allows for and writes the null terminator, so you can just write something like this: if (MOZ_UNLIKELY(!dest.SetLength(len, mozilla::fallible_t()))) { JS_ReportOutOfMemory(cx); return false; } return js::CopyStringChars(cx, dest.BeginWriting(), s, len);
Assignee | ||
Comment 38•10 years ago
|
||
(In reply to neil@parkwaycc.co.uk from comment #37) > I overlooked that this is probably overkill; SetLength calls SetCapacity > anyway, and SetCapacity allows for and writes the null terminator, so you > can just write something like this: Ah interesting. This won't work atm because FakeString's SetLength implementation doesn't behave like this, but we could change that to match ns*String more...
Flags: needinfo?(jdemooij)
Assignee | ||
Comment 39•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #38) > Ah interesting. This won't work atm because FakeString's SetLength > implementation doesn't behave like this, but we could change that to match > ns*String more... Neil fixed this in bug 1041140; clearing needinfo.
Flags: needinfo?(jdemooij)
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•