Add a UTF-8 WebIDL string type
Categories
(Core :: DOM: Bindings (WebIDL), enhancement, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox73 | --- | fixed |
People
(Reporter: hsivonen, Assigned: emilio)
References
(Blocks 2 open bugs)
Details
Attachments
(4 files, 1 obsolete file)
Updated•7 years ago
|
Updated•6 years ago
|
Assignee | ||
Comment 1•6 years ago
|
||
Comment 3•6 years ago
|
||
Updated•6 years ago
|
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 5•5 years ago
|
||
Tests in the next patch.
Assignee | ||
Comment 6•5 years ago
|
||
Depends on D58628
Assignee | ||
Comment 7•5 years ago
|
||
So as to avoid the heap allocation for small strings.
Depends on D58629
Assignee | ||
Comment 8•5 years ago
|
||
In particular, the ones where we transcode unconditionally atm (property names
and such).
There are others like cssText getters and setters which are a bit harder,
because I either need to rewrite all our serialization code to work with UTF8
(which is fine, but a lot of work), or teach webidl to have a setter that takes
UTF8String as input but returns DOMString as output (which is at best hacky).
Depends on D58630
Assignee | ||
Comment 9•5 years ago
|
||
Updated•5 years ago
|
Comment 10•5 years ago
|
||
Assignee | ||
Comment 11•5 years ago
|
||
Comment 12•5 years ago
|
||
Comment 13•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f92bcbfef7aa
https://hg.mozilla.org/mozilla-central/rev/7cdbfc327a81
https://hg.mozilla.org/mozilla-central/rev/6e17f3a19fd8
https://hg.mozilla.org/mozilla-central/rev/b751b548afb7
Comment 14•5 years ago
|
||
Is there anything we could do here to ease reviewing changes to .webidl, since now we'll have more webidl not following what the specs have?
Assignee | ||
Comment 15•5 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #14)
Is there anything we could do here to ease reviewing changes to .webidl, since now we'll have more webidl not following what the specs have?
Observables should be effectively the same as USVString
, or DOMString
+ conversion to UTF-8... I don't know, I guess I could change this so that this is something like an annotation on DOMString
/ USVString
, if you think that'd be easier to review... Not sure.
Comment 16•5 years ago
|
||
If UTF8String
does not match USVString
, bug 1607083 will violate the spec. If UTF8String
does not match DOMString
, bug 1607080 will violate the spec. Is it possible for UTF8String
to match both?
Assignee | ||
Comment 17•5 years ago
|
||
(In reply to Masatoshi Kimura [:emk] from comment #16)
If
UTF8String
does not matchUSVString
, bug 1607083 will violate the spec. IfUTF8String
does not matchDOMString
, bug 1607080 will violate the spec. Is it possible forUTF8String
to match both?
All our CSSOM / CSS parsing code converts to UTF-8, so the DOMString
in bug 1607080 is a lie. Semantics are the same as USVString
already. This is ok per https://github.com/w3c/csswg-drafts/issues/1217.
Comment 18•5 years ago
|
||
So I support USVString
+ annotation. At least CSSOM APIs should use typedef UTF8String CSSOMString
.
Assignee | ||
Comment 19•5 years ago
|
||
(In reply to Masatoshi Kimura [:emk] from comment #18)
At least CSSOM APIs should use
typedef UTF8String CSSOMString
.
That's the ideal final state, yeah... Though it is not quite as easy, see bug 1606995
TLDR returned strings are utf-16 as of now, and I need at least to do a fair bit of profiling to prove that the UTF8String -> JSString conversion step is not worse than what we're doing. And if it's worse, then I have to optimize it somehow.
Comment 20•5 years ago
|
||
So I support USVString + annotation.
I'm torn about this. The problem with that approach is that then you need to find all places USVString is handled and make sure the annotation is handled right. With a separate type there's less chance of people messing things up in the codegen. that said, there's definitely value in having the IDL say "USVString semantics, but C++ will see UTF-8, not UTF-16", and the right way to do that is in fact via an annotation on the type. If we did that, then we could in fact have the string be UTF-8 coming in from JS and UTF-16 going out to JS if we wanted. Pretty straightforward to do...
In either case, I agree we should:
typedef UTF8String CSSOMString
(or whatever syntax we decide on for the UTF8 case, e.g.[UTF8] USVString
)- File issues on specs like https://drafts.fxtf.org/geometry/#dommatrixreadonly that should really be using
CSSOMString
as defined at https://drafts.csswg.org/cssom/#cssomstring-type but are not. - Use
CSSOMString
in our own IDL for cases where it would make us match the CSSOM specs.
Description
•