Closed Bug 1642154 Opened 4 years ago Closed 4 years ago

Compilation error: incomplete type ‘JSLinearString’ used in nested name specifier

Categories

(Core :: JavaScript: GC, defect)

76 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla79
Tracking Status
firefox79 --- affected

People

(Reporter: oss, Assigned: allstars.chh)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0

Steps to reproduce:

I'm tried to compile a SpiderMonkey embedded project using 76.0.1, but I'm getting a compilation error when I try to compile a function containing a rooted linear string:

JS::RootedString rs( cx, JS::ToString(cx,args[0]) );
JS::Rooted<JSLinearString*> rls( cx, rs->ensureLinear(cx) );

Actual results:

g++ -c -g -O3 -std=c++17 -funsigned-char -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 -DMOZ_HAS_MOZGLUE -DWASM_SUPPORTS_HUGE_MEMORY -DEXPORT_JS_API -DMOZILLA_CLIENT -Dtopsrcdir=./firefox-76.0.1/js/src -include ./firefox-76.0.1/js/src/obj/dist/include/js/RequiredDefines.h -I./firefox-76.0.1/js/src/obj/dist/include -o ../out/js_api.o js_api.cpp
In file included from ./firefox-76.0.1/js/src/obj/dist/include/jspubtd.h:18,
from ./firefox-76.0.1/js/src/obj/dist/include/jsapi.h:27,
from js.hpp:25,
from js_api.hpp:23,
from js_api.cpp:23:
./firefox-76.0.1/js/src/obj/dist/include/js/TraceKind.h: In instantiation of ‘const JS::TraceKind JS::MapTypeToTraceKind<JSLinearString>::kind’:
./firefox-76.0.1/js/src/obj/dist/include/js/TraceKind.h:171:29: required from ‘const JS::RootKind JS::MapTypeToRootKind<JSLinearString*>::kind’
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:1040:51: required by substitution of ‘template<class T> using MaybeWrapped = std::conditional_t<(JS::MapTypeToRootKind<T>::kind == JS::RootKind::Traceable), js::DispatchWrapper<T>, T> [with T = JSLinearString*]’
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:1147:27: required from ‘class JS::Rooted<JSLinearString*>’
js_api.cpp:553:38: required from here
./firefox-76.0.1/js/src/obj/dist/include/js/TraceKind.h:82:30: error: incomplete type ‘JSLinearString’ used in nested name specifier
82 | static const JS::TraceKind kind = T::TraceKind;
| ^~~~
js_api.cpp: In static member function ‘static bool Js_Api::toCSV(JSContext*, unsigned int, JS::Value*)’:
js_api.cpp:553:44: error: invalid use of incomplete type ‘class JSString’
553 | JS::Rooted<JSLinearString*> rls( cx, rs->ensureLinear(cx) );
| ^~
In file included from ./firefox-76.0.1/js/src/obj/dist/include/js/TraceKind.h:12,
from ./firefox-76.0.1/js/src/obj/dist/include/jspubtd.h:18,
from ./firefox-76.0.1/js/src/obj/dist/include/jsapi.h:27,
from js.hpp:25,
from js_api.hpp:23,
from js_api.cpp:23:
./firefox-76.0.1/js/src/obj/dist/include/js/TypeDecls.h:36:21: note: forward declaration of ‘class JSString’
36 | class JS_PUBLIC_API JSString;
| ^~~~~~~~

Expected results:

js/public/TraceKind.h looks like it should have a specialization of mapTypeToTraceKind for JSLinearString, since it's part of the public API, and it doesn't.

I believe that there should be a RootedLinearString typedef as well.

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: Untriaged → JavaScript Engine
Product: Firefox → Core

This is by design, JSLinearString is an internal type and its methods are not exposed to embedders. Use JS_EnsureLinearString instead.

Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → INVALID

Or do you get the same error with JS_EnsureLinearString? Maybe there are two problems here. In any case, ensureLinear won't work...

Status: RESOLVED → REOPENED
Component: JavaScript Engine → JavaScript: GC
Ever confirmed: true
Resolution: INVALID → ---

Yes I get the same error. JSLinearString shouldn't be an internal type. There's even an example of its use in jsapi.h. Please take a look ~ line 2161. Here's the output when I use JS_EnsureLinearString.

g++ -c -g -O3 -std=c++17 -funsigned-char -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 -DMOZ_HAS_MOZGLUE -DWASM_SUPPORTS_HUGE_MEMORY -DEXPORT_JS_API -DMOZILLA_CLIENT -Dtopsrcdir=./firefox-76.0.1/js/src -include ./firefox-76.0.1/js/src/obj/dist/include/js/RequiredDefines.h -I./firefox-76.0.1/js/src/obj/dist/include -o ../out/js_api.o js_api.cpp
In file included from ./firefox-76.0.1/js/src/obj/dist/include/jspubtd.h:18,
from ./firefox-76.0.1/js/src/obj/dist/include/jsapi.h:27,
from js.hpp:25,
from js_api.hpp:23,
from js_api.cpp:23:
./firefox-76.0.1/js/src/obj/dist/include/js/TraceKind.h: In instantiation of ‘const JS::TraceKind JS::MapTypeToTraceKind<JSLinearString>::kind’:
./firefox-76.0.1/js/src/obj/dist/include/js/TraceKind.h:171:29: required from ‘const JS::RootKind JS::MapTypeToRootKind<JSLinearString*>::kind’
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:1040:51: required by substitution of ‘template<class T> using MaybeWrapped = std::conditional_t<(JS::MapTypeToRootKind<T>::kind == JS::RootKind::Traceable), js::DispatchWrapper<T>, T> [with T = JSLinearString*]’
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:1147:27: required from ‘class JS::Rooted<JSLinearString*>’
js_api.cpp:552:38: required from here
./firefox-76.0.1/js/src/obj/dist/include/js/TraceKind.h:82:30: error: incomplete type ‘JSLinearString’ used in nested name specifier
82 | static const JS::TraceKind kind = T::TraceKind;
| ^~~~
js_api.cpp: In static member function ‘static bool Js_Api::toCSV(JSContext*, unsigned int, JS::Value*)’:
In file included from ./firefox-76.0.1/js/src/obj/dist/include/js/CallArgs.h:73,
from ./firefox-76.0.1/js/src/obj/dist/include/jsapi.h:31,
from js.hpp:25,
from js_api.hpp:23,
from js_api.cpp:23:
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h: In instantiation of ‘JS::Rooted<T>::Rooted(const RootingContext&, S&&) [with RootingContext = JSContext*; S = JSLinearString*; T = JSLinearString*]’:
js_api.cpp:552:71: required from here
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:1108:37: error: using invalid field ‘JS::Rooted<T>::ptr’
1108 | : ptr(std::forward<S>(initial)) {
| ^
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h: In instantiation of ‘const T& JS::Rooted<T>::get() const [with T = JSLinearString*]’:
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:1133:3: required from ‘JS::Rooted<T>::operator const T&() const [with T = JSLinearString*]’
js_api.cpp:553:10: required from here
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:1135:39: error: using invalid field ‘JS::Rooted<T>::ptr’
192 | const T& get() const { return (ptr); }
| ~~~~~
......
1135 | DECLARE_NONPOINTER_ACCESSOR_METHODS(ptr);
./firefox-76.0.1/js/src/obj/dist/include/js/RootingAPI.h:192:34: note: in definition of macro ‘DECLARE_NONPOINTER_ACCESSOR_METHODS’
192 | const T& get() const { return (ptr); }
| ^~~
make: *** [makefile:95: ../out/js_api.o] Error 1

Steve already pointed out that JSLinearString is missing an implementation of MapTypeToTraceKind on the mailinglist: https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine/CCHKc-N72ic

Blocks: sm-embedding

Yoshi, can you take a look at this when you get a chance?

Flags: needinfo?(allstars.chh)
Assignee: nobody → allstars.chh
Flags: needinfo?(allstars.chh)

Can I reproduce this in m-c? just tried this with a jsapi-test but it is compiled successfully.
see https://gist.github.com/allstarschh/a3bc437f2acb94a0949ddf784fbc6fc6

Originally I tried https://github.com/mozilla-spidermonkey/spidermonkey-embedding-examples but it crashes with firefox76 when I run 'meson _build'
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(rr) bt
#0 0x0000000000000000 in ?? ()
#1 0x00007f19b377eb63 in _GLOBAL__sub_I_Unified_cpp_js_src9.cpp () at /home/allstars/src/gecko_76/js/src/threading/Mutex.h:39
#2 0x00007f19b519e95a in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffa3df8d58, env=env@entry=0x7fffa3df8d68) at dl-init.c:72
#3 0x00007f19b519ea59 in call_init (env=0x7fffa3df8d68, argv=0x7fffa3df8d58, argc=1, l=<optimized out>) at dl-init.c:30
#4 _dl_init (main_map=0x7f19b51ba190, argc=1, argv=0x7fffa3df8d58, env=0x7fffa3df8d68) at dl-init.c:119
#5 0x00007f19b518f0ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#6 0x0000000000000001 in ?? ()
#7 0x00007fffa3df9d12 in ?? ()
#8 0x0000000000000000 in ?? ()
(rr)

(In reply to Yoshi Cheng-Hao Huang [:allstars.chh][:allstarschh] from comment #7)

Can I reproduce this in m-c? just tried this with a jsapi-test but it is compiled successfully.

jsapi-tests pull in the internal headers too, so this doesn't affect them. You should be able to reproduce it in the browser.

(In reply to oss from comment #0)

js/public/TraceKind.h looks like it should have a specialization of mapTypeToTraceKind for JSLinearString

It needs a specialization of MapTypeToRootKind (like we already have for JSFunction and JSScript).

Pushed by allstars.chh@gmail.com: https://hg.mozilla.org/integration/autoland/rev/c272f2f9aaac Add JSLinearString to MapTypeToTraceKind. r=jonco
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79

Hello,
I've tested the result. I'm still having a few problems.

a) I don't think the JS::RootedLinearString typedef was added? It was mentioned in the original bug report...

b) I'm having problems passing in RootedLinearString's to functions. Maybe I'm doing something wrong here?

g++ -c -g -O3 -std=c++17 -funsigned-char -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 -DMOZ_HAS_MOZGLUE -DWASM_SUPPORTS_HUGE_MEMORY -DEXPORT_JS_API -DMOZILLA_CLIENT -Dtopsrcdir=./firefox-76.0.1/js/src -include ./firefox-76.0.1/js/src/obj/dist/include/js/RequiredDefines.h -I./firefox-76.0.1/js/src/obj/dist/include -o ../out/js_api.o js_api.cpp
js_api.cpp: In static member function ‘static bool Js_Api::toCSV(JSContext*, unsigned int, JS::Value*)’:
js_api.cpp:566:33: error: cannot convert ‘JS::Rooted<JSLinearString*>’ to ‘JSString*’
566 | if( JS_StringHasLatin1Chars(rls) )
| ^~~
| |
| JS::Rooted<JSLinearString*>
In file included from js.hpp:25,
from js_api.hpp:23,
from js_api.cpp:23:
./firefox-76.0.1/js/src/obj/dist/include/jsapi.h:2187:61: note: initializing argument 1 of ‘bool JS_StringHasLatin1Chars(JSString*)’
2187 | extern JS_PUBLIC_API bool JS_StringHasLatin1Chars(JSString* str);
| ^

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

(In reply to oss from comment #12)

b) I'm having problems passing in RootedLinearString's to functions. Maybe I'm doing something wrong here?

g++ -c -g -O3 -std=c++17 -funsigned-char -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 -DMOZ_HAS_MOZGLUE -DWASM_SUPPORTS_HUGE_MEMORY -DEXPORT_JS_API -DMOZILLA_CLIENT -Dtopsrcdir=./firefox-76.0.1/js/src -include ./firefox-76.0.1/js/src/obj/dist/include/js/RequiredDefines.h -I./firefox-76.0.1/js/src/obj/dist/include -o ../out/js_api.o js_api.cpp
js_api.cpp: In static member function ‘static bool Js_Api::toCSV(JSContext*, unsigned int, JS::Value*)’:
js_api.cpp:566:33: error: cannot convert ‘JS::Rooted<JSLinearString*>’ to ‘JSString*’
566 | if( JS_StringHasLatin1Chars(rls) )
| ^~~
| |
| JS::Rooted<JSLinearString*>
In file included from js.hpp:25,
from js_api.hpp:23,
from js_api.cpp:23:
./firefox-76.0.1/js/src/obj/dist/include/jsapi.h:2187:61: note: initializing argument 1 of ‘bool JS_StringHasLatin1Chars(JSString*)’
2187 | extern JS_PUBLIC_API bool JS_StringHasLatin1Chars(JSString* str);
| ^

While inside SpiderMonkey JSLinearString is a subclass of JSString, externally and in the public API they are typedefs with no exposed relationship. You can't pass the one where the other is expected. If you have the former, however, it can be infallibly converted to an instance of the latter using JS_FORGET_STRING_LINEARNESS -- which is in jsapi.h at least, not sure if it's in the Rust bindings, we could add it there if desired (but probably in a separate bug).

OK, that makes sense, and I need to study jsapi.h some more...
So where is the best place to test a string argument to see if it's latin1?

bool foo( JSContext* cx, unsigned argc, JS::Value* vp )
{
JS::CallArgs args = JS::CallArgsFromVp( argc, vp );

// 1. Here on args[0] ?

JS::RootedString rs( cx, JS::ToString(cx,args[0]) );

// 2. Here on rs ?

JS::Rooted<JSLinearString*> rls( cx, JS_EnsureLinearString(cx,rs) );

// 3. Here on rls after using JS_FORGET_STRING_LINEARNESS ?

}

I really wish you guys had just implemented UTF-8 internally everywhere. Every string function now has to have multiple implementations which is not great.

(In reply to oss from comment #14)

OK, that makes sense, and I need to study jsapi.h some more...
So where is the best place to test a string argument to see if it's latin1?

Your #2 looks best to me, since you have the right type. Also note that if you're linearizing the string in order to grab its characters, be careful: the pointer returned by eg JS_GetTwoByteLinearStringChars is pretty unstable; if you GC it might become invalid and most JSAPI calls can GC. (That's why it forces you to pass in an AutoRequireNoGC& token, but there are many ways to shoot yourself in the foot by storing the pointer where it outlives the token.)

I really wish you guys had just implemented UTF-8 internally everywhere. Every string function now has to have multiple implementations which is not great.

I wish we could turn back time and do that too, but now it's too late. JS strings are specced to be weird sequences of 16-bit numbers that may or may not represent valid UTF-8 strings. And the Web uses even funkier encodings.

We might someday become more UTF-8 friendly, but I don't understand the situation there. Waldo does.

(In reply to Steve Fink [:sfink] [:s:] from comment #15)

We might someday become more UTF-8 friendly, but I don't understand the situation there. Waldo does.

Have the decency to join me under the bus you just threw me under, please. :-P

At least theoretically, it seems like maybe strings that are repeatedly indexed into could accrue some sort of cached-offset data structure to speed up sequential indexed accesses. We have such a thing for the DOM in Gecko (nodes contain pointers to previous and next sibling, childNodes[i] does some magic to speed up certain patterns of access at a cost in memory). Such could be introduced for strings that need it. But it's going to be a fair amount of complexity. Still, probably necessary if we were ever to move to fully UTF-8.

I would not bet at all on strings becoming UTF-8 in the next five years, in the next ten years I'd say maybe 20% chance, and in twenty...at least conceivably maybe more likely than not, by then. Or it could be strings will never ever ever become UTF-8.

Oh, note -- we wouldn't use UTF-8, we'd use WTF-8 (which allows encoding UTF-16 surrogate code points), and then every consumer gets to have to deal with the possibility of non-UTF-8 in their string data. That, I do not think will ever be fixable.

Thanks for the pointers and warning about GC. I'll make sure not do call any JS calls while I'm doing string processing. And please don't forget to add the RootedLinearString typedef.

The JavaScript string thing is complex I agree. But I really would love just a single UNICODE STANDARD representation. The most cache-friendly encoding is obviously UTF-8, and it's also easier to send over the wire.

I believe the real problem here is that the JavaScript 'String' type is currently trying to serve too many masters. It needs to be split up into distinct types. One type to representing Unicode strings alone as UTF-8. And if there's really a need for another kind of two-byte encoded string that's indexable, then that should be a different type altogether.

(In reply to oss from comment #17)

The JavaScript string thing is complex I agree. But I really would love just a single UNICODE STANDARD representation. The most cache-friendly encoding is obviously UTF-8, and it's also easier to send over the wire.

I believe the real problem here is that the JavaScript 'String' type is currently trying to serve too many masters. It needs to be split up into distinct types. One type to representing Unicode strings alone as UTF-8. And if there's really a need for another kind of two-byte encoded string that's indexable, then that should be a different type altogether.

Yeah, we all wish JavaScript had been designed differently twenty-five years ago. No one thinks it is sensible for strings to be UCS-2 in effect. But these fundamental details are not something that can be changed now, not given how many users of JavaScript there are who would be affected by it.

I assume that the string pointers returned below are managed by the JavaScript engine and I don't have to do anything special to root or free them?

if (JS_StringHasLatin1Chars(str)) {
const JS::Latin1Char* chars = JS_GetLatin1LinearStringChars(nogc, lstr);
// do stuff
} else {
const char16_t* chars = JS_GetTwoByteLinearStringChars(nogc, lstr);
// do stuff
}

From the non-linear string API, I was expecting 'AndLength' versions of these functions instead of null terminated string versions.

(In reply to oss from comment #19)

I assume that the string pointers returned below are managed by the JavaScript engine and I don't have to do anything special to root or free them?

Well...it's tricky. If you get the character data pointer and then do not do other JS operations, you can use the pointer and characters as long as desired. But often you're going to want to do something that could run JS, and then you have to be careful.

In general, the absolute safest way to deal with this (and not have to think "will these steps inadvertently invoke SpiderMonkey at all and potentially GC") is to use JS::AutoStableStringChars. It's in newer SpiderMonkey in js/StableStringChars.h, and in older ones I think in either jsapi.h or jsfriendapi.h.

if (JS_StringHasLatin1Chars(str)) {
const JS::Latin1Char* chars = JS_GetLatin1LinearStringChars(nogc, lstr);
// do stuff
} else {
const char16_t* chars = JS_GetTwoByteLinearStringChars(nogc, lstr);
// do stuff
}

As noted above, it depends what the "stuff" is. (Copying the characters with std::copy_n into some separate array? Sure, safe. Using those characters as a property name to pass to JS_DefineProperty? Not at all safe, that function could trigger a GC that would invalidate the pointer.) And how str happens to store its internals. And other implementation details you should not rely on. The nogc should "protect" you in the sense of asserting or maybe crashing or something if you accidentally do something that GC's...but your program is still effectively crashing which is surely not what you want.

From the non-linear string API, I was expecting 'AndLength' versions of these functions instead of null terminated string versions.

Those pointers, if memory serves, have never been guaranteed to be null-terminated. You'll have to separately get the length and bound character accesses based on it. We do have "flat strings" that are null-terminated in some versions of SpiderMonkey -- but we removed them sometime in the last year or two, so you shouldn't use them because they're no longer forward-compatible.

Oh -- JSLinearString has never been guaranteed to be null-terminated, it only guarantees its characters are stored in contiguous memory such that a pointer to them can be handed out. Some linear strings are the sole owners of their characters, others merely depend upon a subrange of characters in some larger string (that obviously wouldn't ordinarily have a '\0' right after such a range).

Thanks for your help Jeff. Yes I understood not to call any JS functions while processing the string.

Sorry, I misread the comments in the header file about null termination.

/**

  • Copies the string's characters to a null-terminated char16_t buffer.
  • Returns nullptr on OOM.
    */

I now realise it was only referring to JS_CopyStringCharsZ rather then the entire section.

Regarding string length, does that mean that the derived linear string is guaranteed to have the same length as the rooted string? i.e.

JS::RootedString rs( cx, JS::ToString(cx,args[0]) );
if( !rs )
    return false;
bool latin1 = JS_StringHasLatin1Chars(rs);
size_t slen = JS_GetStringLength(rs);
JS::Rooted<JSLinearString*> rls( cx, JS_EnsureLinearString(cx,rs) );
if( !rls )
    return false;
{   
    JS::AutoCheckCannotGC nogc;
    if( latin1 )
    {
        const JS::Latin1Char* s = JS_GetLatin1LinearStringChars( nogc, rls );

       // Q: is slen still valid here when accessing 's' ?
   }

}

(In reply to oss from comment #22)

Regarding string length, does that mean that the derived linear string is guaranteed to have the same length as the rooted string? i.e.

Yes. JS strings are immutable sequences of characters with fixed length. Whether the string is a plain ol' string or linear, or stored as Latin-1 or UTF-16/UCS-2, doesn't change any of that. If you have a string and its length, the linear string you can derive from it has the same length (and same character data in it).

Hi oss,
After Waldo's comments, do you still think there is something we should fix?
If Waldo's comment 13 is your concern then please file a new bug for it.
Otherwise I will close this.

Flags: needinfo?(oss)

It sounds like there's still a need for a RootedLinearString typedef. (Though I believe that is purely a convenience thing; it wouldn't be any different than Rooted<LinearString*>.)

If I were an embedder, I wouldn't use the typedefs anyway. (Gecko doesn't, for the most part.) Rooted<T> makes more sense to me than some magic type name.

Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Flags: needinfo?(oss)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: