987069 - [jsdbg2] Unicode characters in script URLs are not correctly encoded in Debugger.Script.url

Reporter

Description

•

11 years ago

If URLs contain non-ASCII characters they are not correctly Unicode encoded within Debugger.Script.url. In case of accessing http://öl.de this is returned within the Debugger.Script.url property: "http://Ã¶l.de/" Sebastian

Boris Zbarsky [:bzbarsky]

Comment 1

•

11 years ago

DebuggerScript_getUrl is just buggy. It does: 2962 str = js_NewStringCopyZ<CanGC>(cx, script->filename()); where script->filename() is a char*. But this assumes ASCII (or at best ISO-8859-1) encoding for the char*, whereas it's actually more likely to be UTF-8 in the case of Firefox (and in general is in no particular encoding!).

Status: UNCONFIRMED → NEW

Ever confirmed: true

Boris Zbarsky [:bzbarsky]

Comment 2

•

11 years ago

terrence points out that js/public/CharacterEncoding.h has utilities that could be used here.

James Long (:jlongster)

Comment 4

•

10 years ago

I will take the lead on this and figure out who can help fix this. I definitely don't know the C++ functions to use here but I'll get help.

James Long (:jlongster)

Updated

•

10 years ago

Assignee: nobody → jlong

James Long (:jlongster)

Comment 5

•

10 years ago

Jim, you filed the bug that was marked as a duplicate of this one. Any idea of which string encoding function we should be using instead? I noticed that `DebuggerSource_getDisplayURL` uses `JS_NewUCStringCopyZ` instead, but that function internally just does `NewStringCopyZ<CanGC>(cx, s);` as well. Looks like there is some good stuff in CharacterEncoding.h.

Flags: needinfo?(jimb)

Boris Zbarsky [:bzbarsky]

Comment 6

•

10 years ago

There are two overloads of NewStringCopyZ: one takes char* and the other takes char16_t*. JS_NewUCStringCopyZ takes char16_t* and calls the second overload. But that already starts of with UTF-16 data, since it has char16_t*.

Boris Zbarsky [:bzbarsky]

Comment 7

•

10 years ago

So the point is you want to be using one of the JS_NewUCString* functions, but that's the easy part; the "hard" part is getting your data in UTF-16.

James Long (:jlongster)

Comment 8

•

10 years ago

This is the core problem: http://dxr.mozilla.org/mozilla-central/source/js/src/jsscript.h#473 The url/filename of the script isn't stored as a wide char. That's wrong.

Boris Zbarsky [:bzbarsky]

Comment 9

•

10 years ago

The point is, it's stored as UTF-8.

Jim Blandy :jimb

Comment 10

•

10 years ago

I believe that at the time that code was written, JSScript filenames were indeed Latin-1. When the encoding was changed to UTF-8, Debugger wasn't updated. In light of this code: https://hg.mozilla.org/mozilla-central/file/57e4e9c33bef/js/src/jsapi.cpp#l3993 I would guess that UTF8CharsToNewTwoByteCharsZ is the conversion function to use, much as it's used there.

Flags: needinfo?(jimb)

Boris Zbarsky [:bzbarsky]

Comment 11

•

10 years ago

JS script filenames in Gecko have been UTF-8 basically since time immemorial, as far as I can tell. It just passed in necko URI stringifications, which are UTF-8.... James, are you still planning to work on this?

Flags: needinfo?(jlong)

Jim Blandy :jimb

Comment 12

•

10 years ago

(In reply to Boris Zbarsky [:bz] from comment #11) > JS script filenames in Gecko have been UTF-8 basically since time > immemorial, as far as I can tell. It just passed in necko URI > stringifications, which are UTF-8.... Huh. I wonder what I'm thinking of. Anyway, this is all very fine.

James Long (:jlongster)

Comment 13

•

10 years ago

(In reply to Boris Zbarsky [:bz] from comment #11) > JS script filenames in Gecko have been UTF-8 basically since time > immemorial, as far as I can tell. It just passed in necko URI > stringifications, which are UTF-8.... > > James, are you still planning to work on this? I was hoping to push this through with a mentor's help, but at this point I'm not sure I'll be able to do that. I'm only vaguely familiar with our C++ string APIs and I still don't understand where the problem is, even after to talking to a few people about it. Jim, do you think you could take a whack at it? I know you're busy, so I can try to find someone else if not.

Flags: needinfo?(jlong)

make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fixes after rebasing 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fix spots passing ConstUTF8CharsZ through varargs 10 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
Review comments Splinter page, printed to PDF with too-small fonts 10 years ago Jeff Walden [:Waldo] (deleted), application/pdf		Details
Code to do sanity-checking on a "UTF-8" string 10 years ago Jeff Walden [:Waldo] (deleted), text/x-c++src		Details
make ScriptSource filename encoding consistent 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fix spots passing ConstUTF8CharsZ through varargs 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
trivial fixes from the review 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fixes needed due to rebasing 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
add ConstUTF8CharsZ validation 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fix up FormatStackDump 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fix up NotableScriptSourceInfo 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fix up ubinodes 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
fix up debugger 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
update tests for more kinds of characters 9 years ago Tom Tromey :tromey (deleted), patch		Details \| Diff \| Splinter Review
make ScriptSource filename encoding consistent 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review+	Details \| Diff \| Splinter Review
fix spots passing ConstUTF8CharsZ through varargs 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review+	Details \| Diff \| Splinter Review
trivial fixes from the review 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review+	Details \| Diff \| Splinter Review
fixes needed due to rebasing 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review-	Details \| Diff \| Splinter Review
add ConstUTF8CharsZ validation 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review+	Details \| Diff \| Splinter Review
fix up FormatStackDump 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review+	Details \| Diff \| Splinter Review
fix up NotableScriptSourceInfo 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review+	Details \| Diff \| Splinter Review
fix up ubinodes 9 years ago Tom Tromey :tromey (deleted), patch	fitzgen : review+	Details \| Diff \| Splinter Review
fix up debugger 9 years ago Tom Tromey :tromey (deleted), patch	fitzgen : review+	Details \| Diff \| Splinter Review
update tests for more kinds of characters 9 years ago Tom Tromey :tromey (deleted), patch	Waldo : review+	Details \| Diff \| Splinter Review
Bug 987069 - make ScriptSource filename encoding consistent 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - restore an optimization requested in an earlier review; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - set skip-if on encoding.js; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - profiler event marker request; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - rename some fields; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - telemetry fix; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - nsIHangReport fix; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - nsScriptLoader fix; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review-	Details
Bug 987069 - ProcessFile argument type; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - ReflectParse changes; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - TestingFunctions change; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - RootedString changes; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - use encodeUtf8; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - fix testStructuredClone; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - OptimizationTracking fix; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - OSObject; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - ReportError fix; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - MemoryProfiler.cpp fix; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - ReadSPSProfilingStack fix; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - ScriptedCaller cleanup in wasm; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - change nsPACMan to use GetSpec, not GetAsciiSpec; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details
Bug 987069 - introduce NativePathToUTF8; 8 years ago Tom Tromey :tromey (deleted), text/x-review-board-request	Waldo : review+	Details