Cyrillic URL parameter is split when passed from the command line
Categories
(Toolkit :: Startup and Profile System, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox81 | --- | fixed |
People
(Reporter: haswellready, Assigned: toshi)
References
Details
Attachments
(4 files)
User Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0
Steps to reproduce:
- Run:
"C:\Program Files (x86)\Mozilla Firefox\firefox.exe" "https://www.google.com/search?q=Эр-рияд" - everything is OK (the tab is opened correctly)
- run:
"C:\Program Files (x86)\Mozilla Firefox\firefox.exe" "https://www.google.com/search?q=Эр-Рияд" - the result is incorrect
Actual results:
- two tabs are opened and both are with wrong URLs
Expected results:
- one tab is opened and the URL is the same as in the command line (in spite of upper/lower case of the text passed as argument)
Reporter | ||
Comment 1•4 years ago
|
||
Could someone take a look at this bug?
I think it is important.
Can't open many web-pages correctly due to that.
Comment 2•4 years ago
|
||
Hello,
I tried reproducing this issue on the latest version of Firefox Nightly 80.0a1 (2020-07-12), beta 79.0b7 and release 78.0.1 using Run with both urls : https://www.google.com/search?q=Эр-рияд and https://www.google.com/search?q=Эр-Pияд but the result was the same : a google search page for the city Riad. This has the same behaviour when I open them with Chrome.
Switching to a "p" or "P" on an English keyboard provides with different search results.
Setting a component for this issue in order to get the dev team involved.
If you feel it's an incorrect one please feel free to change it to a more appropriate one.
Reporter | ||
Comment 3•4 years ago
|
||
Still reproducible on my side on Firefox Nightly 80a.
Here are my steps on a video (a little bit changed):
https://youtu.be/6mHXBmMZ8Ts
Looks like it depends on multiple Firefox windows and doesn't depend on the letter passed as a parameter.
I believe this is a Launcher Process bug, but confirming.
Moving to Firefox::File Handling per :mossop.
:haswellready, I can't replicate this with Nightly on Windows 10, however my Windows 10 system is en-US locale.
Comment 6•4 years ago
|
||
The video shows that for the case with the Cyrillic character Р
Firefox displays a unicode unavailable glyph. Then splits the rest of the URL to another tab with the address xn--d1ah3e
which is ияд
.
Assignee | ||
Comment 7•4 years ago
|
||
I was able to reproduce the issue on Win8.1 + ru locale, while no repro on Win10.
Assignee | ||
Comment 8•4 years ago
|
||
The Cyrillic character "Р" is represented as "D0 A0" in utf-8. Because the function isspace
considers 0xA0 (= NBSP) as a space, HandleCommandLine
incorrectly split a URL into two parts as Kevin mentioned. I think the problem is the launcher process drops quotes when launching the browser process. Adding 0xA0 to here would fix this problem.
The behavior of isspace
is mysterious. It's causing the difference of Win10 vs Win8.1. And also, it returns false for 0xA0 n my private version of Firefox somehow and no repro, while the official Nightly returns true, causing this problem.
Comment 9•4 years ago
|
||
Do no use CRT isspace
, please.
https://groups.google.com/forum/#!topic/mozilla.dev.platform/-PdmjyA2EFQ
Assignee | ||
Comment 10•4 years ago
|
||
The problem is nsWinRemoteClient converts a command string from utf16 to utf8 before putting it into a window message, and then nsWinRemoteServer parses that utf8 string byte by byte, as ascii. This is causing a white-space problem. We need a utf16 version of HandleCommandLine
, convert a handled string to utf-8, and then pass it to nsCommandLine::Init
.
Given that, changing the component to toolkit.
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 11•4 years ago
|
||
The bug is Windows-specific so I've cleared the flag "needinfo".
Comment 12•4 years ago
|
||
Thank you, the bug will now show up in the Toolkit team's triage queue.
Assignee | ||
Comment 13•4 years ago
|
||
I have an idea to implement comment 10, refactoring HandleCommandLine
in nsWinRemoteServer.cpp. Let me prepare a patch.
Assignee | ||
Comment 14•4 years ago
|
||
This patch takes our the logic to parse a command line string as a new template
class CommandLineParserWin
from HandleCommandLine
to prepare to convert it
into the utf-16 version.
Assignee | ||
Comment 15•4 years ago
|
||
This patch just moves the CommandLineParserWin class to toolkit/xre/
without any change and adds a unittest for it.
Depends on D84226
Assignee | ||
Comment 16•4 years ago
|
||
This patch introduces two classes WinRemoteMessageSender/Receiver to
generate/parse a payload of WM_COPYDATA message which is sent from
nsWinRemoteClient to nsWinRemoteServer.
And, we change the encoding of the payload from utf-8 to utf-16,
incrementing the message version from 1 to 2, because our parsing
logic on nsWinRemoteServer does not support a variable-width character
encoding like utf-8.
Depends on D84227
Assignee | ||
Comment 17•4 years ago
|
||
We should not use isspace() according to bug 1485588.
Depends on D84228
Comment 18•4 years ago
|
||
The severity field is not set for this bug.
:mossop, could you have a look please?
For more information, please visit auto_nag documentation.
Comment 19•4 years ago
|
||
Comment 20•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f2de31a4a8c4
https://hg.mozilla.org/mozilla-central/rev/34307f33b722
https://hg.mozilla.org/mozilla-central/rev/2fb1abdecbc7
https://hg.mozilla.org/mozilla-central/rev/3f85eff2ecbe
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Description
•