Open Bug 1468923 Opened 6 years ago Updated 2 years ago

Characters in custom protocol links are falsely decoded (myscheme://abc/%C3%BC)

Categories

(Core :: Networking, defect, P3)

60 Branch
defect

Tracking

()

UNCONFIRMED

People

(Reporter: abtmp-github, Unassigned)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0 Build ID: 20180605171542 Steps to reproduce: Enter a link like myscheme://abc/%C3%BC In this example, %C3%BC is a UTF-8-encoded U-Umlaut ("ü") Actual results: The target shell extension receives "myscheme://abc/ü". Expected results: Target shell extension should receive "myscheme://abc/%C3%BC".
This conversion does not occur if using Chrome, IE/Edge or Windows Explorer. There, the URL is passed to shell extension without modifications.
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0 (20180619102337) I've tried to test this report on latest Nightly and Firefox release build. However when I enter the "myscheme://abc/%C3%BC" in the URL bar it gets resolved to "myscheme://abc/%C3%BC". I think I`m missing a step in order to reproduce it. Could you let me know where exactly you enter "myscheme://abc/%C3%BC"? Are you using an extension? If yes, please provide a link to it?. If possible please provide a short video showing the issue.
Flags: needinfo?(abtmp-github)
(In reply to Stefan [:StefanG_QA] from comment #2) > Could you let me know where exactly you enter "myscheme://abc/%C3%BC"? > Are you using an extension? If yes, please provide a link to it?. > If possible please provide a short video showing the issue. Thanks for trying it out. Maybe there is a misunderstanding regarding the issue. We have a windows binary (so called "shell extension") that should be triggered if a link with a specific protocol is opened from a browser. The problem is, that this shell extension receives decoded characters instead of "%C3%BC". We do not use firefox extensions. If the above does not help, please tell me, we'll try to create a video.
Flags: needinfo?(abtmp-github)
I short video showing the issue would be helpful. Thanks! Based on other reports I think the proper component would be Core:Networking. Please change if this is not the correct component.
Component: Untriaged → Networking
Product: Firefox → Core
Flags: needinfo?(valentin.gosu)
Priority: -- → P3
Whiteboard: [necko-triaged]
Valentin, I know you've been working on URL stuff (incl. decoding). Mind taking a look when you get a chance?
Attached image shell_extension_reg.png (deleted) —
Shell extension as it is registered in windows registry.
We have a registered shell extension for our URL scheme ‚iecm‘ (see shell_extension_reg.png). This shell extension expects a properly escaped URL. But when the entered url contains escaped UTF-8 characters, the parameter %1 received by the shell extension contains unescaped UTF-8 character which makes the URL invalid. (Interestingly, some reserved characters like ":" that were escaped in the original URL remain escaped when passed to the shell extension!) E.g. when we submit the following path in an URL: test:withü it must be encoded as follows: iecm://host:port/test%3Awith%C3%BC When the encoded URL is entered in Firefox, it is passed to the shell extension as the following string: iecm://host:port/test%3Awithü So, the umlaut is passed in a _decoded_ form, which is invalid for an URL. Again, this works differently (URL is passed as is) if using Chrome, IE/Edge or Windows Explorer.
How is the URL passed to the shell extension? Do you just enter it in the URL bar and it is then opened by the extension? Or do you use a Firefox addon to do something?
Flags: needinfo?(valentin.gosu) → needinfo?(abtmp-github)
The URL is entered in URL bar or called from <a href>. No Firefox extension is used.
Flags: needinfo?(abtmp-github)
My findings so far: I also tried this on linux, though I'll skip those details, and talk about windows. It seems there are two ways to add an external protocol: via regedit, and via about:config. I tried the about:config way first. I built a simple program to test how params are passed: https://pastebin.mozilla.org/9088290 and compile it Then I went to about:config, added this pref: network.protocol-handler.expose.iecm;false Then you load this URL: data:text/html;base64,PGEgaHJlZj0iaWVjbTovL2hvc3Q6cG9ydC90ZXN0JTNBd2l0aCVDMyVCQyI+IGxpbmsgPC9hPg== Click it, select the app you just compiled, then check the output file. This method actually displays the original percent encoded URI. Then I tried the regedit version, with the same compiled program. https://pastebin.mozilla.org/9088295 save this as a .reg and import it into regedit. Load data:text/html;base64,PGEgaHJlZj0ic2hvdGd1bjovL2hvc3Q6cG9ydC90ZXN0JTNBd2l0aCVDMyVCQyI+IGxpbmsgPC9hPg== Click the link and check the output file. As mentioned in this bug, this method unescapes the character, and you end up with shotgun://host:port/test%3Awithü - I'm guessing it's the registry command with %1 that's to blame, but I can't figure out why Chrome is behaving differently. It seems we are calling: https://searchfox.org/mozilla-central/source/uriloader/exthandler/nsLocalHandlerApp.cpp#101 This code is very old, and I currently have no idea how it works :) Since Chrome has a different behaviour here, it's still quite possible we are doing something wrong, probably around the following code: https://searchfox.org/mozilla-central/source/uriloader/exthandler/win/nsOSHelperAppService.cpp https://searchfox.org/mozilla-central/source/uriloader/exthandler/win/nsMIMEInfoWin.cpp#448
Ok, I found it. It seems the cause is this: https://searchfox.org/mozilla-central/rev/9a3f8590f807d449e790c8ba0e39eb14f41066d8/uriloader/exthandler/win/nsMIMEInfoWin.cpp#235 It was added in bug 394974 - which I don't know for sure if it still applies.
Flags: needinfo?(dveditz)
The bug you mention sounds plausible, but to me it looks like the line you identified came from bug 227268 which also deals with unescaping strings passed to other programs. None of these bugs look like they have tests so it's going to be dangerous messing with it. In addition, Windows may have changed behavior since WinXP and may differ from version to version. bug 389580: URLs with %00 were dangerous because some APIs on some versions of windows unescaped it to a null and truncated the string to something different than what we had previously security checked. bug 394974 backed out the simple %00-only fix in favor of a more general approach. So if we change this we need to make sure %00 stays escaped and not mis-converted by Windows. bug 227268 was fixing mangled text passed to mail clients. Didn't dig in but sounded like we were sending windows the %-encoded UTF8 and Windows was unescaping assuming the local charset. That makes me worry that it'll still unescape %00. So we need tests of urls with %00, and then some international mailto: urls with Thunderbird and Outlook as the default mail handlers to make sure appropriate things happen. DEFINITELY need to get some i18n folks involved here. And then we need to test the above on different versions of Windows, at least Win7 and Win10. WinXP/Vista are thankfully not supported, and we can assume Win8/8.1 work like Win10 (plus there's very few users on those versions).
Flags: needinfo?(dveditz)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: