Closed Bug 170241 Opened 22 years ago Closed 18 years ago

URL: escaped characters in hostname

Categories

(Core :: Networking, enhancement)

x86
Windows 2000
enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 309671

People

(Reporter: twb0, Assigned: nhottanscp)

References

()

Details

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2a) Gecko/20020910 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2a) Gecko/20020910 Some HTML E-mails (i.e. from MSN and others) contain links which are not formed correctly. For example, one link from MSN: http://g%2Emsn%2Ecom/0NL37384/2836 The %2E should be a "." Microsoft IE and Opera are aware of this problem and automatically fix the hex characters for you. Obviously the people who create these poorly-written HTML E-mails should fix their problem, but it would be helpful if the browser compensated for this common error. Reproducible: Always Steps to Reproduce: 1. Open HTML-based E-mail from MSN 2. Click on any URL link (for example, http://g%2Emsn%2Ecom/0NL37384/2836) Actual Results: Netscape/Mozilla will complain that "http://g%2Emsn%2Ecom/0NL37384/2836" could not be found. Expected Results: Mozilla should have expanded the %2E (and any other hex digits) and re-written the URL to appear as: http://g.msn.com/0NL37384/2836
Of course expanding the hex digits in http://g%2Emsn%2Ecom/0NL37384/2836%3Ffoo would be an error (the %3F expands to an illegal character).... Over to networking to decide whether we want to unescape just the hostname part of urls...
Assignee: hewitt → new-network-bugs
Component: URL Bar → Networking
QA Contact: claudius → benc
I think the answer is: no. But I know who knows the answer :)
Summary: Browser should fix "malformed" URLs with hex digits → URL: escaped characters in hostname
Don't be so sure ... hostnames and escaping is tricky stuff. I remember we did this once and it was changed ... had to do with international characters in hostnames. ccing darin who did the IDN stuff.
we decided to do away with unescaping hostname characters because it helped avoid security bugs and because there are no DNS characters that require escaping. at the time cookies and necko were not using the same path for URL parsing, but that has since changed... so, we might want to revisit this.
*** Bug 170708 has been marked as a duplicate of this bug. ***
confirming while we debate this
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 171172 has been marked as a duplicate of this bug. ***
*** Bug 157019 has been marked as a duplicate of this bug. ***
andreas, thoughts on this?
nhotta, is this a problem?
Assignee: new-network-bugs → nhotta
Some Windows programs use this technique when trying to launch a URL to avoid 'overly-smart shell' problems. Of course, they probably only test it on IE. looked through RFC 2396 quickly and I didn't see an express prohibition on escaping the hostname in the text, but I also can't see a way in the BNF where it would be allowed. Is the security concern that an invalid hostname might slip through some security checks? Could the unescaping be done early enough in the code that it could not slip past the checks?
see Bug 191388 for escaped IDN hostnames
*** Bug 261276 has been marked as a duplicate of this bug. ***
Based on the spoofing problems, I'd like to NOT support unescaping the hostname.
As far as I can tell from the spec, you're not allowed to escape characters in hostnames anyway. INVALID based on that and the number of comments in this bug that seem reluctant to change it.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
V/invalid.
Status: RESOLVED → VERIFIED
*** Bug 261276 has been marked as a duplicate of this bug. ***
According to RFC-3986 (the new URI spec that obsoletes RFC-2396) that is now allowed. See bug 309671.
Status: VERIFIED → RESOLVED
Closed: 20 years ago18 years ago
Resolution: INVALID → DUPLICATE
You need to log in before you can comment on or make changes to this bug.