Closed Bug 86616 Opened 23 years ago Closed 20 years ago

URL: "special non-english character"

Categories

(Core :: Internationalization, defect)

x86
Windows ME
defect
Not set
minor

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: knocte, Assigned: nhottanscp)

References

Details

(Keywords: intl)

Attachments

(2 files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.1) Gecko/20010607 BuildID: 2001060703 Within this tag, in an html file: <body background="dir\file.gif"> If "dir" or "file" contains the special non-english character '¡', the image is not displayed. This character is used in Spanish language as the opposite of '!', so as to start an exclamation sentence. I have tested this, using a local HTM file placed on my hard drive. Both cases (character in the name of the directory, and in the name of the file) don't work. Reproducible: Always
I have to say, also, that this example works perfectly on IE.
RFC 2396, Section 2.1, notes that specifying a method of encoding non-ASCII characters in URIs is, at present, left to be defined by the relevant URI scheme. The "file" scheme is defined only in predecessor RFC 1738, which does not seem to allow for non-ASCII characters. Realistically, we should probably extend parsing capabilities to allow for non-ASCII characters in this scheme, but I don't know how much modification that would require.
This character is not a non-ASCII character, it is an extended-ASCII character, which I think is not the same. I think that, if IE solves this, why can't Mozilla? Perhaps it is a big effort in time, but I would never mark this as WONTFIX. Additional information: when inserting an image in Netscape Composer, the character '!' is replaced by "%21" on the HTML code and the non-english character '¡' is replaced by "%A1".
->parser?
Assignee: asa → harishd
Component: Browser-General → Parser
QA Contact: doronr → bsharma
I vote invalid because according to the spec those characters should not be there.
Keywords: correctness
QA Contact: bsharma → moied
what's the status on this? knocte: if it is possible attach a testcase and could you please explain how doe IE solves this?
Setting bug status to New
Status: UNCONFIRMED → NEW
Ever confirmed: true
knocte: Could you please attach a testcase? Also, could you point me to the actual url exhibiting the problem. Until then FUTUREing the bug.
Target Milestone: --- → Future
This should be in Networking, anyway (URL parser).
Assignee: harishd → neeti
Component: Parser → Networking
QA Contact: moied → benc
moving neeti's futured bugs for triaging.
Assignee: neeti → new-network-bugs
Summary: Not displaying background image when non-english character '¡' is present in the body tag → URL: "special non-english character"
desperatly needing a testcase, this could be an charset encoding problem. Or was it the \ instead of the /?
Testcase 1: http://212.73.175.223/~nik/bug_86616/ (Apache on my Mac OS X computer, not necessarily up and running at all times...) Testcase 2: http://www.student.lu.se/~kin02ndo/bug_86616/ Testcase 2 doesn't yield a background image, whereas testcase 1 does. The only difference is the server. Thus, it is a server configuration problem (or an ftp problem... it could be tricky moving non-ascii stuff to a server). file:// doesn't work, and neither does directly accessing the file ¡Sophie!.jpg in the URL bar (auto-escaping takes place).
Also note the following: curl -I "http://www.student.lu.se/~kin02ndo/bug_86616/%C1Sophie\!.jpg" HTTP/1.1 200 OK Date: Sat, 05 Oct 2002 06:26:49 GMT Server: Apache/1.3.14 (Unix) mod_ssl/2.7.1 OpenSSL/0.9.6 mod_jk Last-Modified: Sat, 05 Oct 2002 05:50:12 GMT ETag: "7b021-c50b-3d9e7d94" Accept-Ranges: bytes Content-Length: 50443 Content-Type: image/jpeg curl -I "http://localhost/~nik/bug_86616/%C2%A1Sophie\!.jpg" HTTP/1.1 200 OK Date: Sat, 05 Oct 2002 06:25:38 GMT Server: Apache/1.3.26 (Darwin) Last-Modified: Fri, 20 Sep 2002 17:06:00 GMT ETag: "b8e9d-c50b-3d8b5578" Accept-Ranges: bytes Content-Length: 50443 Content-Type: image/jpeg ¡ translates to %c1 in ISO-8859-1 and to %c2a1 in UTF-8. Mozilla itself uses %a1 in the URL escaping, which is bound to fail. This may be a bug. Direct access in Mozilla: http://www.student.lu.se/~kin02ndo/bug_86616/%c1Sophie!.jpg and http://212.73.175.223/~nik/bug_86616/%c2%a1Sophie!.jpg respectively. Interchanging: http://212.73.175.223/~nik/bug_86616/%c1Sophie!.jpg and http://www.student.lu.se/~kin02ndo/bug_86616/%c2%a1Sophie!.jpg will both fail. The solution is to 1) reference the file unescaped if the server can handle it _and_ if the file name survives the ftp transport; or 2) escape the file URI according to the server configuration.
I don't see a background image in testcase 1. ccing darin.
Let's look at the access logs for my web server: h24-69-10-100.gv.shawcable.net - - [05/Oct/2002:09:09:54 +0200] "GET /~nik/bug_86616/ HTTP/1.1" 200 1468 "http://bugzilla.mozilla.org/show_bug.cgi?id=86616" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.2b) Gecko/20021004" h24-69-10-100.gv.shawcable.net - - [05/Oct/2002:09:09:57 +0200] "GET /~nik/bug_86616/%C2%A1Sophie!.jpg HTTP/1.1" 200 50443 "http://212.73.175.223/~nik/bug_86616/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.2b) Gecko/20021004" Mozilla 1.2b 20021004 WinNT fetches the background correctly from the document. p5082a764.dip0.t-ipconnect.de - - [05/Oct/2002:10:16:34 +0200] "GET /~nik/bug_86616/ HTTP/1.1" 200 1468 "http://bugzilla.mozilla.org/show_bug.cgi?id=86616" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2a) Gecko/20020910" p5082a764.dip0.t-ipconnect.de - - [05/Oct/2002:10:16:35 +0200] "GET /~nik/bug_86616/%A1Sophie!.jpg HTTP/1.1" 404 299 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2a) Gecko/20020910" Mozilla 1.2a 20020910 Linux does NOT fetch the background and does NOT get a referrer. It escapes the ¡ as %a1. And finally, Mozilla 1.1 release Mac OS X fecthes the background correctly. So is this a bug on Linux now? Does it work on Linux with 1.2b? Some other URL escaping stuff has been considered in other bugs lately.
Just tested with latest linux trunk. Does not work.
-> intl nhotta has some very similar bugs. our current implementation tries to send URLs in the charset of the origin document. this is usually what servers want, although there is a movement toward UTF-8. there was talk of a fallback scheme, but i know that hasn't been implemented yet (at least not for images). nhotta: is this a duplicate of another bug?
Assignee: new-network-bugs → yokoyama
Component: Networking → Internationalization
QA Contact: benc → ruixu
I'll check if this is a dup.
Assignee: yokoyama → nhotta
Re comment 17: note that the test case is in UTF-16... I find it strange that the WinNT and Linux builds of Mozilla do not behave the same way. Mozilla Linux explicitly escapes ¡ as %a1, which is ISO-8859-1 escaping. Mozilla Win/Mac OS X explicitly escapes ¡ as %c2%a1, which is UTF-8 escaping. The file ¡Sophie!.jpg is stored on Mac OS X as UTF-8 (as all files are). However, if one uses an ftp client to upload it to an external server, the name is translated to a MacRoman encoding. This is why the file is named %c1Sophie!.jpg on the external test case, and also why it doesn't work (the server doesn't use MacRoman). Now, http://212.73.175.223/~nik/bug_86616/index2.html with encoding x-mac-roman still fetches the background using %c2%a1, which means that Mozilla (except Linux) uses UTF-8 regardless of document encoding.
>which means that Mozilla (except >Linux) uses UTF-8 regardless of document encoding. Yes, that is the problem. I am not sure about the Linux case, if it really uses the document charset or it's just hard coded as ISO-8859-1.
Status: NEW → ASSIGNED
Depends on: 162407
Keywords: intl
QA Contact: ruixu → ylong
I am using now Mozilla 1.7.3 [Gecko/20040910] and it seems this is fixed. I will attach a testcase and resolve the bug. Please verify.
This attachment is not important. It is only need for the HTML file used as testcase that will be added to the bug.
Instructions to test this testcase: Download first the image example attached to this bug (https://bugzilla.mozilla.org/attachment.cgi?id=167183&action=view) into a folder of your local computer. Then download this HTML file to the same folder and try to render the document with Mozilla.
Changing bug status to RESOLVED FIXED. Should I change it to WORKSFORME? Please verify.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
-> WORKSFORME
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: