Closed
Bug 86616
Opened 23 years ago
Closed 20 years ago
URL: "special non-english character"
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
WORKSFORME
Future
People
(Reporter: knocte, Assigned: nhottanscp)
References
Details
(Keywords: intl)
Attachments
(2 files)
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.1) Gecko/20010607
BuildID: 2001060703
Within this tag, in an html file:
<body background="dir\file.gif">
If "dir" or "file" contains the special non-english character '¡', the image is
not displayed. This character is used in Spanish language as the opposite of
'!', so as to start an exclamation sentence.
I have tested this, using a local HTM file placed on my hard drive. Both cases
(character in the name of the directory, and in the name of the file) don't work.
Reproducible: Always
Reporter | ||
Comment 1•23 years ago
|
||
I have to say, also, that this example works perfectly on IE.
Comment 2•23 years ago
|
||
RFC 2396, Section 2.1, notes that specifying a method of encoding non-ASCII
characters in URIs is, at present, left to be defined by the relevant URI scheme.
The "file" scheme is defined only in predecessor RFC 1738, which does not seem to
allow for non-ASCII characters. Realistically, we should probably extend parsing
capabilities to allow for non-ASCII characters in this scheme, but I don't know
how much modification that would require.
Reporter | ||
Comment 3•23 years ago
|
||
This character is not a non-ASCII character, it is an extended-ASCII character,
which I think is not the same.
I think that, if IE solves this, why can't Mozilla? Perhaps it is a big effort
in time, but I would never mark this as WONTFIX.
Additional information: when inserting an image in Netscape Composer, the
character '!' is replaced by "%21" on the HTML code and the non-english
character '¡' is replaced by "%A1".
Comment 4•23 years ago
|
||
->parser?
Assignee: asa → harishd
Component: Browser-General → Parser
QA Contact: doronr → bsharma
Comment 5•23 years ago
|
||
I vote invalid because according to the spec those characters should not be there.
Keywords: correctness
what's the status on this? knocte: if it is possible attach a testcase and could
you please explain how doe IE solves this?
knocte: Could you please attach a testcase? Also, could you point me to the
actual url exhibiting the problem. Until then FUTUREing the bug.
Target Milestone: --- → Future
Comment 9•23 years ago
|
||
This should be in Networking, anyway (URL parser).
Assignee: harishd → neeti
Component: Parser → Networking
QA Contact: moied → benc
Comment 10•22 years ago
|
||
moving neeti's futured bugs for triaging.
Assignee: neeti → new-network-bugs
Summary: Not displaying background image when non-english character '¡' is present in the body tag → URL: "special non-english character"
Comment 11•22 years ago
|
||
desperatly needing a testcase, this could be an charset encoding problem. Or was
it the \ instead of the /?
Comment 12•22 years ago
|
||
Testcase 1: http://212.73.175.223/~nik/bug_86616/ (Apache on my Mac OS X
computer, not necessarily up and running at all times...)
Testcase 2: http://www.student.lu.se/~kin02ndo/bug_86616/
Testcase 2 doesn't yield a background image, whereas testcase 1 does. The only
difference is the server. Thus, it is a server configuration problem (or an ftp
problem... it could be tricky moving non-ascii stuff to a server).
file:// doesn't work, and neither does directly accessing the file ¡Sophie!.jpg
in the URL bar (auto-escaping takes place).
Comment 13•22 years ago
|
||
Also note the following:
curl -I "http://www.student.lu.se/~kin02ndo/bug_86616/%C1Sophie\!.jpg"
HTTP/1.1 200 OK
Date: Sat, 05 Oct 2002 06:26:49 GMT
Server: Apache/1.3.14 (Unix) mod_ssl/2.7.1 OpenSSL/0.9.6 mod_jk
Last-Modified: Sat, 05 Oct 2002 05:50:12 GMT
ETag: "7b021-c50b-3d9e7d94"
Accept-Ranges: bytes
Content-Length: 50443
Content-Type: image/jpeg
curl -I "http://localhost/~nik/bug_86616/%C2%A1Sophie\!.jpg"
HTTP/1.1 200 OK
Date: Sat, 05 Oct 2002 06:25:38 GMT
Server: Apache/1.3.26 (Darwin)
Last-Modified: Fri, 20 Sep 2002 17:06:00 GMT
ETag: "b8e9d-c50b-3d8b5578"
Accept-Ranges: bytes
Content-Length: 50443
Content-Type: image/jpeg
¡ translates to %c1 in ISO-8859-1 and to %c2a1 in UTF-8. Mozilla itself uses %a1
in the URL escaping, which is bound to fail. This may be a bug.
Direct access in Mozilla:
http://www.student.lu.se/~kin02ndo/bug_86616/%c1Sophie!.jpg and
http://212.73.175.223/~nik/bug_86616/%c2%a1Sophie!.jpg respectively.
Interchanging: http://212.73.175.223/~nik/bug_86616/%c1Sophie!.jpg and
http://www.student.lu.se/~kin02ndo/bug_86616/%c2%a1Sophie!.jpg will both fail.
The solution is to 1) reference the file unescaped if the server can handle it
_and_ if the file name survives the ftp transport; or 2) escape the file URI
according to the server configuration.
Comment 14•22 years ago
|
||
I don't see a background image in testcase 1. ccing darin.
Comment 15•22 years ago
|
||
Let's look at the access logs for my web server:
h24-69-10-100.gv.shawcable.net - - [05/Oct/2002:09:09:54 +0200] "GET
/~nik/bug_86616/ HTTP/1.1" 200 1468
"http://bugzilla.mozilla.org/show_bug.cgi?id=86616" "Mozilla/5.0 (Windows; U;
Windows NT 5.1; en-US; rv:1.2b) Gecko/20021004"
h24-69-10-100.gv.shawcable.net - - [05/Oct/2002:09:09:57 +0200] "GET
/~nik/bug_86616/%C2%A1Sophie!.jpg HTTP/1.1" 200 50443
"http://212.73.175.223/~nik/bug_86616/" "Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.2b) Gecko/20021004"
Mozilla 1.2b 20021004 WinNT fetches the background correctly from the document.
p5082a764.dip0.t-ipconnect.de - - [05/Oct/2002:10:16:34 +0200] "GET
/~nik/bug_86616/ HTTP/1.1" 200 1468
"http://bugzilla.mozilla.org/show_bug.cgi?id=86616" "Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.2a) Gecko/20020910"
p5082a764.dip0.t-ipconnect.de - - [05/Oct/2002:10:16:35 +0200] "GET
/~nik/bug_86616/%A1Sophie!.jpg HTTP/1.1" 404 299 "-" "Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.2a) Gecko/20020910"
Mozilla 1.2a 20020910 Linux does NOT fetch the background and does NOT get a
referrer. It escapes the ¡ as %a1.
And finally, Mozilla 1.1 release Mac OS X fecthes the background correctly. So
is this a bug on Linux now? Does it work on Linux with 1.2b? Some other URL
escaping stuff has been considered in other bugs lately.
Comment 16•22 years ago
|
||
Just tested with latest linux trunk. Does not work.
Comment 17•22 years ago
|
||
-> intl
nhotta has some very similar bugs. our current implementation tries to send
URLs in the charset of the origin document. this is usually what servers want,
although there is a movement toward UTF-8. there was talk of a fallback scheme,
but i know that hasn't been implemented yet (at least not for images).
nhotta: is this a duplicate of another bug?
Assignee: new-network-bugs → yokoyama
Component: Networking → Internationalization
QA Contact: benc → ruixu
Comment 19•22 years ago
|
||
Re comment 17: note that the test case is in UTF-16... I find it strange that
the WinNT and Linux builds of Mozilla do not behave the same way.
Mozilla Linux explicitly escapes ¡ as %a1, which is ISO-8859-1 escaping.
Mozilla Win/Mac OS X explicitly escapes ¡ as %c2%a1, which is UTF-8 escaping.
The file ¡Sophie!.jpg is stored on Mac OS X as UTF-8 (as all files are).
However, if one uses an ftp client to upload it to an external server, the name
is translated to a MacRoman encoding. This is why the file is named
%c1Sophie!.jpg on the external test case, and also why it doesn't work (the
server doesn't use MacRoman).
Now, http://212.73.175.223/~nik/bug_86616/index2.html with encoding x-mac-roman
still fetches the background using %c2%a1, which means that Mozilla (except
Linux) uses UTF-8 regardless of document encoding.
Assignee | ||
Comment 20•22 years ago
|
||
>which means that Mozilla (except
>Linux) uses UTF-8 regardless of document encoding.
Yes, that is the problem. I am not sure about the Linux case, if it really uses
the document charset or it's just hard coded as ISO-8859-1.
Status: NEW → ASSIGNED
Depends on: 162407
Reporter | ||
Comment 21•20 years ago
|
||
I am using now Mozilla 1.7.3 [Gecko/20040910] and it seems this is fixed. I will
attach a testcase and resolve the bug. Please verify.
Reporter | ||
Comment 22•20 years ago
|
||
This attachment is not important. It is only need for the HTML file used as
testcase that will be added to the bug.
Reporter | ||
Comment 23•20 years ago
|
||
Instructions to test this testcase: Download first the image example attached
to this bug (https://bugzilla.mozilla.org/attachment.cgi?id=167183&action=view)
into a folder of your local computer. Then download this HTML file to the same
folder and try to render the document with Mozilla.
Reporter | ||
Comment 24•20 years ago
|
||
Changing bug status to RESOLVED FIXED. Should I change it to WORKSFORME? Please
verify.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Updated•20 years ago
|
Status: REOPENED → RESOLVED
Closed: 20 years ago → 20 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•