Closed
Bug 28474
Opened 25 years ago
Closed 25 years ago
illegal use of nsString-external JavaScript convert charset incorrectly
Categories
(Core :: Layout, defect, P3)
Tracking
()
VERIFIED
FIXED
M16
People
(Reporter: cwang, Assigned: jbetak)
References
()
Details
(Whiteboard: [PDT+]have fix. r=ftang,rickg. a=bobj, partial fix in on Friday, resolving performance issues with ftang, RickG)
Attachments
(5 files)
OS Win 98
Netscape 6 2000021808-M14
Steps to reproduce
Load page www.cww.com
Results:
Links at the bottom page and the date on the top display incorrectly.
Comment 2•25 years ago
|
||
I've isolated the problem to a single instance culled from
the above. This is the Chinese (GB2312) date display.
Here's the image of incorrect date display with Mozilla on a test page
I created. I also append the correct display image obtained on
the same test page by 4.72.
URL: cww.com → http://www.cww.com/
Comment 3•25 years ago
|
||
Comment 4•25 years ago
|
||
Comment 5•25 years ago
|
||
Here's what the source of the page looks like:
....
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
....
<font color=#000080><strong>WXYZ</strong>
<script language="javascript" src="time.js">
</script>
where WXYZ = 4 Chinese characters
Note that the date is actually read via an external JS script. Note also
that this page has a meta charset tag indicating he GB2312 charset.
The external script looks like this:
var today = new Date();
document.write('(');
document.write(today.getMonth()+1);
document.write('A');
document.write(today.getDate());
document.write('B');
document.write(')');
where A = GB2312-encoded character for "Month" and
B = GB2312-encoded character for "Day"
------------
So the problem is that Mozilla does not process the characters
generated by the external js script as GB2312. Looking at the
incorrect image, the values look like GB2312 characters read in
as ASCII data rather than as GB2312 as indicated by the meta
tag. 4.7x does read them in ad regard them as characters
matching the meta tag.
I attach 2 test cases + 1 external JS file below.
Case 1: A brief page containing the 4 Chinese words in GB2312 and
and an external JS file which generates today's date
and surround them with Chinese words for "Month" and
"Date".
This does not display OK on Mozilla but does OK on 4.7x.
Case 2: Identical file to the above -- 4 Chinese words followed by
a JS function which generates today's date. However, in this
example, I wrote in the JS function inside the page.
This displays OK on Mozilla on 4.7x.
Comment 6•25 years ago
|
||
Comment 7•25 years ago
|
||
Comment 8•25 years ago
|
||
Comment 9•25 years ago
|
||
If you use the first html page, china.html, with "time.js" file,
Mozilla will not display the date well even with a Simplified
Chinese font. 4.7x does OK with this file.
If you use the 2nd file, Mozilla displays the GB2312 date OK.
I don't know who should get this bug. I18n, JS, or layout?
Comment 10•25 years ago
|
||
Ftang: I think this belongs to you. The problem with this page is the author has
encoded the charset spec wrong. They coded it like this:
WRONG: <meta http-equiv="Content-Type" content="text/html charset="gb2312">
RIGHT: <meta http-equiv="Content-Type" content="text/html" charset="gb2312">
Nonetheless, you may need to support this style.
Comment 11•25 years ago
|
||
As far as I know the correct Meta charset lien should look loke this:
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-5">
See this W3C document:
http://www.w3.org/TR/1999/REC-html401-19991224/struct/global.html#adef-http-equi
v
I also note that the above Chinsese page(s) use this same style -- copied
from their main and frame pages:
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
Also Netscape Japanese Home Page uses exactly the same style:
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=x-sjis">
and we dispkay this page OK on Mozilla. Actually Mozilla can display
the Chinese page OK. The only thing it cannot display OK is
the external JS generated Chinese date.
I don't think we can blame this problem on the incorrect meta-tag usage.
Comment 12•25 years ago
|
||
I've dealt with this problem in Netscape 4.X. The problem is that the external
JS script does not work because it is assumed to be in some charset other than
the one that the HTML file uses. If the external JS script arrives over HTTP,
and the HTTP Content-Type header does not have a charset parameter, the browser
must make an assumption, and the assumption that we chose in Netscape 4.X is to
use the HTML document's charset, which is quite reasonable. Now, for backward
compatibility reasons, I would suggest that Mozilla do the same.
Comment 13•25 years ago
|
||
I am very very confused now.
1. Is the JS is embedded in the HTML file or a speerate JS file ?
2. If it is a seperate file
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=5485 is not a good case
for this bug since it is embedded in HTML
RickG's comment about meta tag is not related to this bug since the "today's
news" part display correctly and only the date part is wrong. If this is related
to the META tag problem, then the whole thing won't display correctly.
Updated•25 years ago
|
Status: NEW → ASSIGNED
Comment 14•25 years ago
|
||
I'm sorry about the confusion.
Bugzilla does not recognize file types very well.
What you need to do is save these files on your local disk:
A. You need to save (id=5483) and (id=5484) into the
same directory. Call the first one "china.html" and
the 2nd one "time.js". The 2nd name must be that name
because the first page calls it by that name.
These 2 files will show you the original problem in a much shorter
document.
B. Save the 3rd file, (id=5485), as "china2.html". This file
is essentially the same as "china.html" but I included the
content of "time.js" in this file itself. This page will
not have a problem displaying GB2312 characters generated
by JS.
Looking through Bugzilla, teruko reported on a related issue
in Bug 12813. This is a bug where teh source charset for
external .js file is not supported.
The current problem is where there is no source charset indicated
except the charset indicated by the meta charset tag on
the web page. I believe that this is more common in web pages
today than the source charset case.
It does not look like browser QA's are not the CC line. CC'ing teruko
and blee.
Comment 15•25 years ago
|
||
If you don't mind an internal URL, here is a page which
has exactly the same data as the first (problematical) test
case files.
http://kaze:8000/bugs/bug28474.html
Comment 16•25 years ago
|
||
try http://warp/u/ftang/tmp/cww.html instead
Comment 17•25 years ago
|
||
I turn on the assertion code stated in 28424 and visit
http://warp/u/ftang/tmp/cww.html . I catch the problem right there!!!
This is another missuse of nsString bug
It assert in
4088 warren 3.263 NS_IMETHODIMP
4089 valeski 3.301 HTMLContentSink::OnStreamComplete(nsIStreamLoader*
aLoader,
4090 nsISupports* aContext,
4091 nsresult aStatus,
4092 PRUint32 stringLen,
4093 const char* string)
4094 vidur 3.132 {
4095 warren 3.263 nsresult rv = NS_OK;
4096 warren 3.277 nsString aData(string, stringLen);
4097 vidur 3.132
warren in 3.277 (probably change from vidur's code) pass a char* to nsString
aData which cause this problem. the string contains non ASCII, non ISO-8859-1
data
How to fix it ? Call GetDocumentCharacterSet() from mDocument (method of
nsIDocuemnt) to get the charset, call the character set converter manager to get
a nsIUnicodeDecoder, use the decoder to convert char* string into PRUnichar
before pass to nsString.
Reassign this to vidur, cc warren
add 28424 to the depend list.
Updated•25 years ago
|
Summary: Incorrect character display (day format and links) → illegal use of nsString-Incorrect character display (day format and links)
Comment 18•25 years ago
|
||
So it seems to me like it's hardly ever going to be valid to construct an
nsString from a char* without a charset decoder. Maybe we should remove that
constructor in favor of one that requires a decoder.
Comment 19•25 years ago
|
||
Frank, you seem to know what the right fix is. Rather than have me stumble
through this, I'd appreciate it if you could make the fix and have me review it.
Thanks.
Assignee: vidur → ftang
Comment 20•25 years ago
|
||
Reassign to module owner. Vidur, I simply cannot fix all these kind of bugs. I
am playing your whitebox QA here. You should not expect your QA fix your code.
Assignee: ftang → vidur
Comment 21•25 years ago
|
||
Then what exactly does the i18n group do? Fine - it'll get done at some point.
Status: NEW → ASSIGNED
Target Milestone: M16
Comment 22•25 years ago
|
||
>Then what exactly does the i18n group do?
i18n groupt write all the library under mozilla/intl as gecko group write all
the lib under mozilla/layout or as JS group write all the code under mozilla/js
. i18n group does write some sample usage of intl library, but we don't fix all
misuagge of intl library just as JavaScript group won't fix bugs in all the .js
file.
Comment 23•25 years ago
|
||
when do you plan to fix this ? Please provide ETA.
Comment 24•25 years ago
|
||
Vidur's on sebbatical. I know it's his bug, but Frank, can you just fix this for
us? Thanks.
Assignee: vidur → ftang
Status: ASSIGNED → NEW
Assignee | ||
Updated•25 years ago
|
Status: NEW → ASSIGNED
Comment 26•25 years ago
|
||
name this beta1 because it block 32215. See the screenshot in 23315 for detail.
jbetak- can you verify this w/ your fix ? for 32215, checking the docuement
itself is enough.
Assignee | ||
Comment 27•25 years ago
|
||
as discussed with ftang - we have a fix for this problem. It´s a very contained
modification to one file (HTMLContentSink) ensuring that an external JavaScript
file gets loaded using the HTML document encoding instead of the HTML default
Latin-1.
Comment 28•25 years ago
|
||
Must for Beta1. Sidebar must work for Japanese content. We have a Japanese
3rd party contracted for sidebar content but it's blocked by this bug.
See comments in bug 32215.
Comment 29•25 years ago
|
||
*** Bug 32215 has been marked as a duplicate of this bug. ***
Comment 30•25 years ago
|
||
Putting on PDT+ radar for beta1. Please contact rickg for approval to check
in.
Whiteboard: have fix. r=ftang → [PDT+]have fix. r=ftang
Comment 31•25 years ago
|
||
If you have a chance to check this in on Sunday night or sooner, please call
Rick at home to get his approval. Clearing this by very early Monday morning
will get it in the verification build.
We really need these items checked into the branch, or we are going to be forced
to miss them RSN.
Thanks,
Jim
Comment 32•25 years ago
|
||
ftang and jbetak talked to rickg on Fri evening. jbetak is working on some
suggestions made by rickg and expects to have this done by Monday.
Whiteboard: [PDT+]have fix. r=ftang → [PDT+]have fix. r=ftang,rickg. a=bobj
Assignee | ||
Updated•25 years ago
|
Whiteboard: [PDT+]have fix. r=ftang,rickg. a=bobj → [PDT+]have fix. r=ftang,rickg. a=bobj, partial fix in on Friday, resolving performance issues with ftang, RickG
Comment 33•25 years ago
|
||
jbetak and rickg have agreed on the remaining fix; juraj will be checking in at
around 7:30 oday (3/20/00).
Assignee | ||
Comment 34•25 years ago
|
||
OK, prechecking tests look good - closing down. Thanks for all your help RickG!
I´m opening a new bug 32604 for the trunk fix, we didn't put in all the
neccessary functionality and changes for Beta1 because of the percieved risk.
Comment 36•25 years ago
|
||
** Checked with 3/21/2000 Win32 build **
The original problem at the CWW China site no longer occurs both for the date and for the
boiler-plate link template at the very bottom of the page. My test cas and ftang's test case
work. The most critical test case at Arukikata also works now. I still cannot check for
the portion which contains layer but I assume the part without layer is working though
the data are read in from an external source.
I think these are proof enough that the fix has achieved its mission.
Marking it verified as fixed.
Status: RESOLVED → VERIFIED
SPAM. HTML Element component deprecated, changing component to Layout. See bug
88132 for details.
Component: HTML Element → Layout
You need to log in
before you can comment on or make changes to this bug.
Description
•