Closed
Bug 116143
Opened 23 years ago
Closed 23 years ago
windows-1252 conversion is not round trip
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
mozilla0.9.8
People
(Reporter: ftang, Assigned: ftang)
References
()
Details
Attachments
(1 file)
(deleted),
patch
|
brendan
:
review+
brendan
:
superreview+
|
Details | Diff | Splinter Review |
this is true for other encoding, we may want to fix all single bytes encoding to
make them round trip for cgi.
here is the bug for windows-1252.
Assignee | ||
Comment 1•23 years ago
|
||
here is the patch from bug 87736 to make it round trip
http://bugzilla.mozilla.org/attachment.cgi?id=40024&action=view
could we r= on this patch regardless we want to fix 87736 or not.
Assignee: yokoyama → ftang
Target Milestone: --- → mozilla0.9.8
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Comment 2•23 years ago
|
||
The abve URL contains test cases created by bclary. Here's the explanation
of cases:
** These files contain a form with a hidden field values. When the hidden
field values can contain every possible 8-bit value including control
characters. When the page is loaded, the hidden values are sent to an
echo script, which is currently Netscape-internal. We may substitute
an echo script in an external server.
This type of technique is used in secure authentication apparently
and we have had 2 inquiries about this problem just this week. In these
sites, they generate random encrypted values for login names or other
key values and then send these values back to the server.
** Explanation of files
** The hidden field values are identical in all the test cases below:
x5Ex89x7Ex7Fx80x81xA4x82xA5x83x84x85x86x87x88x89x8Ax8Bx8Cx8Dx8E
x90x91x92x93x94x95x96x97x98x99x9Ax9Bx9Cx9Dx9Ex9FxA2xA3xA4xA5xA6
xA7xA8xA9xAAxABxACxADxAExAFxB0xB1xB2xB3xB4xB5xA7xB6xB7xB8xB9xBA
xBBxBCxBDxBExBFxC5xC6xC7xCBxCCxD0xD1x3DxD6xD7xD8xDCxDDxDExDFxE5
xE6xE7xEBxECxF0xF1x83xC7xF6xF7xF8xFCxFDxFExF
1. iso-8859-1.html
This file has meta charset info indicating that it is in ISO-8859-1.
2. windows-1252.html
The same hidden value as case 1 except that this page has meta
charset info indicating that it is in Windows-1252.
3. nocharset.html
The same hidden value as case 1 except that this page has NO meta
charset info. Use "User-defined" encoding for testing.
4. Client-nocharset.html
The same hidden value as case 1 except that this page has NO meta
charset info. There is also a client side escaping of all 8-bit
values.
Our initial test shows that 2001-12-19 win32 trunk build fails
all 4 test cases. For cases 1 - 3, the buffer seems to show only
about 10 characters or fewer missing many of them. Each case returns
somewhat different results.
Comment 4•23 years ago
|
||
I said above that 2001-12-19 Win32 trunk build truncates
the input values to 10 or fewer characters in all 3 cases.
I tried Netscape 6.2.1-RTM build with the above test cases
and the results are much better.
For iso-8859-1 page, it fails to convert non-existing characters for
this code page (and also x7F) an dturn them into x3F (?). But
all bytes are there.
For windows-1252, the results are better than the
iso-8859-1 case because it also processes values x80 - x9F.
The test showed that it missed only 1 character in conversion, i.e.
x81, which is probably still undefined for this encoding.
If we use User-defined encoding on the 'nocharset.html' test
case, the results are correct and preserve all values in the
echo buffer.
Comparing these results shows that there have been a regression
in this area between 0.9.4 branch (NS 6.2.1) and the latest
trunk.
Comment 5•23 years ago
|
||
I tried the patch at:
http://bugzilla.mozilla.org/attachment.cgi?id=40024&action=view
on the current trunk build compiled from source tonight.
The 4 test cases at:
http://bclary.com/dia
produced exactly the same results -- they correctly reflected
that all the characters in the form hidden field value were
sent to the server. The client side escaping seems to
working with or without this patch.
So this patch produces the desired result.
There are 2 remaining issues:
1. One other Western encoding that is likely to be used by
Latin 1 web sites is ISO-8859-15. We don't have a patch for
that encoding yet. ISO-8859-1 will be taken care of by the
above patch.
I will file a separate bug for it. It should be fixed in the
next milestone.
2. In cases where Mozilla found undefined characters
against a certain code page, in form submissions this
resulted in the truncation of the characters after that
character as reported above in comment 2 for the current
trunk builds. Such truncation does not happen for 0.9.4
builds. This truncation is likely to occur in other encodings
even if we take care of 8859-1, 8859-15 and Windows-1252.
I will file a separate bug for this.
Comment 6•23 years ago
|
||
Surprisingly when I tested the 2 remaining problem scnenarios
with encodings other Western ones such as 8859-7, 8858-5, etc.,
I found that there was no truncation when I used the patched build.
I have not tested fully yet but can ftang explain this result?
Assignee | ||
Comment 7•23 years ago
|
||
momoi, let's move other charset to a seperate bug and discuss there. Let's keep
the comment of this bug report clear for landing the current patch to fix the
1252 issue.
nhotta- can you r= it ?
Comment 8•23 years ago
|
||
Looks fine. r=shanjian
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 9•23 years ago
|
||
Comment 10•23 years ago
|
||
Comment on attachment 62891 [details] [diff] [review]
the same patch r=shanjian without mpl change
Recording r=shanjian too. sr=brendan@mozilla.org.
/be
Attachment #62891 -
Flags: superreview+
Attachment #62891 -
Flags: review+
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 11•23 years ago
|
||
fixed and check in. File seperate bug for the other charsets.
Assignee | ||
Comment 12•23 years ago
|
||
fixed and check in
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•