Closed Bug 116143 Opened 23 years ago Closed 23 years ago

windows-1252 conversion is not round trip

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla0.9.8

People

(Reporter: ftang, Assigned: ftang)

References

(
URL
)

Details

Attachments

(1 file)

the same patch r=shanjian without mpl change 23 years ago Frank Tang (deleted), patch	brendan : review+ brendan : superreview+	Details \| Diff \| Splinter Review

Frank Tang

Assignee

Description

•

23 years ago

this is true for other encoding, we may want to fix all single bytes encoding to make them round trip for cgi. here is the bug for windows-1252.

Frank Tang

Assignee

Comment 1

•

23 years ago

here is the patch from bug 87736 to make it round trip http://bugzilla.mozilla.org/attachment.cgi?id=40024&action=view could we r= on this patch regardless we want to fix 87736 or not.

Assignee: yokoyama → ftang

Target Milestone: --- → mozilla0.9.8

Frank Tang

Assignee

Updated

•

23 years ago

Status: NEW → ASSIGNED

Katsuhiko Momoi

Comment 2

•

23 years ago

The abve URL contains test cases created by bclary. Here's the explanation of cases: ** These files contain a form with a hidden field values. When the hidden field values can contain every possible 8-bit value including control characters. When the page is loaded, the hidden values are sent to an echo script, which is currently Netscape-internal. We may substitute an echo script in an external server. This type of technique is used in secure authentication apparently and we have had 2 inquiries about this problem just this week. In these sites, they generate random encrypted values for login names or other key values and then send these values back to the server. ** Explanation of files ** The hidden field values are identical in all the test cases below: x5Ex89x7Ex7Fx80x81xA4x82xA5x83x84x85x86x87x88x89x8Ax8Bx8Cx8Dx8E x90x91x92x93x94x95x96x97x98x99x9Ax9Bx9Cx9Dx9Ex9FxA2xA3xA4xA5xA6 xA7xA8xA9xAAxABxACxADxAExAFxB0xB1xB2xB3xB4xB5xA7xB6xB7xB8xB9xBA xBBxBCxBDxBExBFxC5xC6xC7xCBxCCxD0xD1x3DxD6xD7xD8xDCxDDxDExDFxE5 xE6xE7xEBxECxF0xF1x83xC7xF6xF7xF8xFCxFDxFExF 1. iso-8859-1.html This file has meta charset info indicating that it is in ISO-8859-1. 2. windows-1252.html The same hidden value as case 1 except that this page has meta charset info indicating that it is in Windows-1252. 3. nocharset.html The same hidden value as case 1 except that this page has NO meta charset info. Use "User-defined" encoding for testing. 4. Client-nocharset.html The same hidden value as case 1 except that this page has NO meta charset info. There is also a client side escaping of all 8-bit values. Our initial test shows that 2001-12-19 win32 trunk build fails all 4 test cases. For cases 1 - 3, the buffer seems to show only about 10 characters or fewer missing many of them. Each case returns somewhat different results.

Katsuhiko Momoi

Comment 3

•

23 years ago

Forgot to add the URL.

URL: http://bclary.com/dia/

Katsuhiko Momoi

Comment 4

•

23 years ago

I said above that 2001-12-19 Win32 trunk build truncates the input values to 10 or fewer characters in all 3 cases. I tried Netscape 6.2.1-RTM build with the above test cases and the results are much better. For iso-8859-1 page, it fails to convert non-existing characters for this code page (and also x7F) an dturn them into x3F (?). But all bytes are there. For windows-1252, the results are better than the iso-8859-1 case because it also processes values x80 - x9F. The test showed that it missed only 1 character in conversion, i.e. x81, which is probably still undefined for this encoding. If we use User-defined encoding on the 'nocharset.html' test case, the results are correct and preserve all values in the echo buffer. Comparing these results shows that there have been a regression in this area between 0.9.4 branch (NS 6.2.1) and the latest trunk.

Katsuhiko Momoi

Comment 5

•

23 years ago

I tried the patch at: http://bugzilla.mozilla.org/attachment.cgi?id=40024&action=view on the current trunk build compiled from source tonight. The 4 test cases at: http://bclary.com/dia produced exactly the same results -- they correctly reflected that all the characters in the form hidden field value were sent to the server. The client side escaping seems to working with or without this patch. So this patch produces the desired result. There are 2 remaining issues: 1. One other Western encoding that is likely to be used by Latin 1 web sites is ISO-8859-15. We don't have a patch for that encoding yet. ISO-8859-1 will be taken care of by the above patch. I will file a separate bug for it. It should be fixed in the next milestone. 2. In cases where Mozilla found undefined characters against a certain code page, in form submissions this resulted in the truncation of the characters after that character as reported above in comment 2 for the current trunk builds. Such truncation does not happen for 0.9.4 builds. This truncation is likely to occur in other encodings even if we take care of 8859-1, 8859-15 and Windows-1252. I will file a separate bug for this.

Katsuhiko Momoi

Comment 6

•

23 years ago

Surprisingly when I tested the 2 remaining problem scnenarios with encodings other Western ones such as 8859-7, 8858-5, etc., I found that there was no truncation when I used the patched build. I have not tested fully yet but can ftang explain this result?

Frank Tang

Assignee

Comment 7

•

23 years ago

momoi, let's move other charset to a seperate bug and discuss there. Let's keep the comment of this bug report clear for landing the current patch to fix the 1252 issue. nhotta- can you r= it ?

Frank Tang

Assignee

Updated

•

23 years ago

Blocks: 104056

Shanjian Li

Comment 8

•

23 years ago

Looks fine. r=shanjian

Frank Tang

Assignee

Updated

•

23 years ago

Blocks: 104148
No longer blocks: 104056

Frank Tang

Assignee

Comment 9

•

23 years ago

Attached patch the same patch r=shanjian without mpl change (deleted) — Details — Splinter Review

Brendan Eich [:brendan]

Comment 10

•

23 years ago

Comment on attachment 62891 [details] [diff] [review] the same patch r=shanjian without mpl change Recording r=shanjian too. sr=brendan@mozilla.org. /be

Attachment #62891 - Flags: superreview+

Attachment #62891 - Flags: review+

Frank Tang

Assignee

Updated

•

23 years ago

Blocks: 104160
No longer blocks: 104148

Frank Tang

Assignee

Updated

•

23 years ago

Blocks: 104060
No longer blocks: 104160

Frank Tang

Assignee

Comment 11

•

23 years ago

fixed and check in. File seperate bug for the other charsets.

Frank Tang

Assignee

Comment 12

•

23 years ago

fixed and check in

Status: ASSIGNED → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Katsuhiko Momoi

Comment 13

•

23 years ago

Let me take this QA work.

QA Contact: teruko → momoi

Frank Tang

Assignee

Updated

•

23 years ago

No longer blocks: 104060

You need to log in before you can comment on or make changes to this bug.