Closed
Bug 200984
Opened 22 years ago
Closed 21 years ago
[FIX]Arabic text in Javascript "unescape" function returns the wrong output
Categories
(Core :: Internationalization, defect, P1)
Core
Internationalization
Tracking
()
VERIFIED
FIXED
mozilla1.4beta
People
(Reporter: neokuwait, Assigned: bzbarsky)
References
Details
Attachments
(3 files, 2 obsolete files)
Gecko/20030328
TO REPRODUCE:
1. view testcase
ACTUAL:
the unescape function produces Latin characters instead of Arabic
EXPECTED:
only characters in their %xx.. hexadecimal form should be affected? (see IE6)
Reporter | ||
Comment 1•22 years ago
|
||
Assignee | ||
Comment 2•22 years ago
|
||
Intl -- not arabic specific.
Assignee: mkaply → smontagu
Status: UNCONFIRMED → NEW
Component: BiDi Hebrew & Arabic → Internationalization
Ever confirmed: true
OS: Windows XP → All
QA Contact: zach → ylong
Hardware: PC → All
Assignee | ||
Comment 3•22 years ago
|
||
Assignee | ||
Comment 4•22 years ago
|
||
Attachment #119646 -
Attachment is obsolete: true
Assignee | ||
Comment 5•22 years ago
|
||
Comment on attachment 119647 [details] [diff] [review]
Oops, that had an extra chunk in it...
Would you review? The key here is that ToNewCString does lossy "conversion" to
a CString.... This patch makes the conversion use the proper charset encoder,
does a bit of cleanup, and removes the unnecessary call to Reset() (new
decoders already come blank).
Attachment #119647 -
Flags: superreview?(jst)
Attachment #119647 -
Flags: review?(smontagu)
Assignee | ||
Comment 6•22 years ago
|
||
mine.
Assignee: smontagu → bzbarsky
Priority: -- → P1
Summary: Arabic text in Javascript "unescape" function returns the wrong output → [FIX]Arabic text in Javascript "unescape" function returns the wrong output
Target Milestone: --- → mozilla1.4beta
Comment 7•22 years ago
|
||
Comment on attachment 119647 [details] [diff] [review]
Oops, that had an extra chunk in it...
// Allocate a buffer of the maximum length
PRUnichar *dest = (PRUnichar*)nsMemory::Alloc(sizeof(PRUnichar) *
maxLength);
+ NS_ENSURE_TRUE(dest, NS_ERROR_OUT_OF_MEMORY);
+
...
+ rv = decoder->Convert(encodedData, &unescapedByteCount, dest, &destLen);
+ if (NS_FAILED(rv)) {
nsMemory::Free(dest);
- return result;
+ return rv;
}
aReturn.Assign(dest, destLen);
nsMemory::Free(dest);
Could you make dest an nsAutoPtr<PRUnichar *> here? If not, you may want to
flip the code around so that you need only one call to nsMemory::Free(dest)...
sr=jst
Attachment #119647 -
Flags: superreview?(jst) → superreview+
Assignee | ||
Comment 8•22 years ago
|
||
nsAutoPtr will call "delete" on the pointer, so that's no good here (unless I
actually do a "new PRUnichar[]" and cast or something like that....
I'll flip the test around; good catch.
Comment 9•22 years ago
|
||
Before I review, can you quiet my suspicions that this is a non-problem?
Specifically, is non-ASCII text legal input to unescape(), and does calling
unescape() with such text have defined output?
Comment 10•22 years ago
|
||
... and another question (possibly for another bug) : is %DA%D1%C8%ED the
correct escaping of the string in the testcase? Why not %0639%0631%0628%064A? Or
possibly %D8%B9%D8%B1%D8%A8%D9%89?
Comment 11•22 years ago
|
||
Comment on attachment 119647 [details] [diff] [review]
Oops, that had an extra chunk in it...
OK, I accept that we need this for compatibility, although we need a new bug to
consider what the correct behaviour is, especially of escape().
Opera, IE, and Konqueror all render the second line of the testcase as:
escaped: %u0639%u0631%u0628%u064A%20%u0639%u0631%u0628%u064A
>+ // To gracefully deal with encoding issues, we have to do the following:
Nit: don't split infinitives
r=smontagu
Attachment #119647 -
Flags: review?(smontagu) → review+
Assignee | ||
Comment 12•22 years ago
|
||
Comment on attachment 119647 [details] [diff] [review]
Oops, that had an extra chunk in it...
Waldemar, Simon asked me to make sure you were OK with this patch.. could you
take a look, please?
Attachment #119647 -
Flags: review+ → review?(waldemar)
Comment 13•22 years ago
|
||
The "new bug" I requested in comment 11 would be a dupe of bug 44272 :-)
Comment 14•22 years ago
|
||
I'll take a look at what the patch does. I'm currently building Mozilla so I
can try it.
Comment 15•22 years ago
|
||
I'm not familiar with the character set used in the first test case, so I used
a UTF-16 test case. encodeURI/decodeURI work as expected, but escape and
unescape don't produce visible output with the patch applied. Hmmmm.....
Assignee | ||
Comment 16•22 years ago
|
||
Comment on attachment 120504 [details]
UTF-16 test case
With this testcase, "escape" produces no output even without my patch...
The problem is the use of flat char* buffers to handle UTF-16 text, of course.
Things die at the first null, which is the first byte if your endianness is
right.
I suppose I could go ahead and rewrite all that stuff in nsGlobalWindow to not
be broken like that....
Attachment #120504 -
Attachment mime type: text/utf16 → text/html
Assignee | ||
Comment 17•22 years ago
|
||
ccing some intl people for opinions
Comment 18•22 years ago
|
||
Asking the more general question: why do the JavaScript 'escape' and 'unescape'
functions care what the document character set is? They are always passed
UTF-16 from the JS engine and should send UTF-16 back to it. 'encodeURI' and
'decodeURI' don't care and work properly.
Assignee | ||
Comment 19•22 years ago
|
||
> why do the JavaScript 'escape' and 'unescape' functions care what the document
> character set is?
Because that's the behavior real web sites expect. In particular, one will
often have a site escape() a string, then on the server unescape it and expect
it to be in the document encoding.... Also, some sites will escape a string
server-side, converting a string in document encoding to %-escapes. When
unescaped, we get a byte sequence that's in the document encoding; to give it
back to the JS engine we have to convert it to UTF-16.
Assignee | ||
Comment 20•21 years ago
|
||
Waldemar, please see comment 19?
Assignee | ||
Comment 21•21 years ago
|
||
Attachment #119647 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #119647 -
Flags: review?(waldemar)
Assignee | ||
Updated•21 years ago
|
Attachment #125691 -
Attachment filename: 琀攀猀琀⸀瀀愀琀挀栀 → test.patch
Attachment #125691 -
Flags: review?(waldemar)
Updated•21 years ago
|
Attachment #125691 -
Flags: review?(waldemar) → review+
Assignee | ||
Comment 22•21 years ago
|
||
Checked in.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment 23•21 years ago
|
||
Verified fixed in 06-17 trunk build on WinXP except the problem in bug 44272.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•