Closed
Bug 96716
Opened 23 years ago
Closed 23 years ago
DOM escape() truncates characters inappropriately
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
People
(Reporter: rich.foyle, Assigned: smontagu)
References
()
Details
Attachments
(3 files)
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
BuildID: 0.9.2
After using document.getSelection(), and verifying the selection was intact. I
escaped the selection. It truncated text including the special character '— '
and all text after this character. The url has a document that contains the
character.
Reproducible: Always
Steps to Reproduce:
1.create a bookmark with as its targert url: javascript:void(alert(escape
(document.getSelection()));
2.select text that contains the '— ' character
3.select the the bookmark, observe the output
Actual Results: The selection was truncated when it hit the character
Expected Results: Should not have truncated the selection
Reporter | ||
Comment 1•23 years ago
|
||
The special character in the form submission for the bug was jacked up...it is
the dash. see the text on the specified url: "KANSAS CITY, Mo. — Investigators
were"
Comment 2•23 years ago
|
||
Comment 3•23 years ago
|
||
Comment 4•23 years ago
|
||
Comment 5•23 years ago
|
||
Trying the (corrected) HTML testcase in IE4.x, NN4.x, and Mozilla:
IE4.7
----------------------------------------------------------------
CHARACTERS IN THE STRING "A—B"
char = A charCode = 65
char = — charCode = 8212
char = B charCode = 66
CHARACTERS IN escape("A—B") = "A%u2014B"
char = A charCode = 65
char = % charCode = 37
char = u charCode = 117
char = 2 charCode = 50
char = 0 charCode = 48
char = 1 charCode = 49
char = 4 charCode = 52
char = B charCode = 66
----------------------------------------------------------------
NN4.7
----------------------------------------------------------------
CHARACTERS IN THE STRING "A—B"
char = A charCode = 65
char = — charCode = 8212
char = B charCode = 66
CHARACTERS IN escape("A—B") = "A%97B"
char = A charCode = 65
char = % charCode = 37
char = 9 charCode = 57
char = 7 charCode = 55
char = B charCode = 66
----------------------------------------------------------------
Mozilla 2001-08-22
----------------------------------------------------------------
CHARACTERS IN THE STRING "A—B"
char = A charCode = 65
char = — charCode = 8212
char = B charCode = 66
CHARACTERS IN escape("A—B") = "A"
char = A charCode = 65
----------------------------------------------------------------
You can see the bug here: escape("A—B") gets truncated.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 6•23 years ago
|
||
Running the JS shell version of the testcase, I get the following output.
The character "—" (Unicode 2014 or JS charCode 8012) is interpreted by
my Cygwin shell as "ù"
Testing the string "AùB"
CHARACTERS IN THE STRING "AùB"
char = A charCode = 65
char = ù charCode = 151
char = B charCode = 66
CHARACTERS IN escape("AùB") = "A%97B"
char = A charCode = 65
char = % charCode = 37
char = 9 charCode = 57
char = 7 charCode = 55
char = B charCode = 66
Comment 7•23 years ago
|
||
Sorry, that's JS charCode 8212 (the decimal representation of hex 2014)
I mistyped '8012'...
Comment 8•23 years ago
|
||
Let me ask rogerl if this bug is a JS Engine bug or not. Is it a dupe
of bug 72964, "Pattern matching failing on non-Latin1 characters"?
There, we discovered something wrong with JS Engine string processing
for high Unicode characters (> 00FF).
This comment was made there:
----- Additional Comments From nhotta@netscape.com 2001-03-23 16:19 -----
Looks like non Latin1 characters without unicode escape are regarded as spaces
when a document charset is multibyte (e.g. UTF-8, EUC-JP, GB2312).
Of course, the document charset is a browser-based issue, but we found
problems directly within the JS Engine. This bug may be another consequence.
(?)
Comment 9•23 years ago
|
||
NOTE: if I alter the JS shell testcase by defining var cnTEST = 'A\u2014B'
instead of var cnTEST = 'A—B', I get this output:
Testing the string "A¶B"
CHARACTERS IN THE STRING "A¶B"
char = A charCode = 65
char = ¶ charCode = 8212
char = B charCode = 66
CHARACTERS IN escape("A¶B") = "A%u2014B"
char = A charCode = 65
char = % charCode = 37
char = u charCode = 117
char = 2 charCode = 50
char = 0 charCode = 48
char = 1 charCode = 49
char = 4 charCode = 52
char = B charCode = 66
Comment 10•23 years ago
|
||
Looks like the JS Engine is not to blame here. The JS shell is performing
as expected on the testcase above...
One must remember that the DOM has its own escape() function. This supersedes
the JS Engine escape() in the browser. Using a debug build of Mozilla, in fact,
we found that the JS Engine escape() is not even called by the HTML testcase.
Reassigning to DOM Level 0 -
Assignee: rogerl → jst
Component: Javascript Engine → DOM Level 0
OS: Windows 2000 → All
QA Contact: pschwartau → desale
Hardware: PC → All
Summary: escaped text is truncated → DOM escape() truncates characters inappropriately
Comment 11•23 years ago
|
||
Here is a very reduced testcase:
javascript: alert(escape('A\u2014B').length);
RESULTS:
IE4.7 : 8
NN4.7 : 5
Moz : 1
JS shell : 8
This summarizes the results from the testcases above, and shows
the truncation occuring in the Mozilla DOM escape().
I'm also curious as to the NN4.7 escape() differing from IE4.7's,
but that's another story...
Comment 12•23 years ago
|
||
Over to internationalization.
Assignee: jst → yokoyama
Component: DOM Level 0 → Internationalization
QA Contact: desale → teruko
Assignee | ||
Comment 14•23 years ago
|
||
Dupe of bug 44272. Refer also to discussion in bug 42221.
*** This bug has been marked as a duplicate of 44272 ***
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•