Closed
Bug 246194
Opened 21 years ago
Closed 21 years ago
Crash setting character encoding to Unicode (UTF-16) on any random website [@ nsBlockFrame::GetFrameForPointUsing ]
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: martijn.martijn, Assigned: jshin1987)
References
()
Details
(Keywords: crash, verified1.7, Whiteboard: fixed-aviary1.0)
Crash Data
Attachments
(3 files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
smontagu
:
review+
dbaron
:
superreview+
asa
:
approval1.7+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7) Gecko/20040608 Firefox/0.8.0+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7) Gecko/20040608 Firefox/0.8.0+
Talkback ID: TB81836Z
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=TB81836Z
It happens when I change the Character Encoding to UTF-16. Do this:
View -> Character Encoding -> More -> Unicode (UTF-16)
It does not happen with Unicode(UTF-16 Little Endian) or Unicode (UTF-16 Little
Endian) or Unicode (UTF-8).
Originally discussed here (second part of it):
http://forums.mozillazine.org/viewtopic.php?p=570536#570536
Also happens with:
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/2004-05-17
Reproducible: Always
Steps to Reproduce:
1. View -> Character Encoding -> More -> Unicode (UTF-16)
2.
3.
Actual Results:
Crash
Expected Results:
No crash
Updated•21 years ago
|
Summary: Crash setting character encoding to Unicode (UTF-16) on any random website → Crash setting character encoding to Unicode (UTF-16) on any random website [@nsBlockFrame::GetFrameForPointUsing]
Comment 3•21 years ago
|
||
I get this assertion before crashing
Charset Overlay menu item pressed: UTF-16
###!!! ASSERTION: Too few bytes in input: '*aSrcLength >= 2', file
nsUCS2BEToUnicode.cpp, line 221
Break: at file nsUCS2BEToUnicode.cpp, line 221
Updated•21 years ago
|
Keywords: crash
OS: Windows 2000 → All
Summary: Crash setting character encoding to Unicode (UTF-16) on any random website [@nsBlockFrame::GetFrameForPointUsing] → Crash setting character encoding to Unicode (UTF-16) on any random website [@ nsBlockFrame::GetFrameForPointUsing ]
Comment 4•21 years ago
|
||
The assertion was added in bug 235090 on 2004-04-23. I'm not sure if this is a
regression caused by that particular patch but can someone perhaps test that by
backing out that patch?
Reporter | ||
Comment 5•21 years ago
|
||
Well, I don't see a Unicode (UTF-16) menu option in:
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7b) Gecko/2004-04-21
But I see it in:
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/2004-04-26
And in that build it crashes too.
Comment 6•21 years ago
|
||
Stack is similar to bug 242620 and bug 235405. It seems that changing to UTF-16
is making all the content disappear from under us and triggering that bug(s).
jshin, do you think that at
http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvlatin/nsUCS2BEToUnicode.cpp#250
we should be setting mState to 0?
Comment 7•21 years ago
|
||
Wrong again: the source of the problem is in zeroing *aSrcLength and *aDstLength
at the same place.
The caller at
http://lxr.mozilla.org/seamonkey/source/parser/htmlparser/src/nsScanner.cpp#354
then consumes one byte from source and inserts a replacement character in
destination, decrements source length and destination length by one each and
tries to proceed with the conversion.
First problem: mState is still 2, so the converter tries again to find a BOM
where no BOM should be (and my idea in the last comment of zeroing mState in the
converter doesn't help, because the caller calls Reset()). The converter returns
failure again and the loop continues to the end of the source buffer.
Second problem: since the caller expects source length to be in octets and
destination length to be in PRUnichars, the destination buffer is overrun
halfway through the loop.
Comment 8•21 years ago
|
||
Do we even want to be returning NS_ERROR_ILLEGAL_INPUT when we don't suceed in
detecting endianness? Maybe better to default to big-endian and carry on.
Obviously in the situation here we're going to end up displaying garbage
whatever we do, but if the input is really some flavour of UTF-16, we have a 50%
chance of getting it right.
Assignee: general → smontagu
Component: Browser-General → Internationalization
QA Contact: general → amyy
Assignee | ||
Comment 9•21 years ago
|
||
That's exactly what I've been thinking :-).
Comment 10•21 years ago
|
||
Comment on attachment 150462 [details] [diff] [review]
patch
while you're here, please correct the typo "edinanness". With that, r=smontagu
Attachment #150462 -
Flags: review+
Assignee | ||
Comment 11•21 years ago
|
||
Comment on attachment 150462 [details] [diff] [review]
patch
thanks for the prompt r and the analysis that led to the patch. I'll fix the
typos when landing.
i asked dbaron for sr on irc.
i'm asking a1.7 at the same time because it's a crash and the patch is simple
enough that makes mozilla work the way it works for non-UTF-16 pages when
UTF-16LE/UTF-16BE is explicitly selected.
Attachment #150462 -
Flags: superreview?(dbaron)
Attachment #150462 -
Flags: approval1.7?
Comment 12•21 years ago
|
||
Comment on attachment 150462 [details] [diff] [review]
patch
Since this is for the Web, it seems more like you should pick an endianness
rather than using platform endianness. How about little-endian (since x86 is)?
Assignee | ||
Comment 13•21 years ago
|
||
you mean x86 is predominant so that LE is a lot more likely than BE, right? I
agree. I'll do that.
Comment 14•21 years ago
|
||
David, by your own logic I would say pick big-endian, following clause D42 in
chapter 3 of the Unicode standard
http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf:
"The UTF-16 encoding scheme may or may not begin with a BOM. However, when
there is no BOM, and in the absence of a higher-level protocol, the byte order
of the UTF-16 encoding scheme is big-endian."
Assignee: smontagu → jshin
Comment 15•21 years ago
|
||
OK, big-endian then. and sr=dbaron with that change
Comment 16•21 years ago
|
||
Comment on attachment 150462 [details] [diff] [review]
patch
a=asa (on behalf of drivers) for checkin to 1.7 branch. Time is short so please
try to get this in in the next couple of hours if at all possible. Thanks.
Attachment #150462 -
Flags: approval1.7? → approval1.7+
Assignee | ||
Comment 17•21 years ago
|
||
thanks for sr/a. fix landed in both trunk and 1.7branch
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment 18•21 years ago
|
||
There's still the "edinanness" typo.
Updated•21 years ago
|
Attachment #150462 -
Flags: superreview?(dbaron) → superreview+
Assignee | ||
Comment 19•21 years ago
|
||
thanks for the note. it turned out that there were two typos (it was
'edinaness') and I fixed only the first 'ed->end' in the prev. checkin. i've
just fixed the second.
Assignee | ||
Comment 20•21 years ago
|
||
Just thought that for aviary-1.0 and ff 0.9 branch landing, I'd better put this
up.
Updated•21 years ago
|
Whiteboard: fixed-aviary1.0
Updated•21 years ago
|
Flags: blocking1.7?
Comment 21•21 years ago
|
||
Verified on Mozilla 1.7 branch. Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US;
rv:1.7) Gecko/20040617.
Keywords: fixed1.7 → verified1.7
Updated•14 years ago
|
Crash Signature: [@ nsBlockFrame::GetFrameForPointUsing ]
You need to log in
before you can comment on or make changes to this bug.
Description
•