Open
Bug 260728
Opened 20 years ago
Updated 2 years ago
Encoding coercion to the default encoding should only happen for 'standard'/unspecified encodings
Categories
(MailNews Core :: Internationalization, defect)
MailNews Core
Internationalization
Tracking
(Not tracked)
NEW
People
(Reporter: eyalroz1, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: intl, Whiteboard: [patchlove][has draft patch][needs new assignee?])
Attachments
(1 file)
(deleted),
patch
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a3) Gecko/20040817
Build Identifier:
(this is part of a split-up of dupe bug 260706 into non-dupe pieces to be
tracked from bug 254868)
The current default encoding coercion scheme is not the most effective 'cheap'
coercion possible: Even when not checking the message body for whether the
selected encoding seems to match the contents or not, it would provide better
results if the coercion option was not "always coerse to default encoding" but
rather "coerse to default encoding whenever the headers say nothing or say the
default, e.g. ISO-8859-1 or US-ASCII"; this is due to the fact that it is
extremely rare for a message to arrive with, say, "charset=windows-1255" in the
content-type header which is neither windows-1255 nor plain English in ASCII but
rather, say, UTF-8 or Arabic in Windows-1256. In fact, I don't think this has
ever happened to me.
Reproducible: Always
Steps to Reproduce:
Reporter | ||
Updated•20 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 1•20 years ago
|
||
What's extremly rare to you may not be necessarily very rare to other people.
For instance, Japanese and Russian users may have different experiences
(especially with Usenet news postings).
Keywords: intl
Reporter | ||
Comment 2•20 years ago
|
||
So what Jungshik (I hope that's the first name) is saying is that this should be
controlled by a pref.
Reporter | ||
Comment 3•20 years ago
|
||
Here's a working, albeit quite ugly, patch.
Assignee: smontagu → eyalroz
Status: NEW → ASSIGNED
Reporter | ||
Comment 4•20 years ago
|
||
Comment on attachment 160489 [details] [diff] [review]
'draft' patch
I don't expect a review+, but I want some input on how to un-uglify the code.
Specifically, there has to be a more elegant way to determine whether a charset
is one of the charsets commonly used by mail clients which don't know any
better (rather than the current use of a new function written by me which does
a few strcasecmp's).
Attachment #160489 -
Flags: review?(smontagu)
Comment 5•20 years ago
|
||
It's not clear to me what problem is being addressed by this bug, other than a
stated lack of an "effective, 'cheap' coercion."
Reporter | ||
Comment 6•20 years ago
|
||
The problem is the following: some people send me e-mail with charset
windows-1255 whose headers say they are iso-8859-1 or us-ascii; some people send
me charset windows-1255 messages whose headers say they are windows-1255; and
some people send me messages in utf-8. Now, if I choose coersion to
windows-1255, the utf-8 messages are displayed incorrectly, but if I don't, some
of the windows-1255 messages are displayed incorrectly (because it is assumed
they are iso-8859-1 or us-ascii).
Comment 7•20 years ago
|
||
(In reply to comment #6)
> Now, if I choose coersion to windows-1255, the utf-8 messages are displayed
> incorrectly, but if I don't, some of the windows-1255 messages are displayed
> incorrectly (because it is assumed they are iso-8859-1 or us-ascii).
Displayed incorrectly in the compose window, or in the received message?
Reporter | ||
Comment 8•20 years ago
|
||
Displayed incorrectly as received messages, of course. Now that bug 260728 has
been fixed, once I correct the display and press 'reply', there's no problem (I
think).
Comment 9•20 years ago
|
||
(In reply to comment #8)
> Now that bug 260728 has been fixed, once I correct the display and
> press 'reply', there's no problem (I think).
Um, *this* is bug 260728. :) Which one do you mean has been fixed?
Bug 234958?
Reporter | ||
Comment 10•20 years ago
|
||
Uh, sorry, I meant to say now that bug 260725 has been fixed.
Updated•20 years ago
|
Product: MailNews → Core
Assignee | ||
Updated•16 years ago
|
Product: Core → MailNews Core
Comment 11•16 years ago
|
||
Not clear to me if Eyal is thinking this works. Thoughts?
reset QA (was empty)
QA Contact: i18n
Reporter | ||
Comment 12•16 years ago
|
||
The patch now suffers from bit rot, I suppose, plus it was never ready for a review+ (e.g. I hard-coded the charsets for which to apply coercion). But I'm not sure I understand your question, Wayne.
In general, this is still something I believe should be done - although my extension (BiDi Mail UI) works around this issue by simply detecting itself what the charset really is by inspecting the content, allowing even for multiple charsets used within the same message:
https://addons.mozilla.org/en/thunderbird/addon/310
It's pretty slow, though, being JS code which needs to inspect the message body and apply a bunch of regexes to it.
Updated•15 years ago
|
Attachment #160489 -
Flags: review?(smontagu)
Updated•14 years ago
|
Whiteboard: [patchlove][has draft patch][needs new assignee?]
Updated•12 years ago
|
Assignee: eyalroz → nobody
Status: ASSIGNED → NEW
Is this bug still relevant after the changes made to this area in the last couple of years?
Reporter | ||
Comment 14•7 years ago
|
||
Well, unless one of those changes did what I suggested, then yes. Frankly, though, I have not been following the code for a while already.
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•