Closed Bug 172337 Opened 22 years ago Closed 22 years ago

Attachment mechanism incompatible with Unicode

Categories

(MailNews Core :: Internationalization, defect)

x86
Windows 2000
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.3alpha

People

(Reporter: lapsap7+mz, Assigned: tetsuroy)

References

Details

(Keywords: intl)

Attachments

(6 files, 2 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2b) Gecko/20020930 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2b) Gecko/20020930 If a filename is in Unicode (eg Chinese filename in Western European system locale), and if the file is attached in an email, every unrecognised character is replaced by a '?'. That means Mozilla can't get access to the file because if you send this mail or save it, you'll see what I mean. Reproducible: Always Steps to Reproduce:
This worksforme all the time with Russian mail (in a Western locale).... could you attach a message showing the problem to this bug?
I'd rather attach a test file. It's archived inside a RAR file because that's the only way to attach a file whose name isn't in Latin-1, and it's the only way to preserve its filename. You could save the file to anything .rar (eg test.rar). With this test file, you should be able to try it yourself. Attach it to a mail and you should get the same thing as in the attached image. The test file has two Chinese characters, and you could see that they are changed to ??.txt. Recall of the fact: when the file is attached, it's not possible to save or send the email.
So how do I get that rar file out of the archive? Is this being done on an English Win2k or a Chinese Win2k (system language, I mean). ccing yokoyama in case this has anything to do with the recent "Make Mozilla Unicode app on windows" changes.
RAR's site is http://www.rarlab.com/ The file and the archive were done in Western Europe system locale, but I don't think this does matter because since WinRAR 2.x, it's been using Unicode. That's why I used RAR rather than Zip to preserve its name.
See, the problem is that we may be assuming the filename is in the system locale. Which in this case is Western Europe, then?
You're using W2k, right? If yes, the file can be extracted without any problem and its filename should be displayed properly (supposing you've add Eastern Asian language beforehand). It seems like you didn't understand what I meant. The Chinese filename is preserved WITHIN the RAR archive. That means when you download the attached RAR file, just name it whatever you want, eg test.rar. The archive filename doesn't matter. Extracting the file seems to be posing some problem, though. An easy way is to use context menu on the RAR file and use WinRAR's menu item. A more complicated way is to switch system locale is Big5 to extract the file, and switch back to your initial system locale.
No, _you_ misunderstand. You have a Win2k in Western locale with a file with a Chinese name. We may be assuming the name is in ISO-8859-1 or something like that because the OS as a whole is in the Western locale. This is about the original bug, not the archive (and no, I am in fact not on Win2k, or even Windows).
I haven't tried this, but I think this problem can be shown the other way round: 1) Suppose your system locale is in Western Europe, create a file whose name contains some non Big5 characters. An example is "français.txt" -- 'ç' isn't in Big5. 2) Switch your system locale to Traditional Chinese (Big5). 3) Now, try to attach the "français.txt" in an email. You should see that the file becomes "fran?ais.txt". And try to save or send your email .... normally, you can't. IMO, I think that's because Mozilla uses an ANSI function instead of W function to read the filename.
It should be "français" -- that is ç instead of ç Sorry, my browser was in UTF-8!
>I think that's because Mozilla uses an ANSI function instead of W >function to read the filename. Correct; however, even changing the commdlg calls to W functions, the problem still occurs. The problem is that the GetMessage(), PeekMessage() and DispatchMessage() needs to be changed as well. If not changed to GetMessageW(), ... then the returned filename from CommonDlg includes '?'. Please wait until 104934 gets fixed. I'll make sure this gets fixed as well.
Depends on: 104934
Re comment #9: Ahem! Boris, please note that this bug was marked as specific to Win2k (and WinXP too). So all your comments didn't apply. But thanks for adding Yokoyama in CC. Yokoyama, so you get the same problem too, right? Could you confirm this bug? Should I change "Component" of this bug from Attachments to I18N? Furthermore, isn't it better that we add this bug to bug 104934 too?
sure. Confirming and taking this bug. >we add this bug to bug 104934 too? Making 172337 depends on 104934 results to link two bugs. (In other words, it's already added to 104934's block list.)
Assignee: mscott → yokoyama
Status: UNCONFIRMED → NEW
Component: Attachments → Internationalization
Ever confirmed: true
Target Milestone: --- → mozilla1.2beta
Keywords: intl
QA Contact: trix → kasumi
Status: NEW → ASSIGNED
Attached patch file URL is now in UTF-8 with MOZ_UNICODE flag (obsolete) (deleted) — Splinter Review
With this and previous patch, we can attach a non-locale file to email. I tested send/recieve attachment.
Attached patch File URL is now in UTF-8 with MOZ_UNICODE flag (obsolete) (deleted) — Splinter Review
Oops, wrong attachment. Trying again. nhotta: can you review this patch?
Attachment #105399 - Attachment is obsolete: true
Blocks: 107941
*** Bug 107941 has been marked as a duplicate of this bug. ***
Comment on attachment 105467 [details] [diff] [review] File URL is now in UTF-8 with MOZ_UNICODE flag r=nhotta you can move tempStr inside #else
Attachment #105467 - Flags: review+
Attached patch as per suggesion (deleted) — Splinter Review
Attachment #105467 - Attachment is obsolete: true
david: can you super review?
Comment on attachment 105639 [details] [diff] [review] as per suggesion r=nhotta carry his stamp
Attachment #105639 - Flags: review+
Target Milestone: mozilla1.2beta → mozilla1.3alpha
Comment on attachment 105639 [details] [diff] [review] as per suggesion sr=bienvenu
Attachment #105639 - Flags: superreview+
Attachment #101983 - Attachment mime type: application/octet-stream → application/x-rar-compressed
This problem is also evident on Mac OS X (FizzillaCFM/2002110808). Is this patch XP?
Summary: Attachment mechanism uncompatible to Unicode → Attachment mechanism incompatible with Unicode
Greg: Sorry, it's not XP. Only for Windows. ( see the patch : +ifeq ($(OS_ARCH),WINNT) ) In Windows, we are going UTF8 in file url but I can't speak for Mac. ( I have zero knowledge ... ) Naoki?
adding a couple of more dependencies
Depends on: 162358, 162361
Fixed; For verification, please wait until 162358 and 162361 get fixed.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Hi Roy, I'm using Mozilla 1.6 stable built. I'm testing again to see the "state" of this bug. Now, if I attach a file having non system locale characters in filename, every non system locale characters is replaced by '_', instead of '?' as it was a year ago. However, the file is still unaccessible and the mail can't be sent. I understand that this bug still depends on bug 162361. So, why was this bug marked as resolved? PS: Is there any abreviation for "non system locale charactes"? It's quite long to type :) Could it be called "NSL characters"?
Oh, by the way, there's a similar bug about Unicode filename I/O. Actually, I would call that the "reverse process" of this bug. I understand that you would like a separate bug. So, here it is. It is bug 234681.
Roy, after two years and I still don't understand why you marked this bug as fixed whereas in fact it isn't !?? I actually don't understand what "for verification" meant for you. I'm now using TB 0.7.3, and the problem presents differently. 1) If I drag 'n drop a file whose name isn't in system locale in the attachment zone, the filename disappears (I'll attach an image) 2) If I use the "Attach" button and choose the file using the dialog, every Unicode character becomes an underscore. In either case, I can't send out the mail because TB thinks the file doesn't exist since the filename is different. I know, I know, we're waiting for bug 162361 :(
> I actually don't understand what "for verification" meant for you. "Verification" is when you test to make sure the bug is fixed. Roy's comment meant that this code-level bug is fixed, but that can't be noticed until those other two bugs are fixed as well...
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: