Closed
Bug 54135
Opened 24 years ago
Closed 23 years ago
conversion (fromU/toU) problem- Sjis code x'81ca' becomes x'fa54'
Categories
(Core :: DOM: Editor, defect, P3)
Tracking
()
VERIFIED
FIXED
mozilla0.9.6
People
(Reporter: hobbit_mak, Assigned: ftang)
References
()
Details
(Keywords: intl)
Attachments
(20 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
application/octet-stream
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details | |
(deleted),
application/octet-stream
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
application/octet-stream
|
ftang
:
review+
|
Details |
(deleted),
application/octet-stream
|
Details |
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; m18) Gecko/20000924
BuildID: 2000092408
If you edit page of Shift JIS and save it proper character x'81ca' becomes x'fa54'.
Reproducible: Always
Steps to Reproduce:
1.Edit page of
http;//homepage1.nifty.com/hobbit/html/utf8.html
2.Save it to local file.
Actual Results: x'81ca'(proper code) changed to x'fa54'(Windows code)
Expected Results: x'81ca' is reatained.
Maybe related with 35166.
http://bugzilla.mozilla.org/show_bug.cgi?id=35166
Assignee | ||
Comment 2•24 years ago
|
||
minor issue. mark it as assign
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Target Milestone: --- → Future
Reporter | ||
Comment 3•24 years ago
|
||
x'fa54'(Windows code) cannot be displayed by Mozilla itself. (Build 2000112704)
Comment 4•24 years ago
|
||
It is reported that Linux build also had this problem.
http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=474
Reporter | ||
Comment 5•24 years ago
|
||
Also sjis code 0x81E0 becomes to 0x8790
sjis code 0x81e6 becomes to 0xfA5B
Reporter | ||
Comment 6•24 years ago
|
||
Reporter | ||
Comment 7•24 years ago
|
||
Patch above was verified on Windows 2000 environments.
Reporter | ||
Comment 8•24 years ago
|
||
Reporter | ||
Comment 9•24 years ago
|
||
Attached list is in utf-8 encoding.
Assignee | ||
Comment 10•24 years ago
|
||
remove Future from the target milestone.
Keywords: intl
Target Milestone: Future → ---
Reporter | ||
Comment 11•24 years ago
|
||
This problem is fixed in Build 2001011720.
Reporter | ||
Comment 12•24 years ago
|
||
Sorry test with modified modules.
This problem is reproduced on Build ID 2001012304.
Reporter | ||
Comment 13•24 years ago
|
||
Reporter | ||
Comment 14•24 years ago
|
||
Because
http://bugzilla.mozilla.org/show_bug.cgi?id=44374
was fixed,
81BE becomes 879C.
81BF becomes 879B.
81DA becomes 8797.
81DB becomes 8796.
81DF becomes 8791.
81E3 becomes 8795.
81E7 becomes 8792.
Patch is also updated.
Assignee | ||
Updated•24 years ago
|
Summary: Sjis code x'81ca' becomes x'fa54' → conversion problem- Sjis code x'81ca' becomes x'fa54'
Assignee | ||
Comment 15•24 years ago
|
||
hobbit.makoto@nifty.ne.jp:
How you generate these patch ? Do you change the source table and use the ufrom
and uto tool to generate it? If so, can you give us the change of the source
table?
Summary: conversion problem- Sjis code x'81ca' becomes x'fa54' → conversion (fromU/toU) problem- Sjis code x'81ca' becomes x'fa54'
Reporter | ||
Comment 16•24 years ago
|
||
I could not find how to use the tool. So I changed both source of coment and
object.
Reporter | ||
Comment 17•24 years ago
|
||
Reporter | ||
Comment 18•24 years ago
|
||
Mozilla convert U+FFE2 to 7C7B (ISO-8022-JP).
It must be 224C (ISO-8022-JP).
Reporter | ||
Comment 19•24 years ago
|
||
How can I change the source table and use the ufrom and uto tool to generate it?
I could not find these tools in source file.
Assignee | ||
Comment 20•24 years ago
|
||
tools at mozilla/intl/uconv/tools/umaptable.c
nhotta- can you help to drive this ? I am overload
Assignee: ftang → nhotta
Status: ASSIGNED → NEW
Comment 21•24 years ago
|
||
hobbit.makoto@nifty.ne.jp,
could you summarize the current remaining problem?
Reporter | ||
Comment 22•24 years ago
|
||
Problem left in build 2001050804 is
- Ten characters are changed if you edit Shift JIS source and save it as Shift
JIS code.
0x81be becomes 0x879c
0x81bf becomes 0x879b
0x81ca becomes 0xfa54
0x81da becomes 0x8797
0x81db becomes 0x8796
0x81df becomes 0x8791
0x81e0 becomes 0x8790
0x81e3 becomes 0x8795
0x81e6 becomes 0xfa5b
0x81e7 becomes 0x8792
Problem about iso-8022-jp was fixed.
I could not download latest source yet, so I could not use tool yet.
Comment 23•24 years ago
|
||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.1
Comment 24•24 years ago
|
||
hobbit.makoto@nifty.ne.jp:
Please try the attached file and update your patch, thanks.
Reporter | ||
Comment 25•24 years ago
|
||
I download mozilla/intl/uconv/tools/.
But I could not found how you made sjis.ut and shis.ut.
I went to mozilla/intl/uconb/tools/.
I nmaked make.win and get umaptable.exe.
Maybe you made sjis.ut and sjis.uf by umaptable and original conversion table.
But I could not fine where and how to make sjis.ut and shis.ut.
Comment 26•24 years ago
|
||
Let me ask Frank and I will update.
Assignee | ||
Comment 27•24 years ago
|
||
for convert from sjis into unicode
I run /intl/uconv/tools/cp932tojdx.pl against
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT
and it will generate source/intl/uconv/ucvja/jis0208.ump
this will be shared by SJIS/EUC/ISO-2022-JP to unicode conversion
for convert from unicode into ShiftJIS
I run intl/uconv/tools/jis0208fromcp932.pl againt
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT
It will generate a file and I then pipe that file into umaptable -uf > 0208.uf
to generate the jis0208.uf
Reporter | ||
Comment 28•23 years ago
|
||
Reporter | ||
Comment 29•23 years ago
|
||
Reporter | ||
Comment 30•23 years ago
|
||
I got cp932.txt from unicode and made sjis.uf from that.
But some characters mapped to two sjis position.
So I comment out sjis locations that had not proper in JIS X 0208 and 0212.
I attached diff list and sjis.uf and confirmed that this sjis.uf solves
problems.
Updated•23 years ago
|
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Updated•23 years ago
|
Target Milestone: mozilla0.9.2 → mozilla0.9.1
Comment 31•23 years ago
|
||
Comment 32•23 years ago
|
||
I put a diff for sjis.uf, it's very big.
I expected something similar to the patch of 02/14/01 06:13.
hobbit.makoto@nifty.ne.jp, do you have any idea why the diff is so large? What
characters did you actually changed? Please list character codes of changed
characters.
Reporter | ||
Comment 33•23 years ago
|
||
I suppose that original table is not derived from CP932.txt.
I would like to know the original table also, but I could not find it.
Comment 34•23 years ago
|
||
I am going to ask Frank.
The characters you changed are the same as listed in your comment 2001-05-08 18:07?
Reporter | ||
Comment 35•23 years ago
|
||
No, character I changed from cp932.txt is listed in 05/15/01 07:30.
No character in 2001-05-08 18:07 is not changed. They are the same as in
cp932.txt.
Reporter | ||
Comment 36•23 years ago
|
||
It is strongly recommended to record from which tool and table or other
resource, source was created. It is better to record in source file.
Maybe this is the reason of difficulity to solve this bug.
In
http://bugzilla.mozilla.org/show_bug.cgi?id=35166
You conclude that you use cp932 for Unicode to SJIS conversion.
Comment 37•23 years ago
|
||
Bug 67374 - sources and tools to build unicode converters not in tree.
Depends on: 67374
Updated•23 years ago
|
Whiteboard: ftang to provide a source file for the current sjis.uf
Comment 38•23 years ago
|
||
TM to 0.9.2 per PDT triage (it's OK to check it in by Friday or after 0.9.1
branch is made).
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 40•23 years ago
|
||
pdt+ base on 6/11 pdt meeting.
Assignee | ||
Updated•23 years ago
|
Whiteboard: ftang to provide a source file for the current sjis.uf → [PDT+]ftang to provide a source file for the current sjis.uf
Assignee | ||
Comment 41•23 years ago
|
||
I don't think we have time to address this problem by moz0.9.2. Push to moz0.9.3
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Assignee | ||
Comment 42•23 years ago
|
||
remove PDT+
Whiteboard: [PDT+]ftang to provide a source file for the current sjis.uf → ftang to provide a source file for the current sjis.uf
Assignee | ||
Updated•23 years ago
|
Whiteboard: ftang to provide a source file for the current sjis.uf → no progress yet. ftang to provide a source file for the current sjis.uf
Comment 44•23 years ago
|
||
I read a part of program for japanese-unicode conversion.
But I didn't recognize the sources and ways to generate some mapping tables.
So, I made a tool to generate jis0201.uf, jis0208.uf, jis0208.ump, jis0208ext.uf
and sjis.uf from CP932.TXT and SHIFTJIS.TXT.
*.uf are generated with 'umaptable'.
Diffs are so large because,,,, the original mapping policy about codes that
SJIS:UCS2 = N:1 is to use HIGHER SJIS code. It is not so good idea. They shoud
be mapped to LOWER SJIS code (without IBM ext codes : bug-82678).
see http://support.microsoft.com/support/kb/articles/Q170/5/59.ASP.
testpage : http://rh.vinelinux.org/~shom/sjis-cp932.html
----------
In addition, this tool can generate tables from APPLE_JAPANESE.TXT.
# ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/JAPANESE.TXT
If it is possible to add "Shift_JIS (Macintosh)" , some problems will be resolved:
1) SJIS in/out problems (bookmark import, saving mail draft, compose, etc.)
on Mac0S 8,9
SJIS 815C (U+2014) EM DASH
8160 (U+301C) WAVE DASH
8161 (U+2016) DOUBLE VERTICAL LINE
817C (U+2212) MINUS SIGN
8191 (U+00A2) CENT SIGN (questionable : U+FFE0?)
8192 (U+00A3) POUND SIGN (questionable : U+FFE1?)
81CA (U+00AC) NOT SIGN (questionable : U+FFE2?)
2) Apple extended ShiftJIS codes (SJIS 8540-886D,EB41-ED96)
# partly. because APPLE defined some codes as Unicode Sequences.
# mozilla cannot process Unicode Sequeces.
testpage : http://rh.vinelinux.org/~shom/sjis-mac.html
Comment 45•23 years ago
|
||
Comment 46•23 years ago
|
||
usage: mkjpconv.pl SHIFTJIS.TXT CP932.TXT
(or mkjpconv.pl SHIFTJIS.TXT APPLE_JAPANESE.TXT
APPLE_JAPANESE.TXT is generated (CR->LF) from APPLE/JAPANESE.TXT)
SHIFTJIS.TXT is:
ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/JIS/SHIFTJIS.TXT
Comment 48•23 years ago
|
||
Matsumoto san, could you put sjis.uf generated by your tool?
I think the current problem is that it is hard to identify modifications.
For example, if we want to change the mapping for Shift_JIS 0x81ca, we want to
identify that change in sjis.uf. Then we can make sure the change won't affect
other characters.
Comment 49•23 years ago
|
||
Comment 50•23 years ago
|
||
Comment 51•23 years ago
|
||
There is very large amount of diffs, but I can see all glyphs defined in
SHIFTJIS.TXT on http://rh.vinelinux.org/~shom/sjis-cp932.html.
I think current mapping table has many (hidden) problems espacially dual mapped
codes in CP932.TXT.
Do you have a tool (or method) to generate SJIS->UCS2, UCS2->SJIS, JIS->UCS2,
UCS2->JIS mapping tables ?
Comment 52•23 years ago
|
||
Comment 53•23 years ago
|
||
I made a tool to check all codes in CP932.TXT.
# to generate Shift JIS encoded HTML page
perl mksjistest.pl CP932.TXT > sjis-cp932.html
# to generate UTF-8 encoded HTML page
perl mksjistest.pl CP932.TXT UTF-8 > sjis-cp932-utf8.html
I modified sjis-cp932-utf8.html by 0.9.2 and 0.9.2 + generated maps, and 'Save
As Charset' with Shift_JIS.
(so I'm using Linux. Please check on Windows)
diffs are: SRC = SJIS, ORG = modified by 0.9.2, NEW = modified by newmap
SRC ORG NEW
------------------ JIS defined region
81BE 879C 81BE
81BF 879B 81BF
81CA FA54 81CA
81DA 8797 81DA
81DB 8796 81DB
81DF 8791 81DF
81E0 8790 81E0
81E3 8795 81E3
81E6 FA5B 81E6
81E7 8792 81E7
------------------ NEC specific codes
8754 FA4A 8754
8755 FA4B 8755
: : :
875D FA53 875D
8782 FA59 8782
8784 FA5A 8784
878A FA58 878A
8790 8790 81E0
8791 8791 81DF
8792 8792 81E7
8795 8795 81E3
8796 8796 81DB
8797 8797 81DA
879A FA5B 81E6
879B 879B 81BF
879C 879C 81BE
----------------- NEC selected IBM ext region
ED40 FA5C ED40
: : :
EEF8 FA49 EEF8
EEF9 FA54 81CA
EEFA FA55 EEFA
EEFB FA56 EEFB
EEFC FA57 EEFC
------------------ IBM ext region
FA40 FA40 EEFA
: : :
FA49 FA49 EEF8
FA4A FA4A 8754
: : :
FA53 FA53 875D
FA54 FA54 81CA
FA55 FA55 EEFA
: : :
FA57 FA57 EEFC
FA58 FA58 878A
FA59 FA59 8782
FA5B FA5B 81E6
FA5C FA5C ED40
: : :
FC4B FC4B EEEC
-------------------------
I think new mapping policy is same as OE.
(I heard OE mapped codes in IBM ext region to NEC selected region)
Comment 54•23 years ago
|
||
Comment 55•23 years ago
|
||
Comment 56•23 years ago
|
||
Comment 57•23 years ago
|
||
Comment 58•23 years ago
|
||
Comment 59•23 years ago
|
||
Assignee | ||
Comment 60•23 years ago
|
||
roy yokoyama, can you help the check in the changes?
shoji-san, which diffs should we pick?
Assignee: ftang → yokoyama
Status: ASSIGNED → NEW
Comment 62•23 years ago
|
||
Please use *.uf, *.ump in the next attachment (old newmap.zip is not include
jisx0208ext.uf, sorry) or create them by mkjpconv.pl (from SHIFTJIS.TXT and
CP932.TXT).
'jisx0201gl.uf' is obsolete (not used in all sources).
And if these are acceptable (I'll make testcases), add mkjpconv.pl into
intl/uconv/tools.
# cp932tojdx.pl and jis0208fromcp932.pl will be obsolete.
I don't know where is the source of jis0212.{uf,ump}.
I want to change mkjpconv.pl to make jis0212.{uf, ump}.
Comment 63•23 years ago
|
||
Comment 65•23 years ago
|
||
nsbranch- since Frank moved it to 0.9.5
Comment 66•23 years ago
|
||
shoji-san: what is the status of this bug?
Are we waitng for ftang to provide sjis.uf source as stated in the whileboard?
Note: I'd appreciate if you can change the status of patches which are already obsolete.
=== cc'ing ftang
Comment 67•23 years ago
|
||
Comment 68•23 years ago
|
||
Please test new maps on Windows, Mac and OS/2.
testcases.zip has SJIS encoded texts to test.
1. display
ALL chars in raw.txt must be shown.
On Windows, ALL chars in rawext.txt, rawibmext.txt must be shown.
2. compose (round trip)
1) edit raw{,ext,ibmext}.txt.html on composer
2) save as with ShiftJIS
3) rawdump.pl <saved html>
"<ORG>:<NEW>:DIFF" are not round tripped codes.
New codes must be "SJIS lower" in
http://bugzilla.mozilla.org/attachment.cgi?id=44509&action=view
(see http://support.microsoft.com/support/kb/articles/Q170/5/59.ASP)
3. mail
1) compose new mail
2) CUT & PASTE all chars in raw.txt
3) send
ALL chars in the mail with raw.txt must be shown.
on Windows, ALL chars in the mail with raw{ext,ibmext}.txt must be shown.
------
If any problem would be occured on Mac or OS/2 especially about 9 chars in
http://rh.vinelinux.org/~shom/sjisprob.html , it should not be corrected by
changing mapping tables.
Comment 69•23 years ago
|
||
nhotta is back from sabbatical. assiging back to him.
Assignee: yokoyama → nhotta
Status: ASSIGNED → NEW
Comment 70•23 years ago
|
||
move to 0.9.6
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Comment 71•23 years ago
|
||
I think the tool has to be reviewed first.
Frank, please review mkjpconv.pl included in the attachment of 08/08/01 03:17.
Assignee: nhotta → ftang
Status: ASSIGNED → NEW
Whiteboard: no progress yet. ftang to provide a source file for the current sjis.uf
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Comment 72•23 years ago
|
||
Viewing the following diff by 4.x, we can see that mozilla is generating codes
which 4.x cannot show, so put 4xp keywoard.
diff between ...-sjis-0.9.2.html and ..--sjis-new.html
http://bugzilla.mozilla.org/attachment.cgi?id=44534&action=view
Keywords: 4xp
Assignee | ||
Comment 73•23 years ago
|
||
Comment on attachment 45060 [details]
newmap.zip (mkjpconv.pl, jis0208.uf, jis0208ext.uf, jis0201.uf, sjis.uf, IBMNEC.map )
rs=ftang.
Attachment #45060 -
Flags: review+
Assignee | ||
Comment 74•23 years ago
|
||
Please check them in.
Assignee | ||
Comment 75•23 years ago
|
||
give back to nhotta for check in.
Assignee: ftang → nhotta
Status: ASSIGNED → NEW
Updated•23 years ago
|
Status: NEW → ASSIGNED
Comment 76•23 years ago
|
||
rs=blizzard
Comment 77•23 years ago
|
||
should someone from international QA be the qa_contact for this bug ?
Comment 79•23 years ago
|
||
Checked in to the trunk.
The tool still needs to be checked in. Frank, please review the tool.
http://bugzilla.mozilla.org/attachment.cgi?id=51199&action=view
Comment 80•23 years ago
|
||
The tool issue to be handled by bug 67374. Mark this as FIXED.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•