Closed Bug 73446 Opened 24 years ago Closed 20 years ago

Need to know how to convert between local encoding and UCS2, e.g., Need NS_ConvertUCS2ToLocalEncoding() and NS_ConvertLocalEncodingToUCS2()

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: roland.mainz, Assigned: ftang)

References

Details

(Keywords: intl)

(Based on discussion on IRC #mozilla with scc): There's need for a fucntion which converts a "PRUnichar *" to a "char *" in system's local encoding - whatever it is (en_US.UTF-8, ja.UTF-8, xzy.UTF-123). Example usage would be a function which get's a document title for printing (which is a PRUnichar *title) and likes to feed it to "/usr/bin/lp -t $MYTITLE" - which is called via a function like lpPrintSetTitle( x, y, const char *title ) where "title" must be in system's locale...
On scc's request - making it a blocker for bug 73009. Making it block bug 72087 ("Xprint major revamp") as the current method to get a char * from PRUnichar * is more than silly (quoting it here just for fun as an example how "not to do this" =:-) : -- snip -- // stolen from mozilla/webshell/embed/xlib/qt/QMozillaContainer.cpp // helper fuction for BeginLoadURL, ProgressLoadURL, EndLoadURL // XXX Dont forget to delete this 'C' String since we create it here static char* makeCString( const PRUnichar* aString ) { int len = 0; const PRUnichar* ptr = aString; while ( *ptr ) len++, ptr++; char *cstring = new char[ ++len ]; // just cast down to a character while ( len >= 0 ) { cstring[len] = ( char )aString[len]; len--; } return cstring; } -- snip --
Blocks: 72087, 73009
1. Reverse way, too (please :-) 2. It would be usefull to have a function like NS_ConvertUCS2ToCOMPOUND_TEXT (and the reverse way). I assume the function would be identical to the NS_ConvertUCS2ToLocalEncoding()/NS_ConvertLocalEncodingToUCS2() - but a set of wrapper functions would be usefull to indicate that these functions are _specially_ for handling the X11 COMPOUND_TEXT datatype (maybe there are differences... and having a "special" function which catches possible exceptions from normal LocalEncoding behaviour would be usefull in such cases).
Summary: RFE: Need NS_ConvertUCS2ToLocalEncoding() → RFE: Need NS_ConvertUCS2ToLocalEncoding() and NS_ConvertLocalEncodingToUCS2()
All he wants is the knowledge of how to get a UCS2 string into the appropriate encoding for the current locale (and perhaps vice versa); I'm sure this functionality already exists in i18n land, and someone just needs to explain how to use it. The names he suggests in the summary are just based on his knowledge of existing string routines. He doesn't necessarily need the conversion functionality in that form ... that's just the only way he knew how to ask for it. So how do you do this?
Assignee: scc → nhotta
Severity: enhancement → normal
Component: String → Internationalization
QA Contact: scc → andreasb
Summary: RFE: Need NS_ConvertUCS2ToLocalEncoding() and NS_ConvertLocalEncodingToUCS2() → Need to know how to convert between local encoding and UCS2, e.g., Need NS_ConvertUCS2ToLocalEncoding() and NS_ConvertLocalEncodingToUCS2()
No longer blocks: 73009
I think the similar thing has been already taken care by nsILocalFile which converts between OS file system charset and UCS2. So that may be used if that fits with your requirement. nsICharsetConverterManager2 is the interface for charset conversion. Using lxr, you should be able to find many examples which use that interface.
I am looking for a function which explicitly says that "I am converting UCS2 to X11_COMPOUND_TEXT" (and backwards). Basically (except that there _may_ be exceptions...) COMPOUND_TEXT is the same as Xserver's local encoding ($LANG). Does nsICharsetConverterManager2 support this ?
I am not familiar with "COMPOUND_TEXT" (and UNIX in general). I searched lxr an found "COMPOUND_TEXT" in widget/src/gtk/nsClipboard.cpp. I believe charset conversion is also happening there.
What you can do now is 1. call nsIPlatformCharset to find the charset of the current system. You need to pass down a parameter for what you ask for because it is possible in the future your clipboard encoding maybe different from your window manager encoding. 2. use that charset to find an nsIUnicodeDeocder or nsIUnicodeEncoder, and then you can convert between PRUnichar* and char* 3. We want to keep the string class in the low level function w/o dependency on unicode converter for now. regarding to your comment about >COMPOUND_TEXT is the same as Xserver's local encoding ($LANG). This is a false statement. COMPOUND_TEXT is an universial encoding scheme which is locale independent. We currently have no function to convert to compound text yet but we may one day. The reason we pass a selector to the nsIPlatformCharset is exactly design for it. One day we may need to use COMPOUND_TEXT as the clipboard format but ISO-8859-2 for window title. compound text use ISO-2022 esc sequence to switch between charset. the reason I don't want to make a convientent function to convert unicode to local encoding is because there are a lot of case which the encoding is NOT local encoding (for example, if I view a rfc822 message, the charset is what ever got labeled in the message, not the one you label.) An implict function which do not take charest as parameter will encourage api designer ignore passing charset information from top to bottom. Not a perfect argument, but it is much easier to change implementation than interface.
Target Milestone: --- → Future
> regarding to your comment about > COMPOUND_TEXT is the same as Xserver's local encoding ($LANG). > This is a false statement. That means that http://lxr.mozilla.org/mozilla/source/widget/src/xlib/nsClipboard.cpp#147 is _wrong_, right ? > COMPOUND_TEXT is an universial encoding scheme > which is locale independent. We currently have no function to convert to > compound text yet but we may one day. The reason we pass a selector to the > nsIPlatformCharset is exactly design for it. One day we may need to use > COMPOUND_TEXT as the clipboard format but ISO-8859-2 for window title. What about implementing a function which claims to convert from/to COMPOUND_TEXT but is currently only a dummy ? IMHO it would be better to have something _now_ which offers the "correct" API instead of forcing the programmers to introduce all their own "workarounds"... which need to be _found_ and removed later...
Reassign to ftang.
Assignee: nhotta → ftang
mark all future new as assigned after move from erik to ftang
Status: NEW → ASSIGNED
Switching qa contact to teruko for now.
Keywords: intl
QA Contact: andreasb → teruko
nsNativeCharsetUtils.cpp was checked in on 2002-06-10. (in xpcom/io). Is this bug still valid with that implemented? http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsNativeCharsetUtils.cpp
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.