Closed
Bug 71247
Opened 24 years ago
Closed 23 years ago
change strings in LDAP XPCOM SDK to unicode and/or UTF8 for i18n
Categories
(Directory :: LDAP XPCOM SDK, defect, P3)
Directory
LDAP XPCOM SDK
Tracking
(Not tracked)
VERIFIED
FIXED
mozilla0.9.2
People
(Reporter: dmosedale, Assigned: leif)
References
Details
(Keywords: intl, Whiteboard: [PDT+] Have a fix, r=, sr=, a= critical for 0.9.2)
Attachments
(6 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
We need to switch to XPIDL wstrings for all the interfaces in LDAP XPCOM SDK so
that unicode data doesn't quietly get thrown out when it passes that boundary.
Assignee | ||
Updated•24 years ago
|
Priority: -- → P2
Assignee | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•24 years ago
|
||
Apparently we are switching to something called "astring" now (abstract
strings).
Keywords: nsbeta1
Priority: P2 → P1
Assignee | ||
Updated•24 years ago
|
Target Milestone: --- → mozilla0.9.1
Updated•24 years ago
|
Severity: normal → major
Summary: change strings in LDAP XPCOM SDK to wstrings for i18n → change strings in LDAP XPCOM SDK to astrings for i18n
Due to this bug, Japanese base DN is displayed garbled on directory edit window.
Please see bug 79869 for details.
Assignee | ||
Comment 3•24 years ago
|
||
Assignee | ||
Comment 4•24 years ago
|
||
Two "notes" to this patch:
1. It does use wstring and not AString. There are problems with using the latter
in an array context in IDL.
2. Because nsIURI uses "string", nsLDAPURL.cpp (which implements nsIURI) can't
use wstring either. This is a big problem, and we haven't decided on a solution.
One alternative is to make a new interface, nsIUnicodeURI or something like
that, which would be subclassed from nsIURI.
#2 means that DNs are still not wide string enabled, and I suggest we open a new
bug on that issue, since it requires some other hackage.
Dmose: Review this patch please, and if you can, test on Mac/PC.
Comment 5•24 years ago
|
||
Setting target milestone to 0.9.2 (check it in anytime, even before, when the
tree is open for). Per PDT triage.
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Reporter | ||
Comment 6•24 years ago
|
||
Reviewer comments:
Can you file a new bug on nsILDAPURL?
My reading of RFC 2251 makes me think that attribute names _can_ be
UTF8, not just ASCII. So I think all the uses of NS_ConvertASCIItoUCS2 should
actually be NS_ConvertUTF8toUCS2. Fix that, and you've got r=dmose@netscape.com.
The patch builds fine on Mac, but I don't have a build that actually runs, so I
can't test that part just yet.
Reporter | ||
Comment 7•24 years ago
|
||
sdagley just showed me the wonders of the "dx" command to macsbug, so I now do
have a usable mac build. LDAP autocomplete is working nicely.
Assignee | ||
Comment 8•24 years ago
|
||
Hmmm, that's not how I read the RFC, it says:
A specification may also assign one or more textual names for an
attribute type. These names MUST begin with a letter, and only
contain ASCII letters, digit characters and hyphens. They are case
insensitive. (These ASCII characters are identical to ISO 10646
characters whose UTF-8 encoding is a single byte between 0x00 and
0x7F.)
I read this as meaning you can have *SOME* 7-bit ASCII characters, which happens
to be a subset of UTF-8. It does exclude "_" for instance, which clearly is
allowed in UTF8 (but is ASCII).
No?
Reporter | ||
Comment 9•24 years ago
|
||
I was interpreting the AttributeType ::= LDAPString to mean that the attribute
name itself could be an LDAPString, not the attribute value. But I think your
reading is right. Never mind. :-/
Assignee | ||
Comment 10•24 years ago
|
||
Scott, can you SR this bug please?
Keywords: review
Whiteboard: SR requested from mscott 5/16
Assignee | ||
Comment 11•24 years ago
|
||
Asking "scc" for SR, mscott is swamped. :)
Comment 12•24 years ago
|
||
I'll get to this but it won't happen until tonight. if someone wants to do it b4
me then cool.
Comment 13•24 years ago
|
||
changing interface methods from string to wstring looks very safe. Looks like
you are using NS_LITERAL_STRING macros were appropriate in here too. good.
these changes looks safe and correct to me.
sr=mscott
I assume you have some test cases to demonstrate preservation of unicode data now?
Assignee | ||
Comment 14•24 years ago
|
||
Checked in.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 17•24 years ago
|
||
I realize I'm reviewing your code after it's already been checked in :-) Sorry.
I hope you find this useful anyway.
+ *_retval = NS_ConvertUTF8toUCS2(rv).ToNewUnicode();
You can go directly to a newly allocated hunk of UCS2 from ASCII with the global
|ToNewUnicode|
*_retval = ToNewUnicode(rv);
If |rv| didn't happen to be a string yet, you would say
*_retval = ToNewUnicode( nsDependentCString(rv) );
I think you could use this better pattern in a couple of places. Other than
that, though, everything looks fine.
Comment 18•24 years ago
|
||
With 05/22 trunk build, the Japanese base DN is still garbled.
dmose, should the base DN display problem be fixed in bug 79869?
Reporter | ||
Comment 19•24 years ago
|
||
scc: only the attributes themselves are actually guaranteed to be ascii.
values, error messages, etc may contain non-ASCII UTF8 chars
ji: no; that depends on nsLDAPURL being fixed to deal with UTF8; since a
separate bug didn't get filed for that, I'm reopening this one.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 20•24 years ago
|
||
Questions:
* rv = mCurrentOperation->SearchExt(NS_ConvertASCIItoUCS2(baseDn).get()
Is baseDn ASCII only?
* Is there a bug for UTF-8 nsLDAPURL?
Reporter | ||
Comment 21•24 years ago
|
||
nhotta:
My idea behind reopening (and retitling this bug slightly) is so that we can use
it to also track the UTF8 work.
As you surmised, the base DN can indeed contain UTF8; it's one of the fixes need
to close this but out.
Summary: change strings in LDAP XPCOM SDK to astrings for i18n → change strings in LDAP XPCOM SDK to unicode and/or UTF8 for i18n
Reporter | ||
Comment 22•23 years ago
|
||
Because the scope of the UTF8 changes are bit unclear, we're dropping priority
on this slightly, but we hope to still see it land for 0.9.2. Current theory is
to write a little C++ code that calls the charset conversion code and exposes
UCS2<->UTF8 conversion goop to JS via the JSAPI.
Priority: P1 → P3
Comment 23•23 years ago
|
||
PDT+ per 6/12 mtg.
Whiteboard: SR requested from mscott 5/16 → [PDT+] SR requested from mscott 5/16
Assignee | ||
Comment 24•23 years ago
|
||
Assignee | ||
Comment 25•23 years ago
|
||
Since getting something into JSAPI (or something similar) seems unlikely at this
point, I've made a couple of additions to the nsILDAPService interface:
|UTF8toUCS2()|
and
|UCS2toUTF8()|
which are both scritable, e.g. from JS you can do stuff like
|url.spec = ldapService.UCS2toUTF8(prefsURL);|
(with appropriate code to get the LDAP service etc). This is less than ideal,
and I've marked the code appropriately. But until we get a real fix for this, I
think this is what we have to do.
dmose, review please?
-- Leif
Reporter | ||
Comment 26•23 years ago
|
||
A couple of nits:
* doxygen @return statements don't have the type or variable name; just the comment
* from the UCS2toUTF8 commentary: s/nsLDAPURL/nsILDAPURL/
* if you could put the |NS_IMETHODIMP|s on their own lines so that the "find
functions using 'grep ^functionName *' still works
Fix those, and you've got r=dmose. No need to post another patch.
Assignee | ||
Comment 27•23 years ago
|
||
Assignee | ||
Comment 28•23 years ago
|
||
Created a 2nd patch, with dmose's changes, and also updated patch for changes
checked in to ns[I]LDAPService.
mscott, can you SR= please? The last patch, from 6/17/01
-- Leif
Whiteboard: [PDT+] SR requested from mscott 5/16 → [PDT+] SR requested from mscott 6/18
Comment 29•23 years ago
|
||
Your changes look good to me. Although I didn't see any callers of your new
conversion APIs on the LDAP service?
sr=mscott
Assignee | ||
Comment 30•23 years ago
|
||
mscott: Aye, this would only be called from Javascript. We have some LDAP
related bugs depending on UTF8 encoding strings to be stored in the C strings as
defined by |nsIURI|.
Thanks!
-- leif
Comment 31•23 years ago
|
||
jumping in, since I'm reviewing #79869
shouldn't the dn attribute on nsILDAPURL be a wstring now?
Reporter | ||
Comment 32•23 years ago
|
||
sspitzer: Because of the choice of nsIURI to stick to UTF8-encoded |string|s, I
think it makes sense to have nsILDAPURL (which inherits from it) do the same.
Of course, now the XPCOM SDK is a nasty mishmash of |wstring| (UCS2), AString,
and UTF8-encoded |string|. So we're gonna have to go back and clean this up. I
wanna wait and see how 84196 shakes out, as well as learn whether nsACString is
going to be exposed to XPIDL before deciding exactly how best to do this cleanup.
Assignee | ||
Comment 33•23 years ago
|
||
dmose/sspitzer: I already file a bug on the "mess" in the LDAP XPCOM SDK re:
strings, it's bug 86000.
Comment 34•23 years ago
|
||
a= asa@mozilla.org for checkin to the trunk.
(on behalf of drivers)
Assignee | ||
Comment 35•23 years ago
|
||
Checked in.
Status: REOPENED → RESOLVED
Closed: 24 years ago → 23 years ago
Resolution: --- → FIXED
Comment 36•23 years ago
|
||
With 06/20 build, the multi-byte base DN display problem is fixed, but the the
directory with this kind of search root is not accessible. For example, if I
specify a directory with Japanese base DN, autocomplete is not working. With the
same directory, it's working perfectly with 4.75.
Steps to reproduce:
1. Setup a directory with hostname to polyglot.mcom.com, port to 8000, base DN
to o=ネットスケープ (view this page in Japanese (Shift-JIS) and copy this string
to your directory create window).
2. Go to the preference window for addressing and set autocomplete to use this
directory.
3. Bring up a new mail compose window, in To: field, type Mike Smith
no match comes up. You should be able to see
Mike Smith <ji2@seamonkey.mcom.com> auto-filled into To: field.
Comment 37•23 years ago
|
||
Should I open a new bug for this?
Comment 39•23 years ago
|
||
ji, seems like a new bug was the right answer. The problem you're describing
isn't similar to the summary for this bug.
Comment 40•23 years ago
|
||
Dealing **properly** with UTF-8 Base DN in LDAP protcol should be
part of LDAP XPCOM SDK. So, it was appropriate to re-open this
bug.
In asnwer to questions by nhotta, dmose says:
-- Additional Comments From Dan Mosedale 2001-05-29 16:20 --
...
My idea behind reopening (and retitling this bug slightly) is so that we can use
it to also track the UTF8 work.
As you surmised, the base DN can indeed contain UTF8; it's one of the fixes need
to close this bug out.
--- End of the quoted comment ---
The remaining problem is that of Base DN and UTF-8 handling and
according to dmose this is the bug to deal with such isssues.
Comment 41•23 years ago
|
||
please update the status whiteboard with your eta.
Whiteboard: [PDT+] SR requested from mscott 6/18 → [PDT+] need eta
Assignee | ||
Comment 42•23 years ago
|
||
Can I get access to the polyglot machine, and the access log file from the LDAP
server? It would help a lot when debugging this.
Thanks,
-- leif
Assignee | ||
Comment 43•23 years ago
|
||
With the help of Katsuhiko Momoi, I've narrowed this down quite a lot. If I
manually edit the prefs.js file, and put in the appropriate UTF8 "byte" string
as the Base-DN, everything works fine. When I looked at the existing base in the
prefs.js, it looked wrong.
So, the code work for reading the prefs, and performing the actual search (see
attached log from the LDAP server). Somehow the actual prefs.js setting is
getting the wrong base-DN.
Before I edited prefs.js, I'd see this in the access log:
[21/Jun/2001:16:23:15 -0700] conn=117 fd=33 slot=33 connection from
208.12.32.152
[21/Jun/2001:16:23:15 -0700] conn=117 op=0 BIND dn="" method=128 version=2
[21/Jun/2001:16:23:15 -0700] conn=117 op=0 RESULT err=0 tag=97 nentries=0
[21/Jun/2001:16:23:17 -0700] conn=117 op=1 SRCH
base="o=\c3\a7\c2\b9\c2\9d\c3\a9\c2\98\c2\aa\c3\a3\c2\83\c2\a3\c3\a7\c2\b9\c2\9d\c3\a5\c2\8c\c2\bb\c3\a3\c2\81\c2\9b\c3\a7\c2\b9\c2\a7\c3\af\c2\bd\c2\b1\c3\a7\c2\b9\c2\9d\c3\af\c2\bd\c2\bc\c3\a7\c2\b9\c2\9d"
scope=2 filter="(|(cn=smith*)(mail=smith*)(sn=smith*))"
[21/Jun/2001:16:23:17 -0700] conn=117 op=1 RESULT err=32 tag=101 nentries=0
[21/Jun/2001:16:31:57 -0700] conn=117 op=-1 fd=33 closed error 32 (Broken pipe)
After I edited prefs.js, and put in the string as provide to me, I get:
[21/Jun/2001:16:33:28 -0700] conn=118 fd=22 slot=22 connection from
208.12.32.152
[21/Jun/2001:16:33:28 -0700] conn=118 op=0 BIND dn="" method=128 version=2
[21/Jun/2001:16:33:28 -0700] conn=118 op=0 RESULT err=0 tag=97 nentries=0
[21/Jun/2001:16:33:28 -0700] conn=118 op=1 SRCH
base="o=\e3\83\8d\e3\83\83\e3\83\88\e3\82\b9\e3\82\b1\e3\83\bc\e3\83\97" scope=2
filter="(|(cn=smith*)(mail=smith*)(sn=smith*))"
[21/Jun/2001:16:33:28 -0700] conn=118 op=1 RESULT err=0 tag=101 nentries=2
And this looks right to me.
-- leif
Comment 44•23 years ago
|
||
It looks to me what is saved in the prefs.js is not
UTF-8 data. I actually cannot read my BaseDN name
as Japanese when I use UTF-8 capable text editor to
open my prefs.js.
Comment 45•23 years ago
|
||
base="o=\e3\83\8d\e3\83\83\e3\83\88\e3\82\b9\e3\82\b1\e3\83\bc\e3\83\97"
As leif says, the above is indeed the correct representation in UTF-8 of the
BaseDN name we are using.
Comment 46•23 years ago
|
||
Since the preference is saved using SetUnicharPref the data in the prefs.js will
not be utf8. I think the problem is we are using CopyCharPref instead of
CopyUnicharPref to read the .uri preference.
I tried changing the call on windows and tested it and seems to be working fine.
I will post a patch right away.
Leif can you try in Linux.
Comment 47•23 years ago
|
||
Comment 48•23 years ago
|
||
To confirm, are you saying that the BaseDN data in prefs.js
will not be in UTF-8 even after the patch?
Comment 49•23 years ago
|
||
Sorry if I sound a bit paranoid, but by agreement, all non-ASCII
data saved in prefs.js must be in UTF-8. Pls ignore this comment
if I am off target.
Comment 50•23 years ago
|
||
Comment 51•23 years ago
|
||
Comment 52•23 years ago
|
||
mommoi, we are saving basedn in utf-8 but I think it is converted to utf-8
twice.
The new patch should save basedn in utf-8.
Comment 53•23 years ago
|
||
I looked at my prefs.js and it shows basedn as
o=\e3\83\8d\e3\83\83\e3\83\88\e3\82\b9\e3\82\b1\e3\83\bc\e3\83\97
tested the last patch on windows and linux. building on mac right now
Whiteboard: [PDT+] need eta → [PDT+] Have a fix, need reviews
Reporter | ||
Comment 54•23 years ago
|
||
r=dmose
Updated•23 years ago
|
Whiteboard: [PDT+] Have a fix, need reviews → [PDT+] Have a fix, r=, need sr=
Comment 55•23 years ago
|
||
works on mac tooo.
Comment 56•23 years ago
|
||
sr=sspitzer
Updated•23 years ago
|
Whiteboard: [PDT+] Have a fix, r=, need sr= → [PDT+] Have a fix, r=, sr=, need a=
Comment 57•23 years ago
|
||
a=blizzard on behalf of drivers for 0.9.2
Whiteboard: [PDT+] Have a fix, r=, sr=, need a= → [PDT+] Have a fix, r=, sr=, a= critical for 0.9.2
Comment 58•23 years ago
|
||
Fix checked in
Status: REOPENED → RESOLVED
Closed: 23 years ago → 23 years ago
Resolution: --- → FIXED
Comment 59•23 years ago
|
||
Verified with 06/22 linux and Mac latest trunk build. It's fixed.
Will verify it on windows when there is an installable build available.
Comment 60•23 years ago
|
||
Verified with win32 06/25 branch build. It's working on windows build too.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•