Closed Bug 485690 Opened 16 years ago Closed 15 years ago

LDAP SSL connections with clientauth sometimes use wrong credentials

Categories

(Directory :: LDAP C SDK, defect, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nelson, Assigned: nelson)

References

Details

Attachments

(2 files, 2 obsolete files)

Attached patch Patch v1 for LDAP C SDK (obsolete) (deleted) — Splinter Review
This problem is seen in client programs that use SSL clientauth and that use different credentials (client certificates) on different connections to the same server during the same process lifetime. It is also seen in programs that use clientauth on some connections to a server, but not on other connections to the same server during the same process lifetime. This is seen most commonly is test clients that are trying to simulate different users, but it may also be seen in servers that are trying to replicate different data sets to a remote server with a different client credential for each different data set. The problem is that after one connection has been established between the client and the server using some (or no) credential, all subsequent connections to the same server may use the same credential (or same absent credential) even if the client has called ldapssl_enable_clientauth specifying different credentials to be used for each subsequent connection. Here is an explanation. SSL remembers ("caches") information about previously established sessions with "remote" clients and servers, and reuses that information from old sessions whenever possible to minimize the expensive computational overhead of establishing new sessions with each new connection. When a new connection is made, if the server says "Let's reuse the session we last used", then the most recent session from that socket's cache will be used, even if that previous session was established using different credentials than the client now wishes to use. NSS's SSL client library supports the use of multiple separate named caches, to allow the sessions associated with one set of credentials to be remembered separately from the sessions associated with other sets of credentials. So a client program that uses credentials for "Alice" with some sockets, and uses credentials for "Bob" with other sockets, can have separate caches for Alice and Bob. If the connections for Alice are configured to use Alice's cache, and Bob's connections use Bob's cache, all confusion over different credentials can be eliminated. When a program (or thread) uses a socket for which it has not configured an explicitly named socket, then NSS uses a default unnamed cache for that socket. There is only one default unnamed cache. If multiple threads, trying to act as separate users, do not specify explicitly named caches for their respective users' identities, then the separate users will end up using a common set of credentials from the common cache, resulting in confusion. It may be reasonable for all LDAP connections that do NOT use client auth to share a common cache, because each connections that uses no client auth typically is (re)authenticated using some other method, such as simple name and password authentication. But it is not reasonable for client auth connections to share a cache with connections that use no client auth. That is the problem experienced with the LDAP C SDK. The LDAP C SDK never explicitly names a cache to be used with any SSL socket, so all SSL sockets share the common unnamed cache, with the resulting identity confusion. A named cache is implicitly created the first time its name is configured to be used by a socket. Cache entries have a lifetime after which they expire. When all the cache entries for a named cache have expired, the cache disappears. So, it's very easy to use named caches. You just give the name to the SSL socket before doing the first handshake on that socket, and SSL does the rest. I have attempted to modify the LDAP C SDK to do that, to configure each SSL socket that will be used with a client certificate to use a cache with the same name as the "nickname" of the cert to be used. I have attached to this bug an untested patch that I believe will solve the problem. I would like one of the LDAP C SDK cognoscenti (you know who you are :) to test this patch and see if it does indeed satisfactorily solve the problem for them.
Attachment #369808 - Flags: review?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: ssl connections with clientauth sometimes use wrong credentials → LDAP SSL connections with clientauth sometimes use wrong credentials
Attached patch alternative patch v1 (obsolete) (deleted) — Splinter Review
This may be slightly more pleasing.
Attachment #369845 - Flags: review?
Attached patch Patch v2 for LDAP C SDK (deleted) — Splinter Review
I found a third function in this source file that almost identical to the other two, so I patched it, too.
Attachment #369808 - Attachment is obsolete: true
Attachment #369845 - Attachment is obsolete: true
Attachment #369957 - Flags: review?
Attachment #369808 - Flags: review?
Attachment #369845 - Flags: review?
Comment on attachment 369957 [details] [diff] [review] Patch v2 for LDAP C SDK I'm asking Mark for this review, but any peer of this module could review it.
Attachment #369957 - Flags: review? → review?(mcs)
hey Nelson, thanx for the patch and explanation! i dont have time to test it right now but should be able to do so by the end of the week or so so if somebody can test it earlier than that please do. on a side note i think related NSS caching design is broken because it allows for such things to happen in the first place pretty much without user consent [i'm sure you have it documented somewhere but still]. i understand it would be difficult to fix because of the performance implications and existing API users expectations in terms of baseline performance they come to expect but you should at least think about fixing it somehow. it is probably a very good and efficient caching mechanism for browsers and client software alike and i can certainly understand how that design came around.
Anton, I don't think the existing API or cache design in broken. It is necessary to tell/configure the socket in advance, saying "if the server asks you to re-use an existing session, use one associated with this identity, and if the server asks you to create a new session, associate that new session with this identity". There is no other basis on which NSS can figure out the right identity to use for those purposes than to be told, and the SSL_SetSockPeerID function is how you tell it.
This patch looks ok Is it safe to call SSL_SetSockPeerID() even if the peer ID has already been set? I'm not sure how this can happen, but do we need to get the current peer ID and free/release it before we set it again?
> Is it safe to call SSL_SetSockPeerID() even if the peer ID has already > been set? It will leak the previous value. But that's easy for us to fix in NSS. > do we need to get the current peer ID and free/release it before we set > it again? I think there's no public NSS API by which to do that, and in any case, that should not be necessary. If you think this is really a possibility, Please let me know.
Depends on: 486999
It is a little surprising to me that the SSL session cache ignores differences in client auth credentials, etc. but I know it would be difficult to code it so it did the right thing automatically (the right thing is likely to be application specific). I think this is a good fix for the LDAP C SDK, but I am not able to test it right now.
The SSL code isn't ignoring anything. It hasn't received any information to ignore. If you carefully trace through the calls made by LDAP SDK to libSSL, you will see that none of them tells libSSL what identity to use before the handshake begins. The patch fixes that. The patch gives the SSL library the information with which it can correctly choose an SSL session to restart. Without that information, the SSL library can only guess which session to restart. It guesses the most recently established session with that server. That's generally right for the client that represents just a single identity. It's a problem for a client that represents multiple identities. But if such a client doesn't tell SSL "For this socket, I want to be identity X", then libSSL has no basis on which to make such a decision. I got a similar reaction from some DS people at Sun. There's clearly some fundamental misunderstanding at work here. I wish I understood the cause of that. Perhaps there is some fundamental difference in models between SSL and LDAP. Or perhaps the way that SSL manages sessions is not generally understood. Please help me to determine the cause of this misunderstanding. Once that is understood, I think we can help many Directory folks with this issue.
In Fedora Directory Server we do this for outgoing LDAP connections (e.g. if you want to use SSL with client cert auth in replication): http://cvs.fedoraproject.org/viewvc/ldapserver/ldap/servers/slapd/ssl.c?revision=1.26&root=dirsec&view=markup function slapd_SSL_client_auth ... rc = ldapssl_enable_clientauth (ld, SERVER_KEY_NAME, pw, cert_name); if (rc != 0) { errorCode = PR_GetError(); slapd_SSL_warn("ldapssl_enable_clientauth(%s, %s) %i (" SLAPI_COMPONENT_NAME_NSPR " error %d - %s)", SERVER_KEY_NAME, cert_name, rc, errorCode, slapd_pr_strerror(errorCode)); } else { /* We cannot allow NSS to cache outgoing client auth connections - each client auth connection must have it's own non-shared SSL connection to the peer so that it will go through the entire handshake protocol every time including the use of its own unique client cert - see bug 605457 */ ldapssl_set_option(ld, SSL_NO_CACHE, PR_TRUE); } So instead of trying to get the correct credentials to be used by the cached connection, we just avoid caching altogether.
In reply to comment 10, disabling the cache is another way to prevent the problem, but it is extermely inefficient, forcing a new full handshake for every client auth connection, and never reusing SSL sessions. It's a "solution" but not an optimal one, IMO. The patch I proposed is much more efficient, reusing sessions properly.
BTW, Rich, where is this bug 605457 ? Got URL?
(In reply to comment #11) > In reply to comment 10, disabling the cache is another way to prevent the > problem, but it is extermely inefficient, forcing a new full handshake for > every client auth connection, and never reusing SSL sessions. It's a > "solution" but not an optimal one, IMO. > > The patch I proposed is much more efficient, reusing sessions properly. Sure. We will investigate changing this in an upcoming version of DS.
(In reply to comment #12) > BTW, Rich, where is this bug 605457 ? Got URL? It is not a public bug. We found it mid-2002, so the bug was in the internal private AOL bug tracking system. In fact, nelsonb@netscape.com at 2002-07-26 14:55:17 said to use SSL_SetSockPeerID with a unique identity string.
(In reply to comment #9) > The SSL code isn't ignoring anything. It hasn't received any information to > ignore. If you carefully trace through the calls made by LDAP SDK to libSSL, > you will see that none of them tells libSSL what identity to use before the > handshake begins. The patch fixes that. The patch gives the SSL library > the information with which it can correctly choose an SSL session to restart. Thanks for the explanation. I think I understand now. When I wrote comment # 8, I was thinking that the SSL library could, for example, look at what certificate is being used for client authentication and use that info. to decide whether to reuse a cached session. But it does not have that information at the time it needs to make the decision about session reuse.
sorry, i didnt really get enough time to have a go with a test setup. this stuff is a royal pain to setup from scratch. i have tried to do it quickly but failed with SSLException : bad_certificate on the server side thrown for certutil generated self-signed cert. its not the issue of trust or validity, it is something with the cert itself. i'm gonna try to generate self signed cert with Java tools then import it into NSS, maybe that will work, maybe not, maybe it all depends on the current moon phase.
doh, but of course i cant just move the keystore from JKS to NSS.
well, I'd say that testing SSL client auth with self-signed certs is doomed. When an SSL server requests client auth, it sends a list of the names of the CAs whose client certs it accepts. SO, when you're doing SSL client testing, you should a) have your own CA for issuing client certs, with its own CA cert. (Certutil can do this nicely). b) have that CA cert marked as trusted to issue client certs in the server that is requesting client auth, so that the server will take the name from that CA cert and send it to the clients as the name of an issuer whose certs will be accepted. c) have that CA cert be imported into the client's cert DB, along with the client's own cert (issued by that CA). It's easy to setup a shell script that - creates a cert DB for a CA - creates a self signed CA cert in that CA's cert DB. - exports the CA cert for that CA - creates a cert DB for the client - imports the CA's cert into the client's DB - creates a cert signing request for the client using the client's DB - signs and issues a new client cert, using the CA's DB - imports the newly issued client cert into the client's DB. NSS's QA test scripts do all those steps (and more) for every NSS QA test run. - creates a client auth cert signed by that CA -
we have so called Blind Trust manager provider on the server side, just for testing so it doesnt matter. the problem i have is that NSS self signed certs are rejected while JKS self signed certs work fine. SSLException isnt very informative, just saying bad_certificate and thats it. any idea how to generate Java Security friendly self signed cert with certutil ?
nevermind, i think i got it. what was happening is NSS refusing a handshake because it didnt know about the server cert [insert another doh! here] and didnt trust it. the thing that got me baffled was that SSLException with bad_certificate error. i thought it was about some certificate extension missing or some other extension not recognized or something like that. will try to test it further now, sorry for the noise.
We use such a script as mentioned by Nelson: http://github.com/richm/scripts/blob/702a161b5ad83c47d03ff5ac37ab52dcf691e16f/setupssl2.sh Creates a self signed CA cert, a server cert for directory server, and a server cert for admin server (that both can be used for client cert auth too).
ok, verified the patch now and it does fix the problem as described and i dont see it rendering anything broken on my simple tests so we should get this fix in asap. here is some proof output on the server side just in case : - unpatched 08/Apr/2009:02:49:01 +0200] CONNECT conn=7 from=x.x.x.x:37556 to=y.y.y.y:636 protocol=LDAP [08/Apr/2009:02:49:02 +0200] BIND REQ conn=7 op=0 msgID=1 type=SASL mechanism=EXTERNAL dn="" [08/Apr/2009:02:49:02 +0200] BIND RES conn=7 op=0 msgID=1 result=0 authDN="cn=Server-Cert,dc=example,dc=com" etime=4 [08/Apr/2009:02:49:02 +0200] CONNECT conn=8 from=x.x.x.x:37557 to=y.y.y.y:636 protocol=LDAP [08/Apr/2009:02:49:02 +0200] BIND REQ conn=8 op=0 msgID=1 type=SASL mechanism=EXTERNAL dn="" [08/Apr/2009:02:49:02 +0200] BIND RES conn=8 op=0 msgID=1 result=0 authDN="cn=Server-Cert,dc=example,dc=com" etime=4 - patched [08/Apr/2009:02:54:30 +0200] CONNECT conn=9 from=x.x.x.x:37558 to=y.y.y.y:636 protocol=LDAP [08/Apr/2009:02:54:30 +0200] BIND REQ conn=9 op=0 msgID=1 type=SASL mechanism=EXTERNAL dn="" [08/Apr/2009:02:54:30 +0200] BIND RES conn=9 op=0 msgID=1 result=0 authDN="cn=Server-Cert,dc=example,dc=com" etime=4 [08/Apr/2009:02:54:30 +0200] CONNECT conn=10 from=x.x.x.x:37559 to=y.y.y.y:636 protocol=LDAP [08/Apr/2009:02:54:30 +0200] BIND REQ conn=10 op=0 msgID=1 type=SASL mechanism=EXTERNAL dn="" [08/Apr/2009:02:54:30 +0200] BIND RES conn=10 op=0 msgID=1 result=49 authFailureID=1245310 authFailureReason="The SASL EXTERNAL bind request could not be processed because the client did not present an certificate chain during SSL/TLS negotiation" etime=2 this is from a single client process just using two separate LDAP handles.
I agree - this looks good.
i already told you the reason. API users dont expect blanket NSS caching in the first place. yes, it is described in the docs, yes, rtfm is in order, still from the API user pov this is wrong. if i wanna enable blanket cache i would go ahead and enable it and i dont expect the API to set me up. the reason we have this bug in the first place is just that, well that and not rtfm, that is if it was documented at the time at all. again, i understand why NSS does it and how that design came around but it is not a good design from where i look at it [as API user]. i also know that there isnt much you can do about it now even if you recognize there is a design problem and are willing to fix it. one way to go about it would be introducing NSS handles where each handle is associated with a given db, cache and whathaveyou but i digress. anyway, this is just to tell you how it is from API user pov. i have no time or energy to argue about it so take it or leave it, up to you. (In reply to comment #9) > I got a similar reaction from some DS people at Sun. There's clearly some > fundamental misunderstanding at work here. I wish I understood the cause > of that. Perhaps there is some fundamental difference in models between SSL > and LDAP. Or perhaps the way that SSL manages sessions is not generally > understood. Please help me to determine the cause of this misunderstanding. > Once that is understood, I think we can help many Directory folks with this > issue.
Anton, I think your answer can be summarized as "caching of sessions is entirely unexpected". That's a helpful answer. Thanks. If we ever do an NSS 4.0, I think that lots of issues like this would change. But I doubt that we ever will do so.
btw this is how it works in the current Sun Directory Server as well. (In reply to comment #10) > In Fedora Directory Server we do this for outgoing LDAP connections (e.g. if > you want to use SSL with client cert auth in replication): > So instead of trying to get the correct credentials to be used by the cached > connection, we just avoid caching altogether.
Well, if Sun's DS disabled caching for auth connections that intended or desired to use client auth, this bug would not exist. This bug was brought to my attention by DS folks (QA, I think, not sure) because their connections that they intended to use client auth were reusing old non-client auth sessions.
last week i spoke to the person responsible for related code in Sun DS and who also filed related Sun CR against NSS and this is what he told me. i think it is still a problem for them since there are scenarios where many short lived connections can take a perf hit with disabled cache. hopefully once your patch is committed they can take advantage of it as well. (In reply to comment #27) > Well, if Sun's DS disabled caching for auth connections that intended or > desired to use client auth, this bug would not exist. This bug was brought > to my attention by DS folks (QA, I think, not sure) because their > connections that they intended to use client auth were reusing old non-client > auth sessions.
I think this is an issue for SASL, as well as for SSL client auth. Basically, any use of LDAP C SDK over SSL that does some sort of binding of identity/credentials to the SSL session needs to separate the SSL sessions so that no single SSL session is used with multiple credentials. The patch attached here attempts to do that for connections authenticated with SSL client auth (certs), but I suspect this is also an issue for SASL. See bug 488449, and see if you think that is the same problem. If so, a patch similar to the one that helps this bug may help that bug. I don't know how you would get the SASL-client to give you a string that you could use as a client session cache name. But I suspect that would be part of the solution.
What can I do to move this patch along?
Priority: -- → P1
Comment on attachment 369957 [details] [diff] [review] Patch v2 for LDAP C SDK Sorry for the delay. The patch looks fine, although I would add a comment block like this before each of the new code blocks: /* * If using client auth., arrange to use a separate SSL session cache * for each distinct certificate nickname. */ r=mcs if comments are added. Please commit or ask me to do so. Thanks!!!!!
Attachment #369957 - Flags: review?(mcs) → review+
Attached patch patch as committed to CVS trunk (deleted) — Splinter Review
Checking in ldapsinit.c; new revision: 5.17; previous revision: 5.16 Is there a tinderbox I can watch to see if this checkin has any unexpected consequences on other platforms?
What release number should go into the target milestone field for this bug? What will be the next expected release from the trunk?
(In reply to comment #33) > What release number should go into the target milestone field for this bug? 6.0.7 > What will be the next expected release from the trunk? 6.0.7
Target Milestone: --- → 6.0
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: