Closed Bug 58107 Opened 24 years ago Closed 24 years ago

QA suites slow because of missing sync testclnt / selfserv

Categories

(NSS :: Tools, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sonja.mirtitsch, Assigned: sonja.mirtitsch)

Details

Attachments

(2 files)

In ssl.sh right now we use a sleep 20 to wait until the selfserv is ready before we start the testclnt. Some platforms / machines are substantially faster, we need a mechanism to communicate to the calling shellscript (signal or file) that the server is ready.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Status: NEW → ASSIGNED
This bug is also related to bug 58176 (if port in use try next higher port) Ian had additional comments about it: These are all failed connections, which is likely to be the old problem of the server not being launched or not being killed properly. Unfortunately, our QA tests are very fragile with respect to timing, and I think failures like these are likely to occur periodically. Right now, our SSL tests use the following strategy: 1. launch a server on port 8443 2. sleep for some preset amount of time 3. launch a client that tries to connect to the server 4. kill the server 5. wait until "selfserv" leaves the process table 6. goto 1 Would it make more sense to use the following strategy? 1. launch a server on port 8443 2. wait until port 8443 accepts connections (eventually breaking with an error if it takes too long) 3. launch a client 4. kill the server 5. wait until "selfserv" leaves the process table 6. goto 1 This should be faster for platforms that launch the server quickly, and also work for slow machines that take longer than our preset amount of time to launch the server.
Assignee: wtc → sonmi
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
Nelson suggested: Yes, and we repeat this for every separate client test, which is very inefficient and unnecesssary. The server can be started once and used for all the different test client runs. No need to keep killing and restarting it for each client. Only exception is if you want to test client auth. To switch client auth on and off, it is necessary to restart the server. > Would it make more sense to use the following strategy? > 1. launch a server on port 8443 > 2. wait until port 8443 accepts connections (eventually breaking with an > error if it takes too long) Yes, that makes sense. > 3. launch a client The next 3 don't however. > 4. kill the server > 5. wait until "selfserv" leaves the process table > 6. goto 1 > > This should be faster for platforms that launch the server quickly, and also > work for slow machines that take longer than our preset amount of time to > launch the server. > > Thoughts? I suggest this: 1. launch a server 2. launch a special client that retries connection attempts until it can succesfuly connect to the server, then exits with a success status. 3. launch the (first, next) test client program, 4. repeat step 3 for each test client program, 5. kill the server. 6. wait for it to die. 7. stop
I checked patch #2 in, along with changes to ssl.sh. Now the server is only launched once, and the special tstclnt is run to block until the server is awake. sonmi- this breaks the QA suite for the "NULL MD5" case, you mentioned you could fix that quickly.
I changed the ssl.sh, for the NULL cipher the servcer is restarted right now. Also, for Client Auth and Stress test the testclient -q is used to determine if the server is ready before starting the real client. Nelson thinks we should try to lump the NULL cipher tests together with the others ---- email ---- > We took it out and start the server once for all the SSL Cipher tests, once > for the two Null Cipher test, once for each SSL Client Auth test. That will work just fine. The stress test can also run with a single start of the server. (currently, but if we add client auth stress tests the picture changes. It is possible to combine the NULL cipher test with the other cipher tests. There's no real need to test the null cipher separately. Only the client auth tests need the server to be restarted. Bob, I'm surprised you disagreed with that statement. It depends on some of what you want to tests. If you enable the NULL ciphers, then you can get tests that run with a single start of the SSL server. If I remember right, however, Some of the tests are testing combinations of connecting to the server with differ server options (TLS on/off by default for instance). It may be better, though, just to do coverage, then do some real tests with the server (like connecting to an SSL2 server with TLS and SSL3 turned on for instance. Also, making tstclient have a option where it will verify that it actually connected with the cipher and protocol it was supposed to would remove the need to restart the server side with so many different options. ------ an additional problem exists under NT: The problem is that if no directory for the cert database is specified, a default one will be opened in ~/.netscape. That could be just about anywhere on Windows, and is almost sure to be wrong. The way to fix this is to add "-d ${SERVERDIR}" to the end of all of the tstclnt -q calls in the script. I will fix the script for this
I made a mistake, it should be "-d ${CLIENTDIR}", it doesn't really matter since no certs will be used, but it should at least be consistent.
reusing the selfserver will make all subsequent tests fail. Do we need to fix this behavior, or can we live with it? ********************* SSL Cipher Coverage **************************** selfserv -v -p 8443 -d w:/.../20001117.1/blowfish_NT4.0_Win95/.../CLIO.8/server -n CLIO.red.iplanet.com -i C:/.../Temp/tests_pid.888 -w nss & tstclnt -p 8443 -h CLIO -q -d w:/.../20001117.1/blowfish_NT4.0_Win95/.../CLIO.8/client < W:/.../20001117.1/blowfish_NT4.0_Win95/.../sslreq.txt selfserv: Launched thread in slot 0 selfserv: About to call accept. selfserv: Thread in slot 0 returned 0 selfserv: PR_Accept returned error -5961: TCP connection reset by peer. selfserv: Closing listen socket. after the selfserver closes his socket - should we try to reopen?
Took the NULL cipher server restarts out. Server now got the additional options -c ABCDEFabcdefghijklm ---- Still open: Need to fix the server, so it does not close and exit after error, but ignores or reopens socket. ---- I think you're saying that if any test makes selfserv fail, then all subsequent tests fail. I guess that's true. I don't think selfserv should fail (unless there's a bug in the NSS code). > + Do we need to fix this behavior, or can we live with it? I think that when PR_Accept returns this error (connection reset by peer) selfserve should not close the listen socket and exit. This error should be one that selfserv just ignores and continues.
This goal of this bug has been achieved by the -q option for tstclnt. You might want to mark this bug fixed unless there are other remaining issues.
Target Milestone: --- → 3.2
Has been fixed, additionally testclient has been modified
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: