Closed Bug 1350506 Opened 8 years ago Closed 7 years ago

sea-puppet's puppet setup is horked.

Categories

(SeaMonkey :: Release Engineering, defect)

defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ewong, Assigned: ewong)

References

Details

Attachments

(1 file, 3 obsolete files)

This looks like bug 1311822. (No, I don't recall running puppet cert on sea-puppet) puppet agent --test on -3 returns: Error: Could not request certificate: Error 400 on SERVER: this master is not a CA Exiting; failed to retrieve certificate and waitforcert is disabled
Following the instructions to that bug, I did: 1) backedup all the stuff in var/lib/puppetmaster/ssl/ca to ../backup.zip 2) removed the last entry in inventory.txt 3) removed the *.pems rebooted puppetmaster. rebooted -3. waiting for it to return
did |puppet agent --trace --test| on -3 and got: Error: Could not request certificate: Error 400 on SERVER: this master is not a CA /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:207:in `is_http_200?' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:100:in `find' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/certificate/rest.rb:12:in `find' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/indirection.rb:201:in `find' /usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:207:in `certificate' /usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:36:in `localhost' /usr/lib/ruby/site_ruby/1.8/puppet/ssl/validator/default_validator.rb:27:in `initialize' /usr/lib/ruby/site_ruby/1.8/puppet/ssl/validator.rb:27:in `new' /usr/lib/ruby/site_ruby/1.8/puppet/ssl/validator.rb:27:in `default_validator' /usr/lib/ruby/site_ruby/1.8/puppet/network/http_pool.rb:35:in `http_instance' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:57:in `network' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:82:in `http_request' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:62:in `http_get' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:96:in `find' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:190:in `do_request' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/request.rb:264:in `do_request' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:190:in `do_request' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:90:in `find' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/certificate/rest.rb:12:in `find' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/indirection.rb:201:in `find' /usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:207:in `certificate' /usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:326:in `wait_for_cert' /usr/lib/ruby/site_ruby/1.8/puppet/application/agent.rb:478:in `wait_for_certificates' /usr/lib/ruby/site_ruby/1.8/puppet/application/agent.rb:319:in `run_command' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:384:in `run' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:510:in `plugin_hook' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:384:in `run' /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:488:in `exit_on_fail' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:384:in `run' /usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:146:in `run' /usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:92:in `execute' /usr/bin/puppet:8 Exiting; failed to retrieve certificate and waitforcert is disabled
Summary: sea-puppet/ sea-hp-linux64-3 are not talking to each other. → sea-puppet's puppet setup is horked.
Blocks: SM2.48b1
Severity: normal → blocker
Blocks: SM2.48
After wrestling with puppet and certificate chaining, I think I've managed to unhork the puppet infrastructure. The basic gist of fixing it I got from [1] 1) generate a Rootca self-signed cert i) generate rootCA CRL 2) generate a puppetmaster ca csr - sign the puppetmaster ca csr with the RootCA cert - Generate the puppetmaster CA CRL i) copy the ca-cert to /var/lib/puppetmaster/ssl/git/ca-certs (and rename to sea-puppet.community.scl3.mozilla.crt) ii) copy the puppetmaster's ca and rootca cert and crls to /var/lib/puppetmaster/ssl/git/certdir - then run the following script in that dir: for i in *.crl; do h=`openssl crl -hash -noout -in $i` fn=$h.r0 echo " Linking ${fn} to $i..." [ ! -f $fn ] && ln -s $i $fn done for i in *.pem; do h=`openssl x509 -hash -noout -in $i` fn=$h.0 echo " Linking ${fn} to $i..." [ ! -f $fn ] && ln -s $i $fn done 3) generate a puppetmaster (leaf) csr - sign puppetmaster (leaf) csr with the puppetmaster ca cert i) copy this puppetmaster leaf cert to /var/lib/puppetmaster/ssl/git/master-certs 4) generate the hosts (2->13 + sea-puppet + sea-master1) csrs - sign with the puppetmaster ca's cert - for each host: i) copy the crt to /var/lib/puppet/ssl/public_keys (rename to pem) ii) copy the crt also to /var/lib/puppet/ssl/certs (and rename to pem) iii) copy the key to /var/lib/puppet/ssl/private_keys (and rename to pem) iv) copy the rootca's crt to /var/lib/puppet/ssl/certs and rename to ca.pem) 5) then run |puppet agent -t| on every host. [1] - https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Certificate_Chaining So I'm going to ask Callek for confirmation on whether I did it right.
Flags: needinfo?(bugspam.Callek)
I think it's unhorked namely because puppet had overwritten my new mercurial installation with the old one. So I need to update the puppet mercurial module.
Attached patch [puppet] proposed patch (obsolete) (deleted) — Splinter Review
Attached patch [puppet] proposed patch (obsolete) (deleted) — Splinter Review
Attachment #8860731 - Attachment is obsolete: true
Attachment #8860734 - Flags: review?(bugspam.Callek)
Attached patch [puppet] proposed patch (v3) (obsolete) (deleted) — Splinter Review
realized I hadn't flushed the repo version.
Attachment #8860734 - Attachment is obsolete: true
Attachment #8860734 - Flags: review?(bugspam.Callek)
Attachment #8860735 - Flags: review?(bugspam.Callek)
Attached patch [puppet] proposed patch (v4) (deleted) — Splinter Review
Attachment #8860735 - Attachment is obsolete: true
Attachment #8860735 - Flags: review?(bugspam.Callek)
then on the puppetmaster: 1) copied the mozilla-python27-mercurial-3.9.1-1.el6.x86_64.rpm to /data/repos/yum/releng/public/CentOS/6/x86_64 2) cd /data/repos/yum/releng/public/CentOS/6/x86_64 3) createrepo --update ./ then on each puppet slave (as root) 1) yum clean all 2) puppet agent -t
FWIW, there has been (as of this writing) one successful L64 trunk hourly build since the outage which had started immediately after buildID=20170420023553. This new build is at http://ftp.mozilla.org/pub/seamonkey/tinderbox-builds/comm-central-trunk-linux64/1492959533/ ; it has buildID=20170423075853 which is after comment #9 (2 hours after if the build ID is in Mozilla time zone or 9 hours after if in UTC).
oops, got my arithmetic wrong: it is 5 hours earlier if in UTC.
fixed.
Assignee: nobody → ewong
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Flags: needinfo?(bugspam.Callek)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: