Closed
Bug 712398
Opened 13 years ago
Closed 13 years ago
Setup buildbot-master21 as a mac test master
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Assigned: bhearsum)
Details
(Whiteboard: [buildmasters][capacity][buildduty])
Attachments
(2 files)
(deleted),
patch
|
armenzg
:
review+
bhearsum
:
checked-in+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
armenzg
:
review+
bhearsum
:
checked-in+
|
Details | Diff | Splinter Review |
We're seeing in bug 712244 PB Maximum connection being reached and CPU wio.
There are 2 masters for Linux slaves, 3 masters for Macosx and 2 masters for Windows.
Nevertheless, there are slightly more Windows machines per silo as their jobs take longer.
I am adding almost another 30 Windows slaves and would like to be ready for it.
Comment 1•13 years ago
|
||
ok, buildbot-master21 is up in scl1, with the root password changed, and has been added to nagios and inventory.
Comment 2•13 years ago
|
||
over to releng for puppetization/configuration.
Assignee: server-ops-releng → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
Reporter | ||
Comment 4•13 years ago
|
||
I won't be able to touch it until the week of January 9th when I come back from vacations.
Priority: -- → P3
Comment 5•13 years ago
|
||
14:16 < nagios-sjc1> buildbot-master21.build.scl1:MySQL connectivity is ACKNOWLEDGEMENT (CRITICAL): CHECK_NRPE: Socket timeout after 10 seconds.;dustin;not set up yet
So the mysql ACL for this isn't in place yet. The new plan for such tasks is to file a "Server Ops: ACLs" request for the network change, and a "Server Ops: Database" request for the MySQL change, so they can be done in parallel. Please do so on the 9th, or earlier if someone else picks this up.
Comment 6•13 years ago
|
||
I missed that I need to modify the number of CPUs *after* building the VM. Arr spotted it, and has modified the VM, so it will come up with 2 CPUs after it's rebooted. I'll leave it to you guys to schedule that (since I'm not sure if it's in prod yet).
Reporter | ||
Comment 7•13 years ago
|
||
I'll use bug 712244 to set it up.
Assignee: armenzg → server-ops-releng
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Component: Release Engineering → Server Operations: RelEng
Priority: P3 → P2
QA Contact: release → zandr
Resolution: --- → FIXED
Reporter | ||
Comment 8•13 years ago
|
||
We'll do the setup in here and leave bug 712244 to determine what is the way forward.
Assignee: server-ops-releng → nobody
No longer blocks: 712244
Status: RESOLVED → REOPENED
Component: Server Operations: RelEng → Release Engineering
Priority: P2 → --
QA Contact: zandr → release
Resolution: FIXED → ---
Summary: Please create buildbot-master21 → Setup buildbot-master21
Updated•13 years ago
|
Priority: -- → P3
Whiteboard: [buildmasters][capacity]
Comment 9•13 years ago
|
||
Armen: you switched from "I'll" in comment #7 to "We'll" in comment #8 -- who is actually going to do the work here?
Reporter | ||
Updated•13 years ago
|
Assignee: nobody → armenzg
Reporter | ||
Comment 10•13 years ago
|
||
I have not had time to work on this for the last month and not seeing the end of the tunnel. Putting back in the queue. This is not urgent. It might be more pressing when there are less r4 machines out of the pool due to dongles.
We now have 3 masters of each testing OS group.
I believe we should have one more macosx test master due to ratio of #slaves/#masters
darwin10/darwin11 -> 170
darwin9 -> 61
win7 -> 73
xp -> 67
fed32 -> 73
fed64 -> 68
tests1-linux -> 73+68=141
tests1-windows -> 73+67=140
tests1-macosx -> 170+61=231
Assignee: armenzg → nobody
Whiteboard: [buildmasters][capacity] → [buildmasters][capacity][buildduty]
Assignee | ||
Comment 11•13 years ago
|
||
Armen, is this bug simply about getting a master instance up and running on this machine, and putting it in the production pool?
Assignee: nobody → bhearsum
Reporter | ||
Comment 12•13 years ago
|
||
yes, that is correct.
Based on comment 10, having an extra macosx master would be the best use of it.
Thanks!
Assignee | ||
Comment 13•13 years ago
|
||
OK, thanks. I'll try to get this done this week.
Summary: Setup buildbot-master21 → Setup buildbot-master21 as a mac test master
Assignee | ||
Comment 14•13 years ago
|
||
Attachment #602914 -
Flags: review?(armenzg)
Assignee | ||
Comment 15•13 years ago
|
||
Attachment #602916 -
Flags: review?(armenzg)
Assignee | ||
Comment 16•13 years ago
|
||
I added this master to slavealloc, put the proper ssh keys on it. At this point, I'm just waiting for reviews, then I can turn it on I think....Once it's actually up and running we need to update Nagios to look for it, as this check is currently looking for 0 instances of buildbot ;) https://nagios.mozilla.org/nagios/cgi-bin/extinfo.cgi?type=2&host=buildbot-master21.build.scl1&service=buildbot
Reporter | ||
Updated•13 years ago
|
Attachment #602914 -
Flags: review?(armenzg) → review+
Reporter | ||
Updated•13 years ago
|
Attachment #602916 -
Flags: review?(armenzg) → review+
Assignee | ||
Updated•13 years ago
|
Attachment #602914 -
Flags: checked-in+
Assignee | ||
Comment 17•13 years ago
|
||
Comment on attachment 602916 [details] [diff] [review]
puppet config update
This is landed, and the master has been created and turned on. I've pushed talos-r3-leopard-032 and talos-r4-{snow,lion}-032 to it, and after I see successful jobs from them, I'll enable the master fully.
Attachment #602916 -
Flags: checked-in+
Assignee | ||
Comment 18•13 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #16)
> Once
> it's actually up and running we need to update Nagios to look for it, as
> this check is currently looking for 0 instances of buildbot ;)
> https://nagios.mozilla.org/nagios/cgi-bin/extinfo.cgi?type=2&host=buildbot-
> master21.build.scl1&service=buildbot
Apparently Puppet changes this somewhere, so it's OK now.
Assignee | ||
Comment 19•13 years ago
|
||
I've seen successful builds from all 3 slaves, I just unlocked them and enabled the master fully. The slaves should rebalance themselves in the next couple of hours.
Assignee | ||
Comment 20•13 years ago
|
||
So, nothing else to do here, woo!
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•