Closed
Bug 858840
Opened 12 years ago
Closed 11 years ago
ns{1,2}.private.phx1 using dhcp
Categories
(Infrastructure & Operations :: Change Requests, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: limed, Assigned: bhourigan)
References
Details
Attachments
(1 file)
(deleted),
text/plain
|
Details |
This host is grabbing its IP via DHCP, sounds like bad mojo and we should change that to use static addresses.
Assignee | ||
Comment 1•12 years ago
|
||
Proposed patch to address network configuration
Assignee: server-ops-infra → bhourigan
Status: NEW → ASSIGNED
Reporter | ||
Comment 3•12 years ago
|
||
Adding a note on what we talked over IRC, unless I am going crazy I had actually tried to change this to a static assignment the last time phx1 had issues with dhcp. And for some reason it changed back to DHCP which I suspect was caused by dhclient, so might want to take a look at this too.
Assignee | ||
Comment 4•12 years ago
|
||
I've changed ns2.private.phx1 to static IP assignment, and made sure dhclient isn't running.
Assignee | ||
Comment 5•12 years ago
|
||
The network changes to ns1 are staged, but I'll need some input on when this can be done. I expect 6 seconds of downtime to restart networking. The default libresolv timeout is 5 seconds. Production services could see query failures. :jabba, Please advise how I should proceed
Flags: needinfo?(jdow)
Comment 6•12 years ago
|
||
I think the best course of action would be to: 1) swap the /etc/resolv.conf entries on clients so that ns2 is preferred, which will minimize the impact during the cutover and 2) do the cutover during the June 1st (or possibly June 2nd) releng maintenance window to further minimize possible impact
Flags: needinfo?(jdow)
Assignee | ||
Comment 8•11 years ago
|
||
CAB NOTES: I'de like to make a simple change to ns1.private.phx1 so that the IP is configured statically vs allocated by DHCP. It will involve an edit to /etc/sysconfig/network-scripts/ifcfg-bond0 and a 'service network restart'. It will require ~6s of downtime on ns1.private.phx1. The default libresolv timeout is 5s, but the default retry is 2. I estimate it will take 10s of downtime for applications to observe a failed query. I don't expect any problems, but there is a possibility that some DNS queries will fail during this time. I would prefer to avoid mucking with resolv.conf globally, while we can remove the IP from resolv.conf many applications (such as Zeus) cache the servers and it would require us to restart many services. Once to remove the IP, and again to re-add it. I'm flexible on timing. I'm thinking early some morning like 5AM PST, or during another maintenance window when folks are already anticipating a blip.
Component: Server Operations: Infrastructure → Server Operations: Change Requests
QA Contact: jdow → shyam
Assignee | ||
Updated•11 years ago
|
Flags: cab-review?
Comment 9•11 years ago
|
||
Approved to ride along on the treeclosing window of June 1st.
Flags: cab-review? → cab-review+
Updated•11 years ago
|
Group: infra
Assignee | ||
Comment 10•11 years ago
|
||
Changes applied, ns1 is no longer using dhcp. Is nice!
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: mozilla.org → Infrastructure & Operations
Updated•9 years ago
|
Change Request: --- → approved
Flags: cab-review+
You need to log in
before you can comment on or make changes to this bug.
Description
•