Closed Bug 625158 Opened 14 years ago Closed 13 years ago

please enable logging of all outbound requests from the build network

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Unassigned)

References

Details

(Whiteboard: [post fx4])

Attachments

(1 file, 1 obsolete file)

We hit an issue recently where a test depended on an external website, and started failing strangely when that website had issues. To help weed out more of these sort of things we'd like to start logging all external requests made by buildslaves. That is:
bm-xserve*
moz2-*
linux-ix-*
w32-ix-*
mw32-ix-*
mv-moz2-linux-ix-*
try-*
n900-*
t-r3-w764-*
talos-r3-*
win32-*
Joduinn just told me that he wants everything logged, to potentially help us catch an infected machine, so please disregard the list of hosts in comment #0.
Summary: please enable logging of all outbound requests from build slaves inside the build network → please enable logging of all outbound requests from the build network
Enabled logging on fw1.mtv1, fw1.scl1, and fw1.sjc1 for all connections from build to public addresses.

I'll leave this open for tracking and will review data on a daily basis for several days.
Status: NEW → ASSIGNED
OS: Linux → All
Hardware: x86_64 → All
Is it possible to publish the data somewhere public, so that releng and developers can analyze too?
Logging access is something infrasec handles.
Assignee: network-operations → server-ops
Component: Server Operations: Netops → Server Operations: Security
QA Contact: mrz → clyon
Any update?
(In reply to comment #4)
> Logging access is something infrasec handles.

IT setup logging of all inbound traffic in bug#602741, and gave us the info from those logs. Dont mind if its IT or InfraSec who gives us those logs - whatever is fastest WFM. 

We are looking for a list of which RelEng systems are attempting to connect to the outside world. We expect this to uncover what test suites are reaching out to external sites, so we can then file bugs with QA. It *might* uncover other surprises, but lets see.


(We need this in place before we can safely *block* access from build-vpn machines to the outside world.)
should be in server and not in security. I am OK with the logs.
Component: Server Operations: Security → Server Operations
QA Contact: clyon → mrz
I've provided Zandr logs from SCL1 which he is parsing through to get some useful data.
Component: Server Operations → Server Operations: Security
Attached file prelim grep-fu on firewall logs (obsolete) (deleted) —
Thank you!
What period of time were these captured over, out of curiosity?
The clock on the firewall was messed up, but if my math is right, this is all but the first 9 minutes of 18 January 2011 (UTC). 

That's a coincidence, btw, what I got was the current log plus the previous 3 files, which rotate at 10MB.
Speaking of clocks... I'll note again here that I pulled out NTP, since it appears that some machines are using pool.ntp.org. As a result, we hit port 123 on 250 unique IP addresses in 24hrs, of which <10 are the usual suspects at MS and Apple.

I'll file a bug to use an internal NTP server instead as soon as I have a chat with netops to figure out which one it should be.
Component: Server Operations: Security → Server Operations
We use ntp1.build.mozilla.org for build machines but I bet the talos-r3 boxes never got configured to that level of detail.
Status: ASSIGNED → NEW
Assignee: server-ops → server-ops-releng
Component: Server Operations → Server Operations: RelEng
QA Contact: mrz → zandr
Comment on attachment 504925 [details]
prelim grep-fu on firewall logs

Port 5672 is for RabbitMQ and we will need to enable in/out access from our build RabbitMQ server to the main mozilla pulse server.  Currently our RMQ server is running on cruncher.
Whiteboard: [post fx4]
Marking post-fx4 because turning on logging increase firewall load.  Don't want that right now.
(In reply to comment #16)
> Marking post-fx4 because turning on logging increase firewall load.  Don't want
> that right now.

Did you turn it off? Otherwise, it's been on since comment 2.
I disabled logging on fw1.mtv1 for all sessions going from build.mtv1 -> *
(In reply to comment #18)
> I disabled logging on fw1.mtv1 for all sessions going from build.mtv1 -> *

Yeah, this is mostly about fw1.scl1. (Talos (10.12.48.0/22) -> internet)
Oh!  I forget we have more than one releng fw point now.
Attached file Log snapshot from 17 Mar 11 (deleted) —
More grep-fu on a recent log snapshot.

Per joduinn, this closes this bug, different bugs will be filed to fix tests that are making outbound connections.

Again, I've filtered out NTP.
Attachment #504925 - Attachment is obsolete: true
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: