Closed Bug 699582 Opened 13 years ago Closed 13 years ago

Doing virus scans shouldn't kill stage

Categories

(Release Engineering :: General, defect, P3)

x86
All
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Unassigned)

References

Details

(Whiteboard: [release][automation])

Attachments

(2 files, 1 obsolete file)

One virus scann job will slow down the ability to rsync out changes to mirrors, presumably by causing a lot of I/O on the netapp. We should figure out how we can make the scanning less resource intensive - bug 699579 is one approach to help with CPU, but probably doesn't help with the network load (still have to grab the files to extract them locally). We should also add locking so that we don't try to run more than one job at a time.
Whiteboard: [release][automation]
9.0b1 had * the load on stage/surf up to about 30 * the rsync from there to pv-mirror01 degraded - it got more than 70 minutes behind * consequently boxes syncing out from there (addons, other products) got behind too * buildbot-master08 got behind on uploading logs (eg the update verifies) I've niced off clamd by 10 to help. stage got rebooted 30 days ago so we may have had that in place and lost it. Might be necesseary to nice extract_and_run_command.py too.
If scanning is CPU bound, you could try setting the CPU affinity on clamd so that it only maxes out one CPU core and leaves the others available for processing other requests. If I/O is the issue, you could investigate using ionice instead of just nice.
Attached patch [tools] Have a nice day (obsolete) (deleted) — Splinter Review
This should nice off the decompression of mar and exe files, both CPU and I/O load (best-effort but lowest-priority), so as not to interfere with the other things going on on surf. clamd is going to run at whatever priority the system has set it up as, so not much we can do there. Is dev-stage01 close enough to stage in terms of the OS & setup to be a fair test of this ?
Attachment #581394 - Flags: feedback?(rail)
Attached patch [tools] Fix 7z call (deleted) — Splinter Review
Attachment #581394 - Attachment is obsolete: true
Attachment #581394 - Flags: feedback?(rail)
Attachment #581408 - Flags: feedback?(rail)
Comment on attachment 581408 [details] [diff] [review] [tools] Fix 7z call If you run that from bash it should work. However, I'm not sure how it amy behave from Popen.
Attachment #581408 - Flags: feedback?(rail) → feedback+
Fair point about the call for 7zip; adding 'bash' at the front might be fine but would need testing. Here's the equivalent of what I've been doing on stage after jobs have started, which I know does mitigate the load quite a bit (helps mar and 7zip decompression, but not 7zip and clam calls). If this is OK please go ahead and land prior to 10.0b1.
Attachment #583087 - Flags: review?(rail)
Attachment #583087 - Flags: review?(rail) → review+
Comment on attachment 583087 [details] [diff] [review] [tools] Nice extract_and_run_command.py but not all the subcalls http://hg.mozilla.org/build/tools/rev/64997b0bc131
Attachment #583087 - Flags: checked-in+
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: