Closed Bug 1272137 Opened 9 years ago Closed 8 years ago

Fuzzing builds failing due to missing requests module

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: coop, Assigned: gkw)

References

Details

Attachments

(1 file)

This is happening on Linux and Mac at least. Windows is hitting an unrelated permissions error before it reaches this point. Cloning into 'funfuzz'... Cloning into 'lithium'... Cloning into 'FuzzManager'... Cloning into 'funfuzz-private'... Traceback (most recent call last): File "funfuzz/bot.py", line 23, in <module> import createCollector File "/builds/slave/fuzzer-macosx64-lion/funfuzz/util/createCollector.py", line 12, in <module> from Collector.Collector import Collector File "/builds/slave/fuzzer-macosx64-lion/FuzzManager/Collector/Collector.py", line 29, in <module> import requests ImportError: No module named requests
If you install requests, you'll find it's also missing numpy. This might be the right juncture to "simply" move the fuzzing scripts to mozharness which has easy support for venvs.
We first need a way to get logs. I already mentioned in bug 1224333 that logs would be super useful before we attempt to fix this.
Blocks: 1284666
The buildbot logs are in pvtbuilds2.dmz.scl3.mozilla.com:/mnt/pvt_builds/fuzzing/tinderbox-builds/, you should still have ssh access to them. FWIW, I set up a virtualenv with the requirements for FuzzManager, but bot.py failed because --remote-host and --basedir aren't arguments any more. It might be worth getting a loaner machine to iterate on fixing things up, see https://wiki.mozilla.org/ReleaseEngineering/How_To/Request_a_slave.
> The buildbot logs are in > pvtbuilds2.dmz.scl3.mozilla.com:/mnt/pvt_builds/fuzzing/tinderbox-builds/, > you should still have ssh access to them. Yes, I verified that I have ssh access, thanks. > FWIW, I set up a virtualenv with the requirements for FuzzManager, but > bot.py failed because --remote-host and --basedir aren't arguments any more. > It might be worth getting a loaner machine to iterate on fixing things up, > see https://wiki.mozilla.org/ReleaseEngineering/How_To/Request_a_slave. Those were removed when we moved to FuzzManager[1]. If the releng machines can access fuzzmanager.fuzzing.mozilla.org and we change: $PYBIN $REPO_NAME/bot.py --remote-host "$FUZZ_REMOTE_HOST" --basedir "$FUZZ_BASE_DIR" to: $PYBIN $REPO_NAME/loopBot.py -b "--random" -t "js" --target-time 28800 it might be baby steps to getting this working again. I'll have to take this on next quarter or something later, though. Nick, can you attach your virtualenv changes here? Also, regarding private api keys to access FuzzManager, how should they be defined/retrieved? They cannot be committed to the public script. [2] [1] https://github.com/MozillaSecurity/funfuzz/commit/fdce900cab8ba76f049357b4fbf8a884fa396725 [2] https://hg.mozilla.org/build/tools/file/default/scripts/fuzzing/fuzzer.sh#l40
Assignee: nobody → gary
Flags: needinfo?(nthomas)
Connectivity to fuzzmanager.fuzzing.mozilla.org:443 seems fine on all three platforms. I manually constructed a virtualenv, we'd need something like virtualenv venv # probably some fun finding virtualenv on windows . venv/bin/activate pip install -r FuzzManager/requirements.txt python $REPO_NAME/loopBot.py -b "--random" -t "js" --target-time 28800 We'd have to put the required packages into our internal pypi instance for automation, normal pypi isn't allowed. Is the 28800 target-time consisten with the 20 minutes we normally run jobs for ? Let me reiterate the value of a loaner machine to iterate fast on this! Not something I have any plans to work on.
Flags: needinfo?(nthomas)
> We'd have to put the required packages into our internal pypi instance for > automation, normal pypi isn't allowed. Got it. Should I file a new bug? > Is the 28800 target-time consistent with the 20 minutes we normally run jobs for ? No, that value is in seconds, so 28800 = 8 hours. For 20 mins it will be 1200. What about the issue of private API keys to connect to FuzzManager? Where should they be stored? (trying to ask questions now so future development will be easier, I do acknowledge the value of the loaner machine)
A dependent bug would be fine for packages on our pypi. For the API credentials the normal method is to put them in a file on the machine (via puppet).
Attached patch fuzzer1.patch (deleted) — Splinter Review
Here's a first iteration. The pip packages to be added seem to already be present (was requested several months ago). Two questions: * The .fuzzmanagerconf credentials are not present in this script. Do I file a (locked) dependent bug? * I notice in the old script that the GitHub repos are cleared then re-cloned. Should the virtualenv be deleted then recreated? Right now only the symlinks inside are deleted, this circumvents a (separate) issue with the virtualenvs whenever Python is updated. I have made this work on a loaner Mac machine, and have landed patches in the main funfuzz repo to fix other issues.
Attachment #8846160 - Flags: review?(nthomas)
CC'ing :decoder, ref the .fuzzmanagerconf credentials.
Also, we need to turn on coredumps for those Mac machines. Are the following lines allowed? mkdir /cores/ ulimit -c unlimited
Hmmm, coredumps are optional if the js shells had debug information available (either as an unstripped binary or a separate debuginfo file, see bug 1347750).
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #10) > Two questions: > * The .fuzzmanagerconf credentials are not present in this script. Do I file > a (locked) dependent bug? On the whole you might be better spending the time talking to #taskcluster about their secrets service, and if that's appropriate for these privileges, since taskcluster is the future. If you are unwilling to cross that bridge now we can ask buildduty to distribute credentials via puppet in a separate bug. > * I notice in the old script that the GitHub repos are cleared then > re-cloned. Should the virtualenv be deleted then recreated? Right now only > the symlinks inside are deleted, this circumvents a (separate) issue with > the virtualenvs whenever Python is updated. I'm not sure why the script deletes repos and reclones, it wouldn't be hard to make it better (but see above about spending time on buildbot vs taskcluster, and the clones are quick anyway). Handling python version upgrades is very likely a non-problem in buildbot, but if it does no harm then OK. You've got 'pip install --upgrade' on the requirements install, so that seems fine (ancient version of pip though!).
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #12) > Also, we need to turn on coredumps for those Mac machines. Are the following > lines allowed? > > mkdir /cores/ > ulimit -c unlimited Ideally we don't do this, but at the very least we set some value for the limit. There is plenty of disk (>500GB) but there's no point setting yourself up to fill the disk; something should make sure /cores/ is cleaned up too.
Comment on attachment 8846160 [details] [diff] [review] fuzzer1.patch >+$VENV_BINDIR/python -u $FF_NAME/loopBot.py -t "js" --target-time 1200 2>&1 | tee log-loopBotPy.txt We don't normally run scripts with this redirection, so I'm unsure if buildbot will like it or not.
Attachment #8846160 - Flags: review?(nthomas) → review+
After discussions with the fuzzing team, we have decided not to proceed with this direction, especially since: * These machines will go away in the short-medium term. * There are still lots of technical hurdles to overcome, by the time these are fixed across multiple teams, perhaps the machines will have gone away already. * The fuzzing team has access to EC2 instances for fuzzing. ** Historically we used to get lots of unused power in the build machines when they are not building, but Win/Linux ones have already moved to EC2. * The cost/benefit value of technical investment in fixing these issues is no longer immediately apparent. Whether we move to TaskCluster or not is orthogonal to the discussion and a almost-completely separate form of technical investment. Note about TaskCluster cost/benefit value: * It is presumed that releng instances will be optimised to use as much of a billed-hour as possible, so we're talking about the unused time we can use for fuzzing in minutes (esp. excluding the ending 5 minutes to prevent accidental overruns), rather than hours as per the old system. * We might still get value out of these unused minutes, esp. if cumulative across multiple VMs, but the odds are not in the fuzzing team's favour (not that it's a competition) as the quest by the team billed for the time would be to reduce or optimise the hour usage of these VMs as much as possible. Resolving WONTFIX for now.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: