Closed Bug 602506 Opened 14 years ago Closed 14 years ago

Fix pymake process implementation to run shell commands without using process pool

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla2.0b7

People

(Reporter: ted, Assigned: khuey)

References

Details

Attachments

(2 files)

When I rewrote the main loop in Pymake to support native commands, I used multiprocessing.Pool to run all commands on background processes. This winds up being significantly slower for shell executions, since instead of just running a Popen() in the main process and using waitpid(), we have to serialize some data to a worker process, then run Popen(), signal a condvar, and wait on the condvar in the main process. I don't think we can quite get back to the same architecture, but some hybrid architecture might be better than the current implementation. After chatting with jorendorff, I was thinking: 1) Keep running native commands like we do currently 2) For shell commands, run Popen() in the main process 3) Add a background thread that just blocks on waitpid()/WaitForMultipleObjects() and signals the condvar when it sees a process exit 4) The main process continues to wait on the condvar to see when a job finishes
Assignee: nobody → ted.mielczarek
Attached patch Patch (deleted) — Splinter Review
My measurements indicate that this performs almost identically to the in-tree pymake on my machine.
Attachment #483843 - Flags: review?(ted.mielczarek)
Comment on attachment 483843 [details] [diff] [review] Patch >+ # Wait on local jobs first for perf This is a bogus comment. Ignore it.
Also, after you r+ this we should sync the in-tree and dev versions of pymake.
For enrichment: >>> p.sort_stats('time') <pstats.Stats instance at 0x02294BE8> >>> p.print_stats(25) Sun Oct 17 18:01:10 2010 pymake.log 174154127 function calls (169000697 primitive calls) in 1345.259 CPU seconds Ordered by: internal time List reduced from 943 to 25 due to restriction <25> ncalls tottime percall cumtime percall filename:lineno(function) 26479 1046.144 0.040 1046.144 0.040 {method 'acquire' of '_multiprocessing.SemLock' objects} 792492 28.836 0.000 28.836 0.000 {nt.stat} 263990 19.603 0.000 21.188 0.000 shlex.py:120(read_token) 565356 14.796 0.000 33.137 0.000 parser.py:615(parsemakesyntax) 3207 12.109 0.004 12.109 0.004 {method 'read' of 'file' objects} 6081011/4694053 11.123 0.000 31.710 0.000 data.py:244(get) 4118481 9.091 0.000 12.833 0.000 data.py:410(match) 16747 8.922 0.001 13.995 0.001 process.py:243(_docall) 781749 7.188 0.000 9.755 0.000 data.py:364(__init__) 1744380/951523 7.172 0.000 48.930 0.000 functions.py:61(resolve) 1759628 7.073 0.000 31.887 0.000 parserdata.py:237(execute) 312048 6.418 0.000 49.774 0.000 data.py:939(resolvevpath) 1236665 6.402 0.000 6.402 0.000 __init__.py:1222(getEffectiveLevel) 39201897/39093227 5.667 0.000 5.748 0.000 {len} 1928900 4.524 0.000 4.794 0.000 parser.py:113(iterdata) 251847 4.373 0.000 31.938 0.000 parserdata.py:124(execute) 50831/50246 4.212 0.000 75.422 0.002 data.py:797(resolveimplicitrule) 1577339/583111 4.119 0.000 72.251 0.000 data.py:186(resolve) 1060428 3.867 0.000 7.397 0.000 data.py:1467(gettarget) 736241 3.666 0.000 30.623 0.000 data.py:40(getmtime) 1728495 3.580 0.000 14.550 0.000 data.py:1310(matchesfor) 54535 3.405 0.000 4.075 0.000 data.py:1122(splitcommand) 1344128 3.221 0.000 5.168 0.000 ntpath.py:63(join) 425123/5388 3.219 0.000 111.977 0.021 parserdata.py:497(execute) 35019 2.992 0.000 20.300 0.001 parser.py:375(parsestring) <pstats.Stats instance at 0x02294BE8>
Assignee: ted.mielczarek → khuey
Comment on attachment 483843 [details] [diff] [review] Patch ># HG changeset patch ># Parent f7f475ddc5c0222734a0efd5736a39a0a39f0031 >diff --git a/pymake/process.py b/pymake/process.py >--- a/pymake/process.py >+++ b/pymake/process.py >@@ -1,15 +1,15 @@ > """ > Skipping shell invocations is good, when possible. This wrapper around subprocess does dirty work of > parsing command lines into argv and making sure that no shell magic is being used. > """ > > #TODO: ship pyprocessing? >-from multiprocessing import Pool, Condition >+import multiprocessing, multiprocessing.dummy As discussed on IRC, if you're going to use multiprocessing.dummy, which uses a thread pool, you'll need to take this Python bug into account: http://bugs.python.org/issue1731717 Looks like there's a workaround, at least. >- self.pool = Pool(processes=jcount) >+ self.remotepool = multiprocessing.Pool(processes=jcount) >+ self.localpool = multiprocessing.dummy.Pool(processes=jcount) I think I'd prefer if you explicitly named them "processpool" and "threadpool" since "remote" and "local" are sort of misnomers anyway. Looks fine otherwise, r=me with those changes.
Attachment #483843 - Flags: review?(ted.mielczarek) → review+
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b8
Target Milestone: mozilla2.0b8 → mozilla2.0b7
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: