Closed
Bug 1155451
Opened 10 years ago
Closed 10 years ago
Treeherder log parser blocks on completed download before parsing
Categories
(Tree Management :: Treeherder: Data Ingestion, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: wlach, Assigned: wlach)
Details
Attachments
(2 files)
Currently the treeherder log parser waits until it has downloaded the full log before unzipping and processing. We could potentially make it slightly faster if we got started on parsing it before it was complete.
I got curious about how much this could help us so I wrote something up. Unfortunately my benchmarking suggests it's not particularly helpful (it shaves between .4 and .1 seconds usually), but perhaps it's worth adding anyway.
Assignee | ||
Comment 1•10 years ago
|
||
On my workstation, I get this set of 10 results on a largish log without the "optimization":
(eideticker)wlach@eideticker:~/src/treeherder-service$ python t.py http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win64/1429215986/mozilla-central-win64-bm82-build1-build170.txt.gz
[0.8731870651245117, 0.9529199600219727, 0.8774969577789307, 0.9509341716766357, 0.8769950866699219, 0.9473130702972412, 0.863955020904541, 0.9525530338287354, 1.4248991012573242, 1.0506041049957275]
0.977085757256
And this set of results with the optimization:
(eideticker)wlach@eideticker:~/src/treeherder-service$ python t.py http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win64/1429215986/mozilla-central-win64-bm82-build1-build170.txt.gz
[1.0908939838409424, 0.8961830139160156, 0.7244958877563477, 0.8918678760528564, 0.7274820804595947, 0.9782431125640869, 1.1817591190338135, 0.9811978340148926, 0.727877140045166, 0.9805409908294678]
0.918054103851
Assignee | ||
Comment 2•10 years ago
|
||
This is like the least urgent thing ever, since it doesn't improve performance that much (see above for benchmarks). I'm kind of on the fence about whether we should even commit it, as it makes the code slightly more complex. I more just wanted to get it out there so people could know that I tried it. Anyway, wouldn't mind a second opinion.
Attachment #8593670 -
Flags: review?(mdoglio)
Assignee | ||
Comment 3•10 years ago
|
||
Testing again against on a web server on my local machine (i.e basically instantaneous download), I still get an average difference of about .1 seconds. I guess it partly depends on (1) how fast the machine is, (2) how saturated the CPU is already, and (3) how long the download takes.
This algorithm will show the most improvement on a slow machine with an unsaturated cpu and slow network performance (since we'll take advantage of the long time it takes to download the file to get a jump start on decompression). If network is the dominating factor (which it seems to be, at least on my workstation), or the CPU is already saturated, expect little improvement from doing things this way.
Assignee | ||
Comment 4•10 years ago
|
||
I updated the PR to include a more realistic test program (which we can now run any time), which actually uses treeherder's log parser artifact builder classes. In this case, the difference in speed tends to be greater (.2 secs on average):
Before:
(venv)vagrant@local:~/treeherder-service$ ./manage.py test_parse_log --profile 10 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1429500689/mozilla-central-linux-bm74-build1-build125.txt.gz
Timings: [1.5258538722991943, 1.6284010410308838, 1.7903828620910645, 1.7481331825256348, 2.2356438636779785, 1.8331339359283447, 1.715242862701416, 2.031848907470703, 1.862015962600708, 1.8552160263061523]
Average: 1.82258725166
Total: 18.2258725166
After:
(venv)vagrant@local:~/treeherder-service$ ./manage.py test_parse_log --profile 10 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1429500689/mozilla-central-linux-bm74-build1-build125.txt.gz
Timings: [1.6193771362304688, 1.614927053451538, 1.5895788669586182, 1.4775769710540771, 1.4487788677215576, 1.5538148880004883, 1.507270097732544, 2.427928924560547, 1.3921799659729004, 1.4321610927581787]
Average: 1.60635938644
Total: 16.0635938644
Updated•10 years ago
|
Attachment #8593670 -
Flags: review?(mdoglio) → review+
Comment 5•10 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/ae995bf6fa2fdf6c6ff365525adfba0774daa5a7
Bug 1155451 - Don't block on full log download before parse
Assignee | ||
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•