Closed
Bug 1165356
Opened 9 years ago
Closed 8 years ago
Use requests everywhere instead of urllib / urllib2 / httplib
Categories
(Tree Management :: Treeherder, defect, P3)
Tree Management
Treeherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
Attachments
(1 file, 1 obsolete file)
Bug 1144417 already switched treeherder-client over to using requests rather than httplib. There are no other uses of httplib in the repo, but we have urllib / urllib2 calls sprinkled all over the place that we should replace with requests. This will fix things like bug 1165335, since we can avoid using urllib.urlencode().
The admin node has requests installed, so we can even switch over the update.py deployment script - which currently uses not just urllib2 but urllib too.
Stage/prod venv have version 2.6.2 installed and the admin node has 2.7.0, so both support the new .post() json param - which means we can can simplify some of our existing requests usage too.
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 1•9 years ago
|
||
https://github.com/mozilla/treeherder/search?q=urllib
https://github.com/mozilla/treeherder/search?q=urllib2
(We won't want to touch the peep.py instances).
Assignee | ||
Comment 3•9 years ago
|
||
This will fix a few things (eg bug 1165335, possibly some of the ETL leaks), so let's do this sooner rather than later.
Priority: P3 → P2
Attachment #8623783 -
Flags: feedback?(emorley)
Assignee | ||
Comment 6•9 years ago
|
||
Comment on attachment 8623783 [details]
PR 640
Left a comment - re-request feedback/review when that change is made, so flake8 passes and the tests can run :-)
Attachment #8623783 -
Flags: feedback?(emorley)
Comment 7•9 years ago
|
||
:KWierso what's the current status of this bug? Is attachment 8623783 [details] ready for a r?
Updated•9 years ago
|
Flags: needinfo?(wkocher)
Assignee | ||
Comment 8•9 years ago
|
||
The PR hasn't been updated since the feedback request in comment 6.
(Side note: camd also has a PR open for this bug, from a while ago https://github.com/mozilla/treeherder/pull/637)
I'd rather someone else take over this.
Flags: needinfo?(wkocher)
Comment 11•9 years ago
|
||
Yeah, I'm happy to take this on. I'll revisit my PR and see if it can be(or needs to be) merged with wes' and made ready for review.
Flags: needinfo?(cdawson)
Updated•9 years ago
|
Assignee: nobody → cdawson
Blocks: 1188661
Assignee | ||
Comment 12•9 years ago
|
||
As part of this work, we'll end up having to revert bug 1155451 (streaming the gzipped log and parsing as we do so), unless we can find a way to do similar with requests. If that's not possible with requests, IMO we should file a feature request against requests and then still switch to it, since the simplification if offers is preferable (again IMO) to the perf optimisation from streaming.
Comment 13•9 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #12)
> As part of this work, we'll end up having to revert bug 1155451 (streaming
> the gzipped log and parsing as we do so), unless we can find a way to do
> similar with requests. If that's not possible with requests, IMO we should
> file a feature request against requests and then still switch to it, since
> the simplification if offers is preferable (again IMO) to the perf
> optimisation from streaming.
Requests offers an API to let you stream content:
http://docs.python-requests.org/en/latest/user/quickstart/#raw-response-content
I would push back against any patch which didn't preserve the optimization. Log parsing speed is still an issue when we're trying to recover from problems -- simplicity of implementation is less of a concern for a 20-line, well-isolated block of code.
Assignee | ||
Comment 14•9 years ago
|
||
The perf gain was pretty small in bug 1155451, and that was testing locally rather than on a datacenter that's both closer to where the logs are located (at least for FTP) and also likely with a faster connection. The streaming also obscured the problem with slow downloads recently, since it distorted the profiles shown in New Relic.
So I think it's not as straightforwards a tradeoff as it seems at first glance.
Though either way I think this comes down to benchmarks (like it did for Cython) - we can run them with and without - and on the actual treeherder stage (or Heroku) nodes to make a decision later :-)
Assignee | ||
Comment 15•9 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #14)
> The streaming also obscured the problem with slow downloads recently,
> since it distorted the profiles shown in New Relic.
There may also be more we can do to annotate the method so this is clearer in New Relic.
Comment 16•9 years ago
|
||
Sure, profiles (which is something that should be done regardless) and/or compelling arguments on maintainability could convince me otherwise. That said, replacing this chunk of code is pretty low priority unless there's something I'm missing...
Assignee | ||
Comment 17•9 years ago
|
||
I didn't say it was high priority :-)
James queried in IRC why we were using a chunk size of 4K, which reminded me about the implementation and how it might conflict with the desire to move to the requests library.
Now we know there is a way to implement similar with requests, deciding whether to stream or not stream is orthogonal to this bug and can be decided at another point.
Assignee | ||
Comment 18•9 years ago
|
||
I'm going to be touching parts of this for bug 1230610 et al.
Assignee: cdawson → emorley
Assignee | ||
Comment 19•9 years ago
|
||
This will be helped by the refactoring in bug 1230610.
Depends on: 1230610
Assignee | ||
Comment 20•9 years ago
|
||
I have a WIP for the ETL parts using urllib2, in bug 1240809. With that fixed, it just leaves the log parser iirc, which can be the focus of this bug.
Assignee | ||
Updated•9 years ago
|
Assignee: emorley → nobody
Assignee | ||
Updated•9 years ago
|
Priority: P2 → P3
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → emorley
Updated•8 years ago
|
Assignee | ||
Updated•8 years ago
|
Comment 21•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Attachment #8849352 -
Flags: review?(cdawson)
Assignee | ||
Updated•8 years ago
|
Attachment #8623783 -
Attachment is obsolete: true
Comment 22•8 years ago
|
||
Comment on attachment 8849352 [details]
[treeherder] mozilla:logparser-requests > mozilla:master
Awesome! Nice work! :)
Attachment #8849352 -
Flags: review?(cdawson) → review+
Comment 23•8 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/9c14899e0cfe781bb90be13f0534ee36672b9550
Bug 1165356 - Log parser: Improve readability of test_parse_log output
It's only ever consumed by humans, so readability takes priority.
https://github.com/mozilla/treeherder/commit/823189e4c7793131cc2b0491853a5234eaec8153
Bug 1165356 - Log parser: Simplify error messages in post_log_artifacts
Prior to bug 1284289, exceptions of type `urllib2.URLError` had
different retry behaviour, however there's now no need for the
conditional just for the sake of a different error message wording.
https://github.com/mozilla/treeherder/commit/6717d9f5fc71ff2477968b827a55344d765e13bd
Bug 1165356 - Log parser: Fetch logs using requests instead of urllib2
Doing so also fixes incorrect line numbers in logs that have Windows
line endings (bug 1328880), hence having to update the expected logview
output.
The tests have to be adjusted to use responses, since requests doesn't
support `file://` URLs.
https://github.com/mozilla/treeherder/commit/aff2152a810b5d68d17252669185aca9f34d28bd
Bug 1165356 - Log parser: Remove tests workaround for file:// log URLs
Now that we're using requests, the log URL being used to access the file
is consistent across environments (since it doesn't reference the local
directory structure), so we don't need to exclude it from comparisons.
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•