Block more default http library User Agents
Categories
(Tree Management :: Treeherder: API, enhancement, P1)
Tracking
(Not tracked)
People
(Reporter: emorley, Assigned: emorley)
References
Details
Attachments
(1 file)
(deleted),
text/x-github-pull-request
|
Details |
For ~3 years we've asked that consumers of Treeherder's REST API set an appropriate User Agent when making requests to our API:
https://treeherder.readthedocs.io/rest_api.html#user-agents
This has been enforced by blocking the default User Agents of popular http libraries:
https://github.com/mozilla/treeherder/blob/9bc1da2c78f73c273c4a149a7a779f4d88ee7b1c/treeherder/config/settings.py#L247-L256
However whilst looking at New Relic for something else today, I see there are a few more user agents that could do with blocking.
Go 1.1 package http
Go-http-client/*
node-fetch/*
-> https://github.com/mozilla/firefox-health-backend/issues/53
python-requests/*
Python-urllib/*
reqwest/*
-> https://github.com/jgraham/fetchlogs/issues/5
All but the ones marked with GitHub issues are for non-legitimate traffic (eg scraping robots.txt or looking for PHP/... vulnerabilities), so no impact from blocking.
Assignee | ||
Comment 1•6 years ago
|
||
The two python related entries in comment 2 are already blocked.
For the Go related ones, they are only being used to scrap the homepage and robots.txt, so safe to block. This was determined using New Relic Insights and the query:
SELECT count(`request.uri`) FROM Transaction FACET `request.uri` WHERE user_agent LIKE 'Go%http%' SINCE 7 DAYS AGO LIMIT 200
Comment 2•6 years ago
|
||
Assignee | ||
Comment 3•6 years ago
|
||
Description
•