Closed
Bug 1230222
Opened 9 years ago
Closed 8 years ago
[Meta] Encourage tools that interact with our API to set informative user agents
Categories
(Tree Management :: Treeherder: API, defect, P3)
Tree Management
Treeherder: API
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
(Keywords: meta)
There are times when we're looking at New Relic or gunicorn logs and are trying to work out where a request originated from.
For submissions to us, we now have the hawk client_id to help inform this, however:
1) This doesn't help identify GETs
2) The client_id is in the auth header, which isn't present in the gunicorn logs or New Relic transaction traces (albeit the latter will be helped by bug 1124278), unlike the user agent
treeherder-client uses a user agent of eg:
treeherder-pyclient/1.8.0
TreeBot uses eg:
TreeBot/0.1
We should try and identify tools other than those that don't set a custom UA, and file bugs/open PRs to add one.
There are also places within Treeherder itself, where we should be setting a UA but don't (eg the bugscache lookups that doesn't use treeherder-client) - plus we should of course do the right thing with requests we make to third party services too (like hg.mozilla.org).
Assignee | ||
Comment 1•9 years ago
|
||
Hi Saptarshi! I don't suppose you could set a custom user agent for the script that was mentioned in bug 1230179 comment 2? It will just allow us to more easily tell where requests are coming from in the case of API deprecation, or when requests are causing too much load etc (examples other tools use are in comment 0 here).
Thanks :-)
Flags: needinfo?(sguha)
Comment 2•9 years ago
|
||
Absolutely. I've changed everything and my requests ought to have
"SaptarshiGuhaTalos/1.0"
as the user agent.
If you'd like a more canonical string, I can change it easily.
Flags: needinfo?(sguha)
Assignee | ||
Comment 3•9 years ago
|
||
That's great - thank you :-)
Assignee | ||
Comment 4•9 years ago
|
||
Only use treeherder-client (which sets a UA):
https://github.com/mozilla/mozilla_ci_tools
https://github.com/adusca/try_extender
https://github.com/chmanchester/trigger-bot
https://github.com/mozilla/releasetasks
https://hg.mozilla.org/build/puppet
Already set a UA:
https://github.com/globau/treebot
Browser based, so browser UA + referrer is fine:
https://hg.mozilla.org/hgcustom/version-control-tools/
Have a PR open to add a UA:
https://github.com/mozilla/mozmill-ci
https://github.com/mozilla-raptor/post-to-treeherder
https://github.com/mozilla/autophone
https://github.com/jmaher/alert_manager
https://github.com/mozilla/pulse_actions
https://github.com/sydvicious/mozplatformqa-jenkins
https://github.com/mjzffr/treeherder-submission-example
https://github.com/mozilla/funsize
Left:
treeherder-node (bug 1191403)
Assignee | ||
Comment 5•9 years ago
|
||
I keep on finding more - it's amazing how many projects are using our API now!
https://github.com/mnoorenberghe/mozscreenshots
https://github.com/h4writer/arewefastyet
https://github.com/dminor/ouija
https://github.com/klahnakoski/TestLog-ETL
https://github.com/jcranmer/m-c-tools-code-coverage
Assignee | ||
Comment 6•9 years ago
|
||
Looking much more useful now (and some of the dependant bugs aren't deployed yet):
90261 treeherder-pyclient/2.0.1
73589 HTTP-Monitor/1.1
60359 ouija
46617 treeherder/treeherder.mozilla.org
15304 treeherder-nodeclient/0.7.0
2926 autophone
2307 SaptarshiGuhaTalos/1.0
817 NewRelicPinger/1.0 (677903)
425 TreeBot/0.1
416 mozscreenshots/0.3.1
410 Pingdom.com_bot_version_1.4_(http://www.pingdom.com/)
133 curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.18 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
97 funsize
70 python-requests/2.9.1
26 Twitterbot/1.0
8 Mozilla/6.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/8.0 Mobile/10A5376e Safari/8536.25
4 mozplatformqa-jenkins
4 -
1 Safari/11601.4.4 CFNetwork/760.2.6 Darwin/15.3.0 (x86_64)
1 Python-urllib/1.17
1 Goldfire Server
1 Go 1.1 package http
1 Flamingo_SearchEngine (+http://www.flamingosearch.com/bot)
The blank UA entries are:
IP-REDACTED - - [18/Feb/2016:06:31:56 +0000] "POST /api/project/gaia/resultset/ HTTP/1.1" 200 37 "-" "-"
IP-REDACTED - - [18/Feb/2016:08:23:54 +0000] "POST /api/project/gaia/resultset/ HTTP/1.1" 200 37 "-" "-"
IP-REDACTED - - [18/Feb/2016:08:57:40 +0000] "POST /api/project/gaia-master/resultset/ HTTP/1.1" 200 37 "-" "-"
IP-REDACTED - - [18/Feb/2016:11:40:47 +0000] "POST /api/project/gaia/resultset/ HTTP/1.1" 200 37 "-" "-"
...guessing gaia-taskcluster perhaps? (I can't check whether it's been deployed due to a Heroku bug not letting my access the app since it's locked, even though admins are supposed to be able to do so; have filed https://help.heroku.com/tickets/336512).
The curl entries are all to /server-status?auto - and are due to the deploy script's drain/undrain feature.
The Python-urllib entry is to /revision.txt?cachescramble=1455818831.65 and is due to whatsdeployed:
https://github.com/peterbe/whatsdeployed/blob/21cdd8350ad074fd0c0573a6a61f611e52695325/app.py#L68
Assignee | ||
Comment 7•9 years ago
|
||
Think we're virtually ready to block non-specific (for non-browser only) UAs:
[emorley@treeherder1.webapp.scl3 ~]$ awk -F\" '{print $6}' /var/log/httpd/treeherder.mozilla.org/access_log |
grep -v 'Mozilla/' | sort | uniq -c | sort -nr
42058 treeherder-pyclient/2.0.1
33387 treeherder/treeherder.mozilla.org
32401 HTTP-Monitor/1.1
19662 ouija
7965 treeherder-nodeclient/0.7.0
6243 SaptarshiGuhaTalos/1.0
2178 autophone
364 NewRelicPinger/1.0 (677903)
172 Pingdom.com_bot_version_1.4_(http://www.pingdom.com/)
168 TreeBot/0.1
32 mozscreenshots/0.3.1
18 funsize
9 mozmill-ci
6 -
3 Twitterbot/1.0
1 IrssiUrlLog/0.2
1 Flamingo_SearchEngine (+http://www.flamingosearch.com/bot)
Assignee | ||
Comment 8•9 years ago
|
||
Latest:
[emorley@treeherder1.webapp.scl3 ~]$ awk -F\" '{print $6}' /var/log/httpd/treeherder.mozilla.org/access_log | grep -v 'Mozilla/' | sort | uniq -c | sort -nr
46894 treeherder/treeherder.mozilla.org
41783 ouija
41686 HTTP-Monitor/1.1
36042 treeherder-pyclient/2.1.0
11579 treeherder-nodeclient/0.7.0
6517 SaptarshiGuhaTalos/1.0
2684 autophone
1975 treeherder-pyclient/2.0.1
1115 Go-http-client/1.1
473 NewRelicPinger/1.0 (677903)
228 Pingdom.com_bot_version_1.4_(http://www.pingdom.com/)
223 TreeBot/0.1
206 mozmill-ci
178 curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.18 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
167 funsize
117 mozscreenshots/0.3.1
26 Opera/9.80 (X11; Linux x86_64; Edition Linux Mint) Presto/2.12.388 Version/12.16
24 mozplatformqa-jenkins
20 python-requests/2.9.1
7 Twitterbot/1.0
6 wpt-fetchlogs
6 ltx71 - (http://ltx71.com/)
1 Scrapy/1.0.5 (+http://scrapy.org)
1 -
And for stage:
[emorley@treeherder1.stage.webapp.scl3 ~]$ awk -F\" '{print $6}' /var/log/httpd/treeherder.allizom.org/access _log | grep -v 'Mozilla/' | sort | uniq -c | sort -nr
45639 treeherder/treeherder.allizom.org
41563 HTTP-Monitor/1.1
29195 treeherder-pyclient/2.1.0
10699 treeherder-nodeclient/0.7.0
1003 treeherder-pyclient/2.0.1
439 NewRelicPinger/1.0 (677903)
187 mozplatformqa-jenkins
111 arewefastyet
100 mozmill-ci
89 treeherder-pyclient/1.8.0
5 autophone
1 ltx71 - (http://ltx71.com/)
The Go UAs were of form:
GET /api/project/try/artifact/100032679/
The libcurl ones for server-status and so not affected by DRF blacklisting:
/server-status
The python-requests ones:
//api/project/mozilla-aurora/jobs/?job_guid=79d27713-76c6-4aaa-a86c-c143851b2745
//api/project/mozilla-aurora/resultset/?revision=ca6ab5be342e
Assignee | ||
Comment 9•8 years ago
|
||
On prod, the only remaining UA that matches the blacklist is:
python-requests/2.9.1
...which I believe to be leftover machines that didn't get the fix from bug 1248277 deployed.
On stage there was just:
[12/May/2016:12:14:56 +0000] "GET /revision.txt?cachescramble=1463055296.49 HTTP/1.0" 200 41 "-" "Python-urllib/1.17"
-> what's deployed, have filed:
https://github.com/peterbe/whatsdeployed/issues/13
[12/May/2016:06:07:36 +0000] "GET /a2billing/ HTTP/1.1" 400 26 "-" "python-requests/2.9.1"
-> Some spam / someone scanning for exploitable frameworks or similar.
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•