Open Bug 1507454 Opened 6 years ago Updated 3 years ago

"Bad Request Error parsing headers: 'limit request headers fields size'" due to hundreds of com.auth0.auth cookies

Categories

(Tree Management :: Treeherder: Frontend, defect, P1)

defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: aryx, Unassigned)

References

Details

Calling treeherder, e.g. https://treeherder.mozilla.org/#/jobs?repo=autoland , fails at the moment with Bad Request Error parsing headers: 'limit request headers fields size' The Cookie request header has 8210 characters and looks like this: csrftoken _ga sessionid 52x com.auth0.auth.<random string>
Firefox's (Nightly) cookie manager claims there are 126 cookies for treeherder.mozilla.org
What are the expiry times for the auth0 cookies like? (eg all around the same time, in the future/past?)
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Priority: -- → P1
Summary: server response "Bad Request Error parsing headers: 'limit request headers fields size'" because many com.auth0.auth cookies in cookie header → "Bad Request Error parsing headers: 'limit request headers fields size'" due to hundreds of com.auth0.auth cookies
Prod is currently using auth0.js 9.8.1. Stage has 9.8.2, though it doesn't look like the two fixes it includes would help: https://github.com/auth0/auth0.js/blob/master/CHANGELOG.md#v982-2018-11-13 The auth0 library switched from localstorage to cookies in v9.8.0, which we picked up (via 9.8.1) in this PR: https://github.com/mozilla/treeherder/pull/4078 ...which was deployed to stage on 2018-10-25 and finally prod on 2018-11-13 (ignoring the brief spell on 2018-11-12 before the rollback). So I guess that would explain why we're only seeing it for the first time now. Though from IRC it seems like we haven't yet ruled out that this might be a "Firefox Multi Account Container" extension issue? https://mozilla.logbot.info/treeherder/20181115#c15614416
Looking at my own cookies for treeherder.mozilla.org, I had two: * `com.auth0.auth.<random string>` - Expires: "Mon, 12 Nov 2018 13:48:06 GMT" * `com.auth0.auth.<random string>` - Expires: "Tue, 13 Nov 2018 11:13:11 GMT" ...both of those dates are in the past - and Firefox Nightly had been restarted since then. So guessing either: 1) expired cookies are intentionally only lazily deleted 2) expired cookie deletion is actually broken The next question is "how is it possible to end up with so many cookies?". I found that if I deleted the cookie for `auth.mozilla.auth0.com` (so as to stop the automatic SSO signin/redirect), then every time I click "sign in" in Treeherder (but importantly don't complete the sign in process and close the new tab that was opened) a new `treeherder.mozilla.org` cookie with name `com.auth0.auth.<random string>` is created. Perhaps this also occurs as part of the "refresh tokens in the background" behaviour? ie: If the SSO session expired, but Treeherder's UI is polling every 15 minutes for a new token, then perhaps across multiple tabs these junk cookies could accumulate? (eg overnight if laptop left on) Though having just tried to emulate that locally with `yarn start` by setting the refresh interval to 5 seconds and then deleting the 4 `auth.mozilla.auth0.com` cookies, I only get one junk cookie before the refresh stops.
(In reply to Ed Morley [:emorley] from comment #4) > Though having just tried to emulate that locally with `yarn start` by > setting the refresh interval to 5 seconds and then deleting the 4 > `auth.mozilla.auth0.com` cookies, I only get one junk cookie before the > refresh stops. Ah but each time I refresh the Treeherder page it tries once more. So the STR is perhaps: 1) Sign into Treeherder 2) Open many Treeherder tabs and use during workday 3) For whatever reason the Auth0 SSO session expires/needs a non-silent login etc 4) Each open Treeherder tab then ends up creating one extra cookie straight after, plus any time those pages are refreshed (including when navigating/filtering in such a way that the jobs-view does a full page reload) or a new tab is opened, a new cookie is created 5) For whatever reason Firefox doesn't expire the cookies straight away or at all 6) When enough cookies accumulate, the requests to Treeherder itself (which include these cookies) fail, due to the header size limit here: https://devcenter.heroku.com/articles/http-routing#http-validation-and-restrictions
Oh but expired cookies are not sent with HTTP requests (I tested just now). So this means that the header size limit must have been reached using cookies that were not expired. So either expiry not set correctly by the auth0.js library or else all of them created within the 15 minute window. Hmm.
(In reply to Ed Morley [:emorley] from comment #2) > What are the expiry times for the auth0 cookies like? (eg all around the > same time, in the future/past?)
Flags: needinfo?(aryx.bugmail)
Sorry, have removed all cookies to get TH working. Only have the content of the Cookie header handy. Ping me if you need it. At the moment, I have 2 cookies for treeherder.mozilla.org. Sheriff Cosmin reports 6.
Flags: needinfo?(aryx.bugmail)
Totally understandable that they were deleted :-) However at the moment I don't think there's enough information to make significant progress on this. Could you keep an eye on the cookie counts and report back if they continue to grow?
Assignee: emorley → nobody
Status: ASSIGNED → NEW
Current status: 7 treeherder cookies, 5 of those are com.auth0.* and they all have expired between Friday and yesterday. See also bug 691973, not sure if related.
Thank you. So in comment 6 I had the realisation that expired cookies are not sent with the request. So that leaves: * cookies with no expiration (or at least an expiration massively far in the future) that are slowly accumulating * cookies with the expected 15 minute expiration but have all somehow been generated in a short space of time Each case would imply very different causes. It sounds like all of yours at the moment are expired, which is fine (ie wouldn't cause this bug).

Have you heard of this happening again since?

Flags: needinfo?(aryx.bugmail)

No.

Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(aryx.bugmail)
Resolution: --- → WORKSFORME

:whimboo was not able to retrigger tasks because of this issue (failure message Unable to retrigger/add job. Taskcluster: can't access property "id", c[e] is undefined). I have ~100 com.auth0.auth.... cookies (and hit the issue years ago).

Sarah, are you aware of a change which could explain this? Or is this bug 530594 (but still lacks an indicator why this started to become an issue again).

Foot note: Treeherder stopped logging me out overnight like it did in the past.

Status: RESOLVED → REOPENED
Flags: needinfo?(sclements)
Resolution: WORKSFORME → ---

(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #14)

:whimboo was not able to retrigger tasks because of this issue (failure message Unable to retrigger/add job. Taskcluster: can't access property "id", c[e] is undefined). I have ~100 com.auth0.auth.... cookies (and hit the issue years ago).

I'm not seeing how the unable to retrigger a job error is related to this auth0 cookies issue. What would help to debug that is to have the specific job in question/steps to reproduce for that error.

Flags: needinfo?(sclements)

Henrik had tried to retrigger this task. Logging out and in hadn't resolved it but clearing the Treeherder cookies fixed it for him.

You need to log in before you can comment on or make changes to this bug.