760910 - Handle 401 responses correctly

Reporter

Description

•

12 years ago

We're refreshing tokens every 4 minutes when there are pending requests. We shouldn't refresh tokens unless they fail, and also not assume we know how long they're good for (this should be server-controlled). Don't fail on a 401 the first time, update token and try again (if we're still failing, die and log that the server is boned, and maybe even report it via some magic API!)

Jason Smith [:jsmith]

Updated

•

12 years ago

Whiteboard: [blocking-aitc]

Jason Smith [:jsmith]

Updated

•

12 years ago

Whiteboard: [blocking-aitc] → [blocking-aitc+]

Ryan Kelly [:rfkelly]

Comment 1

•

12 years ago

Also worth noting: when you refresh your token you may be assigned a different endpoint URL than you had previously. If the endpoint URL is persisted anywhere then it may need to be updated.

Anant Narayanan [:anant]

Reporter

Comment 2

•

12 years ago

Ryan, can we add a field to the token response that says how long it is valid for? If you agree, let's file a bug for it and block this one on that?

Tarek Ziadé (:tarek)

Comment 3

•

12 years ago

We'll add a 'duration' field in the mapping returned by the GET see http://docs.services.mozilla.com/token/apis.html

Tarek Ziadé (:tarek)

Updated

•

12 years ago

Assignee: nobody → alexis

Alexis Metaireau (:alexis)

Comment 4

•

12 years ago

Should be fixed in the last master (r+ by tarek), see: - https://github.com/mozilla-services/tokenserver/commit/78d07d876953863ccd1425d68ee2a75aa0b5e41d - https://github.com/mozilla-services/tokenserver/commit/9f2dab269db6eb6caa55700fe4f60eab9ca4406d - https://github.com/mozilla-services/tokenserver/commit/9c9ce4593d270538deca9b5a4f4b06f37c118e71 - https://github.com/mozilla-services/docs/commit/6deb4a5fafeab10a9edfea94982cfd865fa49d2b

Toby Elliott [:telliott]

Comment 5

•

12 years ago

While I can see merit to having a timestamp (I know it's gone in and out of the spec several times), I don't think this should be a blocking issue. If the client gets a 401, it should head back to the tokenserver, and if it gets another 401, sleep for some period of time (1/2 hour?). There may be scenarios where an unexpired token gets rejected - most notably if we have to pull a key. Relying on the timestamp to tell us the token remains legitimate seems dangerous.

Toby Elliott [:telliott]

Comment 6

•

12 years ago

Also, looking at the Duration field in the revised docs, it seems like that should return a Date rather than an integer. If I was building a client for the api, first thing I'd need to do would be to figure out a date from that, so that I could know without having to preserve context.

Alexis Metaireau (:alexis)

Comment 7

•

12 years ago

The problem with returning a date is that it is relative to the server and not to the client, and thus can be innacurate for the latter. Having a way to know what's the delay between two token generation calls might be of use (for instance to setup a callback on the client, which will update the token to a new one automatically)

Toby Elliott [:telliott]

Comment 8

•

12 years ago

(In reply to Toby Elliott [:telliott] from comment #5) > There may be scenarios where an unexpired token gets rejected - most notably > if we have to pull a key. Relying on the timestamp to tell us the token > remains legitimate seems dangerous. (for clarification, there is value to an expiry - the client may *know* they need to get a new one and save a 401 round-trip, but that's a separate issue from the one described here)

Toby Elliott [:telliott]

Comment 9

•

12 years ago

(In reply to Alexis Metaireau (:alexis) from comment #7) > The problem with returning a date is that it is relative to the server and > not to the client, and thus can be innacurate for the latter. > The date presumably has some sort of timezone affixed to it that can be parsed and converted into whatever relevant time we want. That's worked for HTTP for some time. If that's not sufficient, then unix timestamp would do. Also, if the date is inaccurate, that shouldn't be a problem. Two scenarios: 1) The client thinks the token has expired even though it has not, goes and gets a new one. All good (save a degenerate case where the client immediately thinks the token has expired. That seems extreme) 2) The client thinks the token has not expired where it has, it tries to use it, gets a 401 and goes and gets the new token. All good.

Tarek Ziadé (:tarek)

Comment 10

•

12 years ago

@Toby: The client just needs a duration so it optimizes the time between two token server calls -- to avoid extra 401s. Why do you want us to send the client a date that's specific to the server ? How will the client know when it should call again the TS server ? I am curious to see how you are going to implement this. With a duration, in pseudo-code it's: def get_token_from_server(): token = server.get('/token') callback(get_token_from_server, when=now+token['duration']) We don't bother with timezones, or complicated data parsing. We just know that the token gets deprecated around now + duration

Toby Elliott [:telliott]

Comment 11

•

12 years ago

You're correct if all we plan to use it for is the pseudocode, in which case, all good. I'm thinking about persistence across restarts.

James Bonacci [:jbonacci]

Updated

•

12 years ago

Whiteboard: [blocking-aitc+] → [blocking-aitc+][qa+]

Anant Narayanan [:anant]

Reporter

Comment 12

•

12 years ago

(In reply to Toby Elliott [:telliott] from comment #5) > While I can see merit to having a timestamp (I know it's gone in and out of > the spec several times), I don't think this should be a blocking issue. If > the client gets a 401, it should head back to the tokenserver, and if it > gets another 401, sleep for some period of time (1/2 hour?). > > There may be scenarios where an unexpired token gets rejected - most notably > if we have to pull a key. Relying on the timestamp to tell us the token > remains legitimate seems dangerous. The intention behind blocking on this was that it seemed beneficial to have token expiry time being controlled by the server. Ops could, for instance, tune the expiry time to manage load.

Anant Narayanan [:anant]

Reporter

Comment 13

•

12 years ago

(In reply to Tarek Ziadé (:tarek) from comment #10) > def get_token_from_server(): > token = server.get('/token') > callback(get_token_from_server, when=now+token['duration']) This is almost exactly what we were planning to implement, so it'd be better to have a duration as opposed to a date. We are currently not persisting the token across restarts.

Anant Narayanan [:anant]

Reporter

Comment 14

•

12 years ago

(In reply to Alexis Metaireau (:alexis) from comment #4) > Should be fixed in the last master (r+ by tarek), see: Is this available on the dev or staging server now? I'd like to test a client patch for this.

Alexis Metaireau (:alexis)

Comment 15

•

12 years ago

Anant, this is okay. We have 0.9 in staging and in production now. Please mark it as resolved if everything works for you.

Status: NEW → ASSIGNED

Anant Narayanan [:anant]

Reporter

Comment 16

•

12 years ago

Not closing this quite yet, since we also need to finish up the client side portion of this. Thanks guys!

Assignee: alexis → nobody

Status: ASSIGNED → NEW

Jason Smith [:jsmith]

Comment 17

•

12 years ago

I'd hold off on flagging qa+ until the patch is heading to land (cause that's the time when the parameter matters).

Whiteboard: [blocking-aitc+][qa+] → [blocking-aitc+]

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Updated

•

12 years ago

Assignee: nobody → ndesaulniers

Anant Narayanan [:anant]

Reporter

Updated

•

12 years ago

Status: NEW → ASSIGNED

OS: Linux → All

Hardware: x86 → All

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Updated

•

12 years ago

Depends on: 760909

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Comment 18

•

12 years ago

Attached patch rev1 (obsolete) (deleted) — Details — Splinter Review

Attachment #640378 - Flags: review?(gps)

Gregory Szorc [:gps]

Comment 19

•

12 years ago

Comment on attachment 640378 [details] [diff] [review] rev1 Review of attachment 640378 [details] [diff] [review]: ----------------------------------------------------------------- Nit: Set proper summary line for commit message. Bug 760910 - Handle 401 responses in AITC client code ::: services/aitc/modules/client.js @@ +369,4 @@ > // Set values from X-Backoff and Retry-After headers, if present > _setBackoff: function _setBackoff(req) { > let backoff = 0; > + let successfulStatusCodes = [201, 204, 401]; 401 isn't really a "successful status code." Perhaps rename to indicate the intended purpose? statusCodesWithoutBackoff? @@ +405,4 @@ > // Fuzz backoff time so all client don't retry at the same time > backoff = Math.floor(Math.random() * backoff) + backoff; > this._state.set("backoff", "" + (time + backoff)); > + this._log.debug("CLIENT SETTING BACKOFF TO: " + backoff); No need to put this in caps. You may also want to crank up the log level to info, as this is more important than debug, IMO. ::: services/aitc/modules/manager.js @@ +30,4 @@ > * methods on this object until the callback has been invoked, doing so will > * lead to undefined behaviour. > */ > +function AitcManager(cb, premadeClient, premadeToken) { Add docs for premadeToken's purpose. You may also want to use default values to explicitly set this to null instead of undefined. @@ +34,5 @@ > this._client = null; > this._getTimer = null; > this._putTimer = null; > > this._lastToken = 0; Consider renaming to _lastTokenTime. @@ +35,5 @@ > this._getTimer = null; > this._putTimer = null; > > this._lastToken = 0; > + this._tokenDuration = 240000; // 4 minutes Why did you nuke the constant initial value? Magic numbers should almost always be constified. @@ +226,5 @@ > _validateToken: function _validateToken(func) { > + let timeSinceLastToken = Date.now() - this._lastToken; > + this._log.debug('TIME SINCE LAST TOKEN: ' + timeSinceLastToken); > + this._log.debug('TOKEN DURATION: ' + this._tokenDuration); > + this._log.debug('TOKEN FLAG: ' + this._invalidTokenFlag); Nit: Double quotes. Nit: No need for caps. Nit: These can all be in one log line. @@ +244,4 @@ > this._refreshToken(function(err, done) { > if (!done) { > this._log.warn("_checkServer could not refresh token, aborting"); > return; So if there is an error, func is never invoked. This seems wrong and likely indicates a bug and poor test coverage. Either fix it with the next patch (perhaps ask Anant first) or file a follow-up bug blocking release. @@ +245,5 @@ > if (!done) { > this._log.warn("_checkServer could not refresh token, aborting"); > return; > } > + self._log.debug("IN REFRESH TOKEN CALLBACK L243 WITH NEW TOKEN"); "Got refreshed token." @@ +266,4 @@ > return; > } > > + this._log.debug('IN CHECKSERVER'); Say what you are actually doing rather than dropping a debug label. @@ +278,5 @@ > this._client.getApps(function gotApps(err, apps) { > if (err) { > // Error was logged in client. > + if (err.authfailure) { > + self._log.debug('IN GETAPPS'); "Validating token due to client auth failure." @@ +279,5 @@ > if (err) { > // Error was logged in client. > + if (err.authfailure) { > + self._log.debug('IN GETAPPS'); > + self._validateToken(function (err) { Nit: always name functions. @@ +327,4 @@ > return; > } > > + this._log.debug('IN PROCESSQUEUE'); Nit: Better log message. @@ +342,4 @@ > // Send to end of queue if unsuccessful or err.removeFromQueue is false. > if (err && !err.removeFromQueue) { > self._log.info("PUT failed, re-adding to queue"); > + self._log.debug("ABOUT TO CHECK FOR AUTHFAILURE"); Is this necessary? @@ +344,5 @@ > self._log.info("PUT failed, re-adding to queue"); > + self._log.debug("ABOUT TO CHECK FOR AUTHFAILURE"); > + // Error was logged in client. > + if (err.authfailure) { > + self._log.debug(err.authfailure); Remove @@ +345,5 @@ > + self._log.debug("ABOUT TO CHECK FOR AUTHFAILURE"); > + // Error was logged in client. > + if (err.authfailure) { > + self._log.debug(err.authfailure); > + self._log.debug("IN PUT APPS, AUTHFAILURE OCCURRED"); "Auth failure when ..." @@ +352,5 @@ > + if (!err) { > + self._log.debug("SHOULD'VE GOTTEN A NEW TOKEN"); > + //self._log.debug("RESCHEDULING PUT"); > + //_reschedule(); > + //return; Eliminate dead code if it is really dead. @@ +501,5 @@ > + this._tokenDuration = parseInt(this._premadeToken.duration); > + this._log.debug("SETTING TOKEN DURATION TO: " + this._tokenDuration); > + this._lastToken = Date.now(); > + this._invalidTokenFlag = false; > + this._log.debug("UPDATED CLIENT TOKEN WITH PREMADE TOKEN"); Consolidate log messages. @@ +532,4 @@ > _gotToken: function _gotToken(err, tok, cb) { > if (!err) { > this._log.info("Got token from server: " + JSON.stringify(tok)); > + this._tokenDuration = parseInt(tok.duration); parseInt takes an optional radix argument. Always define it if you know what the integer representation is (you do here). ::: services/aitc/tests/unit/test_aitc_manager.js @@ +157,5 @@ > + mockRequestCount++; > + _('!!! MOCK REQUEST !!!'); > + _(manager._lastToken); > + _("Mock Request Count " + mockRequestCount); > + _('!!!!!!!!!!!!!!!!!!!!'); This seems excessive. ::: services/common/tokenserverclient.js @@ +114,4 @@ > * key (string) HTTP MAC shared symmetric key. > * endpoint (string) URL where service can be connected to. > * uid (string) user ID for requested service. > + * duration (string) the validity duration of the issued token. When did the token server change? We should split out this change into a separate bug and include unit test updates to reflect the server's new behavior. I filed bug 772261.

Attachment #640378 - Flags: review?(gps)

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Comment 20

•

12 years ago

crap, forgot to qref the removal of my debug code. Now my noob-ness will forever be immortalized on bugzilla... patch rev2 inbound

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Comment 21

•

12 years ago

Attached patch rev2 (obsolete) (deleted) — Details — Splinter Review

how embarrassing!

Attachment #640378 - Attachment is obsolete: true

Attachment #640406 - Flags: review?(gps)

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Updated

•

12 years ago

Attachment #640406 - Flags: review?(gps)

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Comment 22

•

12 years ago

Sorry, I also need to make changes that Greg recommended.

Anant Narayanan [:anant]

Reporter

Comment 23

•

12 years ago

(In reply to Gregory Szorc [:gps] from comment #19) > When did the token server change? We should split out this change into a > separate bug and include unit test updates to reflect the server's new > behavior. > > I filed bug 772261. This was changed on the server (stage) about 3 weeks ago, see comment #4.

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Comment 24

•

12 years ago

Attached patch rev3 (obsolete) (deleted) — Details — Splinter Review

Attachment #640406 - Attachment is obsolete: true

Attachment #640699 - Flags: review?(gps)

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Updated

•

12 years ago

Blocks: 760902

Gregory Szorc [:gps]

Comment 25

•

12 years ago

Comment on attachment 640699 [details] [diff] [review] rev3 Review of attachment 640699 [details] [diff] [review]: ----------------------------------------------------------------- What's the relationship between this and bug 760909? I would prefer to review them serially. I /think/ bug 760909 comes before this one?

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Comment 26

•

12 years ago

Yes, I reuse test code from my previous patch in bug 760909.

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Updated

•

12 years ago

Attachment #640699 - Flags: review?(gps)

(not reading bugmail) Nick Desaulniers [:\n]

Assignee

Comment 27

•

12 years ago

Attached patch AITC Client Handle 401 Responses (deleted) — Details — Splinter Review

Attachment #640699 - Attachment is obsolete: true

Attachment #642042 - Flags: review?(gps)

Gregory Szorc [:gps]

Comment 28

•

12 years ago

Comment on attachment 642042 [details] [diff] [review] AITC Client Handle 401 Responses Review of attachment 642042 [details] [diff] [review]: ----------------------------------------------------------------- Overall I think this is good. There are some places where repetitive code could get consolidated. But, I think that could be the realm of a follow-up bug. ::: services/aitc/modules/client.js @@ +376,4 @@ > // Set values from X-Backoff and Retry-After headers, if present. > _setBackoff: function _setBackoff(req) { > let backoff = 0; > + let statusCodesWithoutBackoff = [200, 201, 204, 401]; Need 304. Bitrotted. ::: services/aitc/modules/manager.js @@ +39,5 @@ > this._putTimer = null; > > + this._lastTokenTime = 0; > + this._tokenDuration = INITIAL_TOKEN_DURATION; > + this._premadeToken = premadeToken || null; Use default argument values. @@ +339,5 @@ > + self._log.error(err); > + } else { > + self._putApps(); > + } > + }); I've seen this pattern before. You think it's worth uplifting into a generic function? function _onValidateAndRun(fn, err) { if (err) { this._log.error(error); } fn.call(this); } Usage: this._validateToken(this._onValidateAndRun.bind(this, this._getApps)); ::: services/aitc/tests/unit/test_aitc_manager.js @@ +100,5 @@ > + manager._lastTokenTime = Date.now(); > + let mockRequestCount = 0; > + let clientFirstToken = null; > + > + server.onRequest = function mockstatus () { Nit: no space before ( @@ +115,5 @@ > + case 2: > + // Check that the client obtained a different token. > + do_check_neq(client.token.id, clientFirstToken.id); > + do_check_neq(client.token.key, clientFirstToken.key); > + server.stop(run_next_test); This will kill the server as it is serving the request. Does the test shut down OK? If so, great. That being said, it would be preferred to wait for the response before moving on to the next test.

Attachment #642042 - Flags: review?(gps) → review+

Anant Narayanan [:anant]

Reporter

Comment 29

•

12 years ago

https://hg.mozilla.org/services/services-central/rev/1498357e3144

Whiteboard: [blocking-aitc+] → [blocking-aitc+], [fixed in services]

Jason Smith [:jsmith]

Comment 30

•

12 years ago

Is this possible to verify from an end-user perspective?

Whiteboard: [blocking-aitc+], [fixed in services] → [blocking-aitc+], [fixed in services], [qa?]

Anant Narayanan [:anant]

Reporter

Comment 31

•

12 years ago

Nope.

Whiteboard: [blocking-aitc+], [fixed in services], [qa?] → [blocking-aitc+], [fixed in services], [qa-]

Gregory Szorc [:gps]

Comment 32

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/1498357e3144

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Whiteboard: [blocking-aitc+], [fixed in services], [qa-] → [blocking-aitc+][qa-]

BMO Automation

Updated

•

6 years ago

Product: Web Apps → Web Apps Graveyard

rev1 12 years ago (not reading bugmail) Nick Desaulniers [:\n] (deleted), patch		Details \| Diff \| Splinter Review
rev2 12 years ago (not reading bugmail) Nick Desaulniers [:\n] (deleted), patch		Details \| Diff \| Splinter Review
rev3 12 years ago (not reading bugmail) Nick Desaulniers [:\n] (deleted), patch		Details \| Diff \| Splinter Review
AITC Client Handle 401 Responses 12 years ago (not reading bugmail) Nick Desaulniers [:\n] (deleted), patch	gps : review+	Details \| Diff \| Splinter Review