Closed Bug 1022678 Opened 10 years ago Closed 10 years ago

MSISDN — Lifetime of the sessionToken

Categories

(Cloud Services Graveyard :: MobileID, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rhubscher, Assigned: rhubscher)

References

Details

(Whiteboard: [qa?])

Attachments

(2 files)

(deleted), text/x-github-pull-request
rhubscher
: review+
Details
(deleted), text/x-github-pull-request
tarek
: review+
Details
We need to define for how long we want to keep a sessionToken valid for. With a valid sessionToken, you can generate a new certificate for the linked validated MSISDN. It is interesting because you can generate short live BrowserID certificate and regenerate a new one each time you need it without having to reask the full validation process to the user. At first I was thinking of building long term BrowserID certificate using the validation endpoint and get rid of the sessionToken as soon as the certificate was generated. But this doesn't comply well with not having CRL for BrowserID certificate. It is better to have short live BrowserID certificate. Then we need to decice how the MSISDN server should handle that. How long do we want the sessionToken to be valid for as well as how do we want it to expire. A proposal for this is to touch the sessionToken expiracy each time we build a new certificate. Is this a correct behavior? What are the delays for both the certificate lifetime and the token lifetime.
What we do on the FxA side is to declare the sessionToken to last "forever until revoked", so it has no expiration time, but if the user ever changes their password (or otherwise indicates that they want to revoke access from one of their devices), we delete the sessionToken from the server. That starts the revocation process. The process isn't complete until all issued certificates have expired, which we constrain by using 12-24 hour lifetimes in those certificates. This is a tradeoff between work (shorter certificates means more renewal network traffic and CPU costs for signing), latency of revocation (shorter certificates are incorrectly valid for less time), and behavior during unreachability of the server (shorter certificates mean downtime turns into inability-to-log-in). The Sync servers have some extra tricks to reduce the window during which a recently-revoked device can still get access, and we've kicked around some ideas to generalize these techniques so non-Sync services can use them. But in general, I'd plan to tolerate 12-24 hours of "leftover" access.
https://github.com/mozilla/fxa-auth-server/blob/master/routes/sign.js#L30 is where fxa-auth-server enforces a maximum limit of 24 hours on the "duration" parameter requested by the client. Clients can ask for shorter-lived certificates if they want. In the gecko tree, services/fxaccounts/FxAccounts.jsm (around line 465, in the getAssertion function) is where the client requests a signed certificate when necessary (also look at the various *_LIFETIME constants in FxAccountsCommon.js around line 75). We ask for a new cert unless we already have one which will still be valid in 5 minutes. We use any given keypair for 12 hours, and we ask for 6-hour-long certs. So in practice, the browser will hit fxa-auth-server's "sign" API four times a day, can tolerate 6-hour outages without losing Sync, and can continue making Sync calls for 6 hours after the sessionToken has been revoked (modulo the other tricks that the Sync server uses).
I support what warner has already said. IMO, for this API, session tokens should probably last forever until revoked. When do we revoke them? One likely situation is: * sessionToken1 is currently associated with a msisdn * another device successfully verifies the msisdn (with sessionToken2) In this case, sessionToken1 should probably be revoked because we have a strong signal that the msisdn is probably associated with a different device (with sessionToken2). Another situation is when a devices calls /unregister. The Loop server will also need to respond to when a MSISDN changes hands. E.g., * device1 verifies a msisdn, creates an assertion, and registers itself with the Loop server * device2 acquires the msisdn, verifies it, creates an assertion, and registers itself with the Loop server In the first step, the Loop server probably has a msisdn -> device1 mapping of some kind, but after the second step it should likely destroy that mapping and creates a new entry msisdn -> device2. One problem with using expiry based revocation is having to deal with that warner calls the "slap fight" problem. If device1 tries to re-assert ownership of the msisdn (e.g., because its signed assertion hasn't expired yet), the Loop server needs to decide who should maintain control and who should get an error. Luckily, we have the lastVerifiedAt property in the signed certificate. The server can allow the device who verified last to maintain control.
I'd like to have a way for the database to clean itself for broken devices that didn't revoke their token. Also if I want to connect to more than one device at the same time or to two apps that are using the same device and both ask for a MSISDN token; I don't want my sessionToken to be revoked automatically each time somebody validates a new token for that MSISDN. We don't have such thing as device ID on loop, the only thing we have is SimplePush URL and you can have more than one for a given account since you can be logged on your mobile and desktop at the same time and more generally on more than one device at the same time. So we have many issues: - How do we deal with database garbage collection? - How do we deal with number owner changes? - How do we deal with multiple apps and devices for a given MSISDN? The first question is about defining what is forever in a finite world. Can we say that if the user didn't use its apps for ONE_YEAR it is ok to ask him to verify his number again? Is 6 month or 1, 2, 3 months enought? For the second question, on client side, do we have a way to link the sessionToken to a SIM card and force its revocation on SIM card change? And about the last question, I think we should let the user have many sessionToken for the same MSISDN this makes sense in a multiple device world and each time a new certificate is asked we touch the sessionToken expiracy to be valid forever again. The token is valid forever as soon as it is used. If after a "forever" time period the token is not used anymore we can just garbage collect it.
(In reply to Chris Karlof [:ckarlof] from comment #3) > I support what warner has already said. > > IMO, for this API, session tokens should probably last forever until > revoked. When do we revoke them? One likely situation is: > > * sessionToken1 is currently associated with a msisdn > * another device successfully verifies the msisdn (with sessionToken2) > > In this case, sessionToken1 should probably be revoked because we have a > strong signal that the msisdn is probably associated with a different device > (with sessionToken2). I wouldn't expire the token on this case, unless we want to go the Whatsapp way (only allow a MSISDN registered at one device at a time). That would mean we wouldn't be able to be logged on Loop-FirefoxOS and Loop-Desktop with the same number at the same time (which I think is an use case we most definitely want to allow). Or just use the same MSISDN on two devices (I have my SIM duped in fact). I think that we could: * give this as an option (when registering a new device for a number that already has a device registered, ask the user if he wants to unregister the other devices or not) * Automatically call /unregister on a device when the iccID of the SIM that was present when the registration was done changes. Note that this is an inclusive or (we can do one, the other, or preferably both :)). > * device1 verifies a msisdn, creates an assertion, and registers itself with > the Loop server > * device2 acquires the msisdn, verifies it, creates an assertion, and > registers itself with the Loop server > > In the first step, the Loop server probably has a msisdn -> device1 mapping > of some kind, but after the second step it should likely destroy that > mapping and creates a new entry msisdn -> device2. Again, I think this is a use case we actually want to support (the msisdn -> device is a 1-n relationship, not a 1-1). > > One problem with using expiry based revocation is having to deal with that > warner calls the "slap fight" problem. If device1 tries to re-assert > ownership of the msisdn (e.g., because its signed assertion hasn't expired > yet), the Loop server needs to decide who should maintain control and who > should get an error. Luckily, we have the lastVerifiedAt property in the > signed certificate. The server can allow the device who verified last to > maintain control.
I like the option verify my number and drop all existing tokens. Because then the user knows what's going on.
I believe we are mixing two things here. On one side we have the Mobile ID API which provides a way to obtain a verified phone number. On the other side we have the consumers of this API that may use this verified phone number to build their own login system or for their carrier billing system or to spam the user about Viagra if they want. Our only known consumer so far is Loop. But we may have others soon. For example the Marketplace. We are talking about the expiration of the Mobile ID API session token, not the Loop session token. With that said, I believe that automatically expiring the Mobile ID session token in one device because other device uses the same identity might lead to very confusing UX. I hardly believe that the user will understand why is she being logged out from the Marketplace in her Firefox Desktop when she logs in Loop in her FxOS device and viceversa. And even if she understands, it will still be probably quite frustrating for her. It would be for me at least... specially if the verification process costs me money (SMS). (In reply to Rémy Hubscher (:natim) from comment #4) > > For the second question, on client side, do we have a way to link the > sessionToken to a SIM card and force its revocation on SIM card change? Yes, we can. In fact, we are already asking the user if she wants to use the old SIM or the new one when the old SIM is removed from the device, we associate session tokens to ICC IDs as far as we know the ICC ID. Note that this is not always the case because we allow the user to verify external phone numbers which ICC ID is unknown for us. > And about the last question, I think we should let the user have many > sessionToken for the same MSISDN this makes sense in a multiple device world > and each time a new certificate is asked we touch the sessionToken expiracy > to be valid forever again. > > The token is valid forever as soon as it is used. > If after a "forever" time period the token is not used anymore we can just > garbage collect it. I like this option. We have to define what "forever" means though. (In reply to Antonio Manuel Amaya Calvo (:amac) from comment #5) > I think that we could: > * give this as an option (when registering a new device for a number that > already has a device > registered, ask the user if he wants to unregister the other devices or > not) We can do that, but we have to be very careful about the UX and how we tell the user about the side effects. Maybe the login step is not the best place to do this. The user will be in the context of "I want to log in Loop" and it may end up wondering "why the hell my login in Loop logged me out of the Marketplace!?". > * Automatically call /unregister on a device when the iccID of the SIM that > was present when > the registration was done changes. > Yes, we can do that and note that this won't affect other devices. Mobile ID session tokens are unique per device. > Note that this is an inclusive or (we can do one, the other, or preferably > both :)). > Indeed.
Whiteboard: [qa?]
I've mulled these issues over with :warner and here's my thoughts: MSISDN session token expiration ------------------------------- Session tokens can be created and associated with a verified MSISDN, and also created but never used to actually finish the verification flow. "Unverified" tokens can be pruned after some time (e.g., 1 day), but I think "verified" tokens should be good until we choose to revoke them (more on revocation below). Although we may want to expire them in the future, I don't think it's necessary to decide on that now. What we should do now is: 1) Ensure clients support the 401/110 error (invalid token), and 2) Ensure the server updates the "last used time" whenever a token is used. This should give us the flexibility to expire/revoke tokens in the future, if necessary. I do think it's a good idea to ask users to periodically re-verify that they control the MSISDN, but this should be discussed at the product/UX level, because it's not free, both from a SMS-sending perspective and a UX perspective. MSISDN session token revocation ------------------------------- I'm reversing my previous opinion from above. I agree, we should allow multiple MSISDN verification session tokens to exist for the same MSISDN. I don't agree/follow all the arguments made above regarding use cases, but I do agree that users might want verify a MSISDN for sufficiently different purposes that a second verification should not necessarily revoke the session token resulting from a previous verification. The problem of MSISDN re-use remains for Loop. The correct technical solution will be driven by product/UX requirements, which is another good reason to enforce this policy in the Loop itself and not shared infrastructure. IMO, policies for MSISDN reuse are important, but out of scope for this bug. tldr: IMO, a MSISDN session token associated with a verified MSISDN should have an indefinite lifetime. We may implement expiration policies in the future, but I see no reason to do so now. An MSISDN session token that is not used to complete a verification flow can be pruned after a reasonable period of time. If we want to open a different bug to discuss the "multiple devices for a single MSISDN" use case, I'd be happy to participate.
> a MSISDN session token associated with a verified MSISDN should have an indefinite lifetime I don't see any good reason from an UX point of view not do to this. > An MSISDN session token that is not used to complete a verification flow can be pruned after a reasonable period of time. Agreed. Let's define that time I guess. In any case we've just invalidated the usage of Redis to store the session tokens I think - we should move part of our data to a persistent database - so we are not limited to the RAM and can make the stack more future proof: a "small" front end with redis and a persistent DB in the back. natim, what do you think ? should we try DynamoDB ? ckarlof, what do you guys use in FxA for that ?
> ckarlof, what do you guys use in FxA for that ? We just store everything in RDS/MySQL in AWS. We currently don't use redis and have no caching layer.
If the data model/schema fits within dynamodb that would be almost zero maintenance for us in operations. The nodejs AWS SDK is quite easy to use as well. RDS/MySQL is something we have decent operational experience with now. While it is quite expensive it is very fast. For FxA we see IO latencies at about 2ms and we can scale up vertically quite a bit. Fail overs are essentially automatic and about 1 minute when a server fails.
Attached file Link to github PR (deleted) —
Attachment #8439277 - Flags: review?(alexis+bugs)
Assignee: nobody → rhubscher
Status: NEW → ASSIGNED
I have also created a new Github issue to implement a DynamoDB persistent backend storage: https://github.com/mozilla-services/msisdn-gateway/issues/83
:mostlygeek how do you guys handle MySQL schema creation for FxA? For DynamoDB we need to setup a schema as well should I create a script to do that in the orchestration phase?
Flags: needinfo?(bwong)
:natim For FxA db and schema creation is now manual. Also patching it is manual but done through code, see: https://github.com/mozilla/fxa-auth-db-server Yes please create a script that we can manually run to create the initial DynamoDB tables.
Flags: needinfo?(bwong)
Attached file DynamoDB (deleted) —
Attachment #8441422 - Flags: review?(tarek)
Attachment #8441422 - Flags: feedback?(bwong)
I have created a setup() method for the storage as well as a check during the connection phase that everything is all right.
Attachment #8441422 - Flags: review?(tarek) → review+
Attachment #8441422 - Flags: feedback?(bwong)
:mostlygeek Should the STS API be handled on MSISDN side. How do you get the temporary AccessKeyId and SecretKey? The dynamo-client is able to read them directly from the ENV variable so if we can provision it that way it could be enough. Does it change or is it the same for the instance lifetime?
Flags: needinfo?(bwong)
What I understood is that you will define the EC2 instance to be part of the architecture so it will be able to ask for temporary tokens. It seems that https://github.com/teleportd/node-dynamodb is able to handle STS temporary tokens automatically.
:natim > How do you get the temporary AccessKeyId and SecretKey? The EC2 metadata API (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AESDG-chapter-instancedata.html) > read from ENV variable It needs to pull from the API since the credentials aren't static.
Flags: needinfo?(bwong)
So the flow is: 1. In case STS is set on, call the EC2 metadata API to ask for temporary AccessKeyId and SecretKey 2. Start the connection using temporary tokens 3. If token expire ask again Is that right?
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
AFAIK there is no ON for STS. It should return you credentials. Whether those credentials gives you access to anything is a totally different matter :) Though, on our deployments always use STS for creds. Unless we have to, we never used fixed AWS creds for services.
Yes I've got it, thank you for pointing me to aws-sdk I will work on that. Btw the loop-server can work without dynamodb.
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: