ChatGPT解决这个技术问题 Extra ChatGPT

what's the point of refresh token?

i have to confess i've had this question for a very long time, never really understand.

say auth token is like a key to a safe, when it expires it's not usable anymore. now we're given a magic refresh token, which can be used to get another usable key, and another... until the magic key expires. so why not just set the expiration of the auth token as the same as refresh token? why bother at all?

what's the valid reason for it, maybe a historical one? really want to know. thanks

refresh token is NOT about updating user role or revoking access, neither it's about requesting user/pass only for the first time. You can achieve all that with just access token. It's mainly about reducing the attack service. For more see here

K
Kiarash

I was reading an article the other day by Taiseer Joudeh and I find it very useful he said:

In my own opinion there are three main benefits to use refresh tokens which they are:

Updating access token content: as you know the access tokens are self contained tokens, they contain all the claims (Information) about the authenticated user once they are generated, now if we issue a long lived token (1 month for example) for a user named “Alex” and enrolled him in role “Users” then this information get contained on the token which the Authorization server generated. If you decided later on (2 days after he obtained the token) to add him to the “Admin” role then there is no way to update this information contained in the token generated, you need to ask him to re-authenticate him self again so the Authorization server add this information to this newly generated access token, and this not feasible on most of the cases. You might not be able to reach users who obtained long lived access tokens. So to overcome this issue we need to issue short lived access tokens (30 minutes for example) and use the refresh token to obtain new access token, once you obtain the new access token, the Authorization Server will be able to add new claim for user “Alex” which assigns him to “Admin” role once the new access token being generated Revoking access from authenticated users: Once the user obtains long lived access token he’ll be able to access the server resources as long as his access token is not expired, there is no standard way to revoke access tokens unless the Authorization Server implements custom logic which forces you to store generated access token in database and do database checks with each request. But with refresh tokens, a system admin can revoke access by simply deleting the refresh token identifier from the database so once the system requests new access token using the deleted refresh token, the Authorization Server will reject this request because the refresh token is no longer available (we’ll come into this with more details). No need to store or ask for username and password: Using refresh tokens allows you to ask the user for his username and password only one time once he authenticates for the first time, then Authorization Server can issue very long lived refresh token (1 year for example) and the user will stay logged in all this period unless system admin tries to revoke the refresh token. You can think of this as a way to do offline access to server resources, this can be useful if you are building an API which will be consumed by front end application where it is not feasible to keep asking for username/password frequently.


Why not a short-lived access token where credentials are checked when it has expired? If the user is still valid, then reset the expiry on the access token. I don't understand the need for a refresh token?
"...they contain all the claims (Information) about the authenticated user once they are generated..." And here I thought putting any user related information, even if it's encrypted, into a token is considered bad practice.
Well depends on what user information you talking about. it says "All claims information" not all user information
@RickJolly : Access tokens usually include enough information to identify the user and its role, without needing to hit the database. This is usually achieved with a signed and encrypted token (e.g. hmac signature). A single character added, removed or changed in the original content gives you a completely different signed token. So, if after signing and encrypting "userid:alex;role:user;client:foobar.com;ts:1552806134" the token looks like 'kE4ia6' it will look completely different for "userid:alex;role:admin;client:barbaz.net;ts:1552806565", even if it's the same user.
@RickJolly : A refresh token on the other hand is just an opaque key that points to a single user. You need to query the database to find who it belongs to each time (performance hit). It's in effect an obfuscated representation of the user's credentials. So rather than storing username:password on your client you can store it instead (best practice). On a new project you could use it the same way you use an access token, but because of the afore-mentioned limitations you might eventually want something more scalable. The refresh token / access token setup is a good option.
S
Stijn de Witt

I would like to add to this another perspective.

Stateless authentication without hitting the DB on each request

Let's suppose you want to create a stateless (no session) security mechanism that can do authentication of millions of users, without having to make a database call to do the authentication. With all the traffic your app is getting, saving a DB call on each request is worth a lot! And it needs to be stateless so it can be easily clustered and scaled up to hundreds or even thousands of servers.

With old-fashioned sessions, the user logs in, at which point we read their user info from the database. To avoid having to read it again and again we store it in a session (usually in memory or some clustered cache). We send the session ID to the client in a cookie, which is attached to all subsequent requests. On subsequent requests, we use the session ID to lookup the session, that in turn contains the user info.

Put the user info directly in the access token

But we don't want sessions. So instead of storing the user info in the session, let's just put it in an access token. We sign the token so no one can tamper with it and presto. We can authenticate requests without a session and without having to look up the user info from the DB for each request.

No session ... no way to ban users?

But not having a session has a big downside. What if this user is banned for example? In the old scenario we just remove his session. He then has to log in again, which he won't be able to do. Ban completed. But in the new scenario there is no session. So how can we ban him? We would have to ask him (very politely) to remove his access token. Check each incoming request against a ban list? Yes, would work, but now we again have to make that DB call we don't want.

Compromise with short-lived tokens

If we think it's acceptable that a user might still be able to use his account for, say, 10 minutes after being banned, we can create a situation that is a compromise between checking the DB every request and only on login. And that's where refresh tokens come in. They allow us to use a stateless mechanism with short-lived access tokens. We can't revoke these tokens as no database check is done for them. We only check their expiry date against the current time. But once they expire, the user will need to provide the refresh token to get a new access token. At this point we do check the DB and see that the user has been banned. So we deny the request for an access token and the ban takes effect.


"But once they expire, the user will need to provide the refresh token to get a new access token". Why a refresh token though? Why not re-validate the user based on the credentials in the original token, and then if valid, then reset the expiry date on that token?
@RickJolly Because the access token usually does not contain credentials (neither does the refresh token). The way it works is that the tokens are signed by the issuer. So the issuer knows the tokens are valid (until they expire). Usually the token contains stuff such as the username and roles/permissions the user has. Both sides can then use the token to display such info without having to do DB/network calls first.
Ok thanks, but if we disregard what I wrote about revalidating the user based on credentials in the access token (which could be simply a check that the user id is still permitted), I still see no reason for a refresh token. And that is my question: "Why a refresh token?"
@RickJolly Point taken. I guess that the designers of the protocol concluded that it's easiest to split the two tasks: providing the access token for all actions and providing a refresh token just to get an access token. This puts the responsibility of refreshing in the clients hands, reducing complexity for the server.
u
user7294900

The referenced answer (via @Anders) is helpful, It states:

In case of compromise, the time window it's valid for is limited, but the tokens are used over SSL, so unlikely to be compromised.

I think the important part is that access tokens will often get logged (especially when used as a query parameter, which is helpful for JSONP), so it's best for them to be short-lived.

There are a few additional reasons, with large-scale implementations of OAuth 2.0 by service providers:

API servers can securely validate access tokens without DB lookups or RPC calls if it's okay to not worry about revocation. This can have strong performance benefits and lessen complexity for the API servers. Best if you're okay with a token revocation taking 30m-60m (or whatever the length of the access token is). Of course, the API servers could also keep an in-memory list of tokens revoked in the last hour too. Since tokens can have multiple scopes with access to multiple different API services, having short-lived access tokens prevents a developer of API service for getting a lifelong access to a user's data on API service B. Compartmentalization is good for security.


thanks. however, i think both additional reasons are not strong enough. 1, for large sites have millions of applications it's impossible to keep all access_tokens in memory, db lookups or rpc calls are unavoidable. of course i don't have data to back my argument. 2, i think the scope of access tokens are the same as the refresh tokens, at least in google's implementation. therefore cross-scope attack is not possible even if the refresh token is leaked.
i think to avoid access tokens get logged is a better reason, but only strong if site A is autorized by user B for access info from issuer C, and somehow the token is sent to the fourth party. seems to me it happens if site A is hosted on the fourth party's server... but if 4th party has full access to the server, it's still pointless, right?
@wangii - Re #1 - the access tokens can be self-validating using encryption or signatures, so no need for DB lookup or RPC calls. Refresh tokens can remain opaque and revokable at the OAuth provider. Re #2 - Yes, the scope of the access tokens are the same. However, giving a developer of service (A) lifelong access to a user's data on service (B) is much worse than giving them access only when the user is actively using service (A).
I don't get that. You need to verify a refresh token against the db / session as well, making it stateful. Aside, you need to encrypt the refresh token before storing it, and check against the blacklist, causing additional overhead. Given a short-lived access token, you might have to do this check a lot... can you clarify?
@KimGysen sorry, I'm not certain which part you don't get. A refresh token does need to either be checked against a DB or a revocation list, but the idea is that an access token will be used many times for every refresh. (ie, XX or XXX API requests.
B
B M

Shortes possible answer:

Refresh tokens allow for scoped / different decay times of tokens. Actual resource tokens are short lived, while the refresh token can remain valid for years (mobile apps). This comes with better security (resource tokens don't have to be protected) and performance (only the refresh token API has to check validity against DB).


This seems like the clearest and best answer to me... thank you.
Why don’t resource tokens have to be protected? Is it because they are short lived, and if they are compromised it’s less risk because they soon will be expired?
@QuinnComendant exactly. More accurate probably: resource tokens are less important to protect since short lived, not that they do not require any protection.
t
treecon

The following is an addition to the benefits of refresh tokens that are already mentioned.

Safety First!

Access tokens are short-lived. If someone steals an access token, he will have access to resources only until access token expires.

"...But what if a refresh token is stolen?"

If an attacker steals the refresh token, he can obtain an access token. For this reason, it it recommended that a new refresh token is issued each time a new access token is obtained. If the same refresh token is used twice, it probably means that the refresh token has been stolen.

When the refresh token changes after each use, if the authorization server ever detects a refresh token was used twice, it means it has likely been copied and is being used by an attacker, and the authorization server can revoke all access tokens and refresh tokens associated with it immediately.

https://www.oauth.com/oauth2-servers/making-authenticated-requests/refreshing-an-access-token/

Of course, this is just another layer of security. The attacker can still have time to obtain access tokens, until the refresh token is used a second time (either by the attacker or the real user).

Always keep in mind that the refresh token must be stored as securely as possible.