An Architectural Introduction to OAuth2 and OpenId Connect (OIDC)

Categories: Programming, Cryptography, Security

Table of Contents

Introduction

OAuth2 is a protocol which defines how multiple separate processes can work together to give some client application access to data belonging to a user which is stored on some other system (resource server) while:

  • not exposing user credentials to the client application and
  • allowing the user to maintain control over what rights are granted over that data and for how long

With some non-standardised extensions to this protocol, OAuth2 can also be used to do generic “user authorization” (eg granting rights to admin-users) and to authorize server-to-server calls that are not operating on behalf of a specific user (“service accounts”).

The OAuth2 specification is relatively small; it deliberately limits itself to a minimum of features but adds a few “extension points”. Other standards then build on top of OAuth2 to solve specific use-cases; the most well-known is OpenID Connect aka OIDC.

OIDC adds the ability for a client application to verify that an interactive user is currently present (login) and to obtain limited information about the user - provided the user allows this.

Unfortunately, while extensions to OAuth2 (both standardised and not) solve a lot of interesting use-cases, almost all of the documentation available online covers just a single use-case: the use-case that OAuth1 was created to solve (delegated access to user data). This article tries to cover the architectural decisions behind OAuth2 and OIDC in more detail than typically done, and to look at these more advanced use-cases. In particular, this article looks at how to use OAuth2 and OIDC to provide access-control for a moderately complex set of interacting servers as typical in a modern company or organisation.

This information is intended to be helpful for software architects or developers who are designing software that needs to use authentication/authorization - whether as a “client” of an existing system, or as a “server” that needs to support clients. This article does not cover the exact endpoints, parameters, etc. required - there are enough tutorials for that, or simply see the specification.

This article does not contain any code examples; it covers concepts rather than concrete code. Once the concepts are clear, the code is not hard to write - particularly given the number of supporting libraries available for various languages.

Warning: I am not an OAuth2/OIDC expert; I am a software architect and developer who found the existing information (both online and in books) very unclear and frustrating. This article is the result of a lot of research, consolidated into a presentation that I would have found helpful. Corrections, discussions and feedback in general is very welcome. References to other articles that cover the same topics better than this article does are also very welcome indeed.

Note: the obsolete “OpenID” specification is not related to OpenID Connect and is not derived from OAuth2. It is not addressed in this article.

If you know OAuth2 well, and just want to understand OIDC then Robert Broeckelmann’s Understanding OpenID Connect is probably what you need.

And finally: this article could certainly be better written; the structure needs improvement and some information is repeated. However it has taken me a very long time to get it to even its current state. As I am not being paid, you’ll just have to take it as it is. I hope that it is helpful nevertheless.

Assumptions

I assume you are familiar with HTTP, REST, JSON, public-key-cryptography, and software architecture in general.

I will be talking about “REST endpoints” in this article. OAuth2 and OIDC are not limited to authenticating REST calls, and in fact are not limited to HTTP, but that is the most common RPC procotol that OAuth2 tokens are used with. If you are using something else, most of the info here will still apply.

Some OAuth2 Use Cases

Use Case 1: Simple Delegated Authorization

OAuth version 1 was designed to solve a specific use-case. The OAuth1 specification gives an example of this use-case in sections 1.0 and 1.2:

  • a user Jane has an account at service photos.example.net
  • that service holds photos that Jane has uploaded
  • and Jane now wants to grant service printer.example.com temporary access to those photos

The simplest solution for this use-case is for the user to provide their username and password to the photo-printing service. However that is not a good approach for obvious reasons (see section 1 of the OAuth2 spec for details). What the user really wants is to provide a time-limited and rights-limited token to the photo-printing service, eg grant the right to read photos from their gallery for a few hours.

The OAuth specification (v1 and v2) covers how the developers of printer.example.com can write code to interact with an “authorization service” provided by photos.example.net in order to obtain an “access token” that allows their service to then retrieve the desired photos (from photos.example.net) - and how Jane can stay in control of the granted access.

In OAuth2 terminology:

  • user Jane is the resource owner
  • the photos are protected resources
  • host photos.example.net is the resource server
  • host printer.example.com is the client application
  • the service that issues access-tokens that the resource server will accept is called the authorization server
  • the application Jane uses to interact with the authorization server is called the user agent (and is usually a web-browser)

The OAuth1 specification uses slightly different terminology. It also appears to assumes that the authorization server is integrated into the resource server. This article uses the expression “auth server” for this component; this makes clear that the component is also responsible for authentication (ie verifies user credentials). When the “auth server” supports OIDC then it is sometimes called an “identity provider”, “IdP”, “OpenID Provider” or “OP”. An authorization server provides:

  • an authorization endpoint (the one which a client application redirects a user to in order to get consent (a grant) to issue an access token)
  • a token endpoint (the one that actually returns the access-token on presentation of an auth-code or a refresh-token or client credentials)
  • and possibly other endpoints (eg the OIDC userinfo endpoint)

OAuth1 provides a protocol for solving exactly the above use-case, and only this use-case. OAuth1 is simple, and complete - the spec defines exactly what compliant implementations of all of the above components must do to make such use-cases work - and to work securely.

Sadly, almost all documentation on the internet regarding “OAuth2” addresses just the same “delegated authorization” use-case that OAuth1 deals with - although OIDC and various other extensions to OAuth2 support many other use-cases in addition to the one above.

From the point of view of a software developer, there are a few interesting points:

  • the organisation holding valuable data (the “photo gallery” in this example) needs to actively support the use-case, by providing an auth-server and by ensuring the resource-server accepts the access-tokens that it issues
  • the organisation using the data (the “photo printer” in this example) is dependent upon the resource-server
  • the user needs to explicitly agree, ie the auth-server needs to inform the user of the rights that the client-application is requesting and get their permission (a “consent”) to issue a corresponding token
  • an access-token has an expiry date; typically they are valid for an hour (granting rights for longer time-periods is done by renewing the token - which does not necessarily require user interaction)

Note that the term “client application” is a little confusing; it might be an “app” on a mobile device in which case the term is natural. However it can also be a webserver hosted by some organisation or company which the user is interacting with via a web-browser; if this webserver fetches data from some other “resource server” and needs to include an “access token” in such requests to authorize the operation then that webserver is a “client application”.

In the real world, there are a number of important services that hold user data and allow third-parties to access it via appropriate tokens; obvious examples are Facebook, Twitter, and Google Docs - all of which provide an OAuth2-compatible “auth service” and which provide REST services that accept access-tokens from their auth-service. This makes it possible for users with an account at such a service to grant third-party websites or mobile apps the right to access their data “on their behalf”. These third-parties provide “added value services” to the users, while presumably benefiting themselves in some way. The data storage service (Facebook/Twitter/etc) benefits because the existence of these third-party services makes its users happy, and encourages people to sign up for the core service.

Note that when the “client application” is an app installed on a mobile phone, then the security implications of requesting and storing access-tokens are somewhat different than a webserver doing the same thing. The OAuth2 standard therefore defines different “authorization flows” for these two situations. See later for more information.

Use Case 2: Stateful (Session-based) Authentication/Login

Any large organisation has many internal IT systems. Some of these are “session-based” systems where all network requests include a “session id” (which is allocated by the server on the first request). Typically such systems allow a user to “log in”, after which requests that include that same session-id may be “personalized”, and access to user-related data may be possible. Other systems are “stateless” (no “session” is maintained by the software component); this is addressed in the next section.

Requiring a user to provide their username/password to each “session-based” system separately, and then having each such system verify the credentials and load the user’s permissions is a poor design; it is unpleasant for the user, hard work for the developers, and prone to security problems. Nevertheless, large IT systems worked like this for decades until effective “single sign on” systems were invented - things like Kerberos, SAML, or OAuth2-based systems.

Kerberos provides a mechanism for “signed authorization tokens” that can be used to efficiently implement both “SSO” (for stateful systems) and authorization-for-stateless-systems. However Kerberos was designed to work at “organisation scale”, eg a company or university linked via a relatively fast network; it does not work so well in the kinds of situations that “internet-scale” public facing services encounter. The SAML specification was developed to cover this use-case; unfortunately the SAML spec is highly complex and not very performant.

Extensions to the OAuth2 specification can be used to provide the same kind of features as Kerberos or SAML, but in a manner that is efficient at “internet scale”. Sadly this is seldom discussed in articles on OAuth2; this article does discuss this.

A “session-based” application can store session-state without knowing the user identity, eg to provide the concept of “current page”, “most recent search”, etc. However knowing the user can support nice things like personal config settings for the site. In addition, authentication (providing proof that the session is associated with a specific user-id) is sufficient to authorize all operations on data belonging to that user; a user typically has rights to read/write/create/delete their own data.

An application never actually wants user credentials - it just wants a trustworthy user-id that it can then use to look up additional user-related info in its database(s). In other words, it is convenient for authentication of the user to be performed by an external system and for some “trustworthy userid” to be provided.

An application sometimes also wants to know basic data about the user such as first-name, last-name, date-of-birth, profile-image. An application can of course store such data itself, keyed by the “user id” (see above); however when the app is using an external system for authentication then the user will typically want to update such “profile info” just in one place and have all applications that use that same authentication system see the latest data automatically.

The OpenID Connect (OIDC) extension to OAuth2 addresses this need specifically; it provides a verifiable user-id, and defines a set of “standard profile attributes” that applications can request access to. An application can request an ID token from an auth-server instance and receive a signed data-structure whose format is clearly defined by the OIDC standard. Alternatively, an application can request an access token from an auth-server instance which represents sufficient rights to read specific profile attributes; the application can then make a direct call to a “user info endpoint” within that auth-server instance providing the token as proof that it is permitted to do so.

As already mentioned, in many cases “authentication” is equivalent to authorization; a user typically has all rights (read/write/create/delete) over their own data. In other words, any request received by the session-based application which is associated with a session that has a “logged in user” and manipulates data belonging to that user can be allowed.

However sometimes an application also supports “admin users” who have rights to access data associated with other users or to perform “system functions”. In this case, an ID token that simply proves the user identity is not sufficient; additional info about the user’s rights are needed - ie authorization. An application might just store this info in a local database, keyed by userid - ie implement authorization itself. Alternatively it might also require authorization to be done via OAuth2 - ie require an access-token.

When a session-based application only needs authentication and not authorization then it can delegate to any OIDC-compatible auth-server. A user simply selects whether they want to log in with Facebook, Twitter, or any other service supported by the application; the user is then redirected to that site to obtain an “ID token”. The application receives (via one of the OAuth2 “flows”) a trustworthy token holding a userid of form (auth-server-id, user-id-within-auth-server), eg (“facebook”, “user100”). OIDC in fact specifies a kind of “metadata-retrieval” system where an application can potentally offer a “login with other” option; the user can enter a hostname like “auth.vonos.net”. As long as that domain provides the right metadata-file at the appropriate “well-known URL” then the application can redirect its user appropriately, validate the returned token, and be sure that this user is indeed ("auth.vonos.net", "user1"). However many websites do not support arbitrary auth-servers and instead just limit the options to a few well-known ones; this does have the advantage that a “spammer” cannot host their own auth-server with an infinite amount of bot-accounts. Note also that a client application must have a client account with the associated auth-server, and must specify its client id when redirecting the user to the auth-server to generate a token; this makes it rather difficult to dynamically accept any auth-server - though OIDC does define a protocol for dynamic client registration. OIDC also defines experimental support for “self-issued OpenID Providers”; see the spec for details.

An OIDC “ID token” includes a field which can be tested to ensure that the other end of the network socket is really an interactive application with a current user present; an “authorization token” aka “access token” does not necessarily mean that a user is present - and in fact is designed to allow client applications to act on behalf of users.

Although it is possible for an application to delegate to a “third party” auth-server for authentication (as long as you are willing to accept user-ids of form [auth-server-id, user-id]), it is not possible for an application to use access tokens issued by a “third party” auth-server to make security decisions within its own code-base. The only use for an access-token issued by a third-party auth-server is to pass it to a resource-server that is also run by that same third-party. If you want to use OAuth2 to implement fine-grained authorization in your own application then you need to run your own auth-server (self-hosted or at least an instance provided by some auth-as-a-service provider).

Often a large organisation will support login to its apps only via its own auth-server, ie all session-based apps only redirect to the organisation’s own auth-server. If an organisation wants to support “authenticate via external” but also support authorization then one option is “token exchange”; request an ID token from the external system then send this to your own auth-server to map (other-auth-server-id, user-id) to an access-token for the corresponding user in your own system. You still get delegated credentials and user-profile-data but the rights for the user are managed within your own system.

The OIDC standard defines 4 standard scopes that can be specified to fetch different “user-related data” after authentication:

  • “profile” - include user attributes name, family_name, given_name, middle_name, nickname, preferred_username, profile, picture, website, gender
  • “email” - includes user attributes email, email_verified
  • “address” - includes user attribute address
  • “phone” - includes user attributes phone_number, phone_number_verified

Note that the first time an access-token with rights to such data is requested by an application, the user is presented with a list of the data requested (list of scopes) - and may decline the request if they feel that unreasonable amounts of information is being demanded. Only after the user “consents” to the release of this data is a corresponding token generated.

OAuth1 was actually used as a framework for “authorization” by a number of organisations, eg facebook. However as it was not designed for this purpose, it had to be extended in ways not compliant with its specification. Each organisation “deriving” its authentication-framework from OAuth1 did so in its own way, leading to a lot of confusion. OAuth2 is instead deliberately defined as an “extensible framework” and OIDC is an extension that (among other things) officially supports “authentication” use-cases. It is important to note that OAuth2 alone is not an authentication protocol.

The topic of “scopes” is discussed in detail later in this article.

Use Case 3: Stateless Authorization

IT systems often provide a set of “services” aka “endpoints”, often over a protocol like REST. Typically these services are stateless, ie there is no “login” operation; instead each request needs to provide an appropriate authorization token. This token is commonly called a “bearer token” and for REST calls is provided in an HTTP header of form Authentication: Bearer {token}.

Such a service does not want to receive user credentials in such a header; it would need to look up some database to verify them which is hard work. And more importantly, applications that call such an endpoint would not want to provide credentials here, as:

  • they could get lost or leaked, and
  • such credentials grant all rights on the user’s data to the recipient

Passing an OAuth2 access token here solves these problems:

  • the token has a relatively short expiry-time, so leaking it is not so critical
  • the token is a subset of the user’s rights, not all of them

Exactly what an “access token” is, is discussed later.

Note that although the standard HTTP header is named “Authentication”, the token is usually an access token that represents authorization data (a set of rights) rather than authentication data.

Note that there are “stateless systems” that are only accessible from trusted systems, eg “back ends” for stateful front-ends which trust network firewalls to ensure only the approved internal front-end tools access them; these do not need to perform authorization at all as they just trust the caller.

Use Case 4: Application-level Authentication and Authorization

The previous use-cases talk about accessing data belonging to a specific user or logging in as a specific user, and that is indeed the primary use of OAuth2/OIDC. An OAuth2 “access token” is a combination of a user-id and a set of rights, effectively stating that the holder (bearer) of the token can perform specific operations on data associated with a specific user. A “resource server” receiving such a token then has sufficient information to check whether the requested operation is consistent with the provided token.

However sometimes in a complex set of interacting IT systems, a system needs to act on data associated with multiple users, or data associated with no specific user. OAuth2 supports this via “client accounts”.

Client accounts actually have two quite different usages:

  • client-application-to-auth-server
  • application-to-application authorization as above (sometimes called “system accounts” or “service accounts”)

A “client application” (in OAuth2 terminology) must be registered with every auth-server it obtains tokens from. When it redirects a user to an auth-server to obtain an ID token or access-token, the “client id” must be provided as a parameter. When using the “auth code flow”, the user agent provides just an “auth code” back to the client application; the client app then connects to the auth-server using its “client account” id and credentials in order to obtain the actual token. The client credentials are usually very simple (just a password auto-generated by the auth-server on client account creation).

A registered “client application” can also use a standard OAuth2 endpoint to get an “access token” for itself. It can then pass this token on to other services who can verify the token in order to be sure that the process at the other end of the network socket is indeed a specific client-application.

The registration process needed to create a “client account” with an OAuth2 “auth server” is often somewhat complex and bureaucratic as client accounts should be registered far less often than standard user accounts.

Trying to represent a specific internal service (application) as a “user” for the purposes of application-to-application authentication is extremely painful; OAuth2 authentication flows are not designed for that. The concept of “consents” also does not work well for non-human entities. It is better to use “client accounts” for services rather than user accounts.

Use Case 5: Single Signon

This is really just a variant of Use Case 2 (stateful system login).

When a client application directs a user to obtain an access-token from a specific auth-server, a web-browser is typically started. As standard with web-browsers, it sends all “cookies” associated with the target URL to that system. When the user is not logged-in then the auth-server prompts them for credentials and sets a cookie to indicate the logged-in state.

When the user is later redirected to the same auth-server, as long as this “logged-in” cookie still exists the auth-server does not need to prompt the user for credentials again. An access-token or ID token can therefore be immediately issued (possibly “consent” needs to be obtained interactively first).

The result is that many different client-applications can obtain ID-tokens or access-tokens for the user while the user is prompted only once to log in - ie OAuth2 provides “single signon”.

Session-Based Systems

Just to clarify what is meant by the expression “session-based” as used above: basically, this is any IT server that links together multiple network requests belonging to “the same user”.

A very simple client/server IT system consists of:

  • a database for data storage
  • a monolithic application which combines presentation-logic, business-logic and persistence-logic
  • and users accessing the monolithic app via a web browser

The simplest deployment is to have just one instance of the monolithic app (ie a single process) handling all users.

When a remote request arrives, it is inspected to see whether a “session id” is specified. If yes, then data associated with that session is loaded from an in-memory cache using that session-id as the “key”. If no session-id exists, then a new session-id is allocated and an (empty) session is added to the cache. Multiple requests that use the same “session id” can read or update “state” stored in the in-memory session. The allocated session-id is returned to the client and must be provided on all future calls; when the client app is a web-browser this is usually done by storing the session-id in a “cookie”.

Of course a single instance holding state for all “active sessions” in memory does not scale well. A simple extension is to run multiple instances but ensure all requests associated with a specific session are handled by the same instance each time - aka “sticky sessions”. This can be handled by routing incoming requests according to the “session id” property of the request, or by the originating IP address, or a few other approaches. However this also has limitations.

An alternative approach is for the “session state” to be stored externally to the primary application itself, eg in a key-value store such as redis/memcached - or even a relational DB. Requests can then be distributed across a pool of processes without needing “sticky sessions”; the receiving instance just fetches the data using the session-id at the start of each request and writes back the modified state at the end of each request. Of course there is a performance impact due to the session-handling.

Despite the problems with session-based systems, they are extremely common. And typically they also support the concept of a “login”; a session initially has no “user details”, ie is for an “anonymous user” and operations that need some level of “permission” to perform trigger a mandatory “login flow”. In the case of OIDC, this means that the application causes some kind of interaction to happen on the remote system which results in an “authentication token” being made available to the application. This data is then verified (potentially a somewhat complex process) and the relevant data is then stored into the corresponding “session”. Because session-state is available for each request this process only needs to be done once during a session.

There are however IT systems which are NOT session-based; they simply provide “endpoints” that code can call to read or update data. Not dealing with sessions allows these to be far more scaleable - but every call needs to provide a full set of access-rights, not just a “session-id” through which the (previously decoded, verified and cached) rights can be found. In REST systems, such access-rights are typically provided in an HTTP header of form “Authentication: Bearer {}” and are called a “bearer token”. Such endpoints also typically expect an access token (proof the caller has a set of rights) rather than an authentication token (a description of the user).

More advanced IT systems can combine these approaches:

  • databases persist user data and are accessible only from “business service instances” within the organisation (ie not directly from the internet)
  • “business services” are implemented as stateless (sessionless) REST services that require a suitable access-token. These are accessible only from “backend-for-frontend” instances within the organisation (ie not directly from the internet)
  • a pool of “backend-for-frontend” (BFF) processes acts as a gateway for client-side code to access the business services. These maintain per-user session data, either via sticky-sessions or by externalising session state, and forward calls to the business services. These are accessible from the internet; to access the above business services they cache “access tokens” in user sessions for reuse. The BFF layer usually provides an endpoint for “login” which updates the session state with user-profile data. The BFF layer is also a good place to cache “presentation level” data.
  • a “presentation layer server” provides HTML and Javascript that users run in their web-browsers; the Javascript makes REST calls to endpoints within the “backend-for-frontend”, providing a “sessionid” via a cookie.

Rather than a “presentation layer server”, a native app can be used - calling the BFF.

Having separate BFF and “business service” layers means that the business services can be used in automated processes that are not associated with “users” - ie where session-state is not relevant. It also makes security rules clear, with each service requiring an access-token with specific rights rather than dealing with “logged in users”. The business-service layer can also be efficiently scaled without being concerned about access to state.

Native mobile apps, or single-page-applications running in a browser, can use OIDC for authenticating a user, but are not “session based”; see later for more info on such applications.

Access Tokens

The “access token” is the most fundamental concept in OAuth2. The most interesting thing about it is that the OAuth2 specification never defines what one is; the actual format is an implementation detail of the auth-server that issues it. To quote from the OAuth2 specification section 1.1 “Roles”:

The interaction between the authorization server and resource server is beyond the scope of this specification. The authorization server may be the same server as the resource server or a separate entity. A single authorization server may issue access tokens accepted by multiple resource servers.

This means that any “resource server” which wants to check whether an access-token represents some specific “permission” needs to know the details of this representation. Or in short, a resource-server accepts access-tokens only from a specific auth-server and is tightly coupled to that auth-server’s implementation.

Interestingly, the OAuth2 spec doesn’t even state how the recipient of a “token” issued by that auth-server can ensure it has not been tampered with in transit over the network. That is also an implementation detail.

In practice, a resource-server commonly uses a library provided by the maker of the auth-server in order to validate incoming access-tokens and uses an API provided by this library to also check for the presence of the required permissions. It therefore does not need to know the internal representation of the access-token.

Auth-servers commonly represent access-tokens using the JSON Web Token standard. This provides the following benefits:

  • the token embeds the user’s rights, so no DB lookup is required
  • the token is signed by the issuing auth-server, so verifying the token requires just a signature-check
  • resource-servers who do not wish to use the auth-server library to verify/decode an access-token can use any JWT library instead

Even when using JWT, different auth-servers may use different field-names and values to represent access-rights, ie a resource-server remains coupled to a specific auth-server implementation.

Auth-servers can use other representations for access tokens, eg a Kerberos token.

An access-token could also simply be the key of a record in a database which represents the rights associated with that “token”; in that case the resource-server would need to perform a read operation to access the data (unlike the JWT approach where the data is embeded inline). That record would need to include an “expiry time” (or maybe the record is simply deleted at expiry time), as an access-token always has an expiry date.

Note that the OIDC specification requires that an ID token is always a JWT token.

Although access tokens need to be understood by a resource-server in order to check permissions, a client application does not need to read an access token; it just passes it along to the target resource-server REST endpoint as proof that it is permitted to perform the requested operation. In the case of an OIDC ID token, the client application is the user of the token and is coupled to the auth-server; however OIDC-compatible auth-servers use a standardised format for ID tokens.

Note that the interaction between the three parties in an OAuth1-style use-case is still unchanged and standardised; they all treat an authorization-token as an opaque block of chars. Only the “service” that evaluates user rights is coupled to the auth-server.

API Keys

API keys are not part of the OAuth2/OIDC standards, but it seems helpful to mention them here.

Some services (Google in particular) use the concept of an “API key” that callers of a REST endpoint need to provide, instead of an OAuth access token.

There is no standard for passing an “API key” from client to server; they can be passed as query-params, http-headers, or cookies. They can be used for authorization (required in order for the caller to invoke that endpoint) or for other purposes such as billing/quota-management.

An API key does not include any “user identifier”, ie does not control “whose data” may be manipulated. In general, an API key does not represent rights at all other than “permitted or not”. They also typically do not have an expiry-date - or at least have a very long lifetime.

Because an API key is a kind of “bearer token”, anyone who obtains a copy of the value has the same rights as the caller. This means that embedding them in apps running on user devices (eg mobile apps or Javascript running in a user’s browser) makes interception very easy. That isn’t necessarily a problem; an attacker pretenting to be a mobile app won’t get much extra in the way of privileges. Only an attacker who gets access to an API key that grants special privileges would be a threat.

The primary purposes of an API key are:

  • to be able to revoke permission as a whole for a specific project/application (eg some third-party partner) (not per-user, just per-registered-system)
  • for traceability and billing

Benefits of OAuth2

Here is a quick look at what a user gains when using applications based on OAuth2 rather than “password sharing” or similar.

Readers of this article probably are aware of all this, so I’ll keep it short:

  • Only the auth-server ever prompts the user for login credentials
  • Only the auth-server ever stores user credentials
  • Recipients of access-tokens (even refresh-tokens) which are buggy/hacked cause only limited damage
    • If an access-token is leaked, it is valid only until its expiry-date (typically 1 hour)
    • If a refresh-token is leaked, it is valid until the “consent” is revoked for the client-application associated with that token
    • And the actual user credentials can never be leaked (except directly by the auth-server)
  • User is well informed of which client-applications are requesting which rights (scopes)
  • User can review their existing “consents” at any time, and revoke them
  • For tokens in JWT form (including OIDC tokens and access-tokens from most auth-server implementations), the token can be verified without contacting the auth-server (ie efficient network usage)
  • User credentials and profile data (eg email address) are stored only once; no need to update them per-site

OAuth1 vs OAuth2

Although some history of OAuth1 and OAuth2 has already been discussed, it is worth taking a closer look at how the standards developed and what differences exist.

OAuth1 is a concrete specification; it defines:

  • how client, auth-server and relying-party interact
  • the full set of parameters and allowed values for on each interaction
  • how the recipient verifies that the token is trustworthy
  • how encryption is used to protect data underway

Like OAuth2, it does not specify is the format of an access token, ie how rights are represented. A resource-server is therefore coupled to a specific auth-server implementation. In addition, the scopes that a resource-server checks for in incoming requests must be defined in the auth-server; there are no “standard scopes”. A resource-server is therefore coupled to a specific auth server instance which has the appropriate scopes defined - and has access to the database of users whose data the resource-server manages.

The OAuth1 protocol is very limited in the set of functionality it can provide; the inventors of OAuth1 had a specific use-case in mind (see earlier) and the protocol supports exactly that.

The OAuth2 specification development process was a collaboration between the original developers of OAuth1 (web-centric pragmatists) and architects coming from the enterprise world of Kerberos and SAML. The web people wanted to make minor improvements and cleanups to OAuth1, while the enterprise people wanted something far more ambitious. The result was apparently a significant amount of frustration on both sides.

As noted earlier, the well-known and successful Kerberos protocol provides a mechanism for “signed authorization tokens” that can be used to efficiently implement both “SSO” and authorization-for-stateless-systems. However Kerberos was designed to work at “organisation scale”, eg a company or university linked via a relatively fast network; it does not work so well in the kinds of situations that “internet-scale” public facing services encounter. The SAML specification was developed to cover this use-case; unfortunately the SAML spec is highly complex and not very performant.

Obviously, trying to produce an OAuth2 specification that supported all of the desired use-cases would have made it very complex - and new requirements/use-cases can potentially be discovered. Instead, an OAuth2 specification was released that defines an extensible “framework” that standardises some things while:

  • defining a number of “extension points” - where the results of using these points is not defined in the spec, and
  • leaving a number of issues undefined (eg how to protect tokens from interception while in transit over the network)

As the OAuth2 specification itself states in section 1.8 “Interoperability”:

as a rich and highly extensible framework with many optional components, on its own, this specification is likely to produce a wide range of non-interoperable implementations.

The original “editor” of the OAuth2 spec, having come from the OAuth1 world, was unhappy with the “incomplete” nature of OAuth2. This is a fair complaint in some ways; there is far more “implementation specific” behaviour in a system based on OAuth2 than in one that uses OAuth1. However the OAuth2 specification is far simpler, and allows implementers to optimise for many different use-cases. It is also designed for extension, allowing other standards to build on it.

An OAuth2-compatible system using an auth-server whose access-tokens are represented in JWT format solves many of the complaints about OAuth2 in the previous link. It doesn’t provide quite the same level of security as OAuth1’s “signed requests”; in this approach each HTTP request not only includes a bearer-token for the user but also a “signature” field over (invoked-URL, timestamp) using the client’s key. Such signatures do block attacks related to intercepted bearer-tokens. Unfortunately (if I’ve understood this approach right) this requires the resource-server to have access to the “signing key” for each client - something practical in OAuth1 as the auth-server and resource-server are expected to be the same process. The signature approach also seems to be far slower; not a problem with the OAuth1 use-case but a no-go for many of the use-cases that OAuth2 can be applied to. Eric Hammer is correct in stating that many developers/organisations have trouble setting up TLS correctly; nevertheless it seems fair to require this in order to get the extra features that OAuth2 offers.

Scopes and Access Right Representation in OAuth2 and OIDC

Scopes

A request for an access-token from a client-application to an auth-server includes a list of “scopes”. These are plain strings that specify (in an abstract way) what “permissions” should be encoded into the returned access-token.

This section looks in detail at what “scopes” are and how they are used.

Scopes and Permissions

The standard request sent to an OAuth2 server when allocating an access-token includes one or more “scope” strings. Exactly what a “scope” does depends on the auth-server implementation and configuration; it is one of those details that tightly couple a process verifying access rights with the process creating the access token.

OAuth is not designed to answer the question “what rights does this user have?”. It assumes that a user has all rights to their data, and then creates an access-token which grants a subset of those rights to the holder. When an auth-server receives a (typical) request for an access-token, it does not need to consult a database of “user permissions”; the user is free to tell the auth-server to agree to any permissions at all because the access-token is used to access data for that user and therefore the token always represents a subset of permission “all”.

A resource server already has access to all data it manages for all users; it just needs to execute the appropriate query against a database. An access-token is not something that the resource server needs in order to do its job, but is instead something that the resource server voluntarily evaluates in order to determine whether it should process a command it has received from a remote process.

An organisation that runs one or more OAuth2-supporting “resource servers” must also manage its own auth-server instance. The auth-server admin defines a set of scopes for their organisation, with a nice human-readable description of each. Some organisations use URL-style names for scopes (particularly those with large numbers of resource-servers) in order to avoid naming conflicts. Other organisations use much simpler strings, eg “read:photos”. The exact format doesn’t matter.

Each resource-server accepts access-tokens from only one auth-server; that auth-server is (almost always) managed by the organisation to which the resource-server belongs. The auth-server that a resource server depends on should be clearly documented.

When a resource-server developer is creating a new REST endpoint, they need to decide which permissions to check for in order to best protect the owner of the data that the call will access. The developer should first see if an existing “scope” defined within their auth-server is appropriate; if not then they must ask their auth-server admin to define a new scope. The choice of scope doesn’t really affect the resource-server much; it is just a simple “present or not present” check. However all callers of the endpoint will need to get an access-token corresponding to that scope before they invoke the endpoint; if the scope the endpoint has chosen is a “generic” one then:

  • the caller will be required to ask the user for a “generic” access token which the user might not want to give, and
  • if the token is leaked then whoever gets hold of that token can perform potentially undesirable operations

It is therefore good practice for an endpoint developer to require “scopes” which represent the operations that the endpoint is carrying out. On the other hand, it is not desirable for every single endpoint to invent its own scope; the auth-server admin will not like that, the implementers of calling apps will not like that, and neither will the user when they get presented with either yet-another-token-request or a token request for dozens of scopes all at once (though see comment later on “first party consents”). An endpoint can potentially accept multiple scopes, eg “read:photos” or “readwrite:photos”, thus giving the calling applications the choice of what token to prompt the user for; a client application that only invokes resource-server endpoints that accept “read” permission can ask for the smaller permission while a client app that calls resource-server endpoints that update data too can request just one scope “readwrite:photos” which is also accepted by the readonly endpoints.

The endpoint developer needs to document which scopes are required for calls to the endpoint (as well as the auth-server to fetch it from).

Whichever scopes are requested by a client app, the user is presented with a consent-screen listing them (using the descriptions provided when the scopes were defined). The user simply says yes or no to the set. The returned token includes a list of the scopes that the user consented to, in some undefined form.

Some auth-servers also allow a user to say yes to some but no to others - in which case when the client-application passes that token to some endpoints it will be accepted while others will reject it. Because an access-token is “opaque” to client applications (ie its format is a private detail that is relevant only for the auth-server and resource server) it is not possible for a client-application to extract from the access-token the list of scopes which were actually agreed to (at least not portably); however if an auth-server issues a token which does not represent the set of scopes that was requested (for any reason) then the response that provides the auth-token must include a field “scope” listing the accepted scopes.

Although an auth-server is allowed to react to requested scopes in any way, and encode the rights represented by an access-token in any way it desires, in most cases things are very simple. The most common behaviour is that the returned token is in JWT format and has a field “scopes” (or similar) which is an array of strings that includes every requested scope that the user agreed to (in most cases, all of the requested scopes).

The OAuth scope mechanism cannot represent a right to a specific target entity; for example it is not possible for a user to grant read-access to an arbitrary folder of their photo-gallery. It is possible that some auth-servers support this as a custom extension, but I am not aware of any.

Resource server code typically uses a library provided by the auth-server implementer to decode and verify access-tokens. They therefore don’t actually know or care about the format of the access-token, instead using a library API to check that the incoming access-token has the required permission within it (typically by asking if a specific “scope” string is present).

Note that although traditional user-management systems such as LDAP represent rights in various ways (eg object + read/write/execute flags), resource-server endpoints always document their requirements in terms of “scopes” and users are prompted by the auth-server to “consent” to these scopes using the descriptions associated with the scopes by the auth-server admin. The returned access-token also usually represents the rights embedded in the access-token as a list of the “scopes” that the user agreed to.

User Authentication

An auth-server always needs to authenticate a user before issuing a token. It therefore must have a database of user credentials - whether simple passwords, public keys, ids of identity tokens, or other. An auth-server might run its own database for this (with associated interface for creating/deleting/administering users) or it might use a back-end system such as LDAP or ActiveDirectory to obtain user/credential data.

Some auth-server implementations can act as a front-end for multiple user data stores. Some can even act as a front-end for multiple other auth-server instances. The company auth0.com provides a hosted auth-server that acts as a front-end for many of the large internet resource servers including facebook, google, etc.

An auth-server needs to keep track of:

  • scopes and their human-readable-descriptions
  • registered client applications, their human-readable-descriptions, their redirect-URL(s) and their credentials
  • user-ids and their credentials (often implemented by delegating to some backing system such as LDAP)
  • user profile data (if it supports OIDC)
  • user consents - ie (client, scope, state) tuples

However with at least basic OAuth2, the auth-server does not need to manage user permissions, roles or groups. As noted above, none of these are needed for basic OAuth2 as a user has all rights over their data and a basic access-token represents a subset of those rights.

The auth-server keeps track of (user, client, scope) tuples that the user has agreed to; these are often called a “consent”. If the same tuple is requested later, the user does not need to approve again; the auth-server just generates the access-token automatically. However a user can use an auth-server admin interface to inspect their “consents” and to revoke them. This doesn’t invalidate any existing access-tokens but will block future requests. An access-token is typically valid for a few minutes to a few hours, depending on auth-server config.

Some auth-server implementations partition the client/scope/consent records they keep into namespaces (sometimes called “APIs”). This can be useful when an organisation has many resource-servers; each group of related servers can have its own scope-names without worrying about naming conflicts. It also means that an “API” can be deleted, tidily removing all associated client application registrations, consents, etc.

Querying User Rights

The scope approach described above works well for issuing access-tokens that represent a subset of rights over the data belonging to a specific user. However sometimes users should have rights over things other than their own data. In particular, “admin users” may have rights to perform various operations that affect data of other users, or rights to perform “system functions”.

This kind of “access right” is not really part of the OAuth1 or OAuth2 usecase, and so doing this kind of thing is only possible via authserver-specific “extensions” outside of the OAuth2 standard.

Many auth-server implementations solve this by providing “magic scope names” which do query user rights in some user-database, and store those rights in some auth-server-specific format in the access-token. Some auth-servers even have a global setting that adds this info for every access-token. Relevant resource-server endpoints then need to check the auth-server-specific fields in the access-token to verify that the access-token permits the requested operation.

Assuming the server is connected to some kind of user-datastore that holds user rights (such as an LDAP server or ActiveDirectory system) then the server might embed any of the following in the returned token:

  • a list of all the user’s permissions (flattened)
  • a list of all the user’s roles
  • a subset of the user’s permissions which match some kind of “filter expression” stored in a requested scope string
  • a subset of the user’s permissions which match some kind of “filter expression” stored in a scope-definition (ie set up by the auth-server admin)

Embedding a user’s complete set of rights or roles in a token will of course make it much larger.

System/Service Accounts

Sometimes one application needs to call an endpoint in another to perform an operation that is not associated with a specific user. As noted earlier, “applications” should have a “client account”. These are sometimes called System accounts, Service accounts or “Machine to Machine” accounts.

Like user rights, this is somewhat outside of the OAuth2 spec and therefore into auth-server-implementation-specific behaviour.

As noted earlier, a “client account” is required by each “client application” in order to be able to establish a “back channel” with an auth-server to exchange an auth-code for an access-token or to fetch user profile info (OIDC “userinfo endpoint”).

A client application can also obtain an “access token” for itself which proves the identity of the client application. Sometimes this is enough, ie REST endpoints can be implemented that check for a specific “client id” in incoming access tokens. However sometimes a REST endpoint wants to do checks based on rights rather than on identity.

In some auth-servers, it is possible to assign “scopes” to the client account. When the client application requests an access-token these scopes are then added to the token directly (no “consent” required of course). In some auth-servers it might be possible to link the client-account to an identity in an LDAP/ActiveDirectory instance and then configure the auth-server to add the rights from that account into the access-token in an implementation-specific manner. As noted earlier, resource-servers often use a library provided by the corresponding auth-server to validate and parse access-tokens and so do not need to know the details of this implementation-specific encoding.

Development Processes (Quick Overview)

If you are writing a “client application” that invokes business-services provided by some external party, you just need to read the docs for the business-services you are invoking. Those docs should state which “scopes” are needed for each endpoint; your client app then needs to redirect the user to the auth-server belonging to that same external party and specify the required scopes. You will get back an “opaque” token that you pass in to the business service - and internally that business service implementation will then unpack that token and should find the permissions it needs.

If you are writing business services yourself, then you need to decide first what permissions you want. Then you need to look in your organisation’s auth-server to see if a scope exists which will trigger creation of an access-token with the permissions you need; if not you need to define a new scope that somehow pulls that data out of the user’s rights configuration and adds it to the access-token. Exactly how that is configured is very auth-server-dependent. Then you need to update your docs so users of your business services know what scopes to request from your organisation’s auth-server before calling your business service. The callers of your service don’t need to know what these scopes mean; they are just ‘magic names’ to the caller. Of course you need to ensure your scope-name does not collide with other definitions in your auth-server!

Each scope defined in the auth-server has an associated text description which is shown to the user when a client-application requests an access-token with that scope. The text description should clearly describe what the set of permissions that the scope returns allow the recipient of the token to do.

The data in the returned access token is not a list of scopes but rather some undefined representation of the rights that the requested scopes grant. The app that requests the token (the “client application”) just needs to know the scope-names to ask for, and gets back an “opaque” token in return that it does not need to interpret; it just passes it to the resource server that actually maintains the data. That resource-server does need to validate that the granted rights match the requested operation - ie does need to know how the rights are represented and is thus tightly coupled to the auth-server. However often the rights are a list of strings identical to the originally requested “scope” strings (at least for rights associated with access to “data for a specific user”).

First Party Consents

The “consent” feature of OAuth is very nice when the “resource server” that holds the data and the “client application” that reads it are different organisations. In this case the user has control over what data is accessed by whom.

However often a large organisation will run an auth-server, a resource-server and one or more client applications that use that resource-server. In this case the user is already very aware that they have given their data to this organisation (have uploaded it to the resource-server); it is therefore probably unnecessary to prompt them for consent when a client-application belonging to the same organisation accesses user data via that resource-server. Many auth-servers therefore offer the option to skip the consent step when the access-token request is from such “first party” client applications. Any external client application which wants an access token still has to get interactive user consent.

In the case of Keycloak, each client account has a simple boolean flag Consent Required; when this is not enabled then the user is not prompted for consent to scopes requested by this client. Obviously this should be ticked for any “third party” client application, in order to give users control over their data.

OAuth1, OAuth2, and Undefined Behaviour

The OAuth2 specification is more extensible and applicable to more use-cases than OAuth1. However it does so at some cost.

OAuth1 is very tightly defined, and thus compliant implementations are pretty much compatible. OAuth2 however leaves a couple of features completely (and explicitly) undefined. In particular:

  • a client application cannot specify what “format” of token it desires; the auth-server decides what format to issue tokens in and indicates this in field token_type of its response. If the client application does not recognise the value in field token_type then it must not use the token. Normally this value will be Bearer in which case the token can be passed in REST requests as an HTTP header of form Authentication: Bearer ${token}.
  • even when token_type=Bearer is specified, the token value is simply defined to be a string (with some constraints). This value somehow encodes a set of access-rights, but how that is done is implementation-specific.

The fact that the format of a Bearer token is undefined means that every resource-server is tightly coupled to a specific auth-server implementation; each REST endpoint in a resource-server needs to check that the access-token provided along with the request is valid then check that the operation it has been requested to perform is consistent with (permitted by) that token. That means the resource-server needs to understand the access-token format (though the code that calls the resource-server, ie the client-application, does not). And that means direct coupling of the two implementations.

One of the primary developers of OAuth1 was involved in much of the development of OAuth2, but finally resigned in frustration. He believed the involvement of the “enterprise guys” in OAuth2 development lead to a spec that was more of an “authorization framework” than an authorization protocol. This is actually a fair claim; anyone implementing a new auth-server has a lot of places where they can pretty much do whatever they like while still being “an OAuth2 implementation”. In particular, the lack of detail around bearer tokens (their format, and how they can be verified) was an issue.

OIDC does not change anything related to OAuth2 access-tokens, ie does not affect authorization. It does clearly specify the format of OIDC ID tokens, which must be JSON Web tokens with specific fields present.

Well-Known Auth Server Implementations

If you are a software architect wanting to implement “resource servers” which provide REST endpoints that accept OAuth2 access tokens as a way to authorize callers, then you will also need to have an OAuth2-compliant auth-server. There are two choices: host your own or pay for someone else to provide an instance for you to configure (“OAuth as a service”).

Here are some popular options:

Keycloak is an open-source project led by RedHat. RedHat also offer Redhat SSO which is Keycloak with associated tech-support and official security patches.

The Keycloak, Okta and Auth0 websites all provide excellent documentation on the details of using OAuth2; this article therefore does not try to replicate this. What is provided here is the “architectural context”.

Grant Types and Authorization Flows

Introduction

This section looks at how a client application obtains an access token or ID token.

The interaction between client and auth-server is in two parts:

  • the “token endpoint” performs “grants”: the caller provides some kind of (non-interactive) “authorization” and the server returns tokens
  • the “authorization endpoint” performs (sometimes interactive) “user login” and consent-validation, establishes a “login session”, and can return various output values

The term “grant” means “the thing that proves to the auth-server that it is allowed to return an access token”. The auth-server token endpoint supports multiple “grant types” which tells the auth-server:

  • what input parameters to expect in the incoming request, and
  • what output-parameters the caller expects

In some use-cases, a client uses only the token endpoint (eg “client credentials grant”, “refresh token grant”, “resource owner credentials grant”), as it has all the “proof” it needs to convince the auth-server that a token may be issued. In this situation, a “login session” is not created or needed.

For other use-cases, a client must request the user to visit the “authorization endpoint” to obtain a token (typically via an http-redirect); this results in a login-session being created. The authorization endpoint may return a token directly, but the commonly-used “authorization code flow” returns a one-time code that the client then uses in a call to the “token endpoint”.

The standard grant types and the “authorization code flow” are described below.

In practice, most client applications use some OAuth2 library to implement interactions with the user-agent and auth-server. The information below is useful for understanding exactly what that library is doing, but it is usually not necessary to actually write code at this level. These grant-types are identical across auth-server implementation (ie is standardized) and therefore OAuth2 libraries that support such grant-types/flows are not auth-server-specific. When implementing a resource-server, this is not the case; the access-token associated with a request needs to be parsed and validated, and access-tokens are not standardized.

The Authorization Endpoint and Login Sessions

When a client application causes a user to interact with the authorization endpoint (typically via an http-redirect) this creates a “login session”. When the interface through which the user logs in is a web-browser (typical case) then cookies are returned which identify this login session; this allows later requests for authorization to complete without prompting the user for credentials again (single-signon). Login sessions typically have a lifetime which is configurable in the auth-server via both a max-lifetime and a max-inactive-lifetime. When a request is received by the auth-server at the authorization endpoint and the associated login session has expired then the user is required to log in again. A user may also explicitly “log out” in which case the session is also invalidated.

Invoking the token endpoint never creates a “login session” for a user. However when a client uses a standard refresh token as parameter to the token endpoint, this will be rejected when the session is no longer valid (ie a standard refresh token is linked to the login session through which it was allocated). An “offline refresh token” does not have this limitation; however users are typically warned when an application requests an offline refresh token (precisely because this gives long-lived access to user data).

Login sessions and associated issues are discussed in more detail later.

Client Credentials Grant

The “client credentials grant” allows a process which is registered as a “client application” with the auth-server to get an access-token for “itself”.

Such tokens embed the client-id, and can be passed to REST endpoints which explicitly check the incoming access-token for specific client-ids.

In particular, the token can be used with calls to endpoints offered by the auth-server itself. In addition to the standard OAuth2 endpoints a specific auth-server implementation may offer services such as “update the application icon” or “get statistics about this app” (eg number of users who have consented to access by this particular app).

Some auth-servers allow “rights” to be embedded into the access-token returned from a client login. Auth0 for example allow an arbitrary set of scopes to be automatically embedded in the returned token. It may also be possible to automatically insert permissions/roles from an associated LDAP or ActiveDirectory account (in some auth-server-specific representation).

See this article from Okta for more details.

The “flow” required to obtain an access-token for a client (“client credentials flow”) is extremely simple; the request just includes client-id and client-secret and the token is returned as the response. Unlike the authorization-code-flow (see below) there is only one request/response interaction required to obtain the token.

The OAuth2 standard includes a “resource owner password credentials grant type” so it may initially seem tempting to use special user accounts as “system accounts”; here an application provides (userid, userpassword, scopes, clientid, clientpassword) to the auth-server and gets back an access-token for the specified user. However an auth-server has certain expectations about “user accounts” that don’t entirely work for system accounts, and it is best not to do this. Among other things, a user should “consent” to scopes before an access-token is issued; this can be worked around for “first party applications” (see earlier) but is still not elegant. An auth-server admin might also have a policy that all user accounts use multi-factor authentication; something that would then break “password grant” approaches. Instead, for cases of server-to-server calls where the operation is not specifically on behalf of a specific user, register the calling application as a separate client (ie create a client account) and use the Client Credentials grant.

Note that the “client secret” is usually a dynamically-generated password that the auth-server generates when the client account is created, and is transferred to the auth-server using “http basic authentication”. However an auth-server may choose to use a more secure authentication solution if desired (eg public/private key). If using the simple “http basic auth” approach, ensure that it is transferred to the auth-server over HTTPS!

Note also that HTTP basic authentication encodes both id and credentials as a single string which is base64-encoded and placed in the HTTP Authentication header. The OAuth2 specification examples shows only a single string being passed for client authentication but this string is internally a (client-id, client-secret) pair.

Standard Authorization Flow for OIDC Authentication

The following process describes the interactions between user-agent, auth-server and “client application” when that client-application is a session-based webserver and the webserver wants to allow users to “log on” by providing their credentials to an OAuth2-compatible auth server (a very common configuration).

An OIDC authentication is not actually one of the core “grant types” defined by OAuth2; OIDC is an “extension” to OAuth2. It is very similar to the “OAuth2 Authorization Code Flow”, with just a few different parameters. This sequence is being described before the Authorization flow because it is commonly performed first; a session-based server application generally requires a user to “log in” before giving them options to interact with resource servers that require access-tokens.

To quote from the OIDC specification (section 3.1.2.1):

An Authentication Request is an OAuth 2.0 Authorization Request that requests that the End-User be authenticated by the Authorization Server.

The user accesses the website:

  • user accesses a session-based webserver, establishing an anonymous session
  • user performs some action (eg visits a specific page) on that webserver which requires identity information
  • webserver caches the action that the user tried to perform in the user session then redirects to the “authentication handler” code within the webserver code (typically a simple wrapper around a library that provides OAuth2 client support)

The (potentially interactive) “Authorization Request” now starts, in order to obtain an “auth code”:

  • webserver redirects user to any auth-server that the webserver has a valid client-account for (eg user gets to choose); this redirect is typically done via an HTTP “unauthorized” response with a location-header that points to the selected auth-server
  • the auth-server (redirect target) URL includes
    • response-type = “code” (ie specifying the “authorization code flow”)
    • client-id of the webserver
    • audience-id = client-id
    • one or more “scopes” to specify what access-rights and profile-properties are needed – including string “openid” to trigger OIDC behaviour
    • a redirect-URL that points back to the “authorization endpoint” in the webserver
    • state (optional)
    • prompt (optional)
  • auth-server checks whether the client-id in the URL is valid (known account)
  • auth-server checks whether user is already logged-in (valid session cookie for auth-server provided); if not user is prompted to log in
  • auth-server checks whether user already has a “consent” for (clientid, requested-scopes); if not user is prompted to agree
  • auth-server checks whether the provided redirect-URL matches one of the values associated with the client-account
  • auth-server generates a temporary “auth code”, caches the “grant” result status in memory using key (clientid, auth-code) and sends a response to the user which includes:
    • the specified redirect-URL
    • the auth-code
    • and a few additional properties
  • user follows the redirect-URL, thus passing the “auth code” to the webserver

From here, an additional call is needed to exchange the code (a grant) for tokens:

  • webserver connects to auth-server specifying:
    • client-id and client-secret
    • grant-type = “authorization-code”
    • code = (code from auth-server response above)
  • auth-server responds with all of the following:
    • an OIDC “identity token” (JSON Web token)
    • an OAuth2 “access token” (opaque token with undefined format)
    • a “refresh token” (opaque token with undefined format) - provided scope “offline-access” was requested
  • webserver extracts data from the “identity token” (whose format is defined in the OIDC standard) and stores this info in the user’s session together with a marker to indicate “loggedin=true”.
  • webserver then fetches the original operation that the user wanted to do from their session and performs it or does an internal redirect
  • and the operation that the user originally tried to perform is finally run - now as a logged-in user

The “scope” parameter passed to the auth-server in a standard “OAuth2 authorization code flow” lists the permissions that the client-application desires to have in the returned access token. These are normally just arbitrary strings that have meaning to the resource-server and maybe the auth-server. The OIDC standard defines some “magic” scope strings that all OIDC-compliant auth-servers will recognise and respond to.

The most significant “magic scope” is “openid”; when this is present then when the “auth code” is exchanged for “tokens” then the response includes three tokens (usually):

  • an OIDC identity token
  • an OAuth2 access token
  • and an OAuth2 refresh token (if scope offline-access was requested)

The OIDC specification includes additional magic strings which indicate which “profile data items” the client application wishes to have access to. The user is prompted by the auth-server to agree to exposing these data items (“consent”). The scope “openid” is needed in order to obtain an “ID token”; as the OpenID Connect specification states:

If no openid scope value is present, the request may still be a valid OAuth 2.0 request, but is not an OpenID Connect request

The user is prompted for a “consent” only the first time a particular (clientid, scopes) set is presented. The information that the user is given looks something like “Application ${client-desc} is requesting the following rights: ${desc for scope1}, $[desc for scope2}, ..”. The client-desc is of course relatively important, and therefore auth-servers need to be somewhat careful when accepting requests for new client-accounts; they should validate that the description associated with the account is true and helpful. Some scopes are “built in” to the auth-server and have standard descriptions (which can usually be customised); other scopes can be added by auth-server admins with sufficient rights.

The auth-server’s authorization endpoint does not directly return any tokens, but instead a code which is then exchanged for tokens. The primary reason for this is that the call to do the exchange is performed directly from client-application to auth-server and requires the client-application to provide its credentials. The benefits are:

  • checking client credentials prevents many potential attacks but the user-agent cannot provide these credentials
  • the tokens never pass through the user’s system (browser or mobile device) and therefore are very difficult to intercept

The connection from the webserver to the auth-server is often called a “back channel”.

Although issuing an access-token without checking any client-credentials is not secure, the authorization-code-flow-with-pkce solution described below for mobile apps does do exactly this. That flow should only be used where absolutely necessary.

Having the auth-server redirect the user to a URL that is configured in the “client account” whose client-id was specified in the “auth grant” request is another safety-measure against applications using a false client-id; a website that a user is interacting with can claim to be a “client application” that it is not, but it never gets control returned to it after authentication!

The second safety measure against webservers using a false client-id is of course that they cannot exchange the auth-code for an access-token as that requires access to the (clientid, clientsecret) for the client account.

The redirect-URL provided in the original request (or the one in the client-account) points at an “authorization endpoint” in the webserver; it is this code that makes the direct call to the auth-server to exchange the auth-code for auth-token/access-token/refresh-token. After this is done, the code then forwards or redirects the request to the (protected) location the user originally requested - ie the URL that triggered authentication to occur. This location may have been stored in the user session before the authentication flow started, or might have been encoded in an optional “state” parameter included in the request to the auth-server; the auth-server appends any provided state parameter to the redirect-URL before redirecting back.

The state parameter can also be used to prevent “replay attacks” (eg via XSRF); before redirecting the user to the auth-server, a random value is generated and stored in the session as well as being placed in the state parameter. When the post-auth redirect is received by the webserver, the value from the state param is compared to the value in the session and then the value in the session is removed. Resending the same response again will fail as there is no state-value in the session to compare against (or there is one with a different value). When using an OAuth2 library to implement OAuth2 interactions (which is highly recommended), such safety-checks will be applied automatically.

The prompt parameter controls whether the auth-server should ask for user credentials or not; the default is to ask if-and-only-if the user is not already logged in (does not already have a cookie issued by the auth-server).

The access-token that is issued here provides rights to call the auth-server’s “userinfo” endpoint and get the items of data listed in the original scope list. If the scopes list also included other non-OIDC scopes then the access-token could also potentially be used to call other resource-servers which use the same auth-server as their token-issuer. However if the resource-servers being invoked require “audiences” to be set (not common) then a separate access-token must be issued. It may be good from a security point of view to issue a separate access-token anyway; the one issued along with the ID token has user-profile-related rights that might not be appropriate for passing on to other systems.

Note that the interactions described above are at “pseudo-code” level; there are some additional less-important params and the actual param-names are slightly different than written here. See the specification for the exact details - or better yet use an appropriate library.

Standard Authorization Code Flow For Access-Token Only

The following process describes the interactions between user-agent, auth-server and “client application” when that client-application is a session-based webserver and the webserver wants to access data belonging to the user which is stored in some other system. This description does not cover the case when the client-application is an app on a mobile device.

  • a user (who might be “logged in” or not) is interacting with a session-based webserver
  • the user performs some action that requires the webserver to access user data in some third-party resource-server
  • webserver checks user’s session and detects that no access-token is available to use with that call
  • webserver caches the action the user tried to perform in the user’s session
  • webserver redirects user to the auth-server associated with the target third-party (no other auth-server will do)
  • the auth-server URL specified includes
    • client-id of the webserver
    • audience-id (optional auth-server extension to OAuth2)
    • response-type=”code” (ie authorization-code flow)
    • one or more “scopes” to specify what access-rights are needed
    • redirect-URL (optional) that user is directed back to when auth is complete (ie authorization endpoint in webserver)
  • most of the following steps are the same as above - up until the point where the webserver has retrieved an access-token from the auth-server
  • webserver optionally stores the access-token in the user’s session (for later calls)
  • webserver then fetches the original operation that the user wanted to do from their session and performs it or does an internal redirect
  • and the call to the third-party resource server can now be performed; the necessary access-token is now in the user session and can be included in the call as a “bearer token”.

An access-token has a limited lifespan (typically less than 1 hour); the exact lifetime is chosen by the auth-server that issues it. It can therefore happen that a webserver that has cached an access-token in a user session has used it for multiple calls to the associated target resource-server but then eventually gets an authorization-failure indicating “token expired”. In that case the server can redirect the user again to the auth-server in order to obtain a new “auth code” which can be exchanged for a new access-token. Given that the user probably still has a valid login-cookie for the auth-server, they will not be prompted for their credentials. And unless the user has explicitly revoked their earlier “consent” for this specific (clientid, scopes) set, then the user will not be prompted for consent again. The new access-token is therefore usually issued automatically with just a few almost invisible http-redirects from the point of the user. Alternatively, the client application can use the “refresh token” which is usually issued together with each access-token; see later for more info on refresh tokens.

The fact that access-tokens can expire means that a “client application” using them must either:

  • check before each call that the access-token has at least N seconds before expiry, and renew it if necessary or
  • wrap every call to an external system in a retry loop, handling “token expired” by fetching a new token and retrying the call

Given that the value “N” above can be hard to estimate, and that it is embedded in the token which is theoretically “opaque”, the retry-loop option is often the best approach.

Note that an access-token issued by a third-party auth-server is opaque; its format is not defined and might change at any time. A client application that holds an access-token for a user should therefore not “peek into it” but instead just pass it along in calls to a resource-server associated with that auth-server.

Unlike the OIDC flow above, a refresh-token is always delivered along with the access-token; no special scope must be requested.

Other Standard Flows

Of the four standard OAuth2 “grant types” (authorization flows) the two most important (client credential and authorization code) have been covered above, plus the OIDC variant for authentication.

The remaining flows are:

  • implicit flow (deprecated; mobile apps should use “authorization code flow” plus PKCE - see later)
  • refresh token flow (see later)
  • resource owner password flow in which the request for an access-token just directly includes:
    • client credentials (client-id/client-secret) and
    • user credentials (user-name/user-password)

The resource-owner-password-flow is useful only in specific cases and not discussed in this article. As with the client credentials flow, only a single request/response is required (not multiple phases as in the authorization code flow).

General Comments on Grant Types and Authorization Flows

The “authorization code flow” (and the OIDC variant) is a two-phase protocol; in this case the desired scopes are passed in the first phase (Authorization Request) and are not permitted in the second phase (token retrieval). The other flows/grant-types have only a single phase (token retrieval), and in these the desired scopes are passed in that step.

As noted later in the section on refresh-tokens, a refresh-token can be used to obtain an access-token with a subset of the originally-requested scopes.

Mobile Devices and OAuth2

Now that OAuth2 “authentication flows” for webservers has been covered, it might be helpful to briefly look at how mobile devices integrate with this.

In general, documentation on OAuth2 assumes that the user is using a web-browser to interact with some session-based system, and that when an OAuth2 token is required then the browser can simply be “redirected” to the website of the relevant auth-server. However the user might be using a web-browser on a mobile device or even a native client on a mobile device; in that case things might work slightly differently. There are two different issues to look at, addressed below.

See also RFC 8252 which describes “best practice” approaches for using OAuth2 in native client applications.

Interacting with the Auth Server from a Mobile Device

On most (all?) mobile device operating systems, an installed application can request the OS to “open a URL”. The OS then checks whether any “app” installed on the device has registered as being a “handler” for that URL. If so, then that app is opened and given the complete URL. If not, then a standard web-browser is opened. This means that an installed app that wishes to obtain an OAuth2 token can simply pass the appropriate URL to the OS; if (for example) this URL references Facebook’s auth-server and the user has a native Facebook login app installed, then that app opens. That login app then communicates with the corresponding back-end using some dedicated protocol (not the standard OAuth2 URLs) but eventually returns an OAuth2-compliant HTTP response to the calling app. If no dedicated login app matching that URL exists, then the OS instead opens a web-browser window for that URL. The app needing the OAuth2 token sees the same response either way.

Not being a mobile-app developer, I am not sure what happens if the user is using a web-browser to interact with a webserver that then requests a redirect to a URL for which the user has a dedicated login app - ie whether this “detect app handling URL” feature works for redirects within a browser or not. But at worst, the browser interface for the referenced auth-server is available.

Implementing a Client App on a Mobile Device (Authorization Code Flow With PKCE)

A native app installed on a mobile device may want to itself act as a “client application” with respect to some auth-server, ie to obtain an access-token for the user’s data on some resource-server and then contact that resource-server to perform reads or updates. An example might be a “twitter bot” that updates a user’s account following user-defined rules.

The problem is that the standard “authentication code flow” requires the client-application to present its client-id and client-secret when exchanging the auth-code for the actual tokens. However:

  • this client-secret will be identical for every installation of the app
  • and there is nowhere safe that an app can store that secret on a mobile device

As access to a client-secret allows the holder to impersonate the client application, a “safe” place for the client secret would need to be somewhere that not even the owner of the device on which the app is running can access - which is impossible. In the OAuth2 specification, client applications are therefore divided into two “types” - confidential and public - depending on whether they can safely store their client-secret. A webserver is an example of a “confidential” client-application while mobile-apps and single-page-apps are examples of the “public” type.

Native apps (being of the “public” type) should use a variant of the standard “Authorization Code Flow” called “Authorization Code Flow with PKCE”. PKCE doesn’t provide the client-secret, but instead effectively generates a temporary password whose hash is passed in the first “get auth code” step. The complete (temporary) password is then provided in the second “exchange code for token” step; the auth-server verifies that hash(password) matches the original hash. This doesn’t solve all the problems related to not having a client-secret available, but does at least fix some (see below). The auth-server typically allows the PKCE flow (ie omitting the client-secret) only for clients whose client-account is marked as “type=public”, ie the server has been notified that applications using this client-id are “not properly secure”. Various constraints are then applied to the operations such clients can perform.

The OAuth2 specification originally defined the “implicit grant flow” for this scenario (a client application which does not (cannot) have access to its client secret). However the “authorization code flow with PKCE” approach is significantly more secure (though not as secure as the standard flow).

Although the application does not use its client-secret, it must still have a “client account” with an auth-server in order to request and obtain access tokens. Note also that a single application can be a “client application” with respect to multiple auth-servers (if it accesses resources from multiple distinct organisations), and will require a separate client account with each one.

One threat the PKCE protocol protects against is interception of the HTTP requests. An attacker would need to intercept both requests and correlate them together. In addition, the auth-code that is issued as the first response is a one time code so the attacker would need to block the second request in order to be able to send their own request first.

Another threat that the PKCE protocol protects against on mobile devices is an evil local app. As noted above, an app on a mobile device can register itself as a “handler” for specific URLs. A bad app might register itself as the handler for URLs referencing popular auth-server addresses and perform a “man-in-the-middle” attack. The PKCE protocol relies on the fact that the native app will request the OS to handle the first “get auth code” step (ie launch a browser or login-app to handle login), but will handle the second step (exchange code for ticket) via a direct rest-call that it does not ask the OS to handle.

Nevertheless, the lack of a client secret is a security problem. For any client-account which is marked as being usable via PKCE, any arbitrary application can pretend to be that client. The application cannot intercept user credentials (they are entered in a web-browser or native login app) but does receive the associated token. It is the responsibility of the user to know which application they are interacting with on their device in order to not give bad apps access to ID and access tokens that allow access to the user’s data.

Using ID tokens for Authentication

As described earlier, server-side client applications typically obtain an ID token then mark the user’s HTTP session as “logged in” (typically also storing the ID token in the HTTP session).

For a native mobile app, things are significantly simpler. To log a user on, the app opens the auth-server “authentication grant” URL - ie a url that references the desired auth-server and provides parameters such as “client-id” and “response-type”. The OS will either open a native authentication app associated with that URL, or just open a web-browser window with that location. The user interacts with the auth-server, and eventually the native app receives an auth-code as response. It exchanges this auth-code for an ID token (via authentication-code-grant-with-PKCE) and then just stores the ID token in memory.

There is a significant difference to server-side authentication however. A server-side app can use the “logon state” as implicit authorization to perform any operation on data associated with that user. A client-side app typically cannot perform any useful work without calling some services provided by remote systems; these will typically require an access token. The ID token is therefore actually of little use to a native client app; it does provide a way to get user data from the centralised OIDC profile but accessing any remote services require a redirect to the auth-server associated with that remote service in order to obtain a suitable access-token.

External Agents and Embedded Web Views

Mobile device OSes typically provide two ways for a native client application to display HTML content to its user:

  • by starting an instance of “the default web browser” in a separate “window” (aka an “external agent”), or
  • by embedding an OS-provided “web view” component within the native application’s frame (just like a button, slider, or other native component)

A native app has very little control over an “external agent”; it cannot intercept data-flows, modify HTML, watch keystrokes, inject Javascript, read/write cookies, etc. In contrast, an app typically has a lot of control over an HTML-capable component that it has embedded into its own “window”.

As described in the “best practice” link above, sending a user to an auth-server authorization endpoint to “log in” should be done via a web-browser in a separate window. This is for the user’s own security; the whole point of OAuth2 is that a client application doesn’t get to see the user’s credentials, just the access tokens. The lack of control that a native app has over an external agent ensures this is true even for an “evil” native app. With an embedded HTML component, however, credentials can be intercepted directly or the cookies that the auth-server sets can be read. Unfortunately this means that when a user has logged in to an auth-server via the external agent, any embedded web view component will lack the “session cookies” representing this login session and therefore cannot perform “single signon”. There is no simple solution for this - displaying OAuth2-protected websites within a web-view is basically impossible without significant hacks to the auth-server. A system I am currently working on resolves this by extending the (self-hosted) auth-server to support an access-token (with appropriate scope) as a valid credential; the native app allocates a suitable access-token and injects it into the web-view as a cookie. When the web-view redirects to the auth-server authentication endpoint this then causes a login session to be created within the web-view. Extensions to OAuth2/OpenID Connect should be viewed with caution, and I do not guarantee that this approach is safe with respect to security.

Single Page Applications and OAuth2

A single-page-app is where an application is implemented using Javascript (or similar) running within a web-browser on a user-owned system, and wishes to act as a “client application”.

Like a “native app” on a mobile device that is acting as a “client application”, a single-page-app has nowhere that a “client secret” can be safely stored - and the “client secret” will be identical for every instance of the app (every browser in which the code runs). The recommendation here is to use the “authorization code flow with PKCE” to obtain tokens from the auth-server; this does not protect the access-tokens from interception as well as the standard authorization-code-flow does, but it is usually considered sufficient.

Auth Servers

Most of the responsibilities of an “auth server” have already been described.

It must have access to a database of (user-id, credentials). It might maintain this database itself, or might just be a front-end that relies on an external store such as LDAP or ActiveDirectory.

If it maintains a user database itself, it needs to provide an interface for users to create accounts, delete accounts, manage their credentials, and manage their user profile information.

It needs to provide an interface for users to enter their username and credentials; it might verify these directly or pass them to some back-end system to validate. Ideally the auth-server would support credentials more secure than just passwords, eg hardware-token-based authentication. This might require a moderately complex user interface for login.

It definitely needs to maintain a database of “consents”. It also needs a user-interface for users to view and revoke their consents. Once a consent has been granted, a client application that uses the “redirect approach” to request the same (clientid, scopes) set that have already been consented to will recieve the token without prompting the user for consent again. A client application which has a refresh-token is able to obtain a new “access token” as long as the consent has not been revoked.

Note that there is no way to invalidate an “access token” until its expiry time has been reached. This is somewhat inconvenient, but greatly reduces network traffic and load on servers as the recipient of an access-token does not need to contact the associated server to validate it. The lifetime of an access-token is typically around 1 hour.

The most significant benefit of using an auth-server to issue access-tokens is that a client-application may act “on behalf of a user” without ever having received the user’s credentials.

When the auth-server supports OIDC then the tokens it issues are JWT tokens which are signed with the auth-server’s private key. The auth-server must make its public keys available so that resource-servers which need to validate access-tokens and session-based applications that need to validate authentication-tokens can download the necessary keys. A key should never change, and thus they can be effectively cached. JWT tokens specify the “name” of the key they are signed with in order to implement “key rollover”, where the auth-server starts using a new key (while still providing the old key in order to validate old tokens).

Hub Providers

An auth-server can act as a front-end (proxy) to multiple other auth-servers for the purpose of OIDC authentication. Company Auth0 provides such a hub, delegating to around 50 other providers as needed. Any session-based client app that wants to allow users to “log in” from a range of servers can therefore just redirect to such a hub and let that hub delegate to the auth-server of the user’s choice.

Delegation makes no sense for authorization as the resource server that requires an access-token is tightly coupled to a specific auth server.

Auth-Server Login Sessions and Remember Me

When an auth-server receives credentials from a user and confirms they are valid, the response it returns includes a set-cookie command that indicates the user is “currently logged in” to the auth-server. It is this cookie that allows future requests to be successful immediately without requiring user interaction (as long as there is an existing consent on all the requested scopes for the specified client-application).

Normally this cookie has session-scope, and the auth-server session is kept alive for a few hours. However after an extended period of inactivity, the “session” held by the authserver can be terminated, meaning that the next time a client-application redirects to the auth-server for a token, the user is prompted for credentials again. If the user closes their browser, the session-scoped cookie is deleted (standard behaviour) and therefore even when the server-side session still exists, the browser is no longer “connected to it” and therefore the user is also prompted for credentials.

Some auth-servers provide a checkbox on the login page labelled “Remember Me” or similar. In this case, the server gives the “login cookie” a lifetime longer than just the current browser session, and extends the timeout on the server-side session. Exactly how long the cookie and corresponding server-side session lifetimes are is usually configurable in the auth-server (eg from a few hours to a few months).

If you intend to rely on this functionality within your application, make sure that the auth-server you are using scales appropriately. At the current time, the Keycloak auth-server (also known as RedHat SSO) stores all active sessions in memory (replicated between server instances) which means cluster restarts typically lose “rememberme” sessions, and scalability is limited (to a few thousands of concurrent rememberme sessions).

The fact that rememberme typically relies on cookies has security implications; anyone with access to the browser, or the ability to steal cookies, can then log in as that user. In general, a user should not select this option on important websites such as e-banking. Despite this potential weakness, the cookie issued when “remember me” is selected is still more secure than the “save my password” option provided by many web-browsers as the “remember me cookie” is a token and not the actual credentials.

A user should also avoid enabling this option on a browser which is shared with other users. Note however that public shared computers typically have a “logout” option for the whole user session and this ensures all browser details (cookies, history, cached data) are purged.

Bootstrapping provider support in OIDC

By convention, an OIDC provider (auth-server) serves a json document from GET “https://{hostname}/.well-known/openid-configuration” which contains:

  • authorization_endpoint: The URL where end-users will authenticate
  • claims_supported: An array containing the claims supported (more on that later)
  • issuer: The identifier of the OIDC provider (usually its domain)
  • jwks_uri: Where the provider exposes public keys that can be used to validate tokens
  • token_endpoint: The URL that apps can use to fetch tokens
  • userinfo_endpoint: The URL where apps can learn more about a particular user

While this info potentially makes it possible for an OIDC client app to dynamically support any identity provider (eg one that I run on my own private infrastructure) in practice most OIDC-supporting applications (servers) instead just support a hard-wired list of servers. The main barrier to dynamically supporting identity providers is that each client-application requires a “client account” with an auth-server; the OIDC spec does describe how dynamic client account registration can work but this seems difficult to apply in practice.

The set of scopes an identity-provider supports is listed in its openid-configuration file, but typically is:

  • profile: A scope that asks access to claims like name, family_name, and birthdate
  • email : A scope that asks access to the email and email_verified claims
  • address : A scope that asks access to the address claim
  • phone : A scope that asks access to the phone and phone_verified claims

The Audience Attribute

There is a potential security issue with OAuth2 that the specification does not address: what if the following occurs?

  • multiple resource-servers use the same auth-server;
  • a client application requests a token from the auth-server intending to use it with resource-server-1
  • but some attacker intercepts the token and uses it in calls to resource-server-2

One solution would be for the client-application to include one or more “audience ids” in its request to the auth-server which would be embedded in the resulting access-token. If each resource-server verifies that its id (in whatever form it chooses) is contained in the “audience list” for the token, then such an attack would be blocked.

A draft rfc for such a solution for OAuth2 was created, but never approved - perhaps because such attacks are unlikely to be common for OAuth2, given that it only works for the set of resource-servers that share a common auth-server - ie at most the set of resource-servers within a single organisation. However the OAuth2 specification does state:

Authenticating resource owners to clients is out of scope for this specification. Any specification that uses the authorization process as a form of delegated end-user authentication to the client (e.g., third-party sign-in service) MUST NOT use the implicit flow without additional security mechanisms that would enable the client to determine if the access token was issued for its use (e.g., audience- restricting the access token).

Some OAuth2-compliant auth-server implementations do support “audiences” as an auth-server-specific extension, and some resource-servers require tokens with an embedded audience. This appears to be the case with the endpoints that auth0.com offers its customers for example.

A resource-server that “manages user music preferences” and works with an auth-server that supports audiences can define an “audience string” for itself such as http://acme.com/musicprefs, and document that any client app wanting to communicate with the server would need to ensure that that string is included as the “audience” of the access-token that it requests before calling that server. The URL is just a unique string (it uses URL form just to ensure uniqueness); the server can run at any address. And a server can accept tokens for any audience it wishes to - or ignore the audience completely. However validating the audience string limits the damage that a misused token can do; an attacker might get a “read user data” token for service X (audience http://acme.com/X) but cannot pass it to any other server which expects a different audience.

OIDC specifies that ID tokens always include an “audience” field, and that any client application using that ID token must verify that the audience field of the ID token contains its own client-id value. Presumably this solves some OIDC-specific security problem - perhaps related to the fact that OIDC is explicitly designed to allow ID tokens from arbitrary auth-servers to be used as proof of user identity.

An application which is in possession of a token, but which is not the “audience” for the token, should not peek into it or evaluate it; the only valid use of that token is to pass it to a system which is a valid “audience” (or which in turn passes it on). In particular, what each “claim” actually means is something that only the audience of the token is able to correctly interpret; for all others the access-token is just a block of data without meaning - ie “opaque”.

JSON Web tokens support an audience attribute.

Refresh Tokens

When an OAuth2 access-token is issued, a refresh-token is usually delivered with it. An auth-server does not include a refresh-token for certain insecure “flows” (eg the obsolete implicit-auth flow), and might be configured to not include one under other circumstances. However in general it can be assumed that a refresh-token will be included when using the standard authorization-code-flow.

Example response received after exchanging an authorization code for a token:

     {
       "access_token":"mF_9.B5f-4.1JqM",
       "token_type":"Bearer",
       "expires_in":3600,
       "refresh_token":"tGzv3JOkF0XG5Qx2TlKWIA"
     }

The access-token expires in a relatively short time (eg 1 hour). The refresh-token can be stored and allows the server to retrieve a new access-token using server-to-server calls without involving the user interactively. Refresh tokens typically do not expire.

A client application holding a refresh-token can call the auth-server to obtain a new access-token; the call specifies a set of scopes which can be the same as the original set, or a subset of them. The auth-server response provides the new access-token.

If a user revokes their consent, the access-token continues to be valid until it expires. However when the refresh-token is used to retrieve a new access-token, the refresh will fail. The step of “refreshing” a token is effectively the point at which a user can “regain control” of their tokens; using access-tokens with a longer expiry time does not provide the same level of control.

A refresh token should be kept relatively secret, and only access-tokens should be forwarded. Leaking a refresh-token is a relatively serious security hole - though not as serious as leaking a user credential. Some auth-servers increase the safety of refresh tokens by tying each token to the client-account through which it was first issued; when that client is a “confidential client” (one with a client secret) then an intercepted refresh-token cannot be used without the corresponding client-secret. Note however that if the refresh-token is issed via a “public client” then it can be used by anyone who simply knows the client-id (no secret required). This security feature is also not part of the official standard as far as I know. In either case, it is best to keep refresh tokens only in memory (non-persistent storage) if possible.

There are two ways to use refresh tokens:

  • keep track of the “expiry time” associated with each access-token (included in the auth-server response that issues the token) and verify before each call that uses the acess-token that the token still has “enough time to live” (fetching a new token if not), or
  • wrap each call that uses an access-token in a retry loop; if an error is returned indicating “token expired” then fetch a new access-token and repeat the call.

Exchanging a refresh-token for a new access-token is actually a “grant type” similar to the authorization-code-grant or client-credentials-grant; the refresh-token is the “proof” that the access-token should be issued (the grant).

Refresh tokens are not so important when working with authentication (OIDC), as a “login” typically lasts until the session expires. However OIDC does support refresh-tokens, though with some minor differences from core OAuth2:

  • when the auth-server provides an ID token it includes a refresh-token only if the Authorization Request also included scope offline_access
  • When requesting new tokens using the refresh-token, a new access-token (one providing access to the userinfo endpoint) is always generated; a new ID token is generated only if the “scope” list includes “openid”.

Because new ID tokens can be generated via a refresh-token, an id-token does not necessarily represent a “recent interactive login”. If the user of an id-token cares when the user was last authenticated, it can check for a field auth_time in the id-token. This field is present only when the issuing request included param max_age or require_auth_time. Even without “offline access”, the refresh token issued along with an ID token can be used to check if the user is currently logged iin (and if so, their user profile data can be fetched).

A refresh token can only be used to allocate new access-tokens with the same scopes that were originally requested when the refresh-token was issued (or a subset thereof).

Refresh Tokens, Sessions, and Lifetimes

An auth-server keeps track of all “currently logged in users”, keyed by their user-id. This session typically expires after an idle-time of 30 minutes or so. Performing an explicit “logout” at the auth-server terminates the session immediately. When a user authenticates via a browser, a cookie is also set (with the domain of the auth-server) which identifies the login-session but the login-session is not 1:1 with the browser session; logging in with multiple browsers will connect each to the same session while logging out terminates the auth-server session even though the browser is active.

Although a refresh-token has a long lifetime (eg 60 days by default for Keycloak), it can be used only when the user has a valid login session with the auth-server. Any attempt to get a new access-token or id-token using a refresh-token will fail (error-code returned) if the auth-server does not consider that user “currently logged in”.

For security reasons, some auth-servers support “rolling refresh tokens”: when a refresh-token is used in a call to the auth-server, the response includes not only a new access-token, but also a new refresh-token and the old refresh-token is marked (server-side) as invalid. In effect, with this feature enabled, refresh-tokens become “one-time tokens”.

WARNING: storing refresh-tokens in cookies, in files, or in any other form, is a potential security risk. Offline refresh tokens are an even higher risk, given that the user does not need to be online and that the lifetime is even longer. Therefore do this only when necessary. On the other hand, storing such tokens is still better than the old approach of storing user credentials, as:

  • refresh tokens can only obtain an access-token with the scopes that were specified in the initial request (no privilege escalation)
  • the user credentials (password etc) are never exposed
  • and the user can revoke tokens

See here for further details on refresh tokens in general.

See here for details of various session and token timeout settings in the Keycloak auth-server (other servers will likely have similar options). Note that:

  • default lifetime for a login session is 30 minutes after last interaction with the auth-server (SSO Session Idle) - and the same default is used when the user selects “remember me”.
  • default lifetime for an offline-refresh-token is 30 days after last interaction with the auth-server (Offline Session Idle). Note that this isn’t actually an expiry-time, but rather that the token will be auto-blacklisted if not used within the offline-session-idle period.
  • default lifetime for an access-token is 1 minute (Access Token Lifespan)
  • default lifetime for a standard refresh-token is the same as the login session (Client Session Idle) and is settable per-client-account

Offline Refresh Tokens

An “offline refresh-token” is a variant of the refresh-token which can be used to obtain user-profile-data and user-access-tokens regardless of whether the user currently has a login-session with the auth-server. This can be used to support “long-term login” behaviour, but at a price: the refresh-token needs to be stored somewhere long-term and an attacker who obtains access to that token can perform actions as the user regardless of whether the user is online or not. This kind of token is commonly used for applications that need to run “in the background” on behalf of a user, but can also potentially be used for an interactive application, saving the user from having to enter their credentials on app startup. To obtain an offline access token, the client application must specify scope offline_access when requesting the initial tokens. Often this scope requires the user-account to be “enabled for offline access” (eg with a corresponding role). Users are also typically presented with a warning message during the consent-screen, informing them that the client application is requesting such an offline-token. Offline refresh tokens typically have an even longer lifetime than standard refresh tokens. As with normal consents and refresh-tokens, a user can use the auth-server user interface to list all offline refresh tokens and revoke them - in which case the next use of that refresh-token to obtain a new access-token will fail.

Refresh Tokens and Permanent Login

Some sites/applications with relatively low security concerns wish to leave users “signed in” on a specific device for long periods of time - ie once logged in, the user remains logged in and doesn’t need to provide their credentials again.

The simplest solution is just to configure your auth-server to have a very long lifetime for user sessions. However before relying on this, ensure that the auth-server scales to the number of concurrent login sessions that you are expecting to support. The Keycloak (RedHat SSO) server currently does not scale to large numbers of concurrent login sessions and other auth-server implementations might have a similar limitation.

An “offline refresh-token” is a variant of the refresh-token which can be used to obtain user-profile-data and user-access-tokens regardless of whether the user currently has a login-session with the auth-server. This can be used to support “long-term login” behaviour, but at a price: the refresh-token needs to be stored somewhere long-term and an attacker who obtains access to that token can perform actions as the user regardless of whether the user is online or not. This kind of token is commonly used for applications that need to run “in the background” on behalf of a user, but can also potentially be used for an interactive application, saving the user from having to enter their credentials on app startup. Users still retain control over their data as they can use the native web interface of any auth-server to list and revoke offline refresh tokens.

Mobile app platforms typically provide “secure storage” APIs that can be appropriate for storing a user’s offline refresh token.

For server-side web applications, storage of refresh-tokens is far more complex. Simply placing a refresh-token in a cookie associated with the application’s domain is fairly risky; cookies can be leaked in various ways and for an attacker obtaining a refresh-token gives a lot of possibilities. Storing the refresh-token server-side and issuing a cookie referencing that “long-lived authentication” seems like a somewhat more secure solution. If you have any better ideas, please comment on this article!

One limitation of the offline-refresh-token approach is that although the client application can use the refresh-token to obtain access-tokens and invoke resource-servers, there is no proper “login session” for the user. This in turn means that redirecting a user to other websites will not result in an elegant “single signon” experience; the user will be prompted for their password. In a system I am currently working on, we have resolved this by extending our auth-server (which has a plugin api) to implement a custom authentication flow which accepts an access-token (with a specific scope) and creates a user login session. Extending the standard OAuth2/OpenID Connect flows with custom logic should of course always be done with great caution and I do not guarantee that this approach is sound with regards to security.

Rest-with-Session and Hybrid Authorization

There are two basic approaches to servers each of which has a different authorization approach

  • session-based systems where each request contains a session-id that references a session datastructure
    • if the URL is for a “protected resource” but the the session does not hold a “login context” then
      • the user is redirected to login
      • on redirect back to server, an “auth code” is provided; the server exchanges this code for a token then inserts a “login context” into the session
  • stateless Rest-based systems where each request optionally contains a bearer-token as an http header
    • if the URL is for a “protected resource” but no bearer-token is present, or it is invalid, or it has insufficient permissions
      • a simple error is returned (not a redirect)

The stateless system does not need to “redirect”; the author of the calling application is expected to have read the documentation for the Rest-endpoint that is being called (ie know which auth-server and scopes to use), and the calling application should have already obtained the necessary token.

Normally session-based systems do not provide “Rest service endpoints”, ie requests which send JSON and receive JSON in a “remote procedure call” approach; instead they support posting of HTML forms and receive back either an HTML page or at least an HTML fragment (AJAX).

However a “Rest with sessions” approach can also be used - sometimes deliberately, but quite possibly also just by accident. Here, when a client calls a Rest endpoint (ie passes JSON or at least expects JSON back), and the call requires authorization then a redirect is issued to the auth-server. And on return, an “auth code” is processed, a session established, and a session-id cookie returned. Then on future Rest-calls, the session-cookie is provided and so the Rest calls are authorized via the session.

Traditional Rest interfaces do not expect cookies as input, do not set cookies as output, do not send redirects, and are stateless. This “Rest with sessions” approach is therefore somewhat unusual. Most significantly, this approach only works when the caller supports cookies like a web-browser does. However this can be true in some circumstances:

  • a server which handles an initial GET by returning HTML and javascript resources that form a “rich client application” (Javascript in the client then calls the server making Rest APIs)
  • some native app frameworks process “set-cookie” headers returned by Rest calls

The Rest-with-sessions mode has some behaviours that you should be very wary of:

  • although the server provides “Rest APIs” that are normally stateless, you will need an external session store (unless running on one node or using sticky sessions)
  • server-to-server calls of the Rest endpoints typically will not support cookies.

To support server-to-server calls, your security layer could be configured to support bearer-tokens as well as sessions (“hybrid” authorization). And in fact some OAuth2/OIDC libraries do this almost automatically (eg the spring-security-keycloak integration). This does however have a disadvantage: having some callers be authorized via a cookie referencing an HTTP session while others are authorized via a bearer-token is complex and just plain odd. And that is never a good thing when designing a security-sensitive system.

One issue with this “hybrid” authorization is that server-to-server calls which fail to provide a valid bearer-token will receive a redirect rather than an error-code.

One further inconsistency is that if the session-based authorization is requesting scope “openid” when redirecting on missing authorization (ie requiring “login”) then the grant flow is an OIDC flow returning id-token and access-token, and the id-token is guaranteed to have an “audience” matching the application. However the bearer-token authorization will typically not check for audience (as it is an OAuth2 process and not an OIDC process). If you are relying on the fact that the ID token is valid only for a specific audience (not common but possible) then you have a security hole.

My personal opinion: although I can’t identify a specific flaw in the Rest-with-session or hybrid approaches (except the audience issue), having Rest calls depend on cookies holding http-session-ids just doesn’t feel like a good idea to me. Having mixed authentication approaches (ie supporting both browser-to-rest calls with sessions and server-to-rest calls with bearer tokens) feels even less good.

Client Applications Using Multiple Resource Servers

An application that calls multiple resource-servers (APIs) associated with the same auth-server can either request a single access-token that contains permissions for all APIs, or obtain multiple tokens.

The multiple token approach is more effort, but more secure. The identity-provider should be able to issue them without user interaction, so the user experience is similar. However an evil (or simply vulnerable) implementation of an API that receives a broad-capability access token can do more damage.

A refresh token can be used to obtain additional access-tokens which represent a subset of the scopes that were specified when the refresh token was generated; this provides an easy way to generate tokens with different scopes for different resource-server-endpoints that share a common auth-server. See this article for further info on this topic.

OIDC ID Tokens

When a client-application makes a call to the auth-server “token endpoint” to exchange a grant (an auth-code or inline credentials) for a set of access-tokens, the response is a JSON structure including a field id_token that is a base64-encoded JSON Web Token whose fields are specified in the OIDC specifiation.

Within the ID token, all the standard JWT fields are present:

  • iss: the identifier of the auth-server instance
  • sub: the identifier of the associated user (unique for that auth-server)
  • aud: an array of values which must include the client-id
  • iat: timestamp at which the ID token was created (issued at)

OIDC adds several additional fields to the returned token, including:

  • auth_time: a field that indicates how long ago the user actually interactively authenticated.

The ID token is signed in the usual manner for JWT tokens. When the token is retrieved from the auth-server token endpoint (the usual case) then the token does not need to be encrypted as it is already transported over HTTPS. The client-application also does not need to verify the signature when it has received the token directly from the auth-server.

As noted here:

  • when a server-side app X receives an OAuth2 access token that grants acess to a user’s profile info from identity-provider Y, this does not mean that X should consider the caller to be “logged in” for the purposes of performing other operations on site X. An access token grants access; that is all it was designed to do. The caller is an application which has been granted that right at some point in time - which may have been a long time ago, and the right to access profile-info from Y is what is granted, not the right to access user data from X.

  • the solution is for X to require an id token whose “audience” is set to the “client id” of X; that guarantees that the token has been issued for the purpose of “authenticating to site X as user U” - with a side-effect of also granting access to some profile info about U. This is why OIDC exists as a layer on top of OAuth.

When the “scopes” parameter for the token request includes some of the OIDC magic “user data” scopes (eg “email”) then normally the returned ID token does not include that data directly; instead the access-token that is issued provides rights to query that data from the OIDC userinfo endpoint. User data is embedded in the ID token only in the following situations:

  • when a grant-type is used that returns an ID token but not an access-token (rarely used feature)
  • or when the authorization request (phase 1 of an authorization code flow) specifies optional parameter “claims”, listing the fields that should be embedded in the ID token

The “claims” property of an authorization request is not a simple list of scopes, but instead a moderately complicated nested structure. This field can be used to request non-standard user-related data, or non-standard combinations of user-data fields. The claims property also makes it possible to fetch user-related data in a specific language or script (eg fetch the Katakana version of a person’s name). See the OIDC spec for full details.

Like all tokens, ID tokens have an “expiry date”. However unlike access-token expiry dates, it is not entirely clear how an application should respond when the id-token that it holds for a user has expired:

  • should an app check for each request that the id-token in the session is still “valid”, and if so fetch a new one via refresh?
  • should an app check for expiry when the user requests an operation that requires a logged-in user and if so fetch a new one?
  • what do standard OAuth2 libraries do in this area?

It isn’t clear whether any checks for id-token expiry are needed at all; a “log-on” effectively authorizes a session. As long as the session lives, it doesn’t seem important to check whether the user is still “signed on” as far as the auth-server is concerned. See the section on “OIDC Single Sign Off” for other relevant issues.

An ID token is not usually passed to another application; a client application retrieves one from an OIDC-enabled auth-server (aka an OIDC Identity Provider, IdP, or OP) and then marks the user’s session as “logged in”. An ID token cannot be passed as a “bearer token” in an “Authentication:” HTTP header; only access tokens can be used in that way. This includes calls to the auth-server “userinfo” endpoint; those require an access token to be passed - the one that was issued at the same time as the ID token. An ID token can potentially be passed to some other process in a different HTTP header, or simply in the message body, but there is not a lot a recipient can do with an ID token; it effectively just proves that some auth-server (the issuer) states that some user (the subject) agreed (explicitly or implicitly) to log in to some client application (the audience) at some specific time. And if the “claims” parameter was used to explicitly embed profile data into the ID token, then the token also asserts that that user has specific profile attributes, eg email-address. Note however that the recipient of an ID token must still be able to validate the signature in order for the token to be used in even this minimal way. The client-application which retrieves the token directly from the auth-server (ie exchanges code for tokens) does not need this step as that path is already very secure.

For native apps that act as a client application (eg on a mobile device), an ID token is even less useful. Such an app knows directly whether an interactive user is present, and does not normally need to verify the identity of the user (require username/password) on app startup. Retrieving user profile information from an auth-server might be useful. More important is obtaining “access tokens” that allow interaction with remote systems, but that is not related to obtaining ID tokens.

The access token that was issued along with the ID token might be slightly more useful when passed to another app; it is at least valid as a bearer-token in an “Authentication:” header. The token can be used in calls to the auth-server “userinfo” to check whether the user is still logged in (scope “openid” grants that permission to the holder of the access-token), and to fetch additional profile data (depending on what scopes were requested).

If generic scopes were requested as well, eg “photos:read”, then the access-token can also be used as authorization for calls to resource-servers other than the auth-server (as long as they use that same auth-server as their access-token-issuer).

The OIDC userinfo Endpoint

When a client application makes a request for tokens specifying scope “openid” then the response includes both an “ID token” and an “access token”.

The client application can then make calls to the “userinfo” endpoint of the auth-server, providing the “access token” and requesting any of the user-related profile information that was listed in addition to scope “openid” (eg “email”). In effect, the access-token grants the holder the right to read specific user-profile-related data.

OIDC Terminology: RPs and OPs

The OIDC specification invents some new aliases for words already defined in the OAuth2 spec. Sigh.

  • the Relying Party (RP) is a “client application” that supports OIDC (requests ID tokens etc)
  • the OpenID Provider (OP) aka “Identity Provider” (IdP) is an “auth server” that supports OIDC (issues ID tokens, supports “userinfo” endpoint, etc)

OIDC Authorization Request Param Extension

The OAuth2 authorization request format is fairly ugly; it is an HTTP request with all input values represented as http-query-parameters appended to the URL.

OIDC defines an alternate method of passing the necessary params: as a JWT (optionally signed and optionally encrypted).

The authorization request can be passed this JWT directly (base64-encoded), or the request can just include a URI from which the auth-server can fetch the corresponding JWT object that specifies the desired params. This is particularly useful when the parameters to be passed are large. The parameters then flow directly from client-application to auth-server without passing through the user-agent.

Implementing a Client Application which uses Access Tokens

This section assumes you are a developer creating a new session-based webserver that accesses user data in some other system which accepts OAuth2 access tokens as authorization.

Your first step as developer is to create an account for your new server application (“app”) at the auth-server that the resource-server uses. It cannot be any other auth-server, as those don’t have the right scope definitions available.

When creating such an account, you typically provide a name for your app, an icon, and a brief description. This is so users who are redirected there see which application is requesting access to their data.

You also provide a “redirect URL” that tells the auth-server where to redirect web-based users to after they have granted your app the right to access to their data (a “consent”). This must point to some handler within your application that then calls the auth-server to exchange the auth-code for tokens.

And after registration is complete, you get back a “client id” and “client secret”. The client-id is used in calls to the auth-server to identify this “account” that you have just created. The “client secret” is a dynamically-allocated credential for the account that you need when making calls to the auth-server later.

You then add code to your app that wraps calls to the remote resource-server. The wrapper code should look in the user’s session to see if an appropriate access-token exists and if not redirect the user to get one. You can either check for token expiration before using it, or handle “token expired” errors via a retry-loop.

You can ask for such a token either when the user logs in, or when the token is needed. Doing it on login is nice in some respects, as the user is already in “login mood”. If you are using the same auth-server for authentication (login) as is needed for the remote service authorization then things are also simplified for the user. However if this access-token is needed only sometimes, then this might not be best.

On first redirect, the user will be asked for consent. Later fetches of the token will not present a consent page.

If you choose to ask for a refresh-token, you might store that somewhere longer-term than the user session. Note however that this is sensitive data, nearly as bad as leaking user credentials.

Implementing a Resource Server

This section assumes you are a developer creating a new application that offers REST endpoints through which user-specific data is manipulated. It then describes the set of steps you would need to follow to get authorization working.

This does repeat much of the information already available above, but from a different viewpoint.

First, you need to decide which auth-server you are going to rely on. This auth-server needs to hold the users whose data you are protecting, and you need to be able to define new “scopes” in this auth-server (or are ok with reusing existing scopes).

At runtime, your REST endpoint:

  • extracts the “bearer token” from the Authorization http header
  • validates the token (checks the signature and various other fields)
  • checks that the user whose data is being manipulated matches the user who is the “subject” of the token
  • check that the token grants permission to perform the operations that this endpoint performs (your chosen scopes)

If the above tests don’t pass then return an error (see section “Rejected Requests”).

The calling application is then responsible for obtaining a suitable access-token.

Typically the checks and scope-validation is done via a library provided by your auth-server provider, as the token format is auth-server-specific.

Rejected Requests

According to the IERT oauth-v2-bearer spec, a server can respond to a request with an expired or missing token with:

  • 401 Unauthorized
  • WWW-Authenticate: Bearer scope=”neededscope1 neededscope2 ..”

and a token with insufficient scope using

  • 403 Forbidden
  • WWW-Authenticate: Bearer scope="..."

Site Personalisation and Auto-single-signon

Important: this section contains a lot of speculation about possible solutions. Any suggestions here should be treated with caution; I am not an expert in this area. If you have successfully implemented “personalisation” of a website based upon user profile data retrieved from an OIDC “identity provider”, please let me know how you chose to do it!

The Problem

It is useful for a site to greet a user by name (eg place their name in the menu-bar or status-bar) - optionally along with their “avatar” image. It is also useful to display the site using configurable options such as custom colours, number-of-items-per-page, etc. Name and avatar-image are available from the user’s central profile managed by the OIDC auth-server, along with other data that is not normally used for “personalisation” (eg phone, email). The other examples given here (colours, number-of-items-per-page) are not standard OIDC attributes.

The problem is that users should not be forced to enter credentials when just browsing the site; that’s not a nice user experience.

Or otherwise expressed: there are two scenarios to consider:

  • rendering the site for users who have proved their identity
  • rendering the site for anonymous users, or users who have not yet provided proof of identity

Sending a visitor to an OIDC identity provider to get an ID token, and then obtaining user profile info from that same identity provider is relatively easy; that is a standard feature of OIDC. However sending every user to an OIDC identity provider as soon as they arrive at the site, even though they may not be doing any security-sensitive operations, may not make users happy.

Allowing anonymous users to personalise their view of the site, and have that persist across visits, is nice but not absolutely necessary. It is relatively common for customisation to require an account.

Requiring users who do have an account to be “logged in” to get a personalised site is also acceptable, under the same convention.

However forcing users to create an account just to browse the non-sensitive parts of a site is not acceptable. Similarly, forcing users who do have an account to enter their credentials to browse non-sensitive parts of a site is not good; they should be allowed to be “anonymous” if they wish and not be bothered with a credential prompt.

OIDC defines option “prompt” which can be passed on a “login redirect”; when set to “none” then the request immediately redirects back. If the user is already logged-on at that IDP then an ID-token is returned, if not then not. And if the user is not registered with that IDP, then it also returns immediately. A userid does not need to be specified; users with no account and users with an account but no current session and no “remember me” are treated identically: redirect with no ID token.

Note that regardless of the solution chosen for personalisation, it is still necessary for a client application to protect secure operations by checking that a valid login has been done before executing the associated code.

The topic of “Permanent Login” (or at least long-lived login sessions) was discussed earlier. That is, however, not quite the same problem. A user may well have an active login session within an auth-server, but if a website does not have an HTTP session for that user (eg because they haven’t visited for a while) then it is still necessary to somehow determine that fact (that they are logged in to the SSO server) without making the site unusable for visitors who do not have such a login session.

Possible Solutions

Options that occur to me are:

  • store user profile data in a cookie, separate from authentication
  • store a “profile” of the user in the app’s local database and store the userid in a cookie
  • redirect every user to the auth-server with prompt=none and see if an id comes back
  • store a simple boolean “is-user-logged-in” cookie

In all cases, the presence of personalisation data must not be considered authorization to perform protected operations.

Storing user preferences separately seems the easiest; when a user first logs on (via OIDC) the relevant parts of their profile can be stored into a cookie that is sent to the user browser. On later visits, their “personalisation info” is immediately available. This does have a few disadvantages:

  • a cookie has limited storage
  • the cookie is sent with every request (wasting bandwidth)
  • the cookie has personally sensitive data (could be leaked)
  • there needs to be some way to “refresh” the data from the user’s profile on their auth-server when they are “logged in”

User Profile in a local DB (or other local storage)

As a variant on the above, user profile data can be stored in a database, and just a key (eg a userid) can be stored in a cookie client-side.

When establishing a new http session for a user, the data is fetched and stored in the user’s session.

When the user really “logs in”, the server refreshes the local database with the user’s profile info.

A possible security issue is that an attacker could guess a valid user-id and would get a website personalised for that user which might reveal some personal details. Encrypting the cookie might be a solution, as would setting a second cookie with a random “credential” that must match a value in the user’s stored data. Note that this is not about protecting the cookie from theft/interception, but simply about blocking user-id-guessing.

In the case of a “native app”, the data does not need to be stored in a database, but simply cached locally using any available mechanism.

In the case of a “single page app”, modern browsers support “client side storage” (Web Storage and IndexedDB).

Always Redirect to Auth-server on New Session

When a request without an http-session-id arrives, it is possible to redirect the caller immediately to an auth-server with “prompt=none” specified. If the user has an SSO session with that auth-server then the response will provide an auth-code that can be exchanged for an ID-token and the user is “auto-logged-in”. When a negative response is returned (user not logged in to this auth-server).

This redirect should not be performed on every request; either it is done only when no session-id is provided, or a “not-logged-in” flag can be stored in the session to skip later redirects for requests with the same session. The disadvantages of this approach are:

  • only one auth-server can be supported
  • when the user is known to the auth-server but not “currently logged in to the auth-server” then no info about the user is available (not a major issue)
  • the first visit to the site always results in a redirect to an auth-server, wasting bandwidth and slowing that important first page render. It also requires the site to then exchange the auth-code for the id-token, ie yet more http requests. And potentially then make calls to the auth-server “userinfo” endpoint to obtain the data needed to personalise the site.
  • when the auth-server is not accessible then the site is not accessible as the return-redirect never happens

A variant would be on first login to a site, store a cookie with the (auth-server-id, user-id) but no other data. This at least supports the auth-server of the user’s choice.

Note that a user without an account, and a user with an account but who is not currently logged in to the auth-server, are treated identically. In each case, the auth-server sees no “auth-server login session cookie” and returns with a failure status.

When a user has used “remember me” on a previous login to the auth-server then they will automatically be logged in by this redirect.

Unlike the previous options, this really is an “automatic single-signon” that proves to the website that the HTTP session is associated with a particular user.

This is a variant of the above “always redirect” solution which works when all of the sites that you wish to provide “auto-single-signon” for are subsites of a single domain.

Ensure that when the user logs in to the SSO system, a cookie is set which indicates that the user is logged in. Any site which receives a request without an exisitng HTTP session checks for the cookie and if set redirects the user to the SSO system to obtain an ID token, specifying prompt=none. This should return without user interaction and the ID token provides the needed user information for personalisation. If it should fail, then clear the cookie.

This ensures that the redirect to the SSO system occurs only for users where there is a very high probability that the user is indeed logged in.

This logic can also be performed within an IFrame in the rendered page; when the user really is logged in then the IFrame can invoke the “login endpoint” of the website (often “/ssoLogin”) to update the user’s HTTP session with the necessary data. This avoids ever sending a redirect directly from the viewed website which can give a nicer user experience.

Single Sign Off (aka “Global Logout”)

Types of Logout

When a user is logged in to a client application, there are actually two concurrent logins:

  • the client application (native or session-based), and
  • the OIDC auth server

There are therefore two different types of “log out” (aka “sign off”):

  • log out from the client application (relying-party-initiated logoff)
  • log out from the OIDC auth server (oidc-provider-initiated logoff)

Logging out from a specific client application is easy. In a native app, the app just responds to a click on a “log out” button by clearing any internal “logged in” state in its memory - including deleting (freeing memory for) the user’s ID token if it has cached that. In a server-side app, the user’s HTTP session should be deleted; ideally an HTTP response should also be returned which deletes the session-id cookie but as the session no longer exists this is not critical. The auth-server does not need to be informed; the user is still “logged in” there, and can select “log in” on the same or other apps and successfully log in without needing to enter their credentials again.

An OIDC auth server typically provides an admin interface for users which a user can visit to perform various tasks such as updating their profile, or viewing and updating consents. This admin interface usually also offers a “log off” (aka sign out) button which clears the user’s auth-server-login-session-cookie - and their “remember me” cookie too. This also affects the way that OIDC refresh tokens work; a client application which tries to use a refresh-token issued along with an ID token where “offline access” was not included will now receive an error response. Access to the “userinfo” endpoint will also fail unless “offline access” was requested.

However logging out of the OIDC auth server does NOT necessarily affect any client applications where the user is currently logged in. Some client applications choose to regularly poll the auth-server to see if the user is “still logged in” (via auth-code-grant calls with prompt=none or via a refresh-token), and consider the user “logged out” at the client app when they are no longer logged in at the auth-server. However such behaviour is optional. It is really a matter of perspective: does the client application see the auth-server as a “session manager” or simply as an “identity confirmer” which it relies on for initial login only.

Some auth-servers offer a URL that client applications can call to perform a user logout - ie programmatically trigger both logout from the client application and logout from the auth-server.

When client applications react to logout at the “auth server” level then this is called “single sign off” or “global logout”.

Single Sign Off

Single sign off is actually a rather complex topic.

As described above, an application which accepts an ID-token as proof of a user identity and then considers the user “logged in” can choose if it:

  • regularly checks with the auth-server that the user is “still logged in”, or
  • just treats the user as logged-in until the session terminates

If it does check regularly, there is no requirement that it does so at the ID token expiry time. An access-token must be valid in order to pass it to another system, but an ID token is usually never passed to another system and so its expiry-time is not really relevant. The access-token that was issued along with the ID token, and which grants access to the userinfo endpoint, has an expiry time and can no longer be used to fetch user profile data after that time; however normally all relevant user profile data is fetched immediately after login and stored in the local session. Applications which periodically “reload” user profile data (just in case it has changed) are probably very rare; at most there might be a “refresh my profile” option somewhere - though simply logging out and back in might be easier.

One possible option is to check the ID token expiry time only when the user performs an operation that requires authorization, ie after a user has “logged out” via one site, they still appear logged in at another until they perform a protected operation at which point they are prompted to log in again. The spring-security library with keycloak integration works this way for example.

Note that “signoff” applies only to “login sessions”, and these are related to ID tokens. Signoff is not related to access-tokens; they remain valid until they exipire, regardless of whether the associated user is “signed in” or not (except the access-token linked to the ID token and used to access the “userinfo” endpoint).

When a user clicks “logout” in a web-based application, the application typically deletes the local http-session and sends an http-response which includes commands to delete the “sessionid” cookie at the client end. However if “single sign out” is desired for a group of applications, this is hard to achieve. Only the app in which “logout” was actually clicked is in a situation where it can send a response to delete its cookies. If the group of applications that wish to “sign out together” can be placed under a single hostname, and can share a single session-id-cookie then that at least solves the cookie-deletion issue. However it does not allow all applications to delete the user http-session immediately. Sessions do time-out, but leaving a session around is inefficient. Possibly more significantly, it is also a potential security risk; if someone obtains the (old) session-id before the session expires then they can “continue” the user’s session - as the logged-in user.

The OAuth2 specification does not address logout at all. Neither does the OIDC core specification, but some additional OIDC specs (currently in advanced draft form) have been created to address this. See:

Various auth-server implementations provide their own solutions.

The Keycloak server supports “back-channel logout” in which each client-account can have an associated “admin url”; keycloak knows which client-applications it has issued tokens for during a specific user-session, and therefore on logout can invoke the “admin url” on exactly the relevant client applications. This does of course require network connectivity from the auth-server to the client application which is otherwise not needed. The API is auth-server-specific.

Sign Off Per Device vs Per Session

A user may be “signed on” to a client-application (and an auth-server) on various devices. When “sign off” is selected on one device, what should happen to the others?

For any client relying on “cookies” to indicate that it is “logged in” (eg a session cookie), then there simply is no way to “push” the deletion of cookies to devices that were not involved in the sign-off. As noted earlier, however, as long as the server-side HTTP session that a session-id cookie refers to has been deleted, it does no harm to leave the cookie in place.

However in general, users do not expect or want a sign-off on one device to affect login sessions on other devices.

The “auth-server login session” is separate for each device; the user needs to enter their credentials separately on each device, and receives a different “login session” cookie on each device. Login/logout is therefore naturally per-device.

ID Token Expiry Date and Single Sign Off

An ID token has an expiry date, but a user’s login session with the auth-server may be terminated before that expiry date is reached. The ID token therefore represents a “login” which is actually “stale”. Does this provide an “attack vector”? In general, no. A client application does not usually accept ID tokens as input; it only accepts an “auth code” and then exchanges this for an ID token via a “back channel” ie communicates directly with the auth-server. Therefore although an ID token may be “stale”, it cannot usually be “injected” into any client application. A user might interact with a client application that is holding an ID token that was issued before the most recent logoff, but that is not necessarily a security problem; it just means that client application is not “participating in” the “single sign off”.

Front-Channel Logout

The OIDC spec for front-channel logout requires that a web application that wants to log its user out when the user logs out via another tab:

  • defines an iframe with some custom Javascript
  • periodically polls this iframe to see if the user is “still logged in”
  • when not-logged-in state is detected, then call a “logout” URL on the webserver to trigger the necessary server-side processing.

The iframe just checks if the “auth-server session cookie” still exists; if not then some other tab has sent a “logout” message to the auth-server, received a “delete cookie” command in response, and removed the cookie.

This works without any additional network traffic, just some trickery to make it possible to test for the existence of a cookie associated with some other domain.

This of course does not work when a user is using some other browser - or non-browser client. It also does not detect when the auth-server has “timed out” the user login session. However it is efficient and works for some use-cases.

This partially works when a set of cooperating server-side web applications are redirecting a user between the various sites within a single window. Each app has its own iframe, but there is only one cookie so when logout occurs via one site, then visiting any of the other sites (via a link, back-button, or other) will cause a “logout” callback to the associated server at that time. The user sees “single logout”, but the servers don’t get a chance to actually do cleanup unless/until the user visits them.

Back-Channel Logout

In this approach, when the auth-server executes a “log out” it makes direct calls back to other servers to inform them of the status-change.

This does not work for native apps or single-page-applications; it is applicable only for “server-side” client applications.

For server-side applications, how the change in status is propagated to a web-browser is not defined. For a browser that is connected to a server via “websockets” or similar, the answer is reasonably obvious. For others, the UI must either poll for status or wait until the next call to the server to detect that logout has occurred.

The main benefit of this approach is that each server at which the user is logged-in can immediately clean up its server-side state as soon as logout has occurred.

The OIDC spec states that when a “logout back-channel” call is performed, a security event token (ie a JSON Web Token) is provided that identifies the user who is logging out.

A variant of this approach is for the auth-server to send a “logout” message via a message-bus rather than via url-based callbacks. This is not part of the OIDC spec, but may be available in some auth-servers.

Using a Custom Logout Application

A client-account typically specifies a “post-logout redirect url”; when the client application redirects the client to the auth-server to perform logout, then the auth-server sends the user back to this configured url afterwards. Typically this is the client application’s “home page”. However it is possible to set this to point to some “logout application” that then calls other specific applications that should take action on logout.

This “logout application” can execute code server-side, eg make rest-calls to other applications. Alternatively it can just render javascript which then executes code client-side, eg make rest calls to other applications. The advantage of having javascript execute a set of logout-calls to other applications is that delete-cookie commands returned by those calls can be used (assuming appropriate CORS rules have been defined). The disadvantage is that the logout process is somewhat unreliable; the client is not totally under the control of the systems needing logout.

The “logout application” can be part of an existing application if desired.

References and Further Reading