Spring Security Session Management

Categories: Java

Overview

Spring Security is an excellent framework. However their advice on tracking authentication-sessions is dangerously vague; it applies only when the application is a non-clustered presentation-tier app, and leaks memory otherwise. And Keycloak’s advice on spring setup is actively wrong.

Work colleagues discovered this recently while integrating Keycloak (OAuth2) support into our Spring-based applications.

This article specifically looks at when to enable RegisterSessionAuthenticationStrategy or NullAuthenticatedSessionStrategy.

Summary

For those just wanting the answer, it is:

  • when you are using stateless authorization (eg OAuth2 bearer tokens), use NullAuthenticationSessionStrategy
  • when using a servlet-engine session manager which does not support session-expiry callbacks, use NullAuthenticatedSessionStrategy
  • for other cases, you probably want NullAuthenticatedSessionStrategy anyway

Use RegisterSessionAuthenticationStrategy only when:

  • you are running a single instance of your application (ie not clustered), AND
  • you need the features that Spring’s SessionRegistryImpl offers

SessionRegistryImpl offers the following features (which IMO aren’t really that useful):

  • the ability to limit the number of concurrent logins (eg different PCs or private tabs) for a specific userid
  • the ability to build a “user admin page” where logged-in users can be listed and potentially force-logged-off

The particular setup that we were using and which triggered a memory leak (before we switched to NullAuthenticatedSessionStrategy) was:

However the problem is not limited to this specific servlet-engine or session-store.

Keycloak and Spring

The Keycloak docs states:

You must provide a session authentication strategy bean which should be of type RegisterSessionAuthenticationStrategy for public or confidential applications and NullAuthenticatedSessionStrategy for bearer-only applications.

While this is right for bearer-only applications, it is dangerously wrong otherwise. And while Spring’s config defaults to using neither of these strategies, the Keycloak extension does force a choice of some strategy and recommends only these two.

Spring Security Config and SessionAuthenticationStrategies

Normally a Spring application defines a subclass of WebSecurityConfigurerAdapter and overrides method configure(HttpSecurity) to set up its security rules.

This HttpSecurity object has a method sessionManagement which returns an object which performs various actions when a user’s login is successful. One of the things it does is invoke its configured “authentication strategy”; there is a default but a Spring application can customise this. See this diagram of how the SessionAuthenticationStrategy fits into the overall login process.

The default SessionAuthenticationStrategy implementation is a CompositeSessionAuthenticationStrategy wrapping some other strategies. By default, this list contains just ChangeSessionIdAuthenticationStrategy, ie forces the sessionId cookie to be updated after successful login (see SessionManagementConfigurer).

However if the subclass of WebSecurityConfigurerAdapter calls ancestor method maximumSessions then the CompositeSessionAuthenticationStrategy automatically includes RegisterSessionAuthenticationStrategy which leads to a memory-leak in many circumstances.

The Keycloak library encourages customisation of the strategy; a keycloak-enabled Spring application should subclass KeycloakWebSecurityConfigurerAdapter instead of WebSecurityConfigurerAdapter, and this parent class defines an abstract method sessionAuthenticationStrategy() - thus forcing subclasses to explicitly choose a SessionAuthenticationStrategy to use. As noted, the Keycloak docs then provide bad advice which leads to a memory leak in many situations.

How the Spring SessionRegistry works.

As noted above, the SessionAuthenticationStrategy is notified on each successful login. Section Concurrency Control then discusses management of “security sessions” and in passing mentions setting up RegisterSessionAuthenticationStrategy. That’s a little odd as the whole approach doesn’t work without it - so let’s look at

Spring provides a class SessionRegistryImpl which keeps an in-memory map of (userid -> list of security-sessions). Class RegisterSessionAuthenticationStrategy ensures that on successful authentication, the user’s security-session is added to this map. Class HttpSessionEventPublisher ensures that on user logout (either explicit or via http-session-timeout) the corresponding map entry is cleared (necessary to avoid a memory leak).

As mentioned in the Spring docs, there are two primary uses for the SessionRegistry:

  • Class ConcurrentSessionControlAuthenticationStrategy can use it to check whether a particular userid already has a session (on the same host) and if so potentially reject the login with reason “too many sessions”.
  • Code can use this to implement an admin-page that lists all users logged-in on a particular host (and potentially log them out).

The login-limits can be used to:

  • enforce licensing limits
  • prevent users from sharing logins (at least, only N users at a time can use an account)

The Problems with The Spring SessionRegistry

Provided things are correctly set up (in particular, session timeout detection), this works ok on a single web application instance.

But if the system is clustered:

  • the SessionRegistry on any single host no longer makes sense
  • and entries in the SessionRegistry are likely to be leaked when http sessions time out

The “list of logged-in users” that SessionRegistry offers doesn’t seem to make sense in a clustered system. The list will be only the users logged in on that specific node, so you as an admin-users you:

  • get different data from each node of your cluster;
  • need to somehow combine data from all nodes to get an “overall view” of your cluster;
  • and to force-logoff a specific user you need to find which node they are logged in on

The above is assuming “sticky sessions” are being used. If you are not using sticky sessions, ie requests are being distributed evenly across the cluster, then you have some kind of central session store anyway - and the SessionRegistry makes even less sense as each node tracks data for users whose http-session was first created on that node. But the session-creation-node is not related to any other operations the user is performing.

And depending on how the clustering system manages http sessions, session deletion events (whether expiry or explicit delete) often don’t work or are triggered on a single node which might not be the node on which the session was created. This leads to incorrect results being reported by the SessionRegistry - and to a memory leak.

Pure Rest Clients and the Spring Session Registry

When an application provides a rest endpoint, there are two kinds of callers:

  • callers which handle responses that include set-cookie headers by storing these and including the cookies in future responses
  • callers which expect rest endpoints to be stateless, and ignore any set-cookie headers in the response

The second (cookie-ignoring) approach is the usual approach for “pure rest clients”. In particular, apps for mobile devices often don’t expect or support cookies.

Unfortunately Spring’s RegisterSessionAuthenticationStrategy always creates a new http session when a request is authenticated - even when the authentication is due to an OAuth2 bearer-token. This means that for each request from a pure-rest-client to a secure endpoint using an OAuth2 bearer token, a new http session object is created and registered with the SessionRegistryImpl - objects that will never be used again. These session objects will eventually expire and the memory be recovered (assuming http session expiry is working) but that can take many minutes for each session.

In addition, the features the SessionRegistry provides doesn’t make much sense for clients using OAuth2 bearer tokens; a list of “logged in users” or attempting to limit concurrent sessions for “a logged in user” doesn’t apply to such clients. And finally, “sticky sessions” don’t work with such clients (as they ignore the routing cookies) so calls are distributed across the cluster.

Obviously RegisterSessionAuthenticationStrategy is a very bad choice for applications with such clients.

Session Management on Distributed Systems

The following discussion applies only to applications which use http sessions - which is not usually the case for servers which only provide Rest endpoints.

Managing “user session data” is relatively simple on a system with just one node. When a user request does not provide a session-cookie, then

  • a random session-id is allocated
  • an entry is added to a local in-memory map with the session-id as key and user-session-related data as value
  • the response includes a cookie that holds the value of the session-id

Obviously, on later requests to the same node the cookie is provided and the existing map entry is used to retrieve user session data (state). The map entry needs to have a “last used” timestamp on it, and some background process is needed to periodically scan the map entries and delete entries which “have expired”; without this users who visit the site then just “go away” will continue to consume memory. A ServletEngine emits a “session deleted” event when a session is explicitly deleted or deleted during to timeout - see below.

Normally code within the application which needs to store user-state just stores it into the value in the session-map. If for some reason the code wishes to store the data in its own map keyed by the same session-id then it must listen for “session deleted” events in order to free memory associated with the deleted session.

Of course a server can store session data in a local file, or remote database, etc. rather than just store it in memory. However when storing sessions in a remote database, the behaviour of “session timeout” changes significantly.

A single server has an upper limit to the load it can handle. Eventually scaling needs to be done by running multiple instances with a load-balancer distributing calls across the instances. The load-balancing can be done with “sticky sessions” in which each user is directed to the same back-end server instance, or by distributing requests evenly across the set of instances.

With “sticky sessions” http-session-management can still be done with a simple in-memory or in-local-file storage, and where session deletion/timeout occurs on the node that the user is assigned to. However this does have the disadvantage that failure of that node causes loss of user session data; this is not only a problem for “crashes” but also a problem when rolling out new versions of software. In the “old days” software releases were done “out of business hours” and therefore session loss was not a problem. However modern software development processes recommend frequent releases, and performing these releases during normal business hours. Rollout strategies such as blue/green deployment, and rollouts on clusters such as Kubernetes do not install the “new version” on the same host as the old one, and often leave the old instance up until the new one is running properly. This means that “user session loss” becomes more common and even “store session in file” does not really help. In short: even when using “sticky sessions”, scalable and modern systems require that session data be stored in an external database. And that brings the “session timeout” issue into play.

So when sessions are stored in an external database (often a key-value store such as Redis), how does “http session deletion” and “http session timeout” work? And particularly when “sticky sessions” are not used? This is discussed in the following sections.

Apache Tomcat’s Session Management Options

The applications I happen to be currently working on uses Apache Tomcat; the following information describes how session-management works there. However I expect other servlet engine implementations work similarly.

Tomcat has support for “cluster management”, but the options are relatively limited. The “DeltaManager”:

  • relies on multicast networking which does not work in cloud environments
  • and replicates the session to every member of the cluster - ie is not highly scalable

Tomcat also provides a “Backup Manager” which replicates each session to one other Tomcat node. This relies on sticky sessions to work, configured so that on unavailability of the “primary node”, the user is directed to the backup node. This does work better in cloud environments, and is more scalable. However sticky sessions are still somewhat undesirable. And this “push” approach to session distribution is just ugly and inelegant: a nicer architecture is to store session data in a central place, and for any server which needs that data to pull it. This still supports sticky session routing (the server can pull each time, or just assume it has the latest data for any session it recognises), while also elegantly supporting failover (the new server just pulls the data when failover occurs) and non-sticky setups (the session is pulled on each request). This is described well in the memcached-session-manager wiki page.

Note that the whole concept of “sessions” is not really applicable to REST servers. Yes, it is possible to implement “stateful rest apis”, but IMO this is an anti-pattern; if a server is providing REST apis then those APIs should be stateless:

  • they should not load data from a user-session, or store data in one
  • they should not set session-cookies
  • they should use stateless OAuth2 authentication rather than relying on having a session with “authentication state”.

Unfortunately, it is quite tempting to write hybrid servers which return HTML from some URLs, and handle REST calls (return JSON) at other URLs. In this case, the HTML stuff will use session-based authentication. Ideally there would be 2 separate Spring security configs - one for the HTML (desktop) stuff and one for the rest. However that means that HTML served from the server needs to then obtain OAuth tokens in order to call the REST endpoints - an extra complication when the user is already logged-in via their session. The Keycloak docs assume your server is one or the other (“public” or “confidential” client):

You must provide a session authentication strategy bean which should be of type RegisterSessionAuthenticationStrategy for public or confidential applications and NullAuthenticatedSessionStrategy for bearer-only applications.

It seems possible for one server node in the cluster to be responsible for periodically scanning the complete set of sessions in that remote DB and removing expired items. But there are some significant problems with this:

  • the set can be large
  • there needs to be some kind of “leader election” to determine who does the scanning (not impossible, but complex)
  • and the “session deletion notification event” will only be processed on that scanning node unless some additional complex system exists for notifying all elements in the cluster of the event

Instead, the common solution is to have the database automatically expire data; Redis entries have an optional expiry-time field for this. And in this approach, “http session timeout” events are just not available at all, on any node.

This in turn means that any code which relies on receiving “http session timeout” events in order to avoid memory leaks is broken in a clustered environment.

Ant this brings us back to Spring’s SessionRegistryImpl; it keeps data about sessions in its own map which needs “cleanup” when sessions are deleted - and thus the registry leaks memory in a distributed environment.

It might be possible for SessionRegistryImpl to implement its own “session timeout scanning” by tagging the entries in its map with a timestamp and removing entries that are “too old”. However such timestamps should be updated on each request so the timeout applies “from last request” - but in a system where requests are distributed across a cluster, that doesn’t work. Periodic scanning could check with the central session-store to see if each session held locally still exists or not - seems possible but rather network-intensive. And the current SessionRegistryImpl does not do this.

Session events

HttpSessionListener.sessionDestroyed is called from tomcat StandardSession.expire(boolean) which is called from StandardSession.isValid(). That’s an odd decision - an “isFoo” method doesn’t normally have side-effects but that’s the way it currently works. The isValid check is done when a request is received which specifies that session - ie a callback will occur if the session is used between reaching expiry time and being discarded. However that’s not relevant here.

Tomcat has interface org.apache.catalina.Store which allows sessions to be persisted externally - and this can help preserve sessions over server restarts. However it is not intended for clustering, ie sharing sessions between servers. In particular, expiry of sessions must be done by a tomcat instance which fetches all sessions, iterates over them, and triggers expiry.

Tomcat method ManagerBase.processExpires iterates over all sessions, calling isValid on each one. To work, it needs an array of all sessions to be expired - which obviously does not scale well. This is called from method backgroundProcess which:

in tomcat: * is called from StandardContext.backgroundProcess * which is called from StandardWrapper.backgroundProcess * which is called from ContainerBase.ContainerBackgroundProcessor.processChildren * which is called form COntainerBase.ContainerBackgroundProcessor.run * which is a thread started on tomcat startup

Oddly, method MemcachedBackupSessionManager.backgroundProcess calls updateExpirationInMemcached which calls _manager.findSessions() which returns an array of all sessions. That doesn’t seem like something that memcached can do.

This manager object is passed to the constructor of MemcachedSessionService and will be a MemcachedBackupSessionManager instance.

Externalized Sessions

When running a servlet engine in a cluster, http-sessions are typically stored in a central place such as memcached. Requests can then be distributed evenly across all servers; the session-id provided in the request is used to fetch the http-session from the central store on request begin, and it is written back out at request end if modified.

Http session timeout is managed by the central store. This means that individual servlet engine instances do not get http-session-timeout events for such sessions. This leads directly to memory-leaks in classes such as SessionRegistryImpl which store user-related data related to login sessions.

Summary

For a clustered system, the correct solution is therefore to use the NullAuthenticatedSessionStrategy instead. This is notified on each successful login, and does absolutely nothing. The SessionRegistryImpl map remains empty - meaning of course that the concurrent-login limits won’t work, but that’s no big loss.

It might be possible to implement a cluster-aware SessionRegistry implementation, but none is provided by Spring as far as I know.

In addition, if the application is a rest server using bearer-token authorization then NullAuthenticatedSessionStrategy is very important; RegisterSessionAuthenticationStrategy calls request.getSession() which in turn triggers creation of an http-session and embeds the session-id in a set-cookie command in the response, then stores the (userid, sessionid) pair. This is of course totally pointless when the request is a REST call authenticated via bearer-token, as the caller will likely ignore the set-cookie command - thus leading to a new session per request call and yet another memory leak.

References and Further Reading