If you need an unguessable random string (for a session cookie or access token, for example), it can be tempting to reach for a random UUID, which looks like this:
88cf3e49-e28e-4c0e-b95f-6a68a785a89d
This is a 128-bit value formatted as 36 hexadecimal digits separated by hyphens. In Java and most other programming languages, these are very simple to generate:
import java.util.UUID; String id = UUID.randomUUID().toString();
Under the hood this uses a cryptographically secure pseudorandom number generator (CSPRNG), so the IDs generated are pretty unique. However, there are some downsides to using random UUIDs that make them less useful than they first appear. In this note I will describe them and what I suggest using instead.
How random is a UUID anyway?
As stated on the Wikipedia entry, of the 128 bits in a random UUID, 6 are fixed variant and version bits leaving 122 bits of actual random. 122 bits is still quite a lot of random, but is it actually enough? If you are generating OAuth 2 access tokens, then the spec says no:
The probability of an attacker guessing generated tokens (and other
credentials not intended for handling by end-users) MUST be less than
or equal to 2^(-128) and SHOULD be less than or equal to 2^(-160).
Well, even if the attacker only makes a single guess, the probability of guessing a 122-bit random value can never be less than 2-122, so strictly speaking a random UUID is in violation of the spec. But does it really matter?
To work out how long it would take an attacker to guess a valid token/cookie, we need to consider a number of factors:
- How quickly can the attacker make guesses? An attacker that can try millions of candidate tokens per second can find a match much faster than somebody who can only try a hundred. We will call this rate (in guesses per second) A.
- How many bits of randomness are in each token? A 128-bit random token is more difficult to guess than a 64-bit token. We will label the number of random bits B.
- How many tokens are valid at any given time? If you have issued a million active session cookies then an attacker can try and guess any of them, making their job easier than if there was just one. Such batch attacks are often overlooked. We will label the number of valid tokens in circulation at any one time S.
Given these parameters, OWASP give a formula for calculating how long it will take an attacker to guess a random token as:
Let’s plug in some numbers and see what we get. But what are reasonable numbers? Well, for security we usually want to push the numbers well beyond what we think is actually possible to be really sure. So what is actually possible now?
When we consider how fast a well resourced attacker could guess a token, we can use the Bitcoin hash rate as a reasonable upper-bound approximation. A lot of people are investing a lot of time and money into generating random hashes, and we can view this as roughly equivalent to our attacker’s task. When I looked back in February (you can see how long my blog queue is!), the maximum rate was around 24293141000000000000 hashes per second, or around 264.
That’s a pretty extreme number. It’s fairly unlikely that anyone would direct that amount of resource at breaking your site’s session tokens, and you can use rate limiting and other tactics. But it is worth considering the extremes. After all, this is clearly possible with current technology and will only improve over time.
How many tokens might we have in circulation at any time? Again, it is helpful to consider extremes. Let’s say your widely successful service issues access tokens to every IoT (Internet of Things) device on the planet, at a rate of 1 million tokens per second. As you have a deep instinctive trust in the security of these devices (what could go wrong?), you give each token a 2-year validity period. At a peak you will then have around 63 trillion (246) tokens in circulation.
If we plug these figures into the equation from before, we can see how long our 122-bit UUID will hold out:
A = 264
B = 122
S = 246
That comes out as … 2048 seconds. Or a bit less than 35 minutes. Oh.
OK, so those extreme numbers look pretty terrifying, but they are also quite extreme. The Bitcoin community spend enormous sums of money (certainly in the tens of millions of dollars) annually to produce that kind of output. Also, testing each guess most likely requires actually making a request to one of your servers, so you are quite likely to notice that level of traffic – say by your servers melting a hole through the bottom of your datacentre. If you think you are likely to attract this kind of attention then you might want to carefully consider which side of the Mossad/not-Mossad threat divide you live on and maybe check your phone isn’t a piece of Uranium.
All this is to say that if you have deployed random UUIDs in production, don’t panic! While I would recommend that you move to something better (see below) at some point, plugging more likely numbers into the equation should reassure you that you are unlikely to be at risk immediately. An attacker would still have to invest considerable time and money into launching such an attack.
Other nits
The borderline acceptable level of entropy in a random UUID is my main concern with them, but there are others too. In the standard string form, they are quite inefficient. The dash-separated hexadecimal format takes 36 characters to represent 16 bytes of data. That’s a 125% expansion, which is pretty terrible. Base64-encoding would instead use just 24 characters, and just 22 if we remove the padding, resulting in just 37.5% expansion.
Finally, a specific criticism of Java’s random UUID implementation is that internally it uses a single shared SecureRandom
instance to generate the random bits. Depending on the backend configured, this can acquire a lock which can become heavily contended if you are generating large numbers of random tokens, especially if you are using the system blocking entropy source (don’t do that, use /dev/urandom). By rolling your own token generation you can use a thread-local or pool of SecureRandom instances to avoid such contention. (NB – the NativePRNG uses a shared static lock internally, so this doesn’t help in that case, but it also holds the lock for shorter critical sections so is less prone to the problem).
What should we use instead?
My recommendation is to use a 160-bit (20 byte) random value that is then URL-safe base64-encoded. The URL-safe base64 variant can be used pretty much anywhere, and is reasonably compact. In Java:
import java.security.SecureRandom; import java.util.Base64; public class SecureRandomString { private static final SecureRandom random = new SecureRandom(); private static final Base64.Encoder encoder = Base64.getUrlEncoder().withoutPadding(); public static String generate() { byte[] buffer = new byte[20]; random.nextBytes(buffer); return encoder.encodeToString(buffer); } }
This produces output values like the following:
Xl3S2itovd5CDS7cKSNvml4_ODA
This is both shorter than a UUID and also more secure having 160 bits of entropy. I can also make the SecureRandom into a ThreadLocal if I want.
So how long would it take our extreme attacker to find a 160-bit random token? Around 17.9 million years. By tweaking the format of our tokens just a little we can move from worrying about attacker capabilities and resources to inner peace and happiness. It is so far beyond the realm of possible that we can stop worrying.
Why not go further? Why not 256 bits? That would push the attack costs into even more absurd territory. I find that 160 bits is a sweet spot of excellent security while still having a compact string representation.
I received a couple of pieces of feedback indirectly. I’m not sure if they want to be named, so I will post their comments anonymously. To summarise they make two points:
1. The comparison to Bitcoin hash rate is irrelevant as the attacker must make an online attack, and we can apply rate limiting to thwart that.
2. A more likely threat is a timing attack when the token/session id is looked up in a prefix tree index. (See https://blog.ircmaxell.com/2014/11/its-all-about-time.html for a good background on timing attacks).
My responses to these would be:
1. Firstly, there are potential offline attack vectors. For instance, some applications or frameworks log a hash of the session id in access logs to correlate requests from the same session. Anyone with read-only access to the logs can then mount an offline brute force attack to recover session ids.
Secondly, I would make a defense-in-depth argument. You can easily eliminate the threat entirely for basically no cost by increasing the entropy, without needing to rely on rate limiting, so just do it. Then add rate limiting as well.
2. I’m not sure how many databases actually use prefix tree (trie) indexes rather than the more common B-trees, but it is certainly possible. Even without a trie index, timing attacks are possible due to the common use of memcmp for string comparisons in databases. Such timing attacks themselves require tens or hundreds of thousands of requests (to eliminate noise), so are also somewhat mitigated by rate limiting.
Paragon Initiative Enterprises have a great write up of how to address these timing attacks using “split tokens”: https://paragonie.com/blog/2017/02/split-tokens-token-based-authentication-protocols-without-side-channels#after-fold . Another approach is to append a MAC tag to each token and to verify it (in constant time!) before hitting the database at all – an approach I recommended in https://dev.to/neilmadden/7-best-practices-for-json-web-tokens