If you need an unguessable random string (for a session cookie or access token, for example), it can be tempting to reach for a random UUID, which looks like this:
88cf3e49-e28e-4c0e-b95f-6a68a785a89d
This is a 128-bit value formatted as 36 hexadecimal digits separated by hyphens. In Java and most other programming languages, these are very simple to generate:
import java.util.UUID; String id = UUID.randomUUID().toString();
Under the hood this uses a cryptographically secure pseudorandom number generator (CSPRNG), so the IDs generated are pretty unique. However, there are some downsides to using random UUIDs that make them less useful than they first appear. In this note I will describe them and what I suggest using instead.
How random is a UUID anyway?
As stated on the Wikipedia entry, of the 128 bits in a random UUID, 6 are fixed variant and version bits leaving 122 bits of actual random. 122 bits is still quite a lot of random, but is it actually enough? If you are generating OAuth 2 access tokens, then the spec says no:
The probability of an attacker guessing generated tokens (and other
credentials not intended for handling by end-users) MUST be less than
or equal to 2^(-128) and SHOULD be less than or equal to 2^(-160).
Well, even if the attacker only makes a single guess, the probability of guessing a 122-bit random value can never be less than 2-122, so strictly speaking a random UUID is in violation of the spec. But does it really matter?
To work out how long it would take an attacker to guess a valid token/cookie, we need to consider a number of factors:
- How quickly can the attacker make guesses? An attacker that can try millions of candidate tokens per second can find a match much faster than somebody who can only try a hundred. We will call this rate (in guesses per second) A.
- How many bits of randomness are in each token? A 128-bit random token is more difficult to guess than a 64-bit token. We will label the number of random bits B.
- How many tokens are valid at any given time? If you have issued a million active session cookies then an attacker can try and guess any of them, making their job easier than if there was just one. Such batch attacks are often overlooked. We will label the number of valid tokens in circulation at any one time S.
Given these parameters, OWASP give a formula for calculating how long it will take an attacker to guess a random token as:
Let’s plug in some numbers and see what we get. But what are reasonable numbers? Well, for security we usually want to push the numbers well beyond what we think is actually possible to be really sure. So what is actually possible now?
When we consider how fast a well resourced attacker could guess a token, we can use the Bitcoin hash rate as a reasonable upper-bound approximation. A lot of people are investing a lot of time