Parse, don’t type-check

There’s a fantastic article from last year titled Parse, don’t validate. I’d highly recommend it to any programmer (along with the more recent follow up Names are not type safety). The basic idea is that there are two ways to check that some input to a function is valid:

  • A validator checks that the input is valid and throws an error if not. It doesn’t return anything. For example, checking that a list is not empty.
  • A parser does the same as a validator, but returns a more specific representation of the input that ensures that the required property is satisfied. For example, checking that a list is not empty and returning a NonEmptyList type.

The thesis of the article is that parsers are preferable to validators. If you’ve not read the original article, please do so – it’s very well written and makes the case much better than I can summarise it. The essential message is to make illegal states unrepresentable. In the article, this is done by making use of the type system. This is a philosophy I entirely agree with, but I want to point out and expand upon one ironic aspect of the argument:

A type checker is a paradigmatic example of a validator!

Continue reading “Parse, don’t type-check”

API Security in Action is published!

I wasn’t expecting it so quickly, so it caught me a little off guard, but API Security in Action is now finally published. PDF copies are available now, with printed copies shipping by the end of the month. Kindle/ePub take a little bit longer but should be out in a few weeks time.

My own print copies will take a few weeks to ship to the UK, and I can’t wait to finally hold it in my hands. That’s a brighter ending to 2020.

At some point I’ll try and collect some thoughts about the process of writing it and my feelings with the finished product. But tonight I’ll settle for a glass (or two) of a nice red. Cheers!

Some incomplete thoughts about Gödel

I saw another article on Gödel’s incompleteness theorems linked from Reddit today. It’s a topic I’ve wanted to write about for some time. Although many articles do a decent job in giving an idea of what the big deal is (and this one is pretty good), they can sometimes give a misleading impression of what the theorems actually imply. I’m by no means an expert, but hopefully these notes are useful.

Continue reading “Some incomplete thoughts about Gödel”

Macaroon access tokens for OAuth: Part 2 – transactional auth

In part 1, I showed how Macaroon access tokens in ForgeRock Access Management 7.0 can be used as a lightweight and easy-to-deploy alternative to proof of possession (PoP) schemes for securing tokens in browser-based apps. The same techniques can be adapted to secure tokens in microservice architectures and IoT applications, and I hope to expand on some of the patterns they enable in future blog posts. But in this post, I want to look at third-party caveats and their application to transactional authorization.

Continue reading “Macaroon access tokens for OAuth: Part 2 – transactional auth”

API Security in Action handed over to production

After a flurry of last-minute corrections and updates in response to review feedback, my book has now been handed over to Manning’s production team. That means a few weeks of copy editing and graphics polish, then indexing and typesetting to produce the final version around October time at a guess. I’m not sure how long it then takes to print and ship, but it’s getting close!

The latest edits will be pushed out to the online early-access (MEAP) copy in the next few days, so you can read essentially the finished book online if you wish. Use the code fccmadden at checkout to get 37% off if you want to check it out. The revised material includes improving the presentation of some of the longer chapters. The material on capabilities and macaroons in chapter 9 has been significantly improved, as has chapter 11 on service-to-service API calls. Chapter 12 has been improved after expert feedback from Jean-Philippe Aumasson and his colleagues at Teserakt. Exercises have been added to chapters 6, 7, 12, and 13 too. I think these changes have really made the book much better. I hope you agree.

Least privilege with less effort: Macaroon access tokens in AM 7.0

Part 1: Lightweight PoP

One of the major changes between OAuth 1 and OAuth 2 was the decision to drop the requirement that requests had to be signed using a secret key associated with each access token. This signing process was notoriously hard to get right and dropping it from OAuth 2 enormously simplified the life of client and resource server developers. In OAuth 2, access tokens are pure bearer tokens, a bit like passwords: if you know the value of the token you can use it, without needing any other credentials. This is very easy for developers, but is definitely less secure. The rise of single-page apps (SPAs) and other browser-based clients leaves access tokens more exposed to theft or accidental leakage compared to traditional server-side clients.

Drinking the PoP

Signed requests in OAuth 1 are an example of proof-of-possession (PoP) tokens rather than bearer tokens. If bearer tokens are like cash, PoP tokens are like Chip-and-PIN. Even if you steal an OAuth 1 token you can’t use it without the associated secret key used to sign requests. The only problem was it turned out that even if you did have the secret key you often couldn’t use the token either, because it was just too hard to get request signing to work reliably. Some brave souls periodically try and revive this idea.

To get around the problems with request signing, various alternative schemes have been proposed. Brian Campbell gave a nice overview of the history of OAuth PoP schemes at Identiverse recently, called The Burden of Proof. It’s a good talk and worth watching. The latest such scheme is DPoP, which tries to keep things as simple as possible: rather than signing the whole request, you just sign enough basic details of the request (the URI and HTTP method) to be able to stop the most likely attacks and rely on TLS to protect the content of the request.

DPoP is good, and for some use-cases I think it’s really good, but it still has some downsides:

  • Although the signature process itself is simplified, the client still has to generate and manage this secret key securely, which adds complexity that many client developers will avoid unless they have to.
  • The client, authorization server, and resource servers all have to know about DPoP. It’s quite hard to incrementally deploy.
  • It relies on fresh public key signatures on every request. Public key crypto is slow and expensive, so you generally want to do it as little as possible. I could imagine a rollout of DPoP requiring a significant increase in front-line servers to handle the extra load.
  • DPoP is based on JWS, which supports a wide range of signature algorithms, with no guidance as to which one to use. This means the resource servers have to potentially support them all (including some really expensive ones like ES512), or risk rejecting some clients. Clients will also have to guess which algorithms the server supports when they generate the private key. In practice this means that DPoP will only be used in closed deployments, or else everyone will gravitate to a lowest common denominator option like 2048-bit RSA with PKCS#1 v1.5 padding.
  • Public key signature algorithms are fragile. The point of a signature is that it can be verified by anybody, but in the case of an OAuth request the DPoP token only needs to be verified by exactly one party: the server that you’re sending the request to. There are much simpler and safer ways to achieve request authentication in this case.

Taking the biscuit

OK, so if DPoP is still too complex, what would a better solution look like? The risk that PoP solutions are trying to prevent is an access token being stolen from one request and used to authorize completely different requests that the client didn’t intend. A different way to solve this problem would be to have individual access tokens for every request that just authorizes that one specific request and no others. For example, it could have an audience restriction limiting it to only be used for requests to one specific server, and a single scope that authorizes just the specific type of API call that is being made. The expiry time could be limited to just a few seconds in the future, and so on. In current OAuth implementations it’s pretty hard to restrict an access token to exactly one request, but you can get pretty close.

Of course, this all sounds like a nightmare. The client can’t keep going back to the user to authorize access tokens for every little request! 

It turns out there is a way to get lots of little access tokens for individual requests. It’s easy for developers, relatively simple to deploy, and cheap. (Much cheaper than digital signatures). It’s also quite delicious.

Chocolate coated macaroons

Enter macaroons

Apart from being a tasty treat, macaroons are also a new token format invented by the boffins at Google. At first glance, a macaroon is an authenticated token a bit like a JWT using the HS256 algorithm. But the additional yummy goodness that macaroons add is the ability to add caveats. A caveat is a restriction on how the token can be used. For example, you might limit the scope of a token, or reduce its expiry time. They always reduce, never expand, the authority of the token. The really cool thing is that adding a caveat gives you a new token, leaving the original token unchanged. It’s also really cheap: a few microseconds typically. This means you can have a single powerful access token and then derive multiple more restricted access tokens from it. This lets us have our cake and eat it (sorry, my metaphors are all over the place today): you can get a single token approved by the user but then derive individual access tokens for every single API call you make, with just the perfect amount of privilege for that one call.

A picture of macaroon access tokens, showing how new tokens can be derived with reduced scope, expiry, or audience.

This gives us the benefits of having lots of tiny individual access tokens for each request, but we can generate these tokens on the fly from the original token. This is what the original macaroons paper refers to as contextual caveats: specific details of the context in which an API call is taking place that we add to a token to restrict its use. That way, if the token does get stolen or misused, the scope of what you can do with it is severely limited by the caveats attached to it. The genuine client still has the original access token though, so they can still do whatever they like within the scope of the original grant.

If we restrict the tokens enough then they become pretty hard to misuse. We get much of the same benefits of a PoP approach but with much less complexity. Where you might have had code like the following to call an API:

fetch('https://api.example.com/v1/kittens', {
    method: 'POST',
    headers: {
        'Authorization': 'Bearer ' + token
    },
    body: JSON.stringify(kittenDetails)
});

The new macaroon-enhanced version becomes something as simple as the following with an appropriate library:

fetch('https://api.example.com/v1/kittens', {
    method: 'POST',
    headers: {
        'Authorization': 'Bearer ' + token.restrict({
            aud: 'api.example.com',
            exp: nowPlus5Seconds(),
            scope: 'upload_cute_kitten_picture'
        })
    },
    body: JSON.stringify(kittenDetails)
});

Macaroon OAuth tokens in ForgeRock AM 7.0

At ForgeRock we love OAuth and we also love macaroons. We think they make a great combination. OAuth benefits from the simple but flexible approach to least privilege that macaroons enable, and macaroons benefit from the framework and tooling that exists around OAuth. I strongly believe that macaroons provide about 90% of the benefit of PoP for about 20% of the effort.

That’s why we’ve included support for issuing OAuth access and refresh tokens as macaroons in the upcoming 7.0 release of ForgeRock Access Management. So how does it work? Well, mostly you just go into the OAuth2 Provider Settings and turn it on:

Image of toggle button to turn on "Use Macaroon Access and Refresh Tokens"

Now when you complete an OAuth flow you get issued with a macaroon access token and optional refresh token. They look just like normal access tokens, but are a little bit longer:

Screenshot of Postman showing macaroon access and refresh tokens

You can call the standard token introspection endpoint to see the details of the token, just like you could with a normal access token:

Screenshot showing JSON response from the OAuth token introspection endpoint, showing the scope and other token details.

You can then add a bunch of caveats just like in the example earlier, to reduce the scope, expiry time, and audience and introspect it again to see the difference:

The updated JSON response from the introspection endpoint, showing the reduced scope, expiry, and audience due to the caveats added to the token.

Repeating the token introspection request a few seconds later, the restricted access token has expired:

Screenshot showing the active: false response

But the original access token is still valid. In fact, all the details of the original token in the database (AM’s Core Token Service) are exactly as they were when the token was first issued.

Everything is handled transparently through the token introspection endpoint, making it really easy to deploy this incrementally:

  1. In step 1, you simply turn on macaroon access tokens at the Authorization Server. So long as your resource servers are using token introspection, everything can carry on as normal. Clients and resource servers need no changes.
  2. In step 2, your clients start adding caveats to tokens using a macaroon library. Resource servers carry on introspecting the tokens, but now the introspection responses take into account the caveats on the tokens.
  3. There is no step 3.

That’s a little taster of macaroon access tokens in ForgeRock AM 7.0 and why I think they will be a game changer. Least privilege with less effort. In part 2, I’ll show you some more exotic things you can do with macaroon access tokens using 3rd-party caveats.

What’s the Curve25519 clamping all about?

Note: this post assumes some familiarity with elliptic curve cryptography (ECC). There are numerous tutorials online, such as this one.

If you know a little about ECC, you will almost certainly have come across Daniel Bernstein’s Curve25519. This curve, so called because all arithmetic is carried out modulo the prime 2255 – 19, has become widely adopted because it allows very fast implementations of typical cryptographic functions that are also secure against some kinds of side-channel attacks. But one aspect of the definition of the curve causes some confusion: the “clamping” done to private keys before they are used. It confused me at first too, but it’s actually not too complicated once you realise what it is doing. This article will attempt to explain it for anyone else that finds themselves puzzled.

Continue reading “What’s the Curve25519 clamping all about?”

A flowchart for Cache-Control headers

I always struggle to remember what all the HTTP Cache-Control directives mean and when they are used, so I made this little flow chart to remind me. I think it’s roughly correct, but no doubt there are details that I’ve missed. It’s on GitHub so please send corrections.

cache-flowchart

Convergence in web and API security

Every now and then technologies that initially appear to be distinct end up converging on a common approach from opposite directions. I believe that something like that is happening right now in approaches to web and API authentication around the use of tokens and cookies.

Continue reading “Convergence in web and API security”

A few comments on ‘age’

Update: I’ve updated the section on Cryptographic Doom at the end of the article after clarifications from the age author. That specific criticism was based on my misreading of the age spec.

Age is a new tool for encrypting files, intended to be a more modern successor to PGP/GPG for file encryption. This is a welcome development, as PGP has definitely been showing its age recently. On the face of it, age looks like a good replacement using modern algorithms. But I have a few concerns about its design.

Authenticated encryption

One of the innovations of age is that it aims to support an streaming authenticated encryption. The spec (such as it is) links to Adam Langley’s blog post about streaming encryption, which mentions use-cases like the following:

gpg -d your_archive.tgz.gpg | tar xz

The comments on Hacker News also mentioned cases where people pipe a decrypted file into a shell. Given that age links to this blog post and has implemented a secure streaming AEAD mode, it seems reasonable to suppose that it is intended to be secure in these kinds of use-cases. Each chunk (segment) of ciphertext is authenticated before being decrypted and output, so tar or the shell never sees unauthenticated plaintext and so an attacker can’t tamper with the ciphertext to influence the data being fed into the downstream process. The worst the attacker can do is to truncate the output by corrupting one or more segments causing the decryption to abort halfway (this might still allow significant mischief).

What is authenticated encryption?

Age supports a small number of algorithms. You can encrypt with a password using scrypt. In this case a symmetric key is derived from the password and you get symmetric authenticated encryption as the security goal. This doesn’t just mean that the ciphertext is protected from tampering, it also means that the encrypted file must have come from somebody who knows the password. Assuming you chose a strong password and only shared it with people you trust, then successful authenticated decryption provides strong evidence that the file came from a trusted source.

In terms of threat models you could say that authenticated encryption is intended to protect against spoofing threats as well as tampering threats. (The S and T in STRIDE at a very basic level). Unfortunately, the age spec doesn’t document its threat model or the security goals it is intended to achieve so I’m having to read between the lines to work out what was intended.

For public key cryptography, the notion of authenticated encryption becomes more complicated. I wrote a three-part blog post about it. There are public key authenticated encryption modes, such as NaCl’s box, but age doesn’t use one of those and instead opts for unauthenticated ECIES encryption (like JOSE’s ECDH-ES algorithm) using X25519. So while age uses symmetric authenticated encryption for the file contents, the symmetric file key is itself encrypted using an unauthenticated mode. This means an attacker cannot tamper with the encrypted ciphertext, but they can completely replace it with one of their own choosing. What this means is that age is secure against chosen ciphertext attacks from a confidentiality point of view, but it doesn’t actually provide any origin authentication. You can’t establish that the encrypted file came from a trusted party, so it is completely insecure to pipe the output of age into another tool without independently establishing the authenticity of the file.

The age spec explicitly lists signing and support for web of trust as out of scope. Instead it suggests using separate tools such as signify/minisign if you want to be sure of where a file came from. After all, this is the Unix philosophy: small composable tools that do just one thing. But if you click through the NaCl link above, djb links to a definition of the public key authenticated encryption security goal that he uses for box. This paper (which is old enough to drink in the UK) makes it clear that achieving authenticated encryption by generic composition of signatures and public key encryption is surprisingly difficult; more so than in the symmetric setting as there is no case like Encrypt-then-MAC which is generally always secure. For symmetric crypto we long ago gave up pretending that generic composition was something end users can get right on their own and opted for dedicated AE modes. For some reason, djb seems to be the only person to realise the same applies to public key cryptography.

A second problem with requiring a generic composition of signing and encryption is that it totally kills the streaming use-cases. Either I have to verify a signature over the entire encrypted file before decrypting it, or I have to decrypt to a temporary file that I then verify, or I need to define my own chunked streaming signature verification tool and combine that with age (and hope the chunk sizes line up). Users won’t get this right, so we’ll be right back at streaming unverified plaintext into shell commands.

So how could age fix this? Most importantly, the spec should define its security goals. I believe the correct security goal is authenticated encryption for both symmetric and public key use cases. The simplest way to achieve this would be to swap out the ephemeral X25519 code in favour of encrypting the file key with NaCl’s crypto_box. The sender must supply their own private key during encryption (which could be read out of ~/.age automatically). The recipient either supplies the expected sender’s public key on the command line, or else has a file of trusted senders in their age config directory – perhaps populated by TOFU if you don’t want to get into something like web of trust, but this problem is going to have to be solved somehow.

Unfortunately, as I pointed out on HN, you can’t simply use a static key pair with age to achieve authenticated encryption as age’s key-wrapping algorithm is completely insecure when used in this way (it uses a fixed nonce but isn’t nonce reuse misuse resistant). My overall impression of age is that it uses good algorithms in non-standard ways and then justifies this with ad-hoc reasoning about why it’s safe in this specific implementation. I’d be much happier if it used existing mechanisms from libsodium, which appear to be sufficient to cover all its use-cases.

Cryptographic Doom

The age spec defines a header that lists ways to derive the file decryption key for each recipient. For example, here are some examples from the spec:

-> X25519 8hWaIUmk67IuRZ41zMk2V9f/w3f5qUnXLL7MGPA+zE8tXgpAxKgqyu1jl9I/ATwFgV42ZbNgeAlvCTJ0WgvfEo
-> scrypt GixTkc7+InSPLzPNGU6cFw 18
kC4zjzi7LRutdBfOlGHCgox8SXgfYxRYhWM1qPs0ca8
-> ssh-rsa SkdmSg
SW+xNSybDWTCkWx20FnCcxlfGC889s2hRxT8+giPH2DQMMFV6DyZpveqXtNwI3ts
5rVkW/7hCBSqEPQwabC6O5ls75uNjeSURwHAaIwtQ6riL9arjVpHMl8O7GWSRnx3
NltQt08ZpBAUkBqq5JKAr20t46ZinEIsD1LsDa2EnJrn0t8Truo2beGwZGkwkE2Y
j8mC2GaqR0gUcpGwIk6QZMxOdxNSOO7jhIC32nt1w2Ep1ftk9wV1sFyQo+YYrzOx
yCDdUwQAu9oM3Ez6AWkmFyG6AvKIny8I4xgJcBt1DEYZcD5PIAt51nRJQcs2/ANP
+Y1rKeTsskMHnlRpOnMlXqoeN6A3xS+EWxFTyg1GREQeaVztuhaL6DVBB22sLskw
XBHq/XlkLWkqoLrQtNOPvLoDO80TKUORVsP1y7OyUPHqUumxj9Mn/QtsZjNCPyKN
ds7P2OLD/Jxq1o1ckzG3uzv8Vb6sqYUPmRvlXyD7/s/FURA1GetBiQEdRM34xbrB

In each case we have an algorithm identifier followed by algorithm-specific parameters. For example, in the X25519 case we have an ephemeral public key and then an encrypted file key.

But apart from syntax this header is incredibly close to JOSE! You might as well write it as follows:

{ “alg”: “X25519”,
  “epk”: { ... } }

JOSE even supports multiple recipients (in the lesser used JSON Serialization format) and ECIES with key wrapping. But the cryptographic community has rightly beaten up JOSE for requiring this algorithm header and it led to catastrophic attacks.

Edit: Filippo Valsorda (the author of age) has pointed out that age only uses the algorithm identifier (key type) to match the recipient’s key, not to determine which algorithm to use. Age keys are uniquely linked to an algorithm in exactly the manner I suggest.

I’m fairly sure that age is not vulnerable to the same kind of attacks, but I’m not convinced it never will be. Even if there is no immediate attack, it still violates Moxie’s Cryptographic Doom Principle. Although the age header is protected by a MAC, it cannot verify that MAC until it decrypts the file key. In order to decrypt the file key it trusts the header to tell it what algorithm to use.

Why does this mistake keep being made? As I’ve written before, there are better ways to handle this that systematically avoid these issues.

In summary I think age is interesting and solving a genuine problem. But I think the design could still be improved from where it is today to provide clearer security goals and avoid potential pitfalls in the future.