I still don’t really get “hash shucking”

If you want to learn how to store passwords securely, you could do a lot worse than looking at the OWASP Password Storage Cheat Sheet. These cheat sheets are generally pretty good, and the password storage one is particularly good. The editors do a great job of keeping it up to date and incorporating the latest research from experts. (Just bear in mind that the recommendations there are when using password for authentication. If you’re using a password to encrypt sensitive data then you should be aware of some limitations).

One of the hash functions that OWASP recommend is bcrypt, which should be familiar to anyone who’s ever looked at password hashing. Bcrypt is generally an ok choice, but it has some quirks that make it hard to love. As pointed out in the cheat sheet, many implementations cannot handle input passwords longer than 72 bytes. (And some implementations are not binary safe either). To get around this, it was common advice at one point to “pre-hash” the input using some other fast hash function like SHA-256. That is, rather than the stored password hash being bcrypt(password) it was bcrypt(sha256(password)) or something similar. This was also sometimes done when an old insecure password database using something like unsalted MD5 was upgraded by simply re-hashing the existing hashes with bcrypt: md5(password) -> bcrypt(md5(password)).

On the face of it, this seems like a reasonable and safe thing to do. After all, if someone gets a copy of your password database they will be faced with a list of hard-to-crack bcrypt hashes, rather than raw unsalted MD5 or SHA-1 or whatever.

Continue reading “I still don’t really get “hash shucking””

Entity authentication with a KEM

In cryptography, the process of authenticating a user (or app/service) is known as entity authentication or identification (to distinguish it from message authentication or data origin authentication). There are lots of ways to do this. In this post I’m going to talk about authentication schemes based on public key cryptography. It turns out that the notion of Key Encapsulation Mechanism (KEM), originally invented for (unauthenticated) public key encryption, also provides a unifying framework for thinking about public key entity authentication.

Continue reading “Entity authentication with a KEM”

Book review: The Joy of Cryptography

Mike Rosulek, Oregon State University. Draft of January 3, 2021. Online: The Joy of Cryptography.

This is a freely-available book covering introductory material on cryptography. It’s suitable for anyone with undergraduate-level computer science knowledge. As is often the case in cryptography textbooks, there is a brief review of mathematical background in the first (or zeroth in this case) chapter: modular arithmetic, probability, and Big-O notation. This chapter is brief but clear and establishes the sense of humour that is present throughout the book.

Overall, the book is very well-presented and -written for the target audience—undergraduates and practitioners. I have read several other textbooks on cryptography, but most are pitched more at a level for post-graduate students (even if apparently meant for undergrads). The writing here is always clear and focused, without leaving out essential details. In particular, I appreciated that security definitions are immediately followed by a discussion on the implications of that definition, and what it leaves out.

On a more basic note, the book is beautifully presented, and clearly a lot of thought has gone into the typography, layout, and diagrams (which have a pleasing “tightness” in their use of space).

It is largely a book about symmetric cryptography—the first 12 chapters cover symmetric material in depth, and only the last 3 chapters cover public key crypto. I actually quite like this split, because I believe many topics that are discussed in the context of public key cryptography have simpler illustrations in a symmetric setting. Still, you may find the public key material a little sparse. There is no discussion of elliptic curves, for example, or the developing field of post-quantum cryptography. There is also no material on hybrid encryption (i.e., using a combination of public key and symmetric primitives) or the KEM-DEM paradigm, which I would expect in any modern textbook.

Update: Mike Rosulek pointed out that the book does actually cover hybrid encryption in the very last section: 15.4. Mea culpa.

It is a draft though, and the roadmap makes clear that some subjects that I would expect to be covered are coming at some point. On the other hand, by the time you finish the book you will know enough to tackle some of the more comprehensive textbooks, such as Smart’s Cryptography Made Simple (lol) or Boneh & Shoup’s Graduate Course in Applied Cryptography.

The presentation is also quite slow, taking its time to develop topics and foster a real understanding. This is both a blessing and a curse. The clarity is welcome, but you have to get to chapter 12 before authenticated encryption is covered. If you are looking to find out “what should I be using?”, waiting 12 chapters to find out might be testing your patience a little. But if you want to really understand the topics and what modern cryptography is all about, your patience is paid off.

The approach used in the book to proving security of schemes is that of code-based games. The idea here is that a cryptographic scheme is developed as a kind of abstract data type or API, defining the operations that can be performed: generating a key, encrypting a message, decrypting a message, and so on. An adversary is then given access to two implementations of that API: one that is our real encryption scheme, and another that is some idealised (typically random) implementation of it. The API the attacker sees is restricted in some ways compared to what a real user would see (e.g., the attacker doesn’t get to see the key used for encryption). A scheme is secure if we can prove that no “reasonable” adversary can tell which implementation they have been given any better than a blind guess.

A proof in this scheme proceeds by replacing parts of the real implementation with idealised versions, until we can show that an attack on the real implementation is equivalent to an attack on the idealised version. Such proofs are always relative to some basic set of primitives that we believe to be close enough to the ideal versions already (generally because a lot of smart people have tried to break them for a long time and failed).

This is a different approach than a lot of textbooks, which use this indistinguishability proof style only for encryption, and then switch to other types of games for other security notions (such as message authentication). I found this uniform presentation very helpful in understanding the material and in seeing connections between security notions that I hadn’t appreciated before. For example, this book is the first time I’ve appreciated the importance of the Decisional Diffie-Hellman assumption to proving pseudorandomness of key agreement, which then motivates the use of quadratic residues and safe primes. (Although it’s missing a follow-up discussion on the use of hashing/random oracles to relax this assumption, which is what you would do in practice—a chapter on hybrid encryption would help clarify this).

An another example, the book introduces secret sharing early on and then points out that a one-time pad can be viewed as a simple 2-of-2 secret sharing technique, a connection I had never made before. Secret sharing can be viewed as a type of multi-party computation (MPC) and the author is an MPC researcher and co-author of an MPC textbook. This specialist knowledge provides a welcome fresh perspective on otherwise familiar material.

Overall, this is an excellent introduction to cryptography, with a novel and engaging approach to security proofs. The draft status and some lacking topic areas make me hesitant to recommend it as your sole introduction to cryptography just yet, but as a companion to another textbook it has a lot to recommend it. Even if you believe you already know all this material, you may still gain a lot from reading it: I know I did.

A few programming language features I’d like to see

I enjoyed Hillel Wayne’s recent newsletter about microfeatures they’d like to see in programming languages. A “microfeature” is essentially a small convenience that makes programming in that language a bit easier without fundamentally changing it. I love this idea. I’m partial to a bit of syntactic sugar, even if it can cause cancer of the semicolon. I have a few features that I’d love to see more languages adopt, so I thought I’d join in the fun. Unlike Hillel’s, however, most of mine are much larger changes. I’ve long thought it was time for a bit more experimentalism to return to programming language design, and not just for type systems! Maybe these ideas will inspire you to think up some whacky new programming languages. Let me know in the comments.

Continue reading “A few programming language features I’d like to see”

On PBKDF2 iterations

There has been a lot of discussion recently around the LastPass breach, especially with regards to the number of PBKDF2 iterations applied to the master password to derive the vault encryption key. Other people have already dissected this particular breach, but I want to more generally talk about PBKDF2 iterations and security models. (I’m not going to talk about Argon2 or Bcrypt or any other algorithms).

Continue reading “On PBKDF2 iterations”

A few clarifications about CVE-2022-21449

Just a few quick notes/updates to correct some potentially inaccurate statements that are floating around on Reddit/Twitter etc:

  • The bug only impacts Java 15 and above. The original advisory from Oracle incorrectly listed earlier versions (like 7, 8 and 11) as being impacted. They have since corrected this. Note that they now only list 17 and 18, because 15 and 16 are no longer supported.
  • Bouncy Castle is not impacted by this vulnerability. They have their own ECDSA implementation, and it performs the relevant check to prevent this bug.
  • Although an all-zero signature value is the simplest way to exploit this, there are several alternative values that exhibit the same bug. As previously mentioned, Project Wycheproof has an excellent selection of test vectors for this bug and many variations on it, in different signature formats, and for different elliptic curves.
  • On a related note, some JWT libraries were initially assumed to be unaffected because a quirk of re-encoding raw (IEEE P1363) format signatures into ASN.1 format rejected zero values. But, as pointed out above, there are other invalid values that are not rejected by this conversion that still trigger the bug. Either upgrade your JVM, or your JWT library, and ideally both.
  • Some JWT libraries also apparently accept signature values in several alternative encodings, so if you are checking for bad signatures in a pre-processing step then you have even more values to check. Again, best to update to get the patches rather than trying to fix this yourself.

CVE-2022-21449: Psychic Signatures in Java

The long-running BBC sci-fi show Doctor Who has a recurring plot device where the Doctor manages to get out of trouble by showing an identity card which is actually completely blank. Of course, this being Doctor Who, the card is really made out of a special “psychic paper“, which causes the person looking at it to see whatever the Doctor wants them to see: a security pass, a warrant, or whatever.

Doctor Who holding up a blank ID card with a badly superimposed image of Duke (the Java mascot) holding a glass of wine.
“Looks legit to me. Hic!

It turns out that some recent releases of Java were vulnerable to a similar kind of trick, in the implementation of widely-used ECDSA signatures. If you are running one of the vulnerable versions then an attacker can easily forge some types of SSL certificates and handshakes (allowing interception and modification of communications), signed JWTs, SAML assertions or OIDC id tokens, and even WebAuthn authentication messages. All using the digital equivalent of a blank piece of paper.

It’s hard to overstate the severity of this bug. If you are using ECDSA signatures for any of these security mechanisms, then an attacker can trivially and completely bypass them if your server is running any Java 15, 16, 17, or 18 version before the April 2022 Critical Patch Update (CPU). For context, almost all WebAuthn/FIDO devices in the real world (including Yubikeys*) use ECDSA signatures and many OIDC providers use ECDSA-signed JWTs.

If you have deployed Java 15, Java 16, Java 17, or Java 18 in production then you should stop what you are doing and immediately update to install the fixes in the April 2022 Critical Patch Update.

Update: the official announcement from Oracle also lists older versions of Java, including 7, 8 and 11. Although I’m not aware of the bug impacting those older implementations they did fix a similar bug in the (non-EC) DSA implementation at the same time, so it’s possible older versions are also impacted. There are also other security vulnerabilities reported in the same CPU, so (as always) it is worth upgrading even if you are running an older Java version. The OpenJDK advisory on the other hand lists only versions 15, 17, and 18 as affected by this specific issue (CVE-2022-21449).

Update 2: Oracle have informed me they are in the process of correcting the advisory to state that only versions 15-18 are impacted. The CVE has already been updated. Note that 15 and 16 are no longer supported, so it will only list 17 and 18 as impacted.

Continue reading “CVE-2022-21449: Psychic Signatures in Java”

Is Datalog a good language for authorization?

Datalog is a logic programming language, based on Prolog, which is seeing something of a resurgence in interest in recent years. In particular, several recent approaches to authorization (working out who can do what) have used Datalog as the logical basis for access control decisions. On the face of it, this seems like a perfect fit, and there’s a lot to recommend it. I myself have been a fan of Datalog since first coming across it at the start of my PhD studies back in 2003, and have even written papers advocating for it. However, although I think it has a lot of benefits, I think there is some confusion about some of its complexity results that means it is not always as good a fit as you may be led to believe.

Continue reading “Is Datalog a good language for authorization?”

Why the OAuth mTLS spec is more interesting than you might think

I was catching up on the always excellent Security. Cryptography. Whatever. podcast, and enjoyed the episode with Colm MacCárthaigh about a bunch of topics around TLS. It’s a great episode that touches a lot of subjects I’m interested in, so go ahead and listen to it if you haven’t already, and definitely subscribe. I want to pick up on one of the topics in the podcast in this article, and discuss part of the OAuth specs that I think deserves to be better known.

Continue reading “Why the OAuth mTLS spec is more interesting than you might think”

Multiple input MACs

When working with Message Authentication Codes (MACs), you often need to authenticate not just a single string, but multiple fields of data. For example, when creating an authenticated encryption mode by composing a cipher and a MAC (like AES-CBC and HMAC), you need to ensure the MAC covers the IV, associated data, and the ciphertext. But you can’t just throw these three fields into the MAC one after the other and hope for the best, as this often leads to attacks due to ambiguity of where one field ends and the next begins.

A meme in which Anakin Skywalker is telling Padme that he "encrypted the message with CBC-then-HMAC". Padme replies "You included the IV in the MAC, right?" and then repeats the question looking more concerned.
Here, have a tangentially-related meme I made.

One way to solve this is to encode the fields into a single string that is unambiguously formatted. The excellent blog post I just linked to describes how to do that using the encoding defined by PASETO:

tag = hmac(key, encode(iv, associated_data, ciphertext))

In this blog post I’ll describe an alternative approach in which we adjust the MAC interface to natively accept multiple input arguments, so that the API ensures they are processed unambiguously. This new API not only better reflects how MACs are used in practice, making it harder to misuse, but can also result in better efficiency.

Continue reading “Multiple input MACs”
%d bloggers like this: