A quick crypto lesson – why “MAC then encrypt” is a bad choice

In light of the numerous recent attacks against SSL, I thought I’d offer up a quick and simple crypto lesson about why MAC-then-encrypt schemes are bad. This post will require only a minimum of knowledge about cryptography, so hopefully it’ll be useful to a wide range of people.

This is not designed to be a full and detailed description of how SSL works, or how various attacks against it works, but rather a short primer on the subject for those who know a bit about crypto but don’t really understand how something as seemingly strong as SSL might be broken. Some parts have been generalised or simplified for brevity and ease of understanding, so please don’t take anything I say here as a literal description of how it all works.

Anyway, let’s get started…

A secure network protocol has two main jobs:

  1. Keep the information in the conversation completely confidential.
  2. Prevent an attacker from tampering with the conversation.

The first part, as you probably already know, is performed by encryption. This usually involves exchanging one or more secret session keys between two endpoints, then using them with a cipher of some kind in order to provide safety against eavesdroppers.

The second part is a little more involved. In this case, when I say “tampering with the conversation”, I mean forging packets that look like they came from a legitimate endpoint, in such a way that they have a meaningful effect on the security of the conversation. This part is often implemented via a Message Authentication Code (MAC), which verifies that all data received was in fact sent by an authorised endpoint. Usually, a HMAC hash will be used, which is a keyed version of a cryptographic hash function. By using the session key as the key for the HMAC hash, it is possible to produce a hash of the payload in a way that cannot be forged by anyone that does not know the session key. By computing the same HMAC hash on the receiving end, using the same session key, it is possible to verify the authenticity of the data.

However, there’s a catch. One implementation option, called MAC-then-encrypt, is to compute the MAC on the plaintext data, then encrypt the data. The receiving endpoint then decrypts the data using the session key, and verifies its authenticity. Unfortunately, this means that an unauthenticated attacker can send arbitrary messages, and the receiving endpoint must decrypt them first in order to verify the MAC. Without knowing the session key, the attacker will likely produce garbage data after decryption, and the MAC will not match.

There is, however, an interesting trick that can be done here. Block ciphers require the length of all plaintext messages to be a multiple of the cipher’s block size. Since there is often a length discrepancy, padding is used to ensure that the message length is extended to fit. There are many different algorithms for generating padding data, but the padding is usually reliant on the plaintext in some way. This padding is checked during the decryption phase, and invalid padding results in an error. An attacker can flip certain bits in the ciphertext to modify this padding, and identify changes in behaviour and timing based on these altered bits. This is called a padding oracle attack, and can lead to full discovery of the plaintext.

A better solution, called encrypt-then-MAC, is to encrypt the data first, then compute the MAC of the ciphertext. This leads to a situation where the receiving endpoint checks the MAC first, before performing decryption, and drops the connection if the MAC is incorrect. Since the attacker can’t forge the MAC without knowing the session key, this completely negates the padding oracle attack.

How is all of this relevant to SSL? Well, in TLS 1.0 and earlier, a MAC-then-encrypt scheme was used. This resulted in various attacks, including BEAST and Lucky 13. In TLS 1.1 and later, these types of attacks are prevented.

Hopefully this has given you some insight into one of the ways that SSL can be vulnerable.


4 thoughts on “A quick crypto lesson – why “MAC then encrypt” is a bad choice”

  1. Thanks for the write up on this. I’ve heard Moxie Marlinspike refers to this as the doomsday principle, but couldn’t find any additional written information from him about it.

  2. Your post makes a good justification for encrypt-then-MAC to reduce the CPU utilization for rejecting spoofed packets.

    But wouldn’t using padding not based on the plaintext be just as secure and prevent a padding oracle attack?

    1. The padding data has to contain the length of the padding, in order to remove it later. This automatically makes the padding reliant on the size of the message in relation to the block size, i.e. padLength = blockSize - (msgLength % blockSize). One mechanism of padding is to always force at least one padding byte, and the length of the padding is encoded into the last byte of the padding. So, if you need 3 bytes of padding, your padding might look like 03 03 03. If you need only one byte of padding, your padding is just a single 01 byte. If you don’t need any padding (i.e. the message fits into blocks perfectly) then you add an entire block of padding, e.g. sixteen 0x10 bytes for a 128-bit block. When you strip the padding, you just look at the last byte, then remove that many bytes from the total message, checking that they all have the same value. The padding is invalid if that first byte exceeds the maximum possible padding length (i.e. the block size in bytes), or if the actual padding bytes preceding it do not match. This makes for pretty robust padding.

      The trick here is that the padding gets encrypted too. By flipping bits in the ciphertext, you can deduce what permutations applied by the block transform affected the padding bits, and what permutations applied by the block transform affected the data bits, as single a bit flipped in the ciphertext will either cause a padding failure, in which case you can deduce that the ciphertext bit maps to plaintext padding bit, or cause the padding to remain valid, in which case you can deduce that the ciphertext bit maps to a plaintext data bit. There’s a lot more to it than this (each bit doesn’t have a 1:1 relationship) but that’s the general idea.

      The problem with your suggestion is that padding has to be reliant on the data in some way, in order to be validated and stripped. If you make it totally unreliant on the data length, how do you know how many bytes to strip? The only trick I can think of is to pick a random byte that is not equal to the last byte of plaintext, and use that as padding. However, that mechanism would fail to catch invalid padding on wrong keys, as it’d just assume that the last byte of the garbled plaintext was a single padding byte.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s