This is a note of week 3 of the Coursera course Cryptography I. The topic is data integrity (not confidentiality) using **Message Authentication Code (MAC)** and hash functions.

## 1 MAC Definition

MAC is a pair of a signing algorithm and a verificatin algorithm `(S, V)`

defined over `(K, M, T)`

that `S(k, m)`

outputs `T`

and `V(k, m, t)`

outputs `yes`

when the result is `V(k, m, S(k, m)) = V(k, m, t)`

for any pair of `(k, m)`

.

Cyclic Redundancy Check (CRC) checksum algorithm is used to detect **random errors**, not malicious errors.

The threat model of MAC is using the chosen message attack (attacker’s power) to produce new valid message/tag pair (existential forgery). A secure MAC means:

- attacker cannot produce a valid tag for a new message
- given
`(m, t)`

, attacker cannot even produce a new tag for the same message`(m, t')`

for`t != t'`

A tag cannot be too short, usually it is at least 64-bit.

A secure PRF is a secure MAC if the tag is long enough, say `2^80`

. AES is a secure MAC because its tag is 16-bytes. Two constructions used in practice to covert small-MAC into a big-Mac are: CBC-MAC for banking and HMAC for Internet protocols such as SSL, IPsec, and SSH. Both convert a small-PRF into a big-PRF. Truncated MAC is secure as long as the truncated length is long enough.

## 2 ECBC, NMAC and Other Constructions

The goal that given a PRF for short messages such as AES (`n = 128bit`

), construct a PRF for long messages.

**Encrypted CBC MAC (ECBC-MAC)** takes two independent keys and a long message. It generates a tag using raw CBC chain with a key and the last block is encrypted by another independent key.

Another construction is **Nested MAC (NMAC)**. The PRF uses two keys is defined as `F: K * X -> K`

. It encrypts each block with the first key, the output becomes the input of the key of the next block. The last block output is padded and encrypted by another independent key.

Both ECBC and NMAC need the last encryption step with an independent key to make them secure. They are secure for `2^64`

messages for `128-bit`

AES. ECBC is often used as an AES-based MAC. NMAC is not usually used with AES or 3DES because it needs to change AES key on very block. NMAC is the basis for HMAC.

We cannot pad a message with all `0`

because it is insecure: `100`

and `1000`

may have the same tag. Therefore, if `m0 != m1`

, then `pad(m0) != pad(m1)`

. The ISO standard is to pad `1000..00`

and add dummy block is necessary.

There is an NIST stadard CMAC that doesn’t have dummy block. It generates two keys from a random key and at the final step, XOR padded with one key and XOR unpadded with another key.

There are parallel MACs, one-time MACs and many-time MACs.

## 3 Collision Resistence

Let `H: M -> T`

be a hash functin for `|M| >> |T|`

. A **collision** for `H`

is a pair of `m0, m1 in M and m0 != m1`

such that `H(m0) = H(m1)`

. A hash function is collision resistant if for all explicit efficient algorithms, the probability of finding collision for `H`

is negligible. `SHA-256`

is such a collisioin resistent hash function.

A secure MAC for short messages and a collision resistant hash function can be used to build a secure MAC for big messages.

Due to the Birday paradox, a generic Birthday attack has more than `50%`

success rate for `2^(n/2)`

tries for `n-bit`

tag.

The **Merkle-Damgard** iterated construction is often used to build collision large resistance function from a **compression function** for small data: from `h: T * X -> T`

we obtain `H: X^L -> T`

by iteratively apply `t = h(t, m)`

with an initial `t=IV`

to `m0, m1, ,,, ml || PB`

where `PB`

is a padding block that has the padding data and a 64-bit message length. If `h`

is collision resistant, so is `H`

.

A compression function `h`

can be built from a block cipher. If `E`

is an ideal cipher `E: K * {0, 1}^n -> {0, 1}^n`

, the **Davies-Meyer** compression function is `h(Hi, mi) = E(mi, Hi) XOR Hi`

where `i`

is a block number. **Ideal** means that `E`

is an collection of `|K|`

random permutations. Finding a collision takes `O(2^(n/2))`

evaluation of `(E, D)`

. There are many variants of this function.

The **SHA-256** uses the Merkle-Dagard construction and the Davies-Meyer compression function. The block cipher is **SHACAL-2** with a 512-bit key – the size of each message. The block size is 256-bit.

## 4 HMAC

Building a MAC out of a hash function: `S(k, m) = H(k xor opad, H(k xor ipad || m))`

. The `ipad`

and `opad`

are 512-bit constant defined by HMAC.