4.7 * NMAC and HMAC - II Private-Key (Symmetric) Cryptography 45

Until now we have seen constructions of message authentication codes that are based on pseudorandom functions (or block ciphers). A completely dif-ferent approach is taken in the NMAC and HMAC constructions which are based on collision-resistant hash functions. Loosely speaking (and not being entirely accurate), the constructions of NMAC and HMAC rely on the follow-ing assumptions regardfollow-ing the collision-resistant hash function befollow-ing used:

1. The hash function is constructed using the Merkle-Damg˚ard transform, and

2. The fixed-length collision-resistant hash function – typically called a compression function– that lies at the heart of the hash function, has certain pseudorandom or MAC-like properties.

The first assumption is true of most known collision-resistant hash functions.

The second assumption is believed to be true of the hash functions that are used in practice and believed to be collision resistant. We remark that the security of NMAC and HMAC actually rests on a weaker assumption than that described above, as we will discuss below.

We first present NMAC and then HMAC, since as we will see, HMAC can be cast as a special case of NMAC. Furthermore, it is easier to first analyze the security of NMAC and then derive the security of HMAC.

Notation – the IV in Merkle-Damg˚ard. In this section we will explicitly refer to theIV used in the Merkle-Damg˚ard transform of Construction 4.11;

recall that this is the value given to z0. In the standard Merkle-Damg˚ard construction, the IV is fixed. However, here we will wish to vary it. We denote byH_IV^s (x) the computation of Construction 4.11 on inputx, with key sandz0 set to the valueIV ∈ {0,1}^`.

4.7.1 Nested MAC (NMAC)

Let H be a hash function of the Merkle-Damg˚ard type, and let h be the compression function used inside H. For simplicity, we denote the output length of H and h by n (rather than by `(n)). The first step in the con-struction of NMAC is to consider secretly keyed versions of the compression and hash functions. This is achieved via the initialization vectorIV. Specifi-cally, for a given hash functionH^sconstructed according to Merkle-Damg˚ard and using (the non-secret) key s, define H_k^s to be the keyed hash function obtained by setting IV = k, wherek is secret. Likewise, for a compression function h^s, define a secretly keyed version by h^s_k(x) = h^s(kkx). That is, the keyed compression function works by applying the unkeyed compression function to the concatenation of the key and the message. Note that in the

Merkle-Damg˚ard construction, the use of theIV is such that the first iteration computesz1=h^s(IVkx1). HereIV =kand so we obtain thatz1=h^s(kkx) which is the same ash^s_k(x). This implies that the secret key forH andhcan be of the same size. By the construction, this equals the size of the output ofh, and so the keys are of lengthn. To summarize, we define secretly keyed compression and hash functions as follows:

• For a compression functionhwith non-secret keys, we defineh^s_k(x)^def= h^s(kkx) for a secret keyk.

• For a Merkle-Damg˚ard hash functionH with non-secret keys, we define H_k^s(x) to equal the output ofH^son input x, when theIV is set to the secret keyk. This is consistent with notation above of H_IV^s (x).

We are now ready to define the NMAC function:

CONSTRUCTION 4.13 Nested MAC (NMAC).

The NMAC construction is as follows:

• Gen(1ⁿ): upon input 1ⁿ, run the key-generation for the hash func-tion obtainings, and choosek1, k2← {0,1}ⁿ.

• Mac_k(m): upon input (s, k1, k2) and x ∈ {0,1}^∗, compute N M AC_k^s₁_,k₂(x) =h^s_k₁(H_k^s₂(x))

• Vrfy_k(m, t): Output 1 if and only ift=Mack(m)

In words, NMAC works by first applying a keyed collision-resistant hash function H^s to the inputx, and then applying a keyed compression function to the result. Notice that the input/output sizes are all appropriate (you should convince yourself of this fact). H_k^s₂ is called the inner function and h^s_k₁ theouter function.

The security of NMAC relies on the assumption thatH_k^swith a secretkis collision-resistant, and thath^s_k with a secretkconstitutes a secure MAC. In order to state the security claim formally, we define the following:

• Given a hash function (Gen, H) generated via the Merkle-Damg˚ard trans-form, define ( ˆGen,Hˆ) to be a modification where ˆGenrunsGento obtain sand in addition choosesk← {0,1}ⁿ. Furthermore, ˆH is computed ac-cording to Construction 4.11 usingIV =k(that is, ˆHs,k(x) =H_k^s(x)).

• Given a fixed-length hash function (Gen, h) mapping 2nbits ton bits, define ( ˆGen,ˆh) to be a modification where Genˆ runs Gen to obtain s and in addition choosesk← {0,1}ⁿ. Furthermore, ˆhs,k(x) =h^s(kkx).

Define a message authentication code based on ( ˆGen,ˆh) by computing Macs,k(x) = ˆhs,k(x) andVrfy in the natural way.

We have the following theorem:

THEOREM 4.14 Let (Gen, h) be a fixed-length hash function and let (Gen, H)be a hash function derived from it using the Merkle-Damg˚ard trans-form. Assume that the secretly-keyed ( ˆGen,H)ˆ defined as above is collision-resistant, and that the secretly keyed ( ˆGen,ˆh) as defined above is a secure fixed-length message authentication code. Then, NMAC described in Con-struction 4.13 is a secure(arbitrary-length) message authentication code.

We will not present a formal proof of this theorem here. Rather, an almost identical proof is given later in Section 12.4 in the context of digital signatures (see Theorem 12.5). It is a straightforward exercise to translate that proof into one that works here as well.

For now, we will be content to sketch the idea behind the proof of The-orem 4.14. Assume, by contradiction, that a polynomial-time adversary A manages to forge a MAC. Recall thatAis given an oracle and can ask for a MAC on any message it wishes. Then, it is considered to successfully forge if it outputs a valid MAC tag on any message that it did not ask its oracle. Let m^∗ denote the message for whichAproduces a forgery and letQdenote the set of queries it made to its oracle (i.e., the set of messages that for which it obtained a MAC tag). There are two possible cases:

1. Case 1 – there exists a messagem∈ Q such that H_k^s₂(m^∗) = H_k^s₂(m):

in this case, the MAC tag for m equals the MAC tag for m^∗ and so clearlyAcan successfully forge. However, this case directly contradicts the assumption that Hk is collision resistant because Afound distinct mandm^∗ for whichH_k^s₂(m^∗) =H_k^s₂(m).

2. Case 2 – for every message m ∈ Q it holds that H_k^s₂(m^∗) 6= H_k^s₂(m):

Define Q⁰ = {H_k^s₂(m) | m ∈ Q}. The important observation here is that the messagem^∗ for which the MAC forgery is formed is such that H_k^s₂(m^∗)∈ Q/ ⁰. Thus, we can consider the attack by A to be a MAC attack onh^s_k₁ where the messages are all formed asH_k^s₂(m) for somem, and the message for which the forgery is generated isH_k^s₂(m^∗). In this light, we have thatAsuccessfully generates a MAC forgery in the fixed-lengthmessage authentication codeh^s_k. This contradicts the assumption thath^s_k is a secure MAC.

Once again, for a formal proof of this fact, see the proof of Theorem 12.5 (the only modifications necessary are to replace “signature scheme” with “message authentication code”).

Security assumptions. We remark that if the inner hash function is as-sumed to be collision resistant (in the standard sense), then no secret keyk2

is needed. The reason why NMAC (and HMAC below) were designed with secret keying for the inner function is that it allows us to make a weaker

assumption on the security of H^s. Specifically, even if it is possible to find collisions in H^s, it does not mean that it is possible to find collisions in the secretly keyed versionH_k^s. Furthermore, even if it is possible to find collisions, notice that the attacker cannot compute millions of hashes by itself but must query its MAC oracle (in real life, this means that it must obtain these values from the legitimately communicating parties). To make its life even harder, the attacker does not even receiveH_k^s₂(x); rather it receivesh^s_k₁(H_k^s₂(x)) that at the very least hides much ofH_k^s₂(x).

Despite the above, it is not clear that the assumption that H_k^sis collision resistant is significantly “better” than the assumption that H^s is collision resistant. For example, the new attacks on MD5 and other hash functions that we mentioned above work for any IV. Furthermore, the prefix of any message in a Merkle-Damg˚ard hash function can essentially be looked at as an IV. Thus, if it is possible to obtain a single value H_k^s(x) for some x, then one can define IV = H_k^s(x). Then, a collision in H_IV^s can be used to derive a collision inH_k^s, even though a secret keykis used. We reiterate that in NMAC, the adversary actually only receivesh^s_k₁(H_k^s₂(x)) and notH_k^s₂(x).

Nevertheless, the statement of security in Theorem 4.14 does not use this fact (and currently it is not known how to utilize it).

4.7.2 HMAC

The only disadvantage of NMAC is that the IV of the underlying hash function must be modified. In practice this can be annoying because theIV is fixed by the function specification and so existing cryptographic libraries do not enable an externalIV input. Essentially, HMAC solves this problem by keying the compression and hash functions by concatenating the input to the key. Thus, the specification of HMAC refers to an unkeyed hash function H. Another difference is that HMAC uses a single secret key, rather than two secret keys.

As in NMAC, letH be a hash function of the Merkle-Damg˚ard type, and lethbe the compression function used insideH. For the sake of concreteness here, we will assume thathcompresses its input by exactly one half and so its input is of length 2n (the description of HMAC depends on this length and so must be modified for the general case; we leave this for an exercise). The HMAC construction uses two (rather arbitrarily chosen) fixed constantsopad andipad. These are two strings of lengthn(i.e., the length of a single block of the input to H) and are defined as follows. The stringopad is formed by repeating the byte 36 in hexadecimal as many times as needed; the stringipad is formed in the same way using the byte 5C. The HMAC construction, with an arbitrary fixed IV, is as follows (recall that ‘k’ denotes concatenation):

We stress that “k⊕ipad k x” means that k is exclusively-ored with ipad and the result is concatenated withx. Construction 4.15 looks very different from Construction 4.13. However, as we will see, it is possible to view HMAC as a special case of NMAC. First, assume thath^s_kwith a secretkbehaves like

CONSTRUCTION 4.15 HMAC.

The HMAC construction is as follows:

• Gen(1ⁿ): upon input 1ⁿ, run the key-generation for the hash func-tion obtainings, and choosek← {0,1}ⁿ.

• Mack(m): upon input (s, k) andx∈ {0,1}^∗, compute HM AC_k^s(x) =H_IV^s

k⊕opadkHIV

k⊕ipadkx

and output the result.

• Vrfy_k(m, t): output 1 if and only ift=Mack(m).

a pseudorandom function (note that SHA-1 is often used as a pseudorandom generator in practice and so this is believed to be the case). Now, the first step in the computation of the inner hashH_IV^s (k⊕ipad kx) isz1=h^s(IV kk⊕ ipad). Likewise, the first step in the computation of the outer hash is z₁⁰ = h^s(IV kk⊕opad). Looking athas a pseudorandom function, this implies that h^s(IV kk⊕ipad) andh^s(IV kk⊕opad) are essential different pseudorandom keysk1andk2. Thus, denotingk1=h^s(IV kk⊕ipad) andk2=h^s(IV kk⊕ opad) we have

HM AC_k^s(x) =H_IV^s k⊕opadkH k⊕ipadkx

=H_k^s₁(H_k^s₂(x)).

Notice now that the outer application ofH is to the output ofH_k^s₂(x) and so is just a single computation ofh^s. We therefore conclude that

HM ACk(x) =h^s_k₁(H_k^s₂(x))

which is exactly NMAC. In order to formally restate the theorem of NMAC for HMAC, we just need to assume that the key derivation ofk1 andk2using ipadandopadyields two pseudorandom keys, and everything else remains the same. We formally state this by defining the functionGby

G(k) =h^s(IVkk⊕opad)kh^s(IVkk⊕ipad)

and requiring that it be a pseudorandom generator with expansion factor 2n (see Definition 3.15 in Chapter 3). Observe that the generator doubles its length and so produces two pseudorandom keys each of lengthnfrom a single random key of length n. Keying the hash functions as we have described above, we have the following:

THEOREM 4.16 Assume that(Gen, H), ( ˆGen,Hˆ), (Gen, h)and( ˆGen,ˆh) are all as in Theorem 4.14. In addition, assume that Gas defined above is a pseudorandom generator. Then, HMAC described in Construction 4.15 is a secure(arbitrary-length) message authentication code.

HMAC in practice. HMAC is an industry standard and is widely used in practice. It is highly efficient and easy to implement, and is supported by a proof of security (based on assumptions that are believed to hold for all hash functions in practice that are considered collision resistant). The importance of HMAC is partially due to the timeliness of its appearance. When HMAC was presented, many practitioners refused to use CBC-MAC (with the claim that is is “too slow”) and instead used heuristic constructions that were often insecure. For example, a MAC was defined asH(kkx) whereH is a collision-resistant hash function. It is not difficult to show that whenH is constructed using Merkle-Damg˚ard, this is not a secure MAC at all.

Dans le document II Private-Key (Symmetric) Cryptography 45 (Page 146-151)