Educational reference explaining hash length extension attacks — which algorithms are vulnerable, how the Merkle-Damgård construction enables the attack, and how to defend against it.
| Algorithm | Family | Block Size | Status | Recommendation |
|---|---|---|---|---|
| MD5 | Merkle-Damgård | 64B | Vulnerable | Do not use MD5 for any security purpose. Use HMAC-SHA256 or SHA-3. |
| SHA-1 | Merkle-Damgård | 64B | Vulnerable | SHA-1 is deprecated. Use SHA-256 with HMAC for authentication. |
| SHA-256 | Merkle-Damgård | 64B | Vulnerable | Use HMAC-SHA256 instead of bare SHA-256 for MACs. Never use H(key || message) as a MAC. |
| SHA-512 | Merkle-Damgård | 128B | Vulnerable | Use HMAC-SHA512 or SHA-3 with proper key handling. |
| SHA-3 (Keccak) | Sponge | 136B | Safe | SHA-3 is safe to use as H(key || message) but HMAC-SHA3 is still recommended for clarity. |
| BLAKE2 | Haifa | 64B | Safe | BLAKE2b/BLAKE2s are excellent choices for fast, secure hashing and MACs. |
| BLAKE3 | Merkle tree | 64B | Safe | BLAKE3 is an excellent modern choice for both hashing and keyed MACs. |
| HMAC-SHA256 | HMAC | 64B | Safe | HMAC with any of the SHA-2 family is the standard solution. This is the correct way to use SHA-2 as a MAC. |
Merkle-Damgård (MD) is a design pattern used by MD5, SHA-1, SHA-256, and SHA-512. A message is divided into fixed-size blocks. An iterative compression function f processes each block together with the current state H: H₀ = IV (initialization vector) H₁ = f(H₀, block₁) H₂ = f(H₁, block₂) ... Hₙ = f(Hₙ₋₁, blockₙ) output = Hₙ The crucial property: the output IS the internal state. This means if you have the hash output you have the state needed to continue hashing.
Suppose a server computes MAC = SHA-256(secret_key || message) and sends it with the message. An attacker who knows: 1. The hash value (MAC) 2. The length of secret_key (or can guess it) 3. The message content Can compute a valid MAC for a forged message: message || padding || extra_data Steps: 1. Load the known hash as the starting internal state 2. Continue hashing from that state with the extra data 3. The result is a valid hash for (key || message || padding || extra_data) 4. Server verifies SHA-256(key || forged_message) — it matches! Tools like hashpump and hash_extender automate this attack.
Before hashing each block, MD functions append a specific padding so the total length is a multiple of the block size: 1. Append a single 0x80 byte 2. Append zero bytes until length ≡ (blockSize - 8) mod blockSize 3. Append the original message length as a 64-bit big-endian integer This padding becomes part of the forged message. The attacker must know the key length to compute it correctly (they can try multiple lengths if unknown).
CVE-2009-3490: Flickr API used HMAC but had a length extension vulnerability in token verification. CVE-2008-2380: Multiple web applications that used H(secret || data) as a MAC. Affected pattern: Any code using hash(secret || user_controlled_data) as authentication or integrity check. Secure alternatives: • HMAC: H(K XOR opad || H(K XOR ipad || msg)) — standard and well-studied • SHA-3 / BLAKE2 / BLAKE3 — not vulnerable by design • Authenticated Encryption (AES-GCM) — preferred for encryption + integrity
Calculate the Merkle-Damgård padding bytes (glue) an attacker must include in their forged message extension.
80 00 00 00 00 00 00 00 00 00 00 00 00 00 01 80Original data: key (16 bytes) || message (32 bytes) = 48 bytes Block size: 64 bytes Padding formula: 0x80 || (zero bytes to align) || 64-bit big-endian bit-length Zero padding bytes: 7 Total padding: 16 bytes (1 + 7 + 8) Padded total: 64 bytes (1 block(s)) Attack: attacker appends \x80\x00\x00\x00\x00\x00\x00\x00[8-byte length] || extra_data then forges MAC = H(original_hash_as_state || extra_data_with_padding)
marduc812
2026