Tools

Hash Length Extension Attack

Educational reference explaining hash length extension attacks — which algorithms are vulnerable, how the Merkle-Damgård construction enables the attack, and how to defend against it.

This is a purely educational reference. No exploit code is included. Understanding the attack helps implement correct defenses (HMAC, SHA-3).

Algorithm Vulnerability Status

Algorithm	Family	Block Size	Status	Recommendation
MD5	Merkle-Damgård	64B	Vulnerable	Do not use MD5 for any security purpose. Use HMAC-SHA256 or SHA-3.
SHA-1	Merkle-Damgård	64B	Vulnerable	SHA-1 is deprecated. Use SHA-256 with HMAC for authentication.
SHA-256	Merkle-Damgård	64B	Vulnerable	Use HMAC-SHA256 instead of bare SHA-256 for MACs. Never use H(key \|\| message) as a MAC.
SHA-512	Merkle-Damgård	128B	Vulnerable	Use HMAC-SHA512 or SHA-3 with proper key handling.
SHA-3 (Keccak)	Sponge	136B	Safe	SHA-3 is safe to use as H(key \|\| message) but HMAC-SHA3 is still recommended for clarity.
BLAKE2	Haifa	64B	Safe	BLAKE2b/BLAKE2s are excellent choices for fast, secure hashing and MACs.
BLAKE3	Merkle tree	64B	Safe	BLAKE3 is an excellent modern choice for both hashing and keyed MACs.
HMAC-SHA256	HMAC	64B	Safe	HMAC with any of the SHA-2 family is the standard solution. This is the correct way to use SHA-2 as a MAC.

What is the Merkle-Damgård Construction?

Merkle-Damgård (MD) is a design pattern used by MD5, SHA-1, SHA-256, and SHA-512.
A message is divided into fixed-size blocks. An iterative compression function f processes each block together with the current state H:

  H₀ = IV (initialization vector)
  H₁ = f(H₀, block₁)
  H₂ = f(H₁, block₂)
  ...
  Hₙ = f(Hₙ₋₁, blockₙ)
  output = Hₙ

The crucial property: the output IS the internal state. This means if you have the hash output you have the state needed to continue hashing.

The Length Extension Attack

Suppose a server computes MAC = SHA-256(secret_key || message) and sends it with the message.

An attacker who knows:
  1. The hash value (MAC)
  2. The length of secret_key (or can guess it)
  3. The message content

Can compute a valid MAC for a forged message: message || padding || extra_data

Steps:
  1. Load the known hash as the starting internal state
  2. Continue hashing from that state with the extra data
  3. The result is a valid hash for (key || message || padding || extra_data)
  4. Server verifies SHA-256(key || forged_message) — it matches!

Tools like hashpump and hash_extender automate this attack.

The Padding (Glue)

Before hashing each block, MD functions append a specific padding so the total length is a multiple of the block size:

  1. Append a single 0x80 byte
  2. Append zero bytes until length ≡ (blockSize - 8) mod blockSize
  3. Append the original message length as a 64-bit big-endian integer

This padding becomes part of the forged message. The attacker must know the key length to compute it correctly (they can try multiple lengths if unknown).

Real-World Impact

CVE-2009-3490: Flickr API used HMAC but had a length extension vulnerability in token verification.
CVE-2008-2380: Multiple web applications that used H(secret || data) as a MAC.

Affected pattern: Any code using hash(secret || user_controlled_data) as authentication or integrity check.

Secure alternatives:
  • HMAC: H(K XOR opad || H(K XOR ipad || msg)) — standard and well-studied
  • SHA-3 / BLAKE2 / BLAKE3 — not vulnerable by design
  • Authenticated Encryption (AES-GCM) — preferred for encryption + integrity

Padding Calculator

Calculate the Merkle-Damgård padding bytes (glue) an attacker must include in their forged message extension.

Algorithm

Key Length (bytes)

Message Length (bytes)

Block size: 64 bytes

Padding bytes: 16

Padded total: 64 bytes

Glue padding (hex):

80 00 00 00 00 00 00 00 00 00 00 00 00 00 01 80

Original data: key (16 bytes) || message (32 bytes) = 48 bytes
Block size: 64 bytes
Padding formula: 0x80 || (zero bytes to align) || 64-bit big-endian bit-length
Zero padding bytes: 7
Total padding: 16 bytes (1 + 7 + 8)
Padded total: 64 bytes (1 block(s))

Attack: attacker appends \x80\x00\x00\x00\x00\x00\x00\x00[8-byte length] || extra_data
then forges MAC = H(original_hash_as_state || extra_data_with_padding)