Why XOR Is a Terrible Way to Combine Hashes (And How It Cost Panoptic $4M in Risk)

There is a tempting idea in cryptographic design: take individual hash outputs and XOR them together. It is simple, fast, and incremental — if one input changes, you just XOR out the old hash and XOR in the new one. No need to recompute the whole thing.

The problem is that XOR hashing is broken. Not theoretically shaky. Broken. As in: an attacker can forge any target hash value in polynomial time with probability at least $\frac{1}{2}$.

This was proven in 1997 by Bellare and Micciancio. In August 2025, Panoptic learned this the hard way when a researcher used exactly this technique to spoof position fingerprints and nearly drain their protocol.

This post walks through the math of why XOR hashing fails, then shows how it played out in practice.

Part 1: The XHASH Construction and Its Collapse

Setup

Consider a message $x = x_1 \dots x_n$ split into $b$-bit blocks. Let $h: {0,1}^l \to {0,1}^k$ be an ideal hash function (random oracle). The XHASH function is defined as:

$\text{XHASH}^h(x_1 \dots x_n) = \bigoplus_{i=1}^{n} h(\langle i \rangle | x_i)$

where $\langle i \rangle$ is a binary encoding of the block index and $|$ denotes concatenation. Each block is tagged with its position before hashing, then all results are XORed together.

This construction comes from the randomize-then-combine paradigm of Bellare and Micciancio. The paradigm works well with other combining operations (multiplication in a group, modular addition). But with XOR, it collapses entirely.

The Attack: From Hashing to Linear Algebra

The key insight is that XOR over $\text{GF}(2)$ is linear. This means finding a preimage for XHASH reduces to solving a system of linear equations — something Gaussian elimination does in $O(k^3)$ time.

Here is how the attack works.

Goal: Given any target $z \in {0,1}^k$, find a message $x$ such that $\text{XHASH}^h(x) = z$.

Step 1. Fix two messages $x^0 = x^0_1 \dots x^0_n$ and $x^1 = x^1_1 \dots x^1_n$ where $x^0_i \neq x^1_i$ for all $i$. Set $n = k + 1$.

Step 2. Compute all $2n$ hash values:

$\alpha^j_i = h(\langle i \rangle | x^j_i) \quad \text{for } j \in {0,1}, ; i = 1, \dots, n$

Step 3. For any selector string $y = y[1] \dots y[n] \in {0,1}^n$, define the composite message $x^y = x^{y[1]}_1 \dots x^{y[n]}_n$. We want to find $y$ such that:

$\bigoplus_{i=1}^{n} \alpha^{y[i]}_i = z$

Step 4. Introduce auxiliary variables $\bar{y}[i] = 1 – y[i]$ over $\text{GF}(2)$. The equation becomes a system of $n + k$ linear equations in $2n$ unknowns:

$y[i] \oplus \bar{y}[i] = 1 \quad (i = 1, \dots, n)$

$\bigoplus_{i=1}^{n} \alpha^0_i[j] \cdot y[i] \oplus \alpha^1_i[j] \cdot \bar{y}[i] = z[j] \quad (j = 1, \dots, k)$

With $n = k + 1$, this gives $2k + 1$ equations in $2k + 2$ unknowns — a slightly underdetermined system. Solve via Gaussian elimination. Set one free variable arbitrarily.

That is the entire attack. It requires only $2n$ hash evaluations and one round of Gaussian elimination. No brute force. No birthday paradox. Just linear algebra.

Why It Works: Pairwise Independence

The natural question is: does a solution even exist? Bellare and Micciancio prove the following.

Lemma (Bellare-Micciancio). Fix $z \in {0,1}^k$ and two messages $x^0, x^1$ as above. Then:

$\Pr\left[\exists , y \in {0,1}^n : \text{XHASH}^h(x^y) = z\right] \geq 1 – \frac{2^k}{2^n}$

The probability is over the random choice of $h$.

Proof sketch. Define indicator variables $X_y = \mathbf{1}[\text{XHASH}^h(x^y) = z]$ for each $y \in {0,1}^n$. Let $X = \sum_{y} X_y$.

Each $X_y$ has expectation $E[X_y] = 2^{-k}$ since $\text{XHASH}^h(x^y)$ is uniformly distributed over ${0,1}^k$ for any fixed $y$. So $E[X] = 2^n \cdot 2^{-k}$.

The crucial observation: the collection ${X_y}_{y \in {0,1}^n}$ is pairwise independent. For any $y \neq y’$, there exists some index $i$ where $y[i] \neq y'[i]$, meaning $\text{XHASH}^h(x^y)$ depends on a hash evaluation that $\text{XHASH}^h(x^{y’})$ does not. Since $h$ is a random oracle, these outputs are independent.

Pairwise independence lets us apply Chebyshev’s inequality:

$\Pr[X = 0] \leq \frac{\text{Var}[X]}{(E[X])^2} = \frac{2^n \cdot 2^{-k}}{2^{2(n-k)}} = \frac{2^k}{2^n}$

Setting $n = k + 1$ gives $\Pr[X = 0] \leq \frac{1}{2}$. A solution exists with probability at least $\frac{1}{2}$.

What This Means

XHASH is not just collision-vulnerable. It is not even one-way. An attacker can compute a preimage for any target value. Collision-freeness fails as a corollary: hash any random message $y$ to get $z = \text{XHASH}^h(y)$, then run the attack to find $x$ mapping to the same $z$. With high probability $x \neq y$, and you have a collision.

The total cost is $2n$ hash evaluations plus $O(k^3)$ for Gaussian elimination. For $k = 248$, this is trivial on a laptop.

Part 2: XHASH in the Wild — The Panoptic Incident

What Panoptic Built

Panoptic is a DeFi options protocol on Ethereum. Each user can hold up to 25 option positions, each represented as a packed uint256. For gas efficiency, Panoptic did not store position lists directly. Instead, it stored a fingerprint: the XOR of the keccak256 hashes of each position ID, truncated to 248 bits.

In Solidity, the update looked like this:

uint256 updatedHash = uint248(existingHash) ^ 
    uint248(uint256(keccak256(abi.encode(tokenId))));

When a user needed to interact with the protocol, they supplied their claimed position list. The contract recomputed the fingerprint from that list and compared it against the stored one. If the fingerprints matched, the contract trusted the list.

This is textbook XHASH with $k = 248$.

How the Attack Worked

The attacker’s goal was to produce a fake position list that generates the same fingerprint as the user’s real positions, but with zero collateral requirements. Two flaws made this possible:

No validation of position IDs. The contract did not check whether submitted position IDs corresponded to real, minted positions. Position IDs with a leg count of zero were accepted. This removed the constraint on list length — attackers could use lists of arbitrary size.
XOR is linear over GF(2). The hash outputs can be treated as 248-dimensional vectors over $\text{GF}(2)$. Finding a subset of candidate vectors that XOR to a target fingerprint is exactly the system of linear equations from the Bellare-Micciancio attack.

The practical attack was:

Generate a large pool of candidate uint256 values (they don’t need to be valid positions)
Compute their keccak256 hashes, truncated to 248 bits
Arrange these as column vectors in a matrix over $\text{GF}(2)$
Use Gaussian elimination to find a subset that XORs to the target fingerprint
Use this spoofed list to bypass collateral checks and withdraw borrowed funds

The Panoptic team confirmed they could mine spoof lists on low-budget laptops using a Python script, because Gaussian elimination runs in polynomial time.

The Damage and Recovery

On August 25, 2025, a Cantina researcher reported this vulnerability. Over $4M in user funds were at risk. The Panoptic team coordinated voluntary withdrawals and executed whitehat rescue operations using the same spoofing technique, recovering over 98% of at-risk funds.

The fix for V2 is straightforward: verify that users actually own the positions in their supplied list, rather than relying solely on fingerprint matching. And more fundamentally — do not use XOR as a combining operation for hashes in any security-critical context.

Takeaway

The Bellare-Micciancio result from 1997 is not obscure. It is a clean, provable statement: XOR hashing over $\text{GF}(2)$ reduces to linear algebra, and linear algebra is easy. The existence probability is bounded by a simple Chebyshev argument.

If you need an incremental, order-independent hash, the same paper offers alternatives that work: MuHASH (multiply in a group where discrete log is hard) and AdHASH (add modulo a large integer). Both are incremental, parallelizable, and provably collision-resistant under standard assumptions.

XOR is fast and elegant. It is also the wrong tool for this job.

References:

Bellare, M. & Micciancio, D. (1997). “A New Paradigm for Collision-free Hashing: Incrementality at Reduced Cost.” Eurocrypt 97.
Panoptic. (2025). “Position Spoofing Post Mortem.” panoptic.xyz/blog/position-spoofing-post-mortem.