We Found Collision For SHA-512

Hash functions are at the core of many, many technologies, including ours. We rely on SHA-512 to create a hash chain to guarantee the integrity of logs.

Hash functions have one very important property – collision resistance. If two different inputs generate the same hash, it may mean the hash function is broken. It may also mean that someone got lucky and found an accidental collision in which case the hash function is probably fine (collision resistance does not mean no collisions exist, it only means they are extremely hard to find, let alone find deliberately).

Today we were investigating an issue with one of our clients – they were trying to verify whether a hash is present in the hash chain, but got an error instead. We tracked the error to our core data access layer and realized that the getSingleResult(..) method would purposefully fail if there was more than one result fetched.

We have obviously used the getSingleResult(..) method for finding entries based on their hash because despite the fact that we can ingest millions of log entries, we anticipated that hashes will always be unique.

It turns out that wasn’t the case. Two records were found to have exactly the same hash. An important note – this doesn’t seem to be reproducible behaviour, but it might have to be looked into, as every collision is a potential problem.

The hash used was

3c9fd692bb64ea49cee44ce6a6fd4c8cbe4f9c5c4c60f9efdd7343d48e69710ee23c446d6ce9a4
55330aeec4bb038460eacaf78808d51144402f0a27153a14ed

The two messages that produced it were:

April Fool’s day joke

and

No, we didn’t find a collision

We hope our findings will be used by the cryptographic community to further improve the security of hashing algorithms.