Collisions, a secure hash function killer (MD5, SHA1, SHA2)

The trouble with the use of MD5 in digital signatures recently uncovered by Sotirov et al. is common to other hash functions.

NIST has been discouraging people to use MD5 and even SHA 1 since many years ago. A good account of this was posted by Dustin Trammell here.

Because the output of a hash function is of a fixed length, usually smaller that the input, there will necessarily be collisions. The collision-free property for hash is thus defined by:

A function H that maps an arbitrary length message M to a fixed length message digest MD is a collision-free hash function if:

1. It is a one-way hash function.

2. It is hard to find two distinct messages (M', M) that hash to the same result H(M')=H(M).

Cryptographers talk about “relatively collision free” hash functions. A good hash function should be designed with the Avalanche Criterion in mind.

The Avalanche Criterion (AC) is used in the analysis of S-boxes or substitution boxes. S-boxes take a string as input and produce an encoded string as output.

The avalanche criterion requires that if any one bit of the input to an S-box is changed, about half of the bits that are output by the S-box should change their values. Therefore, even if collisions are unavoidable, there is no way to generate two strings with the same hash value other than brute force.

 

The end of the road for MD5 signed SSL Certificates

X.509 certificates signed by Certificate Authorities that use MD5 function are certainly going to disappear form the Internet as flaws on the MD5 were successfully exploited to generate a rogue certificate that would be considered as valid by all browsers.

The proof of concept was recently published by A. Sotirov et al. , although the basis for the hack has been know for a few years know. The researchers exploited collisions (two different strings that hash to the same value) in the MD5 and the fact that CAs use a sequential numbering of certificates upon issuance.

News that SSL is broken are exaggerated as many CA are already using SHA-1 (a stronger hash function) and the ones that were using MD5 are switching quickly after publication of the flaw.  

See also: