Attacking Hash Functions
by Poisoned Messages
"The Story of Alice and her Boss"
"I don't Know Much About Cryptography
 what is a Hash Function?"
One of the main workhorses of modern cryptography are
collision resistant hash functions.
Given an (almost) arbitrarily long input M, they produce a
fixedsize output H(M). Collision resistance means that it is
infeasible to find two different inputs M and M' with the
same hash H(M)=H(M'). Note that many collisions exist, but it has
to be infeasible to actually find even a single collision!
Hash functions are almost omnipresent in today's cryptography, e.g. in
digital signatures. Instead of signing a long message
M, you simply sign its hash H(M). This is useful and
simplifies many issues ... but if H(M) is identical to
H(M') then the signature is also valid for M'.
The Story of Alice and her Boss
Alice has been an intern, working some weeks in Rome at the office of,
say, Julius Caesar. Depending on the point of view, the
story develops quite differently.
Caesar's View
At the day Alice is supposed to leave, Caesar writes a letter of
recommendation for Alice  on paper.
The same day, she asks Caesar to digitally sign the letter. For his
convenience she presents an electronic copy of the document. Caesar
opens the document  it looks exactly like the original document.
So he signs the document.
Months later, Caesar discovers
that there has been a breach of secrecy
with his French affair files.
Will he ever find out who tricked him and how?
Alice's View
Being an intern, Alice does not have any access to secret documents. Not
enough for her ...
... tricky Alice decides to fool Caesar.
Because Caesar is still relying on the widely used MD5 hash function,
she implements the attack from Wang and Yu [WY05] to
find MD5 collisions. When she receives her letter of recommendation (on
paper), she prepares two postscript files with the same MD5
hash:
a25f7f0b 29ee0b39 68c86073 8533a4b9
Now she asks Caesar to sign the letter ... who has no obvious
reason to decline.
Due to the hash collision,
Caesar's signature for the letter of recommendation is valid for the
order, as well. She presents order and digital signature to the person
in charge of Caesar's files ... and finally gains
access to Caesar's secret documents!
Technical Background: How did you do it?
Based on [WY05] and the analysis described in [Da], we implemented an attack to find
random collisions for the MD5 compression function. It took just a few
hours on a customary PC. Using the length
extension property present in MD5 and all other hash functions following
the (almost omnipresent) MerkleDamgaard design principle, we
constructed a pair of documents to display either the letter or the
order.
Do you have more details? What else is possible?
We are preparing a paper on this.
If you look into postscript documents, e.g. by opening them in a text or
hex editor, you'll get an idea about what is going on.
(We did not try to obfuscate the postscript Code.)
Oh, you can also look at the
Slides of our presentation
at the Eurocrypt 2005 rump session.
Summary  why did you do this?
Recently, the world of cryptographic hash functions has turned into a mess.
A lot of researchers announced algorithms ("attacks") to find collisions for
common hash functions such as MD5 and SHA1 (see [B+, WFLY, WY, WYYa, WYYb]).
For cryptographers, these results are exciting  but many socalled
"practitioners" turned them down as "practically irrelevant".
The point is that while it is possible to
find colliding messages M and M', these messages appear
to be more or less random  or rather, contain a random string of some
fixed length (e.g., 1024 bit in the case of MD5). If you cannot exercise
control over colliding messages, these collisions are theoretically
interesting but harmless, right? In the past few weeks, we have met quite
a few people who thought so.
With this page, we want to demonstrate
how badly wrong this kind of reasoning is! We hope to
provide convincing evidence even for people without much technical or
cryptographical background.
There have already been a few exploits of the collisionfinding attacks
against MD5:
Kaminski [Ka] and Mikle [Mi] presented different executables with the
same MD5 hash. One of Kaminski's executables was quite harmless, the
other one very harmful. Mikle's executables were selfextracting
archives, extracting different stuff.
Lenstra, Wang and de Weger [LWdW,LdW] constructed colliding X.509
certificates.
These exploits seemed to be rather technical to us. We wanted
real documents to collide, which you just open with
some standard browser or viewer. Also, we wanted to present some
use case for such collisions  the story of Alice
and her boss. We hope, you liked our little story. :)
Related Work and References

[B+]
E. Biham, R. Chen, A. Joux, P. Carribault, C. Lemuet, W. Jalby:
Collisions of SHA0 and Reduced SHA1. Eurocrypt 2005.

[Ka]
D. Kaminski:
MD5 to be considered harmful someday

[Da]
M. Daum: Cryptanalysis of Hash Functions of the MD4Family,
PhD thesis, RuhrUniversity of Bochum, submitted.

[LWdW] A. Lenstra, X. Wang, B. de Weger:
Colliding X.509 Certificates.

[LdW] A. Lenstra, B. de Weger:
On the possibility of constructing meaningful hash collisions for
public keys

[Mi] O. Mikle:
Practical Attacks on Digital Signatures using MD5 message digest

[WFLY]
X. Wang, D. Feng, X. Lai, H. Yu:
Cryptanalysis of the hash functions MD4 and RIPEMD. Eurocrypt 2005.

[WY]
X. Wang, H. Yu:
How to break MD5 and other hash functions. Eurocrypt 2005.

[WYYa] X. Wang, H. Yu, Y.L. Yin:
Efficient collision search attacks on SHA0. Crypto 2005.

[WYYb] X. Wang, Y.L. Yin, H. Yu:
Finding collisions in the full SHA1. Crypto 2005.
Acknowledgements
Many thanks to Frederik Armknecht and Dirk Stegemann
for useful comments and discussions on
this work and for helping with the computations to find the necessary
collision. Many thanks also to Benne de Weger, who provided some
interesting comments and links to related work, and to Ernst
Giessmann, who revealed Caesar's real address in (Via Appia instead
of Main Road) and gave us a hint how to improve the robustness of our
postscript code. Finally, many thanks to Andreas Lucks for
his advice on this presentation and the
drawing.
Authors
Magnus Daum
(CITS Research Group, RUB)
Stefan Lucks
(University of Mannheim)
