+ All Categories
Home > Documents > Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på...

Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på...

Date post: 20-Jan-2020
Category:
Upload: others
View: 7 times
Download: 1 times
Share this document with a friend
83
Attacks on Cryptographic Hash Functions with Special Focus on MD5 and SHA-0 HENRIK YGGE Master of Science Thesis Stockholm, Sweden 2008
Transcript
Page 1: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Attacks on Cryptographic Hash Functions with Special Focus on MD5 and SHA-0

H E N R I K Y G G E

Master of Science Thesis Stockholm, Sweden 2008

Page 2: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Attacks on Cryptographic Hash Functions with Special Focus on MD5 and SHA-0

H E N R I K Y G G E

Master’s Thesis in Computer Science (30 ECTS credits) at the School of Computer Science and Engineering Royal Institute of Technology year 2008 Supervisor at CSC was Johan Håstad Examiner was Johan Håstad TRITA-CSC-E 2008:111 ISRN-KTH/CSC/E--08/111--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.csc.kth.se

Page 3: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Abstract

Attacks on Cryptographic Hash Functions

With Special Focus on MD5 and SHA-0

A cryptographic hash function is designed to be secure so that givenan output from the function, it is computationally infeasible to find amessage that results in that output, or to find two different messagesthat produce the same output.

MD5 has been one of the most used cryptographic hash functionsduring the last decades. It has been used in several security applicationsand protocols and for the verification of file transfers over the internetand is still in use today. In 2004 the first collision, i.e. two messagesthat produced the same output, was found by Wang et al. [14]. Severaltheoretical attacks had been published but this was the first practicalattack that produced a result.

SHA-0 was designed to be the successor to MD5 but was soon re-placed by the very similar SHA-1 after NSA found a flaw in the designof SHA-0. SHA-1 is still used today in several security protocols andapplications, even though NIST recommends that it should be replacedby SHA-2 by 2010. Several attacks have been published against SHA-0which resulted in a successful collision search, but none of the attackshad been extendible to SHA-1 until Wang et al. [1] published an attackon SHA-0 with much lower complexity than any previous attack.

In this thesis we have studied the attacks by Wang et al. on MD5and SHA-0 to try to understand what in the designs of the cryptographichash functions made the attacks against them possible. We describe indetail the steps required to produce a collision on both cryptographichash functions, something that we feel is lacking in the papers describingthe attacks. During this process, we have discovered shortcomings inthe attack by Wang et al. on SHA-0 that resulted in later improvementsby Naito et al. [5] on the attack to not be as effective as described.

We have also improved upon the attack by Naito et al. by repla-cing one of their modifications that we found to be erroneous by a newmodification that our experiments have shown to work better. Our newversion of the attack now finds a collision in about 50 hours.

Page 4: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Referat

Attacker på kryptografiska hashfunktioner

Med särskilt fokus på MD5 och SHA-0

En kryptografisk hashfunktion är designad för att vara säker så att givetutdata från funktionen ska det vara mycket svårt att hitta något indatasom ger samma utdata. Det ska också vara mycket svårt att att hittaen kollision, två olika indata som ger samma utdata.

MD5 har varit en av de mest använda kryptografiska hashfunktio-nerna de senaste åren. Den har använts i flertalet applikationer ochprotokoll som behövt vara säkra. Det har även använts flitigt som kon-troll att filer som skickas över nätet har kommit fram utan fel. I ettpar av föregående exempel används den fortfarande. Den första kollisio-nen på MD5 hittades 2004 av ett kinesiskt forskarlag ledda av XiaoyuWang [14]. Flertalet teoretiska attacker hade publiserats innan dess mendetta var den första som var praktiskt genomförbar och resulterade i enkollision.

SHA-0 designades som en uppföljare till MD5 men ersattes snartsjälv av SHA-1, en mycket snarlik kryptografisk hashfunktion, efter detatt NSA hittat svagheter i konstruktionen av SHA-0. SHA-1 användsflitigt idag och är en av de kryptografiska hashfunktioner som NISTrekommenderar för användning idag, även om de också rekommenderaratt den ersätts med SHA-2 senast 2010. Flertalet attacker har publise-rats på SHA-0, av vilka ett par har resulterat i kollisioner, men ingen avattackerna kunde appliceras på SHA-1 förrän Wang et al. [1] publicera-de en attack på SHA-0 med mycket lägre komplexitet än någon tidigareattack. Denna attack gick att applicera på SHA-1 och på så sätt kundede publicera den första attacken på SHA-1.

I det här examensarbetet har vi studerat Wangs attacker på MD5och SHA-0 i detalj för att försöka förstå vad i designen av dem gjordeattackerna möjliga. Vi beskriver i detalj steg för steg hur man åstad-kommer kollisioner på båda de kryptografiska hashfunktionerna, någotvi känner saknas i de rapporter som beskriver attackerna. Under denhär processen har vi upptäckt felaktigheter i attackera av Wang et al.på SHA-0 som har resulterat i att förbättringar av Naito et al. [5] avattacken inte har fungerat i samma utsträckning som beskrivits i derasrapporter.

Vi har också förbättrat attacken av Naito et al. mot SHA-0 genomatt ersätta en felaktig modifikation med en ny modifikation som fungerarbättre enligt våra experiment. Vår nya version av attacken hittar enkollision på SHA-0 efter ungefär 50 timmars körning.

Page 5: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Foreword

This report is my master’s thesis, conducted at CSC, the School of Computer Sci-ence and Communication, at KTH, the Royal Institue of Technology, in Stockholm,Sweden.

I would like to thank Martin Ekerå, with whom I have had several interestingdiscussions over the course of the work for this thesis. He has written a similarthesis with a different focus and I am certain that neither of us would have gottenas far as we have without the help of each other.

I would especially like to thank my supervisor and examiner, professor JohanHåstad at CSC, for the time he has invested and the interest he has shown in thisproject.

Henrik Ygge, Stockholm, June 2008.

Page 6: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital
Page 7: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

.

Contents

1 Preliminaries 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.2 Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.3 This Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Definitions 62.1 Cryptographic Hash Functions . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Birthday Paradox . . . . . . . . . . . . . . . . . . . . . . . . 72.1.2 Complexity of Attacks . . . . . . . . . . . . . . . . . . . . . . 92.1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.4 BSDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Commonly Used Construction Schemes . . . . . . . . . . . . . . . . 12

2.3.1 Davies-Meyer . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3.2 Merkle-Damgård . . . . . . . . . . . . . . . . . . . . . . . . . 122.3.3 Message Padding . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 MD5 143.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.3 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.4 Details of Wang’s Attack . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.4.1 Searching for a Collision . . . . . . . . . . . . . . . . . . . . . 183.4.2 Calculating a Path . . . . . . . . . . . . . . . . . . . . . . . . 193.4.3 Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.5 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.5.1 One-block Collision . . . . . . . . . . . . . . . . . . . . . . . . 243.5.2 New Two-block Collision . . . . . . . . . . . . . . . . . . . . . 25

3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Page 8: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4 SHA-0 274.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.4 Details of Wang’s Attack . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.4.1 Calculating a Path . . . . . . . . . . . . . . . . . . . . . . . . 304.4.2 Searching for a Collision . . . . . . . . . . . . . . . . . . . . . 364.4.3 Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.5 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.5.1 Wang’s Suffient Conditions are not Sufficient . . . . . . . . . 424.5.2 Generating First Fourteen Message Words of Block Two . . . 434.5.3 A Condition for the XOR-rounds is Insufficient . . . . . . . . 444.5.4 One Submarine Modification is Incorrect . . . . . . . . . . . . 454.5.5 Collisions on SHA-0 . . . . . . . . . . . . . . . . . . . . . . . 47

4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5 Conclusions and Results 515.1 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.2 The Status of Cryptographic Hash Functions . . . . . . . . . . . . . 51

Bibliography 53

A MD5 55A.1 MD5 Step by Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

A.1.1 Round 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55A.1.2 Round 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56A.1.3 Round 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57A.1.4 Round 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

A.2 Wang’s Path Through MD5 . . . . . . . . . . . . . . . . . . . . . . . 59A.2.1 Block 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59A.2.2 Block 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

B SHA-0 67B.1 Wang’s Path Through SHA-0 . . . . . . . . . . . . . . . . . . . . . . 67B.2 Final Conditions for SHA-0 . . . . . . . . . . . . . . . . . . . . . . . 68B.3 A Step by Step Look at a Collision . . . . . . . . . . . . . . . . . . . 70

B.3.1 Round 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71B.3.2 Round 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72B.3.3 Round 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73B.3.4 Round 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Page 9: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Chapter 1

Preliminaries

1.1 Introduction

A hash function, H, is an algorithm that calculates a value based on an possbilyvery large input and thus maps that input to a value from a much smaller domain.The output from the hash function is called the hash digest. An example of a hashfunction is

H16(x) = x mod 16, x ≥ 0.

This function takes as input an integer and outputs the integer modulo sixteen.Clearly, H16 can be applied to x of arbitrary size, but the result of the hash functioncan only take sixteen different values. Thus, H16 is a hash function that mapselements from an infinite domain, the domain of non-negative integers, into a domaincontaining sixteen elements.

Hash functions are used frequently in computer science, on the one hand togenerate short identifiers, also called fingerprints, from very large objects but also inhash tables. A hash table associates keys to values for easy lookup. One applicationis to save a list of contacts, i.e. pairs of names and phone numbers. The phonenumbers can be saved in a hash table with the names as the keys. To find a phonenumber quickly the name of the requested person is hashed to find the index in thetable of the corresponding phone number.

Several requirements are made on the hash functions for the hash tables to beefficient:

• The hash function needs to have a sufficiently large output domain. If thenumber of possible outputs is too small, several keys may be mapped to thesame index, which reduces performance.

• It is good if the hash function spreads the hash digests so that a small changein an input results in several differences in the hash digests. This is known asan avalanche effect.

• A good hash function is fast to compute in such a way that large amounts ofdata may be processed in a short amount of time. Most hash functions there-

1

Page 10: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 1. PRELIMINARIES

fore use basic operations that computers can perform very efficiently, likeaddition, subtraction och rotation. More complicated operations like multi-plication and division are avoided.

Besides the ability to spread the hash digests evenly, there are also applicationswhere it is necessary that the hash function used is one-way, i.e. that given anoutput from a hash function, it is computationally infeasible to find an input thathashes to that output. Hash functions that are designed to have this property arecalled cryptographic hash functions. Formally, a cryptographic hash function is ahash function which it designed to be both one-way and collision-resistant, i.e. itshould be computationally infeasible to both find a corresponding input for a givenoutput, and to find two different inputs which hash to the same output.

Because of this property it is possible to use hash digests of data from crypto-graphic hash functions instead of the data itself in security applications. This isadvantageous for two reasons. One is that the hash digest is short compared to theoriginal data. In the case of digital signatures, this means an increase in the speedof encryption if the hash digest is encrypted instead of the data itself. Instead ofsigning the entire document, only the hash digest of the document is signed.

The second reason for using cryptographic hash functions in security applicationsis that the data itself is not sent and from the hash digest there is no way to knowwhat the original message was, without having access to the message. The hashdigest uniquely identifies it and does not reveal any information about the message.The word “uniquely” refers to the fact that we assume that it is not practicallypossible to find a message that hashes to a given hash digest. The hash digestdefines it uniquely in computational sence but not in information theory.

Cryptographic hash functions are also used in commitments, which are used toprove the knowledge of certain information, without revealing the actual informationuntil later. This uses the fact that hash digests from cryptographic hash functionsidentify messages without revealing them. To prove possesion of some information,person A can hash that information along with some random data and give the hashvalue to person B. Person B can then verify the information at a later stage whenperson A also gives him the random data appended to the information. Person Bcan then be sure that person A knew about the information at the time he deliveredthe hash result.

There are several other application for cryptograpic hash functions, such as:

• Password protection.Passwords for users of a computer can be saved as hash digests instead ofsaving the passwords themselves to increase security. When a user attemptsto log in, the computer hashes the password the user entered and comparesthat to the saved digest. If they match, the user is allowed to log in. Toincrease security random data called salt is added to the password before it ishashed. The random data is saved along with the hash digest in clear text.

• Integrity checks for file transfers.

2

Page 11: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

1.1. INTRODUCTION

When a file is sent over the Internet, a hash digest of the file can be sentalong or published for reference on the Internet. When the file is received thehash digest of that file is computed and then compared with the original hashdigest to check if the files are identical.

Cryptographic hash functions are also designed to be collision resistant. Acollision is when two different messages hash to the same digest. If a collision isfound for a cryptographic hash function, the digests cannot be considered uniqueand the hash function is considered broken.

Over the years, several cryptographics hash functions have been used, brokenand replaced. The next section is a short history of some of the most importantcryptographic hash functions the last few decades.

1.1.1 History

The Message Digest Algorithm 4, MD4, was designed by Professor Ronald Rivestin 1990 as a successor to MD2 which he invented in 1989. This new cryptographichash function had a totally new design and was much faster than its predecessor.MD2 consists of eighteen steps, in each of which it uses a lookup table in thecalculations. This seems advantageous for the security of the hash function butmakes computation of MD2 slow. It was also optimized for 8-bit processors. Eventhough MD2 is today almost 20 years old, no collision has yet been found on it.Collisions have instead been found on the cryptographic hash functions MD5, SHA-0 and RIPE-MD which were designed to replace MD2. There are however severaltheoretical attacks [21], [22] against MD2 and it is considered too insecure to beused today.

The successor MD4 consists of 48 steps, almost three times as many as MD2,divided into three rounds, but each step uses only the very basic operations additionand rotation. These operations can be performed quickly by a computer and thusMD4 could process more data than MD2. The construction scheme of MD4 was sopopular that its design would influence most major cryptographic hash functionsfollowing it. Among these are MD5, which Ronald Rivest designed in 1992, the SHAfamily of hash functions, from NSA, the National Security Agency in the UnitedStates, and the RIPE-MD family of hash function from the European Union projectRIPE.

The popularity of MD4 caused it to be intensively studied and in 1991 Bertden Boer and Antoon Bosselaers [18] found the first weakness of the hash function.Since then several weaknesses have been found [16], [17] and today a collision onMD4 can be found by hand calculation [15].

Ronald Rivest designed MD5 in 1992 as a successor to MD4 and the designs ofthe two cryptographic hash functions are very similar. MD5 uses an extra round ofsixteen steps, for a total of 64 steps, compared to MD4 plus some changes in eachstep. MD5 was designed to be a more secure version of MD4 and the similaritiesbetween them made it easy for companies using MD4 to switch to MD5. This made

3

Page 12: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 1. PRELIMINARIES

MD5 one of the most popular cryptographic hash functions and it, like MD4, cameunder intensive scrutiny.

One year after the relase of MD5, NIST, the National Institute of Standards andTechnology, a United States federal agency, released its own hash function calledSHA, the Secure Hash Algorithm. It was designed by NSA because they wereconcered about the security of MD5 as they felt that the hash digest was too small.Two years later this was followed by a revised version and this new version wasnamed SHA-1 while the previous version was referred to as SHA-0. NSA claimedto have found weaknesses in the design of SHA-0, but they never officially detailedwhat the weaknesses were. The difference between SHA-0 and SHA-1 is only oneextra rotation in the message expansion, this is described more in detail in chapter3. SHA-0 and SHA-1 also use the iterative structure of MD4 as a base for theirdesigns.

1.1.2 Future

NIST, has detailed the Secure Hash Standard [7] in which five different crypto-graphic hash algorithms are specified. They are SHA-1 and the four members ofthe SHA-2 family of cryptographic hash functions: SHA-224, SHA-256, SHA-384and SHA-512. These hash functions are improved versions of MD5 and SHA-1.

The last four years a series of attacks have been published against MD5 andSHA-0 and these two hash functions are now considered to be too insecure to use insecurity applications. Because the structure of especially SHA-1 is very similar toMD5 and SHA-0, concerns have been raised that attacks similar to the ones launchedagainst the later hash functions can be used to attack these new hash functions aswell. In response to this, NIST held two public workshops which resulted in anannouncement [19] of a public competition for one or more new hash functions.The development process will be similar to that of AES and a preliminary time linefor the competition as well as policies for the use of the new hash functions havebeen published. The deadline for submissions is in the fall of 2008 after which threecandidate conferences are planned to be held where the submissions are narroweddown for each event. Specifications for the new hash function, called SHA-3, isplanned to be determined by 2012.

1.1.3 This Thesis

In light of the attacks against MD5, SHA-0 and SHA-1 and the request for submis-sions for a new hash algorihm from NIST, a study of the weaknesses of MD5 andSHA-0 is needed and this is the goal of this thesis. Understanding the attacks byWang et al. are the focus of this thesis as the details of their attacks are vague.Their papers describe briefly their attacks and a lot of information about the attacksare omitted all together or exists only in previous papers only available in Chinese.A major part of the contribution of this thesis is a detailed study of these attacks.

4

Page 13: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

1.1. INTRODUCTION

This thesis is written in close cooperation with a similar thesis written by MartinEkerå, also at KTH. The focus of this thesis is towards SHA-0 while his focus istowards MD5. This thesis will describe most parts of the attack on MD5 in detail,but for information about all parts of the attacks on MD5 we refer the reader to histhesis.

Throughout this thesis the attacks by Xiaoyu Wang, Hongbo Yu, and Yiqun LisaYin on SHA-0 and Xiaoyun Wang and Hongbo Yu on MD5 will both be referencedas "the attack by Wang et al." or simply as "Wang’s attack". It will hopefully beclear from the context which of the two attacks we are referring to.

1.1.4 Outline

The next chapter will outline the basics needed to understand the cryptographichash functions and the attacks against them. The third chapter describes MD5 andthe attacks against it in detail as well as our own experiments with that crypto-graphic hash function. The fourth chapter deals with SHA-0 in much the same wayas the second chapter deals with MD5 and in the last chapter we summarize theinformation in the thesis and discuss the results of our studies.

5

Page 14: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Chapter 2

Definitions

In this chapter we introduce the definitions needed to understand the attack andthe notations and terminology used to describe the cryptographic hash functionsand the attacks against them.

2.1 Cryptographic Hash Functions

A hash function that is designed to be a one-way operation so that there is nopractical way to calculate an input to the hash function given an output is calleda cryptographic hash function. Attempts to attack a hash function can be dividedinto three different categories:

• First pre-image attacksGiven a hash digest h, find a message m such that H(m) = h.

• Second pre-image attacksGiven a message m, find a message m′ 6= m such that H(m) = H(m′).

• Collision attacksAttempting to find two messages m and m′ 6= m so that H(m) = H(m′).

First and second pre-image attacks are usually very similar as it is often difficultto make use of m. They are also much more difficult to succeed at compared tocollision attacks since the complexity of a brute force first or second pre-imageattack is 2n while the complexity of a brute force collision attack is 2

n2 , where n is

the number of bits in the hash digest. A collision attack has the lower complexitybecause of the birthday paradox which is described briefly in the next section. Bruteforce attacks are also described later in section 2.1.1.

Cryptographic hash functions are widely used in security applications as detailedin the introduction. Successful attacks against them could be used in the followingways:

6

Page 15: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

2.1. CRYPTOGRAPHIC HASH FUNCTIONS

• Digital signatures.A successful collision attack against the hash function means that an attackercan generate two documents to sign, both with the same hash digest but withdifferent meanings. Since signing a document means using H(m) instead ofm, and the collision attack can generate m′ 6= m so that H(m) = H(m′) thenH(m′) would also give a valid signature. Thus, both m and m′ are signed.A successful first or second pre-image attack means that an attacker could,given a signed document, generate another document with another meaningbut with the same hash digest that would the give the correct signature whenthe document is verified.

• Password protection.A successful first pre-image attack would result in that the attacker couldeasily find a password that would allow him/her to login.

• Commitments.A successful attack of any sort would in this case result in that an attackercould generate several different statements that have the same hash digestand when the time has come to reveal the right one, the attacker can simplychoose the best one.Marc Stevens, Arjen Lenstra and Benne de Weger have published a commit-ment of their correct prediction of the outcome of the 2008 US presidentialelection1. They have used a Sony Playstation 3 to generate several documents,one for each of the candidates running for the presidential election, along witha few others, Paris Hilton for example. The documents all hash to the samevalue by MD5.

• Integrity checks for file transfers.In this case a successful collision attack could mean that an attacker couldgenerate two files, one harmless and one harmful and a person who downloadseither of the files cannot know which of them it is by verifying the hash digest.A successful second pre-image attack would make it possible for an attacker todistribute their own harmful file and claim it to be another, popular, harmlessfile.

The rest of this thesis deals only with cryptographic hash functions and the term“hash function” is therefore used for “cryptographic hash function” from now on.

2.1.1 Birthday Paradox

The birthday paradox is a well known problem in probability theory that refers tothe probability that in a room of k randomly chosen people, at least two of themshare the same birthday. It is called a paradox because this probability is much

1http://www.win.tue.nl/hashclash/Nostradamus/

7

Page 16: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 2. DEFINITIONS

higher than most people expect when first confronted with the problem. In fact,with only 23 people, the probability is over 50%.

This is easy to show by calculating pk, the probability that no two people ina room of n people share the same birthday, and that gives us the the probabilitythat at least two people do share the same birthday, 1− pk. In fact

pk =365365

· 364365

· 363365

· ... · 365− (k − 1)365

.

Note that this expression is only valid for k ≥ 1. If k ≥ 366 then pk = 0because then there are more people than unique birthdays which means that theprobability that two people to do not share a birthday is 0. Let us motivate theabove expression. The first person has no one to compare with so the probabilitythat he does not share a birthday is 365

365 = 1. The second person can not have abirthday on the same day as the first person, which leaves 364 days and thus theprobability is 364

365 that the first and the second person do not share a birthday. Thiscontinues in the same fashion and thus we obtain the equality:

pk =365 · 354 · 363 · ... · (365− (k − 1))

365k=

365!365k · (365− k)!

The smallest k where pk is less than 50% is k = 23 when pk ≈ 0.493 whichimplies that we have 50.7% probability of two people sharing a birthday.

The birthday paradox can be generalized to sets of any size, and it is not hardto prove that for a set of size N , the number of randomly chosen elements from thatset that needs to be included to get a probability of at least 50% that two of themare the same is about

√2N ln 2 ≈

√N .

This can be shown by generalizing the expression above to pk,N where N is thenumber of available values. We want to find k given N such that pk,N ≈ 50%,

pk,N =N

N·(

1− 1N

)·(

1− 2N

)· ... ·

(1− (k − 1)

N

)≈ 1

2⇒

⇒ 1− pk,N = 1− 1 ·(

1− 1N

)·(

1− 2N

)· ... ·

(1− (k − 1)

N

)≈ 1

2.

By the Taylor expansion:

ex = 1 +x1

1!+

x2

2!... ≈ 1 + x ⇒

1− pk,N ≈ e−1N · e−

2N · ... · e−

k−1N =

= e−Pk−1

i=1iN = e−

k(k−1)2N

8

Page 17: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

2.1. CRYPTOGRAPHIC HASH FUNCTIONS

−k(k − 1)2N

≈ ln(

12

)⇒ k ≈

√2N ln 2 ≈

√N

This means that if a cryptographic hash function has a hash digest of 128 bits,i.e., there are 2128 possible outputs, and if we can generate 264 different hash digests,the probability that there is at least one collision is around 50%.

2.1.2 Complexity of Attacks

As seen in the previous section, the complexity of a successful collision attack isO(√

N), where N is the total number of hash digests possible.For a cryptographic hash function with a digest size of n bits, i.e. there are

N = 2n different digests, we have a probability of 2−n for getting the correct digestafter each hashing for a successful first or second pre-image attack. For a decentprobability of finding a message with the correct digest, about 2n messages haveto be tried, so the complexity for a first or second pre-image attack is linear, i.e.O(N).

When studying the security of a hash function, the collision attack is thereforethe natural choice to study first because it is the easiest of the three. A cryp-tographic hash function is considered broken if there exists an attack that has alower complexity then O(

√N) for a collision attack or O(N) for a first or second

pre-image attack. A successful implementation is of course of additional value butusually not necessary. For example, SHA-1 is considered broken even though nocollision has yet been found, because a collision attack with complexity 263 itera-tions thorugh the hash function has been published, which is better than the bruteforce complexity of 280 iterations.

To describe these attacks we need to set up some notation.

2.1.3 Notation

We mostly use standard notation. We let l be the number of bits used in aninteger for the hash function and hereafter refer to l-bit integers as words. Manyoperations are modulo 2l, because that is the most efficient way for computers touse the operations.

The hash functions detailed in this thesis, MD5 and SHA-0, both use 32-bitunsigned integers and the word length l will therefore usually be 32 in this paper.The 32 bits of a word are numbered starting from 0, which is the least significantbit, to 31, which is the most significant. The message words used will be labeledm with an index describing which message word it refers to with m0 being the firstmessage word. Message blocks, M will also be used with an index when appropriate.A message block is 512 bits in MD5 and SHA-0 which results in each message blockconsisting of sixteen message words.

Integers are either given in hexadecimal or binary form and examples are givenby 789abc16 and 101012.

9

Page 18: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 2. DEFINITIONS

Table 2.1. Notations

Symbol Meaningx � y Addition modulo 2l.x � y Subtraction modulo 2l.x⊕ y Bitwise exclusive OR.x ∧ y Bitwise AND.x ∨ y Bitwise OR.¬x Bitwise negation.

x � s Shift of x by s steps to the left.x � s Shift of x by s steps to the right.x ≪ s Rotation of x by s steps to the left.x ≫ s Rotation of x by s steps to the right.

To produce a collision from a hash function, two different messages need to beconstructed that produce the same hash digest. Although it is often sufficient towork only with one of the messages, sometimes conditions on the other message orhash digest is needed as well. To distinguish between the original message that isdealt with most of the time, and the second message, message words of the secondmessage are labeled m′

i.

2.1.4 BSDR

A binary signed digit representation, BSDR, is a convenient, non-unique repres-entation of an integer, t. It is a sequence of l digits ki ∈ {−1, 0, 1}, 0 ≤ i < lwhere

l−1∑i=0

2i · ki = t mod 232.

For example, the BSDR (1 · 213) + (−1 · 212) is evaluated to 1fe016.Note that there are 332 possible BSDRs for a 32-bit word, but only 232 possible

values, and hence there exists several BSDRs representing the same values. In fact,all words, except 0, can be represented as several different BSDRs.

BSDRs are important in studying MD5 and SHA-0 in that they give an excellentrepresentation for describing differences between two words. Given two words, x =abc16 = 1010101111002 and y = cba16 = 1100101110102, we want to describe thedifference z = y−x = 1fe16 as a BSDR. By inspecting x and y bit-by-bit as in table2.2 we can create the following BSDR:

(1 · 210) + (−1 · 29) + (−1 · 22) + (1 · 21).

In this way we get a nice representation of the bits that differ between x and y.

10

Page 19: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

2.2. TERMINOLOGY

Table 2.2. BSDR

211 210 29 28 27 26 25 24 23 22 21 20

y 1 1 0 0 1 0 1 1 1 0 1 0x 1 0 1 0 1 0 1 1 1 1 0 0

y − x 0 1 -1 0 0 0 0 0 0 -1 1 0

2.2 Terminology

As defined earlier, most cryptographic hash functions divide the message to behashed into blocks of a fixed size that are then hashed seperatly. A message block,M , is 512 bits in the cryptographic hash functions detailed in this thesis.

Both MD5 and SHA-0 use the iterative structure of MD4 where a simple stepfunction is iterated several times with the result of previous steps as input to thenext step. The cryptographic hash functions use internal registers to hold the resultof the calculations in each step. In MD5 the calculation of one step requires theresult of the last four steps, and in SHA-0 the last five steps are required. Thismeans that only four and five internal registers are needed in MD5 and SHA-0respectively, in some papers labeled a,b,c,d and e. We have however chosen to useQ as a vector of the results of each step so that Qi is the result of the calculationfor step i. This makes it easier to describe the attacks against the hash functionsas the contents of one index does not change once it has been calculated as it doesif only a-e are used.

The steps of MD5 and SHA-0 are divided into rounds. The rounds control whichfunction is used in the step calculations. This is described in more detail in the twochapters dedicated to MD5 and SHA-0.

Both attacks by Wang et al. on MD5 and SHA-0 respectively use the additivedifference between two message words or partial results of the current hash function.This additive difference is denoted by δ, i.e. δmi = m′

i � mi. Differences in BSDRforms are also used frequently, which are noted with ∆ in this thesis.

To understand the attacks on the cryptographic hash functions in this thesis wedefine a path through multiple steps of the computation of a cryptographic hashfunction as BSDRs for each of the steps that the path is used. For each step, i,that a path exists, ∆Qi is specified, i.e. a BSDR exists that specify the requiredbit differences between Q′

i and Qi. A path is usually derived for all steps of acryptographic hash function.

Both MD5 and SHA-0 contain bitwise functions that are used in each step.The functions all use three words as inputs and outputs one word. Some of thesefunctions do not depend on all the bits of the input all the time. The output fromfunction IF (a, b, c) = (a ∧ b) ∨ (¬a ∧ c) for example is not dependant on any bit iin b where the corresponding bit in a is zero. We say that the function IF has theability to hide bits, a change in any bit bi or cj where ai = 0 or aj = 1 respectively,will not change the output of the function. The changed bit is "hidden" by the

11

Page 20: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 2. DEFINITIONS

function. This ability to hide bits is used in the attacks by Wang et al. on bothMD5 and SHA-0.

2.3 Commonly Used Construction Schemes

Most modern hash functions are built on two schemes by Davies-Meyer and Merkle-Damgård respectively.

2.3.1 Davies-Meyer

The Davies-Meyer scheme uses an encryption function, E, to build a compressionfunction, C. It takes as input a message block Mi and a state si and returns a newstate si+1 that is the sum of the output from the encryption function and the statesi.

E

Mi

si si+1

C

Figure 1: The Davies-Meyer Scheme

The idea of adding the input state to the result of the encryption function is tomake it more difficult for an attack to determine si given E and si+1.

The Davies-Meyer scheme is sometimes described with an XOR-operation in-stead of an addition. The cryptographic hash functions described in this thesis bothuse the Davies-Meyer scheme with addition.

2.3.2 Merkle-Damgård

The Merkle-Damgård scheme uses a compression function, C, to build a hash func-tion, H. It takes as input a message and an initial state s0. The message is dividedinto blocks Mi, usually of 512 bits, and the first block, M0 is used as input alongwith s0 to the compression function in the first step. The output from the firststep, s1, is then used as input along with M1 to the compression function again andthis continues until the last message block, Mk has been used. The result of thecompression function after the last step, sk+1, is the output from the hash function.

12

Page 21: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

2.3. COMMONLY USED CONSTRUCTION SCHEMES

C C C C

M0 M1 Mi Mk

s0s1 si si+1 sk+1

H

Figure 2: The Merkle-Damgård Scheme

The Merkle-Damgård construction scheme can be proven to be collision resistantif the compression function is collision resistant [20], provided we have a suitablemessage padding.

The Merkle-Damgård scheme is often used together with the Davies-Meyerscheme to create a hash function. By using the compression function built bythe Davies-Meyer scheme in the hash function constructed by the Merkle-Damgårdscheme, a hash function can be created from an encryption function. MD5 andSHA-0 both use these two schemes.

2.3.3 Message Padding

Hash functions using the Merkle-Damgård scheme divide the message to be hashedinto blocks and each block is used in one step of the scheme. Message padding isappended to the final block to ensure that the length of the message is an evenmultiple of the block length. The most common block size is 512 bits, used by mosthash functions in the MD-SHA hash function family, including MD5, SHA-0 andSHA-1.

The padding works in the following way. First, one bit set to one, is added tothe message. Then a number of zeros are added to the message to make the lengthof the new message 448 bits modulo 512 bits. The last two words are set to thelength of the original message. This padding is done for all messages, even thosethat originally have a length that is evenly divided by 512. The message length istherefore limited to 264 bits for cryptographic hash functions that use 32 bit wordlengths. Two of the cryptographic hash functions of the SHA-2 family, SHA-384and SHA-512, use a word size of 64 bits and can hash messages up to a size of 2128

bits.

13

Page 22: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Chapter 3

MD5

3.1 Introduction

MD5 was designed by Ron Rivest at MIT in 1991 as a successor to the MD4 hashalgorithm and it became an Internet standard in 1992 as RFC 1321. It was verypopular and used in several security applications and protocols, among others IpSecand SSH. It is still used frequently as verification of the integrity of files sent overthe Internet.

MD5 creates a 128 bit hash-digest and uses a 512 bit message block.

3.2 Structure

MD5 uses the Merkle-Damgård and Davies-Meyer structures with the padding de-scribed in the previous section.

The compression function of MD5 consists of four rounds where each round issixteen steps. It uses four internal registers of which one is updated in each step,which pictorially looks as follows:

Qi

Qi−1

Qi−1

Qi−2

Qi−2

Qi−3

Qi−3

Qi−4

Fi

mσ(i−1)

ki

≪ ri

Figure 3: The MD5 step function

14

Page 23: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

3.2. STRUCTURE

The values from the four previous steps, Qi−1, Qi−2, Qi−3 and Qi−4, are used tocalculate the current step, Qi. We have

Qi = Qi−1 � ((Qi−4 � Fi(Qi−1, Qi−2, Qi−3) � mσ(i−1) � ki) ≪ ri) (3.1)

This expression is central for us and as we need to discuss it in detail we defineTi as

Ti = Qi−4 � Fi(Qi−1, Qi−2, Qi−3) � mσ(i−1) � ki

and thus

Qi = Qi−1 � (Ti ≪ ri).

The words Q−3 to Q0 are defined as the input states to the compression functionwhich according to the Merkle-Damgård structure is the result of the previouscompression function. For the first block, these are defined as:

Q−3 = efcdab8916

Q−2 = 98badcfe16

Q−1 = 1032547616

Q0 = 6745230116

MD5 uses several round- and/or step-dependent variables and functions.

• The function Fi is a bitwise function that depends on the current round:

Table 3.1. MD5 round functions

Round F(a,b,c)1 IF(a,b,c) = (a∧b)∨(¬a∧c)2 IF(a,b,c) = (a∧c)∨(b∧¬c)3 XOR(a,b,c) = a⊕b⊕c4 XNO(a,b,c) = b⊕(a∨¬c)

• The message word mσ(i−1) used in the current step is derived by a simplemessage expansion. The 512 bit message block is split into sixteen words andthe message expansion is designed to make every round use each message wordexactly once. The message expansion used in MD5 is detailed in table 3.2.

• The rotational contants ri are given by a fixed list of values that are detailedin table 3.3

15

Page 24: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 3. MD5

Table 3.2. MD5 message expansion

Round Step i σ(i− 1)1 1-16 i− 12 17-32 (5i + 1) mod 163 33-48 (3i + 5) mod 164 49-64 (7i) mod 16

Table 3.3. MD5 rotation constants

Round Step i ri

1 1-16 7,12,17,22,7,12,17,22,...2 17-32 5,9,14,20,5,9,14,20,...3 33-48 4,11,16,23,4,11,16,23,...4 49-64 6,10,15,21,6,10,15,21,...

• The value ki is a constant that is unique in each step. It is constructed fromthe sine function as follows:

ki = b|232sin(i)|c

For future reference, MD5 is given in complete detail in section A.2 in theappendix.

3.3 Cryptanalysis

MD5 has been widely studied due to its popularity. The first major result came in1996 when Dobbertin found a collision for the compression function. He was unablehowever to extend the collision to the full MD5 [16].

The complexity of a brute force collision attack on MD5 is only about 264 callsto the cryptographic hash function and on March 1 2004, the company CertainKeyCryptosystems launched such an attack. They used a distributed network to tryand find the first collision on MD5. This project was cancelled in August of 2004when the first collision was found by Xiaoyun Wang, Dengguo Feng, Xuejia Lai andHongbo Yu [15] and published in a paper with collisions for several cryptographichash functions. This paper consisted of only four pages, with the collisions listedwithout any explanations. This lead to several papers [13], [11] from people spec-ulating about the attack until Wang et al. [14] published another paper describingthe attack on MD5 in more detail.

The first attack had a complexity of 239 operations which was improved in asequence of results. The best result so far is by Vlastimil Klima who in 2006

16

Page 25: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

3.4. DETAILS OF WANG’S ATTACK

published an attack [12] which could find collisions on MD5 in less than a minuteon a regular notebook using a concept he called tunnels.

The first attack by Wang et al. and the improvements are the main focus of therest of this chapter and we next describe the details of these attacks.

3.4 Details of Wang’s Attack

The attack by Wang et al. [14] uses a method called differential cryptanalysis wherethe difference of the partial results between two messages are kept track of as theyare hashed. They also used the additive difference along with the exclusive-ordifference when describing the registers as the messages were hashed. This allowedthem to specify more exactly the current state since the additive difference and theexclusive difference can be combined to give a BSDR.

A key component of the attack is the path, which we remind the reader is aseries of BSDRs for the intermediate result of the hash function. These BSDRs arethe required differences to accieve a collision.

There are basically three steps to finding a collision on MD5 with Wang’smethod.

1. Calculate a path of differentials that the two messages that collide shouldfollow. This path specifies in each of the 64 steps of the hash function theBSDR-difference of the partial results for the two messages.

2. Generate conditions that satisfy this path, i.e. conditions on Qi for i =−3...64, that leads to the desired differences in each step. These conditionsare used to help the attack follow the path.

3. Search for a collision by generating messages that satisfy the conditions forthe first 16 steps and then checking if the messages satisfy the conditions inrounds two to four.

The attack by Wang et al. uses two blocks to produce a collision. The firstblock is used to produce a specific difference that will be used as input to thesecond block. This second block then produces an output with a difference whichcancels the difference in the input. This will produce a collision since the input andthe output of each block are added together by the Davies-Meyer scheme. Wang etal. have therefore derived two different paths, one for each block.

17

Page 26: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 3. MD5

∆s0 = 0

E

E

∆E(s0,M0) = x

∆s1 = x

∆E(s1,M1) = −x

∆s2 = 0

M0

M1

Figure 4: Two block collision using the Davies-Meyer scheme

In their paper, Wang et al. do not mention why they have decided to use twoblocks instead of one, but possible reasons are discussed later in this paper. Thepaper also lacks information about how the paths are used and no indication isgiven how the paths were constructed. Some references to earlier work is made, butthe papers referred to are only available in Chinese. Major parts of the contributionof this paper is therefore to clarify the details of the attack further. The followingthree sections deal with our and other investigator‘s discoveries about the attack.

The three sections are divided accordingly; the first section deals with finding acollision given a path. This is placed first although it is chronologically the last stepin the collision search. This is a deliberate choice because it is a good introductionto the other parts which require more detail.

The second section, “Calculating a path”, describes, in detail, how a path canbe constructed. This part is quite technical. Deriving conditions from a path is alsopart of this section.

The third section, “Optimizations”, deals with the concept of multi-messagemodification, tunnels and other methods for decreasing the time needed to searchfor a collision given a path.

3.4.1 Searching for a Collision

Conditions are always on one bit of a word and can specify a value for that bit or arelation to another bit in another, earlier word. Let Qi,j denote the j-th bit of Qi.Recall that Q′

i is the result of the calculation for step i of the second message.Conditions in round one can be fulfilled automatically by modifiying the message

word used in that step since there is a direct relation between the result of eachstep, Qi, and the message used in that step, mσ(i−1), which in fact equals mi−1.

Each step in round one looks like this:

Qi = Qi−1 � ((Qi−4 � IF (Qi−1, Qi−2, Qi−3) � mσ(i−1) � ki) ≪ ri) (3.2)

18

Page 27: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

3.4. DETAILS OF WANG’S ATTACK

Table 3.4. Conditions on bits

Condition Meaning. Qi,j = Q′

i,j

0 Qi,j = 01 Qi,j = 1+ Qi,j = 0, Q′

i,j = 1- Qi,j = 1, Q′

i,j = 0ˆ Qi,j = Qi−1,j , Q′

i,j = Q′i−1,j

! Qi,j 6= Qi−1,j , Q′i,j 6= Q′

i−1,j

and can be rewritten as this:

mσ(i−1) = ((Qi � Qi−1) ≫ ri) � IF (Qi−1, Qi−2, Qi−3) � Qi−4 � ki (3.3)

In the second, third and fourth rounds, the message words are fixed as the firstround has already been calculated. In the first round the messages have not beenused before and therefore we can select Qi to match the conditions we want andthen calculate the appropriate message mσ(i−1) from it. Conditions in round twoto four are therefore much more relevant for the complexity of the attack as theyhave to be fulfilled probabilistically.

The first step in searching for a collision is to generate Qi for i = 1...16 thatsatisfy all the conditions on those variables. When all Qi in the first round are fixed,all message words have also been fixed and are given by equation 3.3 above.

There are 43 conditions in later rounds that have to be fulfilled and because allsixteen message words have been fixed, we can do that by calculating Qi, i = 17...64until we find a condition that is not satisfied, or we have calculated all 64 steps.If all conditions are satisfied, we have found a good first block and can start thesearch for a second block. If we find a condition that is not satisfied, we start fromthe beginning again.

This is a very naive way of searching for a collision. It can be done moreefficiently by several optimizations. This is discussed in the section 3.4.3, Optimiz-ations, below.

3.4.2 Calculating a Path

We can make several observations regarding the round functions that are used inMD5. The IF function used in rounds one and two uses ∧ and ∨ which means thatit is possible to hide bits, i.e. the output of the function doesn’t depend on all theinput bits all the time, as described earlier in the paper. If for example we knowthat aj = 1, it doesn’t matter what cj is since that bit will not matter for the outputfrom the round function IF. The function XNO in round four can also hide bits forthe same reason.

19

Page 28: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 3. MD5

The XOR function of the third round does not have this ability and thus anychange in the input will affect the output from the function. This makes the thirdround the most predictable round, but also the hardest since we have no help fromthe round function to get the bit differences we want.

Selecting Message Differences

A collision found with Wang’s attack will only have differences in bits m4,31, m11,15

and m14,31 between the two messages in each of the two blocks. The reason thatthese message differences were chosen is never described, but we think the key reasonis to be found in the third round.

Since MD5 uses a word length of 32 bits, all eventual carries from the 31stbit are discarded. This is important since any difference in the 31st bit will neverpropagate to other bits.

We observed that if we assume that the differences, ∆Qi, in steps i−4 to i−1 arezero in all bits except the 31st and step i is in the third round, then step i will alsohave difference 231. This is easy to prove. Since ∆Qi−1 = ∆Qi−2 = ∆Qi−3 = 231,the output of the round function will also differ in the 31st bit. This difference willbe cancelled by the bit difference of Qi−4 since they are added together and the 31stbit produces no carries. Therefore Ti will not contain any differences. The wordQi is the result of the addition of Ti and Qi−1 and the differences in Qi−1 will bedirectly transfered to Qi, with possible carries. But since the difference is only inthe 31st bit, there is no carry.

This gives us an idea for an easy way to control the third round. If the messagewords that contain differences are used in the third round in such a way that theycreate the difference 231 four steps in a row, we know what the output from thethird round will be without requiring any more conditions. It is not hard to showthat this can be done quite easily.

If we assume that we have no differences coming out of round two, we can makethe result of any step, i, in round three have a difference of 231 by selecting thedifference of the message word in that step to be 231−ri . This guarantees ∆Ti =231−ri as long as no carry occurs in Ti which in its turn guarantees ∆Qi = 231.

Since ∆Qi = 231, we want ∆Ti+1 = 0 to guarantee that ∆Qi+1 = 231. This canbe accieved by setting ∆mσ(i) = 231 since this will cancel the difference from Fi+1

caused by the difference in Qi.In step i + 2, ∆Qi+1 = 231 so again, ∆Ti+2 = 0 is required. Since we already

have two differences, from steps Qi and Qi+1, that will cancel each other in theround function, we can select ∆mσ(i+1) = 0.

There are now three consecutive steps, all with differences only in bit 31. Forthe fourth step, ∆Ti+3 = 0 is required as in the previous steps. Three of theintermediate results used to calculate Ti+3 have differences in bit 31, and two ofthem cancel each other inside the round function. We need ∆mσ(i+2) = 231 tocancel the last one.

20

Page 29: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

3.4. DETAILS OF WANG’S ATTACK

By using these three message differences:

∆mσ(i−1) = 231−ri

∆mσ(i) = 231

∆mσ(i+1) = 0

∆mσ(i+2) = 231,

we can guarantee that we get a difference of 231 after round three if we have nodifference coming into round three and no carry occurs because of the difference inTi. All that is left to do before the message difference is decided is to find a suitablei where the sequence above should start. There are 13 choices since we can select ifrom 33, the first step in round 3, to 45, the last step where i + 3 is still in roundthree. Wang’s choice for i is 35, a choice not motivated in their paper. This is,however, a good choice for an efficient attack, all other values for i that we havetried lead to an attack with a worse expected runtime. This is discussed in moredetail below in section 3.5.2.

Dealing With Round Four

Round four uses the round function XNO which has the ability to hide bits. Thisis used to control the propagation of the bit differences. We know that the last foursteps of round three will all have differences 231 and there will also be three moredifferences from message words that have to be controlled. Two of the messagedifferences are also 231 and these are easy to control since there are no carries toworry about and we can use the round function to make them part of the 231-difference from the third round as it propagates through round four.

We cannot, however, do anything about the difference from the third messageword, the difference that is not at the 31st bit. Since the rotational constants aredifferent in this round compared to the ones used in round three, we cannot evenrotate it to bit 31. Thus, this difference will create a difference in Qi for the stepin which it is used. To minimize the avalanche effect it seems like a good idea tolet this difference enter late in the fourth round. This minimizes the steps that willbe affected by it, and the number of possibilities it will have to spread through thecarry effect. When Wang et al. designed their path, they chose to have the messageword enter in step 62, almost at the very end. If we only concern ourselves with thefourth round, we can choose another message difference that will let this differenceenter in step 64, the very last step. This message difference will unfortunately resultin many more conditions in the second round, as we will show later.

The First Two Rounds

When we analyzed round three we assumed that the last four steps of round twohad no differences. We would therefore like to find a path through the first two

21

Page 30: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 3. MD5

rounds that leads to no differences in the last four steps of round two. As describedabove, this will lead to few conditions in round three, and we know that we can dealwith round four. We also know the beginning of round one since the initial steps,Q−3 to Q0, are fixed, either from a previous block or as the initial values for thehash function.

As described earlier, we want as many of the conditions as possible to comeinto round one, since they can be fulfilled directly. We would therefore like to jointhe two parts of the path in round one, since connecting two parts requires a lotof conditions. We refer the reader to Martin Ekerå’s master thesis where he hasinvested considerable time into researching this area.

3.4.3 Optimizations

The method described in section 3.4.1 for searching for collisions is very simple.Since there are a total of 43 conditions in round two, three and four together,and each has a probability of 1

2 of being fulfilled, the straightforward search wouldrequire about 243 tests, i.e. generations of suitable message words.

In Wang’s paper they mention that the complexity of their attack is at most239 hash operations, and this is due to the fact that they use a concept they callmulti-message modification. The difference between the two is that multi-messagemodification is much more complicated and involves more steps, however the prin-ciple is quite simple. If a condition in the first few steps of round two is not fulfilled,we use multi-message modification to fulfill it by changing one of the message wordsthat affect it so that the condition is fulfilled, and then change other message wordsto make sure that the first change doesn’t propagate in round one.

An example: According to Wang’s path, Q17,31 = 0 is one of the conditions onQ17. If this is not satisfied, we can change m1 so that Q17,31 = 0 and then changem2,m3,m4 and m5 in such a way that the change in m1 doesn’t propagate furtherin the first round. Since none of m2,m3,m4 and m5 have been used in round twoyet, we can still modify them to cancel this difference.

By using multi-message modification, Wang et al. claims that their algorithm tofind a collision has a complexity of 239 and that they could use it to find a collisionin about an hour on a supercomputer.

Multi-message modification became obsolete however when Klima released apaper in 2006 [12] in which he introduced the concept of tunnels which provide away to vary the bits of a message word slightly without recomputing most steps inround two. They can be seen as a more powerful version of Wang’s multi-messagemodification.

The idea is that we can vary bits of a message word and only recompute somepartial results, Qi, of the hash function. This is useful when a condition is found inthe second round that is not satisfied. By varying bits of message words that areused before that step we have a good chance of satisfying the condition after all.By using tunnels we can do this without having to compute all the steps again upto the problematic step.

22

Page 31: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

3.5. OUR CONTRIBUTIONS

The principle is easiest to explain by an example. Let us consider the computa-tion of Q9 −Q13:

Q9 = Q8 � ((Q5 � IF (Q8, Q7, Q6) � m8 + k9) ≪ r9)Q10 = Q9 � ((Q6 � IF (Q9, Q8, Q7) � m9 + k10) ≪ r10)Q11 = Q10 � ((Q7 � IF (Q10, Q9, Q8) � m10 + k11) ≪ r11)Q12 = Q11 � ((Q8 � IF (Q11, Q10, Q9) � m11 + k12) ≪ r12)Q13 = Q12 � ((Q9 � IF (Q12, Q11, Q10) � m12 + k13) ≪ r13)

Suppose we want to vary m8, which is used late in the second round, to satisfysome condition in that round. If we modify it directly, we have to modify m9 tom12 also to cancel the modifications. The idea with tunnels is to realize that if weonly vary bits Q9,j where Q10,j = 0 and Q11,j = 1 then Q11 and Q12 will not beaffected because the IF-function hides those changes.

The first nine steps of round two looks like this:

Q17 = Q16 � ((Q13 � IF (Q14, Q16, Q15) � m1 + k17) ≪ r17)Q18 = Q17 � ((Q14 � IF (Q15, Q17, Q16) � m6 + k18) ≪ r18)Q19 = Q18 � ((Q15 � IF (Q16, Q18, Q17) � m11 + k19) ≪ r19)Q20 = Q19 � ((Q16 � IF (Q17, Q19, Q18) � m0 + k20) ≪ r20)Q21 = Q20 � ((Q17 � IF (Q18, Q20, Q19) � m5 + k21) ≪ r21)Q22 = Q21 � ((Q18 � IF (Q19, Q21, Q20) � m10 + k22) ≪ r22)Q23 = Q22 � ((Q19 � IF (Q20, Q22, Q21) � m15 + k23) ≪ r23)Q24 = Q23 � ((Q20 � IF (Q21, Q23, Q22) � m4 + k24) ≪ r24)Q25 = Q24 � ((Q21 � IF (Q22, Q24, Q23) � m9 + k25) ≪ r25)

Because m11 is not affected by the change in m8 we only have to recomputefrom step Q25, which is dependent of m9, instead of from Q19, which is dependentof m11. Thus any conditions of Q19 −Q24 still remain satisfied.

There are several tunnels that can be used together to reduce the search com-plexity of an attack dramatically, Klima states that by using the tunnels he couldfind a collision on MD5 within a minute using a regular notebook.

3.5 Our Contributions

For several parts of the attack on MD5 by Wang et al. there was little or noexplanation as to why a certain solution was selected. Two questions we had was

23

Page 32: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 3. MD5

why they chose a two-block collision instead of a collision with one block, and whytheir specific message difference was chosen. To understand why we tried to derivean attack using only one block, and one attack with another message difference.The results of these experiments are detailed in the next two sections.

3.5.1 One-block Collision

Wang et al. chose to use two blocks to generate a collision. One block to offset theinput values to the second block which then cancels this offset. We assumed thishad to do with reducing the complexity of the attack but decided to try to derivea path that lead to a collision in one block. We remark that all collisions found onMD5, that we know of, have used Wang’s two-block approach.

To accieve a one-block collision, a new path through the fourth round had to bederived that led to no differences in the last four steps of the round. It is naturalto still use Wang’s message difference, possibly with a different value for i, to keepthe number of conditions in the third round to a minimum. To find a suitable paththrough the fourth round, we wrote a program that tested all possible paths givena message difference and an output from the third round.

In each step a difference can be hidden, flipped or remain unchanged throughthe round function and can also spread through the carry effect. This results in avery large search space for the sixteen steps of round four so more conditions had tobe used. We decided to set limits on both the number of bits in each step in roundfour that were allowed to contain differences and the number of bits in the outputfrom the third round in the same way. The reason for this is that the number ofconditions is directly related to the number of bits with differences in them. Morebit differences require more conditions to get exactly the difference wanted in eachstep, to keep the differences from spreading further or cancelling too soon, andfinally to cancel the differences at the appropriate step.

By doing this we could reduce the search time for the path to a number ofpossibilities that could be evaluated by computer and although we ran the programwith as much as nine allowed bit differences in each word and 20 bit differences outof round three, the program found no suitable path through the fourth round.

We can make several remarks on this. First of all this does not mean that theredoes not exist a one-block collision for MD5, on the contrary such collisions have toexist by the pigeon hole principle.

However, to achieve a collision using only one block, we draw the conclusion thateither a completely new message difference has to be found, or the path throughthe third and fourth round will have too many conditions to be practical for anattack. We can also draw the conclusion that Wang et al. chose the two-blockcollision attack because it reduces the complexity significantly when using theirtype of message difference.

24

Page 33: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

3.6. CONCLUSIONS

3.5.2 New Two-block Collision

All attacks against MD5 after Wang’s first attack have been improvements of thatattack. Some have used new paths, some have used other improvements, but everyattack has used the same message difference. There are however, as described insection 3.4.2, thirteen different choices for message differences that require only twoconditions in round three. One of these choices even has the message with thedifference that is not in bit 31 come in as late as the very last step of the hashfunction. A collision with that message difference would lead to fewer differencesafter the first block. When using the message difference selected by Wang et al,there are eight bits that differ after the first block, Q0,31, Q−1,25, Q−1,31, Q−2,25,Q−2,26, Q−2,31, Q−3,25 and Q−3,31.

With the message differences in words m2, m9 and m12, instead of in wordsm4, m11 and m14 like Wang et al. uses, the differences after the first block wouldonly be in the following five bits: Q0,31, Q−1,31, Q−2,31, Q−3,25 and Q−3,31. This isbecause the difficult message word, the one that has a difference in another bit thanbit 31, will now only be used in step 64 instead of in step 62 where it will influencethe output from steps 63 and 64 as well.

In round two, a problem with this choice of message difference becomes apperant.For Wang’s choice, the messages with differences are used relatively early in thesecond step, all three are used before step 27 and the problematic message word isused already in step nineteen. This makes it much easier to construct a path thathas no differences in the last four steps of round two and also reduces the numberof conditions needed significantly.

For our alternative choice, differences in message words m2, m9 and m12, oneof the messages with differences is used in step 30, and another is used in the verylast step of round two, in step 32. Because we do not want any differences in thelast four steps of round two, these differences in message words have to be used tocancel existing differences, which means that we have to construct a path throughall steps of round two that contains a difference until the last few steps. WithWang’s choice of message differences the path only has to be derived until step 26since no differences are used after that.

We constructed a path through the second round and parts of the first roundand found that the number of conditions required were substantially higher thanfor Wang’s path.

3.6 Conclusions

MD5 has been used in several security applications and is still in use in some today.It is, however, recommended that this cryptographic hash function should not beused anymore because of the recent attacks published against it. The very simplemessage expansion, where each message word is only used four times, is one weak-ness, along with its relatively few steps and short message digest. A brute-forcesearch for a collision is only about 264 calls to the cryptographic hash function since

25

Page 34: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 3. MD5

the digest is only 128 bits, which NIST judges to be too small.Even though MD5 had been studied intensively for over ten years, no collision

had been found when Wang et al. published their paper in 2004 with the first col-lision. The method they used, differential cryptanalysis, proved to be very effectiveagainst cryptographic hash functions that were built upon the iterative structuresimilar to MD4. The key to the collision attack by Wang et al. is the path thatthey derived through MD5 with great skill and patience. The predictability of thethird block and the simple message expansion of MD5 was used to create a pathwith very low search complexity compared to previous attempts.

Our experiments showed the problem of using the predictability of the thirdround and deriving a path with a collision using one block. A second block had tobe used to cancel the differences that were kept to a minimum through the fourthround. To cancel the differences with a second block the Davies-Meyer schemeproved useful for an effective attack since it adds the input to the encryption functionto the output, and Wang et al. could derive a second path, very similar to the first,that produced an output that was equal to the result of negating the input.

After Wang et al. published their attack, several improvements have been made.The most significant was made by Vlastimil Klima who introduced the concept oftunnels and could thereby reduce the time to search for a collision from about anhour to under a minute.

26

Page 35: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Chapter 4

SHA-0

4.1 Introduction

SHA-0 was published by NIST, the National Institute of Standards and Technology,in 1993 as the Secure Hash Algorithm. It was designed by NSA, the NationalSecurity Agency, because of concerns that the hash digest length of MD5 was toosmall. [10] Like MD5, SHA-0 uses a 512 bit message block and is built upon theiterative design of MD4, but uses five internal registers and outputs a 160 bit hashdigest instead. This raises the complexity for a brute-force collision search fromabout 264 hash function computations to about 280. The 160 bit hash digest resultsin a brute-force collision search of complexity about 280 due to the birthday paradox.

4.2 Structure

SHA-0, like MD5, uses the Merkle-Damgård and Davies-Meyer structures.The compression function consists of four rounds where each round is 20 steps.

It uses five internal registers of which one is updated in each step. One step isdecribed as follows:

Qi

Qi−1

Qi−1

Qi−2

Qi−2

Qi−3

Qi−3

Qi−4

Qi−4

Qi−5

≪ 30

≪ 30≪ 30

Fi

≪ 5

wi−1

ki

Figure 5: The SHA step function

27

Page 36: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

The results for the five previous steps, Qi−1, Qi−2, Qi−3, Qi−4, Qi−5, are used tocalculate the current step, Qi. A step can also be expressed as:

Qi = (Qi−1 ≪ 5)�Fi(Qi−2, (Qi−3 ≪ 30), (Qi−4 ≪ 30))�(Qi−5 ≪ 30)�wi−1�ki

Q−4 to Q0 are defined as the input states to the compression function whichaccording to the Merkle-Damgård structure is the output from the previous blockadded with the input to the previous block. For the first block, these are definedas:

Q−4 = c3d2e1f016

Q−3 = efcdab8916

Q−2 = 98badcfe16

Q−1 = 1032547616

Q0 = 6745230116

SHA-0 uses a number of round or step dependent variables and functions.

• The round function Fi is a bitwise function that depends on the current round:

Table 4.1. SHA-0 round functions

Round F(a,b,c)1 IF(a,b,c) = (a∧b)∨(¬a∧c)2 XOR(a,b,c) = a⊕b⊕c3 MAJ(a,b,c) = (a∧b)∨(a∧c)∨(b∧c)4 XOR(a,b,c) = a⊕b⊕c

• The message word used in each step is derived through a message expansion.This message expansion is designed to make sure that any single message wordinfluence more steps in the computation compared to MD5 and is defined as

wi ={

mi i ≤ 15wi−3 ⊕ wi−8 ⊕ wi−14 ⊕ wi−16 i > 15.

With this construction, each of the sixteen message words directly influencebetween 27 and 35 of the extended message words, wi. The only differencebetween SHA-0 and SHA-1 is in this message expansion. SHA-1 has an addedrotation of one bit, so the message expansion of SHA-1 looks as follows:

wi ={

mi i ≤ 15(wi−3 ⊕ wi−8 ⊕ wi−14 ⊕ wi−16) ≪ 1 i > 15.

28

Page 37: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.3. CRYPTANALYSIS

We will use the term message word for mi and expanded message word whenreferring to wi thorugh the rest of this chapter.

• In each step a constant ki is used that is unique for each round.

Table 4.2. SHA-0 round constants

Round Step i ki

1 1-16 5a82799916

2 17-32 6ed9eba116

3 33-48 8f1bbcdc16

4 49-64 ca62c1d616

4.3 Cryptanalysis

First SHA-0 and then SHA-1 have been considered the successors to MD5 and assuch they have both been studied thoroughly. SHA-0 is barely used since SHA-1is so similar and considered more secure. It has still however been the interest ofcryptographers since it is an easier version of SHA-1. The change from SHA-0 toSHA-1 was puzzling since no theoretical attack had been found on SHA-0 whenSHA-1 was released. NSA has not officially explanied why the change to SHA-1was necessary. SHA-1 is currently used in a large number of security protocols andapplications, for example PGP, SSH and SSL.

The first attack against SHA-0 was published by Chabaud and Joux [2] in 2002and had a complexity of about 261 hash function computations, which is better thanthe brute force method of about 280. Later, Biham and Shamir [3] improved uponthe attack and reduced the complexity to about 251 hashes and using this attack,Joux found the first collision in 2004 after 80 000 CPU hours on a 256 itaniumprocessor cluster. This attack was however not extendible to SHA-1.

In 2005 Xiaoyu Wang, Hongbo Yu and Yiqun Lisa Yin [1] published an attack onSHA-0 with a claimed complexity of about 239 computations of the hash function.A year later Yusuke Naito, Yu Sasaki Takeshi Shimoyama, Jun Yajima, NoboruKunihiro and Kazuo Ohta [5] improved upon the attack by reducing the complexityto an estimated 236 hashes by using a technique they call submarine modifications.Their implementation of the attack still took about 100 hours to find a collision.

The latest result is a paper by Stéphane Manuel and Thomas Peyrin that waspresented at FSE08, Fast Software Encryption, a conference in Lausanne Switzer-land, in February 2008. They have developed an attack called a boomerang attackthat can find a collision on SHA-0 in about one hour. [8].

Wang’s attack on SHA-0 is interesting because they managed to extend theattack to SHA-1 and could published an attack [6] with a complexity of about 269.This was the first attack on SHA-1 with better complexity than a brute-force attack.

29

Page 38: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

The attack was later improved to reduce the complexity to about 263 interations [9],but no collision on SHA-1 has yet been found. Still, because of the attack, SHA-1 is now considered broken and is not recommended by NIST to be used in newapplications. NIST also concludes that existing applications and protocols that useSHA-1 should transition to the SHA-2 group of hash functions by 2010.

We describe the details of the attack by Wang et al. on SHA-0 in the nextsection.

4.4 Details of Wang’s Attack

Wang’s attack on SHA-0 uses the same principle as the attack on MD5. By usingdifferential cryptoanalysis, differences between two messages being hashed can bestudied.

The general procedure follows the same three steps as the attack on MD5:

1. Calculate a path of differentials that the two messages that collide shouldfollow. This path specifies the difference in the intermediate results in thecalculations between the two messages in each of the 80 steps of the hashfunction.

2. Generate conditions that satisfy this path, i.e. conditions on Qi for i =−4...80, that gives the desired differences in each step.

3. Search for a collision using the path generated in step one.

The above steps are described in detail in the next three sections. The firstsection deals with step one and two, calculating a good path and generating con-ditions that help satisfy the path. The second section deals with searching for acollision given a path and the third section describes submarine modifications andother optimizations that can speed up the collision search.

We have successfully implemented an attack by following the steps above andwe have listed two collisions in tables 4.12 and 4.13. One of these collisions is alsolisted with all intermediate values in section B.3 in the appendix.

4.4.1 Calculating a Path

All successful attacks against SHA-0 have made use of a concept called local colli-sions.

Local Collisions

Local collisions start when an expanded message word wi is used that differs in oneor several bits, i.e. if a local collision starts in step i, ∆wi 6= 0. This difference willlead to ∆Qi 6= 0 but by introducing other differences to cancel it, we can make surethat Qi+1 to Qi+5 remain unaffected. So a difference can then be controlled and

30

Page 39: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.4. DETAILS OF WANG’S ATTACK

cancelled within six steps from its insertion independently of other local collisions.The six steps are as follows:

wi−1,j0 wi,j0+5 wi+1,j0 wi+2,j0+30 wi+3,j0+30 wi+4,j0+30

Qi Qi+1 Qi+2 Qi+3 Qi+4 Qi+5

≪ 5 ≪ 30 ≪ 30 ≪ 30

DifferenceCorrections

Figure 6: A local collision

• Step i. The message used at step i differs in one bit j0, i.e. Qi,j = Q′i,j except

for j = j0. In the first step, we need to make sure that there is no carry effect,i.e. Qi and Q′

i should not differ in bits other than those that wi and w′i differ

in.

• Step i + 1. This newly added difference in Qi should not spread to Qi+1,this can be achieved by cancelling it with a difference in wi,j0+5. Note thataddition of bits due to rotation is mod 32 so bit j0 + 5 is still a bit from 0 to31.

• Step i + 2. The difference should not be cancelled by the round function,instead it should be cancelled with a difference in wi+1,j0 .

• Step i + 3. Same as the previous step but in this step we need to cancel itwith a difference in wi+2,j0+30 because the register containing the differenceis now rotated 30 steps to the left.

• Step i + 4. Same as the previous step except the difference is cancelled bywi+3,j0+30.

• Step i + 5. The difference in Qi should be cancelled by a message difference,this time in wi+4,j0+30

To summarize, a local collision consists of six differences in six consecutive ex-panded message words. Bit j0 in expanded message word wi−1, bit j0+5 in expandedmessage word wi, bit j0 in expanded message word wi+1 and bit j0 +30 in expandedmessage words wi+2, wi+3 and wi+4. If we control the round function and the carryeffect, this will lead to a local collision and combining these six-step collisions in anefficient way is the key to Wang’s attack.

31

Page 40: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

Finding a Good Disturbance Vector

A disturbance vector is a vector of bits, numbered from 1 to 80, containing a oneat position k if a local collision should start at step k. It contains 80 elements, butbecause of the message expansion, any consecutive sixteen elements determine theother 64.

We need to define what properties make a certain disturbance vector better thananother, to be able to find the best one for an efficient collision search. The vectorwe choose should lead to as few conditions as possible. However, conditions in thefirst 16 steps do not affect the complexity of the attack and there are efficient waysof dealing with conditions in the rest of round one, i.e. conditions on bits in thefirst round do not affect the complexity of the search because they will be fulfilledautomatically.

Local collisions in other rounds will lead to conditions that have to be fulfilledprobabilistically. Table 4.3 summarizes the conditions necessary to deal with a localcollision after step 20.

Table 4.3. SHA-0 First look at conditions in round 2-4

Step Condition Rounds Descriptioni Qi,j = wi−1,j All No carry

i + 1 wi,j+5 6= Qi,j All Cancellation

i + 2Qi−1,j+2 6= Qi−2,j+2 MAJ Difference not cancelled by round function

Fi+2,j 6= wi+1,j All Cancellation

i + 3Qi+1,j+30 6= Qi−1,j MAJ Difference not cancelled by round function

Fi+3,j+30 6= wi+2,j+30 All Cancellation

i + 4Qi+2,j+30 6= Qi+1,j MAJ Difference not cancelled by round function

Fi+4,j+30 6= wi+3,j+30 All Cancellationi + 5 Qi,j+30 6= wi+4,j+30 All Cancellation

Each local collision starting in round two or four, the XOR-rounds, require sixconditions, and those starting in round three require nine conditions, as can beseen in table 4.3. More conditions are required for each local collision in roundthree than in round two and four because the round function in round three, MAJ,is less predictable and we need to control the output. The number of conditionscan be reduced by noting that several conditions are required to make sure thata cancellation occurs in bit j0 + 30. These can be removed by fixing j0, the bitwe introduce a difference in, to bit one which would lead to bit j0 + 30 = 31 andsince we are working mod 232, there is no need to check for cancellation as thereis no carry effect in bit 31. This reduces the number of conditions to three for theXOR-rounds, and six for the third round using the MAJ-round function.

This can be reduced even further. The condition for step i + 1 is to makesure that the difference introduced in wi,j0+5 cancels the difference in Qi,j0 . But

32

Page 41: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.4. DETAILS OF WANG’S ATTACK

Qi,j0 = wi−1,j0 from the condition for step i so we are really interested in therelation between the two bits wi−1,j0 and wi+1,j0 . We can precompute certain bitsof message words that are not varied in the search for a collision so adding conditionson expanded message words do not affect the complexity of the search.

In step i + 2 two conditions are still needed for every local collision in roundthree, one to make sure that the difference is not hidden in the round function,and one more to make sure that the differences are cancelled by the difference inthe expanded message word. These two conditions can be reduced to one condi-tion by precomputation like in the paragraph above. We can change the conditionQi−1,j0+2 6= Qi−2,j0+2 which makes sure that a difference is not cancelled by theround function to a condition that makes sure that the difference is not changedby the round function. This means that output from the round function will differin the same bits as Qi,j0 differs in, which differs in the same bits as wi−1,j0 fromthe condition in step i, and we can precompute wi−1,j0 = wi+1,j0 for every localcollision in round three. By doing this we do not need the cancellation condition,Fi+2,j0 6= wi+1,j0 , for the local collisions in the third round.

The cancellation condition for the XOR-rounds cannot be removed by the samemethod because one condition is always required to make sure that the differenceinserted to start the local collision is not flipped by the round function.

These three improvements have limited the number of conditions to four foreach local collision in round three and two each for those in rounds two and fourand these are detailed in table 4.4.

Table 4.4. SHA-0 Final conditions in round 2-4

Step Condition Rounds Descriptioni Qi,1 = wi−1,1 All No carry

i + 2Qi−1,3 6= Qi−2,3 MAJ Output from round function same as inputFi+2,1 6= wi+1,1 XOR Cancellation

i + 3 Qi+1,31 6= Qi−1,1 MAJ Difference not cancelled by round functioni + 4 Qi+2,31 6= Qi+1,1 MAJ Difference not cancelled by round function

A good disturbance vector is a vector that minimizes the total number of con-ditions in rounds two, three and four. As we have shown above, local collisionsstarting in round two and four require two conditions and four conditions are re-quired for the local collisions starting in round three. Thus we want to minimize2a + 4b where a is the number of local collisions starting in round two and four andb is the number of local collisions starting in round three. In the end, we want acollision on the hash function, so there is also the requirement that no local colli-sion should start after step 75 because Q76, Q77, Q78, Q79 and Q80 are used as theoutput from the encryption function and those should not contain any differencesfor a collision.

To minimize the number of conditions, we have decided that all local collisions

33

Page 42: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

should only start in bit one of the expanded message words, and there are sixteenmessage words in total for each block. This means that there are only 216 differentdisturbance vectors and we can easily search them all and find the best one accordingto our requirements. The disturbance vector used by Wang et al. can be seen intable 4.5.

Table 4.5. The disturbance vector used by Wang et al

Round vector1 0 1 1 1 1 0 0 1 0 1 0 0 1 0 1 0 1 0 0 02 0 1 1 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 03 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 04 1 1 0 0 1 1 0 1 0 1 1 1 0 1 0 0 0 0 0 0

Each element in the vector that contains a one corresponds to starting a localcollision which requires six message differences. A local collision should start atstep i0 of the hash function if element i0 of the disturbance vector is a one. For thisto happen, we need a difference wi0−1,1 which is the expanded message word usedin step i0. As described earlier, we also need five differences to cancel this induceddifference so that a local collision can occur. This is done by inserting differencesin bit wi0,5, wi0+1,1, wi0+2,31, wi0+3,31 and wi0+4,31. Because we are inserting adifference in both wi0−1,1 and wi0+1,1 these differences could cancel each other ifthere are two local collisions starting two steps apart. This is the case for steps 31and 33, 33 and 35, 66 and 68, 68 and 70, 70 and 72, and 72 and 74. This means thatthere is no difference in bit one of the expanded message words w32, w34, w67, w69,w71 and w73, instead, the local collision in those steps are started by not cancellingthe differences from the local collision started two steps earlier. The conditions haveto be altered slightly for these local collisions.

For example, there is a condition F33,1 6= w32,1 to ensure that the difference inthe output from the round function, due to the local collision in step 31, is cancelledwith the message difference in step 33. But since another local collision should startin step 33, there is no difference in w32,1 and we need to make sure that no carryoccurs instead. Thus we need to replace one of the two conditions in this step,F33,1 6= w32,1, with another condition Q33,1 = F33,1. If this is fulfilled, no carry willoccur in step 33. The reason that the other condition, Q33,1 = w32,1, is still neededis that we have precomputed bits of certain expanded message words to be able tomeet earlier conditions, and for this to work Qi must not differ in other bits thanwi−1.

The local collision that starts in step 58 is special. Two of the conditions nor-mally placed on local collisions starting in the third round can be removed since thecollision starts so late in the third round that when the conditions are normally ap-plied the round function of the fourth round is used instead. The two last conditionsplaced on local collisions that start in the third round are Qi0+1,31 6= Qi0−1,1 and

34

Page 43: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.4. DETAILS OF WANG’S ATTACK

Qi0+2,31 6= Qi0+1,1, these can be removed since in step 58+3 and 58+4 the roundfunction used is XOR. Note that no local collision in the second round begins lateenough to be affected by the change of round function in step 40.

From the disturbance vector in table 4.5 we can see that there are only threelocal collisions in the third round, and one of them requires two less conditions thanthe others. There are also sixteen local collisions in round two and four. This givesa total of 16 · 2 + 3 · 4 − 2 = 42 conditions. Each of these conditions will have tobe fulfilled probabilistically. All conditions are listed in tables B.2 and B.3 in theappendix.

The First Round

As mentioned earlier with MD5, the first round is different from the later rounds.Conditions on the first sixteen steps do not affect the complexity of the search, andmost of the conditions in the later steps of the round can be dealt with efficiently aswe describe later in section 4.4.3. The conditions in round two, three and four arebased on the assumption that there are no other differences propagating through thehash function so we need to find a path through the first round that has cancelledall differences by the end of the round.

Wang’s path with necessary conditions is in table B.1 in the appendix. Thepath has, as far as we have understood, been derived by hand and required greatskill and patience to complete. The carry effect and the round function have beenused to propagate or hide bits when needed. Wang et al. note in their conclusionthat

[...] because of the simple step update structure, certain properties of theBoolean function (x∧y)∨(¬x∧y) combined with the carry effect actuallyfacilitate, rather than inhibit, differential attacks. [1]

Precomputing Bits of Message Words

With the chosen disturbance vector, there are a total of nineteen local collisions,starting in steps 22, 23, 27, 31, 33, 34 and 35 in round two, steps 42, 48 and 58in round three and steps 61, 62, 65, 66, 68, 70, 71, 72 and 74 in round four. Foreach of these collisions the following condition is required on the expanded messagewords:

wi0−1,1 6= wi0,6.

Each of the three local collisions in the third round also have the followingcondition:

wi0−1,1 = wi0+1,1.

There are even more conditions due to the path from the first round that specifythe desired difference for some message bits. For example, the difference in bit 1 of

35

Page 44: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

w2 needs to be positive, i.e. w2,1 = 0, w′2,1 = 1. There are sixteen such conditions,

these are detailed in table 4.6.

Table 4.6. Conditions on message words for pre-computation

bit 1 bit 6m0 1 0m1 . .m2 0 0m3 . 1m4 . 1m5 0 0m6 1 .m7 0 .m8 . 1m9 . .m10 . 0m11 0 .m12 0 .m13 . 1m14 . .m15 . 1

There are sixteen bits that are not fixed by the path from the first round. Itis easy to compute the 216 alternatives that are valid and check which satisfy allthe other requirements from the local collisions in round two, three and four. Itturns out that only two sets of message words satisfy all the conditions and theyare detailed in table 4.7.

Wang et al. used option one while Naito et al. used option two.

4.4.2 Searching for a Collision

Unlike the attack on MD5, the attack on SHA-0 requires only one path throughone block. The attack still uses two blocks because the path through the firstround requires fourteen bits of Q0 to take certain values and the initial value for Q0

for the first block does not satisfy the requirements for all the bits. We thereforerequire a first block that is identical for both messages, which is used to produce aninput to the second block that does satisfy the fourteen conditions. Since there areonly fourteen conditions, one in 214 randomly selected message blocks, M0, shouldgenerate an output state that satisfies the conditions and it is easy to find one bytrying random values for the first block.

After having found a suitable message block M0, we can search for a secondblock, M1, of message words that will lead to a collision. From the message pre-computation in the last section, we have two bits already decided in each message

36

Page 45: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.4. DETAILS OF WANG’S ATTACK

Table 4.7. The two options for message words

Option 1 Option 2bit 1 bit 6 bit 1 bit 6

m0 1 0 1 0m1 1 1 1 1m2 0 0 0 0m3 0 1 0 1m4 1 1 1 1m5 0 0 0 0m6 1 0 1 0m7 0 1 0 1m8 0 1 1 1m9 1 0 1 1m10 0 0 0 0m11 0 0 0 0m12 0 0 0 0m13 0 1 0 1m14 0 0 0 0m15 0 1 0 1

word except m0 which also has a condition on bit 31 from the path. The numberof conditions on Q1 to Q16 is higher compared to the number of conditions on m0

to m15 which can be seen in table B.1 in the appendix where all conditions for thefirst round are given. The most efficient way of finding suitable message words istherefore to generate Q1 to Q16 one at a time, see to it that each word satisfies allthe conditions on them, and then calculate the corresponding message word andcheck if the two or three bits that are precomputed are also satisfied. This worksbecause just like for MD5 there is a one-to-one relation between the message wordused in each step and the resulting value, Qi.

To see this let us recall the step update function for SHA-0:

Qi = (Qi−1 ≪ 5)�Fi(Qi−2, (Qi−3 ≪ 30), (Qi−4 ≪ 30))�(Qi−5 ≪ 30)�wi−1�ki.

which can be rewritten as:

wi−1 = Qi�(Qi−1 ≪ 5)�Fi(Qi−2, (Qi−3 ≪ 30), (Qi−4 ≪ 30))�(Qi−5 ≪ 30)�ki.

After a suitable message block has been found that satisfy all conditions in thefirst sixteen steps, we check if the expanded messages also satisfy all conditions forthe rest of the path. This will most likely not be the case and another suitablemessage block will have to to be generated. Since there is a 50% chance that eachcondition in round two, three and four is satisfied, most message blocks that donot satisfy all of the conditions will fail relatively fast. Only one message block

37

Page 46: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

in sixteen will satisfy all conditions in rounds 22 and 23 for example. To find acollision, many message blocks will probably have to be generated, the good newsis that it is a fairly quick process to eliminate most of the blocks that do not leadto a collision.

4.4.3 Optimizations

There are optimizations that can be done to speed up the search.Wang et al. proposed that only the first fourteen message words should be

generated directly, the last two can be varied to speed up the search by not havingto generate a whole new message block if the current configuration fails, but onlythe last or two last message words.

There are three conditions on the steps after round sixteen in the first roundthat can be skipped with very high probability by easy modifications. They areconditions on bits Q17,1, Q17,31 and Q18,31. The idea is that since w15 has still onlybeen used once, in step sixteen, when these conditions are tested, we can modifythat message and only recompute from step sixteen. If for example the conditionon Q17,31 is not satisfied, instead of changing w16, which is computed from fourmessage words, we change w15 instead in a way that flips Q16,26 which in turn flipsQ17,31. This will not always work for all three conditions, but the success rate isvery high because there are few other conditions that we can affect with this change.

Without the modification, at least one of the three conditions will fail aboutseven out of eight cases, which is 7

8 = 87, 5%. With the modifications we havefound that at least one of the conditions fail about 3% of the time which is a hugeimprovement and we can basically ignore these conditions when calculating thecomplexity of the attack. The reason that the modifications do not always work isthat the changes can cause a carry effect that cancels the changes in some situations.

Submarine Modifications

Naito et al. improved upon the attack by adding submarine modifications thatallowed them to skip the conditions for the first two local collisions in round twowith almost 100% accuracy. The idea is to find a set of message bits that whenflipped change one of the bits that are used to validate the condition that failedwithout changing any earlier bits that are affected by conditions. This requires thatevery condition has to be dealt with separately. The message expansion limits usto do this only for conditions in the first few steps of the second round.

When a suitable set of message bits have been found, conditions have to becalculated that guarantees that the changes in the message bits from the first sixteensteps are cancelled by the end of the first round. This is because if they were allowedto propagate further, we would require conditions after the first sixteen steps to dealwith them. The goal is to have the flips of the message bits cause as little change aspossible in the first sixteen steps and when the message words are used again due

38

Page 47: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.4. DETAILS OF WANG’S ATTACK

to the message expansion, they will cause changes in bits that caused the currentcondition to fail.

Submarine modifications are similar to the tunnels of MD5 in that they allowus to change expanded message words if a condition fails so that the conditions arevalid without recomputing from the beginning of the hash function. The differenceis that a submarine modification is derived specifically for a certain condition whilea tunnel is not and can be used for several conditions. Submarine modificationshave a very high success rate compared to tunnels. This is because they are derivedspecifically for a certain condition. Using a tunnel does not automatically mean thata condition will suddently be valid, but most tunnels have several bits to vary, andthus they will often find good expanded message words eventually. Klima states inhis paper [12] that there probably are tunnels in other cryptographic hash functionsbesides MD5, among which he mentions SHA-0. Because of the more complicatedmessage expansion, we assume that they are harder to find and may not be aseffective as in MD5, and the submarine modifications can be seen as a specializedversion of tunnels.

Submarine modifications are best explained by an example. One of the foursubmarine modifications deals with the condition Q23,1 = w22,1. The extra con-ditions needed for submarine modifications of this condition are Q11,15 = w10,15,w11,20 6= w10,15, w12,15 6= w10,15, Q12,13 = 0, Q13,13 = 1, w15,13 6= w10,15 andw19,20 6= w18,15. If these are satisfied and the condition fails, we can flip bits w10,15,w11,20, w12,15 and w15,13 and this will, with very high probability, flip the bits neededto satisfy the condition while keeping bits that are used for other conditions fixed.Let us see the mechanism behind the submarine modification by detailing each stepthat is affected by this modification.

• Step 11. The first step affected by the four message bits that are now flippedis step eleven, which uses w10. In this step, we would like to avoid a carry tomake it easier to cancel this change as soon as possible. The extra conditionQ11,15 = w10,15 makes sure that we do not have a carry.

• Step 12. We would like Q12 to remain unchanged since otherwise we haveto cancel the change in Q12 later as well. To ensure this we also changed bitw11,20 and to make sure that the two changed bits cancel each other insteadof producing carries, we set the extra condition w11,20 6= w10,15.

• Step 13. In the same way as in step seven, we cancel the flipped bit byintroducing another flipped bit, this time w12,15, and make sure that theycancel each other by an extra conditions, in this case w12,15 6= w10,15. Q11,15

is the flipped bit, it is used in the round function with Q10,17 and Q9,17. Fromthe path we know that Q10,17 = 1 and Q9,17 = 0 so the result of the roundfunction will be Q11,15 which is w10,15. Thus w12,15 6= w10,15 ensure that Q13

is not affected by the flipped bit in w10.

• Step 14. In this step we use the round function to hide the flipped bit. Theround function for this step uses the words Q12, Q11 and Q10 of which the

39

Page 48: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

Table 4.8. Submarine modifications

Used condition Extra conditions Flipped bits

Q22,1 = w21,1

Q11,21 = w10,21

w11,25 6= w10,20

Q10,22 = Q9,22 w10,20

Q12,18 = 0 w11,25

Q13,18 = 1 w15,18

w15,18 6= w10,20

w19,25 6= w18,20

Q23,1 = w22,1

Q11,15 = w10,15

w11,20 6= w10,15 w10,15w12,15 6= w10,15 w11,20Q12,13 = 0w12,15Q13,13 = 1w15,13w15,13 6= w10,15

w19,20 6= w18,15

F24,1 6= w23,1

Q6,5 = w5,5

w6,10 6= w5,5 w5,5

w7,5 = w5,5 w6,10

Q7,3 = 0 w7,5

Q8,3 = 1 w10,3

w10,3 6= w5,5

F25,1 6= w24,1

Q11,7 = w10,7

w11,12 6= w10,7

Q10,9 = Q9,9 w10,7

Q12,5 = 0 w11,12

Q13,5 = 1 w15,5

w15,5 6= w10,7

w19,12 6= w18,7

last two are rotated 30 bits to the left. The flipped bit that we wish to hideis Q11,15 and by using the extra condition Q12,13 = 0 the changed bit will noteffect the output. For the bit of Q11 which is flipped, the round function lookslike F14,13 = (Q12,13 ∧Q11,15) ∨ (¬Q12,13 ∧Q10,15) and since Q12,13 = 0 we seethat Q11,15 will not matter for the output of the function.

• Step 15. This step is handled in the same way as the previous step, Q13,3 = 1makes sure that the flipped bit is hidden by the round function.

• Step 16. This step is handled in the same way as steps twelve and thirteen.We flip another bit, bit w15,13 in this case, and make sure that the differencescancel each other by an extra condition, w15,13 6= w10,15. After this step, the

40

Page 49: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.5. OUR CONTRIBUTIONS

flipped bit in w10 has been cancelled in round one and none of the flipped bitswill be used again until step nineteen.

• From step 19. The message expansion will spread the flipped bits to othermessages and one of the changed bits will be w18,13, from w15,13. This leadsto a difference in w18 of ±213 compared to before the submarine modification.This difference will spread to Q19,13 which in turn will affect Q20,18 → Q21,23 →Q22,28 → Q23,1 which is what we want. We can see that the other bit in thecondition, w22,1, is not affected since we do not flip bit one in any word. SinceQ23,1 will be changed, but not w22,1 we can see that the condition will pass ifit failed before the modification.The extra condition w19,20 6= w18,15 guarantees that bit Q20,20 is not affectedby the submarine modification. This is done to increase the probability thatthe difference in Q20,18 is not cancelled before it affects Q23,1. It is still possiblethat this modification will not work, a carry from the other changed bitsmight interfere before Q23,1 is changed and through experimentation, we haveestimated this probability to about 2,5%, which is equal to the probabilitythat Naito et al. estimate in their paper.

As shown in the example, a submarine modification requires recomputation fromthe step were the earliest message word that has a changed bit in it is used. Thisis about ten to fifteen steps.

There are three other submarine modifications that can be used, and the prin-ciple is the same for all of them. The extra conditions are also very similar. Toavoid repetition, detailed step-by-step guides to the other three modifications areomitted in this thesis. Instead, the four conditions are summarized in the table 4.8.

Two conditions, Q22,1 = w21,1 and Q23,1 = w22,1, are only dependant of one stepvariable, Q22 and Q23 respectively. The other two are dependant on the outputfrom the round function which in turn is dependant on three step variables. Thismakes the later conditions harder to derive submarine modifications for since oneor all of the bits that the condition depends on have to change value. If two or noneof the conditions change value, the changes will cancel each other. We have foundthat one of the modifications do not work because the wrong number of bits areusually changed, see the section 4.5, Our contributions, for more details.

Even with these optimization, finding a collision takes a long time. Naito et al.have estimated the runtime of their algorithm to about 100 hours to find a collisionon a regular desktop computer.

4.5 Our Contributions

As with MD5, we feel that a large part of our contribution for SHA-0 are the detailsabout the attacks that we have described in this thesis. Even though the attacksagainst SHA-0 are better explained than the attacks against MD5, we feel that therewere still several details about the attacks that were missing.

41

Page 50: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

While studying the attacks on SHA-0, we found aspects of the attack that couldbe improved. We developed a new version of the attack that, by our experiments,can find a collision faster than the attacks by Wang et al. and Naito et al.

4.5.1 Wang’s Suffient Conditions are not Sufficient

The conditions derived for the path through the first round are claimed to be suffi-cient by Wang et al. in their paper. This is not the case. For the first sixteen steps,there are conditions on both the message words, mi−1, and the step variables, Qi.The message words are dependant on the step variables and vice versa, i.e. if eitherQi or mi−1 are given, the other is also determined. Because of this, an attack hasto generate candidates that satisfy the conditions for either the message words orthe step variables for each step, then compute the other and check if the conditionsfor that word are satisfied as well. For most steps of the first round there are moreconditions on Qi then on the corresponding message word, mi−1. The word Q3 forexample has conditions on 25 of 32 bits, while w2 only has conditions on two bits.For step three it is therefore better to generate Q3 that satisfy the 25 conditionsand then compute w2 and check if both its conditions are satisfied. Each messageword has at least two conditions on bits, since we have precomputed bits one andsix, but some have many more. Submarine modifications add conditions on messagewords.

Through experimentation we have found that some randomized first blocks,while they satisfy all the conditions on the first block, cannot be used for generatinga collision. This is because the conditions are insufficent for both the first and thesecond block.

Table 4.9. An example of a bad first block

M0 61346aef16 53202d4016 7ce62c0316 6319480916

45004f5d16 3a68387116 0df5244016 109211bd16

7d8e50ab16 3f8c6bb416 2d4e615516 5a0508fc16

07a65d6916 5a79073416 4bdd5a0116 36f674d016

An example of this is in table 4.9 where M0 satisfies all the conditions imposedon the first block, but there is no second block that can generate a collision sinceall 213 variations of Q1 for the second block fail the conditions for w0. In this case,the condition w0,1 = 1 is never fulfilled.

Recall that wi−1 could be calculated from Qi,Qi−1,Qi−2,Qi−3 and Qi−4 in thefirst sixteen steps by:

wi−1 = Qi�(Qi−1 ≪ 5)�IF (Qi−2, (Qi−3 ≪ 30), (Qi−4 ≪ 30))�(Qi−5 ≪ 30)�ki.

For w0:

42

Page 51: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.5. OUR CONTRIBUTIONS

w0 = Q1�(Q0 ≪ 5)�IF (Q−1, (Q−2 ≪ 30), (Qi−3 ≪ 30))�(Q−4 ≪ 30)�5a82799916.

Given M0, only Q1 is not constant in the expression above, Q0, Q−1, Q−2, Q−3

and Q−4 have been determined by the first block,

w0 = Q1 � 3e75b3b116 � c6c666ff16 � 01c3c6ad16 � 5a82799916.

Condition w0,1 = 1 is never fulfilled, we can see why by examining only the twoleast significant bits in each word:

w0,0−1 = Q1,0−1 � 012 � 112 � 012 � 012.

The four combinations of the two bits of Q1 give the result for w0 detailed intable 4.10.

Table 4.10. w0 as a result of variations of Q1

Q1 002 012 102 112

w0 102 112 002 012

The problem is that there is a condition Q1,1 = 1, which implies that only thetwo last columns are possible, and both of them result in w0,1 = 0 so the conditionon w0,1 is not satisfied. The condition Q1,1 = 1 is required to make sure that thedifference in bit one of w0 does not change when it is transfered to Q1,1.

This is an example of a situation when the conditions on message words andconditions on Qi are mutually exclusive. It is not always this simple to find thatthe first block cannot lead to a collision, the problematic conditions can occur inlater steps which increases the search time dramatically. Often it is a combinationof different conditions that together with a bad first block cannot be all satisfiedand thus it is very hard to calculate extra conditions to avoid these cases.

4.5.2 Generating First Fourteen Message Words of Block Two

Wang et al. recommends that to find a collision, one should start a search for a goodsecond block by generating fourteen message words that satisfy the conditions forthe first fourteen steps and then use the message words in step fifteen and sixteen asfree variables that are varied through the search. This is a good idea and improvesperformance over the naive method which is to generate all sixteen message words,test if they lead to a collision and if they do not, generate sixteen new words andso on.

We feel that a description by Wang et al. about how this should be done islacking in detail. One attempt is an exhaustive search through all possible valuesof Q1...Q14 that satisfy their respective conditions. This is a bad idea in situations

43

Page 52: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

when there are no possible messages due to a badly chosen first block, as describedabove.

Our solution for this problem is to not do an exhaustive search but insteadgenerate message words one step at a time and if we at some step fail to generatea message word, we restart the search by generating a new first block. This meansthat we will try at most one configuration of the first fourteen message words forevery first block we generate. This is not an ideal solution, but our experimentshave shown that it works better than an exhaustive search.

This exhaustive search can be optimized by noting an interesting property ofthe conditions on message words. As detailed earlier, there is a possibility that thefirst message block cannot lead to a collision. These situations can be discoveredmore efficiantly using the following ideas. Since most message words only have twoconditions, on bit one and bit six from the precomputation, it is possible to speedup the exhaustive search by first testing each combination of the first six bits, andfor those configurations that this fails for, there is no need to test the other bits.

For example, Q9 has eighteen free bits that we can vary. That is 218 combinationsthat have to be tried for an exhaustive search. But the only conditions on themessage word m8 is in bits one and six which are not affected by the bits in Q9 inmore significant positions than six. We can therefore test the 26 combinations ofthe first seven bits, there is one condition, Q9,3 = 0, that prevents us from varyingbit three, and then check for which of these that the two conditions are satisfied.For those combinations that satisfy both conditions on w8, we can vary the otherthirteen bits, and skip the rest. This does reduce the complexity of the search, butnot by enough to make it a better option than our previous suggestion where werestart with a new first block if a step fails.

It is interesting to note that although the submarine modifications do speed upthe search when the first message words have been found, they actually make thesearch for the first message words slower by adding more conditions in round one.

4.5.3 A Condition for the XOR-rounds is Insufficient

The conditions for local collisions in rounds two, three and four are given withoutany detail by Wang et al. We derived the conditions ourselves and found that onecondition differed in their version compared to ours.

The local collisions in rounds two and four have one condition that guaranteesthat when the perturbation is used in the round function, XOR, for the first time,it must be cancelled by the message difference in that step, and not by the roundfunction. Our condition for this is Fi+2,1 6= wi+1,1, because two steps after a localcollision is started, the difference is used in the round function for the first time,hence i + 2 as the index to the round function. The bit output from the roundfunction should not be equal to the bit of the expanded message word becauseotherwise the result of the addition will be a carry instead of cancellation. Wang etal. uses the condition Qi−1,3 = Qi−2,3 or Qi−1,3 6= Qi−2,3.

This is almost the same condition as is used for round three, when we want

44

Page 53: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.5. OUR CONTRIBUTIONS

the bit to remain unchanged by the round function. Wang et al. clearly state intheir paper that they only precompute the expanded message words to make thecancellation occur in round three. In round two and four the expanded messagewords are not precomputed to cancel the differences automatically which is why wehave extended Wang’s condition to make sure that a cancellation occurs.

We interpret Wang’s condition to mean that the condition is either Qi−1,3 =Qi−2,3 or Qi−1,3 6= Qi−2,3 depending on the values for the third bit in the roundfunction, Q22,1 and the corresponding bit in the expanded message word, w23,1. Thecondition cannot be used by itself since one of the two statements are always true,so the correct one has to be chosen for each condition for each situation, and thatchoice depends on the values of Q22,1 and w23,1.

Since Wang’s conditions in itself cannot be used directly, we prefer our versionof the condition which does not depend on other bits.

4.5.4 One Submarine Modification is Incorrect

The submarine modifications are detailed in section 4.4.3, Optimizations, and areclaimed to reduce the complexity of a collision search by 24 since four conditionscan be skipped with very high probability. This seems to be incorrect. This hasto do with the condition in rounds two and four that we found to be flawed. Thesubmarine modifications are derived from Wang’s conditions, and two of them fromthe condition Qi−1,3 = Qi−2,3 or Qi−1,3 6= Qi−2,3.

This condition is incomplete as we showed in the previous section since it isdependant on other bits as well. The local collision starting in step 22 requires thecondition Q21,3 = Q20,3 or Q21,3 6= Q20,3 according to Wang et al. to guarantee thatthe difference in bit Q22,1 does not affect Q24,1 in step 24.

The condition depends on Q22,1 and w23,1 as Q22,1 is used in the round functionwith Q21,3 and Q20,3, and w23,1 contains a difference that should cancel the differencein Q22,1. This is why we have derived our own version of the condition to F24,1 6=w23,1.

When Naito et al. derived their submarine modifications, they used the con-ditions of Wang et al. and their proofs that the modifications work are basedupon those conditions. The submarine modification for condition Q21,3 = Q20,3 orQ21,3 6= Q20,3 is proven to only change Q21,3 and not Q20,3 with very high prob-ability. This is because w18,3 is changed which consequently flips Q19,3, Q20,8 andQ21,3. In the next step however, Q22,1 will also be affected by the change in Q19,3

since Q22,1 is calculated from

Q22,1 = Q21,28 � XOR(Q20,1, Q19,3, Q18,3) � Q17,3 � w21,1 � k22,1 � (possible carry).

This causes the modification for condition F24,1 6= w23,1 to fail almost everytime, w23,1 is not affected and the changes in Q21,3 and Q22,1 cancel each other.

There are two submarine modifications that are derived from the condition thatwe found to be insufficient in the previous section. One is the modification that

45

Page 54: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

we found to be flawed and have described in this section. The other modificationis the submarine modification derived to deal with the condition F25,1 6= w24,1. Inthis case the change in bit w15,5 will result in a change in bit w18,5 by the messageexpansion, which in turn will change Q19,5 which will change Q22,3 in the same waythat the change in Q19,3 changed Q22,1 with the previous modification. The changein Q19,5 will not change Q23,1 and neither will any of the other changes in messagebits and thus this message modification works as expected. Only one of the wordsused to evaluate the condition is changed which means that if the condition failedthe first time it will work when only one of the words used in the evaluation of thecondition is changed.

Our experiments have confirmed that three of the modifications work as expec-ted, and the modification that we judged to be erroneous almost never works. Weinterpret this result not only as a fact that the modification is flawed, but also thatthe condition that Wang et al have used, that these submarine modifications arederived from, is not correct, as we have stated earlier. In their paper, Naito et al.have proven that the modification does change the condition if it is expressed ac-cording to Wang et al. and an unlikely carry effect does not occur. If the conditionis expressed as we have derived it, the proof does not work anymore since we alsouse Q22 which is changed as well, as described above.

We derived a new submarine modification as a replacement. Using the modific-ation for condition F25,1 6= w24,1 as reference we found that if the extra conditionsQ5,7 = w4,7, w5,12 6= w4,7, Q4,9 = Q3,9, Q6,5 = 0, Q7,5 = 1 and w9,5 6= w4,7 areimposed we can flip bits w4,7, w5,12 and w9,5 if the condition fails and thus fulfilthe condition. As with the other submarine modifications, the extra conditions arederived to ensure that the changes in the bits of the message words are cancelled inthe first sixteen steps.

• Step 5. The changed bit w4,7 causes a change in bit Q5,7. This change mustnot propagate to other bits through the carry effect and this is controlled bythe extra condition Q5,7 = w4,7. This condition ensures that a change in bitw4,7 only changes bit Q5,7.

• Step 6. In this step the changed bit Q5,7 is rotated to bit twelve for thecalculation of Q6. To ensure that Q6 is not changed, bit w5,12 is flipped tocancel the change in Q5,7 in this step. The extra condition w5,12 6= w4,7

guarantees that the changes in the bits cancel each other.

• Step 7. The changed bit in Q5,7 is used in the round function in this step.The round function is IF and Q5 is used as the deciding word. Therefore, theextra condition Q4,9 = Q3,9 guarantees that no matter what the bit Q5,7 is,the result of the round function for that bit will be the same. Thus Q7,7 willnot be affected by the change in Q5,7.

• Step 8 and 9. These two steps are handled in the same way. By makingsure, with extra conditions Q6,5 = 0 and Q7,5 = 1, that Q5,7 will not influence

46

Page 55: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.5. OUR CONTRIBUTIONS

the result of the round function in these steps, Q8 and Q9 will not be affectedby the change in Q5,7.

• Step 10. This is the last step that the changed bit, Q5,7, can influence.The extra condition w9,5 6= w4,7 guarantees that the changes in Q5,7 and w9,5

cancel each other.

• After step 16. Because of the message expansion, the changed bits w4,7,w5,12 and w9,5 will affect bits after step sixteen. They will not affect w23,1 soto make sure that the condition F24,1 6= w23,1 is fulfilled, it is required thatthe changes affect either only one of the bits in F24,1 = Q22,1⊕Q21,3⊕Q20,3 orall three. Figure 7 shows the affected bits by the changed message bits. Notethat more bits are probably affected due to the carry effect, but the bits shownin the figure are the ones changed directly. Several bits are not changed dueto cancellation, these bits are shown with a line through them in the figure.

w4,7

w18,7 w20,7 w21,7

Q19,7

Q20,12

Q21,7

Q22,5

Q21,7

Q22,12 Q22,7

w5,12

w19,12 w21,12

Q20,12

Q21,17

Q22,12 Q22,22 Q22,12

w9,5

w17,5 w20,5

Q18,5

Q19,10

Q20,5

Q21,3

Q22,3

Q21,10

Q22,8

Q20,15

Q21,10 Q21,20

Q22,15 Q22,25

Q21,5

Q22,10

Figure 7: Changed bits by the new submarine modification

The new submarine modification with necessary conditions is detailed along withthe other submarine modifications in table 4.11. We have experimentally confirmedthe correctness of all these modifications and the estimated time to find a collisionis considerably lower with this new modification.

The attack by Naito et al. is described better than the attack by Wang et al.but we still found details missing that we had to derive ourselves. It is possiblethat we have misunderstood something during this process that would make themodification work nonetheless. We do not think so however as our new version ofNaito’s attack is faster on the tests we have run compared to our implementationof their attack.

In total, we have generated eight collisions with our new version of the attack.

4.5.5 Collisions on SHA-0

Using the knowledge acquired through the work desribed in this thesis, we wrotea program that could find a collision on SHA-0 in about 50 hours on our desktop

47

Page 56: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

Table 4.11. New Submarine modifications

Used condition Extra conditions Flipped bits

Q22,1 = w21,1

Q11,21 = w10,21

w11,25 6= w10,20

Q10,22 = Q9,22 w10,20

Q12,18 = 0 w11,25

Q13,18 = 1 w15,18

w15,18 6= w10,20

w19,25 6= w18,20

Q23,1 = w22,1

Q11,15 = w10,15

w11,20 6= w10,15 w10,15w12,15 6= w10,15 w11,20Q12,13 = 0w12,15Q13,13 = 1w15,13w15,13 6= w10,15

w19,20 6= w18,15

F24,1 6= w23,1

Q5,7 = w4,7

w5,12 6= w4,7 w4,7Q4,9 = Q3,9 w5,12Q6,5 = 0w9,5Q7,5 = 1

w9,5 6= w4,7

F25,1 6= w24,1

Q11,7 = w10,7

w11,12 6= w10,7

Q10,9 = Q9,9 w10,7

Q12,5 = 0 w11,12

Q13,5 = 1 w15,5

w15,5 6= w10,7

w19,12 6= w18,7

computer. Our attack is about twice as fast as our implementation of the attack byNaito et al. Several collisions were found, and one for each of the two possibilites ofprecomputed message words are listed in tables 4.12 and 4.13. The collision usingWang’s message option is detailed step by step in table B.5 in the appendix.

4.6 Conclusions

SHA-0 is structurally very similar to MD5, but its more complicated message ex-pansion was believed to make it harder to attack. Even though the complexity ofthe attacks against SHA-0 are higher than those against MD5, the first collision onSHA-0 was actually found before the first collision on MD5. The discovery of local

48

Page 57: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

4.6. CONCLUSIONS

Table 4.12. A collision using Wang’s message option

M0 f5fb42d216 427f1b2816 3025a85d16 fcca64ca16

3c10f0d516 d97f6a7a16 22f7183216 4779c05216

a4bb3eb416 d97f6a7a16 68da613716 c402d48d16

6d69d50916 0292497216 9100050f16 07ba452d16

M1 0047af8316 14f93df616 76cea10416 4f2c185816

583e5ac316 ac3b9c9d16 44bede0b16 a6cd6fc016

a387b46016 497d5e0216 609fd61816 c1c3fc8c16

c935592516 beabe26116 1c2c8a0d16 e2f0586d16

M ′1 8047afc116 14f93df616 f6cea14616 4f2c181816

583e5a8316 ac3b9cdf16 c4bede0916 26cd6fc216

a387b42016 c97d5e0216 e09fd65816 41c3fc8e16

c935592716 3eabe22116 9c2c8a0d16 62f0582d16

Table 4.13. A collision using Naito’s message option

M0 c3112b4c16 dffc0f7d16 5c59003d16 56aeeb1516

d61d14dd16 061b780116 6875910516 fccdc5ea16

e3fad7fa16 fea4842a16 6acb0cb916 a3c5cf9b16

64e74f8716 c1874cd916 92967dd716 9ca164a816

M1 00146d0216 43ce3d6316 fa0b363816 23b3df7516

92ae73cb16 250c3bb416 3adeda0a16 dc5dece916

a9e7a74216 1c5d675e16 909e4fb816 895e4e9516

b1e0f31c16 4097d4c516 b86e38b116 780abfc116

M ′1 80146d4016 43ce3d6316 7a0b367a16 23b3df3516

92ae738b16 250c3bf616 badeda0816 5c5deceb16

a9e7a70216 9c5d675e16 109e4ff816 095e4e9716

b1e0f31e16 c097d48516 386e38b116 f80abf8116

collisions meant that attackers could focus on finding good disturbance vectors andpaths through the first round and could in some sense ignore the message expansion.

When NIST in 1995 released the new version of SHA-0, SHA-1, after NSA founda weakness in its structure, it was again the message expansion that was targeted.The only difference between SHA-0 and SHA-1 is that the message expansion alsohas a rotation of one bit to the left which results in a much larger domain of possibleperturbation vectors as all bits in the messages words have to be considered andnot just bit one and six as in SHA-0.

No collision on SHA-1 has yet been found but a theoretical attack with a heuristiccomplexity of 263 has been described by Wang et al. and because of this neitherSHA-0 nor SHA-1 is recommended for use in new applications. NIST also suggests

49

Page 58: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 4. SHA-0

that protocols and applications that use SHA-1 should migrate to the newer andsafer SHA-2 by 2010 at the latest.

Even though the attack published by Wang et al. on SHA-0 was not the first toresult in a collision, it is important for at least two reasons: it both reduced the timeto search for a collision from 80 000 to 800 hours, and the attack was extendible toSHA-1. The attack is, just like the attack on MD5, based on differential cryptana-lysis and the construction of a path through the cryptographic hash function. Inthis case, the path was needed only for the first 20 steps, and the rest of the stepswere handled by conditions on local collisions. Once the conditions for a generallocal collision has been established, as we did in section 4.4.1, all conditions forrounds two, three and four are given, while the conditions for all rounds of MD5had to be derived separately.

Local collisions were first introduced by Florent Chabaud and Antoine Joux [2]and made it possible to produce the first collision on SHA-0. The idea that eachdifference introduced by an expanded message word could be cancelled within sixsteps using other differences in other expanded message words almost independentlywas the key to the attack and the difference between the attack by Chabaud andJoux, the first attack to use local collisions, and the attack by Wang et al. is adifferent disturbance vector and path through the first round. This allowed Wanget al. to reduce the complexity of the search from about 261 calls to about 239 callsto the cryptographic hash function.

Improvements on the attack by Wang et al. have been made by Naito et al. [5]and an attack with a claimed complexity of 236, where four of the conditions usedin the first attack were removed, could be published. This last attack still requiresabout 100 hours of CPU time on a regular desktop computer and the accuracy of thecomplexity of the attack have been questioned. Thomas Peyrin [8] has estimatedthe complexity of their attack to 240,3 calls to the encryption function.

During our study of the attacks published by Wang et al. and Naito et al. wehave found that one of the conditions derived by Wang et al. is not sufficient. Thiscaused one of the submarine modifications derived by Natio et al., which are usedto skip conditions, to fail. The attack by Naito et al. claims to be 24 times fasterthan the attack by Wang et al. since four conditions have been removed. We belivethat their attack is only 23 times as fast and with our new modification our programfinds a collision faster than our implementation of the attack by Naito et al. On aregular desktop computer our program finds a collision in about 50 hours, which istwice as fast as Naito et al. report in their paper.

50

Page 59: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Chapter 5

Conclusions and Results

5.1 Our Contributions

The contributions in this thesis have mostly been to explain the attack launchedby Wang et al. on MD5 and SHA-0. The result of the attack on MD5 was firstpublished in 2004 [15] with no explanation of how the attack was launched. Almosta year later, after requests had been made that Wang et al. describe the details ofthe attack, a second paper [14] was published decribing the attack in some detail.This paper included the paths that had been derived through MD5 and some detailsabout how a collision was found using the paths.

The attack Wang et al. published against SHA-0 was described in more detailbut still lacked useful information about how the path was constructed, how theconditions were derived and how to best search for a collision. As we have shownin section 4.4.3, one of Wang’s conditions on the local collisions can be expressedin a better way that can be used directly instead of requiring other bits besides theones used in the condition.

In this thesis we have looked at how the paths through both hash functions havebeen derived and studied the details about the attacks thoroughly. One of the aimsof this thesis has been to understand the attacks and describe them in more detail.On top of this we feel that we have made contributions by improving one of theconditions derived by Wang et al. for the local collisions in the second and fourthround of SHA-0. We have also shown that because the condition was not clearlyexpressed, one of the four submarine modifications introduced by Naito et al. doesnot work as well as expected. This modification was replaced by another that wehave shown to work and thereby we have lowered the estimated time to search fora collision to about 50 hours.

5.2 The Status of Cryptographic Hash Functions

Wang et al. managed to extend their attack on SHA-0 to SHA-1 and thereby publishthe first attack on SHA-1 that is better than the brute force attack. This prompted

51

Page 60: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

CHAPTER 5. CONCLUSIONS AND RESULTS

NIST to initiate an open contest for the new cryptographic hash function, SHA-3,which will replace the SHA-2 family of hash functions.

The recommended cryptographic hash functions for use today is SHA-1 and theSHA-2 family. NIST recommends however that SHA-1 should not be used in newapplications and protocols as it should be replaced by SHA-2 by 2010.

No major theoretical attack has yet been published on SHA-2 but its structureis quite similar to SHA-0 and SHA-1. It uses the same message padding and isalso built upon the Merkle-Damgård and Davies-Meyer contruction schemes. Theencryption function is also an iterative function but two internal variables are up-dated in each step and each step uses two round functions and two functions thatrotates and xors bits of the same word. At this time, NIST does not forsee anyimminent attack on the SHA-2 family but since they are the only cryptographichash functions in the Secure Hash Standard that are not broken

[...] a successful collision attack on an algorithm in the SHA-2 familycould have catastrophic effects for digital signatures. [19]

The SHA-2 family is considered secure as the only attacks against the cryp-tographic hash functions are the brute-force attacks, but because of the relativelymany calculations that are done in each step of the hash functions, they are quiteslow compared to MD5 and SHA-1. Highly optimized reference implementationsof SHA-1 and SHA-2 found at http://www.xyssl.org show that SHA-1 is abouttwice as fast as SHA-256. This is one of the reasons that SHA-1 is still used inseveral security applications and one of the requirements of SHA-3 is that it shouldbe faster than SHA-2.

The attacks on SHA-1 are not very useful in practice. No collision has yet beenfound, that is publically known, and even when one is found, it is very unlikelythat it can be used in practice. When collisions on hash functions can be generatedas quickly as they can be for MD5, then it is possible within a reasonable time togenerate real documents that hash to the same value. Theoretically, SHA-1 hasbeen broken since the attack by Wang et al. is faster than the brute force attack,but practically it must still be considered secure.

The MD- and SHA-familys have been the most frequently used cryptographichash functions for many years, but there are other alternatives. Two of themare RIPEMD-160, designed as a project by the European Union, and Whirlpool.Neither of these are broken so there are not even theoretical attacks against them,but they have also not been used as frequently as MD5 and SHA-1 and thereforehave not been studied to the same extent.

52

Page 61: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Bibliography

[1] Xiaoyun Wang, Hongbo Yu, and Yiqun Lisa Yin. Efficient Collision Search At-tacks on SHA-0. Crypto 2005, Lecture Notes in Computer Science. 2005. Volume3621. Pages 1-16.

[2] Florent Chabaud and Antoine Joux. Differential Collisions in SHA-0. CRYPTO’98, Lecture Notes in Computer Science. 1998. Volume 1462. Pages 56-71.

[3] Eli Biham and Rafi Chen. Near-Collisions of SHA-0. CRYPTO 2004, LectureNotes in Computer Science. 2004. Volume 3152. Pages 290-305.

[4] Antoine Joux and Thomas Peyrin. Hash Functions and the (Amplified) Boomer-ang Attack. CRYPTO 2007, Lecture Notes in Computer Science. 2007. Volume4622. Pages 244-263.

[5] Yusuke Naito, Yu Sasaki Takeshi Shimoyama, Jun Yajima, Noboru Kunihiro andKazuo Ohta. Improved Collision Search for SHA-0. ASIACRYPT 2006, LectureNotes in Computer Science. 2006. Volume 4284. Pages 21-36.

[6] Xiaoyun Wang, Yiqun Lisa Yin and Hongbo Yu. Finding Collisions in the FullSHA-1. Crypto 2005, Lecture Notes in Computer Science. 2005. Volume 2621,August 2005. Pages 17-36.

[7] Federal Information processing Standards (FIPS) Publication 180-2. SecureHash Standard (SHS). U.S. DoC/NIST. 2002.

[8] Thomas Peyrin and Stéphane Manuel. Collisions on SHA-0 in onehour. IPA Cryptographic Workshop. 2007. http://tpeyrin.no-ip.org/PresentationIPAWORKSHOP2007.pdf

[9] X. Wang, A. Yao, and F. Yao, New Collision search for SHA-1, Rump SessionCrypto’05.

[10] Bill Burr. NIST Hash Function Standards, Status and Plans. 2005.http://csrc.nist.gov/groups/SMA/ispab/documents/minutes/2005-12/B_Burr-Dec2005-ISPAB.pdf

53

Page 62: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

BIBLIOGRAPHY

[11] Philip Hawkes, Michael Paddon and Gregory G. Rose. Musings on the Wanget al. MD5 Collision. Cryptology ePrint Archive, Report 2004/264. 2004. http://eprint.iacr.org/.

[12] Vlastimil Klima. Tunnels in Hash Functions: MD5 Collisions Within a Minute.Cryptology ePrint Archive, Report 2006/105. 2006. http://eprint.iacr.org/.

[13] John Black, Martin Cochran and Trevor Highland. A Study of the MD5 At-tacks: Insights and Improvements. Fast Software Encryption. 2006. Pages 262-277. ISBN: 978-3-540-36597-6.

[14] Xiaoyun Wang and Hongbo Yu. How to Break MD5 and Other Hash Functions.Eurocrypt 2005, Lecture Notes in Computer Science. vol. 3494, May 2005. Pages19-35.

[15] Xiaoyun Wang, Dengguo Feng, Xuejia Lai and Hongbo Yu. Collisions for HashFunctions MD4, MD5, HAVAL-128 and RIPEMD. Cryptology ePrint Archive,Report 2004/199. 2004. http://eprint.iacr.org/.

[16] Hans Dobbertin. Cryptanalysis of MD4. Fast Software Encryption. 1996. Pages53-69.

[17] Xiaoyun Wang, Xuejia Lai, Hui Chen and Xiuyuan Yu. Cryptanalysis of theHash Functions MD4 and RIPEMD. EUROCRYPT 2005, Lecture Notes inComputer Science. 2005. Volume 3494. Pages 1-18.

[18] Bert den Boer and Antoon Bosselaers. An Attack on the Last Two Rounds ofMD4. Lecture Notes in Computer Science. 1991. Volume 576. Pages 194-203.

[19] NIST. Announcing Request for Candidate Algorithm Nominations for a NewCryptographic Hash Algorithm (SHA-3) family. Federal Register. 2007. Volume72. Nr 212. http://csrc.nist.gov/groups/ST/hash/documents/FR_Notice_Nov07.pdf.

[20] Ivan Damgård. A Design Principle for Hash Functions. Lecture Notes in Com-puter Science. 1989. Volume 435. Pages 416-427.

[21] Frederic Muller. The MD2 Hash Function Is Not One-Way. Lecture Notes inComputer Science. 2004. Volume 3329. Pages 214-229.

[22] Lars R. Knudsen and John E. Mathiassen. Preimage and Collision Attacks onMD2. Lecture Notes in Computer Science. 2005. Volume 3557. Pages 255-267.

54

Page 63: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Appendix A

MD5

A.1 MD5 Step by Step

Q−3 = 6745230116, Q−2 = 1032547616, Q−1 = 98badcfe16, Q0 = efcdab8916

A.1.1 Round 1

Q1 = Q0 � ((Q−3 � ((Q0 ∧Q−1) ∨ (¬Q0 ∧Q−2)) � m0 � d76aa47816) ≪ 7)Q2 = Q1 � ((Q−2 � ((Q1 ∧Q0) ∨ (¬Q1 ∧Q−1)) � m1 � e8c7b75616) ≪ 12)Q3 = Q2 � ((Q−1 � ((Q2 ∧Q1) ∨ (¬Q2 ∧Q0)) � m2 � 242070db16) ≪ 17)Q4 = Q3 � ((Q0 � ((Q3 ∧Q2) ∨ (¬Q3 ∧Q1)) � m3 � c1bdceee16) ≪ 22)

Q5 = Q4 � ((Q1 � ((Q4 ∧Q3) ∨ (¬Q4 ∧Q2)) � m4 � f57c0faf16) ≪ 7)Q6 = Q5 � ((Q2 � ((Q5 ∧Q4) ∨ (¬Q5 ∧Q3)) � m5 � 4787c62a16) ≪ 12)Q7 = Q6 � ((Q3 � ((Q6 ∧Q5) ∨ (¬Q6 ∧Q4)) � m6 � a830461316) ≪ 17)Q8 = Q7 � ((Q4 � ((Q7 ∧Q6) ∨ (¬Q7 ∧Q5)) � m7 � fd46950116) ≪ 22)

Q9 = Q8 � ((Q5 � ((Q8 ∧Q7) ∨ (¬Q8 ∧Q6)) � m8 � 698098d816) ≪ 7)Q10 = Q9 � ((Q6 � ((Q9 ∧Q8) ∨ (¬Q9 ∧Q7)) � m9 � 8b44f7af16) ≪ 12)Q11 = Q10 � ((Q7 � ((Q10 ∧Q9) ∨ (¬Q10 ∧Q8)) � m10 � ffff5bb116) ≪ 17)Q12 = Q11 � ((Q8 � ((Q11 ∧Q10) ∨ (¬Q11 ∧Q9)) � m11 � 895cd7be16) ≪ 22)

Q13 = Q12 � ((Q9 � ((Q12 ∧Q11) ∨ (¬Q12 ∧Q10)) � m12 � 6b90112216) ≪ 7)Q14 = Q13 � ((Q10 � ((Q13 ∧Q12) ∨ (¬Q13 ∧Q11)) � m13 � fd98719316) ≪ 12)Q15 = Q14 � ((Q11 � ((Q14 ∧Q13) ∨ (¬Q14 ∧Q12)) � m14 � a679438e16) ≪ 17)Q16 = Q15 � ((Q12 � ((Q15 ∧Q14) ∨ (¬Q15 ∧Q13)) � m15 � 49b4082116) ≪ 22)

55

Page 64: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX A. MD5

A.1.2 Round 2

Q17 = Q16 � ((Q13 � ((Q16 ∧Q14) ∨ (Q15 ∧ ¬Q14)) � m1 � f61e256216) ≪ 5)Q18 = Q17 � ((Q14 � ((Q17 ∧Q15) ∨ (Q16 ∧ ¬Q15)) � m6 � c040b34016) ≪ 9)Q19 = Q18 � ((Q15 � ((Q18 ∧Q16) ∨ (Q17 ∧ ¬Q16)) � m11 � 265e5a5116) ≪ 14)Q20 = Q19 � ((Q16 � ((Q19 ∧Q17) ∨ (Q18 ∧ ¬Q17)) � m0 � e9b6c7aa16) ≪ 20)

Q21 = Q20 � ((Q17 � ((Q20 ∧Q18) ∨ (Q19 ∧ ¬Q18)) � m5 � d62f105d16) ≪ 5)Q22 = Q21 � ((Q18 � ((Q21 ∧Q19) ∨ (Q20 ∧ ¬Q19)) � m10 � 244145316) ≪ 9)Q23 = Q22 � ((Q19 � ((Q22 ∧Q20) ∨ (Q21 ∧ ¬Q20)) � m15 � d8a1e68116) ≪ 14)Q24 = Q23 � ((Q20 � ((Q23 ∧Q21) ∨ (Q22 ∧ ¬Q21)) � m4 � e7d3fbc816) ≪ 20)

Q25 = Q24 � ((Q21 � ((Q24 ∧Q22) ∨ (Q23 ∧ ¬Q22)) � m9 � 21e1cde616) ≪ 5)Q26 = Q25 � ((Q22 � ((Q25 ∧Q23) ∨ (Q24 ∧ ¬Q23)) � m14 � c33707d616) ≪ 9)Q27 = Q26 � ((Q23 � ((Q26 ∧Q24) ∨ (Q25 ∧ ¬Q24)) � m3 � f4d50d8716) ≪ 14)Q28 = Q27 � ((Q24 � ((Q27 ∧Q25) ∨ (Q26 ∧ ¬Q25)) � m8 � 455a14ed16) ≪ 20)

Q29 = Q28 � ((Q25 � ((Q28 ∧Q26) ∨ (Q27 ∧ ¬Q26)) � m13 � a9e3e90516) ≪ 5)Q30 = Q29 � ((Q26 � ((Q29 ∧Q27) ∨ (Q28 ∧ ¬Q27)) � m2 � fcefa3f816) ≪ 9)Q31 = Q30 � ((Q27 � ((Q30 ∧Q28) ∨ (Q29 ∧ ¬Q28)) � m7 � 676f02d916) ≪ 14)Q32 = Q31 � ((Q28 � ((Q31 ∧Q29) ∨ (Q30 ∧ ¬Q29)) � m12 � 8d2a4c8a16) ≪ 20)

56

Page 65: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

A.1. MD5 STEP BY STEP

A.1.3 Round 3

Q33 = Q32 � ((Q29 � (Q32 ⊕Q31 ⊕Q30) � m5 � fffa394216) ≪ 4)Q34 = Q33 � ((Q30 � (Q33 ⊕Q32 ⊕Q31) � m8 � 8771f68116) ≪ 11)Q35 = Q34 � ((Q31 � (Q34 ⊕Q33 ⊕Q32) � m11 � 6d9d612216) ≪ 16)Q36 = Q35 � ((Q32 � (Q35 ⊕Q34 ⊕Q33) � m14 � fde5380c16) ≪ 23)

Q37 = Q36 � ((Q33 � (Q36 ⊕Q35 ⊕Q34) � m1 � a4beea4416) ≪ 4)Q38 = Q37 � ((Q34 � (Q37 ⊕Q36 ⊕Q35) � m4 � 4bdecfa916) ≪ 11)Q39 = Q38 � ((Q35 � (Q38 ⊕Q37 ⊕Q36) � m7 � f6bb4b6016) ≪ 16)Q40 = Q39 � ((Q36 � (Q39 ⊕Q38 ⊕Q37) � m10 � bebfbc7016) ≪ 23)

Q41 = Q40 � ((Q37 � (Q40 ⊕Q39 ⊕Q38) � m13 � 289b7ec616) ≪ 4)Q42 = Q41 � ((Q38 � (Q41 ⊕Q40 ⊕Q39) � m0 � eaa127fa16) ≪ 11)Q43 = Q42 � ((Q39 � (Q42 ⊕Q41 ⊕Q40) � m3 � d4ef308516) ≪ 16)Q44 = Q43 � ((Q40 � (Q43 ⊕Q42 ⊕Q41) � m6 � 4881d0516) ≪ 23)

Q45 = Q44 � ((Q41 � (Q44 ⊕Q43 ⊕Q42) � m9 � d9d4d03916) ≪ 4)Q46 = Q45 � ((Q42 � (Q45 ⊕Q44 ⊕Q43) � m12 � e6db99e516) ≪ 11)Q47 = Q46 � ((Q43 � (Q46 ⊕Q45 ⊕Q44) � m15 � 1fa27cf816) ≪ 16)Q48 = Q47 � ((Q44 � (Q47 ⊕Q46 ⊕Q45) � m2 � c4ac566516) ≪ 23)

57

Page 66: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX A. MD5

A.1.4 Round 4

Q49 = Q48 � ((Q45 � (Q47 ⊕ (Q48 ∨ ¬Q46)) � m3 � f429224416) ≪ 6)Q50 = Q49 � ((Q46 � (Q48 ⊕ (Q49 ∨ ¬Q47)) � m7 � 432aff9716) ≪ 10)Q51 = Q50 � ((Q47 � (Q49 ⊕ (Q50 ∨ ¬Q48)) � m14 � ab9423a716) ≪ 15)Q52 = Q51 � ((Q48 � (Q50 ⊕ (Q51 ∨ ¬Q49)) � m5 � fc93a03916) ≪ 21)

Q53 = Q52 � ((Q49 � (Q51 ⊕ (Q52 ∨ ¬Q50)) � m12 � 655b59c316) ≪ 6)Q54 = Q53 � ((Q50 � (Q52 ⊕ (Q53 ∨ ¬Q51)) � m3 � 8f0ccc9216) ≪ 10)Q55 = Q54 � ((Q51 � (Q53 ⊕ (Q54 ∨ ¬Q52)) � m10 � ffeff47d16) ≪ 15)Q56 = Q55 � ((Q52 � (Q54 ⊕ (Q55 ∨ ¬Q53)) � m1 � 85845dd116) ≪ 21)

Q57 = Q56 � ((Q53 � (Q55 ⊕ (Q56 ∨ ¬Q54)) � m8 � 6fa87e4f16) ≪ 6)Q58 = Q57 � ((Q54 � (Q56 ⊕ (Q57 ∨ ¬Q55)) � m15 � fe2ce6e016) ≪ 10)Q59 = Q58 � ((Q55 � (Q57 ⊕ (Q58 ∨ ¬Q56)) � m6 � a301431416) ≪ 15)Q60 = Q59 � ((Q56 � (Q58 ⊕ (Q59 ∨ ¬Q57)) � m13 � 4e0811a116) ≪ 21)

Q61 = Q60 � ((Q57 � (Q59 ⊕ (Q60 ∨ ¬Q58)) � m4 � f7537e8216) ≪ 6)Q62 = Q61 � ((Q58 � (Q60 ⊕ (Q61 ∨ ¬Q59)) � m11 � bd3af23516) ≪ 10)Q63 = Q62 � ((Q59 � (Q61 ⊕ (Q62 ∨ ¬Q60)) � m2 � 2ad7d2bb16) ≪ 15)Q64 = Q63 � ((Q60 � (Q62 ⊕ (Q63 ∨ ¬Q61)) � m9 � eb86d39116) ≪ 21)

Output:

(Q0 � Q64, Q−1 � Q63, Q−2 � Q62, Q−4 � Q61)

58

Page 67: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

A.2. WANG’S PATH THROUGH MD5

A.2 Wang’s Path Through MD5

A.2.1 Block 1

Round 1

Table A.1. Wang’s path through MD5 - block 1 - round 1

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 01 m0 7 ........ ........ ........ ........2 m1 12 ........ ........ ........ ........3 m2 17 ........ ....0... ....0... .0......4 m3 22 1....... 0ˆˆˆ1ˆˆˆ ˆˆˆˆ1ˆˆˆ ˆ0......5 m4 7 231 1...1.0. 01000000 00000000 001..1.16 m5 12 0ˆˆˆ0ˆ1ˆ 01111111 10111100 010ˆˆ0ˆ17 m6 17 00000011 11111110 11111000 001000008 m7 22 00000001 1..10001 0.0.0101 010000009 m8 7 11111011 ...10000 0.1ˆ1111 0011110110 m9 12 01...... 0..11111 1.01...0 01....0011 m10 17 00...... ....0001 1ˆ01...0 11....1012 m11 22 215 00....ˆˆ ....1000 0001...1 0.......13 m12 7 01....01 ....1111 111....0 0...1...14 m13 12 0.0...00 ....1011 111....1 1...1...15 m14 17 231 0.1...01 ........ 1....... ....0...16 m15 22 0.1..... ........ ........ ........

59

Page 68: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX A. MD5

Round 2

Table A.2. Wang’s path through MD5 - block 1 - round 2

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 017 m1 5 0....... ......0. ˆ....... ....ˆ...18 m6 9 0.ˆ..... ......1. ........ ........19 m11 14 215 0....... ......0. ........ ........20 m0 20 0....... ........ ........ ........21 m5 5 0....... ......ˆ. ........ ........22 m10 9 0....... ........ ........ ........23 m15 14 0....... ........ ........ ........24 m4 20 231 !....... ........ ........ ........25 m9 5 ........ ........ ........ ........26 m14 9 231 ........ ........ ........ ........27 m3 14 ........ ........ ........ ........28 m8 20 ........ ........ ........ ........29 m13 5 ........ ........ ........ ........30 m2 9 ........ ........ ........ ........31 m7 14 ........ ........ ........ ........32 m12 20 ........ ........ ........ ........

60

Page 69: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

A.2. WANG’S PATH THROUGH MD5

Round 3

Table A.3. Wang’s path through MD5 - block 1 - round 3

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 033 m5 4 ........ ........ ........ ........34 m8 11 ........ ........ ........ ........35 m11 16 215 ........ ........ ........ ........36 m14 23 231 ........ ........ ........ ........37 m1 4 ........ ........ ........ ........38 m4 11 231 ........ ........ ........ ........39 m7 16 ........ ........ ........ ........40 m10 23 ........ ........ ........ ........41 m13 4 ........ ........ ........ ........42 m0 11 ........ ........ ........ ........43 m3 16 ........ ........ ........ ........44 m6 23 ........ ........ ........ ........45 m9 4 ........ ........ ........ ........46 m12 11 I....... ........ ........ ........47 m15 16 J....... ........ ........ ........48 m2 23 I....... ........ ........ ........

Extra condition: No carry in step 35 when the message difference is used.

61

Page 70: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX A. MD5

Round 4

Table A.4. Wang’s path through MD5 - block 1 - round 4

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 049 m0 6 J....... ........ ........ ........50 m7 10 K....... ........ ........ ........51 m14 15 231 J....... ........ ........ ........52 m5 21 K....... ........ ........ ........53 m12 6 J....... ........ ........ ........54 m3 10 K....... ........ ........ ........55 m10 15 J....... ........ ........ ........56 m1 21 K....... ........ ........ ........57 m8 6 J....... ........ ........ ........58 m15 10 K....... ........ ........ ........59 m6 15 J....... ........ ........ ........60 m13 21 I.....0. ........ ........ ........61 m4 6 231 J....01. ........ ........ ........62 m11 10 215 I....... ........ ........ ........63 m2 15 J....... ........ ........ ........64 m9 21 ........ ........ ........ ........

Note: I,J,K ∈ (0, 1), I 6= K.Extra conditions:

Table A.5. Wang’s path through MD5 - block 1 - round 4 - extra conditions

Conditions on bits of Qi

Q62 � Q−2 ......0. ........ ........ ........Q63 � Q−1 ˆ....01. ........ ........ ........Q64 � Q0 ˆ....00. ........ ........ ..0.....

62

Page 71: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

A.2. WANG’S PATH THROUGH MD5

A.2.2 Block 2

Round 1

Table A.6. Wang’s path through MD5 - block 2 - round 1

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 01 m0 7 1...0.1. 0...1... ....0... ..0.....2 m1 12 1ˆˆˆ110. ..0ˆˆˆˆˆ 0..ˆ1... ˆˆ0..0.03 m2 17 1011111. ..011111 ...01..1 011ˆˆ11.4 m3 22 1011101. ..000000 ...00ˆˆ0 0001000ˆ5 m4 7 231 010010.. ..101111 ...01110 010100006 m5 12 0..0010. ..10..10 ...01100 010101107 m6 17 1..1011ˆ ˆ.00..01 ˆ..11110 00.....18 m7 22 1..00100 0.11..10 1.....11 11....ˆ09 m8 7 1..11100 0.....01 0..ˆ..011 11....0110 m9 12 1....111 1....011 1..0..11 11....0011 m10 17 1....... ....ˆ101 1ˆˆ0..11 11....1112 m11 22 −215 1ˆˆˆˆˆˆˆ ....1000 0001.... 1.......13 m12 7 00111111 ....1111 111..... 0...1...14 m13 12 01000000 ....1011 111..... 1...1...15 m14 17 231 00111101 ........ 0....... ....0...16 m15 22 0.1..... ........ ........ ........

63

Page 72: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX A. MD5

Round 2

Table A.7. Wang’s path through MD5 - block 2 - round 2

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 017 m1 5 0....... ......0. ˆ....... ....ˆ...18 m6 9 0.ˆ..... ......1. ........ ........19 m11 14 −215 0....... ......0. ........ ........20 m0 20 0....... ........ ........ ........21 m5 5 0....... ......ˆ. ........ ........22 m10 9 0....... ........ ........ ........23 m15 14 0....... ........ ........ ........24 m4 20 231 !....... ........ ........ ........25 m9 5 ........ ........ ........ ........26 m14 9 231 ........ ........ ........ ........27 m3 14 ........ ........ ........ ........28 m8 20 ........ ........ ........ ........29 m13 5 ........ ........ ........ ........30 m2 9 ........ ........ ........ ........31 m7 14 ........ ........ ........ ........32 m12 20 ........ ........ ........ ........

64

Page 73: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

A.2. WANG’S PATH THROUGH MD5

Round 3

Table A.8. Wang’s path through MD5 - block 2 - round 3

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 033 m5 4 ........ ........ ........ ........34 m8 11 ........ ........ ........ ........35 m11 16 -215 ........ ........ ........ ........36 m14 23 231 ........ ........ ........ ........37 m1 4 ........ ........ ........ ........38 m4 11 231 ........ ........ ........ ........39 m7 16 ........ ........ ........ ........40 m10 23 ........ ........ ........ ........41 m13 4 ........ ........ ........ ........42 m0 11 ........ ........ ........ ........43 m3 16 ........ ........ ........ ........44 m6 23 ........ ........ ........ ........45 m9 4 ........ ........ ........ ........46 m12 11 I....... ........ ........ ........47 m15 16 J....... ........ ........ ........48 m2 23 I....... ........ ........ ........

Extra condition: No carry in step 35 when the message difference is used.

65

Page 74: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX A. MD5

Round 4

Table A.9. Wang’s path through MD5 - block 2 - round 4

Conditions on bits of Qi

Step mi ri ∆mi 31 - 24 23 - 16 15 - 8 7 - 049 m0 6 J....... ........ ........ ........50 m7 10 K....... ........ ........ ........51 m14 15 231 J....... ........ ........ ........52 m5 21 K....... ........ ........ ........53 m12 6 J....... ........ ........ ........54 m3 10 K....... ........ ........ ........55 m10 15 J....... ........ ........ ........56 m1 21 K....... ........ ........ ........57 m8 6 J....... ........ ........ ........58 m15 10 K....... ........ ........ ........59 m6 15 J....... ........ ........ ........60 m13 21 I....... ........ ........ ........61 m4 6 231 J.....1. ........ ........ ........62 m11 10 -215 I.....1. ........ ........ ........63 m2 15 J.....1. ........ ........ ........64 m9 21 ......1. ........ ........ ........

Note: I,J,K ∈ (0, 1), I 6= K.

66

Page 75: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

Appendix B

SHA-0

B.1 Wang’s Path Through SHA-0

Table B.1. Wang’s Path Through SHA-0

Conditions on bits of Qi

Qi 31 - 24 23 - 16 15 - 8 7 - 0Q0 ........ 1.1100.1 ..1..1ˆ1 10..ˆ.!.Q1 1....... 0ˆ0011ˆ0 ˆˆ1.10ˆ0 11....1.Q2 0.0..... ..011111 111111.1 0010ˆ...Q3 1.1..ˆˆˆ ˆˆ0.0011 0010101. .010100.Q4 1....ˆ.0 11111000 ..1111.0 110011..Q5 0......0 .0001001 00100.0. 01.01...Q6 .......0 .1011110 010.100. ...100.Q7 0......1 ˆ1011111 0100..00 0....10.Q8 ........ .1000000 00000.11 1...1...Q9 1....... ...00000 0011001. ....0...Q10 ........ ...11111 1111111. ......0.Q11 0....... ........ ......0. ....1...Q12 0....... ........ ........ 0...0...Q13 ........ ........ ........ 1...I.0.Q14 1....... ........ ........ ....K...Q15 0....... ........ ........ ....K.1.Q16 1....... ........ ........ ....I...Q17 0....... ........ ........ ......K.Q18 1....... ........ ........ ........

Note: I=0 if message option one is used and I=1 in case of message option two.See section 3.4.1, Calculating a path and table 4.7. K 6= I.

67

Page 76: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX B. SHA-0

B.2 Final Conditions for SHA-0

Table B.2. Final conditions for SHA-0, round two and three

Nr Step Local collision Condition Description1 22 22 Q22,1 = w21,1 No carry2 23 23 Q23,1 = w22,1 No carry3 24 22 F24,1 6= w23,1 Cancellation4 25 23 F25,1 6= w24,1 Cancellation5 27 27 Q27,1 = w26,1 No carry6 29 27 F29,1 6= w28,1 Cancellation7 31 31 Q31,1 = w30,1 No carry8 33 31,33 Q33,1 = F33,1 No carry9 33 33 Q33,1 = w32,1 Fulfill precomputed message bits10 34 34 Q34,1 = w33,1 No carry11 35 33,35 Q35,1 = F35,1 No carry12 35 35 Q35,1 = w34,1 Fulfill precomputed message bits13 36 34 F36,1 6= w35,1 Cancellation14 37 35 F37,1 6= w36,1 Cancellation15 41 42 Q41,3 6= Q40,3 Output from MAJ same as input16 42 42 Q42,1 = w41,1 No carry17 43 42 Q43,31 6= Q41,1 Diff from MAJ18 44 42 Q44,31 6= Q43,1 Diff from MAJ19 47 48 Q47,3 6= Q46,3 Output from MAJ same as input20 48 48 Q48,1 = w47,1 No carry21 49 48 Q49,31 6= Q47,1 Diff from MAJ22 50 48 Q50,31 6= Q49,1 Diff from MAJ23 57 58 Q57,3 6= Q56,3 Output from MAJ same as input24 58 58 Q58,1 = w57,1 No carry

68

Page 77: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

B.2. FINAL CONDITIONS FOR SHA-0

Table B.3. Final conditions for SHA-0, round four

Nr Step Local collision Condition Description25 61 61 Q61,1 = w60,1 No carry26 62 62 Q62,1 = w61,1 No carry27 63 61 F63,1 6= w62,1 Cancellation28 64 62 F64,1 6= w63,1 Cancellation29 65 65 Q65,1 = w64,1 No carry30 66 66 Q66,1 = w65,1 No carry31 67 65 F67,1 6= w66,1 Cancellation32 68 66,68 Q68,1 = F68,1 No carry33 68 68 Q68,1 = w67,1 Fulfill precomputed message bits34 70 68,70 Q70,1 = F70,1 No carry35 70 70 Q70,1 = w69,1 Fulfill precomputed message bits36 71 71 Q71,1 = w70,1 No carry37 72 70,72 Q72,1 = F72,1 No carry38 72 72 Q72,1 = w71,1 Fulfill precomputed message bits39 73 71 F73,1 6= w72,1 Cancellation40 74 72,74 Q74,1 = F74,1 No carry41 74 74 Q74,1 = w74,1 Fulfill precomputed message bits42 76 74 F76,1 6= w76,1 Cancellation

69

Page 78: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX B. SHA-0

B.3 A Step by Step Look at a Collision

In this section we have detailed a collision on SHA-0 with intermediate results forevery step of the cryptographic hash function for M1. M0 and M1 are detailed inthe table below.

Table B.4. The message blocks for the collision.

M0 f5fb42d216 427f1b2816 3025a85d16 fcca64ca16

3c10f0d516 d97f6a7a16 22f7183216 4779c05216

a4bb3eb416 d97f6a7a16 68da613716 c402d48d16

6d69d50916 0292497216 9100050f16 07ba452d16

M1 0047af8316 14f93df616 76cea10416 4f2c185816

583e5ac316 ac3b9c9d16 44bede0b16 a6cd6fc016

a387b46016 497d5e0216 609fd61816 c1c3fc8c16

c935592516 beabe26116 1c2c8a0d16 e2f0586d16

70

Page 79: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

B.3. A STEP BY STEP LOOK AT A COLLISION

B.3.1 Round 1

Table B.5. A collision using Wang’s message option, round 1

Step Qi Q′i δQi wi w′

i δwi

1 ea4ee8ea16 6a4ee92816 8000003e16 0047af8316 8047afc116 8000003e16

2 599fff2c16 59a006dc16 000007b016 14f93df616 14f93df616 0000000016

3 f1932a2916 f194202b16 0000f60216 76cea10416 f6cea14616 8000004216

4 80f83ccc16 81003dcc16 0008010016 4f2c185816 4f2c181816 ffffffc016

5 6e09256816 6f09257016 0100000816 583e5ac316 583e5a8316 ffffffc016

6 98de490816 98de870a16 00003e0216 ac3b9c9d16 ac3b9cdf16 0000004216

7 01df482416 01de482616 ffff000216 44bede0b16 c4bede0916 7ffffffe16

8 f240079e16 f23fff9e16 fffff80016 a6cd6fc016 26cd6fc216 8000000216

9 8060326416 8060326416 0000000016 a387b46016 a387b42016 ffffffc016

10 cfffffcd16 cfffffcf16 0000000216 497d5e0216 c97d5e0216 8000000016

11 61719c1916 61719e1916 0000020016 609fd61816 e09fd65816 8000004016

12 7b09d80516 7b09d80516 0000000016 c1c3fc8c16 41c3fc8e16 8000000216

13 a2fc71e516 a2fc71e716 0000000216 c935592516 c935592716 0000000216

14 f1d30d3d16 f1d30d3d16 0000000016 beabe26116 3eabe22116 7fffffc016

15 7fd1215a16 7fd1215816 fffffffe16 1c2c8a0d16 9c2c8a0d16 8000000016

16 fe86e29416 fe86e29416 0000000016 e2f0586d16 62f0582d16 7fffffc016

17 7244b82a16 7244b82816 fffffffe16 6ba5588616 eba5588616 8000000016

18 b951ce6016 b951ce6016 0000000016 0e84f1a116 0e84f1e116 0000004016

19 6d81772516 6d81772516 0000000016 ac9f75b216 2c9f75b016 7ffffffe16

20 92c805f016 92c805f016 0000000016 497120cf16 497120cf16 0000000016

71

Page 80: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX B. SHA-0

B.3.2 Round 2

Table B.6. A collision using Wang’s message option, round 2

Step Qi Q′i δQi wi w′

i δwi

21 c1f1b95b16 c1f1b95b16 0000000016 db312c4c16 5b312c4c16 8000000016

22 4a60d55216 4a60d55016 fffffffe16 18c2648e16 98c2648c16 7ffffffe16

23 59d1b01916 59d1b01b16 0000000216 b264c0a916 b264c0eb16 0000004216

24 79904cf016 79904cf016 0000000016 d67145e316 d67145a116 ffffffbe16

25 b14ad4b716 b14ad4b716 0000000016 b07f5e7016 307f5e7216 8000000216

26 7a8b9a1516 7a8b9a1516 0000000016 345e938616 345e938616 0000000016

27 1f84ce9d16 1f84ce9f16 0000000216 d344bf6c16 d344bf6e16 0000000216

28 c50bc7a016 c50bc7a016 0000000016 8666605216 0666601216 7fffffc016

29 16a1fd0c16 16a1fd0c16 0000000016 3a766ce216 3a766ce016 fffffffe16

30 a391b6c016 a391b6c016 0000000016 97dd61ee16 17dd61ee16 8000000016

31 e3415bed16 e3415bef16 0000000216 438b727016 c38b727216 8000000216

32 b6d5563c16 b6d5563c16 0000000016 007380cd16 8073808d16 7fffffc016

33 296d00e716 296d00e516 fffffffe16 e09812aa16 e09812aa16 0000000016

34 b92425d816 b92425da16 0000000216 3020309816 b02030da16 8000004216

35 dce47e5a16 dce47e5816 fffffffe16 a499665f16 2499661f16 7fffffc016

36 1a51751e16 1a51751e16 0000000016 374d36b916 374d36fb16 0000004216

37 82b7cca716 82b7cca716 0000000016 6303b09f16 6303b09d16 fffffffe16

38 11474ba416 11474ba416 0000000016 fdf726dc16 7df726dc16 8000000016

39 f004d21f16 f004d21f16 0000000016 76ddda1016 76ddda1016 0000000016

40 1f8c1b3616 1f8c1b3616 0000000016 815fe63716 015fe63716 8000000016

72

Page 81: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

B.3. A STEP BY STEP LOOK AT A COLLISION

B.3.3 Round 3

Table B.7. A collision using Wang’s message option, round 3

Step Qi Q′i δQi wi w′

i δwi

41 658e287916 658e287916 0000000016 7e54d56a16 7e54d56a16 0000000016

42 3254eb3416 3254eb3616 0000000216 f4c5195c16 f4c5195e16 0000000216

43 90826efe16 90826efe16 0000000016 ccf453e616 ccf453a616 ffffffc016

44 07713dc016 07713dc016 0000000016 58a2e26f16 58a2e26d16 fffffffe16

45 0be55dd816 0be55dd816 0000000016 ee3bb75116 6e3bb75116 8000000016

46 1009d1e116 1009d1e116 0000000016 a6ad941916 26ad941916 8000000016

47 2c3bec6816 2c3bec6816 0000000016 8d6c58a516 0d6c58a516 8000000016

48 1acb1e4316 1acb1e4116 fffffffe16 5f37e13316 5f37e13116 fffffffe16

49 8b8f7ead16 8b8f7ead16 0000000016 9cf8358616 9cf835c616 0000004016

50 8cd3b03516 8cd3b03516 0000000016 7ec447d816 7ec447da16 0000000216

51 4d7c9bb216 4d7c9bb216 0000000016 5459641516 d459641516 8000000016

52 1f92cd7c16 1f92cd7c16 0000000016 0ee0c78c16 8ee0c78c16 8000000016

53 30beada016 30beada016 0000000016 85219a0616 05219a0616 8000000016

54 ab486de016 ab486de016 0000000016 8e5c30e716 8e5c30e716 0000000016

55 fa629d3d16 fa629d3d16 0000000016 8b05905316 8b05905316 0000000016

56 2dc7bb4d16 2dc7bb4d16 0000000016 af8c845e16 af8c845e16 0000000016

57 1a5ef94516 1a5ef94516 0000000016 a00483ed16 a00483ed16 0000000016

58 6fa378ec16 6fa378ee16 0000000216 59a62cb816 59a62cba16 0000000216

59 e1cfea2416 e1cfea2416 0000000016 d91a04fc16 d91a04bc16 ffffffc016

60 e850d9af16 e850d9af16 0000000016 50eb321716 50eb321516 fffffffe16

73

Page 82: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

APPENDIX B. SHA-0

B.3.4 Round 4

Table B.8. A collision using Wang’s message option, round 4

Step Qi Q′i δQi wi w′

i δwi

61 9c70ca3e16 9c70ca3c16 fffffffe16 bfd0594a16 3fd0594816 7ffffffe16

62 19bc064816 19bc064a16 0000000216 aedc413116 2edc417316 8000004216

63 465e3f5716 465e3f5716 0000000016 ca7acf6716 4a7acf2516 7fffffbe16

64 450225f216 450225f216 0000000016 31af7bff16 b1af7bfd16 7ffffffe16

65 1262562216 1262562016 fffffffe16 c679934f16 c679934d16 fffffffe16

66 34bcc7e716 34bcc7e516 fffffffe16 e3f8638b16 e3f863c916 0000003e16

67 f4ed91e916 f4ed91e916 0000000016 39cd811016 b9cd815216 8000004216

68 713fc02f16 713fc02d16 fffffffe16 162e563316 962e563316 8000000016

69 9301b76c16 9301b76c16 0000000016 520c309416 520c30d416 0000004016

70 271f4f3416 271f4f3616 0000000216 b6c1749816 b6c1749816 0000000016

71 a546885a16 a546885816 fffffffe16 f7558aea16 f7558aa816 ffffffbe16

72 6589e81516 6589e81716 0000000216 9589e38d16 1589e3cd16 8000004016

73 e9d74b9f16 e9d74b9f16 0000000016 09a660c616 09a6608416 ffffffbe16

74 0d0d354a16 0d0d354816 fffffffe16 1de0f7ce16 1de0f7ce16 0000000016

75 5a44119b16 5a44119b16 0000000016 ca8e3f2b16 4a8e3f6b16 8000004016

76 93041eb416 93041eb416 0000000016 e1bf45d316 e1bf45d116 fffffffe16

77 40e35d7216 40e35d7216 0000000016 3a46517716 3a46517716 0000000016

78 0ce17cd016 0ce17cd016 0000000016 e33c717d16 633c717d16 8000000016

79 1f66795a16 1f66795a16 0000000016 1ae9931116 9ae9931116 8000000016

80 ebbe4c9316 ebbe4c9316 0000000016 7d98aa8e16 7d98aa8e16 0000000016

74

Page 83: Attacks on Cryptographic Hash Functions with Special Focus ... · Referat Attacker på kryptografiska hashfunktioner Med särskilt fokus på MD5 och SHA-0 ... In the case of digital

TRITA-CSC-E 2008:111 ISRN-KTH/CSC/E--08/111--SE

ISSN-1653-5715

www.kth.se


Recommended