Provable Security Analysis of SHA-3 Candidates · UNIVERSITY OF NOVI SAD DEPARTMENT OF POWER,...

UNIVERSITY OF NOVI SAD

DEPARTMENT OF POWER, ELECTRONICS AND TELECOMMUNICATIONS

MASTER’S THESIS

Provable Security Analysis of SHA-3

Candidates

Marjan Skrobot

Promoters :

Prof. dr. ir. Bart Preneel

Prof. dr. ir. Vincent Rijmen

Supervisors :

Elena Andreeva, PhD

Bart Mennink

June, 2012

Abstract

Hash functions are fundamental cryptographic primitives that compress messages of arbi-

trary length into message digests of a fixed length. They are used as the building block

in many important security applications such as digital signatures, message authentication

codes, password protection, etc. The three main security properties of hash functions are

collision, second preimage and preimage resistance.

In 2005, significant breakthrough was made in the cryptanalysis of hash functions. Namely,

attacks on SHA-1 and MD5 raised concerns about the security of the widely used hash

function standards. In a response to this hash function crisis, the US National Institute for

Standards and Technology (NIST) announced a call for the design of a new cryptographic

hash algorithm in 2007. NIST received 64 submissions. At this moment, 5 candidates are

in the final round of competition: BLAKE, Grøstl, JH, Keccak and Skein.

An important criteria for the evaluation of hash functions is their security. A common tech-

nique to assess the security of hash functions is via reductionist proofs of security. Within

this provable framework, Andreeva et al. provided a summary of all known security reduc-

tion results in the ideal model for the 14 second round SHA-3 candidates. Furthermore,

they identified several open problems. In this thesis, we investigate the existing proof tech-

niques for the second preimage analysis and resolve remaining open problems regarding the

second preimage resistance of Grøstl and Skein. More precisely, these two hash functions

are proved optimally second preimage resistant in the ideal model within the concrete secu-

rity provable framework. Finally, we provide an overview of the current security reduction

and performance results on the five finalists.

Acknowledgements

I would like to show my gratitude to the people without whose help and guidance the

accomplishment of this thesis would not have been possible.

In the first place I am very grateful to my supervisors Elena Andreeva and Bart Mennink

who introduced me to the field of cryptology and whose sincerity and encouragement I will

never forget. Above all, it would have been next to impossible to write this thesis without

their supervision and advices from the very beginning to the end of my work. Bart’s

positive spirit and his precious time he put into reading and giving critical comments about

my thesis I greatly appreciate. I gratefully acknowledge Elena for introducing me to the

area of provable security, and for guiding me to the literature that sparked and sustained

my interest in cryptology. The cooperation with both of them was very important and

educational to me.

I gratefully thank Vojin Senk and Zeljen Trpovski for their great support and active in-

volvement as coordinators in the exchange process. I was privileged to have them as my

professors and I am grateful for the help they have given me.

A special word of gratitude to my parents, Pavle and Ruza, who have been a constant

source of support emotional, moral and of course financial during my postgraduate years,

and this thesis would certainly not have existed without them. Also, I would like to thank

my family and friends for their support throughout my studies. Finally, I want to give a

special thanks to my girlfriend Ljiljana for her great support and for producing the figures

used in this thesis.

v

Contents

Abstract iii

Acknowledgements v

Table of Contents vii

List of Figures ix

List of Tables xi

1 Introduction 1

2 Preliminaries 5

2.1 Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.2 Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.3 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.4 Complexity Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Provable Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.2 The Provable Security Paradigm . . . . . . . . . . . . . . . . . . . . 10

2.2.3 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.4 Standard and Ideal Model . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.5 Complexity Theory Techniques . . . . . . . . . . . . . . . . . . . . . 12

2.3 Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1.1 Merkle-Damgard Mode of Operation . . . . . . . . . . . . . 13

2.3.1.2 Random Oracles . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.2 Security Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.2.1 Formal Security Notions . . . . . . . . . . . . . . . . . . . . 15

2.3.2.2 Expected Security . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.3 Generic Attacks Against Merkle-Damgard Mode of Operation . . . . 17

2.3.4 Compression Function Building Strategies . . . . . . . . . . . . . . . 17

2.3.5 Other Modes of Operation . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3.5.1 Wide-pipe and Narrow-pipe Design . . . . . . . . . . . . . 19

2.3.5.2 HAIFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

vii

Table of Contents viii

2.3.5.3 Sponge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3.6 Establishing Security of Hash Functions . . . . . . . . . . . . . . . . 20

2.3.6.1 Property Preservation . . . . . . . . . . . . . . . . . . . . . 21

2.3.6.2 Indifferentiability Results . . . . . . . . . . . . . . . . . . . 21

2.3.6.3 Idealized Proof Model . . . . . . . . . . . . . . . . . . . . . 22

2.3.7 Security Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 NIST’s SHA-3 Hash Function Competition 25

3.1 The History of SHA Family . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2 SHA-3 Security Requirements and Evaluation Criteria . . . . . . . . . . . . 26

3.3 The Competition Finalists . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.1 BLAKE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.2 Grøstl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.3.3 JH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3.4 Keccak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3.5 Skein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 A Summary of the Existing Results . . . . . . . . . . . . . . . . . . . . . . . 33

3.4.1 Factors of Favorability . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.4.2 A Summary of the Security and Performance Results . . . . . . . . . 33

4 Second Preimage Resistance of Grøstl and Skein 37

4.1 Security Analysis of Grøstl . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1.1 Assessing Second Preimage Resistance of Grøstl . . . . . . . . . . . 38

4.1.2 Proof of Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2 Security Analysis of Skein . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2.1 Assessing Second Preimage Resistance of Skein . . . . . . . . . . . . 43

4.2.2 Proof of Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 Conclusions and Remarks 49

5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.2 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.3 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Bibliography 58

A Mathematical Derivations 59

A.1 Security Bound on Second Preimage of Grøstl . . . . . . . . . . . . . . . . . 59

List of Figures

2.1 The Merkle-Damgard construction. . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 The HAsh Iterative FrAmework - HAIFA construction. . . . . . . . . . . . . 19

2.3 The sponge construction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.1 The BLAKE’s compression function. . . . . . . . . . . . . . . . . . . . . . . 28

3.2 The Grøstl hash function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.3 The JH’s compression function. . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4 The UBI mode of operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.1 The Grøstl’s compression function. . . . . . . . . . . . . . . . . . . . . . . . 38

4.2 The Skein hash function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

ix

List of Tables

3.1 A schematic summary of hardware and software results. . . . . . . . . . . . 34

3.2 A schematic summary of security reduction results of the five finalists. . . . 35

xi

Chapter 1

Introduction

This thesis deals with provable security properties of cryptographic hash functions.

Cryptographic hash functions are fundamental cryptographic primitives. They are used

as a building block in many higher-level primitives in cryptography. The hash functions

compress message inputs of arbitrary length and return a hash value of fixed length.They are

employed in many practical applications such as digital signatures, message authentication

codes, password protection, pseudorandom string generation, derivation of cryptographic

keys, etc.

One of the first uses of hash functions was presented in 1976, in the famous paper by Diffie

and Hellman [DH76] on public-key cryptography. They were proposed as a building block of

digital signatures. A practical hash function must be efficiently computable and uniformly

distributed, but in order to protect data integrity and to provide message authentication

hash functions must satisfy specific security requirements. In his PhD thesis [Mer79], Merkle

defined the three main security properties of hash functions: collision, preimage and second

preimage resistance. Depending on the application, these security properties are relevant

or not. In practical signature schemes hash functions are used to: 1) make more efficient

signing of a messages of arbitrary length; 2) provide secure authentication. The usual way to

employ a hash function in a signature scheme is to initially hash a message inputM and then

to sign the hashed messageH(M) with the secret key of the signer σ(M) = H(M)d mod N ,

where 0 ≤M ≤ N . Later, the verifier receives the pair (M,σ(M)). This approach is known

as hash-and-sign paradigm. An undesired event will happen if an adversary finds two

distinct messages with the same hash output H(M1) = H(M2). Such messages are called

colliding messages, and the event is a collision event. In the case that collision event occurs,

the adversary can trick an honest party A by first asking him to sign a harmless message

M1. If the honest party A signs the message, the adversary can counterfeit the signature

since the signature is the same for a potentially harmful message M2. Similar scenario can

1

Chapter 1. Introduction 2

happen if the adversary for a previously chosen specific message M finds another message

M ′ with the same hash output H(M) = H(M ′). This security property is known as

second preimage resistance (or weak collision). Another practical use of hash functions

is for commitment. A commitment scheme allows a prover to commit on data without

revealing it. A possible approach to create commitment would be to apply hash function

on data and disclose only the hash value. Later, the prover can open the commitment by

revealing the data. The hash value is the only guarantee to a verifier, who checks for the

correctness of it. An adversary, typically the verifier may try to retrieve information about

data from commitment. The commitment scheme is broken if the adversary succeeds to

retrieve a message M (data) from a hash value Y = H(M). Therefore, hash functions used

in commitment schemes need to be first preimage resistant. These examples show that the

use of an insecure hash function as a building block would endanger higher-level primitives.

In his proposal of the digital signature scheme, Rabin [Rab78] described an iterative hash

function based on a block cipher DES with a message block mi used as a key. However,

this design turned out to be trivially insecure (cf. Section 2.3.1.1). A significant break-

through in the design of hash functions was due to Merkle [Mer90] and Damgard [Dam90]

who independently showed how to iterate a compression function to preserve the collision

resistance of compression function to the collision resistance of hash function1. This itera-

tion principle, known as Merkle-Damgard is used in the most popular hash functions today.

The most prominent hash functions during the previous two decades are the MDx family

(MD5 most important), the SHA family, RIPEMD, HAVAL, Tiger, GOST and Whirlpool,

all of which rely on the Merkle-Damgard iterative principle. The MD5 was designed by Ron

Rivest in 1991, based on Rivest’s earlier hash function design MD4. MD5 hash function

has been employed in a wide variety of security applications. In 1995, the US National

Institute for Standards and Technology (NIST) issued the Secure Hash Standard with a

specification of the SHA-1 algorithm. This algorithm has become the most widely used hash

function standard. A new SHA-2 algorithm was published in 2001. After the breakthrough

in cryptanalysis by Wang et al. [WYY05, WY05] in 2005, security flaws were identified in

MD5 and SHA-1. Moreover, other results emerged [Jou04, Dea99, KS05, KK06] that raised

a question about the security of the Merkle-Damgard construction and hash functions in

general. This hash function crisis initiated the ongoing NIST’s hash function competition

[NIS07] with the aim to develop a new hash function standard, SHA-3. The end of the

selection process is scheduled for the late 2012. NIST specified a number of requirements

that the future SHA-3 function should meet. The hash function with n-bit hash value is

required to provide collision resistance of approximately n/2 bits, preimage resistance of

approximately n bits and second preimage resistance of approximately n − L bits, where

the length of the first preimage is at most 2L blocks. We also point to the indifferentiability

1This property is known as collision-resistance preservation.


framework introduced by Maurer et al. [MRH04]. This framework was further developed

in the context of hash functions by Coron et al. [CDMP05]. Indifferentiability is important

because it guarantees security resistance against all generic attacks. The hash functions

submitted to the SHA-3 competition claim security, but only a limited number of them

are actually backed by security proofs. Many of these security results are obtainable with

means of provable security.

The concept of provable security was introduced by Goldwasser and Micali [GM84]. Orig-

inally, they developed it in the context of asymmetric encryption. From this preliminary

work, several lines of research emerged. Fundamentally, the goal of provable security is

to provide a mathematical guarantee that a cryptographic scheme cannot be broken by a

class of attackers in a specified mathematical model of reality. Cryptographic schemes are

usually based on some mathematical problem. Those schemes that can be proven secure

under the assumption that the underlying mathematical problem is computationally hard

are said to be secure in the standard model. Since it is usually difficult to assess the secu-

rity in the standard model, in practice is often used the ideal model. Within this model,

underlying cryptographic primitives are replaced by their idealized versions.

Practically, the provable security approach allows us to prove the security of higher level

scheme (e.g. digital signature) under some assumption on the hash function security. In

this context, an important line of research was initiated by the research of Fiat and Shamir

[FS86] where they suggested the random oracle methodology. Later, Bellare and Rogaway

[BR93] formally introduced the random oracle model in order to allow design of more

practice oriented provably secure cryptographic schemes. They depicted the random oracle

model as a “bridge between theory and practice”. Within this model, the hash function is

replaced by an ideal primitive (random oracle). Likewise, the provable security approach

also allows us to conduct the security analysis of hash functions, which can be realized in

both standard and ideal model. Typically used approach in this context is to argue the

security property of the hash function under some assumption on the security of property

of the underlying compression function. In the ideal model, adversaries have oracle access

to the ideal version of the compression function or its underlying building blocks (e.g. block

cipher or permutation(s)). During the second round of NIST’s competition, Andreeva et

al. [AMP10c] provided a summary of all known security reduction results for all 14 second

round SHA-3 candidates. Moreover, they identified open problems regarding the security

reduction results and as the main concern they indicated the lack of optimal security bounds

on the second preimage resistance. These results have been revisited in [AMPS12], a part

of which is based on results of the work presented in this thesis. In addition, we refer

to [ABM+12, ALM11].


Besides this aforementioned goal, another substantial aspect of the provable security ap-

proach is associated with the introduction of notions and their definitions. Deficiency of

a proper definitions for the basic notions of security encouraged Rogaway and Shrimpton

[RS04] to revisit and formalize seven security notions of keyed hash functions. They also

considered all of the implications and separations among them within the provable secu-

rity framework. Subsequently, Andreeva et al. [ANPS07, AMP10b] determined by proof or

counterexample, the security property preservation2 of seventeen different iterations in the

standard model.

Our Contribution

In this thesis we analyze the security of the final round candidates in the competition for the

new SHA-3 hashing algorithm. We give a concise survey of the five finalists together with

their security reductions and performance results. The main contribution of this thesis is

the analysis of the second preimage resistance of hash function competition finalists Grøstl

and Skein. More precisely, within the concrete security provable security framework, we

provide a lower bound on the second preimage resistance of Grøstl in the ideal permutation

model and Skein in the ideal cipher model and prove them both optimally second preimage

resistant.

Outline of the Thesis

In Chapter 2 we introduce the mathematical and cryptographic prerequisites for our proofs.

In Chapter 3 we present the timeline of SHA-3 hash function competition as well as the

NIST’s requirements and evaluation criteria for SHA-3 hash function. Additionally, we

provide a brief introduction to the five finalists of competition and their security and per-

formance properties.

In Chapter 4 we present proofs for second preimage resistance of the Grøstl and Skein

hash functions.

Chapter 5 offers concluding remarks where we discuss obtained security results, highlight

some limitations of our approach and provide some future directions for the research.

2The preservation of the seven security properties defined in [RS04].

Chapter 2

Preliminaries

In this chapter we introduce a basic background knowledge, which includes some concepts

from mathematics as well as cryptography. In Section 2.1 we introduce the basic mathe-

matical definitions. In Section 2.2 concepts of provable security are discussed. Section 2.3

offers an introduction to cryptographic hash functions and their security properties.

2.1 Mathematical Background

In this section first we give the mathematical notations used in our work. Then we offer a

brief summary of basic definitions from graph theory (see Section 2.1.2), probability theory

in Section 2.1.3 and complexity theory in Section 2.1.4. Definitions and notations for this

section are taken from literature [Die10, MvOV97].

2.1.1 Notation

Let N denote the set of all natural numbers and Z denote the set of integers. Let n ∈ N, then

0, 1n denotes all the n-bit strings. We denote the set of all bit strings of arbitrary length

by 0, 1∗. The concatenation of two bit strings x and y is denoted by x||y. The message

blocks of any message M are denoted by m1||m2|| . . . ||mk where k denotes the number of

message blocks. Furthermore, x$←− X corresponds selecting x uniformly at random from

the set X.

5

Chapter 2. Preliminaries 6

2.1.2 Graph Theory

Definition 2.1. A graph is a pair G = (V,E) of sets satisfying E ⊆ [V ]2; thus, the elements

of E are 2-element subsets of V . The elements of V are the vertices (or nodes, or points)

of the graph G, the elements of E are its edges (or lines).

The number of vertices of a graph G is its order, written as |G|; its number of edges is

denoted by ||G||. Two vertices x, y of G are adjacent (or neighbours), if e = x, y is an

edge of G. Two edges e 6= f are adjacent if they have an end in common. The vertex set

of a graph G is referred to as V (G), its edge set as E(G). Graphs are finite or infinite

according to their order; unless otherwise stated, the graphs we consider are all finite. For

the empty graph (∅, ∅) we simply write ∅. A graph of order 0 or 1 is called trivial.

Definition 2.2. A path in narrow sense is a non-empty graph P = (V,E) of the form

V = x1, x2, . . . , xk E = e1, e2, . . . , ek,

where the xi are all distinct and ei = xi−1, xi for all i ≤ k.

Definition 2.3. A path in wider sense1 of length k in a graph G is a non-empty sequence

x0e1−→ x1

e2−→ x2 · · ·ek−1−−−→ xk−1

ek−→ xk

of vertices and edges in G such that ei = xi−1, xi for all i ≤ k. A path is closed if x0 = xk

and open if they are different. If the vertices in a path in wider sense are all distinct, it

defines an obvious path in narrow sense in G.

In a path the first vertex x0 is called start vertex and the last vertex xk is called end vertex.

These two vetrices are linked by a path and jointly they are called terminal vertices of the

path; the vertices x1, . . . , xk−1 are the inner vertices of a path. The number of edges of a

path is its length.

Definition 2.4. A directed graph (or digraph) is a pair (V,E) of disjoint sets (of directed

graph vertices and edges) together with two maps init: E → V and ter: E → V assigning

to every edge e an initial vertex init(e) and a terminal vertex ter(e). The edge e is said to

be directed from init(e) to ter(e).

A directed graph may have several edges between the same two vertices x, y. Such edges

are called multiple edges; if they have the same direction (say from x to y), they are parallel.

If init(e) = ter(e), the edge e is called a loop.

1The term “walk” is used by some authors [Die10] for a path in wider sense p (a path in which vertices oredges may be repeated), while the terms “path” and “simple path” are used for what is in our work calleda path in narrow sense P .


Notice that we use directed graphs in our work. Also, under the term path we refer to a path

in wider sense and often we denote it by the natural sequence of its edges p = (e1, e2, . . . , ek).

2.1.3 Probability Theory

In this section we consider sample spaces with only finitely many possible outcomes. Let

the simple events of a sample space S be labeled s1, s2, . . . , sn.

Basic Definitions

Definition 2.5. An experiment is a procedure that yields one of a given set of outcomes.

The individual possible outcomes are called simple events. The set of all possible outcomes

is called the sample space.

Definition 2.6. A probability distribution P on S is a sequence of numbers p1, p2, . . . , pn

that are all non-negative and sum to 1. The number pi is interpreted as the probability of

si being the outcome of the experiment.

Definition 2.7. An event E is a subset of the sample space S. The probability that event

E occurs, denoted P (E), is the sum of the probabilities pi of all simple events si which

belong to E. If si ∈ S, P (si) is simply denoted by P (si).

Fact 2.8. Let E ⊆ S be an event.

i) 0 ≤ P (E) ≤ 1. Furthermore, P (S) = 1 and P (∅) = 0. (∅ is the empty set.)

ii) If the outcomes in S are equally likely, then P (E) = |E||S| .

Definition 2.9. Two events E1 and E2 are called mutually exclusive if P (E1 ∩ E2) = 0.

That is, the occurrence of one of the two events excludes the possibility that the other

occurs.

Fact 2.10. Let E1 and E2 be two events.

i) If E1 ⊆ E2, then P (E1) ≤ P (E2).

ii) P (E1∪E2)+P (E1∩E2) = P (E1)+P (E2). Hence, if E1 and E2 are mutually exclusive,

then P (E1 ∪ E2) = P (E1) + P (E2).


Conditional Probability

Definition 2.11. Let E1 and E2 be two events with P (E2) > 0. The conditional probability

of E1 given E2, denoted P (E1|E2), is

P (E1|E2) =P (E1 ∩ E2)

P (E2).

P (E1|E2) measures the probability of event E1 occurring, given that E2 has occurred.

Definition 2.12. Events E1 and E2 are independent if P (E1 ∩ E2) = P (E1)P (E2).

Observe that if E1 and E2 are independent, then P (E1|E2) = P (E1) and P (E2|E1) =

P (E2). That is, the occurrence of one event does not influence the likelihood of occurrence

of the other.

Fact 2.13. (Bayes’ theorem) If E1 and E2 are events with P (E2) > 0, then

P (E1|E2) =P (E1)P (E2|E1)

P (E2).

2.1.4 Complexity Theory

The main goal of complexity theory is to provide mechanisms for classifying computational

problems according to the resources needed to solve them. The classification should not

depend on a particular computational model, but rather should measure the intrinsic dif-

ficulty of the problem. The resources measured may include time, storage space, random

bits, number of processors, etc., but typically the main focus is time, and sometimes space.

Basic Definitions

Definition 2.14. An algorithm is a well-defined computational procedure that takes a

variable input and halts with an output.

It is usually of interest to find the most efficient algorithm for solving a given computational

problem. The time that an algorithm takes to halt depends on the “size” of the problem

instance. Also, the unit of time used should be made precise, especially when comparing

the performance of two algorithms.

Definition 2.15. The size of the input is the total number of bits needed to represent the

input in ordinary binary notation using an appropriate encoding scheme. Occasionally, the

size of the input will be the number of items in the input.


Definition 2.16. The running time of an algorithm on a particular input is the number

of primitive operations or “steps” executed.

Often a step is taken to mean a bit operation. For some algorithms it will be more convenient

to take step to mean something else such as a comparison, a machine instruction, a machine

clock cycle, a modular multiplication, etc.

Definition 2.17. The worst-case running time of an algorithm is an upper bound on the

running time for any input, expressed as a function of the input size.

Definition 2.18. The average-case running time of an algorithm is the average running

time over all inputs of a fixed size, expressed as a function of the input size.

Asymptotic notation

It is often difficult to derive the exact running time of an algorithm. In such situations

one is forced to settle for approximations of the running time, and usually may only derive

the asymptotic running time. That is, one studies how the running time of the algorithm

increases as the size of the input increases without bound.

In what follows, the only functions considered are those which are defined on the positive

integers and take on real values that are always positive from some point onwards. Let f

and g be two such functions.

Definition 2.19. (order notation)

i) (asymptotic upper bound) f(n) = O(g(n)) if there exists a positive constant c and a

positive integer n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0.

ii) (asymptotic lower bound) f(n) = Ω(g(n)) if there exists a positive constant c and a

positive integer n0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0.

iii) (asymptotic tight bound) f(n) = Θ(g(n)) if there exist positive constants c1 and c2 and

a positive integer n0 such that c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0.

iv) (o-notation) f(n) = o(g(n)) if for any positive constant c > 0 there exists a constant

n0 > 0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0.

Intuitively, f(n) = O(g(n)) means that f grows no faster asymptotically than g(n) to

within a constant multiple, while f(n) = Ω(g(n)) means that f(n) grows at least as fast

asymptotically as g(n) to within a constant multiple. f(n) = o(g(n)) means that g(n) is

an upper bound for f(n) that is not asymptotically tight, or in other words, the function

f(n) becomes insignificant relative to g(n) as n gets larger. The expression o(1) is often

used to signify a function f(n) whose limit as n approaches ∞ is 0.


2.2 Provable Security

The first part of this section focuses on the basic definitions and concepts of provable

security. Then in Section 2.2.5 we present two approaches taken from complexity theory

used to evaluate the level of security of cryptographic schemes.

2.2.1 Basic Definitions

The main goal of cryptography is to enable secure communication by using cryptographic

schemes (or protocols). A common way to design a scheme is to choose or build secure

atomic2 primitives, and then on top of them to design the scheme, in such a way that the

scheme can “inherit” security from these atomic primitives. Under the atomic primitive we

assume either a problem which is considered to be computationally hard (e.g. the discrete

log problem, the integer factorization problem) or a secure cryptographic construction such

as block cipher, permutation, compression function, etc. The problem that can arise with

a cryptographic scheme design is that even if a good underlying atomic primitive is used,

a poor design can result in an insecure scheme. The usual way to investigate whether a

scheme inherits desired security properties from the underlying primitives, is by means of

provable security. The provable security idea was introduced in 1984 by Goldwasser and

Micali [GM84] in the context of asymmetric encryption. Usually, theoreticians say that the

term “provable security” is in some way misleading. The reason for this is that we actually

do not provide an absolute proof of security. We simply provide a reduction of the security

of the scheme to the security of some underlying atomic primitive. The term that better

reflects the essence of this approach is reductionist approach.

2.2.2 The Provable Security Paradigm

In order to provide a security proof we need to:

1. Introduce a formal adversarial model for a concrete security goal.

2. Formally define a security notion we want to achieve in chosen adversarial model.

3. Exhibit security reduction which shows that the only practical way to defeat the

scheme is to break the underlying atomic primitive.

This practically means that if we find some weakness in the scheme, we will definitely find

a weakness in the underlying atomic primitive as well. Vice versa, if we believe that the

2In this context the term “atomic” means that the primitive in question cannot be used alone to solve aspecific cryptographic problem. Commonly, it is used as a building block of higher-level primitives.


atomic primitive is secure, then we will know that the scheme must be secure with respect

to the desired security notion. From the point of view of cryptanalysis, this implies that

its focus should be on the atomic primitive. To summarize, there are two principal aims of

the provable security approach. The first is associated with the introduction of notions and

their definitions which practically entails classification of protocols and atomic primitives,

while the second is related to the actual reduction.

2.2.3 Assumptions

When the provable security approach is used, one needs to be aware that proven security

does not exclude the possibility of attack. Crucially, the scheme is proven secure under a

certain assumption. In the case that the assumption is not satisfied, results obtained by

the proof become irrelevant. This does not have to lead to a practical attack, it only means

that the proof of security is no longer useful. This further implies that a proof of security

is more valuable when the assumption is weaker. Once we introduced the comparable

feature regarding security, we can compare security reduction results (e.g. if two schemes

are proven secure, the one making weaker assumptions is preferable). However, it is not

always possible to compare strength of the assumptions.

2.2.4 Standard and Ideal Model

In cryptography the standard model is the model of computation in which the adversary

is only limited by the amount of time and computational power available. As we pointed

above, cryptographic schemes are often based on complexity assumptions3. These schemes

whose security reduction is possible using only complexity assumptions are said to be se-

cure in the standard model. Although a proof in the standard model brings more security

guaranties than other techniques, it is quite difficult to complete this type of proof in prac-

tice. Therefore, in many proofs, an ideal model is used where cryptographic primitives (e.g.

block cipher, permutation, compression function) are replaced by their idealized versions.

Probably the best known technique of this kind is known as random oracle model.

3Under complexity assumptions we consider an assumption on the hardness of the underlying problem(e.g. the discrete log problem, the integer factorization problem).


2.2.5 Complexity Theory Techniques

Asymptotic Security

In the theoretical literature, complexity theory is widely used. There one talks about

polynomial-time adversaries and negligible success probabilities. In this setting, a scheme

needs to be designed with polynomial-time algorithms. Then polynomial-time reductions

can be exhibited from the assumption on the computational hardness of the underlying

problem to an attack of the security notion. Generally speaking, by exhibiting a polynomial-

time reduction from A to B, we can show that problem B is at least as hard as A. A

polynomial security result claims that a scheme is secure for sufficiently large values of the

security parameter, without suggesting any specific values for it.

Concrete Security

Polynomial-time approach is quite favorable in the theoretical domain, but in practice

the more desired approach is to provide a concrete number for the security parameter.

Such number needs to suggest, for example, how large the security parameter should be,

such that a polynomial adversary that makes a certain number of queries to the public

algorithms of the scheme succeeds with a small probability. This framework is called

concrete security framework and it captures the quantitative nature of security. Another

aspect of the concrete security approach is associated with possible preservation of the

strength of the underlying atomic primitive in its transformation to the scheme.

Security Resistance and Attacks

In the provable security framework, attacks and security resistance are the complement of

each other. Attacks measure the degree of insecurity while quantitative bounds measure the

degree of security. More precisely, while the proof of security provides a lower bound, cryp-

tographic attacks provide an upper security bound on the complexity of breaking scheme

under some assumption. When these two bounds meet, the security property of the scheme

is identified and the bound is declared as tight.

2.3 Hash Functions

Firstly, we briefly introduce basic definitions and notions of hash functions together with

their design strategies. Later, in Section 2.3.2 we present the main security properties


of hash functions. Generic second preimage attacks against the Merkle-Damgard mode

of operation are discussed in Section 2.3.3. In Section 2.3.4 we introduce existing design

strategies of compression functions, while in Section 2.3.5 the most significant new modes of

operation are discussed. Finally, Section 2.3.6 and Section 2.3.7 offer the proof techniques

and the security model, which is used later in our security analysis.

2.3.1 Basic Definitions

Definition 2.20. A hash function is a deterministic function that maps an input of finite

arbitrary size to an output of finite fixed size. Formally, H: 0, 1∗ → 0, 1n.

Definition 2.21. A compression function is a deterministic function that maps input of

finite fixed size s to an output of finite fixed size p where s > p.

The procedure that describes how a compression function should be used in order to allow

a secure hashing of arbitrarily long inputs is called mode of operation. One of the first

proposals which included an iteration of compression function was made by Rabin [Rab78].

The advantages of this iterative approach are linear time complexity in the message size and

the modest memory requirements. Later, Merkle defined the three main security notions

of hash functions in his PhD thesis [Mer79]. These basic notions of security were revisited

and formalized in a wider context in [RS04, AS11]. At this point, commonly used informal

definitions of the three main security notions are provided:

• collision resistance (Coll) - it is hard to find any two distinct inputs M and M ′

which hash to the same output, such that H(M) = H(M ′).

• second-preimage resistance (Sec) - it is hard to find any second input which has

the same output as any specified input, i.e., given M , to find a second-preimage

M ′ 6= M such that H(M) = H(M ′).

• preimage resistance (Pre) - it is hard to find any input which hashes to that output,

i.e., to find any preimage M ′ such that H(M ′) = Y .

2.3.1.1 Merkle-Damgard Mode of Operation

As indicated previously, Rabin [Rab78] introduced an iterative hash function design based

on DES block cipher. The algorithm goes as follows: first, a message M is divided into

message blocks M = m1||m2|| . . . ||mk−1||mk of fixed size. Further, the hash function is

computed in the iterative manner: hi ← f(hi−1,mi) where f(hi−1,mi) = DESmi(hi−1)

and h0 = IV . Finally, the hash function returns a hash value H(M) = hk. Later was


shown that the use of unspecified IV leads to trivial second preimage and collision attacks.

A colliding message is found if the first input block is removed and for IV is selected h1.

In addition, trivial preimage attacks are possible under the assumption that IV can be

chosen by the adversary. Merkle and Damgard independently offered a solution to address

these problems [Mer90, Dam90]. Their idea was to fix a default value for IV and to use

a padding scheme with the message length appended at the end. Each of them offered a

different padding scheme. Merkle’s padding scheme emerged as standard due to its higher

efficiency as the smaller number of padded bits is needed in the case of large messages.

Figure 2.1. The Merkle-Damgard construction.

The Merkle-Damgard mode of operation constructs a hash function Hf : 0, 1∗ →0, 1n by iterating a compression function f : 0, 1n × 0, 1m → 0, 1n. Padding is

achieved by appending to the original message a single ’1’ bit followed by as many ’0’

bits as needed to complete an m-bit block after embedding the message length at the

end. Adding the message length in the last block and using of a fixed IV , the so-called

strengthening, is the crucial ingredient in establishing the collision-resistance preservation

of Merkle-Damgard. This iteration design known as the Merkle-Damgard construction, is

the most commonly used mode of operation in hash functions.

2.3.1.2 Random Oracles

Difficulty of exhibiting a proof under complexity assumptions, has forced cryptographers

to introduce some construction with well-understood properties which could be used every

time a cryptographic hash function is required. First choice for such construction is the

random function. Fiat and Shamir [FS86] first suggested the random oracle framework,

which was later formally introduced by Bellare and Rogaway [BR93].

Definition 2.22. A random oracle is a public hash function that maps inputs of arbitrary

size to outputs of finite size, or R : 0, 1∗ → 0, 1n, where the outputs are drawn uniformly

at random from the range space and accessible by all algorithms in a black-box manner.

In a reductionist security proof an underlying hash function can be replaced with the ran-

dom oracle. The random oracle model allows us to prove efficient-in-practice cryptographic


schemes secure, which sometimes can be provably impossible in the standard model. Still,

when using the random oracle model one needs to be aware that the random oracle assump-

tion is the strongest assumption possible for hash functions. As a consequence, the security

guaranties provided by the random oracle model are not as strong as those obtained in

the standard model. What are the advantages of this approach? Firstly, it allows building

of efficient schemes. Furthermore, even though the random oracle assumption is strong,

the results obtained in the random oracle model provide valuable security guarantees (e.g.

provably exclude certain generic attacks, absence of security flaws in the design, etc.).

2.3.2 Security Properties

In this section we formally introduce basic security notions of hash functions as well as

their expected security levels. We take notations and terminology from [And10, Bou11].

2.3.2.1 Formal Security Notions

Let us remind that the goal of this work is to obtain reduction results on security of

particular hash functions. As indicated in Section 2.2, in order to make a reduction possible

we need to introduce a formal adversarial model were the security notion of the scheme (a

hash function in this case) has to be defined in that model.

The formal definitions for hash functions are characterized with the so-called attack-based

definitions. Typically, attacks are defined through a game between a challenger and the

attacker, where the challenger’s task is to simulate the environment of the adversary A and

generates the secret system parameters. Usually the adversarial advantage is measured by

the success probability of the adversaries. In terms of analysis of security property xxx ∈Coll, Pre, Sec of the hash function H we denote the adversarial advantage in breaking

that property by AdvxxxH (A). We write AdvxxxH (t) to denote the maximum advantage of

any adversary with time complexity at most t. While the length of the first preimage M

is of 2L blocks following NIST’s security requirements, throughout this thesis the length is

denoted by λ (in bits) and k (in blocks), where λ/m ≈ k = 2L.

Definition 2.23. Let λ, n ∈ N and let H: 0, 1∗ → 0, 1n be a hash function. Then, the

advantage of the adversary A against collision is

AdvCollH (A) = Pr[

(M,M ′)$←− A(·) : M 6= M ′ and H(M) = H(M ′)

].

The advantage of the second preimage adversary A is defined as

AdvSec[λ]H (A) = Pr

[M

$←− 0, 1λ; M ′$←− A(M) : M 6= M ′ and H(M) = H(M ′)

].


The advantage of the preimage adversary A is defined as

AdvPreH (A) = Pr[M

$←− 0, 1λ; Y ← H(M); M ′$←− A(Y ) : H(M ′) = Y

].

These are commonly used formal definitions for keyless second preimage and preimage

notion. An attempt to formalize collision resistance in similar fashion faces fundamental

difficulties. The problem lies in the fact that for any hash function there always exists an

efficient collision finding algorithm, but we humans are simply not able to find it. One

solution to formalize collision resistance in the standard model was offered by Rogaway

[Rog06]. The main idea behind his proposal was to provide security reduction for the case

when a hash function is used as a building block of a higher level primitive. This reduction

means that as long as humans are not able to find collision on the hash function, then the

higher level primitive cannot be be broken by hash functions collisions.

2.3.2.2 Expected Security

Now, after we defined relevant security notions we need to see what their security level is.

We want to show security results for hash functions in general, which means that we do

not want to focus on any particular hash design. In order to achieve this, we consider a

hash function which act as random oracle.

Preimage and Second Preimage Resistance. It is easy to show that any adversary who

is trying to find a (second) preimage would succeed with probability q/2n after sending q

queries to the random oracle. Each query to the random oracle as result has uniformly

random output of size n. This implies that each query has probability 2−n to yield a

(second) preimage. This means that when we consider a hash function as the random

oracle the problem of finding a second preimage is just as hard as the problem of inverting

the hash function.

Collision Resistance. Results for collision resistance are a bit different due to the birth-

day paradox. Intuitively, it is much easier to find any pair of two inputs which hash to the

same output, than to find an input which hashes to the same output as one particular input

selected before. The birthday problem estimates the probability that in a set of randomly

chosen people (less than 365) a pair shares the same birthday under the assumption that all

birthday dates are equally probable. The probability that such pair is found is higher than

50% if there are 23 persons in the set. If we compare the number of possible dates (365)

and the number of people required (23) we can see that 23 is approximately square-root

dependent from the 365. If we map the birthday paradox to our collision problem, such

that our range length is 2n possible values, it is clear that after√

2n = 2n/2 queries to hash


function, collision is going to be found with probability higher than 50%. We can also look

at this problem from different angle. If an adversary is trying to find a collision, as he sends

q queries to random oracle, he knows q(q−1)/2 pairs and each pair results in collision with

probability 2−n. This implies that a collision is found after 2n/2 queries [Wag02].

2.3.3 Generic Attacks Against Merkle-Damgard Mode of Operation

Cryptanalysis of modes of operation has increased significantly over the years. As a re-

sult, several generic4 attacks against Merkle-Damgard mode of operation were introduced

(e.g. the length extension attack, Joux’s multicollision attack, etc.). In our work, we are

interested in second preimage generic attacks.

We defined second preimage resistance as the security notion which captures the difficulty

of finding any second message input which has the same output as any previously specified

message input. For a long time it was thought that the Merkle-Damgard based hash func-

tion with strengthening preserved second preimage resistance and that it was taking about

2n steps (queries) to find a second preimage for secure hash function [LM92]. However,

in 1999, Dean showed in his PhD thesis [Dea99] that this security level could not be ac-

complished by hash functions whose compression function allowed the easy finding of fixed

points5. He found a way to circumvent the strengthening by finding preimages of the same

size as the target message. Surprisingly, this important result has gone unnoticed until

2005, when Kelsey and Schneier [KS05] generalized Dean’s attack by using the multicolli-

sion result of Joux [Jou04]. More precisely, they introduced the generic second preimage

attack on the Merkle-Damgard hash function that requires at most approximately 2n−L

queries, where the length of the first preimage is at most 2L blocks. Later, more flexi-

ble generic second preimage attack was described by Andreeva et al. [ABF+08], with the

same complexity as the two mentioned before. Bouillaguet and Fouque [BF09] showed

within provable security framework, that these generic second preimage attacks against

the Merkle-Damgard construction are optimal under the assumption that the compression

function is random.

2.3.4 Compression Function Building Strategies

A compression function is commonly built on the top of a block cipher or a limited number of

permutations. Although block ciphers are primarily designed for encryption, they are used

4The attacks which are applicable on all hash functions based on a single construction design or modeof operation are called generic attacks.

5A fixed point of a function is a point that is mapped to itself by the function. In the context of hashfunctions, a fixed point for a compression function would mean that f(h,m) = h.


as a building block of compression functions, because of their well-understood properties

and design.

Block Cipher Based Compression Functions

A detailed analysis of block cipher based compression functions was conducted by Preneel

et al. [PGV93]. More precisely, they analyzed the 64 most basic ways to construct a hash

function from a block cipher6. Furthermore, Black et al. [BRS02] proved secure 12 of

these 64 PGV schemes in oracle model where underlying block cipher is treated as random

primitive. In 2009, Stam [Sta09] revisited the rate-17 block cipher based hash functions,

where he analyzed them in a wider context. The most widely known types of the block

cipher compression function are the Matyas-Meyer-Oseas (PGV1), the Miyaguchi-Preneel

(PGV3) and the Davies-Meyer (PGV5). The main drawback of this type of design is its

inefficiency.

Permutation Based Compression Functions

In order to address problems with a weak key schedule and to make more efficient compres-

sion functions, a limited number of permutations can be used instead of block cipher. In

their paper, Black et al. [BCS05] analyzed all 2n-bit to n-bit compression functions based

on one n-bit permutation, and proved them insecure against collision and (second) preim-

age attacks. Later, Rogaway and Steinberger [RS08b, RS08a] together with Stam [Sta08]

extended these results to compression functions with arbitrary input and output sizes, and

an arbitrary number of underlying permutations. Moreover, they provided security bounds

which indicate the expected number of queries required to find collisions or preimages for

permutation based compression functions.

2.3.5 Other Modes of Operation

In 2004, the attacks of [WYY05, WY05] shaken the confidence of cryptographic community

in the security of widely employed hash functions MD5 and SHA-1. This has led to an

increased interest in the field of hash functions. As a result of the research on design

strategies of hash functions, new modes of operation emerged, with different design and

security characteristics. In this section we present some of the most important modes of

operation.

6These construction are usually called PGV which is an acronym for Preneel, Govaerts and Vandewalle.7A compression function based on a single call to a block cipher.


2.3.5.1 Wide-pipe and Narrow-pipe Design

One important aspect of hash function design is the size of the internal state with regard to

the size of final hash output. The Merkle-Damgard construction is a so-called narrow-pipe

design where the size of the internal state is the same as the size of the final hash output

(l = n). In 2005, Lucks introduced the Wide-pipe Hash [Luc05]. The main idea behind

this design was to use an internal state of the hash function considerably larger than hash

output (l n). More precisely, the size of an internal state is about twice as big as the

final hash output obtained by chopping at the end of the iteration. As a consequence, Lucks

was able to provide a proof that generic second preimage attacks could not be faster than

exhaustive search. As a drawback of this design one can underline slightly higher memory

requirements. This wide-pipe strategy has been employed in several SHA-3 competition

finalists, namely Grøstl, JH, Keccak and Skein.

2.3.5.2 HAIFA

The HAsh Iterative FrAmework was introduced by Biham and Dunkelman [BD07]. HAIFA

mode is basically a modified version of Merkle-Damgard mode where slight tweaks are

employed. In order to address the problem of generic second preimage attacks against

Merkle-Damgard, the designers of HAIFA accompanied each message block in the iteration

with a counter that tracks number of message bits hashed to this point and a fixed optional

salt8. The security property preservation of HAIFA design among others was investigated

by Andreeva et al. [ANPS07]. Bouillaguet and Fouque proved HAIFA to be optimally

second preimage resistant if the underlying compression function is assumed to behave like

an ideal primitive [BF09]. The HAIFA design strategy was followed by the designers of

one SHA-3 competition finalist, namely BLAKE. Consequently, security results of HAIFA for

preimage, second preimage, collision, and indifferentiability, while assuming ideality of the

underlying compression function, are applicable for the BLAKE hash function.

Figure 2.2. The HAsh Iterative FrAmework - HAIFA construction.

8An input parameter for the compression function, can be either public or secret.


2.3.5.3 Sponge

As an alternative to the Merkle-Damgard design, sponge functions were introduced by

Bertoni et al. [BDPA07]. Instead of iterating a secure compression function in order to

preserve security properties and to obtain a secure hash function, designers of sponge

functions considered a different approach where they iterate a possibly insecure compression

function a sufficient number of times to obtain a secure hash function. The internal state

iterated by sponge functions is r + c bits wide, where c is the so-called capacity. The hash

value is obtained after two phases: absorbing and squeezing. Sponge functions iteratively

“absorb” r-bit message blocks per compression function call and this process is called the

absorbing phase. Once the message is processed, the squeezing phase occurs and the first r

bits of the internal state are returned as output block in a possibly iterative manner. The

number of output blocks can be chosen by the user. The security guarantees for the most

of sponge-like constructions9 are typically based on indifferentiability results, which can be

seen in Section 3.3.3 and Section 3.3.4. The SHA-3 competition finalist based on original

sponge function design is Keccak, while JH is regarded as a sponge-like hash function.

Figure 2.3. The sponge construction.

2.3.6 Establishing Security of Hash Functions

In this section we analyze possible techniques from the provable security aspect that can be

used to obtain security reduction results. Throughout further analysis emphasis is placed

on the second preimage resistance.

9For a sponge-like hash function we consider a hash function which employs a permutation based com-pression function and iterate a wide internal state.


2.3.6.1 Property Preservation

In Section 2.3.1.1 we showed how a hash function should be built in order to preserve the

collision resistance from the compression function to the complete hash function. Also,

generic security is discussed in Section 2.3.3 where is pointed out that Merkle-Damgard

construction does not preserve second preimage resistance. Furthermore, Andreeva et

al. [ANPS07, AMP10b] analyzed, among the other, preservation of second preimage re-

sistance by various constructions. Unfortunately, only two of these constructions actually

preserve second preimage resistance, one of which is ROX construction [ANPS07] while

the other one is BCM [AP09]. A reason why typically used constructions do not preserve

second preimage resistance is believed to be due to an introduction of fixed bits through

the state input by the initialization vector and possibly through the message input. An-

other reason for non-preservation can be presence of fixed padding message bits. As a

consequence, the second preimage resistance of the compression function does not directly

translate to the second preimage security of hash function based on the Merkle-Damgard

construction with final chopping.

2.3.6.2 Indifferentiability Results

In the recent years, an important progress in security analysis was made with the introduc-

tion of indifferentiability framework by Maurer et al. [MRH04]. In addition, this framework

was further developed in the context of hash functions by Coron et al. [CDMP05]. The

main principle behind this framework is as follows: in order to investigate the security of a

particular mode of operation, one can replace the underlying primitive (usually compression

function or even underlying building block of compression function such as permutation or

block cipher) with an ideal version of itself (a random function, a random permutation, an

ideal block cipher) and then compare the combination of the ideal primitive and the mode

of operation in question with the random oracle. Following this approach we can deter-

mine weather this design is indifferentiable from a random oracle or not. Positive answer

would mean that the design behaves ideally up to a certain level. The level of resemblance

(typically expressed in number of queries) between concrete design and random oracle is

regarded as an important security indicator. For us, the importance of this framework lies

in the fact that the result obtained within this framework indirectly provides bounds on

the (second) preimage and collision resistance of the hash function in question [AMP10c].


2.3.6.3 Idealized Proof Model

A reduction in the ideal model considers where an information-theoretic adversary who

has only query access to the idealized underlying primitive (compression function in this

case). Bouillaguet and Fouque [BF09] followed this approach to provide optimal security

bound on the second preimage resistance in ideal compression function model for Merkle-

Damgard and HAIFA constructions. A benefit of successfully conducted security reduction

is the guarantee that the hash function has no severe structural weaknesses, unless one

can detect a possible deviation from the random behavior in the underlying compression

function. In the later case, the security results obtained by reduction are invalid. Also, one

needs to be aware that an ideal compression function is quite a strong assumption. In the

problematic case, when the compression function exhibits non-random behavior, the level

of modularity can be refined in order to revalidate or improve security guarantees. In this

case, one needs to assume the ideal behavior of underlying building blocks of compression

function (e.g. the underlying block cipher or permutation(s)). In [BCC+08], the designers

of Shabal suggested idealized proof model to assess the collision, preimage and second

preimage resistance of Shabal. More concretely, they proved Shabal secure in the ideal

cipher model by using the graph based simulation approach. Subsequently, Fouque et

al. [FSZ09] analyzed collision and preimage resistance of the construction identical to the

compression function of Grøstl. This analysis was performed in the ideal permutation

model. A summary of all known security reduction results for all 14 second round SHA-3

candidates in the ideal model was provided by Andreeva et al. [AMP10c]. Subsequently,

these results were revisited and updated in [AMPS12, ABM+12].

2.3.7 Security Model

As explained in Section 2.2 after the decision has been made on what to achieve within

the provable security framework, a formal adversarial model needs to be introduced, where

the security notion of the scheme in question has to be defined. In Section 2.3.2.1 formal

definitions of the three main security notions of hash functions are provided in the standard

model. However, in the idealized proof model where an underlying primitive of compression

function is assumed to be ideal, these formal definitions slightly differ. Therefore, in order

to carry out the meaningful reduction we introduce formal adversarial model which will

be used in our analysis. This setting is very similar to the analysis conducted in [BRS02,

FSZ09, AMP10c, AMPS12].

Let us assume that the underlying primitive of compression function is an ideal primitive

(e.g. a random permutation, an ideal block cipher). In this model, the adversary A is

a probabilistic algorithm with oracle access to a uniformly at random sampled primitive


P$←− Prim(H). The set Prim(H) depends on the chosen hash function (e.g. in the case of

permutation-based hash function H1, primitive P is chosen independently and uniformly

at random from the set of all permutations Prim(H1)).

We consider information-theoretic adversaries only. Hence, the adversary has unbounded

computational power and its only obstacle to succeed in an attack is the randomness of

the query response. The complexity is measured by the number of queries made to the

oracle. In this ideal model the adversary A is allowed to make at most q forward and

inverse queries to the oracle. All these queries are stored in a query history L as indexed

elements. Without loss of generality, we assume that L always contains the queries required

for the attack and that the adversary does not ask any oracle query in which the response

is already known. The definitions of preimage and second preimage that we use in the ideal

model correspond to the everywhere10 preimage and second preimage notions of [RS04].

Definition 2.24. Let λ, n ∈ N, let Y = 0, 1n, M = 0, 1λ and let H: 0, 1∗ → 0, 1n

be a hash function. Then, the advantage of the adversary A against collision is

AdvColH (A) = Pr[

(M,M ′)$←− A(P ) : M 6= M ′ and H(M) = H(M ′)

].

The advantage of the everywhere second preimage adversary A is defined as

AdveSec[λ]H (A) = max

M∈MPr[P

$←− Prim(H); M ′$←− A(P ) : M 6= M ′ and H(M) = H(M ′)

].

The advantage of an everywhere preimage adversary A is defined as

AdvePreH (A) = maxY ∈Y

Pr[P

$←− Prim(H); M ′$←− A(P ) : H(M ′) = Y

].

For q ≥ 1 we write AdvxxxH (q) = maxAdvxxxH (A) where the maximum is taken over all

adversaries that ask at most q oracle queries where xxx ∈ ePre, eSec, Col.

Above we defined the security notions of the hash function H in the formal adversarial

model. In addition, similar definitions can be used to define security notions of compression

function f . The security analysis conducted in Chapter 3 and Chapter 4 is realized in this

adversarial model.

10Notice that the ePre and eSec of [RS04] relies (w.r.t. randomness) on the key generation, while in thekeyless and ideal model setting it relies (w.r.t. randomness) on the random underlying primitive.

Chapter 3

NIST’s SHA-3 Hash Function

Competition

This chapter briefly reviews the timeline of the SHA family history including the NIST’s

SHA-3 hash function competition. Section 3.2 presents NIST’s requirements and evaluation

criteria for SHA-3 hash function. Additionally, Section 3.3 provides a brief introduction

to the five finalists of competition and their security and performance properties. Finally,

security and performance results are summarized in Section 3.4.

3.1 The History of SHA Family

In 1993, the US National Institute of Standards and Technology (NIST) published the first

Secure Hash Standard. Soon after having been published it was withdrawn due to flaws

in the design of Secure Hash Algorithm which was described in the Federal Information

Processing Standards Publication (FIPS PUBS) 180. That version of Secure Hash Algo-

rithm is commonly referred to as SHA-0. After being improved, FIPS 180-1 was published

in 1995 containing a specification of the hash function known as SHA-1. SHA-1 has been

the most widely used hash function algorithm in the next decade, even though the SHA-

2 standard, published in FIPS 180-2 in 2001, has better security properties than SHA-1.

SHA-2 includes a significant number of changes from its predecessor SHA-1. After a series

of attacks on SHA-1 by Wang et al. [WYY05, WY05] together with results that raised a

question about the security of Merkle-Damgard construction [Dea99, Jou04, KS05, KK06]

NIST recommended the replacement of SHA-1 by the SHA-2 hash function family. On

November 2, 2007, NIST announced a call for the design of a new SHA-3 hashing algo-

rithm [NIS07], similarly to the development process for the Advanced Encryption Standard

(AES). The main goal of this public competition is to develop a new, secure cryptographic

25

Chapter 3. NIST’s SHA-3 Hash Function Competition 26

hash algorithm, as a standard that can be used in generating digital signatures, message

authentication codes, and many other hash function applications. The selected algorithm

is intended to be available royalty-free worldwide. NIST defines three categories of eval-

uation criteria that will be used to compare candidate algorithms throughout the SHA-3

competition: 1) security, 2) cost and performance, and 3) algorithm and implementation

characteristics. The new hash algorithm will be referred to as “SHA-3”.

Sixty-four candidates mostly from Europe and North America were submitted for hash

function competition by October 31, 2008. The preliminary cryptanalysis showed that fifty-

one candidate algorithms meet the minimum of submission requirements. These candidates

were selected for the first round in the end of 2008. Later, on July 24, 2009, after public

feedback and internal reviews of the first-round candidates, NIST selected fourteen second-

round candidates using previously defined evaluation criteria. At the end of 2010, after one

year of public review, NIST announced five SHA-3 finalists: BLAKE, Grøstl, JH, Keccak,

and Skein. In order to improve their hash functions, submitters of the finalist algorithms

were allowed to make minor modifications to their algorithms and submit the final packages

to NIST by January 16, 2011. Similarly to the previous rounds, one-year public comment

period is planned for the finalists. NIST plans to choose a winner of the SHA-3 competition

in 2012.

3.2 SHA-3 Security Requirements and Evaluation Criteria

NIST specifies security as the most important competition’s evaluation criteria [NIS07].

Moreover, they define security requirements which are expected to be fulfilled by the future

SHA-3 hash algorithm. The minimum security requirements that NIST expects from the

SHA-3 hash function of hash value size n are:

1. collision resistance of approximately n/2 bits,

2. preimage resistance of approximately n bits,

3. second preimage resistance of approximately n−L bits, where the length of the first

preimage is at most 2L blocks,

4. resistance to length-extension attacks,

5. any m-bit hash function specified by taking a fixed subset of the candidate functions

output bits should meet the above requirements with m replacing n.

As explained in Section 2.3.2.2 and Section 2.3.3, a standard hash function is expected to

satisfy these specified requirements. Certainly, an increase of second preimage resistance


(from approximately n − L bits up to resistance of approximately n bits) and resistance

against other attacks, such as multi-collision attacks, is seen as an advantage by NIST. Any

result that shows that the candidate hash function does not meet the specified requirements

is considered to be a serious attack. Therefore, a special attention has to be directed towards

newly developed attacks. This is of great importance, especially if the level of security of

the hash function is lower than it is claimed by the submitter.

A good place to start security analysis is by checking the soundness of the mathematical

basis. This analysis can provide a good indication of the hash function design quality. To

select the best candidate, each submitted hash function is compared with other candidates

(of the same hash length) based on provided security results, regarding (second) preimage

resistance, collision resistance, and resistance to generic attacks. One additional security

property raised by the public during the evaluation process is the extent to which the

algorithm output is indifferentiable from a random oracle (see Section 2.3.6.2). In a sum-

mary, those candidates whose preliminary security analysis raised concerns were discarded

from the competition. Similarly, designs that have not received much feedback from the

cryptographic community were also considered as doubtful and they were discarded, too.

3.3 The Competition Finalists

In this section we present the five finalists. Beside their main characteristics, we provide

security properties and performance results of each finalist based on earlier works by An-

dreeva et al. [AMP10c, ABM+12, AMPS12] and Turan et al. [TPB+11].

3.3.1 BLAKE

The BLAKE hash function [AHMP10] uses HAIFA as iteration mode. BLAKE’s compression

function (see Figure 3.1) maintains a large inner state initialized with the internal state hi−1,

the salt S, and the counter Ci. Then the compression function iterates series of message-

dependent rounds. After these rounds, the new internal state is obtained by compressing

the inner state together with the old internal state and the salt. This internal design is so-

called local wide-pipe which is inspired by Lucks’ wide-pipe design [Luc05]. The compression

algorithm used in BLAKE is a modified version of Bernstein’s stream cipher ChaCha [Ber08].


Figure 3.1. The BLAKE’s compression function.

Security of BLAKE

As noted before, the security results of HAIFA (see Section 2.3.5.2) are carried over to

theBLAKE hash function under an idealness assumption of the compression function. Nev-

ertheless, Andreeva et al. [ALM11] and Chang et al. [CNY11] independently showed that

BLAKE’s compression function is differentiable from a random compression function after

about 2n/4 queries. This implies that BLAKE’s compression function has non-random be-

havior and as a consequence the HAIFA security results in the ideal compression function

model are invalid for the BLAKE hash function (see Section 2.3.6.3). In order to restore

BLAKE’s security guarantees Andreeva et al. [ALM11] refined the level of modularity in

the security analysis and revalidated the security results in ideal cipher model. Firstly,

they proved optimal security bounds on the compression function AdvColf = Θ(q2/2n) and

AdvePref = Θ(q/2n). Due to collision and everywhere preimage preservation of the HAIFA

design, this security results are carried over from BLAKE’s compression function and ex-

tended to the BLAKE hash function. The everywhere second preimage property of BLAKE1

was directly analyzed in the ideal cipher model and as a result BLAKE was proved optimally

second preimage resistant AdveSecH = Θ(q/2n). Finally, the BLAKE hash function is proved

indifferentiable from a random oracle in the ideal cipher model [ALM11, CNY11].

Performance of BLAKE

BLAKE hash function as classified by NIST [TPB+11] is one of the top performers in soft-

ware across most platforms, while in hardware its performance is labeled as average. In

constrained environments, BLAKE is described as one of the top performers in speed with rel-

atively modest memory requirements. Moreover, BLAKE has a structure that allows flexible

designs.

1The everywhere second preimage property is not preserved by HAIFA design which is shown in[ANPS07].


3.3.2 Grøstl

The Grøstl hash function [GKM+11] uses a wide-pipe Merkle-Damgard construction with

a final transformation employed before chopping. Its compression function is based on two

AES-like, fixed and distinct permutations. All nonlinearity in the design is derived from

the AES S-box. Since the security of compression function is not optimal, Grøstl designers

employed a final transformation which is believed to be one-way and collision resistant, but

does not compress before the chopping. The reader is referred to Section 4.1 for a detailed

description.

Figure 3.2. The Grøstl hash function.

Security of Grøstl

In the center of the Grøstl security analysis is its permutation based compression func-

tion. In relation to this, Fouque et al. [FSZ09] introduced specific 2-permutation based

construction and analyzed its collision and preimage resistance. Grøstl’s compression

function is based on this particular construction. Their results allow us to claim tight se-

curity bounds on the compression function for collision AdvColf = Θ(q4/2l) and preimage

resistance AdvePref = Θ(q2/2l). Following same arguments as in security analysis of the

BLAKE, optimal bounds are obtained on collision and everywhere preimage resistance for

the Grøstl hash function. Furthermore, the Grøstl hash function is proven indifferentiable

from a random oracle if the underlying permutations are ideal [AMP10a]. The bound on

second preimage resistance of Grøstl is unknown. In Chapter 4 we analyze everywhere

second preimage resistance of the Grøstl in the ideal permutation model and we obtain

bound AdveSecH = Θ(q/2n−L).


Performance of Grøstl

In [TPB+11], Grøstl is marked as an average performer in software across most platforms

while in hardware Grøstl’s performance is seen as above-average. In constrained environ-

ments, Grøstl has poor performance with modest memory requirements. It can be also

noted that Grøstl has a flexible structure that allows various area trade-offs.

3.3.3 JH

The JH hash function [Wu11] is a novel design and to an extent it resembles a sponge con-

struction. It can be viewed as a sponge-like construction as it employs fixed permutation

based compression function and wide-pipe Merkle-Damgard construction with final chop-

ping as iteration mode, where the message size is m, the hash value size is n, while the

internal state size l satisfies l = 2m ≥ 2n. The permutation P is based on the AES design.

Specifically, all hash value sizes of JH use the same function. Also, each member of the JH

family is selected by using its corresponding IV .

Figure 3.3. JH’s compression function.

Security of JH

As a consequence of the results of Black et al. [BCS05], the JH compression function is

insecure in the ideal permutation model. As a confirmation of this claim, collisions and

preimages can be found for JH compression function in one query to the permutation. In

their paper, Lee and Hong [LH11] proved that the JH hash function is optimally colli-

sion resistant AdvColH = Θ(q2/2n). Andreeva et al. [AMPS12] proved optimal bounds for

preimage and second preimage resistance of JH for the n = 256 variant, while bounds for

n = 512 variant on preimage and second preimage resistance are improved but still not


optimal. Furthermore, JH hash function is proven indiffierentiable from a random oracle if

the underlying permutation is assumed to be ideal [BMN10]. Later, Moody et al.[MPST12]

improved the indifferentiability bound on JH and confirmed (second) preimage results ob-

tained in [AMPS12].

Performance of JH

In [TPB+11] JH is described as an average to above-average performer in software and

hardware, while in constrained environments JH is regarded as average in performance.

Also, JH has modest memory requirements.

3.3.4 Keccak

The Keccak hash function [BDPA11] follows the sponge construction [BDPA07], but can

also be considered as a Merkle-Damgard construction with final chopping. It uses a single

large fixed permutation. The permutation can be seen as a combination of a linear mixing

operation and a very simple nonlinear mixing operation. What is interesting regarding this

hash design is that it uses a single design for variable hash output sizes.

Security of Keccak

Similarly to JH, the compression function of Keccak is based on one permutation and

the same results apply. Collisions and preimages can be found for Keccak’s compression

function in one query to the permutation. The sponge construction is proven indifferentiable

from a random oracle if the underlying permutation is assumed to be ideal [BDPA08] and

this result applies to Keccak. As noted in Section 2.3.6.2, indifferentiability bound renders

bounds on the other security properties. Following this approach, an optimal bound is

obtained on collision resistance AdvColH = Θ(q2/2n), as well as on preimage and second

preimage resistance Θ(q/2n) for Keccak in the ideal permutation model [AMP10c].

Performance of Keccak

The Keccak hash function is described by NIST [TPB+11] as an average performer in

software, while hardware performance of Keccak is regarded as excellent. In constrained

environments, Keccak is below-average in performance with modest memory requirements.

Keccak is highly parallelizable due to the design.


3.3.5 Skein

The Skein hash function [BKL+10] builds on the Unique Block Iteration (UBI). UBI mode

hashes an arbitrary-length string by iterating a compression function, which takes as input

an internal state, a message block, and a tweak. The compression function is based on the

Threefish tweakable block cipher in Matyas-Meyer-Oseas mode as can be seen on Figure

3.4. The tweak encodes the number of bytes processed to this point, type of UBI mode and

special flags for the first and the last block. Skein supports variable output size. If a single

output block is not enough, Skein runs the output transformation several times. The most

innovative parts of Skein are the Threefish block cipher and the mode of operation. The

reader is referred to Section 4.2 for a detailed description.

Figure 3.4. Hashing a three-block message using UBI mode.

Security of Skein

Due to optimal security bounds on the compression function claimed by submitters [BKL+09]

and the property that the Skein’s mode of operation preserves collision resistance and

everywhere preimage resistance, optimal bounds for these two properties are obtained.

Furthermore, the Skein hash function is proven indifferentiable from a random oracle if

the underlying tweakable block cipher is assumed to be ideal [BKL+09]. As derived in

[AMP10c], this indifferentiability renders a bound of O(q2n + q2

2l

)on the second preimage

resistance. This second preimage bound for Skein is optimal for the n = 256 variant, while

for n = 512 variant this claim is not held. In Chapter 4 we improve bound on second

preimage resistance to AdveSecH = Θ(q/2n) in the ideal cipher model.


Performance of Skein

NIST [TPB+11] rated Skein’s performance in software as above-average across most plat-

forms, particularly in 64-bit mode. In hardware, Skein’s throughput-to-area ratio is av-

erage to a little below-average. Results in constrained environments show that Skein has

above-average performance. Skein has modest memory requirements and benefits from the

pipelining used in modern processors.

3.4 A Summary of the Existing Results

In this section we provide a summary of the previously mentioned results. First, Section

3.4.1 presents the main advantages and drawbacks of the finalists recognized by NIST

[TPB+11], and then provides a schematic summary of security and performance results.

3.4.1 Factors of Favorability

BLAKE was promoted to the final round of NIST’s SHA-3 hash function competition due

to its high security margin, good performance in software, and its simple and clear

design.

Grøstl was chosen as a finalist because of its well-understood design and solid performance,

especially in hardware. Although the security properties of Grøstl are not ideal, the

amount of cryptanalysis that has been published on Grøstl and its building blocks

provides a degree of security in this design.

JH was selected as a finalist because of its solid security properties, good all-around per-

formance, and innovative design. As drawbacks of JH design NIST emphasizes not

well-understood compression function construction together with lack of analysis pro-

vided for this construction.

Keccak was selected by NIST for the final of competition, mainly due to its good security

properties, its high throughput and throughput-to-area ratio and the simplicity of its

design.

Skein advanced to the final, mainly due to its high security margin and speed in software.

3.4.2 A Summary of the Security and Performance Results

In Table 3.1 we briefly summarized performance results presented in [TPB+11], in order

to provide an insight on this important evaluation aspect. Let us emphasize that the


description of performance level (high, average and low) does not imply drastically different

performances, considering that all these performance results are within satisfactory range

expected by the NIST.

Table 3.1. A schematic summary of hardware and software results. The firstcolumn indicates the name of hash function selected in the final of competition,while the next three columns describe performance results in software, hardwareand in constrained environments, respectively.

Software HardwareConstrained

settings

BLAKE High Average High

Grøstl Average High Low

JH Average Average Average

Keccak Average High Low

Skein High Low High

As for the provable security results, the summary presented in our work is based on the

classification conducted by Andreeva et al. [AMP10c, AMPS12]. The first of these two

mentioned papers deals with provable security results of all 14 second round SHA-3 candi-

dates, while in the second paper as well as in this thesis the emphasis is placed on the five

competition finalists. Concretely, in Table 3.2 we presented all security reduction results

(for n = 256 and n = 512 variants of the SHA-3 hash function finalists) known to us. We

updated second preimage results of Grøstl and Skein obtained in Chapter 4 which are

illustrated in the table with a green box. A yellow box in the table is used to indicate prob-

lems which are still open, one of which is the lack of an optimal (second) preimage bound

for 512 bits variant of JH. Essentially, all the results are provided in the ideal permutation

or cipher model, which means that the strength of assumptions is weakened in comparison

to the ideal compression function assumption. If we take a look on the security bounds

on compression functions presented in this table, we can see that collisions and (second)

preimages can be found for the JH and Keccak compression function in one query to the

permutation and as a consequence these compression functions are regarded as insecure.

However, this does not invalidate security of the JH and Keccak hash functions.


Tab

le3.2

.A

sch

emat

icsu

mm

ary

ofse

curi

tyre

du

ctio

nre

sult

sof

five

fin

ali

sts.

Th

eu

sed

para

met

ersn

,l,m

,2L

,d

enote

the

hash

fun

ctio

nou

tpu

tsi

ze,

the

inte

rnal

valu

esi

zean

dth

em

essa

ge

inp

ut

size

,th

ele

ngth

of

the

firs

tp

reim

age

inm

essa

ge

blo

cks,

resp

ecti

vely

.T

he

firs

tco

lum

nin

dic

ates

the

nam

eof

has

hfu

nct

ion

sele

cted

inth

efi

nal

of

com

pet

itio

n,

wh

ile

the

seco

nd

colu

mn

des

crib

esth

eu

nd

erly

ing

assu

mp

tion

s.T

he

nex

tth

ree

colu

mn

ssh

owth

ese

curi

tyb

ou

nd

son

com

pre

ssio

nfu

nct

ion

s,w

hil

eth

ela

stth

ree

colu

mn

ssu

mm

ari

zeth

ese

curi

tyre

du

ctio

nre

sult

son

com

ple

teh

ash

fun

ctio

ns.

Aye

llow

box

ind

icate

sth

eex

iste

nce

of

an

on

-tri

via

lu

pp

erb

ou

nd

wh

ich

isn

ot

yet

opti

mal

for

bot

hth

e25

6an

d51

2b

its

vari

ant.

Agre

enb

oxin

dic

ate

sth

ese

curi

tyre

du

ctio

nre

sult

sth

at

are

pro

ven

inth

isth

esis

wh

ile

the

oth

erre

sult

sp

rese

nte

din

this

tab

lear

ebas

edon

pre

vio

us

work

s[A

MP

10c,

AM

PS

12].

Mod

el

Ad

vColl

fA

dvPre

fA

dvSec

fA

dvColl

HA

dvPre

HA

dvSec

H

BLAKE

Idea

lci

pher

EΘ

(q2/2n

)Θ

(q/2n)

Θ(q/2n)

Θ(q

2/2n

)Θ

(q/2n)

Θ(q/2n)

Grøstl

Idea

lp

erm

uta

tion

sP

,QΘ

(q4/2l

)Θ

(q2/2l

)Θ

(q2/2l

)Θ

(q2/2n

)Θ

(q/2n)

Θ(q/2n−L

)

JH

Idea

lp

erm

uta

tion

PΘ

(1)

Θ(1

)Θ

(1)

Θ(q

2/2n

)O( q 2

n+

q2

2l−

m

)O( q 2

n+

q2

2l−

m

)Keccak

Idea

lp

erm

uta

tion

PΘ

(1)

Θ(1

)Θ

(1)

Θ(q

2/2n

)Θ

(q/2n)

Θ(q/2n)

Skein

Idea

lblo

ckci

pher

EΘ

(q2/2l

)Θ

(q/2l )

Θ(q/2l )

Θ(q

2/2n

)Θ

(q/2n)

Θ(q/2n)

Chapter 4

Second Preimage Resistance of

Grøstl and Skein

This thesis is concerned with the second preimage resistance of SHA-3 candidates, namely

Grøstl and Skein. As explained in Chapter 3, an important evaluation criterion in the

competition for SHA-3 hash function is security (e.g. the possible reductions of the hash

function security to the security of its underlying building blocks). In this chapter we

provide a lower bound on second preimage resistance of Grøstl and Skein within the

concrete-security provable-security framework. The reader is referred to Section 2.3.6 and

Section 2.3.7 where the proof techniques and the security model used in this chapter are

discussed.

4.1 Security Analysis of Grøstl

As briefly presented in Section 3.3.2 Grøstl combines characteristics of the wide-pipe and

Merkle-Damgard constructions and uses two distinct permutations P and Q. Let us closely

observe Grøstl to see how the hash value is obtained. First, the padding function padG

takes a message M of N bits length and returns the padded message split into l- bit

message blocks padG(M) = m1||m2|| . . . ||mk of the certain length, which is a multiple of

message block size l. Padding is achieved by appending to the original message a single

’1’ bit followed by as many ’0’ bits as needed to complete l-bit block after embedding

the 64-bit representation of the number of message blocks in the padded message. Then,

Grøstl iterates the permutation based compression function f : 0, 1l × 0, 1l → 0, 1l.Finally, the output of the last compression call is processed by the output transformation

g(h) = P (h)⊕ h after which the output size is shortened from l to n bits with the function

shortn.

37

Chapter 4. Second Preimage Resistance of Grøstl and Skein 38

Figure 4.1. Grøstl’s compression function.

4.1.1 Assessing Second Preimage Resistance of Grøstl

A possible way to obtain a bound on the second preimage resistance of Grøstl is by using

indifferentiability results. Grøstl is proven indifferentiable from a random oracle if the

underlying permutations are ideal [AMP10a]. Briefly, a proved bound shows that Grøstl

behaves like a random oracle up to the birthday bound which is not enough for achieving

optimal second preimage resistance.

As indicated in [GKM+11], the underlying compression function of Grøstl exhibits a non-

ideal behavior (i.e. the fixed points for the compression function can be found easily1,

the generalised birthday collision attack is applicable to the l-bit compression function of

Grøstl with a complexity of 2l/3), which makes the result of Bouillaguet and Fouque [BF09]

in the ideal compression function model inapplicable. Therefore, in order to reconfirm the

second preimage resistance of Grøstl we explore further. More precisely, we assume ideality

of the underlying building blocks of compression function which in the case of Grøstl are

two permutations P and Q.

4.1.2 Proof of Security

Under the assumption that P and Q are random l-bit permutations, where l is the iterated

state size and n is the output size, we will prove that the advantage of the second preimage

adversary is upper bounded by O((k+1)q2

2l+ 2q

2n

), where the second preimage adversary

makes at most q queries and the length of target message is at most k blocks. In this

ideal model, an adversary is allowed to make both forward and inverse queries to P and

1In order to find a fixed point, we select a message m arbitrarily and then compute h = P−1(Q(m))⊕m.This will give us the fixed point for Grøstl’s compression function f(h,m) = h.


Q random permutations. All these queries are stored in a query history LP and LQ as

indexed elements and their number is q2 and q1, respectively.

Theorem 4.1. Let P,Q be two random l-bit permutations and let A be a computationally

unbounded adversary which makes at most q < 2l−1 queries to oracles. Its advantage in

breaking H second preimage resistance is upper bounded by:

AdveSec[λ]H (q) ≤ (k+1)q2

2l+ 2q

2n .

Proof. We prove the theorem by using a graph based approach. To complete this proof, we

will introduce the graph construction setting, which is based on the definitions provided in

Section 2.1.2.

The Graph Construction. We introduce two, initially empty lists LP , LQ. Let us denote

by LQ = (αi, βi)1≤i≤q1 a list such that Q(αi) = βi and by LP = (α′j , β′j)1≤j≤q2 a list

such that P (α′j) = β′j where a tuple (α, β) ∈ 0, 1l×0, 1l. We introduce a directed graph

(V,E), initially (IV , ∅). Any (αi, βi) ∈ LQ and (α′j , β′j) ∈ LP defines an edge e between

the two vertices in (V,E) which we denote by αi ⊕ α′jei−→ αi ⊕ α′j ⊕ βi ⊕ β′j . We define a

path in the graph as a sequence of edges p = (e1, . . . , ek+1) such that for each of its edge

ei, where 1 ≤ i ≤ k the output vertex is equal to the input vertex of ei+1. We say that two

distinct paths collide if they both start with the IV vertex and both end with the same

output vertex.

Grøstl in the Graph Setting. Intuitively, an edge in (V,E) corresponds to an evalu-

ation of the Grøstl compression function and the number of them is exactly q1 · q2. For

convenience edges ei ∈ E are labeled by messages mi in 0, 1l where mi = αi and 1 ≤ i ≤ k.

A path in the graph (V,E) obtained while hashing the target message M is called the chal-

lenge path denoted by IVm1−−→ h1

m2−−→ h2 · · ·mk−−→ hk

ek+1−−−→ hk+1. It is necessary to emphasize

that first k internal states are l-bit long, while hk+1 (n-bit long hash value) is obtained by

applying output transformation with the function shortn on the internal state hk. We can

conclude that a vertex in (V,E) corresponds to the internal state of the Grøstl hash func-

tion.

Let SP be the event that, as a result of adversary’s queries, a path which collides with and

differs from the challenge path is formed in the graph (V,E).

Claim 1. AdveSec[λ]H (q) ≤ Pr[SP]

Proof. Suppose that the second preimage adversary A receives a randomly generated tar-

get message M where padG(M) = m1||m2||...||mk and it outputs a message M ′ 6= M

where padG(M ′) = m′1||m′2||...||m′s such that HP,Q(M) = HP,Q(M ′) for queried ora-

cles P and Q. The adversary A makes all of the queries necessary to compute H(M)


and H(M ′). We denote by p = (m1,m2, ...,mk, ek+1) the challenge path and denote by

p′ = (m′1,m′2, ...,m

′s, e′s+1) the path obtained while hashing message M ′. We claim that

paths p and p′ are colliding paths.

1. If |M | 6= |M ′|, then due to the padding function of Grøstl, the inputs of the last

invocation of the compression are not the same mk 6= m′s, then clearly p and p′

induced by messages M and M ′ are distinct.

2. Otherwise, |M | = |M ′|. Since hk+1 = h′s+1, either there is a second preimage for

the output transformation or hk = h′s. If the latter case is true, either there is

a second preimage on the compression function, or (hk−1,mk) = (h′k−1,m′k). This

argument repeats for the compression function. Since |M | = |M ′| and IV is fixed for

both evaluations, either there is a second preimage at some point, or mi = m′i for

1 ≤ i ≤ k. In the latter case, M = M ′ which is impossible. Therefore, there exists at

least one pair (h′i−1,m′i) 6= (hi−1,mi), which implies that paths p and p′ are distinct.

Because M and M ′ collide, we have hk+1 = h′s+1 and hence the paths p and p′ end with

the same output vertex which means that they collide. Therefore, finding a message that

collides with the target message is equivalent to finding a path that collides with the

challenge path. This completes the proof of the Claim 1.

Claim 2. Pr[SP] ≤ (k+1)q2

2l+ 2q

2n .

Proof. Suppose that A wins. The SP event occurs when A succeeds in connecting a path

(different from the challenge path) in the graph (V,E) from IV to the challenge path. That

connection can happen in two ways:

Let C be the event in which a connection occurs on an internal state of the challenge

path before the output transformation is applied and let us name CO the event in which

connection occurs after the output transformation is applied.

Simulation. We simulate the execution of A, and bookmark in lists LP and LQ the queries

sent to the oracles P and Q, respectively. Every time A submits a new query to the oracle, it

receives a uniformly-distributed random value. Let IVm1−−→ h1

m2−−→ h2 · · ·mk−−→ hk

ek+1−−−→ hk+1

be the sequence of vertices crossed by the challenge path.

Case 1: If the C event occurs after the q-th query to P and/or Q oracle, in the graph

there exists a path p′, IVm′1−−→ h′1

m′2−−→ h′2 · · ·m′s−−→ h′s where h′s is equal to one of the internal

states hi from the challenge path for 0 ≤ i ≤ k. This means that the adversary has found

a collision on compression function f. More precisely, this collision is actually the second

preimage of one out of k + 1 internal states for f.


Start the Simulation. Let us assume that the event C occurs after the adversary has sent

a query to Q or Q−1. Without loss of generality, we consider forward queries only. The tuple

(α, β) is generated where β is a random value from a set of size at least 2l− q1. The second

preimage is found if in the list LP exists a pair (α′j , β′j), such that hi = α⊕ α′j ⊕ β ⊕ β′j

where 0 ≤ i ≤ k. Since 1 ≤ j ≤ q2, each query to Q or Q−1 generates q2 new edges. There-

fore, each query has a probability q2·(k+1)(2l−q1)

to give the second preimage of one out of k + 1

internal states from the challenge path. Consequently, a probability that event C occurs

after the adversary asks at most q1 queries to Q or Q−1 is upper bounded by:

Pr[C]Q ≤(k + 1)q1q2

2l − q1.

Alternatively, we have the case that the event C is realized after the adversary has sent a

query to P or P−1. An upper bound for this case is obtained in the similar way as before:

Pr[C]P ≤(k + 1)q1q2

2l − q2.

By the union bound, we obtain an upper bound on probability that event C occurs:

Pr[C] ≤ Pr[C]Q + Pr[C]P ≤(k + 1)q1q2

2l − q1+

(k + 1)q1q22l − q2

.

Case 2: A hash value hk+1 = shortn(P (hk) ⊕ hk) is generated by applying the output

transformation together with the function shortn2. The output transformation is designed

on top of the permutation P . Therefore, the event CO can be realized only after the

adversary has sent query to P or P−1. Notice that each query generates precisely one

output transformation edge. If CO event occurs, in the graph (V,E) there exists a path p′,

IVm′1−−→ h′1

m′2−−→ h′2 · · ·m′s−−→ h′s

e′s+1−−−→ h′s+1 where h′s+1 = hk+1 and h′s 6= hk. This implies that

the adversary has found a second preimage on the output transformation for n-bit long

preimage hk+1.

Start the Simulation. The event CO is realized after the adversary has sent query

to P or P−1. Without loss of generality, only forward tuple (α, β) is generated and β

is a random value from a set of size at least 2l − q2. The second preimage is found if

hk+1 = shortn(α⊕ β). Therefore, each query to P or P−1 has a probability at most 2l−n

(2l−q2)to give the second preimage on the output transformation. Consequently, a probability

that event CO occurs after the adversary asks at most q2 queries to P or P−1 is upper

bounded by:

Pr[CO] = Pr[CO]P ≤q2 · 2l−n

2l − q2.

2The function shortn truncates the output by returning only the last n bits.


Combining all cases, we give an upper bound on a probability that event SP occurs:

Pr[SP] ≤ Pr[C] + Pr[CO]

≤ (k + 1)q1q22l − q1

+(k + 1)q1q2

2l − q2+q2 · 2l−n

2l − q2

≤ (k + 1)q2

2l+

q

2n−1.

In Appendix A we provide a detailed mathematical support for this equation. This com-

pletes the proof of Claim 2.

The result for second preimage resistance of Grøstl now follows from the combination of

the two claims which completes the proof of Theorem 4.1.

4.2 Security Analysis of Skein

As briefly presented in Section 3.3.5 the mode of operation employed in Skein called Unique

Block Iteration (UBI) takes as input an internal state, a message block, and a tweak. The

compression function is based on the Threefish tweakable block cipher used in Matyas-

Meyer-Oseas mode. The tweak encodes the number of bytes processed so far, the type of

UBI mode and special flags for the first and the last block. In normal hashing mode there

are three UBI invocations: the one for a configuration block used to generate IV , a message

hashing block and a block which represents the output transformation.

Figure 4.2. Skein in normal hashing mode.

The padding function padS takes a message M of N bits length and returns the padded

message split into the message blocks padS(M) = m1||m2|| . . . ||mr of a certain length,

which is a multiple of message block size l. If N is a multiple of 8, padding is achieved by

appending to the original message as many ’0’ bits as needed to complete an l-bit block.

Otherwise, padding is achieved by appending to the original message a single ’1’ bit followed


by as many ’0’ bits as needed to complete an l-bit block. Interestingly, Skein uses a block

counter included in the tweak rather than the usual strengthening. The designers claim that

the counter provides the same security as the typical padding where the message length is

appended in the end of message. Furthermore, the counter ensures that each message block

is hashed in the unique way. To obtain the hash value, the output of the last compression

call is processed by the output transformation after which the output size is optionally

shortened from l to n bits with the function shortn.

4.2.1 Assessing Second Preimage Resistance of Skein

A possible way to obtain a bound on the second preimage resistance of Skein is by us-

ing indifferentiability results. The Skein hash function is proven indifferentiable from a

random oracle if the underlying tweakable block cipher is assumed to be ideal [BKL+09].

Additionally, an upper bound O(q2n + q2

2l

)on the second preimage resistance is derived via

the indifferentiability [AMP10c]. NIST requires the SHA-3 hash function for n = 224, 256,

384, 512. The existing second preimage bound gives optimal second preimage resistance as

long as 2n ≤ l.

In order to prove an optimal bound on the second preimage resistance of narrow-pipe

versions of Skein, we will directly analyze second preimage resistance of Skein in the ideal

cipher model. Our proof follows techniques used by Bouillaguet and Fouque [BF09] for

HAIFA construction.

4.2.2 Proof of Security

Under the assumption that E is an ideal tweakable block cipher, where l is the iterated

state size and n is the output size, we will prove that the advantage of the second preimage

adversary is upper bounded by O(2q2l

+ 2q2n

), where the second preimage adversary makes

at most q queries and the length of target message is at most r blocks. In this ideal model,

an adversary is allowed to make both forward and inverse queries to E random oracle. All

these queries are stored in a query history LE as indexed elements.

Theorem 4.2. Let E be an ideal tweakable block cipher and let A be a computationally

unbounded adversary which makes at most q < 2l−1 queries. Its advantage in breaking Hsecond preimage resistance is upper bounded by:

AdveSec[λ]H (q) ≤ 2q

2l+ 2q

2n .

Proof. The proof follows an approach used in the proof of Theorem 4.1.


The Graph Construction. Let LE = (ki, xi, ti, yi)1≤i≤q be an initially empty list such

that y = Ek(t, x) where tuple (k, x, t, y) ∈ 0, 1l × 0, 1l × 0, 1s × 0, 1l. We introduce

an initially empty directed graph (V,E). When the adversary A sends a forward query

(k, x, t) to oracle E it receives a value y, and when A sends an inverse query (k, t, y)

to oracle it receives a value x. An edge e ∈ E is formed between two vertices in V ,

(k, x, t, y)e−→ (k′, x′, t′, y′) if k′ = y⊕x. We define a path in the graph (V,E) as the sequence

of vertices which we denote by p = (k1, x1, t1, y1)e1−→ · · · er−→ (kr+1, xr+1, tr+1, yr+1). We say

that two vertices (k, x, t, y) and (k′, x′, t′, y′) collide if y⊕ x = y′⊕ x′. Further, two distinct

paths collide if they both start with the same vertex and they both end with colliding

vertices.

Skein in the Graph Setting. Intuitively, an edge corresponds to precisely one evaluation

of the Skein’s compression function. For each i, 1 ≤ i ≤ r is true: mi = xi, hi−1 = ki, t3 is a

tweak value of the message type and hi = yi⊕xi. The hash value hr+1 = shortn(yr+1⊕xr+1)

is obtained by applying the output transformation with the final chopping on the internal

state hr = kr+1. In the output transformation, the tweak has the output type and the

64-bit counter is used instead of message block input xr+1. Without loss of generality, we

can replace the first UBI invocation for the configuration block with IV = k1 and fix it as

a constant. If a path in (V,E) is obtained while hashing the target message M , we refer to

this sequence as the challenge path. We denote by (IV, h1, . . . , hr) the sequence of internal

states crossed by the challenge path to obtain the hash value hr+1.

Let SP be the event that, as a result of adversary’s queries, a path which collides with and

differs from the challenge path is formed in the graph (V,E), where the overlapping tweaks

coincide with each other.

Claim 1. AdveSec[λ]H (q) ≤ Pr[SP].

Proof. Suppose that the second preimage adversary A receives a randomly generated tar-

get message M where padS(M) = m1||m2|| . . . ||mr and it outputs a message M ′ 6= M

where padS(M ′) = m′1||m′2|| . . . ||m′p such that HE(M) = HE(M ′) for queried oracle E.

Adversary A makes all of the queries necessary to compute H(M) and H(M ′). Let us

denote by p = (IV, x1, t1, y1)e1−→ · · · er−→ (kr+1, xr+1, tr+1, yr+1) the challenge path induced

by message M and let us denote by p′ = (IV, x′1, t′1, y′1)

e′1−→ · · ·e′p−→ (k′p+1, x

′p+1, t

′p+1, y

′p+1)

the path induced by message M ′. We claim that paths p and p′ are colliding paths.

3We assume that tweaks t and t′ in the definition of edge correspond to one another in terms of bitsprocessed so far, the type of UBI mode and special flags.


1. If |M | 6= |M ′|, then the values of the tweak entering the output transformation are

different tr+1 6= t′p+1 and so (kr+1, xr+1, tr+1, yr+1) 6= (k′p+1, x′p+1, t

′p+1, y

′p+1)

4.

2. Otherwise, |M | = |M ′|. Since hr+1 = h′r+1, either there is a second preimage on

output transformation or (hr, tr+1) = (h′r, t′r+1). If the second statement is true,

either there is a second preimage on compression function where tweak values must

be the same tr = t′r, or (hr−1,mr, tr) = (h′r−1,m′r, t′r). This argument repeats for the

compression function. Since |M | = |M ′| and IV is fixed for both evaluations, either

there is a second preimage on compression function at some point (for the same value

of tweak), or mi = m′i for 1 ≤ i ≤ r. In the latter case, M = M ′ which is impossible.

Therefore, there is some i, 1 ≤ i ≤ r such that mi 6= m′i, and so (ki, xi, yi) 6= (k′i, x′i, y′i)

for ti = t′i.

Since M and M ′ collide hr+1 = h′p+1, and hence yr+1⊕xr+1 = y′p+1⊕x′p+1. Therefore, the

paths p and p′ collide. This completes the proof of the Claim 1.

Claim 2. Pr[SP] ≤ 2q2l

+ 2q2n .

Proof. Suppose that A wins. As noted before, the SP event occurs when A succeeds

in connecting a path (different from challenge path) in the graph (V,E) from IV to the

challenge path, where the tweaks need to coincide. Similarly as in the case of Grøstl the

connection can happen in two ways:

Let C be the event in which a connection occurs on an internal state of the challenge

path before the output transformation is applied and let us name CO the event in which

connection occurs after the output transformation is applied.

Simulation. We simulate the execution of A, and bookmark in list LE the queries sent

to the oracle E. Every time A submits a new query to the oracle, it receives a uniformly-

distributed random value. We denote the challenge path induced by the target message M

by p = (IV, x1, t1, y1)e1−→ · · · er−1−−−→ (kr+1, xr+1, tr+1, yr+1).

Case 1: If the C event occurs after the adversary A asks at most q the queries to E

oracle, in the graph (V,E) there exists a path p′ = (IV, x′1, t′1, y′1)

e′1−→ · · ·e′p−→ (k′p, x

′p, t′p, y′p),

where the vertex (k′p, x′p, t′p, y′p) collides with a vertex (ki, xi, ti, yi)1≤i≤r from the challenge

path, such that t′p = ti. This means that adversary has found a collision for the tweakable

compression function f. More precisely, this collision is actually the second preimage of one

of the hi from the challenge path, for 1 ≤ i ≤ r.4As noted above, in the output transformation the block cipher is used in the counter mode and therefore

xr+1 = x′p+1 but this does not affect our proof.


Start the Simulation. Without loss of generality, let us assume that event C occurs after

the adversary has sent the j-th query. The tuple (k′j , x′j , t′j , y′j) is generated where y′j is a

random value from a set of size at least 2l−j. The only place where the path p′ can connect

to the challenge path is the vertex where t′j = ti, for 1 ≤ i ≤ r. A second preimage on

tweakable compression function is found if yi ⊕ xi = y′j ⊕ x′j . Therefore, the j-th query has

a probability at most 1/(2l − j) to give this second preimage. Consequently, a probability

that event C occurs after the adversary asks at most q queries to E is upper bounded by:

Pr[C] ≤q∑j=1

1

2l − j≤ q

2l − q.

Case 2: As noted before, the hash value hr+1 is obtained by applying the output trans-

formation with the final chopping on the internal state hr = kr+1 of the challenge path.

If the CO event occurs after the adversary A asks at most q the queries to E oracle, in

the graph (V,E) exists a path p′ = (IV, x′1, t′1, y′1)

e′1−→ · · ·e′p−→ (k′p+1, x

′p+1, t

′p+1, y

′p+1) where

hr+1 = shortn(y′p+1⊕x′p+1) after final chopping of l−n leftmost bits. This means that the

adversary has found the second preimage on the output transformation.

Start Simulation. Let us assume that event CO occurs after adversary has sent the i-

th query. The tuple (k′i, x′i, t′i, y′i) is generated where y′i is a random value from a set of

size at least 2l − i. A second preimage on output transformation is found if and only if

hr+1 = shortn(y′i ⊕ x′i). Therefore, the i-th query has a probability at most 2l−n

2l−i to give

this second preimage. Consequently, a probability that event CO occurs after adversary Aasks q queries to E is upper bounded by:

Pr[CO] ≤q∑i=1

2l−n

2l − i≤ q · 2l−n

2l − q.

Combining both cases, we give an upper bound on probability that event SP occurs:

Pr[SP] ≤ Pr[C] + Pr[CO]

≤ q

2l − q+q · 2l−n

2l − q

≤ 2q

2l+

2q

2n.

We obtain this result similarly as in proof of Grøstl as for q < 2l−1 we have 12l−q ≤

22l

. If

the final chopping is not needed n = l, the results are still valid. This completes the proof

of the Claim 2.


The result for the second preimage resistance of Skein now follows from the combination

of the two claims which completes the proof of Theorem 4.2.

Chapter 5

Conclusions and Remarks

In this chapter, we offer a brief summary of the work done in the thesis and then we discuss

its implications for the future study.

5.1 Conclusions

In this thesis we considered the final round candidates in the competition for a new SHA-3

hashing algorithm within the provable security framework. To be able to carry out the

analysis, we became familiar with the provable security approach together with the state

of the art of hash functions, and more closely with the competition finalists. As shown

in Chapter 4, we provided a lower bound on second preimage resistance of Grøstl and

Skein in the ideal model. The obtained results for Grøstl in the ideal permutation model

confirm the claim that the Merkle-Damgard iteration looses a factor linear in the message

length (in blocks) of the second preimage security in the ideal compression function model

[KS05]. Secondly, Skein’s bound shows that the addition of a tweak which entails an unique

compression function call results in an increase of the second preimage resistance (up to

approximately n bits). In Table 3.2 we presented the existing security reduction results and

updated those obtained in our work. One needs to be aware of shortcomings of provable

security approach in the ideal model while looking at these security reduction results. There

are classes of attacks still maybe possible, such as timing attacks, differential fault analysis,

and differential power analysis. Sometimes applied proof techniques or human factors (i.e.

flaws in the proof, proof given in the wrong model or for the wrong problem) may affect

the accuracy of a security reduction. However, security reduction results are of the great

importance since they give us a very good indication that the higher level structure has

no flaws in the design. More concretely, they show that no attack on the hash function is

possible without exploiting a weakness of the underlying idealized primitive.

49

Chapter 5. Conclusions and Remarks 50

5.2 Summary of Contributions

Bearing in mind the importance of valid security guaranties, we see our results as a valuable

contribution to the SHA-3 competition. The main contributions of this thesis are:

• The analysis of the second preimage resistance of hash function competition finalists

Grøstl and Skein. Within the concrete-security provable-security framework, we

gave a lower bound on the second preimage resistance of Grøstl in the ideal permu-

tation model and Skein in the ideal cipher model and proved them both optimally

second preimage resistant.

• While seeking for solutions we investigated the existing proof techniques concerning

security notions with an emphasis on the second preimage resistance.

• In addition, we gave a concise survey of the five finalists together with their security

reductions and performance results.

5.3 Future Research

In recent years, the NIST SHA-3 competition has focused the attention of cryptographic

community and initiated a broad research on the design principles and analysis of hash

functions. As a result many new ideas emerged regarding construction designs, cryptanal-

ysis, proof techniques, etc. Also, new directions for further research related to this topic

were identified. We now list some open problems:

• Firstly, as can be seen in Table 3.2 the provided bounds on the preimage and second

preimage resistance of JH are not optimal.

• Once we provide a reduction of the security (Col, ePre, eSec) of the hash function to

the security of some underlying atomic primitive (under the assumption that particu-

lar underlying primitive is ideal), a more detailed analysis of that particular primitive

can be conducted with the goal to investigate its resistance to existing and new at-

tacks.

• All security reduction results presented in this work were carried out in the ideal

model. Supporting second preimage resistance with a proof in the standard model

still remains the substantial challenge. A possible direction would be an attempt

to design a construction efficient-in-practice with the second preimage preservation

property.

Chapter 5. Conclusions and Remarks 51

• More fundamentally, definitions and a classification of the main security properties

are still not completely understood, while new practical applications emerge with the

demand for subtle security requirements.

• There is a need for developing new methods to assess security and to develop new

attacks and designs ideas.

• Finally, a broad range of use and the number of existing security requirements as

well as performance requirements make the hash function design more complex. One

solution would be to effectively parse these requirements into certain related entities

and to design different hash functions which would deal with each of these entities.

Bibliography

[ABF+08] Elena Andreeva, Charles Bouillaguet, Pierre-Alain Fouque, Jonathan J. Hoch,

John Kelsey, Adi Shamir, and Sebastien Zimmer. Second Preimage Attacks on

Dithered Hash Functions. In Nigel P. Smart, editor, EUROCRYPT, volume

4965 of Lecture Notes in Computer Science, pages 270–288. Springer, 2008.

[ABM+12] Elena Andreeva, Andrey Bogdanov, Bart Mennink, Bart Preneel, and Christian

Rechberger. On Security Arguments of the Second Round SHA-3 Candidates.

International Journal of Information Security, 11(2):103–120, 2012.

[AHMP10] Jean-Philippe Aumasson, Luca Henzen, Willi Meier, and Raphael C.-W. Phan.

SHA-3 proposal BLAKE. Submission to NIST (Round 3), 2010.

[ALM11] Elena Andreeva, Atul Luykx, and Bart Mennink. Provable Security of BLAKE

with Non-Ideal Compression Function. IACR Cryptology ePrint Archive, Re-

port 2011/620, 2011.

[AMP10a] Elena Andreeva, Bart Mennink, and Bart Preneel. On the Indifferentiability

of the Grøstl Hash Function. In Juan A. Garay and Roberto De Prisco, edi-

tors, SCN, volume 6280 of Lecture Notes in Computer Science, pages 88–105.

Springer, 2010.

[AMP10b] Elena Andreeva, Bart Mennink, and Bart Preneel. Security Properties of Do-

main Extenders for Cryptographic Hash Functions. JIPS, 6(4):453–480, 2010.

[AMP10c] Elena Andreeva, Bart Mennink, and Bart Preneel. Security Reductions of the

Second Round SHA-3 Candidates. In Mike Burmester, Gene Tsudik, Spyros S.

Magliveras, and Ivana Ilic, editors, ISC, volume 6531 of Lecture Notes in Com-

puter Science, pages 39–53. Springer, 2010.

[AMPS12] Elena Andreeva, Bart Mennink, Bart Preneel, and Marjan Skrobot. Security

Analysis and Comparison of the SHA-3 Finalists BLAKE, Grøstl, JH, Keccak,

and Skein. In Aikaterini Mitrokotsa and Serge Vaudenay, editors, Progress

in Cryptology - AFRICACRYPT, volume 7374 of Lecture Notes in Computer

Science, pages 287–305. Springer, Heidelberg, 2012.

53

Bibliography 54

[And10] Elena Andreeva. Domain Extenders for Cryptographic Hash Functions. PhD

thesis, Katholieke Universiteit Leuven, 2010.

[ANPS07] Elena Andreeva, Gregory Neven, Bart Preneel, and Thomas Shrimpton. Seven-

Property-Preserving Iterated Hashing: ROX. In Kaoru Kurosawa, editor, ASI-

ACRYPT, volume 4833 of Lecture Notes in Computer Science, pages 130–146.

Springer, 2007.

[AP09] Elena Andreeva and Bart Preneel. A Three-Property-Secure Hash Function.

In Roberto Maria Avanzi, Liam Keliher, and Francesco Sica, editors, Selected

Areas in Cryptography, volume 5381 of Lecture Notes in Computer Science,

pages 228–244. Springer, 2009.

[AS11] Elena Andreeva and Martijn Stam. The Symbiosis between Collision and Preim-

age Resistance. In Liqun Chen, editor, IMA Int. Conf., volume 7089 of Lecture

Notes in Computer Science, pages 152–171. Springer, 2011.

[BCC+08] Emmanuel Bresson, Anne Canteaut, Benoıt Chevallier-Mames, Christophe

Clavier, Thomas Fuhr, Aline Gouget, Thomas Icart, Jean-Francois Misarsky,

Marıa Naya-Plasencia, Pascal Paillier, Thomas Pornin, Jean-Rene Reinhard,

Celine Thuillet, and Marion Videau. Shabal, a Submission to NIST’s Crypto-

graphic Hash Algorithm Competition. Submission to NIST, 2008.

[BCS05] John Black, Martin Cochran, and Thomas Shrimpton. On the Impossibility of

Highly-Efficient Blockcipher-Based Hash Functions. In Ronald Cramer, editor,

EUROCRYPT, volume 3494 of Lecture Notes in Computer Science, pages 526–

541. Springer, 2005.

[BD07] Eli Biham and Orr Dunkelman. A Framework for Iterative Hash Functions -

HAIFA. IACR Cryptology ePrint Archive, Report 2007/278, 2007.

[BDPA07] Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche. Sponge

functions. ECRYPT Hash Workshop, 2007.

[BDPA08] Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche. On

the Indifferentiability of the Sponge Construction. In Nigel P. Smart, editor,

EUROCRYPT, volume 4965 of Lecture Notes in Computer Science, pages 181–


[BDPA11] Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche. The

Keccak SHA-3 submission. Submission to NIST (Round 3), 2011.

[Ber08] Daniel J. Bernstein. ChaCha, a variant of Salsa20, 2008. http://cr.yp.to/

chacha/chacha-20080128.pdf.

http://cr.yp.to/chacha/chacha-20080128.pdf

http://cr.yp.to/chacha/chacha-20080128.pdf

Bibliography 55

[BF09] Charles Bouillaguet and Pierre-Alain Fouque. Practical Hash Functions Con-

structions Resistant to Generic Second Preimage Attacks Beyond the Birthday

Bound, 2009.

[BKL+09] Mihir Bellare, Tadayoshi Kohno, Stefan Lucks, Niels Ferguson, Bruce Schneier,

Doug Whiting, Jon Callas, and Jesse Walker. Provable Security Support for

The Skein Hash Family, 2009.

[BKL+10] Mihir Bellare, Tadayoshi Kohno, Stefan Lucks, Niels Ferguson, Bruce Schneier,

Doug Whiting, Jon Callas, and Jesse Walker. The Skein Hash Function Family.

Submission to NIST (Round 3), 2010.

[BMN10] Rishiraj Bhattacharyya, Avradip Mandal, and Mridul Nandi. Security Anal-

ysis of the Mode of JH Hash Function. In Seokhie Hong and Tetsu Iwata,

editors, FSE, volume 6147 of Lecture Notes in Computer Science, pages 168–


[Bou11] Charles Bouillaguet. Etudes d’hypotheeses algorithmiques et attaques de prim-

itives cryptographiques. PhD thesis, Universite Paris Diderot, 2011.

[BR93] Mihir Bellare and Phillip Rogaway. Random Oracles are Practical: A Paradigm

for Designing Efficient Protocols. In Dorothy E. Denning, Raymond Pyle, Ravi

Ganesan, Ravi S. Sandhu, and Victoria Ashby, editors, ACM Conference on

Computer and Communications Security, pages 62–73. ACM, 1993.

[BRS02] John Black, Phillip Rogaway, and Thomas Shrimpton. Black-Box Analysis

of the Block-Cipher-Based Hash-Function Constructions from PGV. In Moti

Yung, editor, CRYPTO, volume 2442 of Lecture Notes in Computer Science,


[CDMP05] Jean-Sebastien Coron, Yevgeniy Dodis, Cecile Malinaud, and Prashant Puniya.

Merkle-Damgard Revisited: How to Construct a Hash Function. In Victor

Shoup, editor, CRYPTO, volume 3621 of Lecture Notes in Computer Science,


[CNY11] Donghoon Chang, Mridul Nandi, and Moti Yung. Indifferentiability of the Hash

Algorithm BLAKE. IACR Cryptology ePrint Archive, Report 2011/623, 2011.

[Dam90] Ivan Damgard. A Design Principle for Hash Functions. In Gilles Brassard, ed-

itor, Advances in Cryptology - CRYPTO, 9th Annual International Cryptology

Conference, Santa Barbara, California, USA, August 20-24, 1989, Proceedings,

volume 435 of Lecture Notes in Computer Science, pages 416–427. Springer,

1990.

Bibliography 56

[Dea99] Richard Dean. Formal Aspects of Mobile Code Security. PhD thesis, Princeton

University, 1999.

[DH76] Whitfield Diffie and Martin E. Hellman. New Directions in Cryptography.

IEEE Transactions on Information Theory, IT-22(6)/ 644-654, 1976.

[Die10] Reinhard Diestel. Graph Theory (Graduate Texts in Mathematics). Springer-

Verlag, 2010.

[FS86] Amos Fiat and Adi Shamir. How to Prove Yourself: Practical Solutions to Iden-

tification and Signature Problems. In Andrew M. Odlyzko, editor, CRYPTO,


1986.

[FSZ09] Pierre-Alain Fouque, Jacques Stern, and Sebastien Zimmer. Cryptanalysis of

Tweaked Versions of SMASH and Reparation. In Roberto Maria Avanzi, Liam

Keliher, and Francesco Sica, editors, Selected Areas in Cryptography, volume


[GKM+11] Praveen Gauravaram, Lars R. Knudsen, Krystian Matusiewicz, Florian Mendel,

Christian Rechberger, Martin Schlaffer, and Søren S. Thomsen. Grøstl – a SHA-

3 candidate. Submission to NIST (Round 3), 2011.

[GM84] Shafi Goldwasser and Silvio Micali. Probabilistic Encryption. Journal of Com-

puter and System Sciences, 28(2)/ 270-299, 1984.

[Jou04] Antoine Joux. Multicollisions in Iterated Hash Functions. Application to

Cascaded Constructions. In Matt Franklin, editor, Advances in Cryptology

CRYPTO, volume 3152 of Lecture Notes in Computer Science, chapter 19,

pages 99–213. Springer, Berlin, Heidelberg, 2004.

[KK06] John Kelsey and Tadayoshi Kohno. Herding Hash Functions and the Nos-

tradamus Attack. In Serge Vaudenay, editor, EUROCRYPT, volume 4004 of

Lecture Notes in Computer Science, pages 183–200. Springer, 2006.

[KS05] John Kelsey and Bruce Schneier. Second preimages on n-bit hash functions

for much less than 2n work. In Ronald Cramer, editor, EUROCRYPT, volume


[LH11] Jooyoung Lee and Deukjo Hong. Collision Resistance of the JH Hash Function.

IACR Cryptology ePrint Archive, Report 2011/19, 2011.

[LM92] Xuejia Lai and James L. Massey. Hash Function Based on Block Ciphers.

In Rainer A. Rueppel, editor, EUROCRYPT, volume 658 of Lecture Notes in

Computer Science, pages 55–70. Springer, 1992.

Bibliography 57

[Luc05] Stefan Lucks. A Failure-Friendly Design Principle for Hash Functions. In

Bimal K. Roy, editor, ASIACRYPT, volume 3788 of Lecture Notes in Computer

Science, pages 474–494. Springer, 2005.

[Mer79] Ralph Merkle. Secrecy, Authentication, and Public Key Systems. PhD thesis,

UMI Research Press, 1979.

[Mer90] Ralph C. Merkle. One Way Hash Functions and DES. In Gilles Brassard, edi-

tor, Advances in Cryptology - CRYPTO, 9th Annual International Cryptology

Conference, Santa Barbara, California, USA, August 20-24, 1989, Proceedings,


1990.

[MPST12] Dustin Moody, Souradyuti Paul, and Daniel Smith-Tone. Improved Indiffer-

entiability Security Bound for the JH Mode. In NIST’s 3rd SHA-3 Candidate

Conference 2012, 2012.

[MRH04] Ueli M. Maurer, Renato Renner, and Clemens Holenstein. Indifferentiability,

Impossibility Results on Reductions, and Applications to the Random Oracle

Methodology. In Moni Naor, editor, TCC, volume 2951 of Lecture Notes in


[MvOV97] Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone. Handbook of

Applied Cryptography. CRC Press, 1997.

[NIS07] NIST. Announcing Request for Candidate Algorithm Nominations for a New

Cryptographic Hash Algorithm. Technical report, NIST, 2007.

[PGV93] Bart Preneel, Rene Govaerts, and Joos Vandewalle. Hash functions based on

block ciphers: A synthetic approach. In Advances in Cryptology - CRYPTO,

Lecture Notes in Computer Science, pages 368–378. Springer-Verlag, 1993.

[Rab78] Michael O. Rabin. Digitalized signatures. In Foundations of Secure Computa-

tion, pages 155–166. Academic Press, 1978.

[Rog06] Phillip Rogaway. Formalizing Human Ignorance. In Phong Q. Nguyen, editor,

VIETCRYPT, volume 4341 of Lecture Notes in Computer Science, pages 211–


[RS04] Phillip Rogaway and Thomas Shrimpton. Cryptographic Hash-Function Basics:

Definitions, Implications, and Separations for Preimage Resistance, Second-

Preimage Resistance, and Collision Resistance. In Bimal K. Roy and Willi

Meier, editors, FSE, volume 3017 of Lecture Notes in Computer Science, pages

371–388. Springer, 2004.

Bibliography 58

[RS08a] Phillip Rogaway and John P. Steinberger. Constructing Cryptographic Hash

Functions from Fixed-Key Blockciphers. In David Wagner, editor, CRYPTO,


2008.

[RS08b] Phillip Rogaway and John P. Steinberger. Security/Efficiency Tradeoffs for

Permutation-Based Hashing. In Nigel P. Smart, editor, EUROCRYPT, volume


[Sta08] Martijn Stam. Beyond Uniformity: Better Security/Efficiency Tradeoffs for

Compression Functions. In David Wagner, editor, CRYPTO, volume 5157 of

Lecture Notes in Computer Science, pages 397–412. Springer, 2008.

[Sta09] Martijn Stam. Blockcipher-Based Hashing Revisited. In Orr Dunkelman, editor,

FSE, volume 5665 of Lecture Notes in Computer Science, pages 67–83. Springer,

2009.

[TPB+11] Meltem Sonmez Turan, Ray Perlner, Lawrence E. Bassham, William Burr,

Donghoon Chang, Shu jen Chang, Morris J. Dworkin, John M. Kelsey,

Souradyuti Paul, and Rene Peralta. Status Report on the Second Round of the

SHA-3 Cryptographic Hash Algorithm Competition. Technical report, NIST,

2011.

[Wag02] David Wagner. A Generalized Birthday Problem. In Moti Yung, editor,

CRYPTO, volume 2442 of Lecture Notes in Computer Science, pages 288–303.

Springer, 2002.

[Wu11] Hongjun Wu. The Hash Function JH. Submission to NIST (round 3), 2011.

[WY05] Xiaoyun Wang and Hongbo Yu. How to Break MD5 and Other Hash Functions.

In Ronald Cramer, editor, EUROCRYPT, volume 3494 of Lecture Notes in


[WYY05] Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu. Finding Collisions in the Full

SHA-1. In Victor Shoup, editor, CRYPTO, volume 3621 of Lecture Notes in


Appendix A

Mathematical Derivations

A.1 Security Bound on Second Preimage of Grøstl

Pr[SP] ≤ Pr[C] + Pr[CO] (A.1)

≤ (k + 1)q1q22l − q1

+(k + 1)q1q2

2l − q2+q2 · 2l−n

2l − q2(A.2)

≤ (k + 1)q1q22l − q

+(k + 1)q1q2

2l − q+q2 · 2l−n

2l − q(A.3)

≤ 2 · 2(k + 1)q1q22l

+2q2 · 2l−n

2l(A.4)

≤ 2 · (k + 1)q2

2 · 2l+

q

2n−1(A.5)

≤ (k + 1)q2

2l+

q

2n−1(A.6)

Firstly, we present obtained bounds (A.2) in proof for second preimage resistance of the

Grøstl. Since q = q1 + q2, we can replace q1 and q2 with q in denominator and the equation

(A.3) holds. As for q < 2l−1 we have 12l−q ≤

22l

we obtain (A.4). Furthermore, we wish

to determine what is the maximum value of 2q1q2. We consider x = q2, q1 = q − x

and define a function fq(x) = 2(q − x)x = 2qx − 2x2. To find a maximum of function

we search for the first derivative f ′q(x) = 2q − 4x where 2q − 4x = 0. We have that

x = q/2 ⇒ fqmax = fq(q/2) = 2(q − q/2)q/2 ⇒ fqmax = q2/2. Using this result we obtain

(A.5). Finally, we obtain the bound on second preimage resistance of Grøstl (A.6).

59

Date post:	21-Jun-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Provable Security Analysis of SHA-3 Candidates · UNIVERSITY OF NOVI SAD DEPARTMENT OF POWER,...

Documents