Correlation Attacks on Block Ciphers - CiteSeerX

Supervisor: Tom Hoholdt

Technical University of Denmark

Department of Mathematics

January, 1996

Correlation Attacks on Block Ciphers

Thomas Jakobsen

Master’s Thesis

✵

AbstractThis report presents a new statistical attack on iterative block ciphers called thecorrelation attack which is a natural generalization of linear cryptanalysis. Theattack is based on �nding complex-valued functions on the input and the output ofa cipher which have a high correlation. Their mutual relation is then exploited toyield information about the �nal round key.Introducing the notions of imbalance, I/O product, and correlation matrix, it isshown how to measure a cipher's security against the attack, and the mini-cipherIDEA(8) is found to be provably secure (assuming independency of subkeys).Links to other kinds of statistical attacks are explored. In particular, it is shownthat the correlation matrix of a cipher and the matrix of di�erential transitionprobabilities used with di�erential cryptanalysis are connected by the 2-dimensionalFourier transform. This implies that correlation cryptanalysis and di�erential crypt-analysis are essentially of the same strength.Key words: Correlation, Boolean complexity, linear cryptanalysis, partitioningcryptanalysis, di�erential cryptanalysis, statistical attack, block cipher, IDEA, SAFER.

Resum�eDenne rapport omhandler et nyt statistisk angreb p�a itererede blok-kryptosystemerkaldet et korrelationsangreb. Angrebet, som er en naturlig generalisering af line�rkryptoanalyse, bygger p�a forekomsten af h�jt korrelerede komplekse funktioner, deropererer p�a klartekst og ci�ertekst. Deres indbyrdes sammenh�ng udnyttes til atopn�a information om n�glen i sidste runde.Begreberne ubalance, I/O-produkt og korrelationsmatrix indf�res og det blivervist, hvordan det er muligt at m�ale et kryptosystems sikkerhed mod angrebet.Det demonstreres, at mini-kryptosystemet IDEA(8) er beviseligt sikkert (idet un-dern�gler antages at v�re uafh�ngige).Sammenh�nge med andre former for statistiske angreb unders�ges. Specieltvises det, at et kryptosystems korrelationsmatrix og matricen med di�erentielle over-gangssandsynligheder, som benyttes til di�erentiel kryptoanalyse, er forbundet viaden 2-dimensionale Fourier-transformation. Dette betyder, at korrelationskrypto-analyse og di�erentiel kryptoanalyse i bund og grund har samme styrke.N�gleord: Korrelation, boolsk kompleksitet, line�r kryptoanalyse, partitionerendekryptoanalyse, di�erentiel kryptoanalyse, statistisk angreb, blok-kryptosystem, IDEA,SAFER.

Here Legrand, having re-heated the parchment, submitted it my inspection. Thefollowing characters were rudely traced, in a red tint, between the death's-head andthe goat:53++!305))6*;4826)4+.)4+);806*;48!8`60))85;]8*:+*8!83(88)5*!;46(;88*96*?;8)*+(;485);5*!2:*+(;4956*2(5*-4)8`8*; 4069285);)6!8)4++;1(+9;48081;8:8+1;48!85;4)485!528806*81(+9;48;(88;4(+?34;48)4+;161;:188;+?;\But," said I, returning him the slip, \I am as much in the dark as ever. Wereall the jewels of Golconda awaiting me on my solution of this enigma, I am quitesure that I should be unable to earn them."\And yet," said Legrand, \the solution is by no means so di�cult as you mightbe led to imagine from the �rst hasty inspection of the characters. These charac-ters, as any one might readily guess, form a cipher { that is to say, they convey ameaning; but then, from what is known of Kidd, I could not suppose him capable ofconstructing any of the more abstruse cryptographs. I made up my mind, at once,that this was of a simple species { such, however, as would appear, to the crudeintellect of the sailor, absolutely insoluble without the key."\And you really solved it?"\Readily; I have solved others of an abstruseness ten thousand times greater.Circumstances, and a certain bias of mind, have led me to take interest in suchriddles, and it may well be doubted whether human ingenuity can construct anenigma of the kind which human ingenuity may not, by proper application, resolve.In fact, having once established connected and legible characters, I scarcely gave athought to the mere di�culty of developing their import."Sir Edgar Allan PoeThe Gold Bug, 1843.

PrefaceBlock ciphers are often used to secure communication since they are fast and easyto implement in both software and hardware. Unlike public key systems, the theoryof block ciphers has not been very �rmly based in a mathematical sense { at leastnot until recently. Then came the di�erential attack of Biham and Shamir [4], andlater the linear attack of Matsui [30]. These provided basis for work dealing withthe provable security against statistical attacks.The new attack in this report which is called correlation cryptanalysis representsa natural generalization of linear cryptanalysis since it is dual to di�erential crypt-analysis in a \time/frequency" sense. I hope it will help to a better understandingof the nature of both attacks and to a more thorough knowledge of how to constructgood block ciphers. To my knowledge, most of the results presented in this workare original.This report expands results found while I spend a semester at the Signal andInformation Processing Laboratory at the Swiss Federal Institute of Technology,Zurich. The work presented here has come into existence at the Department ofMathematics at the Technical University of Denmark. It has been very nice workinghere, and I look eagerly forward to doing my Ph.D. at this very same place.It is a pleasure to thank my supervisor Tom H�holdt, Carlo Harpes, JamesMassey, and Kim L�uders-Jensen for useful comments and discussions. It has been apleasure to work on the subject of cryptanalysis which I �nd extremely interestingsince it brings together such a wide variety of mathematical topics.Thanks also to Karin and my family for support and for putting up with themany hours that I spent over a piece of paper or in front of the screen.Lyngby, January 31, 1996Department of MathematicsTechnical University of DenmarkThomas Jakobseni

Contents1 Introduction 12 Preliminaries 52.1 Cipher Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 52.2 Miscellaneous : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 72.3 The Fourier Transform : : : : : : : : : : : : : : : : : : : : : : : : : : 72.4 Elements from Linear Algebra : : : : : : : : : : : : : : : : : : : : : : 83 The Statistical Attack 114 The Correlation Attack 174.1 I/O Products and Correlation Matrices : : : : : : : : : : : : : : : : : 174.2 The Distribution of Imbalance Estimates : : : : : : : : : : : : : : : : 254.2.1 Empirical Evidence for the Distribution Function : : : : : : : 315 Attack Algorithms 355.1 The Simple Attack : : : : : : : : : : : : : : : : : : : : : : : : : : : : 355.2 The Advanced Attack : : : : : : : : : : : : : : : : : : : : : : : : : : 365.3 Simplifying the Advanced Attack : : : : : : : : : : : : : : : : : : : : 375.4 Success Probability : : : : : : : : : : : : : : : : : : : : : : : : : : : : 425.4.1 The Simple Attack : : : : : : : : : : : : : : : : : : : : : : : : 425.4.2 The Advanced Attack : : : : : : : : : : : : : : : : : : : : : : 476 Bounds for Multiple Rounds 516.1 The Simple Attack : : : : : : : : : : : : : : : : : : : : : : : : : : : : 516.2 The Advanced Attack : : : : : : : : : : : : : : : : : : : : : : : : : : 566.3 Bounds Related to Composite Matrices : : : : : : : : : : : : : : : : : 596.4 Schur Stochastic Decomposition : : : : : : : : : : : : : : : : : : : : : 646.5 Construction of Secure Round Functions : : : : : : : : : : : : : : : : 677 Links to Other Statistical Attacks 697.1 Di�erential Cryptanalysis : : : : : : : : : : : : : : : : : : : : : : : : 697.2 Generalized Linear Cryptanalysis : : : : : : : : : : : : : : : : : : : : 717.3 Partitioning Cryptanalysis : : : : : : : : : : : : : : : : : : : : : : : : 737.4 Relations Between the Attacks : : : : : : : : : : : : : : : : : : : : : : 75iii

8 A Couple of Ideas 778.1 Approximation of Boolean Functions : : : : : : : : : : : : : : : : : : 778.1.1 Repeated Substitution, Expansion, and Truncation : : : : : : 788.1.2 An Approach Using Resultants : : : : : : : : : : : : : : : : : 798.1.3 An Approach Using Buchberger's Algorithm : : : : : : : : : : 798.2 An Authentication Scheme using Gr�obner Bases : : : : : : : : : : : : 809 Conclusion 819.1 Suggestions for Further Work : : : : : : : : : : : : : : : : : : : : : : 82Bibliography 85A Symbols 89B Abbreviations 91

Chapter 1IntroductionSince the adoption of DES [36] as a standard in 1977, people have tried zealouslyto break the cipher but apparently without any great success. Biham and Shamir[5] mention some of the attacks on DES that have been attempted during the years.Among these are attacks utilizing the complementation property of DES, exhaustivesearch and related time/memory tradeo�s, the method of formal coding, and themeet in the middle attack. None of these attacks have succeeded in bringing downthe complexity to less than half that of exhaustive key search.Two approaches, however, have proven to be more successful than the rest, and infact many a block cipher has been brought to its knees by one of these attacks. Theattacks are the di�erential cryptanalysis (DC) of Biham and Shamir [4] introducedin an attempt to cryptanalyze DES, and the linear cryptanalysis (LC) of Matsui,which was also used to attack DES.Harpes, Kramer, and Massey generalized LC in [12] by introducing the notionof binary I/O sums, and the new attack (GLC) was shown to be in some cases moresuccessful than ordinary LC. A similar attack, m-ary GLC, which went beyond thebinary case was explored in [17]. Harpes made further generalizations in [13] whenhe introduced the notion of partitioning cryptanalysis (PC) and showed that this isa still more powerful attack.In this work, we develop a new attack, namely correlation cryptanalysis (CC),which can be seen a the natural generalization of LC since it has several nice prop-erties which LC, PC, and most notably GLC lack. This includes generalizations ofMatsui's piling-up lemma and better applicability to ciphers which are not of thexor-kind. Furthermore, the attack is the \time/frequency" dual of DC, so to speak.The attack is also closely related to the correlation attack on stream ciphers whichwas �rst pointed out by Blaser and Heinzman [6] and fully developed by Siegenthaler[45].CC works by exploiting highly correlated functions acting on the cipher inputand output, respectively. This correlation is then used to yield information aboutthe �nal round key. When looking at CC, there are three main questions whichnaturally come to mind: 1

2 CHAPTER 1. INTRODUCTION� How does one determine a cipher's security against CC?� Given a cipher, what is the best possible attack via CC?� How is it possible to construct ciphers which are secure against CC?Each one of these questions will be addressed before we reach the �nal page. Moreprecisely, the report is organized as follows.Chapter 2 introduces de�nitions, models, and notions used in the rest of the text.A number of basic theorems, some of which the reader is probably already familiarwith, are also presented to provide reference. The Fourier transform over Abeliangroups is also introduced since it plays an important role in the theory of CC.The notion of a statistical attack is de�ned in Chapter 3. The statistical attackis a common framework into which it is possible to put both LC, GLC, PC, DC,and CC. The cryptanalyses are all known-plaintext attacks on block ciphers andcommon to them all is the statistical analysis of plaintext/ciphertext pairs (P/C-pairs), which in the successful case leads to knowledge of the key of the last (or �rst)round.The basic notions of the correlation attack are explained in Chapter 4. Amongthese are the so-called correlation matrix and the imbalance operator which are bothcentral tools. It is shown that the correlation of function pairs over a cipher is closelylinked with the correlation matrix. It is also shown how to obtain the correlationmatrix of a whole cipher given the correlation matrices of the individual rounds.In Chapter 5, the mechanics of the correlation attack are developed and threedi�erent attack algorithms of various complexity and strength are presented. Thesimple attack considers the correlation of only one function pair to yield informationabout the key. Inspired by the approach with \multiple approximations" of [20], theadvanced attack considers several pairs of functions. The advanced attack has thedrawback of being di�cult to analyze and we therefore also present a simpli�cationof this attack. Based on analysis of the simpler attack, the chapter also presentsa result which states how many P/C-pairs are required to carry out a successfulattack. Finally, based on Weil's bound [28], it is shown how the Boolean complexityof an xor-cipher relates to its correlation immunity.On basis of the multiplicativity of correlation matrices, we derive in Chapter 6lower and upper multiround bounds for the number of P/C-pairs required to do asuccessful analysis of a given cipher. The Frobenius norm of the cipher's correlationmatrix plays an important role in this context. The results answer one of thequestions which have arised through the study of LC, namely how to establish byproof whether a given cipher is secure or not.The relationships between CC and other types of statistical attacks are examinedin Chapter 7. It is shown that di�erential and CC are dual notions in the sense that,informally speaking, they represent the same attack in, respectively, the time domainand the frequency domain. More precisely, the matrix of di�erential transitionprobabilities used with DC is the 2-dimensional Fourier transform of the correlationmatrix (and vice versa). As a consequence, the strength of a full correlation attackexploiting every possible linear connection between input and output depends onexactly the same measure of weakness as the strength of a full di�erential attack

3exploiting every possible di�erential characteristic. Links to m-ary GLC and PC arealso examined, and it is shown that these two approaches are not more successfulthan CC.Chapter 8 contains a collection of ideas for new attacks which perhaps would beworth some investigation. Among these is an approach using Buchberger's algorithmto obtain probabilistic information about the key. An idea for an authenticationscheme using Gr�obner bases is also presented.Finally, in Chapter 9 we summarize our results with a conclusion and somesuggestions for further work.

4

Chapter 2PreliminariesIn this chapter we present the notation and the de�nitions used throughout thereport. We also present some basic theorems regarding the Fourier transform andelements from linear algebra to provide reference.2.1 Cipher ModelWe consider an iterative block cipher with round function consisting of a keyedAbelian group operation + at the entry followed by a keyed permutation (see �g.2.1). Letting (G;+) denote the employed Abelian group of order n with neutralelement 0, we denote by �a the additive inverse of a, and we write a� b instead ofa+ (�b).R

Keyed

permutation

LAbelian group

K

K

ϕKL

X

Y

operation

One round

Figure 2.1: The considered round function.More formally, we are looking at an iterative block cipher of r rounds with npossible input values and a round function given by Rk(x) = 'kL(x + kR) wherex; kR 2 G and 'kL : G! G is a permutation indexed by a key kl (by kL and kR wedenote the \left hand" and the \right hand" part of the key k).5

6 CHAPTER 2. PRELIMINARIESAs in [12], the capital letters K, X, and Y denote random variables describingrespectively the key, the input, and the output of the cipher. The correspondinglower-case letters denote instances of these random variables. With superscripts onletters, e.g. Y (j) and k(j), we indicate that expressions belong to a certain round (jin this case); thus Y (j) is the output from the j-th round (and the input to roundj +1). We will assume that all subkeys are independent and uniformly distributed.Although subkeys are often generated by some key schedule, this assumption canusually be made without loss of generality.By the reduced cipher corresponding to an r-round cipher, we de�ne the cipherconsisting of the �rst (r � 1) rounds of the original cipher. The expanded cipher isde�ned to be the cipher consisting of the original cipher followed by the inverse ofthe original last round (with a new subkey ~K). Figure 2.2 shows the original cipherand the corresponding expanded and reduced ciphers.(1)

K(1)

K(1)

X X X

Y(r+1)

(1)K(1)R

K

K

R (r-1)

(r-1)K(r-1)R

(r)(r)

KR(2)(2)

(1)K(1)R

(r-1)K(r-1)R

KR(2)(2)

(1)K(1)R

KR

(r-1)K(r-1)R

(r)(r)

KR(2)(2) K

K

K

(2)

(r-1)

(r)

K

K

(2)

(r-1)

K

K

K

K

(2)

(r-1)

(r)

~)

K(r)

R(-1

~Y = Y(r)

YFigure 2.2: A cipher and the corresponding expanded and reduced cipher.

2.2. MISCELLANEOUS 72.2 MiscellaneousLet the decomposition of (G;+) into additive, cyclic groups be given by G = ZZn1 �ZZn2 � � � � � ZZng (where n = n1 � n2 � � � � � ng and g is the number of subgroups). Bysubscript on elements from G we denote the corresponding elements of the cyclicsubgroups, e.g., x = (x1; x2; : : : ; xg), where x 2 G and xj 2 ZZnj .By i we denote the imaginary unit and by c the complex conjugate of the complexnumber c. Furthermore, by �w : G ! C we denote the function given by �w(x) =�<w;x>, where � = e 2�in denotes an n-th primitive root of unity in C, and < w; x >=Pgi=1wjxjTj mod N , for w; x 2 G and Tj = n=nj.By G we denote the elements of the character group corresponding to (G;+).For the purpose of this report, simply let the elements of the character group be theset of functions given by G = f�w : w 2 Gg. Notice that �a(x) � �a(y) = �a(x+ y),�a(�x) = �a(x) = ��a(x), and �a(x) = �x(a) for all � 2 G and a; x; y 2 G. Inaddition, the values of a character always lie on the unit circle in C.2.3 The Fourier TransformDe�nition 2.3.1 Fourier transform. Let G be the elements of a �nite Abeliangroup of order n with character group G = f�a : a 2 Gg. Given a complex-valuedfunction : G ! C, the Fourier transform of is the function Ff g : G ! Cde�ned by Ff g(a) = 1n Xx2G (x) � ��a(x)for all a 2 G.For a cyclic group, the above de�nition coincides with the usual de�nition of thediscrete Fourier transform, and for G = ZZw2 (the \xor-group") it is simply the Walsh-Hadamard transform. We will often use the well-known Parseval's identity so weinclude it here (without proof).Theorem 2.3.2 Parseval's identity. For any function : G! C,1n Xx2G j (x)j2 = Xw2G jFf g(w)j2 :Another useful property of the Fourier transform is the following.Theorem 2.3.3 Inverse Fourier Transform. Any function : G ! C can beexpressed in a unique way as a weighted sum of elements from the character groupG by using the Fourier transform, as follows (x) = Xw2GFf g(w) � �w(x):For G cyclic this is just another way of stating the well-known property that aperiodic function can be thought of as a sum of (co)sines.

8 CHAPTER 2. PRELIMINARIESBy �ab or �a(b) we denote Kronecker's delta function, i.e.,�ab = �a(b) = ( 1 if a = b0 otherwise.Proposition 2.3.4 The Fourier transform of a character is a delta function. Moreprecisely Ff�ag(x) = �a(x):2.4 Elements from Linear AlgebraBy Mrs we denote the set of all complex r � s matrices, and by Mr the set of allcomplex r� r (square) matrices. We use column vectors and by MT we denote thetransposed version of the matrixM and by M� =MT the Hermitian adjoint.To provide reference, we explain now brie y the notions of eigenvalues and sin-gular values and some of their properties. The propositions are given without proof(for these see a textbook on matrix analysis, e.g., [15] or [16]).De�nition 2.4.1 An eigenvalue � of the square matrix M 2 Mm is a complexnumber that ful�ls the equation Mv = �vfor some nonzero vector v called an eigenvector.An m � m matrix M has m (algebraic) eigenvalues including multiplicities. Indecreasing order of magnitude, these will be denoted by �1(M); �2(M); : : : ; �m(M)(i.e., j�1(M)j � j�2(M)j � : : : � j�m(M)j).Proposition 2.4.2 Every symmetric matrix M 2 Mm can be factored into threematrices M = V �1�Vwhere � is a diagonal matrix consisting of the eigenvalues of M and the rows of Vare the corresponding eigenvectors.Two matrices A and B are said to be similar, if A = M�1BM for some matrix M .Similar matrices have the same eigenvalues.De�nition 2.4.3 The singular values of the square matrix M 2 Mm are de�ned tobe the square roots of the eigenvalues of M�M .Singular values are always real numbers since the eigenvalues of M�M are nonnega-tive. The m singular values (including multiplicites) of an m�m matrixM will bedenoted by �1(M); �2(M); : : : ; �m(M) such that �1(M) � �2(M) � : : : � �m(M).When the elements of M are real numbers,M� =MT and in that case the singularvalues of M are of course the square roots of the eigenvalues of MTM .At some point, we will need to measure the \magnitude" of a matrix. Here thefollowing de�nitions are useful.

2.4. ELEMENTS FROM LINEAR ALGEBRA 9De�nition 2.4.4 Let M = [mab] 2 Mn be a square matrix. De�ne the followingmatrix operatorsl1 norm: kMk1 = Pab jmabjFrobenius norm: kMk2 = (Pabm2ab) 12l1 norm: kMk1 = maxab jmabjMaximum column sum norm: jjjM jjj1 = maxbPa jmabjSpectral norm: jjjM jjj2 = �1(M)Maximum row sum norm: jjjM jjj1 = maxaPb jmabjSpectral radius: �(M) = �1(M)Every norm is bounded by every other norm when suitable scaled. The followingproposition presents the tightest bounds possible for all of the norms introducedabove.Proposition 2.4.5 Let M 2 Mr be a square matrix. Then kMk1, kMk2, rkMk1,jjjM jjj1, jjjM jjj2, and jjjM jjj1 are all matrix norms on M , i.e.,jjjMnjjj � jjjM jjjn for all n 2 IN;where jjj � jjj is one of the mentioned operators. Furthermore, the following table givesthe best constants k such that jjjM jjj� � kjjjM jjj� for all M 2 Mr.jjj � jjj�jjj � jjj� jjj � jjj1 jjj � jjj2 jjj � jjj1 k � k1 k � k2 rk � k1jjj � jjj1 1 pr r 1 pr 1jjj � jjj2 pr 1 pr 1 1 1jjj � jjj1 r pr 1 1 pr 1k � k1 r r3=2 r 1 r rk � k2 pr pr pr 1 1 1rk � k1 r r r r r 1The spectral radius �(�) (which is not a matrix norm) is bounded from above by everymatrix norm.The table is from [43].Proposition 2.4.6 Let C be a doubly stochastic matrix. Then the spectral radius�(C) equals 1. Moreover, e = (1; 1; : : : ; 1) is an eigenvector with eigenvalue 1.It is easily checked that Ce = e since C is doubly stochastic.De�nition 2.4.7 By the Hadamard product or Schur product M � N of the twomatrices M = [mab] and N = [nab] with the same dimensions de�ne the matrixwhich is the elementwise product of M and N . More formally,M �N = [mab � nab] :For more details on the properties of the Hadamard product, see [16].

10 CHAPTER 2. PRELIMINARIESProposition 2.4.8 The Hadamard product is submultiplicative with respect to thespectral norm in the following sense. For any A;B 2 Mm we havekXj=1�j(A �B) � kXj=1�j(A) � �j(B)for k = 1; 2; : : : ;m. In particularjjjA �Bjjj2 � jjjAjjj2 � jjjBjjj2:De�nition 2.4.9 A square matrix C is said to be Schur stochastic, unitary stochas-tic, or orthostochastic if it has the formC = U � Ufor some unitary matrix U .We mention without proof that every orthostochastic matrix is doubly stochastic.De�nition 2.4.10 The Kronecker product of V = [vab] 2 Mrs and W = [wab] 2Mtu is denoted by V W and is de�ned to be the block matrixV W = 0BBBB@ v1;1 �W v1;2 �W : : : v1;s �Wv2;1 �W v2;2 �W : : : v1;s �W... ... . . . ...vr;1 �W vr;2 �W : : : vr;s �W 1CCCCA :Proposition 2.4.11 Some basic properties of the Kronecker product are(AB)T = AT BT ;(AB)� = A� B�;and (AB)(C D) = (AC) (BD):

Chapter 3The Statistical AttackIn [35], Murphy, Piper, Walker and Wild present a general setting which is usefulfor characterizing a number of cryptanalytical attacks based on statistical testing.Their work applies to both block ciphers and stream ciphers. In this chapter andin Chapter 7, we put LC, GLC, PC, DC, and CC into this setting. However, thechapter can be read independently of [35] since we develop our own notation andwe do not follow the conventions of Murphy et al.A common element to LC, GLC, PC, DC, and CC is the statistical analysis ofP/C-pairs, which in the successful case leads to determination of the �nal roundkey. More speci�cally, all �ve attacks depend on the existence of a random variableS which depends on the cipher input X and the reduced cipher output Y (r�1) insuch a way that information is leaked. For this to be the case, S must be uniformlydistributed when X and Y (r�1) are randomly chosen and independent of each other,and as non-uniformly distributed as possible (with respect to some measure) whenX and Y (r�1) describe an actual P/C-pair of the reduced cipher.In the following, let V be some set and let T be a family of balanced functionsof the form s : G2 ! V (i.e., s(X;Y ) is uniformly distributed over V when (X;Y )is uniformly distributed over G2). Furthermore, let a likelihood evaluator L be acertain measure of non-uniformity; more precisely let L be a function which mapsa random variable S with values in V into the nonnegative real numbers, such thatL(S) = 0 if S is uniformly distributed over V .De�ne by an attack descriptor the pair (T ;L), where T is a family of functionsonto V as described above and L is a likelihood evaluator used to measure thenon-uniformity of V -valued random variables. We will use the above model of astatistical attack here and in the following chapters to describe all of the abovementioned attacks.Example 3.0.1 Linear cryptanalysis as introduced by Matsui [30, 31] has an attackdescriptor given byT = fsjs : ZZw2 � ZZw2 ! ZZ2; a; b 2 ZZw2 n f0g; s(x; y) = a � x� b � y for all x; ygand L(S) = ��P [S = 0]� 12 �� for all S;11

12 CHAPTER 3. THE STATISTICAL ATTACKwhere � denotes addition modulo 2, w is the bit-width of the cipher, and � denotesthe inner product over ZZw2 , i.e., T is the set of all binary linear relations betweeninput and output, and the value of the likelihood evaluator L is de�ned to be the biasof its argument.The following is an explanation of the basic attack with descriptor (T ;L) as itapplies to our cipher model (for actual ciphers like DES there might be more e�cientapproaches). As usual, X and Y denote the cipher input and output, respectively.First, we guess a last-round key ~k. Then with this guess we obtain the one-rounddecryption of the ciphertext ~Y = (R(r)~k )�1(Y ). There are now two possibilities:� We have guessed the correct last-round key.� We have guessed a wrong last-round key.For each of these possibilities there is a corresponding distribution of ~Y . If we havecorrectly guessed the key (instance 1), then ~Y equals Y (r�1) (see Figure 3.1) and(r-1)

(r)

(2)

(1)

(r-1)

(r)

(2)(2)

(r-1)

(r)

(2)

(r-1)

(1) (1)

(r)

Y = Y(r)

Y = Y(r)

(1)

~

(1)R

R

(r-1)R

(r)

R(2)

X

Y = Y(r+1)

R

R

(r-1)R

(r)

R(2)

(r-1)

~

X

)(r)

R(-1

~ )(r)

R(-1

~

(r-1)Y

k

k

k k

k

k

k

k k k

k

k

k k

kk(1)

k = k k = k

~

~

Y = YFigure 3.1: Cipher output given a correct and a wrong last-round key guess.therefore the two pairs (X; ~Y ) and (X;Y (r�1)) are identically distributed (i.e., ~Y isthe output from the reduced cipher). If we have guessed a wrong key (instance 2),then ~Y is generally not equal to Y (r�1). Instead (X; ~Y ) = (X;Y (r+1)) follows thedistribution of a P/C-pair from an extended cipher.

13To distinguish between cases 1 and 2, we do a statistical test of the hypothesisH1: (X; ~Y ) is distributed like (X;Y (r�1)) (success)against the alternativeH2: (X; ~Y ) is distributed like (X;Y (r+1)) (failure).If the test points at H1 being correct, we declare our key guess to be the correctlast-round key. If the test points at the alternative H2, we discard our key guessand repeat the whole procedure with another last-round key guess. See Figure 3.2for a schematic of the attack (the �gure is from [14]).(1..r-1) K(r)

roundinverse

r-th

k~

Y~

K

{k: L[k] = max L[k]}

Y

k

X

ML-estimatorkey

guess

~

~

~

Evaluation of the guess

Plaintext First r-1 r-throunds roundsource

L[k]Figure 3.2: Schematic of the statistical attack.To carry out the test in practice, we consider a function s from T and the cor-responding random variables S = s(X; ~Y ), S(1::r�1) = s(X;Y (r�1)), and S(1::r+1) =s(X;Y (r+1)). We then test the simple hypothesisH1 : L(S) = L(S(1::r�1))against the simple alternativeH2 : L(S) = L(S(1::r+1)).To do this, we �rst have to compute an estimate ~L(S) of L(S) based on a su�-cient number of P/C-pairs. There are several ways the test can proceed. Neyman-Pearson's Lemma [27] is useful for distinguishing between two simple hypotheses.

14 CHAPTER 3. THE STATISTICAL ATTACKThis requires knowledge of the probability density function of ~L(S) (more aboutthat and the relations to CC later).Some simpli�cations are possible if we assume that the random variable S(1::r+1)is uniformly distributed. This is a valid assumption since Y (r+1) has passed throughtwo more rounds than Y (r�1) and therefore, loosely speaking, S(1::r+1) = (X;Y (r+1))is \more uniformly distributed" than S(1::r�1) = (X;Y (r�1)). This phenomenon isformulated in [12] for GLC as the \Hypothesis of Wrong Key Randomization". Thesimpli�cation results in the new testH1 : L(S) = L(S(1::r�1))against H2 : L(S) 6= L(S(1::r�1))or, alternatively,H1 : L(S) = L(S(1::r�1))against H2 : L(S) < L(S(1::r�1)).An advantage of this test is that it does not require prior knowledge of L(S(r+1))and if L is appropriately chosen, then knowledge of the probability density functionof ~L is not necessary either. As candidate for the last-round key we simply choosethe key which have the highest value of L. The following summarizes on a morealgorithmic form the simplest possible attack (K denotes the keyspace of the lastround).choose a suitable s 2 Tfor ~k 2 K do begin~Y := (R(r)~k )�1(Y )compute the estimate ~L(s(X; ~Y )) based on the available P/C-pairsL[~k] := ~L(s(X; ~Y ))endoutput all keys ~k that maximize L[~k]It is possible to improve upon this attack by considering several functions from Tinstead of only one (namely s). How to do this optimally for the correlation attackwill be revealed later on.Next, after having recovered the last-round key we can proceed by yet anotherstatistical attack to recover the key of the round next to last and so on. Or, iftractable, we can do an exhaustive key-search on the remaining part of the key(taking advantage of relations between bits in the subkeys caused by a key schedule,if any).

15There exist short-cuts that decrease the number of keys that we need to test. Forinstance, it is possible to speed up an attack if one can get just some informationabout Y (r�1) by guessing correctly only some of the bits of the last-round key. Moreformally, we say that the two keys ~k1 and ~k2 belong to the same key class if therandom variables s(X; ~Y1) and s(X; ~Y2) with ~Y1 = (R(r)~k1 )�1(Y ) and ~Y2 = (R(r)~k2 )�1(Y )are identically distributed for any s 2 T . This relation induces a set of equivalenceclasses on the key space. The test can only distinguish between keys which arenot equivalent, i.e., we can only infer to which key class the correct last-round keybelongs { in return we need not test as many keys as earlier. We will not dealfurther with the subject of key classes, instead we refer to [14]. The attack can alsobe extended by simultaneously trying to guess the �rst-round and the last-roundkey. This approach also will not be considered.To prevent undermining our complexity estimations by short-cuts like the abovewe will sometimes tacitly assume that the adversary is computationally unbounded.Thus, if he can't break the cipher by a statistical attack, then nobody can (wheneven a computationally unbounded adversary can fail a statistical test, the reasonlies in the limited number of P/C-pairs).

16

Chapter 4The Correlation AttackIn this chapter we introduce the basic theory and notions related to the correlationattack. We then proceed to deduce the probability density function for certainso-called imbalance estimates.4.1 I/O Products and Correlation MatricesBasic notions of CC are the I/O product, the imbalance operator, and the correlationmatrix. These notions are all useful tools for obtaining knowledge about functionpairs that have a high correlation. They will be explained in this section. First,however, we need to de�ne certain pairs of functions called I/O pairs.De�nition 4.1.1 I/O pair. Let an I/O pair (f; g) over G be a pair of two func-tions, f : G ! C and g : G ! C, such that E[f ] = E[g] = 0, and E[jf j2] =E[jgj2] = 1. The function f is called the input function and the function g is calledthe output function. By PG we denote the set of all possible I/O pairs over G.The following is the CC-equivalent of the linear relations used with LC.De�nition 4.1.2 I/O product. Given an I/O pair (f; g) and two random vari-ables X and Y , de�ne by the I/O product (input/output product) corresponding to(f; g), X, and Y the following expressionS = f(X) � g(Y ):We sometimes use Sfg to denote the I/O product corresponding to (f; g).Similar to the notion of characteristics used with DC, we de�ne the following.De�nition 4.1.3 The linear I/O pair with characteristic (a; b) where a; b 2 G nf0g is de�ned to be the I/O pair (�a; �b) where �a and �b denote the characterscorresponding to a and b. The corresponding linear I/O product Lab(X;Y ) is de�nedto be Lab(X;Y ) = �a(X) � ��b(Y ):We omit X and Y and use Lab if it is implicitly given to which random variables werefer. 17

18 CHAPTER 4. THE CORRELATION ATTACKIt is easily seen that (�a; �b) is indeed an I/O pair since E[�a] = E[�b] = 0 andE[j�aj2] = E[j�bj2] = 1. To measure the non-uniformity of the distribution of anI/O product, we introduce the following notion which is analogous to the bias usedby Matsui [30] to measure the non-uniformity of linear approximations.De�nition 4.1.4 Imbalance. Given an I/O product S, de�ne by the imbalance ofS denoted by I(S), the squared absolute expected value of S, i.e.,I(S) = jE[S]j2 :If the random variable S depends on the random variable K, we let I(SjK = k) orjust I(Sjk) denote the imbalance of S given that K = k. The average-key imbalance�I(S) of the I/O product S is de�ned to be the expected value of the imbalance of Sover all possible keys K, i.e.,�I(S) = 1jKj Xk2K I(SjK = k)where K denotes the key space. An I/O product with average-key imbalance 1 is saidto be guaranteed.In other words, the imbalance of the I/O product S = f(X) � g(Y ) is simply thesquared correlation between f(X) and g(Y ).Remark 4.1.5 For xor-type ciphers, that is, ciphers acting on the group (ZZw2 ;�)for some w, where � represents componentwise addition modulo 2, the I/O productand the imbalance de�ned above coincides with the linear relation and 4 times thesquare of the bias used by Matsui. Thus, results derived for CC are possible to adaptfor use with LC as well.The imbalance operator will be used as likelihood estimator in the attack. Tosummarize, the attack descriptor of a correlation attack is given byT = fsjs : G2 ! G; s(x; y) = f(x) � g(y) for all x; y; (f; g) 2 PGgL(S) = jE[S]j2 for all S:The following which is called the hypothesis of �xed key equivalence was introducedin [12] for use with GLC.Hypothesis 4.1.6 Fixed key equivalence. For every I/O product S and almostevery key k, I(Sjk) � �I(S):Roughly, it states that the correlational properties of a cipher are independent of theactual choice of key meaning that we can use the average-key imbalance of an I/Oproduct as a good approximation of the actual �xed-key imbalance. As indicatedby the context, we will sometimes assume that the hypothesis holds.In [25], Lai, Massey, and Murphy de�ne a matrix of di�erential transition prob-abilities for use with DC. Similarly, here we put the imbalances of all linear I/Oproducts into a matrix called the correlation matrix.

4.1. I/O PRODUCTS AND CORRELATION MATRICES 19De�nition 4.1.7 Correlation matrix. Given a keyed function ek : G ! G withinput X and output Y = eK(X), de�ne by the correlation matrix of that function,the (n� 1)� (n� 1) matrix C = [cab] withcab = �I(Lab(X;Y ))for a; b 2 G n f0g.Notice that the above de�nition of a correlation matrix is not identical to the de�-nition which is normally used in probability theory since, e.g., a correlation matrixin the above sense is not always positive semi-de�nite. A more correct but alsomore extensive choice of words might be the cross-correlation matrix (in probabil-ity theory, the correlation matrix is more like a kind of autocorrelation matrix).The de�nition, however, complies with the notation of Daemen, Govaerts, and Van-dewalle, who also uses the words correlation matrix in conjunction with (binary)LC.Example 4.1.8 The correlation matrix C = [cab] of the function ek(x) = x+k withx; k 2 G is the identity matrix I, sincecab = �I(�a(X) � ��b(X +K))= 1n Xk2G I(SjK = k)= 1n Xk2G ��1n Xx2G�a(x) � ��b(x+ k)��2= 1n Xk2G ��1n Xx2G�a(x) � ��b(x) � ��b(k)��2= ��1n Xx2G�a�b(x)��2= �ab:The next observation is crucial to most of the following theory.Proposition 4.1.9 All correlation matrices are doubly stochastic.Proof Let C be the correlation matrix of the function ek : G! G and let K denotethe key space. That C is doubly stochastic follows easily from Parseval's identity.The sum of the elements in the b-th row of C isXa2Gnf0g cab = Xa2Gnf0g �I(Lab)= Xa2Gnf0g 1jKj Xk2K ��1n Xx2G�a(x) � ��b(ek(x))��2 :

20 CHAPTER 4. THE CORRELATION ATTACKSince �0 = 1 and E[��b] = 0 we may include a = 0 in the sum without altering theresult. This makes it possible to apply Parseval's identity and obtain1n � jKj Xa2GXk2K ��Xx2G �a(x) � ��b(ek(x))��2 = 1n � jKj Xa2GXk2K ��b(ek(a))��2= 1:In a similar way the column sums can be found to equal 1 as well. 2Proposition 4.1.10 It is possible to express the elements of a correlation matrix byusing the Fourier transform. More precisely, let ek : G! G be a keyed permutationand let C = [cab] be the corresponding correlation matrix. Thencab = 1jKj Xk2K jFf�b � ekg(a)j2:In this way it is possible to speed up computation of correlation matrices by a factorof n= log n by using the Fast Fourier Transform (FFT). For information about theFFT algorithm and how to implement it see [37].Proof Follows directly from the de�nitions of a correlation matrix and the Fouriertransform cab = �I(Lab)= 1jKj Xk2K ��1n Xx2G�a(X) � ��b(ek(x))��2= 1jKj Xk2K ��1n Xx2G�b(ek(x)) � ��a(X)��2= 1jKj Xk2K jFf�b � ekg(a)j2: 2As decryption is carried out by applying the inverse encryption function, the follow-ing is useful to know.Proposition 4.1.11 Let C = [cab] be the correlation matrix corresponding to thepermutation ek : G ! G and let D = [dab] be the correlation matrix correspondingto the inverse function e�1k : G! G. ThenC = DT : (4.1)Proof The property is proved simply by inserting the de�nitions in question.cab = �I(�a(X) � ��b(eK(X)))= �I(�b(Y ) � ��a(e�1K (Y )))= dba: 2As a direct result of this, we have the following.

4.1. I/O PRODUCTS AND CORRELATION MATRICES 21Corollary 4.1.12 The correlation matrix of an involution is symmetric.Proof Let the function e : G! G be an involution, i.e., e�1 = e, and let C be thecorresponding correlation matrix. According to (4.1) C = CT . 2Since a large percentage of ciphers use involutory round functions to facilitate de-cryption, the correlation matrix of a cipher is often symmetric.Remark 4.1.13 PES (Proposed Encryption System) is a block cipher constructedby Lai and Massey [24]. It has later been replaced in [25] by the slightly modi�edIPES (Improved PES) which is now known as IDEA (International Data Encryp-tion Algorithm). PES uses an involution as round permutation. As mentioned,this results in a symmetric correlation matrix C. The symmetry also appears inthe matrix of di�erential transition probabilities D which is used with di�erentialcryptanalysis. The purpose of modifying PES was exactly to get rid of this symme-try. The modi�cation caused a permutation of the columns of both C and D whichmade the cipher stronger. Generally, symmetrical matrices cause a higher maximummultiround imbalance than non-symmetrical.In an attack, we exploit I/O products with a high imbalance. The question is howto �nd these given some cipher. Clearly, trying every possible I/O product and com-puting the average-key imbalance by running through all possible keys and inputsis an immense task, which is, to say the least, unpractical. However, by consideringthe correlation matrix it is possible to compute the average-key imbalance of anyI/O product in an easy way. This is formulated in the following proposition.Proposition 4.1.14 Let there be given a cipher with a group operation at the entryand at the exit, i.e., let the encryption function ejkl : G ! G be given by ejkl(x) =~ek(x + j) + l, where j; l 2 G are the subkeys used in the initial and �nal groupoperations, and k 2 K is the rest of the key (cf. Figure 4.1). In addition, let (f; g)~

J

K

L

eK

X

Y=e (X+J)+LK

~

Figure 4.1: A cipher with a group operation at the entry and at the exit.

22 CHAPTER 4. THE CORRELATION ATTACKbe an I/O pair over G, let C denote the correlation matrix of the cipher, and let Sbe the I/O product corresponding to (f; g). Then�I(S) = fTCg; (4.2)where f and g are the Fourier power spectrum vectors of f and g respectively, i.e.,fa = jFffg(a)j2 and gb = jFfgg(b)j2 for a; b 2 G n f0g.Proof According to the de�nition of average-key imbalance, we have�I(S) = 1jKj � n2 Xj;l2GXk2K I(Sjj; k; l)= 1jKj � n2 Xj;l2GXk2K I �f(X) � g(Y jj; k; l)�= 1jKj � n2 Xj;l2GXk2K �� 1n Xx2G f(x) � g(y)��2 ; (4.3)where y = ~ek(x+ j)+ l. To simplify notation, let Fa = Fffg(a) and Gb = Ffgg(b).Now we express f(x) and g(x) by the inverse Fourier transformf(x) = Xa2GFa � �a(x): (4.4)Similarly for g(y) we obtaing(y) = Xb2GGb � �b(y)= Xb2GGb � �b(~ek(x+ j) + l): (4.5)Substitution of (4.4) and (4.5) into (4.3) yields�I(S) = 1jKjn2 Xj;l2GXk2K �� 1N Xx2G "Xa2GFa � �a(x)# � 24Xb2GGb � �b(~ek(x+ j) + l)35��2 :Recall that �(x) = �(�x), so�I(S) = 1jKjn2 Xj;l2G Xx2K �� Xa;b2G FaGb � 1n Xx2G�a(x) � ��b(~ek(x+ j) + l)!��2 :Recall also that �(x+ y) = �(x) � �(y). Thus�I(S) = 1jKjn2 Xj;l2GXk2K �� Xa;b2G FaGb � 1n Xx2G�a(x) � ��b(~ek(x+ j)) � ��b(l)!��2= 1jKjn2 Xj;l2GXk2K �� Xa;b2G FaGb � 1n Xx2G�a(x� j) � ��b(~ek(x)) � ��b(l)!��2= 1jKjn2 Xj;l2GXk2K �� Xa;b2G FaGb � 1n Xx2G�a(x) � �a(�j) � ��b(~ek(x)) � ��b(l)!��2 :

4.1. I/O PRODUCTS AND CORRELATION MATRICES 23Since ��b(l) = �b(�l) we get�I(S) = 1jKjn2 Xj;l2G Xk2K �� Xa;b2G FaGb�a(�j) � �b(�l) � 1n Xx2G�a(x) � ��b(~ek(x))!��2 :Application of Parseval's identity twice yields�I(S) = 1jKj Xj;l2G �� Xa;b2G FaGb � �a(�j) � �b(�l) � 1n Xx2G�a(x) � ��b(~ek(x))!��2 :Since E[f ] = 0 and E[g] = 0 by the de�nition of an I/O pair, we have f0 = g0 = 0,and �nally we obtain (4.2)�I(S) = 1jKj Xk2K Xa;b2Gnf0g ��FaGb � Ff�b � ~ekg(a)��2= Xa;b2Gnf0g jFaj2 � jGbj2 � 1jKj Xk2K ��Xx2G�a(x) � ��b(~ek(x))��2= Xa;b2Gnf0g fagb 1jKj Xk2K I(Lab(X;Y )jk)= Xa;b2Gnf0g fagb �I(Lab(X;Y ))= fTCg: 2Proposition 4.1.14 still leaves us with the problem of determining the correlationmatrix of a whole cipher. This would be an immense task, too, if it wasn't for thefollowing proposition which lets us determine the correlation matrix of an iteratedblock cipher given the correlation matrices of the individual rounds.Proposition 4.1.15 The correlation matrix of a multiround cipher consisting ofkeyed permutations separated by keyed group operations is the product of the corre-lation matrices of the individual permutations. In other words,C(1::r) = rYs=1C(s);where r is the number of rounds, C(1::r) = [c(1::r)ab ] is the correlation matrix of theentire cipher, and C(s) is the correlation matrix of the permutation of round s.Proof First, the result is shown for two rounds. We denote the �rst round key by j,the intermediate subkey used with the group operation by k, and the second roundkey by l. The �rst round function is denoted by Qj, and the second by Rl, i.e.,the encryption function is given by ejkl(x) = Rl(Qj(x) + k), where j 2 J , k 2 G,and l 2 L. Finally, the random variables W , X, Y , and Z denote the input andthe output of round one, and the input and output of round two, respectively (cf.Figure 4.2). Then we have

24 CHAPTER 4. THE CORRELATION ATTACKJ

X

Y

L

JJ

K

L

Z = R (Q (W)+K)

W

Q

RLFigure 4.2: Two rounds separated by a group operation.Xt2G c(1)at c(2)tb = Xt2G �I(�a(W ) � �t(X)) � �I(�t(Y ) � �b(Z))= Xt2G 24 1jJ j Xj2J I(�a(Q�1j (X)) � �c(X)jJ = j)35�24 1jLjXl2L I(�t(Y ) � �b(Rl(Y ))jL = l)35= Xt2G 24 1jJ j Xj2J ��1n Xx2G�a(Q�1j (x)) � �t(x)��235�264 1jLjXl2L ��1n Xy2G�t(y) � �b(Rl(y))��2375= 1jJ j � jLj � n2 Xt2G Xj2J ;l2L ��1n Xx;y2G�a(Q�1j (x)) � �b(Rl(y)) � �t(y � x)��2 :By applying Parseval's identity with respect to t this becomes1jJ j � jLj � nXt2G Xj2J ;l2L �� 1n Xx;y2G�a(Q�1j (x)) � �b(Rl(y)) � �t(y � x)��2 :Whenever y � x 6= t, the terms in the sum disappear, leaving the terms wherey = x+ t. The expression simpli�es to1jJ j � jLj � nXt2G Xj2J ;l2L �� 1n Xx2G�a(Q�1j (x)) � �b( ~Rl(x+ t))��2 :

4.2. THE DISTRIBUTION OF IMBALANCE ESTIMATES 25Substituting k = t and w = Q�1j (x) yields1jJ j � jLj � n Xj2J ;k2G;l2L �� 1n Xw2G�a(w) � �b(Rl(Qj(w) + k))��2= 1jJ j � jLj � n Xj2J ;k2G;l2L I(�a(W ) � �b(Z)jj; k; l)= �I(�a(W ) � �b(Z))= c(1::2)ab :Thus c(1::2)ab =Xt c(1)at c(2)tb ;for all a; b 2 G n f0g, i.e., C = C(1) � C(2):By induction this property generalizes to more than two rounds. 24.2 The Distribution of Imbalance EstimatesIn this section we turn to the examination of the probability density function ofimbalance estimates. As indicated by the attack descriptor, the likelihood estimatorof a correlation attack is the imbalance operator I(�). Thus, in the statistical attackwe must construct an estimate ~I(S) of I(S) given some I/O product S = f(X)�g( ~Y )and a number N of P/C-pairs. Since I(S) = jE[S]j2, the natural choice of estimateis ~I(S) = �� 1N X(x;y)2Z f(x) � g(~y)��2 :We mention without proof that this estimator is noncentral but consistent.To do hypothesis testing via this value, we need the distribution of ~I(S). Thefollowing de�nition presents the probability density function of ~I(S). We later provethat this is indeed the correct density function.De�nition 4.2.1 Imbalance distribution. Let the imbalance distribution withimbalance parameter J and sample size parameter N be the probability distributionwith density �(�;J;N) : IR+ [ f0g ! IR+ [ f0g de�ned by�(x;J;N) = 2N1� J � h� 2N1� J x; 2N1� J J� (4.6)where h(�; s) represents the probability density function of the non-central �2-distributionwith 2 degrees of freedom and skewness parameter s.The following gives a more explicit expression.

26 CHAPTER 4. THE CORRELATION ATTACKProposition 4.2.2 The density function � of the imbalance distribution with pa-rameters J and N is given by�(x;J;N) = N(1 � J)p�e�N(x+J)1�J 1Xr=0 �r (4.7)with �r = 1(2r)! � 2N1� J�2 Jx!r �(r + 12)�(r + 1) ; (4.8)where � represents the gamma function.Proof The non-central �2-distribution h : IR+ [ f0g ! IR+ [ f0g with d degrees offreedom and skewness parameter s has density given by (see [34])hd(x; s) = 2� d2��12��1 e� 12 (x+s)x 12 (d�2) 1Xr=0�r;where �r = 1(2r)!(sx)r �(r + 12)�((d+ 2r)=2) :For d = 2 this simpli�es toh2(x; s) = 12p�e� 12 (x+s) 1Xr=0 �rwith �r = 1(2r)!(sx)r�(r + 12)�(r + 1) : (4.9)Expanding 2N1�J �h2( 2N1�Jx; 2N1�JJ) and simplifying the result yields (4.7) and (4.8). 2As one can see, the probability density function is rather complex. However, forsmall values of J , it simpli�es greatly.Proposition 4.2.3 If J � ( 12N )2, then the imbalance distribution with parametersJ and N closely resembles an exponential distribution. More formally,�(x;J;N) � h(x;J;N)where h(x;J;N) = N1 � J e�N(x+J)1�Jwith accumulated error� = Z 10 jh(x;J;N)� �(x;J;N)j dx = 1� e� JN1�J :

4.2. THE DISTRIBUTION OF IMBALANCE ESTIMATES 27Proof Recall from Proposition 4.2.2 that the probability density is given by (4.7)and (4.8). For r > 0 all three factors of �r are decreasing in r (recall that thegamma-function is increasing for positive arguments). Thus for r > 0 and x � 1�r � 12! � 2N1� J�2 Jx!r � � 32��(2)� � 2N1 � J�2 J!r �0:For J � ( 12N )2 this implies1Xr=1�r � �0 1Xr=1 � 2N1� J �2 J!r= � 2N1�J �2 J1� � 2N1�J �2 J �0� �0:Since �0 = p�, this means altogether that P1r=0 �r � p� and thus by (4.7) we have�(x;N;J) � N1� J e�N(x+J)1�J :Since all values of �r are positive, we have h(x;J;N) < �(x;J;N), and thus theaccumulated error is given by� = Z 10 (�(x;J;N) � h(x;J;N)) dx= 1� Z 10 h(x;J;N)dx= 1� e� JN1�J :Note that h(x) does not de�ne a probability density function since R10 h(x)dx 6= 1.However for small J the value is close to 1. 2Unfortunately it is not often the case that J � ( 12N )2 since we usually have a largevalue of N . In that case, Algorithm XI is useful for evaluating �(x;J;N).

28 CHAPTER 4. THE CORRELATION ATTACKAlgorithm for evaluating the probability density of the imbalance distribution.procedure XI(x, J, N)// input: x, the I/O product imbalance estimate.// J , the imbalance parameter.// N , the sample size parameter.// output: The value of �(x;J;N).begin� := 0:00001r := 0�0 := 1s0 := 0~x := x � 2N=(1 � J)a := J � 2N=(1 � J)while t > � do beginr := r + 1�r := �r�1 � a~x=(2r)2sr := sr�1 + �rendrlast = rreturn N=(1 � J) � e�(~x+a)=2 � srlastendThe value of � can be changed to suit the actual need for precision.Proposition 4.2.4 Algorithm XI computes the correlation distribution with the re-quired precision.Proof The variable �r is assigned the value �r�1 � a~x=(2r)2 and thus by inductionwe have �r = 12 � 4 � 6 � � � (2r)!2 (a~x)r (4.10)since �0 = 1. Also by induction we see that sr = Prj=0 �r. Recall the recurrencerelation �(x+1) = x�(x) and that �(1) = 1 and �(12) = p�. This implies �(n+1) =n! and �(n+ 12) = 12 � 32 � 52 � � � 2n�12 p� for n 2 IN. Using this with (4.10) yields�r = 11 � 2 � � � (2r) 1 � 3 � 5 � � � (2r � 1)2 � 4 � 6 � � � (2r) (a~x)r= 1(2r)! 1�3�5��(2r�1)2r p�r! 1p� (a~x)r= 1p�(2r)! (a~x)r�(r + 12)�(r + 1) :

4.2. THE DISTRIBUTION OF IMBALANCE ESTIMATES 29Substituting ~x = x � 2N=(1 � J) and a = J � 2N=(1 � J) yields �r = �r=p� with �ras de�ned in (4.8). The return value is thusN1� J e�(~x+a)=2 rlastXr=0 �r = N(1� J)p�e�N(x+J)1�J rlastXr=0 �r� N(1� J)p�e�N(x+J)1�J 1Xr=0�r= �(x;J;N):The while-statement secures the required precision. The algorithm terminates sincer! grows faster than kr for positive k and r 2 IN. 2We now present the proof of � actually being the density function of imbalanceestimates.Proposition 4.2.5 Let S be an I/O product, and let S1; S2; : : : ; SN denote N in-dependent random variables (called samples) each with the same distribution as S.Furthermore, let the random variable ~I(S) be an estimate of I(S) based on the Nsamples of S, i.e., let ~I(S) = �� 1N NXj=1 Sj��2 : (4.11)Then, assuming that N is large, the distribution of ~I(S) is close to the correlationdistribution �(�; I(S); N).Proof Let X and Y denote cipher input and output, as usual. Furthermore, let Kdenote the key used in the �nal group operation, and let Y� denote the input to thelast group operation, i.e., Y = Y� + K. Assume that the I/O product is given byS = f(X) � g(Y ). We wish to determine the distribution of (4.11). In the following,let A = Re(S), B = Im(S), Aj = Re(Sj), and Bj = Im(Sj) for all j = 1; 2; : : : ; N .Then I(S) = jE[S]j2= E[A]2+ E[B]2:The sum of the variances of A and B isV [A] + V [B] = E[A2]� (E[A])2 + E[B2]� (E[B])2= E[A2 +B2]� I(S)= E[jSj2]� I(S)= E[jf(X) � g(Y )j2]� I(S)= E[jf(X)j2 � jg(Y� +K)j2]� I(S):Computing the expected value over all possible k; x 2 G and letting y = y� + k bethe plaintext corresponding to the ciphertext x yields�I(S) + 1n2 Xk2G Xx2G jf(x) � g(y� + k)j2

30 CHAPTER 4. THE CORRELATION ATTACK= �I(S) + 1n Xx2G jf(x)j2 � 1n Xk2G jg(y� + k)j2= �I(S) + 1n Xx2G jf(x)j2! � 0@ 1n Xk2G jg(k)j21A= E[jf(X)j2] �E[jg(K)j2]� I(S):Since (f; g) is an I/O pair, we have E[jf(X)j2] = E[jg(K)j2] = 1, resulting in thesimpli�ed relation V [A] + V [B] = 1� I(S):We have ~I(S) = ~A2 + ~B2; (4.12)where ~A = 1N NXi=1Ai and ~B = 1N NXi=1Bi:As N grows, the central limit theorem tells us that the distribution of ~A approachesa normal distribution with mean value E[A] and variance V [A]=N . Similarly, thedistribution of ~B approaches a normal distribution with mean value E[B] and vari-ance V [B]=N . To proceed, we now assume that V [A] and V [B] are equal, i.e.,that V [A] = V [B] = 1 � I(S)2 :If V [A] 6= V [B], we simply rotate S = A+ iB in complex plane until V [A] = V [B]by multiplying S by a suitable complex number w of magnitude 1. This causes noloss of generality since ~I(wS) = ~I(S) when jwj = 1. Dividing both sides of (4.12)by V [A]N , we now obtain 2N1� I(S) ~I(S) = ~C2 + ~D2;where ~C and ~D are random variables following normal distributions with variance1 and mean valuesE[ ~C] = E[A]s 2N1� I(S) and E[ ~D] = E[B]s 2N1 � I(S);respectively. According to [34], the sum of d squared normally distributed randomvariables with variance 1 follows the non-central �2-distribution with d degrees offreedom and skewness parameter s given by the sum of the squared mean values.Consequently, 2N1�I(S) ~I(S) follows the non-central �2-distribution with two degrees offreedom and skewness parameters = E[ ~C]2 + E[ ~D]2= E[A]s 2N1 � I(S)!2 + E[B]s 2N1 � I(S)!2= I(S)2N1� I(S) :

4.2. THE DISTRIBUTION OF IMBALANCE ESTIMATES 31Let h(�; s) denote the corresponding probability density function of 2N1�I(S) ~I(S). Thenthe density function � of ~I(S) is�(x; I(S); N) = 2N1� I(S)! � h 2N1 � I(S)! � x; s! :Substituting s = I(S)2N1�I(S) , I(S) = J and simplifying yields (4.6). 24.2.1 Empirical Evidence for the Distribution FunctionComputer simulations verify the validity of Proposition 4.2.5. Figures 4.3 and 4.4show both the theoretical distribution and the empirical distribution of ~I(S) forvarious values of N and I(S).The empirical distribution is indicated by the histogram and the theoretical dis-tribution by the graph. Note how the variance decreases and how the top movescloser to x = I(S) as N grows. The empirical distribution was calculated on ba-sis of randomly chosen permutations over ZZ256. Each histogram represents 20000estimates of ~I(S).

32 CHAPTER 4. THE CORRELATION ATTACK0 0.005 0.01 0.015 0.02 0.025 0.03

0

10

20

30

40

50

60

70

80

90

100

x

f(x)

I(S) = 0.004302, N = 100

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010

100

200

300

400

500

600

700

x

f(x)

I(S) = 0.004302, N = 1000

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 10−3

0

500

1000

1500

x

f(x)

I(S) = 0.004302, N = 10000

Figure 4.3: Some theoretical and empirical distributions of ~I(S).

4.2. THE DISTRIBUTION OF IMBALANCE ESTIMATES 330 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

0

5

10

15

20

25

30

35

40

45

x

f(x)

I(S) = 0.008562, N = 100

0 0.005 0.01 0.015 0.02 0.025 0.030

10

20

30

40

50

60

70

80

90

100

x

f(x)

I(S) = 0.008562, N = 1000

0 0.005 0.01 0.0150

50

100

150

200

250

300

350

x

f(x)

I(S) = 0.008562, N = 10000

Figure 4.4: Some theoretical and empirical distributions of ~I(S).

34

Chapter 5Attack AlgorithmsBased on the previous derivations of the probability distribution of I/O-products,we now present three attack algorithms of di�erent complexity and strength. In thischapter, K denotes the set of possible last-round keys.5.1 The Simple AttackBy the term \simple" we refer to the usage of only one I/O pair (f; g). The simpleattack is carried out by hypothesis testing via the Neyman-Pearson Lemma [27] (cf.Chapter 3). In the following X and Y denote the cipher input and output as usual,whereas ~k denotes a last-round key guess, and ~Y = R�1~k (Y ) the corresponding inputto the last round given that the last-round key is ~k (cf. Figure 3.2).Let (f; g) be an I/O pair and let S = f(X) � g( ~Y ) be the corresponding I/Oproduct. Similarly, letS(1::r�1) = f(X) � g(Y (r�1)) and S(r+1) = f(X) � g(Y (r+1))be the I/O products over the reduced and expanded cipher, respectively. Further-more, let there be given N P/C-pair samples, and let ~I(S) denote the correspondingimbalance estimate of I(S) based upon these samples.For each guess of last-round key there are two possibilities: 1) the guess is thecorrect last-round key; or 2) the guess is a wrong last-round key. If possibility 1)is correct, then we have ~Y = Y (1::r�1) and thus the distribution of the imbalanceestimate ~I(S) will be �(x; �I(S(1::r�1)); N) (assuming that the hypothesis of �xedkey equivalence holds). If possibility 2) is correct, ~I(S) will have the distribution�(x; �I(S(1::r+1)); N).To determine the correct possibility, we must distinguish, for each guess of last-round key, between two �-distributions, one with parameter �I(S(1::r�1)) and one withparameter �I(S(1::r+1)). According to the Neyman-Pearson Lemma, the likelihoodratio �(~I(S); �I(S(1::r�1))�(~I(S); �I(S(1::r+1))provides the best possible test value to accomplish this. The higher this value is,the more probable is it that I(S) = �I(S(1::r�1)) meaning that ~Y = Y (r�1) (i.e., we35

36 CHAPTER 5. ATTACK ALGORITHMShave chosen the correct key). Thus, to �nd the maximum likelihood key we mustsearch for the last-round key that maximizes the likelihood ratio. This is done inthe following algorithm.Algorithm for attacking a block cipher.procedure ATTACK1((f; g), Z)// input: (f; g), an I/O pair// Z, the set of collected P/C-pairs// output: The set of maximum likelihood last-round keysbeginfor ~k 2 K do begin~I := 0for (x; y) 2 Z do begin~y := R�1~k (y)~I := ~I + f(x) � g(~y)end~I := ��~I=N ��2L[~k] := �(~I ; �I(S(1::r�1)fg ); N)=�(~I ; �I(S(1::r+1)fg ); N)endreturn f~k : L[~k] = maxk L[k]gendThe complexity of the above algorithm isO(N �jKj) if we approximate the complexityof evaluating � by O(1).5.2 The Advanced AttackIn the above approach we use only one I/O pair to distinguish the correct key fromthe wrong keys. However, it is possible to use several I/O pairs in an improved at-tack equivalent to the approach with \multiple approximations" considered in [20]for use with linear cryptanalysis. This is done in the following where we considera set of m I/O pairs A = f(f1; g1); (f2; g2); : : : ; (fm; gm)g and the correspondingI/O products S = fS1, S2, : : :, Smg where St = ft(X) � gt( ~Y ) for t = 1; 2; : : : ;m.We assume independency between the estimated imbalances ~I(S1); ~I(S2); : : : ; ~I(Sm)implying that the joint distribution of ~I(S1); ~I(S2); : : : ; ~I(Sm) is given by the prob-ability density function h(j1; j2; : : : ; jm) = Qmt=1 �(jt; �I(St); N). The likelihood ratioof Neyman-Pearson's Lemma is thenmYt=1 �(~I(St); �I(S(1::r�1)t ))�(~I(St); �I(S(1::r+1)t ))

5.3. SIMPLIFYING THE ADVANCED ATTACK 37resulting in the algorithm below. In the actual implementation, we evaluate thelog-likelihood ratiomXt=1 log �(~I(St); �I(S(1::r�1)t ))� log �(~I(St); �I(S(1::r+1)t ))so as not to cause over- or under ow. Since log is an increasing function, this iscompatible with the old approach.Algorithm for attacking a block cipher.procedure ATTACK2(A, Z)// input: A = f(f1; g1); (f2; g2); : : : ; (fm; gm)g, the set of I/O pairs// Z, the set of collected P/C-pairs// output: The set of maximum likelihood last-round keysbeginfor ~k 2 K do beginL[~k] := 0for t := 1; 2; : : : ;m do begin(f; g) := (ft; gt)~I := 0for (x; y) 2 Z do begin~y := R�1~k (y)~I := ~I + f(x) � g(~y)end~I := ��~I=N ��2L[~k] := L[~k] + log �(~I; �I(S(1::r�1)t ); N)� log �(~I; �I(S(1::r+1)t ); N)endendreturn f~k : L[~k] = maxk L[k]gendThe time complexity of the above algorithm is O(m �N � jKj), again assuming thatevaluation of � is possible in O(1) time. The memory complexities of both AlgorithmATTACK1 and ATTACK2 are not very high if at each step of the algorithm we storeonly the key guesses which have maximum L[�].5.3 Simplifying the Advanced AttackThe two algorithms presented above have one major drawback in common: Whilethey might be optimal in a maximum-likelihood sense, they are di�cult to analyzesince they give occasion for some very complex probability distributions. In thissection we present a simpli�cation of the advanced attack to facilitate analysis.

38 CHAPTER 5. ATTACK ALGORITHMSIn Algorithm ATTACK2 several �-distributed random variables are considered,namely the imbalance estimates for each of the I/O products S1; S2; : : : ; Sm. In thefollowing approach, a new random variable ~I� is introduced which is a weighted sumof the various imbalance estimates. This sum almost follows a normal distributionwhich makes it at lot easier to do the hypothesis checking (we do not have to considerthe high complexity �-function).To simplify further we assume that ~I(Sj) is distributed the same way for all Sjwhenever a wrong key is used, more precisely that the imbalance over the expandedcipher I(S(1::r+1)t ) = 1n�1 for all t (corresponding to identical entries in the correlationmatrix of the expanded cipher). As mentioned earlier, this is called the hypothesisof wrong key randomization and was formulated in [12] for use with binary GLC.The hypothesis was demonstrated to hold for a variety of actual ciphers in [14].Intuitively, it is reasonable that I(S(1::r+1)) is close to 1n�1 , at least when comparedto I(S(1::r�1)), since the expanded cipher has got two more rounds than the reducedcipher (resulting in a \more random" output). The following de�nes the weightedsum mentioned above.De�nition 5.3.1 Let S = fS1; S2; : : : ; Smg be a set of m I/O products. Further-more, let � = (�1; �2; : : : ; �m) with k�k2 = 1 be a vector consisting of m realnumbers. The random variable ~I�(S) is then de�ned by the following weighted sumof imbalance estimates ~I�(S) = mXj=1�j � ~I(Sj):One of the advantages of using this sum is its nice distribution, which is presentedin the following proposition.Proposition 5.3.2 Let S = fS1; S2; : : : ; Smg be a set of m I/O products. If I(S1),I(S2), : : :, I(Sm) are independent and small, then the random variable ~I�(S) basedupon N samples follows a normal distribution with mean valueE[ ~I�(S)] =Xj �j � �I(Sj) + 1N �and variance V [ ~I�(S)] = N�2:Proof According to the proof of Proposition 4.2.5, the random variable~J = 2N1� I(S) ~I(S)follows a non-central �2-distribution with 2 degrees of freedom and skewness param-eter s = 2NI(S)=(1 � I(S)). This means that ~J has mean value E[ ~J] = 2 + s andvariance V [ ~J] = 4 + 4s (see [19]). Consequently, the random variable~I(S) = 1 � I(S)2N ~J

5.3. SIMPLIFYING THE ADVANCED ATTACK 39has mean value E[ ~I(S)] = (2 + s) � 1� I(S)2N= 1� I(S)N + I(S)� I(S) + 1Nand variance V [ ~I(S)] = (4 + 4s) 1 � I(S)2N !2= 1 � 2I(S) + I(S)2N2 + 4I(S)� 4I(S)22N� 1N2 :Then by the central limit theorem, ~I�(S) follows a normal distribution with meanvalue E[ ~I�(S)] = mXt=1�t � E[ ~I(St)]= mXt=1�t � �I(St) + 1N �and variance V [ ~I�(S)] = mXt=1�2t � V [ ~I(St)]= mXt=1�2t=N2= N�2;since k�k2 = 1. 2Thus, if our guess ~k equals the correct key, then the random variable ~I�(S) followsa normal distribution with mean value�1 = E[ ~I�(S(1::r�1))]= mXt=1�t � (I(S(1::r�1)t ) + 1N )according to Proposition 5.3.2. If we have guessed a wrong key, then by the hypoth-esis of wrong key randomization, ~I�(S) has mean value�2 = E[ ~I�(S(1::r+1))]= mXt=1�t � (I(S(1::r+1)t ) + 1N )� Xt �t � 1n� 1 + 1N � :

40 CHAPTER 5. ATTACK ALGORITHMSIn either case, the variance is N�2.With this knowledge of �1 and �2 it is possible to �nd the exact choice of thevector � which results in the best attack. We want to distinguish between the distri-butions N(�1; N�2) and N(�2; N�2) (here N(�; V ) denotes the normal distributionwith mean value � and variance V ). Due to the variances being identic and constantwith respect to � in the two cases, to get the most powerful statistical test we shouldchoose � = (�1; �2; : : : ; �m) with k�k2 = 1 such that the two mean values �1 and �2are as far apart as possible. Assuming that the hypothesis of �xed key equivalenceholds, this happens when the value�1 � �2 = Xt �t � �I(S(1::r�1)t )� 1n� 1�� Xt �t � ��I(S(1::r�1)t )� 1n� 1�is as large as possible. By Cauchy's inequality this happens for the unique choice of� where �1�I(S(1::r�1)1 )� 1n�1 = �2�I(S(1::r�1)2 )� 1n�1 = � � � = �m�I(S(1::r�1)m )� 1n�1 ;i.e., �t = �I(S(1::r�1)t ) � 1n�1l (5.1)for all t = 1; 2; : : : ;m withl = " mXt=1��I(S(1::r�1)t )� 1n� 1�2# 12since k�k2 = 1. This corresponds to the mean values�1 = mXt=1�t � (�I(S(1::r�1)t ) + 1N )= mXt=1 �I(S(1::r�1)t )� 1n�1l � (�I(S(1::r�1)t ) + 1N ) (5.2)and �2 = mXt=1�t � � 1n � 1 + 1N �= mXt=1 �I(S(1::r�1)t )� 1n�1l � � 1n� 1 + 1N � (5.3)and a di�erence of�1 � �2 = mXt=1�t � ��I(S(1::r�1)t ) + 1N � 1n� 1 � 1N �

5.3. SIMPLIFYING THE ADVANCED ATTACK 41= mXt=1 �I(S(1::r�1)t )� 1n�1l ��I(S(1::r�1)t ) + 1N � 1n� 1 � 1N �= mXt=1 ��I(S(1::r�1)t )� 1n�1�2l= " mXt=1��I(S(1::r�)t )� 1n� 1�2# 12 : (5.4)We can now present the algorithm of the simpli�ed attack.Algorithm for attacking a block cipher.procedure ATTACK3(A, Z)// input: A = f(f1; g1); (f2; g2); : : : ; (fm; gm)g, the set of I/O pairs// Z, the set of collected P/C-pairs// output: The set of maximum likelihood last-round keysbeginfor ~k 2 K do beginL[~k] := 0~I� := 0for t := 1; 2; : : : ;m do begin(f; g) := (ft; gt)~I := 0for (x; y) 2 Z do begin~y := R�1~k (y)~I := ~I + f(x) � g(~y)end~I := ��~I=N ��2~I� := ~I� + ~I � (�I(S(1::r�1)t )� 1n�1)endL[~k] := ~I�endreturn f~k : L[~k] = maxk L[k]gendIn the algorithm, the constant value of l from (5.1) has been multiplied out tosimplify (this is allowed since multiplication with a constant does not change thethe occurrence of the maximum value L[�]).

42 CHAPTER 5. ATTACK ALGORITHMS5.4 Success ProbabilityIn this section we consider how to estimate the success probability of algorithms AT-TACK1 and ATTACK3. Clearly, ATTACK1 is most successful when the consideredI/O product has a large imbalance over the reduced cipher. Similarly, ATTACK3performs best when one considers as many I/O products as possible.5.4.1 The Simple AttackAs mentioned, the potential power of ATTACK1, when applied to a given cipher,can be measured by the maximum imbalance over the corresponding reduced cipher.The next theorem will show how to obtain this value and the corresponding I/Oproduct(s) that attain this imbalance. As a side result, we will see that linear I/Oproducts are in some sense the best. However, before formulating this, we need alemma.Lemma 5.4.1 Given two vectors v and w each with nonnegative components andPj vj = 1, the following inequality holdsv � w � wmax;where wmax = maxj wj. Equality holds if and only if va = 0 whenever wa 6= wmax.Proof Expanding the inner product yieldsv � w = Xj vj � wj� Xj vj � wmax= wmax:It is easily seen, that we have equality if and only if vj is nonzero only wherewj = wmax. 2We are now ready to formulate a proposition with which one can �nd maximumimbalances over certain ciphers.Proposition 5.4.2 Maximum imbalance. Let there be given a cipher with agroup operation at the entry and at the exit, and let C = [cab] denote the correlationmatrix. Furthermore, let X and Y denote the cipher input and output, respectively.Then the maximum average-key imbalance over all possible I/O products�Imax = max(f;g)2PG �I(f(X) � g(Y ))is given by the maximum element of the correlation matrix C. More formally,�Imax = maxa;b cab:Moreover, the linear I/O sums in the setfLab : cab = �Imaxgattain this maximum as their imbalance.

5.4. SUCCESS PROBABILITY 43This result tells us that, in some sense, the linear I/O products are the best.Proof Let (f; g) be an I/O pair, and let f and g denote the vectors of thecorresponding Fourier power spectra like in Proposition 4.1.14. Furthermore, letS = f(X) � g(Y ). Then according to (4.2),�I(S) = fTCg: (5.5)Recall that by Parseval's identity and the de�nition of a correlation pair we havePa fa = Pb gb = 1. We �rst show that �I(S) is maximized by choosing f = ea andg = eb for some values of a and b (ej being the j-th basis vector, i.e., the all-zerovector except for a 1 at position j).Assume that we already know a g which maximizes (5.5) (together with some f).Then w = Cg is a vector with nonnegative components. According to Lemma 5.4.1choosing f = ea such that wa = maxj wj will maximize the value of fTw = fTCg.In a similar manner one can show that choosing g = eb for some b will maximize(5.5). Thus, the maximum is given by eTaCeb = cab for some values of a and b.Application of the inverse Fourier transform yields the corresponding I/O prod-uct is F�1f�ag(X) � F�1f�bg(Y ) = �a(X) � ��b(Y )= Lab(X;Y ): 2Proposition 5.4.2 proves the hypothesis of Harpes that the maximum imbalance ofthe so-called I/O sums used with GLC is attained by a homomorphic1 I/O sum(at least for ciphers working over the xor-group, (ZZw2 ;+), since in this case CC isequivalent to GLC). In the following example we apply Proposition 5.4.2 to oneround of the cipher IDEA(8) and to the full cipher.Example 5.4.3 The block cipher IDEA [24, 25] is an example of a cipher with agroup operation at the entry and at the exit. To be more exact, the generalizedversion IDEA(w) with bitwidth w, v = w=4, m = 2v, and m + 1 prime is givenby the computational owchart of Figure 5.1. Here � represents addition over ZZv2(xor), + represents addition modulo m, and � represents multiplication modulom+ 1 with 0 � m (!). Accordingly, the group operation at the entry of the cipher isgiven by a b = (a1 � b1; a2+b2; a3+b3; a4 � b4). The corresponding group (G;)is decomposable into four cyclic groups, viz., G = ZZ�m+1 � ZZm � ZZm � ZZ�m+1.To transform the cipher into a cipher which uses only additive cyclic groups, wehave to incorporate the isomorphism from (ZZ�2n+1; �) to (ZZ2n;+) and back into thekeyed round permutation. More precisely, let 'k(5;6) : G! G denote the permutationin the round function following the group operation . De�ne the new round function~'k(5;6) : ~G! ~G by~'k(5;6)(x1; x2; x3; x4) = log1;4('k(5;6)(expx1; x2; x3; expx4));1The GLC-notion of a homomorphic I/O sum is equivalent to the notion of a linear I/O product.

44 CHAPTER 5. ATTACK ALGORITHMS

(1)

1 2 3ZZZ Z(1) (1) (1) (1)

4

1 2 3ZZ

Round permutation, ϕ

Group operation,

Group operation,

Seven more rounds

Output permutation

X X X X

Y Y Y Y1 2 3 4

1 2 3 4

Z Z(9) (9) (9) (9)

4

5

6

Z

Z

(1)

Figure 5.1: The block cipher IDEA.

5.4. SUCCESS PROBABILITY 45where the function log1;4 : G! ~G is de�ned bylog1;4(x1; x2; x3; x4) = (log; x1; x2; x3; log x4):Here log and exp are respectively the discrete logarithm function and the exponen-tial function applied to the same basis (an arbitrary generator of ZZ�m+1). The newequivalent owchart is given by Figure 5.2. The function ~'k(5;6) can be thought(1)

Group operation,

Round permutation, ϕ

Z

Seven more rounds

Output permutation

1

EXP EXP

LOG LOG

2 3ZZ Z(9)

Y Y Y Y

(9) (9) (9)4

1 2 3 4

Group operation,

1 2 3ZZZ Z(1) (1) (1) (1)

4

X X X X1 2 3 4

5

6

Z

Z

(1)

~

~

~Figure 5.2: A block cipher which is equivalent to IDEA.of as the bijection of a cipher equivalent to IDEA which uses the group operationa~b = (a1+b1; a2+b2; a3+b3; a4+b4) instead of the usual .The mini-version of IDEA called IDEA(8) has w = 8 and thus m = 4. For thiscipher we have found the correlation matrix C = [cab] corresponding to the alteredround function ~' by using a computer program. The matrix C has a maximumvalue of 0:25 at cab with, e.g., a = (0; 0; 0; 1) and b = (0; 0; 2; 0). The charactercorresponding to a = (a1; a2; a3; a4) 2 ZZ44 is �a(x) = ia1x1+a2x2+a3x3+a4x4. This meansthat the highest average-key imbalance over one round is 0:25 and that this imbalance

46 CHAPTER 5. ATTACK ALGORITHMSis attained by the I/O product Smax = �(0;0;0;1)(X)��(0;0;2;0)(Y ) = iX4 �(�1)Y3 (amongothers).We now use Propositions 4.1.15 and 5.4.2 to �nd the maximum imbalance overthe reduced cipher. IDEA(8) has 8 rounds, and thus to obtain the maximum possibleimbalance for an I/O product over the reduced cipher we compute the 7th powerof the correlation matrix C. We get �Imax = maxa;b (C7)ab = 0:004994 for a =(0; 2; 2; 0) and b = (1; 0; 2; 0). This imbalance is attained by the I/O product S(1::7) =i2a2+2a3�b1�2b3 = (�i)b1 � (�1)a2+a3+b3 (among others).For xor-ciphers, a bound on the diagonal elements of the correlation matrix arepossible to obtain by considering the degree of the cipher expressed as a polynomialfrom ZZw2 [x].Proposition 5.4.4 Let there be given an xor-cipher ek : ZZw2 ! ZZw2 with correlationmatrix C = [cab]. Assume that caa 6= 1 for all a, i.e., none of the I/O productscorresponding to diagonal elements in the correlation matrix are guaranteed. Inaddition, let d denote the maximum degree over all k of ek expressed as a polynomialfrom ZZw2 [x], i.e., let d = maxk deg(ek(x)):Then caa � (d � 1)2n ;for all a, where n = 2w.In fact, the proposition holds for all ciphers over groups whose cyclic decompositionconsists only of groups of a prime order, e.g., G = ZZ3 � ZZ45 or G = ZZ9257. However,most ciphers work over a group whose order is a power of 2 which is why we havecovered the above case only.Proof Let (f; g) = (�a; �a) be a linear I/O pair corresponding to the diagonalelement caa and let sk(x) = �a(x) � ��a(ek(x)). Furthermore, let S = sK(X) bethe I/O sum corresponding to (f; g). The diagonal element caa is expressible in thefollowing way caa = �I(S)= �I(�a(X) � ��a(Y )= �I(�a(X � Y )):Weil's theorem ([28], Theorem 5.38) tells us that for any non-trivial character �, wehave �� Xx2ZZw2 �(h(x))�� (deg(h(x))� 1)n 12 (5.6)if h 2 ZZw2 [x] is not on the form h = u2 � u+ v (5.7)

5.4. SUCCESS PROBABILITY 47for some u 2 ZZw2 [x] and v 2 ZZw2 . The inequality (5.6) implies��1n Xx2ZZw2 sk(x)��2 � (deg(x� ek(x))� 1)2n :The left hand side of the above expression equals I(SjK = k). We now have leftto prove that h(x) = x� ek(x) is not on the form (5.7). However, if x� ek(x) wason the indicated form, then I(SjK = k) would equal 1 since, in that case, the lefthand side of (5.6) would equal n (for an explanation of this see [28]). But we haveassumed that there are no guaranteed I/O products and we have a contradiction.We �nally obtain �I(S) = 1K Xk2K I(SjK = k)� 1K Xk2K (deg(x� ek(x))� 1)2n� (d� 1)2n : 2Proposition 5.4.4 states that if the ciphertext is expressible as a polynomial of suf-�ciently low degree with respect to the plaintext, then the maximum imbalance ofcertain I/O products over the cipher is low. The proposition provides a more analyt-ical approach for examining some ciphers since it is easy to obtain an upper boundon the degree of polynomials over a whole cipher when the degrees of polynomialsover the individual rounds are known. Low Boolean complexity, however, is nota good design criterion, since ciphers of low polynomial degrees are susceptible toother attacks than statistical. At least, care should be taken if using Proposition5.4.4 to design a cipher.Siegenthaler [46] (see also [47]) has shown a somewhat related result for streamciphers based on combining Boolean linear feedback shift registers. More precisely,he has shown that there is a linear tradeo� between the correlation immunity andthe Boolean complexity of the combiner if the combiner is memoryless.5.4.2 The Advanced AttackClearly, the advanced attack is more powerful than the simple attack since we receiveinformation from several I/O products. In this section, we seek to determine howmany P/C-pairs ATTACK3 requires for a successful cryptanalysis. This number isalso an upper bound for the number of P/C-pairs required by ATTACK1 (since thisis a weaker attack).The strongest attack with Algorithm ATTACK3 of course uses all the I/O pairsat disposal, e.g., all the linear I/O products S = fLab : a; b 2 Gg with imbalance asindicated by the correlation matrix (additional I/O products would be redundantsince the linear I/O products completely describe the correlational properties (cf.Proposition 4.1.14)). We call this a full attack. However, since all the imbalance

48 CHAPTER 5. ATTACK ALGORITHMSestimates corresponding to these linear I/O products are not always independent,it is not possible directly to use Proposition 5.3.2 to acquire information aboutthe distribution of ~I�(S) since this proposition requires the random variables to beindependent.In the following, however, we assume that the imbalance estimates of the I/Oproducts from S are independent. Since independent estimates leak more informa-tion than correlated estimates, this hypothetical situation is a much stronger attackthan otherwise possible. Consequently, if this hypothetical full attack with indepen-dent imbalance estimates requires N P/C-pairs to succeed, then the \real" attackrequires more than N pairs. Therefore, the following theorems which determine Nfor the hypothetical attack can be thought of as lower bounds for N for real attacks.Next, after choosing the optimal � given by (5.1), we proceed to determine thedistribution of ~I�(S) given a correct and a wrong key guess, respectively. Accordingto Proposition 5.3.2 the two distributions are the normal distributions N(�1; N�2)and N(�2; N�2) with �1 and �2 given by (5.2) and (5.3). Thus, to distinguishbetween correct and wrong keys, we wish to test which one of the two hypothesesH0: ~I�(S) 2 N(�1; N�2), andH1: ~I�(S) 2 N(�2; N�2)is true. This is equivalent to testing between the hypothesesH0: ~I� � �2 2 N(�1 � �2; N�2), andH1: ~I� � �2 2 N(0; N�2).According to (5.4) �1 � �2 = " mXt=1��I(S(1::r�1)t )� 1n� 1�2# 12which in this case equals24 Xa;b2Gnf0g��I(L(1::r�1)ab )� 1n� 1�23512 = 24 Xa;b2Gnf0g�cab � 1n� 1�23512= kC � Ek2:Based on this, we now derive a formula for determining how many P/C-pairs onehave to consider to mount a successful attack against a given cipher for which thevalue of kC � Ek2 is known. In the following, jKj will denote the number of last-round keys.Theorem 5.4.5 Given a block cipher, let C be the correlation matrix of the corre-sponding reduced cipher and let jKj denote the number of possible last-round keys.Then the minimum number N of P/C-pairs required for a full correlation attackwith independent imbalance estimates to succeed with probability p isN = ��1(1� 1jKj) + ��1(p)kC �Ek2 : (5.8)

5.4. SUCCESS PROBABILITY 49where �(x) = P [N(0; 1) � x] is the accumulated normal distribution function andE = [eab] is given by eab = 1n�1Proof Consider algorithm ATTACK3. Instead of outputting the last-round key ~kwhich has the highest value of ~I�, assume that there is a constant above whichwe declare ~I� � �2 to belong to N(kC � Ek2; N�2) and below which we declare~I� � �2 to belong to N(0; N�2) (corresponding to guessing a correct and a wrongkey, respectively).We do not want any false positives (i.e., last-round key guesses which are declaredcorrect, when they are not), and thus for Z1 2 N(0; N�2) we wantP [Z1 > ] � 1jKj: (5.9)Similarly, we do not want a false negative, i.e., we want the attack algorithm tosucceed with a reasonable probability p. In other words, for Z0 2 N(kC�Ek22; N�2)we want P [Z0 > ] � p: (5.10)Both of these inequalities have to hold for the attack to succeed. Thus the smallestnumber N for which they hold for some value of is the value we are looking for.Equation (5.9) is equivalent toP [ ~Z1 > N ] � 1jKjwhere ~Z1 = N � Z1 2 N(0; 1). This inequality has the solution � ��1(1� 1jKj)N : (5.11)Equation (5.10) is equivalent toP [ ~Z0 > N � ( � kC � Ek2)] � pwhere ~Z0 = N � (Z0 � kC � Ek2) 2 N(0; 1). This inequality has the solution � ��1(1� p)N + kC � Ek2: (5.12)The critical point where the right hand sides of (5.11) and (5.12) are equal is nowobtained by solving ��1(1 � 1jKj)N = ��1(1 � p)N + kC � Ek2with respect to N . The solution isN = ��1(1� 1jKj) ��1(1� p)kC � Ek2

50 CHAPTER 5. ATTACK ALGORITHMSwhich simpli�es to (5.8). 2The following table shows the value of ��1(1� 1jKj) + ��1(p) for various values of pand jKj = 2b where b is the number of bits in the last-round key guess.pb 50% 90% 95% 97.5% 99% 99.99% 99.9999%1 0 1.28 1.64 1.96 2.37 3.72 4.752 0.67 1.96 2.32 2.63 3.00 4.39 5.433 1.15 2.43 2.80 3.11 3.47 4.87 5.904 1.53 2.82 3.18 3.49 3.86 5.25 6.295 1.86 3.14 3.51 3.82 4.19 5.58 6.626 2.15 3.44 3.80 4.11 4.48 5.87 6.917 2.42 3.70 4.06 4.38 4.74 6.14 7.178 2.66 3.94 4.30 4.62 4.99 6.38 7.419 2.89 4.17 4.53 4.85 5.21 6.60 7.6410 3.10 4.38 4.74 5.06 5.42 6.82 7.8511 3.30 4.58 4.94 5.26 5.62 7.02 8.0512 3.49 4.77 5.13 5.45 5.81 7.21 8.2413 3.67 4.95 5.31 5.63 5.99 7.39 8.4214 3.84 5.12 5.49 5.80 6.17 7.56 8.6015 4.01 5.29 5.65 5.97 6.34 7.73 8.7616 4.17 5.45 5.81 6.13 6.50 7.89 8.92Theorem 5.4.5 makes it possible to prove sometimes that a given cipher is secureagainst CC. This happens if, e.g., N exceeds the number of possible P/C-pairs.Example 5.4.6 In this example we show that the cipher IDEA(8) is immune tocorrelation cryptanalysis. Let C7 denote the correlation matrix of the reduced cipheras found in Example 5.4.3. The corresponding value of the Frobenius norm is kC7�Ek2 = 0:007600.Then according to (5.8) and the above table, the minimum number of P/C-pairsrequired to correctly guess the 12-bit last-round key with a probability of more thanp = 50% is N = ��1(1� 1jKj) + ��1(p)kC � Ek2� 3:49=0:007600� 459which is more than the available number of P/C-pairs (namely 28 = 256). Conse-quently, IDEA(8) is secure against CC (assuming independent round keys). As weshall see later, this implies that IDEA(8) is secure against LC, GLC, PC, and DC,too.

Chapter 6Bounds for Multiple RoundsWhen examining the security of a cipher against CC, it is not always desirable norpractical to have to consider the correlation matrix of the whole cipher, since it isgenerally hard to obtain. In the following, we will provide various upper boundsfor multiple rounds which depend on only one value derived from the correlationmatrix of the round function. With these bounds, one can easily determine howmany rounds a cipher should have to make it secure against CC.We will assume that the round functions of the considered cipher are identical,implying identical correlation matrices for each round. The results, however, areadaptable for use with ciphers using several di�erent round functions. First, wewill derive bounds related to ATTACK1, i.e., bounds that tell us something aboutthe maximum value of the correlation matrix of the whole cipher, then we presentbounds related to ATTACK3 telling us how many P/C-pairs are required for asuccessful attack.6.1 The Simple AttackThe success of the simple attack depends on �nding an I/O product with a highimbalance. In this subsection we develop upper bounds for the imbalance over agiven cipher. First, however, we need a lemma.Lemma 6.1.1 Let C be a correlation matrix, and let E = [eab] be the (n�1)�(n�1)matrix with eab = 1n�1 for a; b 2 G n f0g. ThenCr � E = (C � E)r: (6.1)for all r 2 IN. Furthermore, �2(C) = �1(C � E); (6.2)i.e., the second highest singular value of C is the highest singular value of C �E.Proof To prove (6.1), we use induction on r. Clearly, (6.1) holds for r = 1. Thenfor r = s+ 1 we get (C � E)s+1 = (C � E)s(C � E):51

52 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSUsing the induction hypothesis Cs � E = (C � E)s we get(Cs � E)(C � E) = Cs+1 � EC + EE � CsE:Since C and Cs are doubly stochastic, the above simpli�es toCs+1 � E + E � E = Cs+1 � Eending the proof of (6.1). Next, write D = CTC on the formD = V �1�Vwhere � is the diagonal matrix with the eigenvalues of D (including multiplicities) indescending order of magnitude and the rows of V contain the corresponding eigen-vectors. Since C and thereby D are doubly stochastic, 1 is the highest eigenvalue ofD and (1=(n � 1); : : : ; 1=(n � 1)) is a corresponding eigenvector. In other words,� = diag(1; �2(C)2; �3(C)2; : : : ; �n(C)2)(recall that the eigenvalues of D are the squared singular values of C), andV = 0BBBB@ 1=(n � 1) 1=(n � 1) � � � 1=(n � 1)v2;1 v2;2 � � � vn�1;2... ... . . . ...vn�1;1 vn�1;2 � � � vn�1;n�1 1CCCCA :Furthermore, since D is Hermitian, V is unitary, i.e., V �1 = V �. Another way towrite E is the followingE = (1=(n � 1); 1=(n � 1); : : : ; 1=(n � 1))�(1=(n � 1); 1=(n � 1); : : : ; 1=(n � 1))T= V � � diag(1; 0; : : : ; 0) � V;and thus �1(C � E)2 = �1((C � E)T (C � E))= �1(CTC � CTE + ETE � ETC)= �1(D � E + E � E)= �1(V � � diag(1; �2(D); : : : ; �n(D)) � V�V � � diag(1; 0; : : : ; 0) � V )= �1(V � � diag(0; �2(D); : : : ; �n(D)) � V )= �2(D)= �2(C)2proving (6.2). 2In fact, the lemma holds for all doubly stochastic matricesC, which is easily checked.The following theorem gives an upper bound for the imbalance of an I/O productover a whole cipher when the correlation matrix is known for the round function.

6.1. THE SIMPLE ATTACK 53Theorem 6.1.2 Let there be given a cipher with r rounds each with correlationmatrix C. Then the average-key imbalance of any I/O product S over that cipher isbounded by �I(S) � 1n � 1 + kC � Ekrwith k � k replaced by any one of the following matrix norms: k � k1, k � k2, jjj � jjj1,jjj � jjj2, or jjj � jjj1.Proof According to Proposition 2.4.5, the maximum norm of C = [cab], i.e.,kCk1 = maxab jcabj is upper-bounded by any of the listed matrix norms. Using�rst Propositions 5.4.2 and 4.1.15, then the triangle inequality and (6.1) of Lemma6.1.1, we obtain �I(S) = maxa;b (Cr)ab� 1n� 1 + kCr � Ek1� 1n� 1 + kCr � Ek= 1n� 1 + k(C � E)rk� 1n� 1 + kC � Ekr;where k � k is any one of the listed norms. 2Corollary 6.1.3 Let there be given a cipher with r rounds each with correlationmatrix C. Then the average-key imbalance of any I/O product S over that cipher isbounded by an expression involving a singular value of C, namely�I(S) � 1n� 1 + �2(C)r:Proof Follows from (6.2) and application of Theorem 6.1.2 with k � k = jjj � jjj2, thespectral norm. 2Thus for ciphers with �2 < 1 we can get force the maximum imbalance as closeto 1n�1 as we wish by using an appropriate number of rounds. More tight upperbounds are possible to derive by considering two or more rounds at a time, i.e., forr divisible by m compute 1n�1 + [�2(Cm)]r=m. Using m = r of course gives us thecorrect maximum imbalance right away (via Proposition 5.4.2) since we computeCr.Corollary 6.1.3 leaves us with the problem of �nding the second highest singularvalue of a given correlation matrix. With the following proposition, we can �nd anupper bound for the second highest singular value.

54 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSProposition 6.1.4 The second highest singular value of the correlation matrix C =[cab] is bounded from above by�2(C) � min8<: 1 �Xb mina (CTC)ab!12 ; 1�Xa minb (CTC)ab! 129=;Proof The second highest eigenvalue of a doubly stochastic matrix M = [mab]obeys the following bound�2(M) � min(1�Xb mina mab; 1 �Xa minb mab) :For a proof of this consult [2], Theorem 5.10, where a similar bound is proved forstochastic matrices. This �nishes the proof since the singular values of C are justthe square roots of the eigenvalues of the doubly stochastic matrix CTC. 2Example 6.1.5 In this example, we use the correlation matrix from Example 5.4.3to obtain an upper bound for the imbalance of a multiround I/O product over IDEA(8).Let S(1::r) denote an I/O product over r rounds. By Proposition 6.1.4, we have�2(C) � 0:9813, and since �I(S(1::r)) � �2(C)r+ 1255 according to Corollary 6.1.3, thefollowing table is obtainedr 1 2 3 4 5 6 7 8�I(S(1::r)) � 0.9853 0.9669 0.9489 0.9312 0.9139 0.8969 0.8802 0.8638As we have seen already in Example 5.4.3, the true maximum imbalance for 7 roundsis approximately 0:004994. Thus, using Proposition 6.1.4 provides us with a bound,but it is very loose and should be used only for a preliminary cipher evaluation.As the above example has shown us, the bound given in Proposition 6.1.4 is not verytight, and unfortunately as the size of the correlation matrix grows, in most cases thebound gets even worse. This is due to the increased probability of getting very smallminima in the rows and columns of the matrix CTC resulting in a bound closer to 1.Furthermore, the bound requires the calculation of CTC which is time-consuming.To establish a tighter multiround bound we need a way to compute a better ap-proximation of the second highest singular value than provided by Proposition 6.1.4.In the following, we present an algorithm for �nding the second highest singular valueof a doubly-stochastic matrix. The algorithm is a variant of the well-known iterativealgorithm for computing a numerical approximation of the spectral radius of a givenmatrix.

6.1. THE SIMPLE ATTACK 55Algorithm for �nding the second highest singular value of a correlation matrix.procedure SINGULARVALUE(C, �)// input: C, a doubly stochastic matrix// �, the precision// output: �, the second highest eigenvalue of Cbeginx := (1; 0; : : : ; 0)D := C � E, where Eab = 1=(n � 1) for all a; b� := 1repeatz := DT (Dx)~� := �� := maxj zjx := z=�until j�� ~�j < �return(p�)endThe algorithm is based upon the following proposition.Proposition 6.1.6 Let M be an m�m matrix, and let �1; �2; : : : ; �m be the alge-braic eigenvalues of M (including multiplicities), such that j�1j � j�2j � � � � � j�mj.Furthermore, let vj denote the eigenvector corresponding to �j, and let w0 2 Cmbe an arbitrary vector with a non-zero component in the v1 direction. Finally, letwr =Mwr�1 for k = 1; 2; : : :. Then the spectral radius �(M) of M is given by�(M) = limr!1 yTr wr+1yTr wr :where y0; y1; : : : are any non-zero vectors.Proof The eigenvectors form a base, and thus we may writew0 = c1v1 + c2v2 + � � �+ cmvm;where c1; c2; : : : ; cn 2 C and c1 6= 0 (w0 has a non-zero component in the v1 direction).Then wr = c1�r1v1 + c2�r2v2 + � � �+ cm�rmvm= �r1 c1v1 + c2 �2�1!r v2 + � � �+ cm �m�1 !r vm! :For r su�ciently large, this sum will be dominated by the term c1�r1v1, and thereforeyrwTr+1yrwr � yrvT1 c1�r+11yrvT1 c1�r1 = �1:

56 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSSince the spectral radius of a matrix is just the maximum absolute eigenvalue, theproof is �nished. 2What follows is a brief explanation of the algorithm. First, the algorithm usesLemma 6.1.1 to get rid of the highest singular value of C by subtracting E fromC. It then proceeds by computing the new highest singular value (i.e., the originalsecond highest singular value); this is done by �nding the highest eigenvalue ofDTD and returning its square root. Letting � := maxj zj and computing x := z=�is equivalent to choosing yr of Proposition 6.1.6 to be the all-zero vector except fora 1 at the same position where z has its maximum (this gives fast convergence).Finally, z is scaled down to prevent over ow.On some (very rare) occasions, the initial value of x = (1; 0; : : : ; 0) has no com-ponent in the direction corresponding to the highest eigenvalue of D. In this case,the algorithm does not produce the correct result. To solve this problem, simply tryanother value of x.Algorithm SINGULARVALUE has complexityO(j �n2) where j is the number ofiterations required. As one can see from the above proof, j depends upon the needfor precision and upon the ratio between the third and the second highest singularvalues of C of which we know nothing a priori. However, the number of iterationsshould not grow very high even if the ratio is only slightly below 1. The factor n2is due to the two matrix/vector multiplications done in the step z := DT (Dx). Bycomputing Dx �rst instead of DTD we avoid the time-consuming calculation of amatrix product. Hansen treats other methods for computing singular values in [11].Example 6.1.7 After 56 iterations, Algorithm SINGULARVALUE �nds the 4 de-cimals approximation of the second highest singular value of the correlation matrixof the round function of IDEA(8) to be 0:6792. Again, let S(1::r) denote an I/Oproduct over r rounds. Using Theorem 6.1.3, we obtain the following tabler 1 2 3 4 5 6 7 8�I(S(1::r)) � 0.6832 0.4653 0.3173 0.2168 0.1485 0.1021 0.0706 0.0492These values are closer to the true maximum value of �I(S1::r) than those presentedin Example 6.1.5.6.2 The Advanced AttackRecall the formula (5.8) from the previous chapter for obtaining the number N ofP/C-pairs required for a successful attack. The number depends on the Frobeniusnorm k�k2 of a certain matrix. Since k�k2 is a matrix norm and since the correlationmatrix is multiplicative, it is easy to obtain lower bounds for N also for multiroundciphers. This is done in this section in a manner similar to the proofs of, e.g.,Theorem 6.1.2 and Corollary 6.1.3 where upper bounds on the imbalance whereobtained by considering various matrix norms.

6.2. THE ADVANCED ATTACK 57Proposition 6.2.1 Let C be the correlation matrix of the round function of an r-round cipher. Then the number of samples N necessary for a successful correlationattack obeys the following boundsN � ��1(1� 1jKj) + ��1(p)kC � Ekr�12� ��1(1� 1jKj) + ��1(p)kC � Ekr�11 :where p is the desired success probability and jKj is the number of last round keyguesses. Some other bounds areN � ��1(1� 1jKj) + ��1(p)pn � kC � Ekr�1with k � k replaced by either jjj � jjj2 (the spectral norm), jjj � jjj1 (the maximum absolutecolumn sum norm), or jjj � jjj1 (the maximum absolute row sum norm).The last 3 bounds are particularly useful if kC � Ek2 � 1 (in this case the �rst 2bounds are worthless).Proof Let C(1::r�1) denote the round function of the reduced cipher. Using Propo-sitions 5.4.5 and 2.4.5 yieldsN = ��1(1� 1jKj) + ��1(p)kC(1::r�1)� Ek2= ��1(1� 1jKj) + ��1(p)kCr�1 �Ek2= ��1(1� 1jKj) + ��1(p)k(C �E)r�1k2� ��1(1� 1jKj) + ��1(p)kC � Ekr�12� ��1(1� 1jKj) + ��1(p)kC � Ekr�11 :The other bounds follow analogously (since kDrk2 � pnjjj � jjj according to the tableof Proposition 2.4.5). 2Assume that we are designing a cipher, and that we want it to be immune againstCC. In other words, we want the CC attack to be at least as computationally expen-sive as a brute-force attack. The expected number of encryptions by a bruteforceattack is 12 jK(1::r)j = 12jKjr where jK(1::r)j and jKj are the number of possible keys forthe whole cipher and for one round, respectively (since one must expect on averageto run through half of the keys before �nding the correct one).The expected number of operations gone through by a simple CC attack (i.e.,using one I/O pair only) is 12 jKjN . Recalling (5.8), the condition for security can

58 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSbe expressed by 12 jKjr � 12 jKj��1(1 � 1jKj) + ��1(p)kC � Ekr�12 :Solving this with respect to integer-valued r producesr = 2666log(��1(1� 1jKj) + ��1(p))log(jKj � kC �Ek2) 3777+ 1;which is the required number of rounds. CC becomes impossible for even an compu-tationally unbounded adversary when the required number of P/C-pairs N exceedsthe number of possible plaintexts n (note, however, that DES is an example of acipher where the number of possible keys is less than the number of possible plain-texts). In this case the required number of rounds r can be found by solvingn � ��1(1� 1jKj) + ��1(p)kC � Ekr�12 :The integer solution to this isr = 2666 log(��1(1� 1jKj) + ��1(p)) � log nlog kC � Ek2 3777+ 1:Incidentally, notice that if an adversary has N = n available P/C-pair samples thenhe also has the entire mapping x 7! y of plaintexts x to ciphertexts y leaving noneed to �nd the key.Until now, we have derived only lower multiround bounds on the number N ofP/C pairs necessary for a successful attack. The number of pairs actually neededmay be higher. The following proposition gives us an upper bound on the su�cientnumber of P/C-pairs to break a cipher. This bound is useful for an attacker only,since it tells us the maximum number of pairs needed. The actual number might besmaller. This makes it possible to determine if a cipher is breakable by CC, whereasthe previous bounds can only determine if a cipher is secure. The bound is basedon the fact that all matrix norms are lower bounded by the spectral radius.Proposition 6.2.2 Let C be the correlation matrix of the round function of an r-round cipher. Then the number of samples N required for a successful correlationattack is upper-bounded by N � ��1(1 � 1jKj) + ��1(p)�(C � E)r�1 :With this proposition it is possible to determine a lower bound on the number ofrounds for which the cipher can be broken.Proof For any matrix norm k�k and any matrixM , one has kMk � �(M). Equation(5.8) yields N = ��1(1� 1jKj) + ��1(p)k(C �E)r�1k2

6.3. BOUNDS RELATED TO COMPOSITE MATRICES 59� ��1(1� 1jKj) + ��1(p)�((C � E)r�1)= ��1(1� 1jKj) + ��1(p)�(C � E)r�1 ;since �(M r) = �1(M r) = �1(M)r = �(M)r for any matrix M . 2CommentsThe matrix norm technique for obtaining upper bounds for multiple rounds is usefulfor DC and PC as well (since powers of doubly stochastic matrices are also consideredin these two cases).As mentioned in the beginning of the chapter, the two sections above have con-sidered only ciphers with round functions which are identical from round to round.This made it possible to obtain the correlation matrix of the reduced cipher bycomputing Cr�1, where C is the correlation matrix of the round function and r isthe number of rounds. This, in turn, resulted in bounds in which the expressionkC �Ekr�1 occurred (where k � k is some matrix norm). Similar bounds for cipherswith round functions which are not identical are easily obtained simply by replacingthe expression kC�Ekr�1 with the analogous expression kC(1)k�kC(2)k�� kC(r�1)kwhere C(1); C(2); : : : ; C(r), are the correlation matrices of the various rounds. Thisfact follows by more or less the same arguments as used in the proofs above.6.3 Bounds Related to Composite MatricesEven after having reduced the problem of estimating the security of a cipher to thatof evaluating various matrix norms of the correlation matrix of the round function,the method is still unpractical when it comes to direct computation, since whenlooking at real world ciphers, one often have n = 264. However, if the cipher hasa certain structure, then it is possible to derive certain norms in a more analyticalway. In this subsection denotes the Kronecker product. The main tool is thefollowing lemma.Lemma 6.3.1 Consider the matrix function � :Mm �Mn !Mm�n given by�(M;N) = pXa;b=0 gab �Ma N bwhere gab 2 C. Let �(x; y) be the complex bivariate polynomial�(x; y) = pXa;b=0 gab � xayb:Then the eigenvalues of the matrix �(M;N) are the mn numbers �(�r(M); �s(N))where r = 1; 2; : : : ;m and s = 1; 2; : : : ; n.

60 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSProof For a proof see [26], Section 8.3. 2Corollary 6.3.2 Let C = A B. Then the singular values of C are the productsof the singular values of A and B.Proof First notice that due to Lemma 6.3.1, the corollary clearly holds if we replacethe term \singular values" by \eigenvalues". Now recall that the singular values ofthe matrix C = A B are the square roots of the eigenvalues of the matrix CTC.Due to properties of the Kronecker product, we haveCTC = (AB)T (AB)= (AT BT )(AB)= (ATA) (BTB):By Lemma 6.3.1, the singular values of C are thus given byfq�r(ATA) � �s(BTB) : r = 1; 2; : : : ;m; and s = 1; 2; : : : ; ng:However, since q�r(ATA) = �r(A) and q�s(BTB) = �s(B), the elements in thisset are in fact the products of the singular values of A and B. 2Corollary 6.3.3 Let M be the composite matrix M1 M2 � � � Ms where Mj isdoubly stochastic for all j = 1; 2; : : : ; s. Then the second highest singular value ofM is maxf�2(M1); �2(M2); : : : ; �2(Ms)g.Proof For s = 2 the result follows easily from Corollary 6.3.2 and the fact that thehighest singular value of a stochastic matrix is 1. The property generalizes to s > 2by induction. 2Before presenting the next proposition, we need a de�nition.De�nition 6.3.4 De�ne by the extended correlation matrix ~C corresponding to thecorrelation matrix C the matrix~C = 0BBBB@ 1 0 � � � 00... C0 1CCCCA ;i.e., the cases a = 0 and b = 0 are included in the matrix, so to speak (cf. De�nition4.1.7).Note that the extended correlation matrix is also doubly stochastic.

6.3. BOUNDS RELATED TO COMPOSITE MATRICES 61[1] [2] [3] [4] [5] [6] [7] [8]

Q Q Q Q Q Q Q Q

1 2 8X X X

transform

1 2 8

QK

T

K

J

Y Y Y

(G, +)

Linear Figure 6.1: A composite cipher with a linear transform.Theorem 6.3.5 Consider a cipher with input x = (x1; x2; : : : ; xw) where each com-ponent xj belongs to some common set G�, and a round function described by thefollowing procedure: First a key is added to each component (over some Abeliangroup (G�;+)), then a keyed permutation is applied to each component of the resultand �nally all resulting components are combined by a invertible linear transform(cf. Figure 6.1). More formally, let the round function be given byRj;k(x) = T �Qk(x1 + j1; x2 + j2; : : : ; xw + jw);where the function Qk : Gw� ! Gw� is de�ned for k = (k1; k2; : : : ; kw) byQk(x) = (Q[1]k1 (x1); Q[2]k2 (x2); : : : ; Q[w]kw (xw));where Q[s]ks : G� ! G� is a permutation and T is a w � w matrix which is invertibleover G�. Let ~C be the extended correlation matrix of R, and let ~C [s] = [~c[s]ab] be theextended correlation matrix of Q[s]. Then(i) ~C = ( ~C [1] ~C [2] � � � ~C [w])Pwhere P = [pab] with pab = �a;bT and(ii) �2( ~C) = maxf�2( ~C [1]); �2( ~C [2]); : : : ; �2( ~C [w])g:Proof Let G = Gw� and let �a denote the character over (G;+) corresponding toa 2 G and let �as� denote a character over (G�;+) corresponding to as 2 G�. Thenby the de�nition of a correlation matrix~cab = 1jJ j � jKj Xj2J ;k2K �� 1n Xx2G�a(x) � ��b(T �Qk(x+ j))��2 ;

62 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSwhere J = G and K are the respective key spaces. Substitution of x+ j by x yields1jJ j � jKj Xj2J ;k2K ��1n Xx2G�a(�j) � �a(x) � ��b(T �Qk(x))��2= 1jKj Xk2K �� 1n Xx2G�a(x) � ��b�T (Qk(x))��2 :Now de�ne the matrix ~D = [ ~dab] by~dab = 1K Xk2K ��1n Xx2G�a(x) � ��b(Qk(x))��2 (6.3)for all a; b 2 G and note that ~cab = ~da;b�T :Since T is non-singular, the function f(b) = b �T is a permutation, and consequently~D is simply a column permutation of ~C. Thus we may write~C = ~DP (6.4)where P = [pab] is the permutation matrix de�ned by pab = �a;b�T . By (6.3) and thede�nition of a character, we have~dab = 1K Xk2K ��1n Xx2G wYs=1�as� (xs)! wYs=1��bs� (Q[s]ks(xs))!��2= wYs=10B@ 1K Xk2K ��1n Xxs2G� �as� (xs) � ��bs� (Q[s]ks(xs))��21CAwhich equals wYs=1 �I(Las;bs(Xs; Q[s]Ks(Xs)))where L denotes the indicated linear I/O sum over G�. This, in turn, equalsQws=1 c[s]as;bs and hence, due to the de�nition of the Kronecker product,~D = ~C [1] ~C [2] � � � ~C [w];where ~C [s] is the extended correlation matrix of the function Q[s]. By (6.4)~C = ( ~C [1] ~C [2] � � � ~C [w])Pproving (i).To prove (ii), notice that the matrix ~CT ~C = P T ~DT ~DP is similar to ~DT ~D sincepermutation matrices are unitary and real-valued. Thus ~C and ~D have the samesingular values because similar matrices have the same eigenvalues. By Corollary6.3.3 the second highest singular value of ~D is maxf�2( ~C [1]); �2( ~C [2]); : : : ; �2( ~C [w])g.This concludes the proof. 2

6.3. BOUNDS RELATED TO COMPOSITE MATRICES 63Unfortunately, (ii) of Theorem 6.3.5 is trivial, since the value of the right hand sideis 1. This is because extended correlation matrices have at least two singular valueswhich are 1. This fact is proved by noting that the matrix ~CT ~C has the form ofan extended correlation matrix, and therefore both (1; 0; : : : ; 0) and (1; 1; : : : ; 1) areeigenvectors corresponding to an eigenvalue of 1 implying that ~C has two singularvalues with the value 1. In spite of (ii) not being useful as such, we have included(ii) anyway, since the idea behind it might prove { at a later time { to be useful forobtaining knowledge about powers of the round correlation matrix of a structuredcipher.In the following example, it is shown how to apply (i) of Theorem 6.3.5 to amodi�cation of a real cipher.Example 6.3.6 We consider a slight modi�cation of the cipher SAFER K-64 de-signed by Massey [29] for Cylink. See Figure 6.2 for a schematic of one round ofthe cipher.exp log log exp exp log log exp

2-PHT 2-PHT 2-PHT 2-PHT



U1

V1

W1

64 bits

L

R

PHT

ADD/XOR

NL

XOR/ADD

X1 X2 X8

Y1 Y8Y2

K

K

8 bits

Figure 6.2: One round of the block cipher SAFER.The EXP and LOG boxes are nonlinear permutations over ZZ256, and the PHTbox represents a linear (over (ZZ256;+)), invertible transform called the Pseudo-Hadamard-Transform, � denotes addition modulo 2 over ZZ82 (XOR), and + denotesaddition modulo 256 over ZZ256 (ADD). To bring SAFER K-64 on a form where The-orem 6.3.5 applies, we introduce some minor changes in the �rst and third layer by

64 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSexchanging some of the ADD and XOR operations, such that the �rst layer in ourversion consists of ADD operations only and the third layer consists of XOR op-erations only. The cipher is now on the form of Figure 6.1 since the �rst layerrepresents 8 parallel, identical group operations over (ZZ256;+), the function Q[j]kj isgiven by either Q[j]kj (xj) = EXP (x)�kj or Q[j]kj (xj) = LOG(x)�kj in accordance withthe value of j, and the linear transform T is represented by the PHT-layers. Sinceit is feasible to �nd the correlation matrices of the functions Q[j], j = 1; 2; : : : ; 8 bydirect computation, it is possible with Theorem 6.3.5 to obtain a representation ofthe correlation matrix of SAFER on a more structured form.Recently, certain weaknesses in SAFER has been pointed out by Knudsen [21, 22].6.4 Schur Stochastic DecompositionWhile investigating the properties of the correlation matrix, we stumbled upon somestructure which has connections to Schur stochastic matrices. While the results inthis section are not immediately useful, we have included them since they mightprovide foundation for further research into this area.Consider the correlation matrix CR of an unkeyed permutation R : G ! G.Such matrices appear when studying ciphers in which the key is introduced bygroup operations only, e.g., when the round function is given by Rk(x) = R(x+ k)(since the corresponding correlation matrix CRk equals CR). As we shall see, allcorrelation matrices are expressible as sums of such \unkeyed" correlation matrices.Proposition 6.4.1 The correlation matrix C of an unkeyed permutation R : G!G may be written as C = (F �PF ) � (F �PF );where � denotes the Hadamard product, F = [fab] is the (n�1)�n truncated Fouriermatrix de�ned by fab = 1pn��a(b)for a 2 G n f0g, b 2 G, and P = [pab] is the permutation matrix de�ned by pab =�a;'(b). Since the matrix F �PF is unitary, one may also writeC = (F �PF ) � ((F �PF )�1)T :Proof Let D = [dab] = (F �PF ) � (F �PF ). Then by inserting the de�nition of Pand F , the following is obtaineddab = j(F �PF )abj2= ��Xx F �ax(PF )xb��2= ��Xx F�a;xF'(x);b��2= ��Xx �a(x) � ��b('(x))��2= �I(Lab(X;Y ));

6.4. SCHUR STOCHASTIC DECOMPOSITION 65where Y = '(X). This equals cab by de�nition of a correlation matrix. That F �PFis unitary follows from the fact(F �PF )�(F �PF ) = F �P TFF �PF = I: 2Note, that F , F �P and PF are Vandermonde matrices. Matrices of the form A �(A�1)T are widely used in a certain approach for designing chemical engineeringplants. Here A is called the gain matrix and A � (A�1)T is called the relative gainarray (see [32]). For a mathematical treatment, see [16] and [18].Corollary 6.4.2 Let C be the correlation matrix of an unkeyed function over someAbelian group. Then C is Schur stochastic, i.e., C = U � U where U is unitary.Proof Follows directly from Proposition 6.4.1 and the de�nition of a Schur stochas-tic matrix. 2The following proposition presents a decomposition of any correlation matrix intoSchur stochastic matrices.Proposition 6.4.3 The correlation matrix C of a keyed permutation ek : G ! Gwith k 2 K is expressible as a sum of jKj Schur stochastic matrices. More formally,C = 1jKj Xk2KC [k];where C [k] = U [k] � U [k] with U [k] = [u[k]ab ] = FP [k]F � unitary for all k 2 K andP [k] = [�a;ek(b)].Proof Let C = [cab] be the correlation matrix of ek. If we �x k, the function ek canbe thought of as an unkeyed permutation, and thus by de�nition of the correlationmatrix we get cab = �I(Lab(X; eK(X))= 1jKj Xk2K I(Lab(X; eK(x))jK = k)= 1jKj Xk2K c[k]ab ;where C [k] = [c[k]ab ] is the correlation matrix of the \unkeyed" permutation ek (giventhe �xed k). According to Proposition 6.4.1 C [k] is Schur stochastic and furthermoreC [k] = (FP [k]F �) � (FP [k]F �). 2In the following, the sum in Proposition 6.4.3 will be called the Schur decompositionof C. Since ordinary matrix addition is subadditive with respect to every matrixnorm, we have

66 CHAPTER 6. BOUNDS FOR MULTIPLE ROUNDSProposition 6.4.4 Given a permutation ek : G ! G, let C be the correspondingcorrelation matrix. Then the second highest singular value of C is bounded fromabove by �2(C) � �1 + 1jKj Xk2K 2 � �1(U [k])2 (6.5)� �1 + 1jKj Xk2K 2 � h�1(F )2 � �1(P [k])i2 (6.6)with U [k], P [k], and F de�ned as in Proposition 6.4.3.Proof Let C = 1jKj Xk2KC [k]be the Schur decomposition of Proposition 6.4.3. As mentioned in Chapter 2, theHadamard product � is submultiplicativewith respect to the spectral norm. Further-more, ordinary matrix addition is subadditive with respect to every matrix norm.Thus, �2(C) = �20@ 1jKj Xk2KC [k]1A� 1jKj Xk2K�2(C [k])= 1jKj Xk2K�1(C [k]) + �2(C [k])� 1= �1 + 1jKj Xk2K�1(U [k] � U [k]) + �2(U [k] � U [k]):By the submultiplicativity of the Hadamard product, this expression is boundedfrom above by �1 + 1jKj Xk2K �1(U [k]) � �1(U [k]) + �2(U [k]) � �2(U [k])= �1 + 1jKj Xk2K�1(U [k])2 + �2(U [k])2� �1 + 1jKj Xk2K 2 � �1(U [k])2proving (6.5). Further deductions yield

6.5. CONSTRUCTION OF SECURE ROUND FUNCTIONS 67�1 + 1jKj Xk2K 2 � �1(U [k])2 = �1 + 1jKj Xk2K 2 � h�1(FP [k]F �)i2� �1 + 1jKj Xk2K 2 � h�1(F )�1(P [k])�1(F �)i2= �1 + 1jKj Xk2K 2 � h�1(F )2�1(P [k])i2 : 2Unfortunately, the bound is trivial since the right hand side of (6.5) equals 1. Thisis due to the fact that all eigenvalues of a unitary matrix U equals 1 since U�U = I(and V and F are both unitary). The proposition is included, however, since thetechniques used in the proof might be possible to expand upon.6.5 Construction of Secure Round FunctionsUntil now we have dealt exclusively with evaluating the security of a given cipher.Another interesting problem is how to construct ciphers which are secure againstCC. One approach would be to use a su�cient number of rounds { as the precedingresults in this chapter have demonstrated, this results in ciphers that are secureagainst the statistical attack.Another approach, however, is choosing a round function with a good correlationmatrix C (in the proper sense). Round functions with correlation matrices whoseelements are all close to 1n�1 have high correlation immunity, since the correspondingvalue of the Frobenius norm kC�Ek2 is close to 0. This is where the so-called bentfunctions come in handy. A bent function [9, 23, 41, 42] is a function which has aperfectly at Fourier power spectrum. Unfortunately, bent functions which are atthe same time permutations do not exist but \almost" bent permutations exist whichhave an \almost" at Fourier power spectrum. The links between (binary-valued)bent functions and cryptography have already been thoroughly studied [33, 38, 39]due to these functions' nonlinear properties. It has been shown that using almostbent functions as round functions gives immunity against LC and DC [40]. Thesame holds true for CC, and this fact is easily proven by considering Proposition4.1.10 and the de�nition of a bent function.

68

Chapter 7Links to Other Statistical AttacksIn this chapter we will explore the connection between CC and the statistical attacksDC, GLC, and PC. There are strong links \downwards" to all three attacks in thesense that CC is the most general approach.7.1 Di�erential CryptanalysisDi�erential cryptanalysis [4, 5] can be thought of as a statistical attack with de-scriptor given byT = fsjs : G2 ! G;�1;�2 2 G n f0g; s(x; y) = e(x+�1)� y ��2 for all x; ygand L(S) = jP [S = 0]j for all S;where �1 and �2 denote the input and output di�erence, respectively, and e : G!G is the encryption function. If S = s(X;Y ) where s 2 T and X and Y are cipherinput and output respectively, then the value of the likelihood estimator L(S) issimply a measure of how often the output di�erence is �2 given that the inputdi�erence is �1.The following is the de�nition of the DC counterpart to the correlation matrix.De�nition 7.1.1 Matrix of Di�erential Transition Probabilities. Given acipher ek : G ! G, the matrix of di�erential transition probabilities D = [dab] isde�ned to be the (n� 1)� (n� 1) matrix given byd�1�2 = P [eK(X +�1)� y ��2 = 0]for a; b 2 G n f0g.In other words, the elements are simply the probabilities of transition from a certaininput di�erence to a certain output di�erence. It is easily shown that this matrix is adoubly stochastic, too. There is a more intimate connection between the correlationmatrix C of a cipher and the matrix of di�erential transition probabilitiesD. Beforeit is revealed, we need a de�nition.The de�nition of the extended matrix ~D is similar to the de�nition of the ex-tended correlation matrix (cf. De�nition 6.3.4).69

70 CHAPTER 7. LINKS TO OTHER STATISTICAL ATTACKSDe�nition 7.1.2 De�ne by the extended matrix ~D corresponding to the matrix ofdi�erential transition probabilities D the matrix~D = 0BBBB@ 1 0 � � � 00... D0 1CCCCA :Theorem 7.1.3 Given a cipher, let ~C denote the extended correlation matrix andlet ~D denote the extended matrix of di�erential transition probabilities. Then ~C isthe two-dimensional Fourier transform of ~D and vice versa. More precisely,~C = F ~DF � and ~D = F ~CF �where F = [fab] is de�ned by fab = 1pn��a(b) for a; b 2 G.Proof Let F�1fh(�1;�2)g denote the Fourier transform of the function h : G2 ! Cwith respect to the variable �1. Similarly, let F�1�2 fh(�1;�2)g denote the inverseFourier transform of the function h : G2 ! C with respect to the variable �2. Wehave to show that cab = F�1 nF�1�2 fd�1�2g (b)o (a):Starting with the right hand side and letting K denote the key space, we obtainF�1 fF�2 fP [eK(X +�1)� y ��2 = 0]g (b)g (a)= F�1 (F�2 ( 1n � jKj jf(x; k) : ek(x)� ek(x��1) = bgj) (b)) (a)= F�1 8<:F�2 8<: 1n � jKj Xk2KXx2G ��2(ek(x)� ek(x��1))9=; (b)9=; (a):Applying the inverse Fourier transform yieldsF�1 8<: 1n � jKj Xk2KXx2G�b(ek(x)� ek(x��1))9=; (a)= F�1 8<: 1n � jKj Xk2KXx2G�b(ek(x)) � �b(ek(x��1))9=; (a)= F�1 8<: 1jKj Xk2K(�b � ek) � (�b � ek)(�1)9=; (a)where � denotes convolution. Applying the Fourier transform yields1jKj Xk2KFf�b � ekg(a) � Ff�b � ekg(a) = 1jKj Xk2K ��Ff�b � ekg(a)��2 :According to Proposition 4.1.10 the latter expression equals cab. 2This shows us that CC and DC are dual attacks in some sense. DC takes places inthe time domain and CC in the frequency domain, so to speak. Since the Frobeniusnorm is unitarily invariant, and since the matrix F of the theorem above is unitary,we have the following immediate result.

7.2. GENERALIZED LINEAR CRYPTANALYSIS 71Corollary 7.1.4 The Frobenius norm of a correlation matrix equals that of thecorresponding matrix of di�erential transition probabilities. More formally,kCk2 = kDk2and thereby also kC � Ek2 = kD � Ek2where E = [ 1n�1 ]Recall from Chapters 5 and 6 that the number of P/C-pairs required by a fullcorrelation attack is inversely proportional to kC � Ek2. Since DC is a statisticalattack in the same respect as CC it can be shown by similar arguments that kD�Ek2is also inversely proportional to the number of P/C pairs required to mount anoptimal di�erential attack considering every possible di�erence pair. Consequently,due to Corollary 7.1.4, the two attacks are essentially of the same strength { at leastwhen considering the full advanced attacks, i.e., using every possible connectionbetween input and output, be it correlational or di�erential. CC and DC usuallydi�er in strength when considering only one I/O pair and one di�erential pair,respectively (i.e., when using the simple attack).It is conceivable that there exist ciphers which are dual in the sense that the cor-relation matrix of the �rst cipher is the matrix of di�erential transition probabilitiesof the second cipher and vice versa. It might be rewarding to examine such ciphers.CC has some drawbacks when compared to DC. Among these the fact that onehas to do calculations over the complex numbers. On the other hand it is notnecessary to order the P/C-pairs pairwise when carrying out the CC attack suchthat they have a given input or output di�erence; this is necessary when doing adi�erential attack.7.2 Generalized Linear CryptanalysisIn [12], the LC of Matsui was generalized so that the previous expressions a � Xand b � Y (cf. Example 3.0.1) are replaced by f(X) and g(Y ), where f and g arecarefully chosen balanced, binary-valued functions. Thus, the attack descriptor isgiven byT = fsjs(x; y) = f(x)� g(y); f : G! ZZ2; g : G! ZZ2; f and g balancedgand L(S) = 2 � ��P [S = 0]� 12 �� for all S;where � denotes addition modulo 2. The functions in T are called I/O sums.In [17], the binary restriction on the functions was dropped, and a new, moreappropriate likelihood estimator was introduced. The descriptor of this attack,which is called m-ary GLC, is given byT = fsjs(x; y) = f(x)m g(y); f : G! ZZm; g : G! ZZm; f and g balancedg

72 CHAPTER 7. LINKS TO OTHER STATISTICAL ATTACKSand L(S) = IGLC(S) for all S;where m denotes subtraction modulo some positive integer m and IGLC is theGLC-imbalance operator withIGLC(S) = m � V [P [S = 0]; P [S = 1]; : : : ; P [S = m� 1]]= mm� 1 m�1Xj=0 �P [S = j]� 1m�2 :Here the expression V [�] denotes the unbiased estimator [27] of the variance of itsarguments. The functions in T are called I/O di�erences. Given the observationsP [S = 0]; P [S = 1]; : : : ; P [S = m � 1], and assuming that these are normally dis-tributed, using a sum of squares as likelihood estimator actually is the best possiblechoice since the corresponding attack yields the actual maximum likelihood last-round key.Before we show the connection between CC and GLC, we present a lemma withwhich we will have a convenient way to express the imbalance of an I/O di�erence.The lemma is from [17].Lemma 7.2.1 Given a function s : G ! ZZm, let X be a uniformly distributedrandom variable with values in G. The GLC-imbalance IGLC(S) of S = s(X) isthen given by IGLC(S) = 1(m� 1)n2 m�1Xl=1 ��Xx2G �ls(x)��2 ; (7.1)where � is an m-th root of unity in C.Proof Let the function p be given by p(x) = P [s(X) = x], and let Fmfpg denotethe ordinary cyclic Fourier transform of p with respect to ZZm, i.e., Fmfpg(w) =1m Pm�1x=0 p(x) � ��wx, where � is an m-th root of unity in C. Then (7.1) follows fromapplying Parseval's identity.I(S) = mm� 1 m�1Xj=0 �p(j)� 1m�2= mm� 1 24m�1Xj=0 (p(j))2 � 1m35= m2m� 1 "m�1Xl=0 jFmfpg(l)j2# � 1m� 1= 1(m� 1)n2 264m�1Xl=0 ��m�1Xj=0 n � p(j) � �jl��2375� 1m� 1= 1(m� 1)n2 m�1Xl=1 ��Xx2G�ls(x)��2: 2The two attacks are related by the following.

7.3. PARTITIONING CRYPTANALYSIS 73Proposition 7.2.2 Let T = f(X) m g(Y ) be an m-ary I/O di�erence and letS = �f(X)��g(Y ) be the corresponding I/O product, where � is an m-th root of unityin C. Then the imbalances of S and T are related by the following sum.IGLC(T ) = 1m� 1 m�1Xj=1 I(Sj): (7.2)In other words, GLC is a special case of the advanced CC attack where certainm�1I/O products are considered. This implies that GLC is not stronger than CC.Proof First, we have to show that S is an I/O product. This is indeed the case,since E[�f(X)] = E[��g(Y )] = 0 and E[j�f(X)j2] = E[j��g(Y )j2] = 1 (cf. De�nition4.1.1). Now (7.2) follows immediately by applying Lemma 7.2.1I(T ) = 1(m� 1)n2 m�1Xj=1 ��Xx2G�j(f(x)�g(y))��2= 1m� 1 m�1Xj=1 �� 1N Xx2G ��f(x)��g(y)�j��2= 1m� 1 m�1Xj=1 jE(Sj)j2= 1m� 1 m�1Xj=1 I(Sj): 27.3 Partitioning CryptanalysisIn the basic form of PC (see [13, 14]), we study the imbalance of the random variableU = (f(X); g(Y )), where f : G! ZZm and g : G! ZZm are balanced functions thatdescribe some partitions A = fA0;A1; : : : ;Am�1g and B = fB0;B1; : : : ;Bm�1g ofrespectively the input and the output space into equally large (disjoint) subsets, i.e.,f(x) = a if and only if x 2 Aa and analogously g(y) = b if and only if y 2 Bb.More formally, we are looking at the function familyT = fsjs(x; y) = (f(x); g(y)); f : G! ZZm; g : G! ZZm; f and g balancedg:Various likelihood estimators have been proposed, among these the normalized Eu-clidean norm (corresponding to the GLC approach)L(U) = IPC(U) = m2m� 1 Xa;b2ZZm �P [U = (a; b)]� 1m2�2 for all U .The link from PC to CC goes through GLC.

74 CHAPTER 7. LINKS TO OTHER STATISTICAL ATTACKSProposition 7.3.1 Let U = (f(X); g(Y )) be a random variable for use with par-titioning cryptanalysis. Then the PC-imbalance IPC(U) is given by the sum of theimbalances of the corresponding m-ary I/O di�erence T = f(X) m g(Y ) over allpossible isomorphisms of the group (ZZm;m). In other words,IPC(U) = m� 1m! � X 2 IGLC(T ); (7.3)where T = (f(X)) m g(Y ) and = f j : ZZm ! ZZm is a permutationg is theset of all permutations over ZZm.Proof To prove (7.3), we use the shorthand notation pa;b = P [(f(X); g(Y )) =(a; b)]. Note that jj = m!. Starting with the right hand side of (7.3), we havem� 1m! � X 2 IGLC(T (k))= m� 1m! � X 2 IGLC( (f(X))m g(Y ))= m� 1m! � X 20@ mm� 1 Xb2ZZm P [ (f(X)) m g(Y ) = b]2 � 1m� 11A= m� 1m! � X 20B@ mm� 1 Xb2ZZM 0@ Xa2ZZm pa; (a)m b1A2 � 1m� 11CA :For each of the m possible values of b, the expression (a) m b is a permutationwith regard to a, and thus the above simpli�es tom� 1m! � X 20B@ m2m� 1 Xb2ZZm0@ Xa2ZZm pa; (a)1A2 � 1m� 11CA= �1 + m(m� 1)! 0B@X 20@ Xa2ZZm pa; (a)1A21CA(loosely speaking, we have included \mb" in the permutation ). In the following,we expand the square by counting how many times each term appears.�1 + m(m� 1)! 0@m!m Xa;b p2a;b + m!m(m� 1)Xa;b pa;b 0@ X~a6=a;~b6=b p~a;~b1A1A= �1 + m(m� 1)! 0@(m� 1)! Xa;b2ZZm p2a;b+(m� 2)! Xa;b2ZZm pa;b0@ X~a;~b2ZZm p~a;~b �X~b pa;~b �X~a p~a;b + pa;b1A1A= �1 +m Xa;b2ZZm p2a;b + mm� 1 Xa;b2ZZm pa;b� �P [(f(X); g(Y )) 2 ZZ2m]� P [f(X) = a]� P [g(Y ) = b] + pa;b� :

7.4. RELATIONS BETWEEN THE ATTACKS 75Since f(X) = a with probabilitym�1, g(X) = bwith probabilitym�1, and (f(X); g(Y ))is always in ZZ2m, this simpli�es to�1 +m Xa;b2ZZm p2a;b + mm� 1 Xa;b2ZZm pa;b �1� 1m � 1m + pa;b�= m� 2m� 1 � P [(f(X); g(Y )) 2 ZZ2m]� 1 + (m+ mm� 1) Xa;b2ZZm p2a;b:Again, (f(X); g(Y )) is always in ZZ2m, and the expression simpli�es tom2m� 1 Xa;b2ZZm p2a;b � 1m� 1 = IPC(U): 27.4 Relations Between the AttacksFigure 7.1 shows how the di�erent attacks are related in terms of generality. Thelower the position, the less general is the attack. The left hand side and the righthand side of the �gure represent attacks in the frequency domain and in the timedomain, respectively. PC and GLC are rated as being equally general since, looselyDifferential Cryptanalysis

(Binary) Generalized Linear Cryptanalysis

Linear Cryptanalysis

m-ary Generalized Linear Cryptanalysis

Partitioning Cryptanalysis

Correlation Cryptanalysis

Figure 7.1: Links between various statistical attacks.speaking, the GLC attack can emulate a PC attack according to Proposition 7.3.1.The connection the other way is easily proved since it is possible to derive the valueof T directly from U .

76

Chapter 8A Couple of IdeasSome of the ideas in this section represent a whole study in themselves and thereforemost statements are very loose and should be considered as nothing more thanstarting points for further research. The ideas are included anyway since they areall related to the correlation attack in some respect.8.1 Approximation of Boolean FunctionsIn this section, we consider algorithmic approaches for �nding low-degree polyno-mials which approximate some high-complexity, keyed, Boolean function fk : ZZw2 !ZZ2. Approximation is de�ned in the following sense.De�nition 8.1.1 A binary function p is said to approximate a another binary func-tion f if P [p(X) = f(X)] > 0:5for a uniformly distributed argument X. The degree of approximation is de�ned tobe 2 � j1� P [p(X) = f(X)]j.The idea for approximation is to express f as a polynomial in some canonical formand then leave out the terms with degree above a certain limit. In the following, wewill assume that all polynomials are represented in algebraic normal form (ANF).De�nition 8.1.2 Let p be a polynomial of degree d represented in ANF. The trun-cated polynomial tm(p) of order m � d is then de�ned to be the polynomial which isthe sum of those terms of p that have degree less than or equal to m.Assuming that the coe�cients belonging to high degrees are su�ciently sparse, wehave the following.Proposition 8.1.3 Let p 2 ZZ2[x1; x2; : : : ; xw] be a binary, multivariate polynomialof degree d represented in ANF and let m � d. If the coe�cients belonging toterms in p of degree higher than m are su�ciently sparse (in some sense), then thetruncated polynomial tm(p) approximates p.77

78 CHAPTER 8. A COUPLE OF IDEASWe will not prove the proposition or elaborate on the meaning of sparse. Intuitively,however, it is clear that terms of high degree \rarely" equals 1 since it requires allthe implicated variables of the term to equal 1. Our hope is that leaving out theterms of high degree \rarely" matters (preliminary, empirical tests of this hypothesisare positive).Example 8.1.4 Let f : ZZ42 ! ZZ2 be de�ned by the following polynomial on ANFf(x1; x2; x3; x4) = x1x2x3x4 + x1x2x3 + x2x3x4 + x1x3x4 + x2x4 + x1x4 + x2 + x3:Then the truncated polynomialt2(f(x1; x2; x3; x4)) = x2x4 + x1x4 + x2 + x3approximates f since the values of f and t2(f) agree for 12 of the 16 possible argu-ments.We will assume in the following that fk is the composition of several low-complexityfunctions f (1)k(1) ; f (2)k(2); : : : ; f (r)k(r) : ZZw2 ! ZZw2 , i.e.,fk = f (r)k(r) � f (r�1)k(r�1) � � � � � f (1)k(1) :This agrees with the structure of an iterative block cipher based on successivelyapplying a cryptographically weak function several times. The function f (j)k(j) mightbe thought of as the j-th round function of a block cipher. Thus fk is the polynomialwhich describes the whole cipher.One way to exploit Proposition 8.1.3 would be by the following method of re-peated substitution, expansion, and truncation.8.1.1 Repeated Substitution, Expansion, and TruncationLet y = fk(x1; x2; : : : ; xw), and let gb : ZZw2 ! ZZ2 be a function which picks out theb-th bit of its argument. We wish to �nd an approximation of y. This is done simplyby inserting polynomials into each other and ignoring (truncating) at each step theterms with too high degree. More formally, we �x some m and start by expandingy(1)j = gj(f (1)k(1)(x1; x2; : : : ; xw))into ANF for j = 1; 2; : : : ; w. Here the input bits of x and the key bits of k should betreated symbolically as indeterminate variables of the polynomial. We then truncateby evaluating ~y(1)j = tm(y(1)j )for all j. Each ~yj represents approximations of one bit of the output of the �rstround. We then proceed to compute~y(s)j = tm(gj(f (s)k(s)(~y(s�1)1 ; ~y(s�1)2 ; : : : ; ~y(s�1)w )))by expanding into ANF and truncating for j = 1; 2; : : : ; w and s = 2. Continuingsuccessively with s = 3; 4; : : : ; r one �nally obtains the truncated cipher output~y = (~y(r)1 ; ~y(r)2 ; : : : ; ~y(r)w ):

8.1. APPROXIMATION OF BOOLEAN FUNCTIONS 79In this way we can �nd approximations of each output bit as a function of the keybits and the input bits. Since terms only grow in degree or disappear completelywhen inserting one (binary) polynomial into another, we may throw away terms ofhigh degree at each step without loosing any information with respect to the termsof lower degree.What we want ideally, however, is explicit approximations of each key bit asa function of the plaintext and the ciphertext. This would be a direct source ofprobabilistic information about the key. The above approach gives us the key bitsonly implicitly. In the next two subsections we will see how it might be possible toisolate key bits of vulnerable ciphers.8.1.2 An Approach Using ResultantsThe theory of resultants is useful for solving systems of polynomial equations. It ispossible, at least theoretically, to solve the system of equations belonging to somecipher with respect to the unknown key (for an algorithm see for example [10]). Inpractice, however, the solution algorithm will usually require exponential time andspace since the exact polynomial which describes a certain key bit will normally bevery complex.However, if it is possible to devise an algorithm which computes the truncatedversions of the key bits as functions of the input and output bits analogously tothe approach in the previous subsection, it might be possible to avoid intermediateexpression swell. Unfortunately, the obvious approach of truncating the results aftereach step of the algorithm is not successful when it comes to using resultants, sincetoo much information is lost at each step. The following approach, on the otherhand, appears to be worth some investigation.8.1.3 An Approach Using Buchberger's AlgorithmGiven a set P of multivariate polynomials over some �eld, Buchberger's algorithmis a method for constructing a basis for P which have several nice properties (see[1, 7]). Among the properties of the so-called Gr�obner basis is the possibility ofback-solving the system of equations related to P. More speci�cally, the Gr�obnerbasis obtained by using lex-ordering gives a set of polynomial equations which aretriangularized with respect to the polynomial coe�cients.As with the theory of resultants, intermediate expression swell makes it in-tractable to work with the standard version of Buchberger's algorithm when solvingthe system of equations belonging to some cipher. However, by a slight change tothe algorithm we hope to produce a Gr�obner basis which is an approximation ofthe original Gr�obner basis. The only change is that new syzygy polynomials aretruncated before they are added to the basis; this prevents expression swell. Sincetruncation of the syzygy polynomials found at each step corresponds to adding poly-nomials with no terms of low degree to the basis, this does (hopefully) not representa major change. Since Hilbert's basis theorem does not allow us to add in�nitelymany elements to the basis, the algorithm will �nish sooner or later with a Gr�obner

80 CHAPTER 8. A COUPLE OF IDEASbasis. The open question is whether this basis is a good approximation of the originalGr�obner basis (in the proper sense).8.2 An Authentication Scheme using Gr�obner BasesAnother property of the Gr�obner basis is its applicability to the ideal membershipproblem. Consequently, another idea for using Gr�obner bases in cryptography wouldbe in an authentication scheme in which the prover has knowledge of a (secret,randomly chosen) Gr�obner basis. The public key would be an equivalent basiswhich is not a Gr�obner basis { this is easily constructed from the Gr�obner basis.As challenge the veri�er sends either a member of the ideal generated by the publicbasis or a random polynomial. The veri�er does not tell which choice he made. Theprover then has to tell whether the received polynomial belongs to the ideal or not.Conceivably, this is only possible with probability larger than 0.5 if the prover hasaccess to an appropriate Gr�obner basis.With a suitable protocol and some changes, it might even be possible to turnthe above procedure into a zero-knowledge proof. However, before it is possibleto state anything about the security of the above method, the time complexity ofBuchberger's algorithm has to be analyzed in greater detail than it has been untilnow. Care must also be taken when choosing the Gr�obner basis since the idealmembership problem is not hard in all instances.

Chapter 9ConclusionThe main contribution of this work has been the development of a new statisticalattack on iterative block ciphers which we have called correlation cryptanalysis.Being a natural generalization of linear cryptanalysis, it also applies to ciphers whichare not of the xor-variety. Informally speaking, the attack actually performs a kindof multi-dimensional linear regression of the ciphertext as a function of the plaintext,and uses the information leaked from this to obtain the last-round key. Since theattack has several properties in commonwith other standard attacks on block cipherssuch as LC and DC, the notion of a statistical attack has been introduced as acommon setting into which it is possible to put the various attacks. The model of astatistical attack might also be useful for describing new attacks in the future.In summary, the correlation attack is based upon the use of so-called I/O pairs,which are certain pairs of complex-valued functions with high correlation when ap-plied to the plaintext and the ciphertext, respectively. As a measure of the degreeof correlation, the notion of imbalance has been introduced, and the so-called corre-lation matrix consisting of the imbalances of all linear I/O pairs has been de�ned.Since the correlation matrix completely describes the correlational properties of acipher, it is useful for obtaining the imbalance of any I/O pair. The correlationmatrix has other nice properties, e.g., multiplicativity in the sense that a multi-round correlation matrix is given by the product of the correlation matrices of theindividual rounds.With the tools developed in this report, it is possible to evaluate the security ofa given cipher against CC, and to prove whether the cipher is immune to correlationattacks. More precisely, it has been shown that the minimum number of P/C-pairsrequired for a successful cryptanalysis is inversely proportional to the Frobeniusnorm of the correlation matrix of the reduced cipher. Due to the multiplicativityof the correlation matrix and the fact that the Frobenius norm is a matrix norm,multiround bounds based upon the correlation matrix of the round function havebeen developed, too.All methods are easily applied to small ciphers like IDEA(8) by direct computa-tion and when considering larger ciphers there are various helpful analytical tools.For instance, Proposition 4.1.10 makes it possible to compute correlation matricesby the Fast Fourier Transform reducing the time complexity signi�cantly. Theorem6.3.5 is also useful for obtaining knowledge about the structure of large composite81

82 CHAPTER 9. CONCLUSIONciphers like SAFER and Proposition 5.4.4 helps obtaining upper bounds on themaximum imbalance over a given cipher when the polynomial degree of the cipheris known.It has been shown how all the standard statistical attacks in the cryptanalyst'stool box (LC, GLC, PC, CC, and DC) are strongly linked together, and a remarkablefact has been demonstrated, namely that CC and DC are complements in terms oftime/frequency domain. Furthermore, it has been shown that CC is a more generalattack than PC and GLC (and thus LC). Consequently, if a cipher is secure againstCC, it is also secure against the other above-mentioned attacks.Proposition 4.1.10 shows in a direct way why the well-known almost bent func-tions provide good immunity against CC if used in the round function. Thus, wehave found a robust design criterion which yields block ciphers that are provablysecure against CC. This result is not surprising since the use of bent functions hasalready been demonstrated elsewhere to thwart other statistical attacks like DC andLC (e.g., in [40]).The ultimate use of \multiple approximations" [20] namely the use of all availableI/O products has been considered. When evaluating the security of a cipher againstthe simple attack which considers one I/O pair only, it su�ces to �nd the maximumimbalance over the reduced cipher, i.e., the maximum element of the correlationmatrix. As mentioned, however, the security against the advanced attack whichconsiders several I/O pairs is related to the Frobenius norm of the correlation matrix.9.1 Suggestions for Further WorkFirst, of course, the ideas given in Chapter 8 ought to be investigated to see if theyare fruitful. Still, there are many other interesting open questions. For instance,we have considered only linear relationships between input and output to yieldinformation about the �nal round key. It might prove interesting to explore thepossibility of obtaining nonlinear relationships, e.g., I/O \products" of the formS = h(X;Y ) where the function h depends upon its arguments in an arbitrary way.More research on how to evaluate the security of large ciphers without resortingto direct computation should also be carried out. Since most ciphers are composedof simple functions which are put together in some clever way, a good idea would beto look for a more general version of Theorem 6.3.5 that allows arbitrary compositionof functions. In this way the security of large ciphers could be evaluated in a moreanalytical manner.Recall from Subsection 5.4.2 that the imbalance estimates of all linear I/O prod-ucts are not independent. However, we have assumed this to be the case in Theorem5.4.5. Consequently, the result regarding the number of P/C-pairs required to suc-ceed an attack is pessimistic in the sense that the actual number is higher than thesuggested number. If the independency between the various imbalance estimatescould be made out, it might be possible to lower this number.Referring to Proposition 5.4.4 Weil's bound and its variations might be usefulfor constructing round functions with good correlation properties since using certainlow degree polynomials means low correlations between input and output (maybe it

9.1. SUGGESTIONS FOR FURTHER WORK 83is even possible to construct almost bent functions in this way). However, too lowBoolean complexity means that the cipher is vulnerable to other attacks. Thereforea proper balance between correlation immunity and Boolean complexity should bemaintained. This could be achieved by alternating between two di�erent roundfunctions, one round function which is a low degree polynomial with good correlationproperties and another round function with high Boolean complexity. This wouldalso show in an explicit way how Shannon's notions of di�usion and confusion [44]relates to the theory of block ciphers: Di�usion is easily achievable by a lineartransform, i.e., a low degree polynomial (cf. the PHT-transform of SAFER), andconfusion is introduced by using functions of high Boolean complexity.The attack algorithms are possible to speed up if a sequential likelihood ratiotest (SLRT) [27, 35] is used instead of the nonsequential tests of ATTACK1-3. TheSLRT does not use a constant number of P/C-pairs for each key guess. Rather, itstops processing P/C-pairs when a su�cient number have been analyzed (as indi-cated by a constantly updated likelihood ratio). Also, instead of analyzing one keyguess at a time, a faster attack would consider all keys at the same time by updatingthe presently most promising key (as indicated by its constantly updated likelihoodratio). Furthermore, we have assumed subkeys to be independent. Normally, how-ever, subkeys are generated from some key schedule. It is possible to devise relatedkey attacks [3] using CC which exploits this dependence.Finally, with a proper notation it might be possible to clean up various expres-sions and put most propositions and proofs in this report on a matrix form makingthem shorter and easier to comprehend.

84

Bibliography[1] Thomas Becker and Volker Weispfenning, Gr�obner Bases { A ComputationalApproach to Commutative Algebra, Springer-Verlag, New York, 1993.[2] Abraham Berman and Robert J. Plemmons, Nonnegative Matrices in the Math-ematical Sciences, Classics in AppliedMathematics vol. 9, Society for Industrialand Applied Mathematics, Philadelphia, 1994.[3] E. Biham, New Types of Cryptanalytic Attacks Using Related Keys, TechnicalReport #753, Computer Science Department, Technion { Israel Institute ofTechnology, Sep. 1992.[4] E. Biham and A. Shamir, \Di�erential Cryptanalysis of DES-like Cryptosys-tems", Proceedings of Crypto'90, 1990.[5] E. Biham and A. Shamir, Di�erential Cryptanalysis of the Data EncryptionStandard, Berlin: Springer-Verlag, 1993.[6] W. Blaser, P. Heinzmann, \New Cryptographic Device with High Security Us-ing Public Key Distribution", Proceedings of IEEE Student Paper Contests1979-1980, pp. 145-153, 1982.[7] David Cox, John Little, and Donal O'Shea, Ideals, Variety, and Algorithms,Springer-Verlag, New York, 1992.[8] Joan Daemen, Rene Govaerts, and Joos Vandewalle, \Correlation Matrices",preprint.[9] J. F. Dillon, \Elementary Hadamard Di�erence Sets", Proceedings of the SixthSoutheastern Conference on Combinatorics, Graph Theory, and Computing,Boca Raton, Florida, 1975.[10] Keith O. Geddes, Stephen R. Czapor, and George Labahn, Algorithms for Com-puter Algebra, Kluwer Academic Publishers, 1992.[11] Per Christian Hansen, Computing the singular value decomposition, technicalreport, Numerical Institute, Technical University of Denmark, 1982.[12] C. Harpes, G. G. Kramer and J. L. Massey, \A Generalization of Linear Crypt-analysis and the Applicability of Matsui's Piling-up Lemma". Proceedings ofEurocrypt'95. 85

86 BIBLIOGRAPHY[13] C. Harpes, \Partitioning Cryptanalysis". Post-diplom thesis, Swiss Federal In-stitute of Technology Zurich, Signal and Information Processing Laboratory.[14] C. Harpes, preliminary draft of Ph.D.-thesis, Swiss Federal Institute of Tech-nology Zurich, Signal and Information Processing Laboratory.[15] Roger A. Horn and Charles R. Johnson, Matrix Analysis, Cambridge UniversityPress, 1985.[16] Roger A. Horn and Charles R. Johnson, Topics in Matrix Analysis, CambridgeUniversity Press, 1991.[17] Thomas Jakobsen, Security Against Generalized Linear Cryptanalysis and Par-titioning Cryptanalysis, Semester project at Signal and Information ProcessingLaboratory, Swiss Federal Institute of Technology Zurich, Z�urich 1995.[18] C. R. Johnson and H. M. Shapiro, \Mathematical Aspects of the Relative GainArray A� (A�1)T", SIAM J. Algebraic and Discrete Methods 7 (1986), 627-644.[19] N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distribu-tions, vol. 2, second edition, John Wiley and Sons.[20] B. S. Kaliski, M. J. Robshaw, \Linear Cryptanalysis Using Multiple Approxi-mations", Proceedings of Crypto'94, LNCS, 1994.[21] Lars Knudsen, \A Key-Schedule Weakness in SAFER K-64", presented atCrypto '95, Santa Barbara.[22] Lars Knudsen and Thomas Berson, \Truncated di�erentials of SAFER", to bepresented at Fast Software Encryption 1996, Cambridge.[23] P. V. Kumar, R. A. Scholtz, and L. R. Welch, \Generalized Bent Functions andTheir Properties", J. Combinatorial Theory, Ser. A 40, 1985, 90-107.[24] Xuejia Lai and James L. Massey, \A Proposal for a New Block Encryption Stan-dard", Advances in Cryptology { Eurocrypt '90 Proceedings, Berlin: Springer-Verlag, 1991, pp. 389-404.[25] Xuejia Lai, James L. Massey, and Sean Murphy, \Markov Ciphers and Dif-ferential Cryptanalysis", Advances in Cryptology { Eurocrypt '91 Proceedings,Berlin: Springer-Verlag, 1991, pp. 17/38.[26] Peter Lancaster, Theory of Matrices, Academic Press, New York, 1969.[27] H. J. Larson, Introduction to Probability Theory and Statistical Inference, JohnWiley & Sons, 1969.[28] Rudolf Lidl and Harald Niederreiter, Finite Fields, Encyclopedia of Mathemat-ics and its Applications, vol. 20, Reading, 1983, Addison-Wesley.

BIBLIOGRAPHY 87[29] James L. Massey, \SAFER K-64: A Byte-Oriented Block-Ciphering Algo-rithm", Fast Software Encryption, editor R. Anderson, Lecture Notes in Com-puter Science No. 809, New York, Springer, 1994, pp. 1-17.[30] M. Matsui, \Linear Cryptanalysis Method for DES Ciphers", abstracts of Eu-rocrypt '93, Lofthus, Norway, 1993.[31] M. Matsui, \The First Experimental Cryptanalysis of the Data EncryptionStandard", extended abstract submitted to Crypto '94.[32] T. J. McAvoy, Interaction Analysis, Instrument Society of America, ResearchTriangle Park, 1983.[33] W. Meier and O. Sta�elbach, \Nonlinearity Criteria for Cryptographic Func-tions", Advances in Cryptology, Proceedings of Eurocrypt'89, Springer-Verlag,1989.[34] P. A. P. Moran, An Introduction to Probability Theory, Clarendon Press, Ox-ford, 1968.[35] Sean Murphy, Fred Piper, Michael Walker, and Peter Wild, \Likelihood Es-timation for Block Cipher Keys", submitted to Journal of Cryptology, June1994.[36] National Bureau of Standards, \Data Encryption Standard", Federal Informa-tion Processing Standards Publications No. 46, 1977.[37] Henri J. Nussbaumer, Fast Fourier Transform and Convolution Algorithms,Springer Verlag, Berlin Heidelberg, 1981.[38] K. Nyberg, \Constructions of Bent Functions and Di�erence Sets", Advancesin Cryptology, Proceedings of Eurocrypt'90, Springer-Verlag, 1990.[39] K. Nyberg, \New Bent Mappings Suitable for Fast Implementation", Fast Soft-ware Encryption, LNCS 809, Springer-Verlag, 1993.[40] K. Nyberg and L. R. Knudsen, \Provable Security Against a Di�erential At-tack", Journal of Cryptology, vol. 8, number 1, 1995.[41] J. D. Olsen, R. A. Scholtz and L. R. Welch, \Bent Function Sequences", IEEETrans. Inform. Theory, IT-28, 1982, 858-864.[42] O. S. Rothaus, \On `Bent' Functions", J. Combinatorial Theory, Ser. A 20(1976), 300-305.[43] H. Schneider and W. G. Strang, \Comparison Theorems for SupremumNorms",Numerische Math. 4 (1962), 15-20.[44] C. E. Shannon, \Communication Theory of Secrecy Systems", Bell Syst. Tech.J., vol. 28, Oct. 1949.

88 BIBLIOGRAPHY[45] T. Siegenthaler, \Decrypting a Class of StreamCiphers Using Ciphertext only",IEEE Trans. on Computers, vol. C-33, 1984.[46] T. Siegenthaler, \Correlation-Immunity of Nonlinear Combining Functions forCryptographic Applications", IEEE Trans. on Info. Theory, vol. IT-31.[47] Xiao Guo-Zhen, James L. Massey, \A Spectral Characterization of Correlation-Immune Combining Functions", IEEE Trans. on Info. Theory.

Appendix ASymbolsC Correlation matrix.~C The extended correlation matrix.�a The character from ~G corresponding to a 2 G.�ab, �a(b) Kronecker's delta function.E The (n� 1)� (n� 1) matrix with eab = 1=(n � 1).E[�] Expected value/mean value operator.Ff�g The Fourier transform.G The elements of the group used in the cipher.G The elements of the character group corresponding to G.�(�) The gamma function.i The imaginary unit.I(S) The imbalance of S.I(SjK = k) The imbalance of S given that K = k.IGLC(�) GLC-imbalance.IPC(�) PC-imbalance.�I(S) The average-key imbalance of S.~I(S) The estimated imbalance of S.~I�(S) Weighted sum of imbalance estimates.K Key.K(i) j-th round key.K Key space.Lab(X;Y ) The linear I/O sum �a(X) � ��b(Y ).L Likelihood estimator.�j(�) The j-th largest eigenvalue of the argument.n The number of possible plaintexts; n = jGj.N The number of samples.N(�; V ) Normal distribution with mean value � and variance V .PG All possible I/O pairs over the group G.�(�) The accumulated normal distribution.r The number of rounds.�(�) Spectral radius.S I/O product.Sfg The I/O product f(X) � g(Y ).89

90 APPENDIX A. SYMBOLSS(1::r�1) I/O product over the reduced cipher.S(1::r+1) I/O product over the expanded cipher.S Set of I/O products.�j(�) The j-th largest singular value of the argument.T Family of balanced functions.MT The transposed matrix.V [�] Variance operator.X Input.X(j) Input to the j-th round.�(�;J;N) Imbalance density function with parameters J and N .Y Output.Y (j) Output from the j-th round.ZZm The set f0; 1; : : : ;m� 1g.�m Addition modulo m.m Subtraction modulo m. The Kronecker product; group operation used in IDEA.� Multiplication modulo m+ 1 with 0 � m and m+ 1 prime.+ Addition modulo m.� Addition modulo 2; xor.� The Hadamard product.� Convolution.k � k1 l1 norm / absolute sum norm.k � k2 Frobenius norm.k � k1 l1 norm / maximum norm.jjj � jjj1 Maximum column sum norm.jjj � jjj2 Spectral norm.jjj � jjj1 Maximum row sum norm.M� The Hermitian adjoint of M .�c The complex conjugate of c.

Appendix BAbbreviationsANF Algebraic Normal FormCC Correlation CryptanalysisDC Di�erential CryptanalysisDES Data Encryption StandardFFT Fast Fourier TransformGLC Generalized Linear CryptanalysisIDEA International Data Encryption AlgorithmI/O Input/OutputIPES Improved Proposed Encryption StandardLC Linear CryptanalysisP/C Plaintext/CiphertextPC Partitioning CryptanalysisPES Proposed Encryption StandardPHT Pseudo Hadamard TransformSAFER Secure And Fast Encryption RoutineXOR Exclusive OR91

Date post:	07-Feb-2023
Category:	Documents
Upload:	khangminh22
View:	0 times
Download:	0 times

Correlation Attacks on Block Ciphers - CiteSeerX

Documents