+ All Categories
Home > Documents > Biasing power traces to improve correlation in power ...€¦ · Power analysis attacks [1] have...

Biasing power traces to improve correlation in power ...€¦ · Power analysis attacks [1] have...

Date post: 17-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
4
COSADE 2010 - First International Workshop on Constructive Side-Channel Analysis and Secure Design 77 Biasing power traces to improve correlation in power analysis attacks Yongdae Kim, Takeshi Sugawara, Naofumi Homma, Takafumi Aoki, Akashi SatohGraduate School of Information Sciences, Tohoku University 6-6-05 Aramaki Aza Aoba Aoba-ku Sendai-shi 980-8579 Japan Email: {kimyd, sugawara, homma}@aoki.ecei.tohoku.ac.jp [email protected] National Institute of Advanced Industrial Science and Technology 1-18-13 Sotokanda Chiyoda-ku Tokyo 101-0021 Japan Email: [email protected] Abstract—In this paper, we present a selection method of power traces to improve the efficiency of power analysis attacks. The proposed method improves the correlation factor by biasing distribution of power traces. The biasing is to select a subset from many traces. We demonstrate our method through correlation power analysis (CPA) experiments using two different devices. The results clearly show that the selection of power traces has a significant impact on the results of CPAs. Based on the selection method, an evaluation method to detect such biasing in power traces is also proposed. The method can be used to achieve fair comparison of statistical distinguishers for power analysis attacks. I. I NTRODUCTION Power analysis attacks [1] have been serious threats to cryp- tographic devices. The attacks usually exploit a set of power traces measured from a target device which are correlated to internal data/operations. The assumption here is that the measured power traces have different statistical distributions depending on the operands/operations. An adversary can re- cover secret keys when the distributions are distinguishable. Since the difference-of-means method was presented as a distinguisher in the primary article on Differential Power Analysis (DPA) [1], a lot of efforts have been devoted to find effective distinguishers [2]– [7]. We note that the correlation coefficient in Correlation Power Analysis (CPA) [4] is one of the most common distinguishers. In contrast, we take an approach of changing the statistical distribution of power traces, rather than working on distin- guishers. In the previous works, it is implicitly assumed that the statistical distributions are fixed and mainly determined by physical characteristics of the target device. However, they can be biased when particular subset of traces are selected from a larger set. As a result, the efficiencies of conventional power analysis attacks are improved when the biased traces are used. The idea of selecting power traces was firstly proposed in [8]. In the article, masking countermeasure is compromised by selected traces to bias the statistical distribution of a random mask. On the other hand, we introduce a selection method that can be used to reduce the number of power traces to reveal keys. Our method is as a pre-processing technique to improve the performance of the following conventional analysis. In addition, we present a method to evaluate such biasing in the set of power traces since the selected sets potentially cause an unfair comparison between distinguishers. Our preliminary result was posted to the DPA Contest 2008/2009 [9]. This paper focuses on the selection method tailored to CPA, yet the same strategy can be extended to other distinguishers. The advantage of the proposed method is examined through actual experiments on two different crypto- graphic devices: (i) FPGA implementation of AES (Advanced Encryption Standard) and (ii) ASIC implementation of DES (Data Encryption Standard). The results show that the selection of power traces achieves a significant reduction in the value of measurement to disclosure (MTD). II. FORMALIZATION OF POWER TRACES This section describes our power consumption model and derives the equation of correlation coefficient for CPAs. The basic characteristics of power traces are discussed from the view point of correlation which is used in CPA. Then, we mention how to improve the correlation factor. A. Power Consumption Let P total be a power consumption observed from a cryp- tographic device. P total is represented as a sum of two components, P data and P noise , that is P total = P data + P noise , (1) where P data is the power consumption depending on the pro- cessed data, and the remaining P noise is the noise component (i.e. electronic noise in the measurement setup). The relation- ship between P data and the processed data is determined by a power model (e.g. Hamming weight model and Hamming distance model). On the other hand, P noise is approximated by a normal distribution in most cases.
Transcript
Page 1: Biasing power traces to improve correlation in power ...€¦ · Power analysis attacks [1] have been serious threats to cryp-tographic devices. The attacks usually exploit a set

COSADE 2010 - First International Workshop on Constructive Side-Channel Analysis and Secure Design

77

Biasing power traces to improve correlation inpower analysis attacks

Yongdae Kim†, Takeshi Sugawara†, Naofumi Homma†, Takafumi Aoki†, Akashi Satoh‡

†Graduate School of Information Sciences, Tohoku University6-6-05 Aramaki Aza Aoba Aoba-ku Sendai-shi 980-8579 Japan

Email: {kimyd, sugawara, homma}@[email protected]

‡National Institute of Advanced Industrial Science and Technology1-18-13 Sotokanda Chiyoda-ku Tokyo 101-0021 Japan

Email: [email protected]

Abstract—In this paper, we present a selection method ofpower traces to improve the efficiency of power analysis attacks.The proposed method improves the correlation factor by biasingdistribution of power traces. The biasing is to select a subset frommany traces. We demonstrate our method through correlationpower analysis (CPA) experiments using two different devices.The results clearly show that the selection of power traces has asignificant impact on the results of CPAs. Based on the selectionmethod, an evaluation method to detect such biasing in powertraces is also proposed. The method can be used to achievefair comparison of statistical distinguishers for power analysisattacks.

I. INTRODUCTION

Power analysis attacks [1] have been serious threats to cryp-tographic devices. The attacks usually exploit a set of powertraces measured from a target device which are correlatedto internal data/operations. The assumption here is that themeasured power traces have different statistical distributionsdepending on the operands/operations. An adversary can re-cover secret keys when the distributions are distinguishable.

Since the difference-of-means method was presented asa distinguisher in the primary article on Differential PowerAnalysis (DPA) [1], a lot of efforts have been devoted to findeffective distinguishers [2]– [7]. We note that the correlationcoefficient in Correlation Power Analysis (CPA) [4] is one ofthe most common distinguishers.

In contrast, we take an approach of changing the statisticaldistribution of power traces, rather than working on distin-guishers. In the previous works, it is implicitly assumed thatthe statistical distributions are fixed and mainly determinedby physical characteristics of the target device. However, theycan be biased when particular subset of traces are selectedfrom a larger set. As a result, the efficiencies of conventionalpower analysis attacks are improved when the biased traces areused. The idea of selecting power traces was firstly proposed in[8]. In the article, masking countermeasure is compromised byselected traces to bias the statistical distribution of a randommask.

On the other hand, we introduce a selection method thatcan be used to reduce the number of power traces to revealkeys. Our method is as a pre-processing technique to improvethe performance of the following conventional analysis. Inaddition, we present a method to evaluate such biasing in theset of power traces since the selected sets potentially cause anunfair comparison between distinguishers.

Our preliminary result was posted to the DPA Contest2008/2009 [9]. This paper focuses on the selection methodtailored to CPA, yet the same strategy can be extended toother distinguishers. The advantage of the proposed method isexamined through actual experiments on two different crypto-graphic devices: (i) FPGA implementation of AES (AdvancedEncryption Standard) and (ii) ASIC implementation of DES(Data Encryption Standard). The results show that the selectionof power traces achieves a significant reduction in the valueof measurement to disclosure (MTD).

II. FORMALIZATION OF POWER TRACES

This section describes our power consumption model andderives the equation of correlation coefficient for CPAs. Thebasic characteristics of power traces are discussed from theview point of correlation which is used in CPA. Then, wemention how to improve the correlation factor.

A. Power Consumption

Let Ptotal be a power consumption observed from a cryp-tographic device. Ptotal is represented as a sum of twocomponents, Pdata and Pnoise, that is

Ptotal = Pdata + Pnoise, (1)

where Pdata is the power consumption depending on the pro-cessed data, and the remaining Pnoise is the noise component(i.e. electronic noise in the measurement setup). The relation-ship between Pdata and the processed data is determined bya power model (e.g. Hamming weight model and Hammingdistance model). On the other hand, Pnoise is approximatedby a normal distribution in most cases.

Page 2: Biasing power traces to improve correlation in power ...€¦ · Power analysis attacks [1] have been serious threats to cryp-tographic devices. The attacks usually exploit a set

COSADE 2010 - First International Workshop on Constructive Side-Channel Analysis and Secure Design

78

B. Correlation Coefficient

One of the improved power analysis attacks is CorrelationPower Analysis (CPA), which employs Pearson’s correlationcoefficient ρ to evaluate the linear relationship between Ptotal

and the processed data.Assume that we have N power consumptions corresponding

to N different inputs and N is large enough. Let gi (1 ≤ i ≤N) denote a power consumption at the i-th input, which isgiven by

gi = si + wi, (2)

where si is the data-dependent component corresponding toPdata, and wi is the data-independent component correspond-ing to Pnoise. The assumption considered here is that si isperfectly correlated to the processed data and wi is subjected toa normal distribution with a mean of 0 and a standard deviationof σnoise described as

wi ∼ N(0, σnoise). (3)

Now, we assume that the parallel implementation of acryptographic algorithm with M S-boxes processed simulta-neously. When the contribution of the j-th S-box at the i-thinput is represented as hij (1 ≤ i ≤ N, 1 ≤ j ≤ M), si isgiven as

si =M∑

j=1

hij . (4)

The mean E(hij) and the variance var(hij) are representedas

E(hij) =1N

N∑

i=1

hij

≡ µ, (5)

var(hij) =1N

N∑

i=1

(hij − µ)2

≡ σ2, (6)

respectively. This is because the N data are truly at random,and processed independently in M S-boxes. The mean and thevariance of sij are provided by a sum of those of hij as

E(M∑

j=1

hij) =M∑

j=1

E(hij) = Mµ (7)

var(M∑

j=1

hij) =M∑

j=1

var(hij) = Mσ2 (8)

respectively. The Pearson’s correlation coefficient ρj betweengi and hij is defined as

ρj =cov (gi, hij)√

var(gi)var(hij), (9)

where cov(X,Y ) denotes the covariance of two variables Xand Y . Here, cov(gi, hij) can be simplified using Eqs. (7) and

(8) as follows:

cov(gi, hij) == E((gi − E(gi))(hij − E(hij))) (10)= E((si + wi − E(si)− E(wi))(hij − µ)) (11)= E((si + wi −Mµ)(hij − µ)) (12)= E((si −Mµ)(hij − µ))

+E(wi(hij − µ)) (13)

= E

((M∑

k=1

hik −Mµ

)(hij − µ)

)

+E(wi)E(hij − µ) (14)

= E

((M∑

k=1

(hik − µ)

)(hij − µ)

)(15)

= E((hij − µ)2) (only if k = j, otherewise 0) (16)= σ2 (17)

Also, var(gi) can be simplified using Eq. (8) as

var(gi) = var(M∑

k=1

hik + wi) (18)

=M∑

k=1

var(hik) + var(wi) (19)

= Mσ2 + σ2noise. (20)

Using Eqs. (17) and (20), therefore, ρj is represented as

ρj =cov(gi, hij)√

var(gi)var(hij)(21)

=σ2

√(Mσ2 + σ2

noise) σ2(22)

=1√(

M + σ2noise

σ2

) (23)

=1√(

M + 1SNR

) , (24)

where SNR represents the signal-to-noise ratio given by thevariance of hij and wi.

III. BIASING POWER TRACES

This section presents a method of selecting a biased subsetof power traces from a larger set of power traces. An evalua-tion method to detect the bias in power traces is also presented.

A. Method of choosing biased power traces

To improve the correlation coefficients, many papers havebeen only focused on finding efficient distinguishers [5], [6].In this paper, however, we focus on a pre-processing techniquechoosing biased power traces to improve the correlation co-efficient. As shown in Eq. (24), the SNR should be increased

Page 3: Biasing power traces to improve correlation in power ...€¦ · Power analysis attacks [1] have been serious threats to cryp-tographic devices. The attacks usually exploit a set

COSADE 2010 - First International Workshop on Constructive Side-Channel Analysis and Secure Design

79

Power consumption

Fre

qu

en

cy o

f o

ccu

ren

ce

Fre

qu

en

cy o

f o

ccu

ren

ce

Power consumption

(a) (b)

Fig. 1. Distributions of two sets of power traces at a fixed time index (a) thedistribution of N2 power traces and (b) the distribution of N1 power traces

to achieve the higher correlation coefficient, ρ. However, thenoise component σ2

noise is uncontrollable and determined bythe target device and the measurement. On the other hand,the variance σ2 can be changed to have a higher correlationcoefficient. It can be achieved by intentionally choosing asubset of measured traces so that σ2 is maximized. Accordingto Eq. (20), the maximization of σ2 is equivalent to that ofvar(gi). Therefore, the subset can be chosen by the followingmethod.

First, let N1 and N2 denote the number of chosen powertraces and total power traces, respectively.

1) Determine a time index tct which is the most relevanttime index to the processed data using N2 power traces.

2) Calculate a mean, µp and a variance, σ2p of power

consumption at tct using N2 power traces.3) Describe the normal distribution of power consumption

(see Fig. 1(a)) by the probability density function withparameters µp and σ2

p given as

f(x) =1√2πσ2

p

exp

(−(x− µp)2

2σ2p

). (25)

4) Sort N2 power traces by the probability density function.5) Extract N1 power traces from N2 power traces in order

for increasing the probability (see Fig. 1(b)).This method assumes that the power consumption gi is

approximated by a normal distribution. The assumption isbased on the fact that each bit of in the processed data ischanged independently with a probability of 0.5, and followsthe binomial distribution. Thus, the sum of transitions for allthe bits can be approximated by the normal distribution. Thisapproximation is reasonable since the bit-length L is largeenough in practice (e.g. L = 64 in DES).

Figure 1(b) shows that the extracted power traces are awayfrom the mean. It is intuitive that the variance of these tracesis high. Such a distribution is well investigated in statisticsand referred to as a truncated distribution [10]. The varianceof the truncated distribution is higher than that of the originaldistribution.

B. Evaluation of power tracesThe use of the above biased power traces would happen in-

tentionally or unintentionally, resulting in an unreliable evalu-ation of cryptographic modules. In other words, a set of power

(a) (b)

Voltage(mV)

Time(ns)

0 200 400 600 800 10000

50

100

150

200

250

300

0 100 200 300 400 500 600 700 800 900 1000−40

−20

0

20

40

60

80

100

Voltage(mV)

Time(ns)

Fig. 2. Measured power traces : (a) DES in ASIC and (b) AES in FPGA

traces based on a normal distribution would be preferable forevaluation. Thus, we can evaluate the set of power tracesby the distribution. Such evaluation is achieved by testingnormality of the traces. For the quantification of normality,one of the well-known test methods is a modification of theKolomogorov-Smirnov test, which is generally referred to asthe Lilliefors test [11]. If the Lilliefors test statistic is smallenough, it can be said that the set of power traces follows anormal distribution.

IV. EXPERIMENTAL RESULTS

This section demonstrates the effectiveness of biasing powertraces through CPA experiments. In addition, we show amethod of evaluating whether the power traces are selectedto improve the results of power analysis attacks.

Our experiments employed power consumption traces from(i) DES and (ii) AES circuits. The experiment (i) was con-ducted on the DES crypto-processor implemented on an ASICcalled SecmatV3 SoC in 2008/2009 DPA contest [9]. Theexperiment (ii) is performed on the AES circuit implementedin a Xilinx FPGA on the Side-channel Attack Standard Eval-uation Board (SASEBO) [12]. The total numbers of measuredtraces are (i) 80,000 and (ii) 150,000, respectively. Figures2(a) and (b) show the measured power trace of DES and AES,respectively.

In order to choose the biased set of power traces, we deter-mined a point (tct) by computing the correlation coefficientsbetween the power consumption and the processed data ofDES and AES. The determined points are 724 ns and 593 nsfor DES and AES, respectively. 1

Using the values at the point, we selected the biased set ofpower traces according to the method described in the previoussection. We also selected the same number of power tracesat random for comparison. In this procedure, 300 and 3,000traces are selected for the experiments (i) and (ii), respectively.

Figures 3 and 4 show the transitions of correlation coef-ficient values (MTD : measurement to disclosure) in CPAson DES and AES, respectively. In each figure, the black line

1The point can also be determined by SPA. For example, we can determinethe point of AES in the experiment since the unique pattern of 10th round (i.e.the target round) is observed by SPA. Another possible method is to utilizea reference device. In this case, such point would be found by the selectionmethod for Template Attacks. However, we do not discuss the efficient methodof selecting of the point in this paper.

Page 4: Biasing power traces to improve correlation in power ...€¦ · Power analysis attacks [1] have been serious threats to cryp-tographic devices. The attacks usually exploit a set

COSADE 2010 - First International Workshop on Constructive Side-Channel Analysis and Secure Design

80

(a) (b)

0 50 100 150 200 250 300 0

0. 2

0. 4

0. 6

0. 8

1

Number of traces

Ma

xim

um

of

co

rre

latio

n c

oe

ffic

ien

t

0 50 100 150 200 250 300 0

0. 2

0. 4

0. 6

0. 8

1

Number of tracesM

axim

um

of

co

rre

latio

n c

oe

ffic

ien

t

Correct Key

Wrong Keys

Correct Key

Wrong Keys

Fig. 3. Transitions of correlation coefficient values in CPA on DES : (a)Random traces and (b) Biased traces

(a) (b)Number of traces

Maxim

um

of corr

ela

tion c

oeffic

ient

Number of traces

Maxim

um

of corr

ela

tion c

oeffic

ient

0 500 1000 1500 2000 2500 30000

0. 1

0. 2

0. 3

0. 4

0. 5

0

0. 1

0. 2

0. 3

0. 4

0. 5

0 500 1000 1500 2000 2500 3000

Correct Key

Wrong Keys

Correct Key

Wrong Keys

Fig. 4. Transitions of correlation coefficient values in CPA on AES : (a)Random traces (b) Biased traces

indicates the transition of the correct key and the gray linesindicate those of wrong keys. Figures 3(a) and 4(a) are theresults of power traces selected at random, and Figs. 3(b) and4(b) are those of biased power traces selected by our method. Itis obvious that the correlation results were improved using thebiased power traces. More precisely, the MTDs of the biasedpower traces in Figs 3 and 4 are 2.5 and 8.0 times smaller thanthose of the randomly-selected power traces, respectively.

Figure 5 shows the classification rates of CPAs using the twotypes of power traces. The vertical axis indicates the classifica-tion rate representing the number of S-boxes where we coulddistinguish a correct key from all possible key candidates. Ifall correct keys were obtained, the classification rate is 100.The horizontal axis is the number of traces used in the CPA.We can confirm from Fig. 5 that the characteristics of power-trace sets have a significant impact on the results of poweranalysis attacks independently of cryptographic algorithms andimplementation platforms.

Table I shows the variance σ2 and the Lilliefors test statisticD used in the above experiments. We present the product of σ2

and D as an evaluation matrix of power trace sets. The biasedpower traces have a specific property of increasing both thecomponents. As the value of σ2D increases, the probabilitythat the power traces were biased for the improvement ofpower analysis attacks increases.

V. CONCLUSION

In this paper, we presented a method of selecting a biased setof power traces for the improvement of power analysis attacks.The selection method can also be available for examiningpower traces for the fair evaluation of attacks. We confirmed

0 50 100 150 200 250 3000

20

40

60

80

100

Number of traces

Cla

ssific

atio

n r

ate

(%)

0 500 1000 1500 2000 2500 30000

20

40

60

80

100

Number of traces

(a) (b)

Cla

ssific

atio

n r

ate

(%)

Random traces

Biased traces

Random traces

Biased traces

Fig. 5. Classification rates of CPAs : (a) DES in ASIC and (b) AES inFPGA

TABLE ISTATISTICS OF POWER TRACES

DES ASICσ2 (mV 2) D σ2D

Random traces 11.97 0.036 0.43Biased traces 80.33 0.346 27.79

AES FPGAσ2 (mV 2) D σ2D

Random traces 9.37 0.080 0.75Biased traces 67.37 0.318 21.42

that the significant MTD improvements in CPAs were achievedby the selection method for two different algorithms andplatforms. The results suggest that the characteristics of powertraces should be fairly considered for evaluating power anal-ysis attacks.

REFERENCES

[1] P. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” LectureNotes in Computer Science, vol. 1666, pp. 388–397, Aug. 1999.

[2] F.-X. Standaert, B. Gierlichs, and I. Verbauwhede, “Partition vs. compar-ison side-channel distinguishers,” Lecture Notes in Computer Science,vol. 5461, pp. 253–267, December 2008.

[3] S. Chari, J. R. Rao, and P. Rohatgi, “Template attacks,” Lecture Notesin Computer Science, vol. 2523, pp. 12–28, 2002.

[4] E. Brier, C. Clavier, and F. Olivier, “Correlation power analysis witha leakage model,” Lecture Notes in Computer Science, vol. 3156, pp.135–152, 2004.

[5] T.-H. Le, J. Clediere, C. Canovas, B. Robisson, C. Serviere, and J.-L.Lacoume, “A proposition for correlation power analysis enhancement,”Lecture Notes in Computer Science, vol. 4249, pp. 174–186, 2006.

[6] L. Batina, B. Gierlichs, and K. Lemke-Rust, “Comparative evaluation ofrank correlation based DPA on an AES prototype chip,” Lecture Notesin Computer Science, vol. 5222, pp. 341–354, 2008.

[7] W. Schindler, K. Lemke, and C. Paar, “A stochastic model for differentialside channel cryptanalysis,” Lecture Notes in Computer Science, vol.3659, pp. 30–46, 2005.

[8] K. Tiri and P. Schaumont, “Changing the odds against masked logic,”Selected Areas in Cryptography, Lecture Notes in Computer Science,vol. 4356, pp. 134–146, 2007.

[9] “DPA Contest,” 2008/2009, http://www.dpacontest.org.[10] A. C. Cohen, Truncated and Censored Samples : Theory and Applica-

tions. Marcel Dekket Inc., 1991.[11] H. Lilliefors, “On the Kolmogorov-Smirnov test for normality with mean

and variance unknown,” Journal of the American Statistical Association,Jun 1967.

[12] Research Center for Information Security, “Side-channel Attack Standard Evaluation BOard (SASEBO),”http://www.rcis.aist.go.jp/special/SASEBO.


Recommended