+ All Categories
Home > Documents > Scan-based Side-channel Attack against HMAC-SHA-256 ...

Scan-based Side-channel Attack against HMAC-SHA-256 ...

Date post: 10-Feb-2022
Category:
Upload: others
View: 9 times
Download: 0 times
Share this document with a friend
13
IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018) [DOI: 10.2197/ipsjtsldm.11.16] Regular Paper Scan-based Side-channel Attack against HMAC-SHA-256 Circuits Based on Isolating Bit-transition Groups Using Scan Signatures Daisuke Oku 1,a) Masao Y anagisawa 2 Nozomu Togawa 1,b) Received: June 1, 2017, Revised: September 1, 2017, Accepted: October 20, 2017 Abstract: A scan chain is used by scan-path test, one of design-for-test techniques, which can control and observe internal registers in an LSI chip. On the other hand, a scan-based side-channel attack is focused on which can restore secret information by exploiting the scan data obtained from a scan chain inside the crypto chip during cryptographic processing. In this paper, we propose a scan-based attack method against a hash generator circuit called HMAC-SHA- 256. Our proposed method is composed of three steps; Firstly, we isolate 64 bit-transition groups from a scan data using scan signatures based on the property of the HMAC-SHA-256 algorithm. Secondly, we classify these 64 bit- transition groups into 32 pairs. Lastly, we find out the correspondence between the scan data and the internal registers in the target HMAC-SHA-256 circuit. Our proposed method restores the secret information by the three steps above, even if the scan chain includes registers other than the target hash generator circuit and hence it becomes too long. Experimental results show that our proposed method successfully restores two secret keys of the HMAC-SHA-256 circuit using up to 425 input messages in 7.5 hours. Keywords: HMAC, SHA-256, side-channel attack, scan chain, scan-based side-channel attack 1. Introduction Design-for-test techniques become much more important in LSI design as the size and function of LSI chips are more and more complicated. Scan-path test using a scan chain embedded in an LSI chip is one of the design-for-test techniques and is able to increase the eciency of fault detection and operation test. A scan chain can control and observe internal registers easily and directly by connecting registers serially in an LSI chip. JTAG and IEEE 1500 are often used as the test interfaces[1]. Scan chain is accessed and scan data is obtained through these test interfaces. In some cases, secure test interfaces are im- plemented in a chip where the access to the scan chain is lim- ited somehow [2], [3], but it is pointed out that there still exist LSI chips on which these test interfaces can be accessed non- securely [1], [4], [5]. Scan data itself can be secured or com- pressed and several secure scan architectures are proposed [5]. However, the secure scan architectures may have the drawbacks from the viewpoints of area overheads and testability and hence not all the LSI chips always have these secure scan architectures. In some cases, a test interface is invalidated by blowing the fuses connected to the test pin after manufacturing [6]. Even in these 1 The author is with the Department of Computer Science and Commu- nications Engineering, Waseda University, Shinjuku, Tokyo 169–8555 Japan. 2 The author is with the Department of Electronic and Physical Systems, Waseda University, Shinjuku, Tokyo 169–8555 Japan. a) [email protected] b) [email protected] cases, there exists a technique that the test interface is re-validated by using a focused ion beam [6]. Overall, current LSI chips still have a potential risk that even a malicious user can access its scan chain and obtain its scan data. Recently, side-channel attacks are focused on which can re- store the secret information inside an LSI chip. By observing and analyzing physical information emitted from a cipher chip, an at- tacker can obtain a secret key inside the cipher circuit. A timing attack [7], a fault analysis attack [8], [9], a cache attack [10], an electromagnetic side-channel attack [11], and a dierential power attack [12], [13], [14], [15] are reported as possible side-channel attacks. A scan-based side-channel attack, or scan-based attack, is also one of the possible side-channel attacks, which can re- store the secret information using the scan data obtained from the scan chain embedded on the target cipher chip. So far, several scan-based attacks against symmetric-key cryptosystems, DES [16] and AES [17], [18], [19], [20], public-key cryptosys- tems, RSA [19], [21] and ECC [19], [22], and stream ciphers [23], [24], [25] are reported. We focus on a hash generator circuit, one of the most fre- quently used circuit in authentication. Dierential power anal- ysis based side-channel attacks and electromagnetic side-channel attacks for hash generator circuits are proposed in Refs. [11], [12], [14], [15]. However, as far as we know, scan-based attacks against hash generator circuits have not been proposed yet. Still we may have a practical scan-based attack against hash generator circuits. c 2018 Information Processing Society of Japan 16
Transcript
Page 1: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

[DOI: 10.2197/ipsjtsldm.11.16]

Regular Paper

Scan-based Side-channel Attack against HMAC-SHA-256Circuits Based on Isolating Bit-transition Groups Using

Scan Signatures

Daisuke Oku1,a) Masao Yanagisawa2 Nozomu Togawa1,b)

Received: June 1, 2017, Revised: September 1, 2017,Accepted: October 20, 2017

Abstract: A scan chain is used by scan-path test, one of design-for-test techniques, which can control and observeinternal registers in an LSI chip. On the other hand, a scan-based side-channel attack is focused on which can restoresecret information by exploiting the scan data obtained from a scan chain inside the crypto chip during cryptographicprocessing. In this paper, we propose a scan-based attack method against a hash generator circuit called HMAC-SHA-256. Our proposed method is composed of three steps; Firstly, we isolate 64 bit-transition groups from a scan datausing scan signatures based on the property of the HMAC-SHA-256 algorithm. Secondly, we classify these 64 bit-transition groups into 32 pairs. Lastly, we find out the correspondence between the scan data and the internal registersin the target HMAC-SHA-256 circuit. Our proposed method restores the secret information by the three steps above,even if the scan chain includes registers other than the target hash generator circuit and hence it becomes too long.Experimental results show that our proposed method successfully restores two secret keys of the HMAC-SHA-256circuit using up to 425 input messages in 7.5 hours.

Keywords: HMAC, SHA-256, side-channel attack, scan chain, scan-based side-channel attack

1. Introduction

Design-for-test techniques become much more important inLSI design as the size and function of LSI chips are more andmore complicated. Scan-path test using a scan chain embeddedin an LSI chip is one of the design-for-test techniques and is ableto increase the efficiency of fault detection and operation test. Ascan chain can control and observe internal registers easily anddirectly by connecting registers serially in an LSI chip.

JTAG and IEEE 1500 are often used as the test interfaces [1].Scan chain is accessed and scan data is obtained through thesetest interfaces. In some cases, secure test interfaces are im-plemented in a chip where the access to the scan chain is lim-ited somehow [2], [3], but it is pointed out that there still existLSI chips on which these test interfaces can be accessed non-securely [1], [4], [5]. Scan data itself can be secured or com-pressed and several secure scan architectures are proposed [5].However, the secure scan architectures may have the drawbacksfrom the viewpoints of area overheads and testability and hencenot all the LSI chips always have these secure scan architectures.In some cases, a test interface is invalidated by blowing the fusesconnected to the test pin after manufacturing [6]. Even in these

1 The author is with the Department of Computer Science and Commu-nications Engineering, Waseda University, Shinjuku, Tokyo 169–8555Japan.

2 The author is with the Department of Electronic and Physical Systems,Waseda University, Shinjuku, Tokyo 169–8555 Japan.

a) [email protected]) [email protected]

cases, there exists a technique that the test interface is re-validatedby using a focused ion beam [6]. Overall, current LSI chips stillhave a potential risk that even a malicious user can access its scanchain and obtain its scan data.

Recently, side-channel attacks are focused on which can re-store the secret information inside an LSI chip. By observing andanalyzing physical information emitted from a cipher chip, an at-tacker can obtain a secret key inside the cipher circuit. A timingattack [7], a fault analysis attack [8], [9], a cache attack [10], anelectromagnetic side-channel attack [11], and a differential powerattack [12], [13], [14], [15] are reported as possible side-channelattacks.

A scan-based side-channel attack, or scan-based attack, isalso one of the possible side-channel attacks, which can re-store the secret information using the scan data obtained fromthe scan chain embedded on the target cipher chip. So far,several scan-based attacks against symmetric-key cryptosystems,DES [16] and AES [17], [18], [19], [20], public-key cryptosys-tems, RSA [19], [21] and ECC [19], [22], and stream ciphers [23],[24], [25] are reported.

We focus on a hash generator circuit, one of the most fre-quently used circuit in authentication. Differential power anal-ysis based side-channel attacks and electromagnetic side-channelattacks for hash generator circuits are proposed in Refs. [11],[12], [14], [15]. However, as far as we know, scan-based attacksagainst hash generator circuits have not been proposed yet. Stillwe may have a practical scan-based attack against hash generatorcircuits.

c© 2018 Information Processing Society of Japan 16

Page 2: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

In this paper, we propose a scan-based attack method againsta hash generator circuit called HMAC-SHA-256. HMAC-SHA-256 is one of the message authentication algorithms using SHA-256 hash function, which is very widely used as a hash generator.Our method analyzes the scan data obtained from the scan chainembedded and restores the secret information by finding out thecorrespondence between the scan data and the internal registersin the target HMAC-SHA-256 circuit. Our method is composedof three steps: Firstly, we isolate 64 bit-transition groups froma scan data using scan signatures based on the property of theHMAC-SHA-256 algorithm. Secondly, we classify these 64 bit-transition groups into 32 pairs. Lastly, we find out the correspon-dence between the scan data and the internal registers in the targetHMAC-SHA-256 circuit.

Experimental results show that our method successfully re-stores two secret keys from the scan data in the HMAC-SHA-256circuit, even if the scan chain includes other registers than thoseof the HMAC-SHA-256 circuit.

The contributions of the paper are summarized as:( 1 ) We propose a world-first scan-based side-channel attack

method against an HMAC-SHA-256 hash generator circuit,which is based on effectively isolating bit-transition groups

in the obtained scan data.( 2 ) Our method successfully restores secret keys inside an

HMAC-SHA-256 circuit from its scan data within a practicalamount of time, even if the scan data includes not only in-ternal registers in an HMAC-SHA-256 circuit but also manyother registers outside.

The rest of this paper is organized as follows: Section 2 in-troduces the HMAC-SHA-256 algorithm and an HMAC-SHA-256 hash generator circuit; Section 3 gives several assumptionsfor a scan-based attack to an HMAC-SHA-256 circuit; Section 4proposes our scan-based attack method against an HMAC-SHA-256 circuit; Section 5 demonstrates the experimental results; Sec-tion 6 discusses the extensibility and limitation of the proposedmethod when applied to other implementations of HMAC-SHA-256 circuits; Section 7 gives our conclusions.

2. HMAC-SHA-256

In this section, we introduce the algorithms of HMAC andSHA-256 and the target HMAC-SHA-256 circuit. The operatorsused in this paper is summarized in Table 1.

2.1 HMAC [26]HMAC (Hash-based Message Authentication Code) is one of

the widely used message authentication codes, which can be usedwith any iterative cryptographic hash function. In HMAC, givenan input message M and an original secret key K, an authenti-cation code HMAC (M,K) is generated according to the follow-ing equation with hash function h (see the diagram as shown inFig. 1).

HMAC (M,K) = h(K0 ⊕ opad || h

(K0 ⊕ ipad || M

))

In Fig. 1, f shows a compression function, ipad and opad are B-bit constants, and IV is an H-bit initial value (H < B). The inputmessage M is partitioned into M(1),M(2), . . . ,M(N), each of which

Table 1 The notations in this paper.

⊕ Bitwise XOR

∧ Bitwise AND

x Bitwise NOT of x

← Substitution

�(

mod 232)-addition

|| Concatenation operator

Sn Right shift by n bits

Rn Right rotation by n bits

has a B-bit length. A B-bit value K0 is generated by the originalsecret key K as follows:

K0 =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

K (L (K) = B)

K || 0 . . . 0︸︷︷︸B − L(K)

(L (K) < B)

h (K) || 0 . . . 0︸︷︷︸B − L(h(K))

(B < L (K))(1)

where L (K) shows the bit length of K and h (K) shows the hashcode when K is input. Note that B < L (h (K)) always holds inHMAC. Kin, Kout and H(i) (1 ≤ i ≤ N) in Fig. 1 are the interme-diate hash values generated by the compression function f , eachof which has the size of H bits. In the last f function, the H-bitH(N) value is extended into a B-bit message block and input intothe f function.

Note that, in HMAC-SHA-256 below, B is set to be 512 bitsand hence we consider a 512-bit K0 value and 512-bit opad andipad values. H is set to be 256 bits and hence we have 256-bitKin, Kout, and H(i) values.

2.2 SHA-256 [27], [28]SHA-256 is an iterative cryptographic hash function standard-

ized by NIST which outputs a 256-bit hash value. In SHA-256, an input message M is partitioned into 512-bit messageblocks M(1),M(2), . . . ,M(N), where N shows the number of mes-sage blocks *1.

The algorithm of SHA-256 is shown in Algorithm 1. The al-gorithm is applied to every message block M(i) (1 ≤ i ≤ N) re-peatedly and the main loop for each message block M(i) is com-posed of the three phases: In PHASE 1, the internal registersare initialized. Let H(i−1) be a 256-bit intermediate hash valuegenerated for the previous message block M(i−1). When SHA-256 is applied to HMAC, we assume H(0) = IV , H(0) = Kin,or H(0) = Kout accordingly if we do not have a previous messageblock (see Fig. 1). Then H(i−1) is partitioned into eight 32-bit sub-hash values, H(i−1)

1 , . . . ,H(i−1)8 . Let a, b, c, d, e, f , g, h be the

32-bit internal registers in the SHA-256 circuit. These registers

*1 The message partitioning process in HMAC-SHA-256 is performed asfollows: Firstly, we append

(K0 ⊕ ipad

)and the input message M and

consider a long-sized message(K0 ⊕ ipad

)||M. Then we partition it

into a set of 512-bit message blocks. The first message block must be(K0 ⊕ ipad

), which size is just 512 bits. The next message block M(1) is

the first 512 bits from the input message M. The size of the last messageblock, M(N), may become smaller than 512 bits. In HMAC-SHA-256,‘1’, a number of ‘0’s, and a 64-bit integer indicating the size of the en-tire message

(K0 ⊕ ipad

)||M are appended to the end of the last message

block. If the last message block is too long to accommodate all the abovepadding bits, the last two message blocks are used. See Refs. [27], [28]in detail.

c© 2018 Information Processing Society of Japan 17

Page 3: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

Fig. 1 The block diagram of HMAC.

Fig. 2 The block diagram of the compression function in PHASE 2 of Algorithm 1.

Algorithm 1 SHA-256for i = 1 to N do

{PHASE 1: Initialize internal registers}a← H(i−1)

1 ; b← H(i−1)2 ; c← H(i−1)

3 ; d ← H(i−1)4 ;

e← H(i−1)5 ; f ← H(i−1)

6 ; g← H(i−1)7 ; h← H(i−1)

8 ;

{PHASE 2: Apply compression function}for j = 0 to 63 do

T1 ← h � Rot2 (e) �Ch (e, f , g) � Kj �Wj;

T2 ← Rot1 (a) � Ma j (a, b, c);

h← g; g← f ; f ← e;

e← d � T1;

d ← c; c← b; b← a;

a← T1 � T2;

end for

{PHASE 3: Calculate i-th intermediate hash value H(i)}H(i)

1 ← a � H(i−1)1 ; H(i)

2 ← b � H(i−1)2 ;

H(i)3 ← c � H(i−1)

3 ; H(i)4 ← d � H(i−1)

4 ;

H(i)5 ← e � H(i−1)

5 ; H(i)6 ← f � H(i−1)

6 ;

H(i)7 ← g � H(i−1)

7 ; H(i)8 ← h � H(i−1)

8 ;

end for

H(N)‖ = |(H(N)

1 ||H(N)2 || · · · ||H(N)

8

);

are initialized using H(i−1).In PHASE 2, the compression function is applied to up-

date the internal registers as in Algorithm 1, where Ch (e, f , g),Ma j (a, b, c), Rot1 and Rot2 are calculated by:

Ch (e, f , g) = (e ∧ f ) ⊕ (e ∧ g) (2)

Ma j (a, b, c) = (a ∧ b) ⊕ (a ∧ c) ⊕ (b ∧ c) (3)

Rot1 (a) = R2 (a) ⊕ R13 (a) ⊕ R22 (a) (4)

Rot2 (e) = R6 (e) ⊕ R11 (e) ⊕ R25 (e) (5)

In PHASE 2, Kj ( j = 0 . . . 63) shows the 32-bit constant definedby the specification of SHA-256. The 512-bit message blockM(i) is further partitioned into a set of 32-bit values M(i)

0 , . . . ,M(i)15.

Then Wj in PHASE 2 is given by:

Wj =

⎧⎪⎪⎨⎪⎪⎩M(i)

j ( j = 0, 1, . . . , 15)

Xj ( j = 16, 17, . . . , 63)(6)

where Xj is defined by:

Xj = σ1

(Wj−2

)�Wj−7 � σ0

(Wj−15

)�Wj−16. (7)

In the above equation, σ0 (x) = r7 (x) ⊕ r18 (x) ⊕ s3 (x) andσ1 (x) = R17 (x) ⊕ R19 (x) ⊕ S 10 (x).

In PHASE 3, the i-th intermediate hash value H(i) for M(i) isfinally calculated as in Algorithm 1.

The block diagram of the compression function and messageschedule used in SHA-256 is shown in Fig. 2.

2.3 HMAC-SHA-256 [29]In HMAC-SHA-256, the compression function given by SHA-

256 is used in each of the f functions in HMAC in Fig. 1 *2.As in Refs. [30], [31], [32], [33], [34], [35], HMAC-SHA-256

is used in IPsec and SSL/TLS.( 1 ) HMAC-SHA-256 in IPsec; HMAC-SHA-256 is used in sev-

eral protocols of IPsec such as Authentication Header (AH)

*2 Note that, in the last compression function ( f function) of HMAC-SHA-256, a 256-bit intermediate hash value H(N) is extended to a512-bit message block by padding ‘1’, ‘0’s, and the message size of(K0 ⊕ opad

)||H(N), as in the same way as the message partitioning pro-

cess. Then it is input into the last compression function.

c© 2018 Information Processing Society of Japan 18

Page 4: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

and Encapsulated Security Payload (ESP) [30], [31], [32].( 2 ) HMAC-SHA-256 in SSL/TLS; HMAC-SHA-256 is used in

several protocols of SSL/TLS such as in Handshake Protocoland Recode Protocol [33], [34], [35].

Based on [29], we employ the HMAC-SHA-256 circuit as atarget hash generator circuit as follows:(A) The target HMAC-SHA-256 circuit has a SHA-256 circuit

block inside, which is repeatedly used to calculate the com-pression function.

(B) The SHA-256 circuit performs PHASE 1 in one clock cy-cle, PHASE 2 in 64 clock cycles and PHASE 3 in one clockcycle in Algorithm 1, i.e., The SHA-256 circuit performsAlgorithm 1 in 66 clock cycles.

Then the target HMAC-SHA-256 circuit completes its process ina total of (N + 3)×66 clock cycles, since the SHA-256 process isrepeated (N + 3) times as shown in Fig. 1.

3. Assumptions

In Refs. [16], [17], it is assumed that only the registers in thetarget crypt circuit are included in the scan chain. However, anLSI chip usually contains a micro controller, embedded memoriesand its controller, and peripheral interfaces other than the targetcrypto circuits and a scan chain can include all of these registers.Hence, in this paper, we assume a full scan design where all theregisters in an LSI chip including a target hash circuit as well asother circuit blocks are connected to the scan chain.

In HMAC-SHA-256, Kin and Kout in Fig. 1 are considered tobe the secret keys as in Refs. [12], [14], [15]. Once Kin and Kout

are successfully restored, an attacker can replicate the HMAC-SHA-256 circuit which can generate a correct hash value fromany input message and he/she can establish a secure communica-tion using this copied an HMAC-SHA-256 circuit. The purposeof this paper is to restore Kin and Kout using the scan data.

As in Refs. [23], [25], we assume the following assumptions.( 1 ) Attackers know the timing when the target circuit generates

a hash value.( 2 ) The scan chain is not secured and hence attackers can in-

put any message to the target HMAC-SHA-256 circuit andobtain the scan data from the scan chain at any time.

( 3 ) Attackers do not know the connection order of the registersin the scan chain of the target HMAC-SHA-256 circuit northe number and type of registers in the scan chain of thetarget HMAC-SHA-256 circuit. Thus the scan data itself ismeaningless to attackers as it is.

In summary, attackers can input any message to the target cir-cuit and obtain scan data from the scan chain running a hash func-tion. However attacks do not know a bit in scan data correspondsto internal register in the target circuit.

4. Scan-based Attack against HMAC-SHA-256

In this section, we propose a scan-based attack against anHMAC-SHA-256 circuit using the scan data obtained from itsscan chain. Our algorithm is composed of the three steps below:

4.1 STEP 1: Isolate 64 Bit-transition Groups from ScanData

When we look at the SHA-256 compression function ofAlgorithm 1, the value of the 32-bit register a moves to the reg-isters b, c and d in this order as in Fig. 2. The value of the 32-bitregister e also moves to the register f , g and h in this order. Letxi denote the i-th bit of register x. Then ai moves to bi, ci anddi (0 ≤ i ≤ 31) in this order. Similarly, ei moves to fi, gi and hi

in this order. A bit-transition group refers to the bit positions ofai, bi, ci and di or those of ei, fi, gi and hi in the obtained scandata from the target HMAC-SHA-256 circuit. Note that, thesebit positions in the scan data obtained at any clock cycle do notchange and thus every bit-transition group is composed of fourbit positions as in G1 and G2 in Fig. 3 (a) and Fig. 3 (b). We havetotally 64 bit-transition groups since every internal register is 32-bit long and a bit-transition group initiates either from the registera or from the register e.

Now we try to find out these 64 bit-transition groups in theobtained scan data. We first vertically arrange the scan data ob-tained from a scan chain at different clock cycles and focus on aparticular vertical bit sequence as in Fig. 3 (c). In Fig. 3 (c), thescan data at Cycle T to Cycle (T + 7) are shown. Every verticalbit sequence shows a bit change trace in a particular register bitat different clock cycles, which is called a scan signature.

Assume that the scan signature of ai is given by the leftmostbold rectangle of Fig. 3 (c). Since ai is moved to bi at the nextclock cycle, ai value at Cycle T must be moved to bi value atCycle (T + 1). In the same way, ai value at Cycle (T + 1) mustbe moved to bi value at Cycle (T + 2). This means that the scansignature of ai from Cycle T to Cycle (T + �) must be moved tothe scan signature of bi from Cycle (T + 1) to Cycle (T + 1 + �).In the same way, the scan signature of bi from Cycle (T + 1) toCycle (T + 1 + �) must be moved to the scan signature of ci fromCycle (T + 2) to Cycle (T + 2 + �). If � is large enough, or weinput many messages and obtain scan data, we can find out theidentical scan signature with the length of (� + 1) bits in the scandata whose starting clock cycle is different by one bit from eachother.

Example 1 Figure 3 (c) shows the vertically arranged scandata obtained from the target LSI chip. These scan data must in-clude the intermediate values in all the register connected to thescan chain but we do not know which bit in the scan data corre-sponds to which bit in the internal registers.

In Fig. 3, we can find out two identical 5-bit scan signatures,one is shown by the bold rectangles (Bit-transition group G1) andthe other is shown by the dotted rectangles (Bit-transition groupG2). The bit-transition group G1 includes 1st bit, 51st bit, 512thbit, and 200th bit. The bit-transition group G2 includes 1,021stbit, 198th bit, 356th bit, and 815th bit.

Note that we cannot know at this time which bit of registersa–h corresponds to the transition group G1 or G2. �

The detailed algorithm of STEP 1 is described as follows:STEP 1.1

First of all, the scan data obtained from a scan chain in theHMAC-SHA-256 circuit at different clock cycles are verti-cally arranged as in Fig. 3 (c). Let Cycle T be the clock cy-

c© 2018 Information Processing Society of Japan 19

Page 5: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

Fig. 3 Isolate 64 bit-transition groups from scan data.

cle when the SHA-256 circuit starts Algorithm 1 (as in theassumption (1) in Section 3, attackers know Cycle T *3). Letn be the bit length of the scan data (n also shows the numberof scan flip-flops (SFFs) in the scan chain). Let i = 1.

STEP 1.2Let SSi be a vertical (� + 1)-bit sequence at the i-th columnin the scan data stating from Cycle T , where (� + 1) showsthe scan signature length.

STEP 1.3Find the bit-transition groups from the scan data as follows.1.3.1Find the same vertical bit sequence starting from CycleT + 1 as SSi in the scan data. If such a vertical bit sequenceis found only in the i1-th column, go to 1.3.2. Otherwise,go to STEP 1.5.

1.3.2Find the same vertical bit sequence starting from CycleT + 2 as SSi in the scan data. If such a vertical bit sequenceis found only in the i2-th column, go to 1.3.3. Otherwise,go to STEP 1.5.

1.3.3Find the same vertical bit sequence starting from CycleT + 3 as SSi in the scan data. If such a vertical bit se-quence is found only in the i3-th column, go to STEP 1.4.

*3 Even if attackers do not know the accurate timing of Cycle T , we expectthat they can also isolate the bit-transition groups similarly using therough timing estimation as in Ref. [36]. How to isolate them targeting anHMAC-SHA-256 circuit in a practical situation is our future work.

Otherwise, go to STEP 1.5.STEP 1.4

Let (i-th, i1-th, i2-th, i3-th) be a bit-transition group. Deletei-th column, i1-th column, i2-th column and i3-th columnfrom the scan data.

STEP 1.5i = i + 1. If n < i, go to STEP 1.6.Otherwise, if the i-th column is deleted from the scan data,go back to STEP 1.5. If the i-th column exists in the scandata, go to STEP 1.6.

STEP 1.6In case of i ≤ n, go to STEP 1.2. In case of n < i, if theobtained bit-transition groups are just 64, STEP 1 is success-fully finished. Otherwise, STEP 1 fails.

Now we estimate the calculation volume in STEP 1. STEP 1.3performs the bit comparisons at most 3×(� + 1)×n times, and theloop from STEP 1.2 to STEP 1.6 is performed at most n times.Thus, STEP 1.2–STEP1.6 require 3× (� + 1)× n× n times. SinceSTEP 1.1 just requires O (1) time, STEP 1 totally requires ap-proximately O

(n2�)

times.In STEP1, we try to find out (� + 1)-bit scan signature in

the scan data and isolate bit-transition groups. Then we canfinally isolate the 64 bit-transition groups in the scan data,even though we cannot know the correspondence between bit-transition groups and internal register bits at this time. In otherwords, we do not identify a, b, c, d and e, f , g, h in STEP 1.

c© 2018 Information Processing Society of Japan 20

Page 6: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

4.2 STEP 2: Classify 64 Bit-transition Groups into 32 PairsIn STEP 1, the 64 bit-transition groups are isolated from the

scan data. After that, in STEP 2, these 64 bit-transition groupsare classified into 32 pairs P0, . . . , P31 of bit-transition groups,i.e., each pair Pi (0 ≤ i ≤ 31) has two bit-transition groups andeach bit-transition group in Pi corresponds to the i-th bit in theinternal register a or e.

Let us consider a message M and assume that the first 512-bitmessage block from M is input to the target circuit. The registersa to d and e to h are initialized at PHASE 1 of Algorithm 1 atCycle T . Then the registers a and e are updated at Cycle (T + 1)

according to the PHASE 2 of Algorithm 1 using the input mes-sage M. According to Phase 2 of Algorithm 1, the register a isupdated at Cycle (T + 1) by:

a = T1 � T2 (8)

= (h � Rot2 (e) �Ch (e, f , g) � K0 �W0) �

(Rot1 (a) � Ma j (a, b, c)) (9)

= A �W0 (10)

where A = h�Rot2 (e)�Ch (e, f , g)�K0�Rot1 (a)�Ma j (a, b, c).A is calculated by the values of a to h initialized at Cycle T andthe pre-determined constant K0. A is independent of the inputmessage M. W0 is the first 32 bits of the input message M as inEq. (6). The register e is updated at Cycle (T + 1) in the sameway.

Now we prepare two input messages M1 and M2 and considerthat the first 512-bit message block from M1 is input to the targetcircuit. Assume that the registers a to d and e to h are initializedat PHASE 1 of Algorithm 1 at Cycle T . Then the registers a ande are updated at Cycle (T + 1). Based on Eq. (10), this update atCycle (T + 1) can be described by:

a1 = A �W10 (11)

e1 = E �W10 (12)

where a1 and e1 are the updated values for a and e using the mes-sage M1, respectively, and A and E are the constants calculatedusing the initial values of the registers a to d and e to h. W1

0 is thefirst 32 bits of the input message M1 as in Eq. (6).

In the same way, when the first message block from the mes-sage M2 is input to the target circuit at Cycle (T + 1), the updatedvalues of the registers a2 and e2 are described by:

a2 = A �W20 (13)

e2 = E �W20 (14)

where W20 is the first 32 bits of the input message M2. Since the

initial values of the registers a to h here are completely the sameas those of Eqs. (11) and (12), we can use the same A and E valuesin Eqs. (13) and (14).

Here, we prepare a pair of messages, M1 and M2, so thatW2

0 = W10 ⊕α, where the MSB *4 of α is one and the other bits are

zero. At that time, the MSBs of a2 and e2 must be inverted fromthose of a1 and e1 and the other bits are completely the same.

*4 MSB stands for Most Significant Bit and shows the leftmost bit. LSBstands for Least Significant Bit and shows the rightmost bit.

Table 2 Examples of two ASCII symbols whose Hamming distance is one.

A pair of ASCII symbols Bit difference

N/A 10000000

(q,1) 01000000

(q,Q) 00100000

(q,a) 00010000

(q,y) 00001000

(q,u) 00000100

(q,s) 00000010

(q,p) 00000001

Then, by observing the bit difference between the bit-transitiongroups obtained by inputting M1 and M2, we can find out thetwo bit-transition groups which correspond to the MSBs of theinternal registers a and e.

However, according to the specification of HMAC-SHA-256,its input message must be represented by a sequence of 8-bitASCII codes and we cannot have two ASCII codes whose dif-ference is MSB only, i.e., we cannot prepare W1

0 and W20 so that

only their MSB is different. Table 2 shows an example pair ofASCII symbols and its bit difference. Now the first MSB refersto MSB itself and the second MSB refers to the second bit fromMSB in each register.

Now, let us consider the two ASCII symbol sequences “qqqq”and “1qqq.” Since the ASCII codes of ‘q’ and ‘1’ are 01110001and 00110001, we have

“qqqq” = 01110001011100010111000101110001,

“1qqq” = 00110001011100010111000101110001.

In order to determine the two bit-transition groups correspondingto the first MSBs of register a and register e, we give “qqqq” and“1qqq” to W1

0 and W20 , for example. Then we set

W10 = 01110001011100010111000101110001,

W20 = 00110001011100010111000101110001.

As a result, according to Eqs. (11)–(14), the second MSBs mustbe inverted in a2 and e2 compared to a1 and e1, and the first MSBsmay also be inverted in a2 and e2 if there exist carry bits from thesecond MSBs to the first MSBs in calculating Eqs. (11)–(14). Inother words, the first and second MSBs may be all inverted (to-tally 4 bits) in a2 and e2 at most and only the two second MSBsare inverted (totally 2 bits) in a2 and e2 at least.

Example 2 Let us assume that A and W10 are given as follows:

A = 10001110010101000011001000010000

W10 = “qqqq”

= 01110001011100010111000101110001

Then W20 is given by:

W20 = W1

0 ⊕ 01000000000000000000000000000000

= 00110001011100010111000101110001

= “1qqq”

According to Eqs. (11) and (13), we have:

a1 = A �W10

= 11111111110001011010001110000001 (15)

c© 2018 Information Processing Society of Japan 21

Page 7: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

a2 = A �W20

= 10111111110001011010001110000001 (16)

In this case, only the second MSB is different from each other inEqs. (15) and (16).

However, assume that we consider another message and have:

W1′0 = “rrrr”

= 01110011011100110111001101110011

W2′0 = 00110011011100110111001101110011

= “2rrr”

The bit difference between W1′0 and W2′

0 is the second MSB butaccording to Eqs. (11) and (13) we have:

a1′0 = A �W1′

0

= 00000000110001101010010010000010 (17)

a2′0 = A �W2′

0

= 11000000110001101010010010000010 (18)

This is because we have a bit carry from the second MSB to thefirst MSB. In this case, the first MSB and the second MSB aredifferent from each other in Eqs. (17) and (18).

As in this example, only the second MSB is different in a1 anda2 in some cases and both the first MSB and the second MSB aredifferent in a1 and a2 in some cases.

We also have the same situation in the internal register e asin Eqs. (12) and (14) and totally 2–4 bits are inverted in 64 bit-transition groups when we give several pairs of messages. �

When we prepare many such pairs of input messages and countthe inverted bits at each case, we can know which bits are the firstMSBs and which bits are the second MSBs of the registers a ande, since the second MSBs of the registers a and e are always in-verted while the first MSBs of the registers a and e are inverted insome cases.

Once we can find out the two bit-transition groups which cor-respond to the first MSBs of the internal registers a and e as wellas the two bit-transition groups which correspond to their secondMSBs, we can find out the other bit correspondence from MSBto LSB in the same way. We finally have 32 pairs, P0, . . . , P31, ofbit-transition groups.

Note that, we cannot distinguish in each pair, Pi, which onecorresponds to the register a and which one corresponds to theregister e at this time.

4.3 STEP 3: Determine Whether Each Bit-transition Groupin a Pair Corresponds to Register a or Register e

Finally, in STEP 3, we determine whether each bit-transitiongroup in Pi corresponds to the register a or register e.

According to PHASE 2 of Algorithm 1, the register values atCycle T and those at Cycle (T + 1) must satisfy:

a (T + 1) � d (T ) =

e (T + 1) � Rot1 (a (T )) � Ma j (a (T ) , b (T ) , c (T )) (19)

where x (T ) shows the register x value at Cycle T . Since the aboveequation can be calculated in a bit-wise manner, we determine the

bit positions of each register in the 64 bit-transition groups fromLSB to MSB so that they satisfy Eq. (19). Overall, we can knowbit-correspondence between the scan data and the bit positions ofinternal registers.

Since the secret key Kin is used at the first process of the SHA-256 circuit and the secret key Kout is used at the last process ofthe SHA-256 circuit as shown in Fig. 1, we can easily know theirinput timing and thus we can restore the Kin and Kout values usingthe scan data.

5. Experimental Evaluations

In this section, we demonstrate the experimental results to re-store the secret keys Kin and Kout in an HMAC-SHA-256 circuitby using our proposed method. We use the HMAC-SHA-256implementation as described in Section 2.3 which is based onRef. [29]. Our proposed method is implemented in python on thecomputer environment, Intel Core i5 (2.6 GHz) with 8 GB mainmemory.

5.1 When the Scan Chain Includes Random DataIn this experiment, we assume that the scan flip-flop values

of the circuits other than HMAC-SHA-256 are given by randomvalues.5.1.1 Setup

In this experiment, we confirm whether the secret keys Kin andKout can be restored when the length of the scan chain is changed.The scan chain length of the internal registers a to h becomes32×8 = 256 bits. When we change its length, we add random bitsto it. These extra bits inserted randomly into the 256-bit originalscan chain can be considered to be the scan data for the circuitblocks other than the target HMAC-SHA-256 circuit. Then wehave the scan chain lengths from 256 bits to 2,048 bits.

The 128-bit original secret key K in the target HMAC-SHA-256 circuit is given as a random value. We have prepared ran-dom input messages whose length ranges from 32 bits to 512bits. For every scan-chain length, we have performed 50 trials andmeasured the CPU times and the number of required input mes-sages to successfully restore the secret keys Kin and Kout. Notethat, we have made a single 512-bit message block (i.e., N = 1)from every input message and thus obtained the scan data through(N + 3) × 66 = 4 × 66 clock cycles.5.1.2 Results

In all the trials, our proposed method successfully restores thesecret keys Kin and Kout and their CPU times are summarizedin Fig. 4. In Fig. 4, the maximum, average, and minimum CPUtimes over 50 trials required to restore the secret keys are plottedfor every scan-chain length. The number of messages required isalso summarized in Fig. 5. In our experiment, we have prepared41 to 425 messages as in Fig. 5 and the scan signature length(� + 1) in STEP 1 was fixed to 61, which is the maximum valuesince we have 64 iterations in PHASE 2 of Algorithm 1. As Fig. 4demonstrates, we take around 7.5 hours to restore the secret keysin an HMAC-SHA-256 circuit, even if the scan-chain length be-comes 2,048 bits.

The experimental results here demonstrate that a scan-basedattack exists in HMAC-SHA-256 and it must be definitely a real

c© 2018 Information Processing Society of Japan 22

Page 8: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

Fig. 4 The CPU time to restore the secret keys Kin and Kout .

Fig. 5 The number of required messages to restore the secret keys Kin and Kout .

threat.

5.2 Scan-based Attack against a Practical CircuitAs an experiment against a practical circuit, we pick up an

IPsec circuit including HMAC-SHA-256 circuit block and AESencryption circuit block as in Ref. [37]. *5 The block diagramof IPsec circuit is shown in Fig. 6. The HMAC-SHA-256 cir-cuit block performs HMAC-SHA-256 process in (N + 3) × 66 =(1 + 3) × 66 = 264 clock cycles when the number N of messageblocks is one. We use the 128-bit AES encryption circuit blockwhich is proposed in Ref. [39]. It performs AES encryption pro-cess in 19 clock cycles. Note that, the input plaintext and outputciphertext are input/output 32-bit wisely.5.2.1 Setup

Our HMAC-SHA-256 circuit block used includes the registersof:( 1 ) 8 × 32-bit registers for storing intermediate data a–h as in

Fig. 2,

*5 In Ref. [37], the IPsec circuit using HMAC circuit block and AES en-cryption circuit block is proposed, where HMAC-SHA-1 is used asHMAC. However, according to Ref. [38], SHA-1 has vulnerability andSHA-2 should be used instead of it. Thus, in our additional experiment,we use HMAC-SHA-256 as one of practical implementations of IPsec.

Fig. 6 The block diagram of IPsec circuit [37].

( 2 ) 16 × 32-bit registers for Mij ( j = 0, 1, . . . , 15) in Eq. (6),

( 3 ) 8 × 32-bit registers for calculating intermediate hash values,( 4 ) 256-bit register for buffering a message,( 5 ) 256-bit register for storing the secret key Kin,( 6 ) 512-bit register for storing the secret key Kout and( 7 ) 19-bit register for iteration counters.We totally have 2067-bit registers in this circuit block but (5) 256-bit register for storing the secret key Kin and (6) 512-bit registerfor storing the secret key Kout are excluded in the scan chain sinceit explicitly includes the target secret keys.

c© 2018 Information Processing Society of Japan 23

Page 9: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

Fig. 7 IPsec circuit timing chart.

The AES encryption circuit block used includes the registersof:( 1 ) 128-bit register for key expansion,( 2 ) 128-bit register for storing intermediate data,( 3 ) 2 × 32-bit registers for buffering the input plaintext and out-

put ciphertext,( 4 ) 128-bit register for storing a resultant ciphertext and( 5 ) 4 × 4-bits registers for a load counter.We totally have 464-bit registers in this circuit block. Overall,our target IPsec circuit has the {2067− (256+512)}+464 = 1763bits length of the scan chain.

The HMAC-SHA-256 process and the AES encryption pro-cess in the IPsec circuit run in a pipelined manner as depictedin Fig. 7, when the ESP process of IPsec is running. We haveobtained the scan data from the IPsec circuit during 264 clock cy-cles at the shaded clock cycles in Fig. 7, giving various messagesinto HMAC-SHA-256 circuit block according to our proposedmethod. Note that, this circuit can be applied to the AH pro-cess of IPsec. In this case, only HMAC-SHA-256 circuit block isused and messages are directly input into HMAC-SHA-256 cir-cuit block and thus we believe that giving various messages intoHMAC-SHA-256 circuit block may be reasonable.

Similar to Section 5.1.1, the 128-bit original secret key K inthe target HMAC-SHA-256 circuit is given as a random value.We have prepared random messages whose length ranges from32 bits to 416 bits and input them into HMAC-SHA-256 circuitblock. We have also prepared 128-bit random plaintexts and 128-bit random AES key and input them into AES encryption circuitblock. We have performed 10 trials and measured the CPU timesto successfully restore the secret keys Kin and Kout.5.2.2 Results

In all the trials, our proposed method successfully restores thesecret keys Kin and Kout. Their average CPU times over 10 trialsrequired to restore the secret keys Kin and Kout is 5,190 seconds.In our experiment, we have prepared 77 messages and the scansignature length (� + 1) in STEP 1 was also fixed to 61. Table 3summarizes the CPU time in 10 trials.

The results clearly show that our proposed method can be suc-cessfully applied to a practical circuit including HMAC-SHA-256circuit.

5.3 Applying Our Method to SHA-2 FamilySHA-256 is one of the hash functions defined by the SHA-2

family and SHA-2 family has totally the six hash functions, SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224 and SHA-512/256 [27], [28]. Now we consider to apply our method to the

Table 3 The CPU time in 10 trials.

Trials CPU time [s]

1 5,6042 5,1313 5,1634 5,1285 5,1586 5,1617 5,1408 5,1309 5,13810 5,143

Average 5,190

SHA-2 family other than SHA-256.Firstly, SHA-512 is the hash function based on 64-bit word

computation. The initial value, Kj, and the number of rounds inSHA-512 are different from those of SHA-256, but their algo-rithm structure is the same. Therefore, we may also restore thesecret key from the HMAC-SHA-512 circuit in a similar way us-ing our proposed scan-based attack. SHA-224 and SHA-384 arethe simple truncated versions of SHA-256 and SHA-512, respec-tively, with only the initial values being different. Both of theSHA-512/224 and SHA-512/256 are also simple truncated ver-sions of SHA-512. We may also apply our proposed scan-basedattack to these circuits and successfully restore their secret keys.

Overall, our proposed scan-based attack can be applied to allthe hash functions in the SHA-2 family.

6. Discussions

In this section, we discuss the extensibility and limitation ofthe proposed method when applied to other implementations ofHMAC-SHA-256 circuits.

6.1 Scan-based Attack against Other SHA-256/HMAC-SHA-256 Circuit Implementation

The HMAC-SHA-256 circuit that is the target against the pro-posed scan-based attack method satisfies the conditions (A) and(B) in Section 2.3, and this HMAC-SHA-256 circuit is an exam-ple of HMAC-SHA-256 implementations. This implementationis based on Ref. [29], which is the simplest and has a small areafor the HMAC-SHA-256 circuit. This architecture is also targetedin Ref. [11] which proposes an electromagnetic side-channel at-tack method against the HMAC-SHA-256 circuit.

We focus on the simplest implementation based on Ref. [29] asthe first step to propose a scan-based side-channel attack methodagainst HMAC-SHA-256 circuits. However, there are severalother implementations of the HMAC-SHA-256 circuit. As below,we separately discuss other implementations of SHA-256 circuitand other implementations of HMAC-SHA-256 circuit.6.1.1 Scan-based Attack against Other SHA-256 Circuit Im-

plementationsAs far as we know, basic iterative SHA-256 circuit [29], [40],

two-unrolled SHA-256 circuit [41], [42], pipelining SHA-256circuit [43], [44], two-unrolled pipelining SHA-256 circuit [45],four-unrolled pipelining SHA-256 circuit [45], and compactSHA-256 circuit [46] are proposed as SHA-256 circuit imple-mentations.

We pick up each of the implementations above, and discuss

c© 2018 Information Processing Society of Japan 24

Page 10: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

the possibility of applying the proposed scan-based side-channelattack against each implementation.Basic iterative SHA-256 circuit [29], [40]

The basic iterative implementation of the SHA-256 circuitis the one described in Section 2.2 and thus it satisfies thecondition (B) in Section 2.3. Clearly the scan-based attackagainst HMAC-SHA-256 proposed can be applied to this cir-cuit implementation.

Two-unrolled SHA-256 circuit [41], [42]The two-unrolled implementation of the SHA-256 circuitprocesses PHASE 2 of Algorithm 1 in 32 clock cycles, notin 64 clock cycles. In this implementation, the data transi-tion occurs in one clock cycle from the register a to registerc, the register b to register d, the register e to register g, andthe register f to register h in Fig. 2. Therefore, by findingsuch bit transitions using the scan data obtained from theHMAC-SHA-256 circuit including this SHA-256 circuit im-plementation, we expect that we can find the bit-transitiongroups similar to those obtained in STEP 1 of the proposedmethod. When we can insert arbitrary messages into thisSHA-256 circuit, we expect that STEP 2 and STEP 3 in theproposed method can be successfully applied to this circuitimplementation.

Pipelining SHA-256 circuit [43], [44]The pipelining implementation of the SHA-256 circuit pro-cesses PHASE 2 of Algorithm 1 in a pipelined manner. Inthis implementation, the processes on the left side (a par-tial circuit composed of the registers a, b, c and d) and theright side (a partial circuit composed of the registers e, f , gand h) in Fig. 2 constitute the pipeline stages. In this case,we expect that the bit-transition groups can be also found inSTEP 1 of the proposed method, since the 32-bit register a

moves to register b, the register b moves to register c, theregister c moves to register d, the register e moves to registerf , the register f moves to register g, and the register gmovesto register h in one clock cycle as in the basic iterative SHA-256 circuit. When we can insert arbitrary messages into thisSHA-256 circuit, we expect that STEP 2 and STEP 3 in theproposed method can be successfully applied to this circuitimplementation.

Two-unrolled pipelining SHA-256 circuit [45]The two-unrolled pipelining implementation of the SHA-256 circuit runs two-unrolled SHA-256 circuits in apipelined manner. By combining the above discussions oftwo-unrolled SHA-256 circuit and pipelining SHA-256 cir-cuit implementations, we can also apply our scan-based at-tack to this circuit implementation.

Four-unrolled pipelining SHA-256 circuit [45]The four-unrolled pipelining implementation of the SHA-256 circuit processes PHASE 2 of Algorithm 1 in 16 clockcycles, not in 64 clock cycles, and then runs it in a two-stagepipelined manner. In this case, the value of one register doesnot move directly to another register, unlike basic iterativeimplementation and two-unrolled implementation. There-fore, we cannot find the bit-transition groups using STEP 1of the proposed method. We consider that we cannot apply

our scan-based attack directly to this circuit implementation.Compact SHA-256 circuit [46]

The compact implementation of the SHA-256 circuit runsone iteration of PHASE 2 of Algorithm 1 in several clockcycles, not in one clock cycle. In this case, if the timingwhen one register moves to another register during one itera-tion can be specified, we expect that the bit-transition groupscan be also found out using the scan data obtained from theHMAC-SHA-256 circuit including this SHA-256 circuit im-plementation. When we can insert arbitrary messages intothis SHA-256 circuit, we expect that STEP 2 and STEP 3in the proposed method can be successfully applied to thiscircuit implementation.

As discussed above, the proposed method depends on theSHA-256 circuit implementations but it is expected that the scan-based attack method proposed can be efficiency applied to manyother SHA-256 implementations proposed so far. However, itis necessary to carry out simulation and implementation exper-iments to see how much CPU time and scan data are actuallyneeded. They are important future works.6.1.2 Scan-based Attack against Other HMAC-SHA-256

Circuit ImplementationsThe HMAC-SHA-256 circuit implementation targeted by our

scan-based attack method is based on Ref. [29]. This implemen-tation is the simplest and has a small area compared to Ref. [42].However, there are other HMAC-SHA-256 circuit implementa-tions that utilize two SHA-256 circuit blocks inside [42]. Now,let SHAa and SHAb be the two SHA-256 circuit blocks in theHMAC-SHA-256 circuit.

When the HMAC-SHA-256 circuit starts its process, both thecircuit blocks SHAa and SHAb will run. When we arrange ver-tically the scan data obtained from the HMAC-SHA-256 circuit,we may find out 64 or more bit-transition groups using STEP 1,since we have the two SHA-256 circuit blocks in this circuit.

If messages can be inserted independently into the circuitblocks SHAa or SHAb, the bit-transition groups of the circuitSHAa and the bit-transition groups of the circuit SHAb can bedealt with independently. In this sense, we expect that we canapply STEP 2 and STEP 3 to this circuit implementation.

However, we can consider the case where messages cannotbe inserted independently into SHAa and SHAb. For example,Ref. [42] proposes the HMAC-SHA-256 circuit implementationin this case. In this circuit implementation, when a message M isinserted into SHAa, only the message M′ which is dependent onM is inserted into SHAb. Although the bit-transition groups maybe frond out from the scan data obtained from this HMAC-SHA-256 circuit implementation by using STEP 1, we cannot classifythem into 32 pairs of bit-transition groups even by using STEP 2,since we cannot insert independent messages Ma and Mb intoSHAa and SHAb, respectively. We consider that we cannot ap-ply our scan-based attack directly to this circuit implementation.

Thus, the proposed scan-based attack depends on the imple-mentation of the HMAC-SHA-256 circuit. Developing the scan-based attack against a wider range of HMAC-SHA-256 circuitsis one of the important future works.

c© 2018 Information Processing Society of Japan 25

Page 11: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

Fig. 8 Scan chain with no inverters (a) and with inverters (b).

6.2 Inverters in the Scan ChainLet Cx be the the circuit including the target HMAC-SHA-256

circuit of the proposed scan-based attack. As discussed in Sec-tion 3, we assume that Cx has a single scan chain connecting allthe registers in Cx and hence all the SFFs in Cx without any in-verters. All the SFFs connected to the scan chain in Cx composea shift register and output scan data.

Let n be the scan chain length. Let SFFi (1 ≤ i ≤ n) be an i-thSFF of the scan chain. Now, we insert an odd number of invertersbetween SFFi and SFFi+1 where 1 ≤ i ≤ n − 1. Let Cy be such acircuit. Let scanx and scany be the n-bit scan data obtained fromthe circuit Cx and the circuit Cy, respectively, at the same clockcycle, when the same message is inserted into Cx and Cy and theyrun the same process.

As below, the proposed method can be also applied to the cir-cuit Cy including inverters in the scan chain.

Comparing scanx and scany, the (i + 1)-th bit to n-th bit ofscanx and scany are exactly the same, but the 1st bit to i-th bitof them are always inverted, as in Fig. 8, since the position of in-verters connected between SFFi and SFFi+1 does not change. Itis always true when we obtain scan data from Cx and Cy at anyclock cycle.

As in Fig. 3 (c), we arrange vertically the scan data obtainedfrom Cx at different clock cycles. We also arrange vertically thescan data obtained from Cy at different clock cycles. Every ver-tical bit sequence shows a bit change trace in a particular SFF atdifferent clock cycles, which is called a scan signature. Let SSx, j

and SSy, j be the j-th bit scan signature obtained from Cx and Cy,respectively, when we focus on j-th bit in the scan data. If i < j,SSy, j must be the same as SSx, j, i.e., SSy, j = SSx, j. If j ≤ i, SSy, jmust be inverted to SSx, j, i.e., SSy, j = SSx, j.

In STEP 1, by just considering both original scan signature andits inverted scan signature above, we can successfully find out bit-transition groups, even if a target circuit includes inverters in itsscan chain.

6.3 The Limitation of the Proposed Method from the View-point of Registers in Circuits Other than HMAC-SHA-256

Now we consider the circuit which includes two HMAC-SHA-256 circuit blocks (Section 6.3.1) and the circuit including shiftregisters other than the HMAC-SHA-256 circuit block (Sec-tion 6.3.2). Then, we try to apply our proposed method to each ofthem.6.3.1 Scan-based Attack against the Circuit with Two

HMAC-SHA-256 Circuit Blocks in the Scan ChainLet hmaca and hmacb be the two HMAC-SHA-256 circuit

blocks in the target circuit. hmaca and hmacb are connected inthe same scan chain. Let Ga be a set of 64 bit-transition groups inthis case from hmaca and Gb be a set of 64 bit-transition groupsfrom hmacb.

In STEP 1, we can find out 128 bit-transition groups in thiscase from the scan data in the target circuit, each one of whichbelongs to either Ga or Gb, by giving enough messages.

In STEP 2, bit-transition groups are classified into 32 pairs ofhmaca or hmacb by inserted messages into the target circuit. Weassume that messages can be inserted independently into hmaca

and hmacb. In this case, the message inserted into hmacb is fixedand various messages are inserted into hmaca. That is, STEP 2can be applied only to hmaca. In the same way, STEP 2 can beapplied only to hmacb. The 128 bit-transition groups obtained inSTEP 1 can be classified into 32 pairs for hmaca and 32 pairs forhmacb.

In STEP 3, we determine whether each bit-transition group cor-responds to the register a or register e. Overall, we can knowbit-correspondence between the scan data and the bit positions ofinternal registers.

In this case, it is considered that the proposed scan-based at-tack can be applied, if messages can be inserted independentlyinto the two HMAC-SHA-256 circuit blocks hmaca and hmacb.6.3.2 Scan-based Attack against the Circuit Including Shift

Registers Other than the HMAC-SHA-256 CircuitBlock

In STEP 1, we may find out 64 or more bit-transition groups inthis case, since the bit transition depends on the internal registersof SHA-256 circuit block and shift registers in the circuit.

In STEP 2, we try to find out 32 pairs of bit-transition groupsfrom the bit-transition groups obtained in STEP 1. We also as-sume that messages can be inserted into HMAC-SHA-256 circuitblock independent of the shift registers. In this case, we can findout the inverted bit positions from the scan data according to thevarious input messages. STEP 2 can be applied to the circuit withshift registers in this case and then our proposed method can besuccessfully applied to it.

As discussed above, if each HMAC-SHA-256 circuit block canbe controlled independent of other circuit blocks, we believe thatour proposed method can be applied to an entire circuit. How-ever, if HMAC-SHA-256 circuit block cannot be controlled inde-pendently, it can be very difficult to apply our proposed methodto an entire circuit. How to solve this problem is another futurework.

6.4 When Optimizations to Implement the HMAC-SHA-256 Algorithm in Hardware Are Done in HMAC-SHA-256 Circuit

First of all, to restore the secret keys of a certain circuit includ-ing the HMAC-SHA-256 circuit, STEP 1, STEP 2 and STEP 3 ofthe proposed method need to be applied to it.

In STEP 1, 64 bit-transition groups are found from the scandata obtained from the HMAC-SHA-256 circuit. In Algorithm 1,one 32-bit internal register moves to another cycle by cycle.Based on this property, we can obtain 64 bit-transition groupsby using a scan signature in the scan data.

c© 2018 Information Processing Society of Japan 26

Page 12: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

At that time, when we can identify the register transitions fromthe scan data using the register transition pattern as in the caseof two-unrolled SHA-256 circuit (Section 6.1.1), we believe thatbit-transition groups can be isolated, even if some kind of opti-mizations are implemented in the HMAC-SHA-256 circuit.

As discussed in Section 6.1.1, the key point here is that we canknow register transition pattern even if several optimizations inhardware are done.

Once the bit-transition groups in the target HMAC-SHA-256circuit can be found out, we can similarly apply STEP 2 andSTEP 3 to it and thus we can successfully restore the secret keysof the target circuit. Note that, we always assume that any mes-sages can be inserted into HMAC-SHA-256 circuit as describedin Section 3.

7. Conclusions

In this paper, we have proposed a scan-based attack methodagainst an HMAC-SHA-256 circuit using scan signatures. Ourproposed method can restore the secret information by findingout the correspondence between the scan data obtained from ascan chain and the internal registers in the target HMAC-SHA-256 circuit, even if the scan chain includes other registers thaninternal registers in the target circuit. Experimental results showthat our proposed method successfully restores two secret keys ofthe HMAC-SHA-256 circuit using up to 425 input messages in7.5 hours.

In our proposed method, we assume that all the internal regis-ters in the HMAC-SHA-256 circuit as well as other circuits areconnected in a scan chain and attackers can obtain scan data di-rectly from the scan chain. In the case that a partial scan chainis employed in the target HMAC-SHA-256 circuit, where not allthe registers are connected in a scan chain, our method may failto restore the secret information. This is because we cannot re-store the internal register values from the partial scan data. Howto compensate lacking values in internal registers is one of themajor challenges in our future works.

Acknowledgments This research and development workwas supported in part by the MIC/SCOPE #171503005.

References

[1] Das, A., Rolt, D.J., Ghosh, S., Seys, S., Dupuis, S., Natale, D.G.,Flottes, M., Rouzeyre, B. and Verbauwhede, I.: Secure JTAG Im-plementation Using Schnorr Protocol, Journal of Electronic Testing,Vol.29, No.2, pp.193–209 (2013).

[2] Novak, F. and Biasizzo, A.: Security extension for IEEE Std 1149.1,Journal of Electronic Testing., Vol.22, No.3, pp.301–303 (2006).

[3] Chiu, G.-M. and Li, J.C.-M.: A secure test wrapper design againstinternal and boundary scan attacks for embedded cores, IEEE Trans.on Very Large Scale Integr. (VLSI) Syst., Vol.20, No.1, pp.126–134(2012).

[4] DeBusschere, E. and McCambridge, M.: Modern Game Console Ex-ploitation, tech. rep., Department of Computer Science, University ofArizona (2012).

[5] Rolt, D.J., Das, A., Natale, D.G, Flottes, M., Rouzeyre, B. andVerbauwhede, I.: Test Versus Security: Past and Present, IEEE Trans.Emerging Topics in Computing, Vol.2, No.1, pp.50–62 (2014).

[6] Ebrard, E., Allard, B., Candelier, P. and Waltz, P.: Review of fuse andantifuse solutions for advanced standard CMOS technologies, Micro-electronics Journal, Vol.40, No.12, pp.1755–1765 (2009).

[7] Kocher, C.P.: Timing attacks on implementations of Diffie-Hellman,RSA, DSS, and other systems, Lecture Notes in Computer Science,Vol.1109, pp.104–113 (1996).

[8] Biham, E. and Shamir, A.: Differential fault analysis of secret keycryptosystems, Lecture Notes in Computer Science, Vol.1294, pp.513–525 (1997).

[9] Boneh, D., DeMillo, A.R. and Lipton, J.R.: On the importance ofchecking cryptographic protocols for faults, Lecture Notes in Com-puter Science, Vol.1233, pp.37–51 (1997).

[10] Tsunoo, Y., Tsujihara, E., Minematsu, K. and Miyauchi, H.: Crypt-analysis of block ciphers implemented on computers with cache, Lec-ture Notes in Computer Science, Vol.2779, pp.62–76 (2003).

[11] Gebotys, C.H., White, B.A. and Mateos, E.: Preaveraging and carrypropagate approaches to side-channel analysis of HMAC-SHA256,ACM Trans. Embedded Computing Systems, Vol.15, No.1, pp.4:1–4:19 (2016).

[12] Belaıd, S., Bettale, L., Dottax, E., Genelle, L. and Rondepierre, F.:Differential power analysis of HMAC SHA-2 in the hamming weightmodel, Proc. International Conference on Security and Cryptography,pp.230–241 (2013).

[13] Kocher, P., Jaffe, J. and Jun, B.: Differential power analysis, Proc.CRYPTO ‘99, pp.388–397, Springer-Verlag (1999).

[14] McEvoy, R., Tunstall, M., Murphy, C.C. and Marnane, W.P.: Differen-tial power analysis of HMAC based on SHA-2, and countermeasures,Lecture Notes in Computer Science, Vol.4867, pp.317–332 (2007).

[15] Okeya, K.: Side channel attacks against HMACs based on blockcipher based hash functions, Lecture Notes in Computer Science,Vol.4058, pp.432–443 (2006).

[16] Yang, B., Wu, K. and Karri, R.: Scan based side channel attack on ded-icated hardware implementations of Data Encryption Standard, Proc.International Test Conference, pp.339–344 (2004).

[17] Yang, B., Wu, K. and Karri, R.: Secure scan: A design-for-test archi-tecture for crypto chips, IEEE Trans. Computer-Aided Design of Inte-grated Circuits and Systems, Vol.25, No.10, pp.2287–2293 (2006).

[18] Nara, R., Togawa, N., Yanagisawa, M. and Ohtsuki, T.: A scan-based attack based on discriminators for AES cryptosystems, IEICETrans. Fundamentals of Electronics, Communications and ComputerSciences, Vol.E92-A. No.12, pp.3229–3237 (2009).

[19] Rolt, D.J., Natale, D.G., Flottes, M. and Rouzeyre, B.: A novel dif-ferential scan attack on advanced DFT structures, ACM Trans. De-sign Automation of Electronic Systems, Vol.18, No.4, pp.58:1–58:22(2013).

[20] Ali, S.S., Saeed, S., Sinanoglu, O. and Karri, R.: Novel test-mode-only scan attack and countermeasure for compression-based scan ar-chitebures, IEEE Trans. Computer-Aided Design of Integrated Cir-cuits and Systems, Vol.34, No.5, pp.808–821 (2015).

[21] Nara, R., Satoh, K., Yanagisawa, M., Ohtsuki, T. and Togawa, N.:Scan-based side-channel attack against RSA cryptosystems using scansignatures, IEICE Trans. Fundamentals of Electronics, Communi-cations and Computer Sciences, Vol.E93-A, No.12, pp.2481–2489(2010).

[22] Nara, R., Yanagisawa, M., Ohtsuki, T. and Togawa, N.: Scan vulnera-bility in elliptic curve cryptosystems, IPSJ Trans. System LSI DesignMethodology, Vol.4, pp.47–59 (2011).

[23] Agrawal, M., Karmakar, S., Saha, D. and Mukhopadhyay, D.: Scanbased side channel attacks on stream ciphers and their counter-measures, Lecture Notes in Computer Science, Vol.5365, pp.226–238(2008).

[24] Liu, Y., Wu, K. and Karri, R.: Scan-based attacks on linear feedbackshift register based stream ciphers, ACM Trans. on Design Automationof Electronic Systems, Vol.16, No.2, pp.20:1–20:15 (2011).

[25] Fujishiro, M., Yanagisawa, M. and Togawa, N.: Scan-based attackagainst Trivium stream cipher using scan signatures, IEICE Trans.Fundamentals of Electronics, Communications and Computer Sci-ences, Vol.E97-A, No.7, pp.1444–1451 (2014).

[26] FIPS 198-1: The Keyed-Hash Message Authentication Code (online),available from 〈http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.198-1.pdf〉 (accessed 2017-08-15).

[27] Descriptions of SHA-256, SHA-384, and SHA-512 (online), avail-able from 〈http://www.iwar.org.uk/comsec/resources/cipher/sha256-384-512.pdf〉 (accessed 2017-06-01).

[28] FIPS 180-4: Secure Hash Standard (online), available from〈http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf〉 (ac-cessed 2017-06-01).

[29] Juliato, M. and Gebotys, C.: FPGA implementation of an HMAC pro-cessor based on the SHA-2 family of hash functions, University ofWaterloo, Tech. Rep (2011).

[30] Dierks, T. and Rescorla, E.: Request for Comments 5246: The Trans-port Layer Security (TLS) protocol version 1.2 (online), available from〈https://tools.ietf.org/html/rfc5246〉 (accessed 2017-06-01).

[31] Uskov, A. and Hayk, A.: The efficiency of block ciphers in ga-lois/counter mode in IPsec-based virtual private networks, Proc.IEEE International Conference on Electro/Information Technology,pp.173–178 (2014).

c© 2018 Information Processing Society of Japan 27

Page 13: Scan-based Side-channel Attack against HMAC-SHA-256 ...

IPSJ Transactions on System LSI Design Methodology Vol.11 16–28 (Feb. 2018)

[32] Niedermayer, H., Klenk, A. and Carle, G.: The networking perspec-tive of security performance-a measurement study, Proc. 13th GI/ITGConference Measuring, Modelling and Evaluation of Computer andCommunication Systems, pp.1–17 (2006).

[33] Kelly, S. and Frankel, S.: Request for Comments 4868: Using HMAC-SHA-256, HMAC-SHA-384, and HMAC-SHA-512 with IPsec (on-line), available from 〈https://tools.ietf.org/html/rfc4868〉 (accessed2017-06-01).

[34] Bhargavan, K. and Leurent, G.: Transcript collision attacks: Breakingauthentication in TLS, IKE, and SSH, Proc. Network and DistributedSystem Security Symposium (2016).

[35] Al Fardan, N.J. and Paterson, K.G.: Lucky thirteen: Breaking the TLSand DTLS record protocols, Proc. IEEE Symposium on Security andPrivacy, pp.526–540 (2013).

[36] Oku, D., Yanagisawa, M. and Togawa, N.: Scan-based side-channelattack against HMAC-SHA-256 circuits based on identifying initialpositions (in Japanese), IPSJ SIG Technical Reports, Vol.2017-ARC-225, No.22, pp.1–6 (2017).

[37] McLoone, M. and McCanny, J.V.: A single-chip IPSec crypto-graphic processor, Proc. IEEE Workshop on Signal Processing Sys-tems, pp.133–138 (2002).

[38] Wang, X., Yin, Y. L. and Yu, H.: Finding collisions in the full SHA-1,Lecture Notes in Computer Science, Vol.3621, pp.27–36 (2005).

[39] McLoone, M. and McCanny, J.V.: Generic architecture and semicon-ductor intellectual property cores for advanced encryption standardcryptography, IEE Proceedings-Computers and Digital Techniques,Vol.150, No.4, pp.239–244 (2003).

[40] Algredo-Badillo, I., Feregrino-Uribe, C., Cumplido, R. andMorales-Sandoval, M.: Novel Hardware Architecture for implement-ing the inner loop of the SHA-2 Algorithms, Proc. 14th EuromicroConference on Digital System Design, pp.543–549 (2011).

[41] Zeghid, M., Bouallegue, B., Machhout, M., Baganne, A. and Tourki,R.: Architectural design features of a programmable high throughputreconfigurable SHA-2 Processor, Journal of Information Assuranceand Security, Vol.2, pp.147–158 (2008).

[42] Michail, H.E., Athanasiou, G.S., Kelefouras, V., Theodoridis, G. andGoutis, C.E.: On the exploitation of a high-throughput SHA-256FPGA design for HMAC, ACM Trans. Reconfigurable Technology andSystems, Vol.5, No.1, pp.2:1–2:28 (2012).

[43] Dadda, L., Macchetti, M. and Owen, J.: The design of a high speedACIC unit for the hash function SHA-256 (382, 512), Proc. Design,Automation and Test in Europe Conference and Exhibition, Vol.3,pp.70–75 (2004).

[44] Rote, M.D., Vijendran, N. and Selvakumar, D.: High performanceSHA-2 core using the Round Pipelined Technique, Proc. IEEE Inter-national Conference on Electronics, Computing and CommunicationTechnologies, pp.1–6 (2015).

[45] McEvoy, R.P., Crowe, F.M., Murphy, C.C. and Marnane, W.P.: Op-timisation of the SHA-2 family of hash functions on FPGAs, Proc.IEEE Computer Society Annual Symposium on Emerging VLSI Tech-nologies and Architectures, pp.317–322 (2006).

[46] Kim, M., Ryou, J. and Jun, S.: Efficient Hardware Architecture ofSHA-256 Algorithm for Trusted Mobile Computing, Lecture Notes inComputer Science, Vol.5487, pp.240–252 (2008).

Daisuke Oku received the B. Eng. andM. Eng. degree from Waseda Universityin 2016 and 2017, respectively, all inComputer Science and CommunicationsEngineering. He is presently working to-wards D. Eng. degree there. His researchinterests are LSI design and cryptographyarchitecture.

Masao Yanagisawa received his B.Eng.,M.Eng., and Dr.Eng. degrees fromWaseda University in 1981, 1983, and1986, respectively, all in electrical en-gineering. He was with University ofCalifornia, Berkeley from 1986 through1987. In 1987, he joined TakushokuUniversity. In 1991, he left Takushoku

University and joined Waseda University, where he is presently aProfessor in the Department of Computer Science and Commu-nications Engineering. His research interests are combinatoricsand graph theory, computational geometry, VLSI design andverification, and network analysis and design. He is a fellow ofIEICE and a member of IEEE and ACM.

Nozomu Togawa received his B.Eng.,M. Eng., and Dr. Eng. degrees fromWaseda University in 1992, 1994, and1997, respectively, all in electrical en-gineering. He is presently a Profes-sor in the Department of Computer Sci-ence and Communications Engineering,Waseda University. His research interests

are VLSI design, graph theory, and computational geometry. Heis a member of IEEE, ACM and IEICE.

(Recommended by Associate Editor: Takeshi Matsumoto)

c© 2018 Information Processing Society of Japan 28


Recommended