COMP3221 lec13-decision-I.1 Saeid Nooshabadi
COMP 3221
Microprocessors and Embedded Systems
Lectures 4/5: Making Decisions in C/Assembly Language - I
http://www.cse.unsw.edu.au/~cs3221
March 2004
Modified from notes by Saeid Nooshabadi
COMP3221 lec13-decision-I.6 Saeid Nooshabadi
C Decisions (Control Flow): if statements
° 2 kinds of if statements in C•if (condition) statement•if (condition) statement1 else statement2
° Following code is same as 2nd if if (condition) goto L1; statement2; goto L2;L1: statement1;
L2:• Not as elegant as if-else, but same meaning
COMP3221 lec13-decision-I.7 Saeid Nooshabadi
ARM decision instructions (control flow) (#1/2)
° Decision instruction in ARM:•cmp register1, register2 ;compare register1 with ; register2
•beq L1 ; branch to L1 if equal
is “Branch if (registers are) equal”
Same meaning as C:
• if (register1==register2) goto L1
° Complementary ARM decision instruction•cmp register1, register2•bne L1•bne is “Branch if (registers are) not equal”
Same meaning as C: if (register1!=register2) goto L1
° Called conditional branches
COMP3221 lec13-decision-I.8 Saeid Nooshabadi
ARM decision instructions (control flow) (#2/2)
° Decision instruction in ARM:•cmp register1, #immediate ;compare register1 with
; immediate number•beq L1 ; branch to L1 if equal
• is “Branch if (register and immediate are) equal”
Same meaning as C:
• if (register1==#immediate) goto L1
° Complementary ARM decision instruction•cmp register1, #immediate•bne L1•bne is “Branch if (register and immediate are) not equal”
Same meaning as C: if (register1!=#immediate) goto L1
° Called conditional branches
COMP3221 lec13-decision-I.9 Saeid Nooshabadi
Compiling C if into ARM Assembly
° Compile by handif (i == j) f=g+h; else f=g-h;
Mapping f: v1, g: v2, h: v3, i: v4, j: v5
° Start with branch:cmp v4, v5 beq L-true ; branch to L-True
; if i==j
° Follow with false partsub v1,v2,v3 ; f=g-h
exit
i == j?
f=g+h f=g-h
(false) i != j
(true) i == j
COMP3221 lec13-decision-I.10 Saeid Nooshabadi
Compiling C if into ARM Assembly
° Need instruction that always transfers control to skip over true part• ARM has branch: b label ; goto “label”
sub v1,v2,v3 ; f=g-h
b L-exit
° Next is true partL-true: add v1,v2,v3 ; f=g+h
° Followed by exit branch labelL-exit:
COMP3221 lec13-decision-I.11 Saeid Nooshabadi
Compiling C if into ARM: Summary
°Compile by handif (i == j) f=g+h; else f=g-h;
Mapping f: v1, g: v2, h: v3, i: v4, j: v5
cmp v4,v5 beq L-true ; branch i==j
sub v1,v2,v3 ; (false) b L-exit ; go to ExitL-true: add v1,v2,v3 ;(true)L-exit:
° Note:Compiler supplies labels for branches not found in HLL code; often it flips the condition to branch to false part
i == j?
f=g+h f=g-h
(false) i != j
(true) i == jC
ARM
COMP3221 lec13-decision-I.12 Saeid Nooshabadi
Motoring with Microprocessors° Thanks to the magic of microprocessors and embedded systems,
our cars are becoming safer, more efficient, and entertaining.
° The average middle-class household includes over 40 embedded processors. About half are in the garage. Cars make a great vehicle for deploying embedded processors in huge numbers. These processors provide a ready source of power, ventilation, and mounting space and sell in terrific quantities.
° How many embedded processors does your car have?
° If you've got a late-model luxury sedan, two or three processors might be obvious in the GPS navigation system or the automatic distance control. Yet you'd still be off by a factor of 25 or 50. The current 7-Series BMW and S-class Mercedes boast about 100 processors apiece. A relatively low-profile Volvo still has 50 to 60 baby processors on board. Even a boring low-cost econobox has a few dozen different microprocessors in it.
° Your transportation appliance probably has more chips than your Internet appliance.
° New cars now frequently carry 200 pounds of electronics and more than a mile of wiring.
http://www.embedded.com/
COMP3221 lec13-decision-I.13 Saeid Nooshabadi
COMP3221 Reading Materials (Week #5)° Week #5: Steve Furber: ARM System On-Chip; 2nd Ed,
Addison-Wesley, 2000, ISBN: 0-201-67519-6. We use chapters 3 and 5
° ARM Architecture Reference Manual –On CD ROM
° A copy of the article by Cohen, D. “On holy wars and a plea for peace (data transmission).” Computer, vol.14, (no.10), Oct. 1981. p.48-54, is place on the class website.
COMP3221 lec13-decision-I.14 Saeid Nooshabadi
Loops in C/Assembly
° Simple loop in C
Loop:g = g + A[i];i = i + j;if (i != h) goto Loop;
° (g,h,i,j:v2,v3,v4,v5; base of A[]:v6):
° 1st fetch A[i]
Loop: ldr a1,[v6, v4, lsl #2] ;(v6+v4*4)=addr A[i] ;a1=A[i]
COMP3221 lec13-decision-I.15 Saeid Nooshabadi
Simple Loop (cont)
° Add value of A[i] to g and then j to ig = g + a1i = i + j;
° (g,h,i,j:v2,v3,v4,v5):
add v2,v2,a1 ; g = g+A[i]add v4,v4,v5 ; i = i + j
The final instruction branches back to Loop if i != h :
cmp v4,v3 bne Loop ; goto Loop ; if i!=h
COMP3221 lec13-decision-I.16 Saeid Nooshabadi
Loops in C/Assembly: Summary
Loop:g = g + A[i];i = i + j;if (i != h) goto Loop;
° (g,h,i,j:v2,v3,v4,v5; base of A[]:v6):
Loop: ldr a1,[v6, v4, lsl #2] ;(v6+v4*4)=addr A[i] ;a1=A[i]
add v2,v2,a1 ; g = g+A[i] add v4,v4,v5 ; i = i + j cmp v4,v3
bne Loop ; goto Loop ; if i!=h
C
ARM
COMP3221 lec13-decision-I.17 Saeid Nooshabadi
while in C/Assembly:
° Although legal C, almost never write loops with if, goto: use while, or for loops
° Syntax: while(condition) statement
while (save[i] == k) i = i + j;
° 1st load save[i] into a scratch register (i,j,k: v4,v5,v6: base of save[]:v7):
Loop: ldr a1,[v7,v4,lsl #2] ;v7+v4*4=addr of save[i] ;a1=save[i]
COMP3221 lec13-decision-I.18 Saeid Nooshabadi
While in C/Assembly (cont)
° Loop test: exit if save[i] != k (i,j,k: v4,v5,v6: base of save[]:v7)
cmp a1,v6bne Exit ;goto Exit
;if save[i]!=k
° The next instruction adds j to i:
add v4,v4,v5 ; i = i + j
° End of loop branches back to the while test at top of loop. Add the Exit label after:
b Loop ; goto LoopExit:
COMP3221 lec13-decision-I.19 Saeid Nooshabadi
While in C/Assembly: Summary
while (save[i]==k) i = i + j;
(i,j,k: v4,v5,v6: base of save[]:v7)
Loop: ldr a1,[v7,v4,lsl #2] ;v7+v4*4=addr of save[i] ;a1=save[i]
cmp a1,v6 bne Exit ;goto Exit
;if save[i]!=k add v4,v4,v5 ; i = i + j
b Loop ; goto LoopExit:
C
ARM
COMP3221 lec13-decision-I.20 Saeid Nooshabadi
Beyond equality tests in ARM Assembly (#1/2)
° So far ==, != , What about < or >?•cmp register1, register2•blt L1
is “Branch if (register1 <register2)
Same meaning as C:
• if (register1<register2) go to L1
° Complementary ARM decision instruction•cmp register1, register2•bge L1•bge is “Branch if (register1 >= register2) ”
Same meaning as C: if (register1>=register2) go to L1
COMP3221 lec13-decision-I.21 Saeid Nooshabadi
Beyond equality tests in ARM Assembly (#2/2)
° Also•cmp register1, #immediate•blt L1
is “Branch if (register1 <#immediate)
Same meaning as C:
• if (register1<immediate) go to L1
° Complementary ARM decision instruction•cmp register1, #immediate•bge L1•bge is “Branch if (register1 >= #immediate) ”
Same meaning as C: if (register1>=immediate) go to L1
COMP3221 lec13-decision-I.22 Saeid Nooshabadi
If less_than in C/Assembly
if (g < h) { ... }
cmp v1,v2 ; v1<v2 (g<h)
blt Less ; if (g < h) b noLess ; if (g >= h) Less:
...
noLess:
Alternative Code
cmp v1,v2 ; v1<v2 (g<h) bge noLess ; if (g >=
h) ... ; if (g < h) noLess:
C
ARM
COMP3221 lec13-decision-I.23 Saeid Nooshabadi
Some Branch Conditions°b Unconditional
°bal Branch Always
°beq Branch Equal
°bne Branch Not Equal
°blt Branch Less Than
°ble Branch Less Than or Equal
°bgt Branch Greater Than
°bge Branch Greater Than or Equal
° Full Table Page 64 Steve Furber: ARM System On-Chip; 2nd Ed, Addison-Wesley, 2000, ISBN: 0-201-67519-6.
COMP3221 lec13-decision-I.24 Saeid Nooshabadi
What about unsigned numbers?° Conditional branch instructions blt, ble, bgt, etc,
assume signed operands (defined as int in C). The equivalent instructions for unsigned operands (defined as unsigned in C). are:
°blo Branch Lower (unsigned)
°bls Branch Less or Same (unsigned)
°bhi Branch Higher (unsigned)
°bhs Branch Higher or Same (Unsigned)
° v1 = FFFF FFFAhex, v2 = 0000 FFFAhex
° What is result of
cmp v1, v2 cmp v1, v2
bgt L1 bhi L1
COMP3221 lec13-decision-I.25 Saeid Nooshabadi
Signed VS Unsigned Comparison
° v1 = FFFF FFFAhex, v2 = 0000 FFFAhex
° v1 < v2 (signed interpretation)
° v1 > v2 (unsigned interpretation)
° What is result of
cmp v1, v2 cmp v1, v2
bgt L1 bhi L1
… …
L1: L1:
Branch NOT taken Branch Taken
COMP3221 lec13-decision-I.26 Saeid Nooshabadi
Branches: PC-relative addressing
° Recall register r15 in the machine also called PC;
° points to the currently executing instruction
° Most instruction add 4 to it. (pc increments by 4 after execution of most instructions)
° Branch changes it to a specific value
° Branch adds to it• 24-bit signed value
(contained in the instruction)
• Shifted left by 2 bits
° Labels => addresses
memory0:
FFF...
registers
r14
r0
r15 = pc
beq address
b address
–32MB
+32MB
24 bits
COMP3221 lec13-decision-I.27 Saeid Nooshabadi
C case/switch statement
° Choose among four alternatives depending on whether k has the value 0, 1, 2, or 3
switch (k) {
case 0: f=i+j; break; /* k=0*/
case 1: f=g+h; break; /* k=1*/
case 2: f=g–h; break; /* k=2*/
case 3: f=i–j; break; /* k=3*/
}
COMP3221 lec13-decision-I.28 Saeid Nooshabadi
Case/switch via chained if-else, C
°Could be done like chain of if-else if(k==0) f=i+j; else if(k==1) f=g+h; else if(k==2) f=g–h; else if(k==3) f=i–j;
COMP3221 lec13-decision-I.29 Saeid Nooshabadi
Case/switch via chained if-else, C/Asm.°Could be done like chain of if-else
if(k==0) f=i+j; else if(k==1) f=g+h; else if(k==2) f=g–h; else if(k==3) f=i–j;(f,i,j,g,h,k:v1,v2,v3,v4,v5,v6) cmp v6,#0
bne L1 ; branch k!=0 add v1,v2,v3 ; k=0 so f=i+j b Exit ; end of caseL1:cmp v6,#1
bne L2 ; branch k!=1 add v1,v4,v5 ; k=1 so f=g+h
b Exit ; end of caseL2:cmp v6,#2
bne L3 ; branch k!=2 sub v1,v4,v5 ; k=2 so f=g-h
b Exit ; end of caseL3:cmp v6,#3
bne Exit ; branch k!=2 sub v1,v2,v3 ; k=3 so f=i-j
Exit:
C
ARM
COMP3221 lec13-decision-I.30 Saeid Nooshabadi
Case/Switch via Jump Address Table
° Notice that last case must wait for n-1 tests before executing, making it slow
° Alternative tries to go to all cases equally fast: jump address table
• Idea: encode alternatives as a table of addresses of the cases
- Table an array of words with addresses corresponding to case labels
• Program indexes into table and jumps
° ARM instruction “ldr pc, [ ]” unconditionally branches to address L1 (Changes PC to address of L1)
COMP3221 lec13-decision-I.31 Saeid Nooshabadi
Idea for Case using Jump Table
•check within range•get address of target clause from target array•jump to target address
Default clause
end_of_case:
code for clause
jump to end_of_case
AddressJump Table
COMP3221 lec13-decision-I.32 Saeid Nooshabadi
Case/Switch via Jump Address Table (#1/3)
° Use k to index a jump address table, and then jump via the value loaded
° 1st test that k matches 1 of cases (0<=k<=3); if not, the code exits
(k:v6, v7: Base address of JumpTable[k])
cmp v6, #0 ;Test if k < 0blt Exit ;if k<0,goto Exitcmp v6, #3 ;Test if k >3bgt Exit ;if k>3,goto Exit
COMP3221 lec13-decision-I.33 Saeid Nooshabadi
Case/Switch via Jump Address Table (#2/3)° Assume 4 sequential words (4 bytes) in memory,
with base address in v7, have addresses corresponding to labels L0, L1, L2, L3.
° Now use (4*k) (k:v6) to index table of words and load the clause address from the table (Address of labels L0, L1, L2, L3 ) to register pc.
ldr pc, [v7, v6,lsl #2] ;JumpTable[k]= v7 + (v6*4)
(Register Indexed Addressing)
• PC will contain the address of the clause and execution starts from there.
L3
L0
L1
L2
v7
V6*4
COMP3221 lec13-decision-I.34 Saeid Nooshabadi
Case/Switch via Jump Address Table (#3/3)
° Cases jumped to by ldr pc, [v7, v6,lsl #2] :
L0: add v1,v2,v3 ; k=0 so f=i+j b Exit ; end of case L1: add v1,v4,v5 ; k=1 so f=g+h b Exit ; end of case L2: sub v1,v4,v5 ; k=2 so f=g-h b Exit ; end of caseL3: sub v1,v2,v3 ; k=3 so f=i-j Exit:
COMP3221 lec13-decision-I.35 Saeid Nooshabadi
Jump Address Table: Summary(k,f,i,j,g,h Base address of JumpTable[k]) :v6,v1,v2,v3,v4,v5,v7)
cmp v6, #0 ;Test if k < 0blt Exit ;if k<0,goto Exitcmp v6, #3 ;Test if k >3bgt Exit ;if k>3,goto Exit
ldr pc, [v7, v6,lsl #2] ;JumpTable[k]= v7 +
(v6*4)
L0: add v1,v2,v3 ; k=0 so f=i+j b Exit ; end of case L1: add v1,v4,v5 ; k=1 so f=g+h b Exit ; end of case L2: sub v1,v4,v5 ; k=2 so f=g-h b Exit ; end of caseL3: sub v1,v2,v3 ; k=3 so f=i-j Exit:
More example on Jump Tables on CD-ROM
COMP3221 lec13-decision-I.36 Saeid Nooshabadi
If there is time, do it yourself:
° Compile this C code into ARM:
sum = 0;for (i=0;i<10;i=i+1)
sum = sum + A[i];•sum:v1, i:v2, base address of A:v3
COMP3221 lec13-decision-I.37 Saeid Nooshabadi
(If time allows) Do it yourself:sum = 0;for (i=0;i<10;i=i+1)
sum = sum + A[i];•sum:v1, i:v2, base address of A:v3
C
ARM
mov v1, #0 mov v2, #0
Loop: ldr a1,[v3,v2,lsl #2] ; a1=A[i] add v1, v1, a1 ; sum = sum+A [i]
add v2, v2, #1 ; increment i cmp v2, #10 ; Check(i<10) bne Loop ; goto loop
COMP3221 lec13-decision-I.46 Saeid Nooshabadi
Cmp Instructions and CPSR Flags (#1/3)
x1xx = Z set (equal)(eq)x0xx = Z clear (not equal)(ne)xx1x = C set (unsigned higher
or same) (hs/cs)xx0x = C clear (unsigned
lower) (lo/cc)
x01x = C set and Z clear (unsigned higher)
(hi)1xxx = N set (negative) (mi)0xxx = N clear (positive or
zero) (pl)
28 31 24 20 16 12 8 4 0 N Z C V
cmp r1, r2
;Update flags after r1–r2
xx0x OR x1xx = C clear or Z set (unsigned lower or same)(ls)
CPSR
COMP3221 lec13-decision-I.47 Saeid Nooshabadi
Cmp Instructions and CPSR Flags (#2/3)
xxx1 = V set (signed overflow)(vs)xxx0 = V clear (signed no overflow) (vc)1xx1 = N set and V set (signed greater
or equal (ge)0xx0 = N clear and V clear (signed
greater or equal)(ge)1xx0 = N set and V clear (signed
less than) (lt) 0xx1 = N clear and V set (signed
less than) (lt) 10x1 = Z clear, and N set and V
set (signed greater)(gt) 00x0 = Z clear, and N clear and V
clear (signed greater) (gt)
28 31 24 20 16 12 8 4 0 N Z C V
N = V (signed greater or equal (ge)
N V (signed less than) (lt)
Z clear AND (N =V) (signed greater than) (gt)
CPSR
cmp r1, r2
;Update flags after r1–r2
COMP3221 lec13-decision-I.48 Saeid Nooshabadi
cmp Instructions and CPSR Flags (#3/3)
x1xx = Z set (equal)(eq)1xx0 = N set and V clear (signed less)
(le)0xx1 = N clear and V set (signed less)
(le)
28 31 24 20 16 12 8 4 0 N Z C V
Z set OR N V (signed less or equal)(lt)
CPSR
cmp r1, r2
;Update flags after r1–r2
COMP3221 lec13-decision-I.49 Saeid Nooshabadi
Example° Assuming r1 = 0x7fffffff, and r2 = 0xffffffffWhat the CPSR flags would be after the
instruction; cmp r1, r2Answer: r1 = 2147483647r2 = (4294967295)or(-1)
(Unsigned) (signed)
r1- r2 = 0x7fff ffff -
0xffff ffff
r1- r2 = 0x7fff ffff +
- 0xffff ffff
r1- r2 = 0x7fff ffff +
0x0000 0001 2’s complement of 0xffff ffff
0x8000 0000 (carry into msb carry out of msb) 0 (carry out)
N=1, Z= 0, C = 0, V=1
C clear (unsigned lower) (lo/cc)
Z clear AND (N =V) (signed greater than) (gt)
2147483647 > -1
COMP3221 lec13-decision-I.50 Saeid Nooshabadi
Branch Instruction Conditional Execution Field
0000 = EQ - Z set (equal)0001 = NE - Z clear (not equal)0010 = HS / CS - C set
(unsigned higher or same)0011 = LO / CC - C clear
(unsigned lower)0100 = MI -N set (negative)0101 = PL- N clear
(positive or zero)0110 = VS - V set (overflow)0111 = VC - V clear
(no overflow)1000 = HI - C set and Z clear
(unsigned higher)
1001 = LS - C clear or Z set (unsigned lower or same)
1010 = GE - N set and V set, or N clear and V clear (signed >or =)
1011 = LT - N set and V clear, or N clear and V set (signed <)
1100 = GT - Z clear, and either N set and V set, or N clear and V clear (signed >)
1101 = LE - Z set, or N set and V clear,or N clear and V set (signed <, or =)
1110 = AL - always1111 = NV - reserved.
cmp r1, r2B{<cont>} xyz
28 24 31 23 20 16 12 8 4 0 COND 24-bit signed offsetL101
COMP3221 lec13-decision-I.51 Saeid Nooshabadi
Flag Setting Instructions° Compare
cmp r1, r2 ; Update flags after r1 – r2
° Compare Negatedcmn r1, r2 ; Update flags after
r1 + r2° Test
tst r1, r2 ; Update flags after r1 AND r2
° Test Equivalenceteq r1, r2 ; Update flags after
r1 EOR r2These instructions DO NOT save results; Only UPDATE CPSR Flags
COMP3221 lec13-decision-I.52 Saeid Nooshabadi
Flag Setting Instructions Example
° Assuming r1 = 0x7fffffff, and r2 = 0xffffffff
What would the CPSR flags would be after the instructions
cmp r1, r2, lsl #2 cmn r1, r2, lsl #2 tst r1, r2, lsr #1 teq r1, r2, lsr #1
COMP3221 lec13-decision-I.53 Saeid Nooshabadi
Flag Setting Instructions Example Solution (#1/4)
cmp r1, r2, lsl #2
Answer: r1 = 0x7fffffff = 2147483647 r2 lsl #2 = 0xfffffffc
= 4294967292 (unsigned) = (-4) (signed)
r1 - r2 lsl # 2 =
0x7fff ffff +
0x0000 0100 2’s complement of 0xffff fffc
0x8000 00ff (carry into msb carry out of msb) 0 (carry out)
N=1, Z= 0, C = 0, V = 1C clear (unsigned lower) (lo/cc)
2147483647 < 4294967292Z clear AND (N = V) (signed greater than) (gt)2147483647 > -4
COMP3221 lec13-decision-I.54 Saeid Nooshabadi
Flag Setting Instructions Example Solution (#2/4)
cmn r1, r2, lsl #2
Answer: r1 = 0x7fffffff = 2147483647 r2 lsl #2 = 0xfffffffc
= 4294967292 (unsigned) = (-4) (signed)
r1 + r2 lsl # 2 = r1 – (-r2) (comparing r1 and –r2)
0x7fff ffff +
0xffff fffc 0x7fff fffb (carry into msb = carry out of msb) 1 (carry out)
N = 0, Z= 0, C = 1, V = 0C set (unsigned higher or same) (hs/cc).
Z clear AND (N = V) (signed greater than) (gt)2147483647 > -(-4) = 4
C set here really means there was an unsigned addition overflow
COMP3221 lec13-decision-I.55 Saeid Nooshabadi
Flag Setting Instructions Example Solution (#3/4)tst r1, r2, lsr #1
Answer: r1 = 0x7fffffffr2 lsr #1 = 0x7fffffff
Destination C 0lsr
r1 and r2 lsr # 1 =
0x7fff ffff and
0x7fff ffff
0x7fff ffff
N = 0, Z= 0, C = 1, V= 0N clear (Positive or Zero) (pl)
Z clear (Not Equal) (ne). It really means ANDing of r1 and r2 does not make all bits 0 , ie r1 AND r2 0
COMP3221 lec13-decision-I.56 Saeid Nooshabadi
Flag Setting Instructions Example Solution (#4/4)teq r1, r2, lsr #1
Answer: r1 = 0x7fffffffr2 lsr #1 = 0x7fffffff
r1 eor r2 lsl # 1 =
0x7fff ffff xor
0x7fff ffff
0x0000 0000
N = 0, Z= 1, C = 1, V= 0N clear (Positive or Zero) (pl)Z set (Equal) It really means r1 = r2
Destination C 0lsr
COMP3221 lec13-decision-I.57 Saeid Nooshabadi
Updating CPSR Flags with Data Processing Instructions
° By default, data processing operations do not affect the condition flags
- add r0,r1,r2 ; r0 = r1 + r2 ; ... and
DO not set ; flags
° To cause the condition flags to be updated, the “S” bit of the instruction needs to be set by postfixing the instruction with an “S”.
• For example to add two numbers and set the condition flags:- adds r0,r1,r2 ; r0 = r1 + r2
; ... and DO set flags
COMP3221 lec13-decision-I.58 Saeid Nooshabadi
Updating Flags Example
° Compile this C code into ARM:
sum = 0;for (i=0;i<10;i=i+1)
sum = sum + A[i];•sum:v1, i:v2, base address of A:v3
COMP3221 lec13-decision-I.59 Saeid Nooshabadi
Updating Flags Example Solution Ver 1 sum = 0;for (i=0;i<10;i=i+1)
sum = sum + A[i];•sum:v1, i:v2, base address of A:v3
C
ARM
mov v1, #0 mov v2, #0
Loop: ldr a1,[v3,v2,lsl #2] ; a1=A[i] add v1, v1, a1 ;sum = sum+A [i] add v2, v2, #1 ;increment i cmp v2, #10 ; Check(i<10) bne Loop ; goto loop
COMP3221 lec13-decision-I.60 Saeid Nooshabadi
Updating Flags Example Solution Ver 2 sum = 0;for (i=0;i<10;i=i+1)
sum = sum + A[i];•sum:v1, i:v2, base address of A:v3
C
ARM
mov v1, #0 mov v2, #9 ; start with i =
9 Loop: ldr a1,[v3,v2,lsl #2] ; a1=A[i] add v1, v1, a1 ;sum = sum+A
[i] sub v2, v2, #1 ;decrement i cmp v2, #0 ; Check(i<0) bge Loop ; goto loop
COMP3221 lec13-decision-I.61 Saeid Nooshabadi
Updating Flags Example Solution Ver 3 sum = 0;for (i=0;i<10;i=i+1)
sum = sum + A[i];•sum:v1, i:v2, base address of A:v3
C
ARM
mov v1, #0 mov v2, #9 ; start with i =
9 Loop: ldr a1,[v3,v2,lsl #2] ; a1=A[i] add v1, v1, a1 ; sum =
sum+A [i] subs v2, v2, #1 ; decrement i
; update flags bge Loop ; goto loop
COMP3221 lec13-decision-I.62 Saeid Nooshabadi
COMP3221 Reading Materials (Week #5)° Week #5: Steve Furber: ARM System On-Chip; 2nd Ed,
Addison-Wesley, 2000, ISBN: 0-201-67519-6. We use chapters 3 and 5
° ARM Architecture Reference Manual –On CD ROM
COMP3221 lec13-decision-I.63 Saeid Nooshabadi
Conditional Execution° Recall Conditional Branch Instruction: beq, bne, bgt, bge, blt, ble, bhi, bhs,blo, bls, etc
Almost all processors only allow branch instructions to be executed conditionally.
° However by reusing the condition evaluation hardware, ARM effectively increases number of instructions.
• All instructions contain a condition field which determines whether the CPU will execute them.
° This removes the need for many branches, Allows very dense in-line code, without branches.
° Example: subs v2, v2,#1 ; Update flagsaddeq v3,v3, #2 ; add if EQ (Z = 1)
COMP3221 lec13-decision-I.64 Saeid Nooshabadi
Conditional Code Example
Start
Stopr0 = r1?
r0 > r1?
r0 = r0 - r1 r1 = r1 - r0
Yes
No Yes
No
° Convert the GCD algorithm given in this flowchart into
1)“Normal” assembler,where only branches can be conditional.
2)ARM assembler, where all instructions are conditional, thus improving code density.
° The only instructions you need are cmp, b and sub.
COMP3221 lec13-decision-I.65 Saeid Nooshabadi
Conditional Code Example (Solution)° “Normal” Assemblergcd cmp r0, r1 ;reached the end? beq stop blt less ;if r0 > r1 sub r0, r0, r1 ;subtract r1 from r0 bal gcdless sub r1, r1, r0 ;subtract r0 from r1 bal gcdstop° ARM Conditional Assemblergcd cmp r0, r1 ;if r0 > r1 subgt r0, r0, r1 ;subtract r1 from r0
sublt r1, r1, r0 ;else subtract r0 ; from r1 bne gcd ;reached the end?
COMP3221 lec13-decision-I.66 Saeid Nooshabadi
Long Integer Addition Example
°long int = 64 bitslong int l, m, n;
n = l + m;
/*Won’t work in C. Still treats long int as int */
r0
r2
r1
r3+
0313263
r4r5
C
l
m
n
COMP3221 lec13-decision-I.67 Saeid Nooshabadi
Long Integer Addition Example (Analysis)
° We need to do the addition in two stepsstruct int64 {int lo;
int hi;}l, m, n;n.lo = m.lo + l.lo;
n.hi = m.hi + l.hi + C; How to check on carry in C?
r0
r2
r1
r3+
0313263
r4r5
C
l
m
n
hi lo
COMP3221 lec13-decision-I.68 Saeid Nooshabadi
Long Integer Carry Generation
Statement n.lo = l.lo + m.lo; would generate a carry C if:
r0
r2
r1
r3+
0313263
r4r5
C
l
m
n
hi lo
… 0 01 …… 0 11 …… 1 00 …
C=1Or(Bit 31 of m.lo = 1) and (bit 31 of n.lo 1)
… 0 11 …… 0 10 …… 1 01 …
C=1
(Bit 31 of l.lo = 1) and (bit 31 of m.lo = 1)
… 0 11 …… 0 01 …… 1 00 …
C=1Or(Bit 31 of l.lo = 1) and (bit 31 of n.lo 1)
COMP3221 lec13-decision-I.69 Saeid Nooshabadi
Long Integer Addition Example in C
struct int64 {int lo; int hi;
};
struct int64 ad_64(struct int64 l,struct int64 m){
struct int64 n;
unsigned hibitl=l.lo >> 31, hibitm=m.lo >> 31, hibitn;
n.lo=l.lo + m.lo;
hibitn=n.lo >> 31;
n.hi=l.hi + m.hi +
((hibitl & hibitm) || (hibitl & ~hibitn) || (hibitm & ~hibitn));
return n;
}
COMP3221 lec13-decision-I.70 Saeid Nooshabadi
Long Integer Addition Example Compiledad_64: /* l.lo, l.hi, m.lo and m.hi are passed in r0-r3*/ str lr, [sp,-#4]! ; store ret add. mov ip, r0, asr #31 ; hibitl (l.lo>>31) mov lr, r2, asr #31 ; hibitm,(m.lo>>31) add r0, r0, r2 ; n.lo = l.lo + m.lo add r1, r3, r1 ; n.hi = l.hi + m.hi mov r2, r0, asr #31 ; hibitn,(n.lo>>31) tst lr, ip ; is hibitl=hibitm ?
bne .L3mvn r2, r2 ; ~(hibitn) tst ip, r2 ; is hibitl=~hibitn ?
bne .L3 tst lr, r2 ; is hibitm=~hibitn ? beq .L2.L3: mov r2, #1 ; C = 1.L2: add r1, r1, r2 ; n.hi = l.hi + m.hi + C ldr lr, [sp,#4]! ; get ret add.
mov pc, lr ;n.lo n.hi retuned in r0-r1
COMP3221 lec13-decision-I.71 Saeid Nooshabadi
Long Integer Addition Example ARM
ad_64:
/* l.lo, l.hi, m.lo and m.hi are passed in r0-r1*/
str lr, [sp,-#4]! ; store ret add. adds r0, r0, r2 ; n.lo = l.lo + m.lo addc r1, r3, r1 ; n.hi = l.hi + m.hi + C ldr lr, [sp,#4]! ; get ret add. mov pc, lr ;n.lo n.hi retuned in r0-r1
addc Is add with Carry
COMP3221 lec13-decision-I.72 Saeid Nooshabadi
Long Integer Addition Example in C (GCC)° long long type in extension in gcclong long ad_64(long long l,long long m){
long long n;
n = m + l;
return n;
}
° ARM Compiled Codead_64:
/* l.lo, l.hi, m.lo and m.hi are passed in r0-r1*/ adds r0, r0, r2 ; n.lo = l.lo + m.lo addc r1, r3, r1 ; n.hi = l.hi + m.hi + C mov pc, lr
Long long type is 64 bits
COMP3221 lec13-decision-I.73 Saeid Nooshabadi
“And in Conclusion …”• Flag Setting Instructions: cmp, cmn, tst, teq in ARM
• Data Processing Instructions with Flag setting Feature: adds, subs, ands, in ARM
• Conditional Instructions: addeq, ldreq,etc in ARM