CS2504, Spring'2007©Dimitris Nikolopoulos
61
Large constants
MIPS has 16 bits in the instruction field For large constants:
lui loads upper 16 bits of register with operand Used to compose large 32-bit numbers Example:
#goal = 0000 0000 0011 1101 0000 1001 0000 0000
#upper half=0000 0000 0000 0000 0000 0000 0011 1101
#or 61 decimal
lui $s0, 61
#lower half 0000 0000 0000 0000 0000 1001 0000 0000
#or 2304 decimal
ori $s0, 2304
CS2504, Spring'2007©Dimitris Nikolopoulos
62
Large constants
Can it be done otherwise? If use addi, addi would copy the most significant bit
(which is the sign bit) to all upper 16 bits of the destination.
This is called sign extension A negative operand would propagate 1's to the upper bits Watch for automatic sign extensions in arithmetic operations in MIPS
ori assumes that 16 higher bits of immediate operand are zeros, therefore it does the job.
CS2504, Spring'2007©Dimitris Nikolopoulos
63
Memory addressing
j instruction has a single constant operand and no variants: 26-bit operand to specify memory location 6 bits still needed for opcode
Branch instructions have two register operands (10 bits) 16-bit memory operand, can only address 216 bytes Range -32,768,+32,767 bytes (signed offset) Insufficient for 32-bit architectures Need solution to address more memory Key idea: base + offset Register holds a base, offset indicates distance
(positive or negative from the base)
CS2504, Spring'2007©Dimitris Nikolopoulos
64
PC-relative addressing
Base register is the program counter MIPS-specific:
MIPS program counter actually points to next instruction (PC+4) for efficiency purposes to be clarified later
Offset encoded in 16 bits is actually a number of words, not bytes (effectively extending the range to 217 bytes, signed)
Offset is added to PC+4 Direct jump instruction(j), can address 228 bytes. jr uses full 32-bit address stored in register
CS2504, Spring'2007©Dimitris Nikolopoulos
65
PC relative addressing
Long jumps j instruction has 26-bit argument, representing 28-
bit addresses Missing 4 bits for complete address:
4 leftmost bits left untouched by branches and jumps Program loader and linker are aware of this while
placing programs in memory If jump has to cross boundaries set by the loader,
program must use jr with a register operand
Long-range branches?bne $t0,$t1,L1 #L1 is far far away...
beq $t0,$t1,L2 #L2 is nearbyj L1L2:
CS2504, Spring'2007©Dimitris Nikolopoulos
66
MIPS Pop Quiz
What are the values of the offset fields of the bne and the j instructions in this loop, if the loop starts at 80000hex?
Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j Loop
Exit:
CS2504, Spring'2007©Dimitris Nikolopoulos
68
Recap: MIPS Instruction formats
op rs rt rd shamt funct
R-format
op rs rt rd shamtimm
I-format
op rs rt rd shamtadd
J-format
CS2504, Spring'2007©Dimitris Nikolopoulos
69
Homework
Learn how to decode MIPS instructionsUse table in page 103
CS2504, Spring'2007©Dimitris Nikolopoulos
70
Decoding instruction example
00af8030hex #as in SPIMopcode=000 000 (Rformat, bits:3126)rs = 00101 (bits: 2521)rt = 01111 (bits: 2016)rd = 10000 (bits: 1511)shamt=00000funct=110000 (mult instruction)
mult $s0,$a1,$t7
CS2504, Spring'2007©Dimitris Nikolopoulos
72
Assemble to simplify your life
Pseudoinstructions Instructions composed of other MIPS assembly
instructions move $t0, $t1 = add $t0,$zero,$t1 blt, bge, ble all use beq, bne, slt
CS2504, Spring'2007©Dimitris Nikolopoulos
73
Assembler
Translate assembly to binary code Binary code augmented with meta-
information object file header, size of pieces of object file text segment static data segment relocation information symbol table (labels defined in the program) debugging information (associates assembly
instructions with high-level language instructions)
CS2504, Spring'2007©Dimitris Nikolopoulos
74
Linker
Primary motivation: libraries and reusable code
Stitches together code and data modules symbolically Still no absolute addresses
Linker figures out new addresses of labels Relocation information is used to figure out
positions of labels in libraries Linker patches (does not recompile) the binary
Resolves all internal and external references, complains otherwise
CS2504, Spring'2007©Dimitris Nikolopoulos
76
Loader
Read executable, find size of text and data segments
Create address space large enough to hold text and data
Copy text and data into memory Copy program parameters to stack Initialize machine registers, stack pointer Jump to start-up routine, which copies
parameters to registers and calls main
CS2504, Spring'2007©Dimitris Nikolopoulos
77
Dynamic Linking
Library part of executable Not good if library changes May produce large executable although small
fraction of library is used Dynamic linking
Attempt to link the library code at runtime, when we need it. Furthermore, attempt to link only the code we need, no less, no more
Concept of DLLs
CS2504, Spring'2007©Dimitris Nikolopoulos
79
Lazy linking explained
Program keeps a dummy routine, pointer to dummy routine in data segment
Load pointer, jump to dummy Dummy jumps to dynamic linker/loader
code Dynamic linker loader locates target library
code, remaps and changes pointer of jump in memory with new address
Voila!
CS2504, Spring'2007©Dimitris Nikolopoulos
81
Java-specific
Java uses an interpreter (JVM) JVM is equivalent to a hardware simulator Interpreter helps portability
Java runs everywhere Interpreter harms performance
Java uses JIT compilers to remedy
CS2504, Spring'2007©Dimitris Nikolopoulos
83
Compiler Primer
Compilers translate and optimize programs Program representation changes
High-level language (the source, possibly with some additional information sprinkled, machine-independent)
One or more intermediate compiler representations (IR)
High-level IR, close to source, mostly machine-independent, good for source-to-source transformations
Low-level IR, close to assembly, mostly machine-dependent, good for architecture-specific optimizations
CS2504, Spring'2007©Dimitris Nikolopoulos
84
Some transformations
Procedure inlining Reduce procedure call and argument passing
overhead Increase code size
Loop unrolling Reduce loop branching overhead Increase code size Enables pipelining and other optimizations
CS2504, Spring'2007©Dimitris Nikolopoulos
85
Optimization example x[i] = x[i] + 4
Address of x[i] is used twice Naive interpretation (using virtual registers):li R100, xlw R101, isll R102,R101,2 #i offset from x baseadd R103,R100,R102 #address of x[i]lw R104,0(R103) #x[i] in R104add R105,R104,4 # result in R105li R106,xlw R107,isll R108,R107,2add R109,R107,R107sw R105,0(R109)
CS2504, Spring'2007©Dimitris Nikolopoulos
86
Optimization example x[i] = x[i] + 4
Address of x[i] is used twice Common expression elimination: li R100, xlw R101, isll R102,R101,2 #i offset from x baseadd R103,R100,R102 #address of x[i]lw R104,0(R103) #x[i] in R104add R105,R104,4 # result in R105 sw R105,0(R103)
CS2504, Spring'2007©Dimitris Nikolopoulos
87
Other optimizations
Constant propagation Copy propagation Dead code elimination Data store elimination
CS2504, Spring'2007©Dimitris Nikolopoulos
88
Compiler primer
Compilers are conservative Can be extremely hard to verify correctness of an
optimization. If the compiler can't verify it won't apply it
What is easy to humans may be difficult for the compiler in some cases
The opposite is true too (try staring at optimized assembly code and figure it out)
Pointers, dynamic allocation, other high-level language features make optimization difficult Although they do increase programmer's
productivity
CS2504, Spring'2007©Dimitris Nikolopoulos
89
Putting it all together
void swap (int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp;}
CS2504, Spring'2007©Dimitris Nikolopoulos
90
Putting it all together
swap:#a0,a1 hold pointer to v[] and ksll $t1,$a1,2 #multiply k*4 to find offsetadd $t1,$a0,$t1 #find address of v[k]lw $t0,0($t1) #load v[k], t0 is our templw $t2,4($t1) #load v[k+1]sw $t2,0($t1) #store *v[k+1] in v[k]sw $t0,4($t1) #store *v[k] in v[k+1]jr $ra
CS2504, Spring'2007©Dimitris Nikolopoulos
91
Putting it all together
void sort (int v[], int n) { int i, j; for (i=0;i<n;i++) { for (j=i1;j>=0 && v[j]>v[j+1]; j) { swap(v,j) } }}
CS2504, Spring'2007©Dimitris Nikolopoulos
92
Deconstruct procedure for (i=0;i<n;i++) move $s0,$zerofor1tst: slti $t0, $s0, $a1 #if i < n beq $t0, zero, exit1 #i >= n #loop body... addi $s0,$s0,1 #i = i + 1j for1tst
Putting it all together
CS2504, Spring'2007©Dimitris Nikolopoulos
93
Deconstruct procedure#for (j=i1;j>=0 && v[j] > v[j+1];j)
addi $s1,$s0,1 #initialize j
for2tst:
slti $t0,$s1,0 #if j < 0 exit
bne $t0,$zero,exit2
sll $t1,$s1,2 #find j offset in v
add $t2,$t1,$a0 #find address v[j]
lw $t3,0($t2) #load v[j]
lw $t4,4($t2) #load v[j+1]
slt $t0,$t4,$t3 #if v[j+1]<v[j] t0 = 0
beq $t0,$zero,exit2
#...(body of second loop), call swap
addi $s1,$s1,1
j for2tst
exit2: ...
Putting it all together
CS2504, Spring'2007©Dimitris Nikolopoulos
94
Deconstruct procedure#to call swap (v,j)#need to save argument registersor $s2,$zero,$a0 #move pseudo...or $s3,$zero,$a1 #move pseudo...#$s0,$s1 already occupied#now load arguments for the callee (swap)or $a0,$zero,$s2 #first parameter to #swap is v, saved # in $s2, why necessary?or $a1,$zero,$s1 #second parameter to #swap is j, in $s1jal swap
Putting it all together
CS2504, Spring'2007©Dimitris Nikolopoulos
95
Remaining chores#save registers in sort, since sort#will be a callee and procedure swap # will be called in sort,need to # save$ra addi $sp,$sp,20sw $ra,16($sp)sw $s3,12($sp)sw $s2,8($sp)sw $s1,4($sp)sw $s0,0($sp)
Putting it all together
CS2504, Spring'2007©Dimitris Nikolopoulos
96
Putting it all together - Sort
Save registerssort:addi $sp,$sp,20sw $ra,16($sp)sw $s3,12($sp)sw $s2,8($sp)sw $s1,4($sp)sw $s0,0($sp)
CS2504, Spring'2007©Dimitris Nikolopoulos
97
Putting it all together - Sort
Save parameters to call swapor $s2,$zero,$a0or $s3,$zero,$a1
Outer loop: move $s0,$zero #add $s0,$zero,$zerofor1tst: slt $t0, $s0, $a1 beq $t0, zero, exit1 #i >= n
CS2504, Spring'2007©Dimitris Nikolopoulos
98
Putting it all together - Sort
Inner loop
addi $s1,$s0,1 #initialize jfor2tst: slti $t0,$s1,0 #if j < 0 exit beq $t0,$zero,exit sll $t1,$s1,2 #find j offset in v add $t2,$t1,$a0 #find address v[j] lw $t3,0($t2) #load v[j] lw $t4,4($t2) #load v[j+1] slt $t0,$t4,$t3 #if v[j+1]<v[j] t0 = 0 beq $t0,$zero,exit2
CS2504, Spring'2007©Dimitris Nikolopoulos
99
Putting it all together - Sort
or $a0,$zero,$s2 #first parameter to #swap is v, saved # in $s2or $a1,$zero,$s1 #second parameter to #swap is j, in $s1jal swapaddi $s1,$s1,1j for2tstaddi $s0,$s0,1j for1tst
CS2504, Spring'2007©Dimitris Nikolopoulos
100
Putting it all together - Sort
exit1: lw $s0,0($sp) lw $s1,4($sp) lw $s2,8($sp) lw $s3,12($sp) lw $ra,16($sp) addi $sp,$sp,20 jr $ra
CS2504, Spring'2007©Dimitris Nikolopoulos
101
Does it matter?
clear1 (int array[], int size){ int i; for (i=0;i<size;i++) array[i] = 0;}clear 2 (int *array, int size) { int *p; for (p=&array[0]; p<array[size]; p++) *p = 0;}
CS2504, Spring'2007©Dimitris Nikolopoulos
102
Does it matter? Arrays vs. pointers
move $t0, $zeroloop1: sll $t1,$t0,2add $t2,$a0,$t1sw $zero,0($t2)addi $t0,$t0,1slt $t3,$t0,$a1bne $t3,$zero,loop1
move $t0, $a0sll $t1, $a1, 2add $t2,$a0,$t1loop2: sw $zero,0($t0)addi $t0,$t0,4slt $t3,$t0,$t2bne $t3,$zero,loop2
CS2504, Spring'2007©Dimitris Nikolopoulos
103
Does it matter?
Arrays vs. pointers Array version calculates address of array[i] inside
the loop, using a multiplication (shift logical) Pointer version pre-calculates address array[size]
and calculates array[i] inside the loop using an addition (also called a pointer bump)
Pointer version reduces the number of instructions per iteration
In the old days using pointers used to be a good advice. Nowadays, compilers can probably figure out and execute such cases efficiently. In this example, the compiler could do strength reduction (change multiply with add)
CS2504, Spring'2007©Dimitris Nikolopoulos
104
Recap
You just learned most of MIPS assembly! Can translate fairly realistic C programs
Implement integer arithmetic/logic Branch Call functions Call recursive functions
New concepts Register manipulation, caller-save, callee-save Stack (from the machine's point of view) Layout of programs in memory (address space) Memory addressing in various ways Minimalistic instruction sets (RISC)