Date post: | 27-Jun-2015 |
Category: |
Technology |
Upload: | alveena-saleem |
View: | 119 times |
Download: | 0 times |
Instruction Set Architecture
� What is an instruction set?�Portion of the machine visible to the programmer or
compiler writer�Each instruction is directly executed by hardware
� Examples�DEC VAX
�INTEL IA-32
�H and P DLX
�MIPS, Power PC, SPARC
�ARM
Instruction Set Principles
� Chapter 2 in both 2nd and 3rd edition
operand
Instruction Set Architecture
� How are they represented?�By bits
�Typically 16, 32, 64
� Variable or Fixed�Fixed – each instruction is same size
What you see What the machine sees
Add A,B
op-code operand
Application Areas
� Desktop computing� Code size not important
� Integer and floating-point performance is important
� Servers� Floating-point not important
� Integer performance is important
� Embedded applications� Value cost and power efficiency is important
� Code size is important
� Multimedia and DSP applications� Real time constraints
� Power efficient
Classifying Instruction Sets
� Type of internal CPU storage�Stack – operands are implicit
�Accumulator – one operand is implicit
�General purpose registers – explicit operands
Classifying Instruction Sets
� Where are the operands?�Accumulator, Stack, registers, memory
�Register-Register (Load-Store), Memory-Register, Memory-Memory
� Size and type of operands�8 bits, 16 bits, unsigned, signed, floating point
� Addressing modes
� Types of operations
Classifying Instruction Sets
� Number of operands
� Two operand format�Add R1,R2�The result is placed in the first operand
�Source and result are the same
� Three operand format�Add R3,R1,R2�The result is placed in the first operand
Classifying Instruction Sets
� Type of internal CPU storage�Stack – both operands are implicit
�Accumulator – one operand is implicit
�General purpose registers – explicit operands
Pre-1980’s
operatorStack Accumulator Register
C = A + B Push A Load A Load R1,A
Push B Add B Load R2,B
Add Store C Add R3,R1,R2
operand Pop C Store C,R3
Number of Operands
� Changes the instruction length
� Variable number of operands Æ variable length
� Variable length increases the complexity of the architecture
Classifying GPR Machines
� The number of memory operands? � Notation: Rx is a register and A is a memory location
� Load Store Machines (Register-Register) – 0 memory operands� Load R1,A� Load R2,B� Add R1,R2
� Register-Memory Machines – 1 memory operand� Load R1,A� Add R1,B
� Memory-Memory Machines – 2 or 3 memory operands� Add R1,A,B� Add C,A,B
Register-Register Machines (0,3)
� Example� Add R1, R2, R3� ARM, MIPS, PowerPC, SPARC
� Advantages� Simple fixed length instruction encoding
� Decoding is simplified
� CPI is uniform
� Code generation is simplified
� Disadvantages� Instruction count is high
� Some instructions are short wasting bits (low density)
� Leads to large programs
Types of General Purpose RegisterMachines
� Notation (m,n): m memory operands n total operands
Need for memory address may limit the number of registers
Register operand
Add R1, C
op-code Memory address operand
Memory operands take up many bits leaving fewer bits for theRegister operand thus allowing a fewer number of registers.
Register-Memory (1,2)
� Example� Add R1, C� Add R1, R2� Intel 80x86, Motorola 68000
� Advantages� Data access immediate without loading� Instruction format simple� Instruction density higher than (0,3) model� Note: instruction density better use of bits
� Disadvantages� Source may be destroyed� Need for memory address may limit the number of registers� CPI will vary depending on type of operands
Memory Addressing
� How is the memory address interpreted?
� Byte addressed
� Byte order�Big Endian vs. Little Endian
� Alignment�An object of size s bytes at byte address A is aligned if
A mod s = 0
� Addressing modes
Memory-Memory (3,3)
� Advantages�Best instruction density
�Doesn’t waste registers for temporary results
� Disadvantages�Large variation in instruction size (3 operand
instructions)�Large variation in CPI
�Can worsen memory bottleneck
� Most complex model – currently extinct� VAX
Interpreting Addresses
• Memory is just a bunch of bits.• How big can the address be?
32-Bit addressing
Address Memory
Interpreting Addresses
• Memory is just a bunch of bits.• How do we address it?
Byte addressing
Address Memory
Byte Ordering
Interpreting Addresses
• What is the length of the thing we are addressing?
•Typical lengths: byte 8, half-word16, word 32, double word 64
Word addressing
Address Memory
Alignment
� For a byte addressed machine�all byte accesses are aligned
�word accesses are aligned if the address is a multiple of 4
�32-bit integer accesses are aligned if the address is a multiple of 4
�64-bit floating point accesses are aligned if the address is a multiple of 8
Alignment
�An object of size s bytes at byte address A is aligned ifA mod s = 0
Byte 0 Byte 1 Byte 2 Byte 3 Byte 4
Accessing this word is a misaligned access.
Misalignment may cause slow performance
Addressing Modes – DataMode Example Meaning When used
Register Add R4,R3 R[4]=R[4]+R[3] When a value is in a register
Immediate Add R4,#3 R[4]=R[4]+3 For constants
Displacement Add R4,100(R1) R[4]=R[4]+M[100+R[1]] Accessing local variables
Register Add R4,(R1) R[4]=R[4]+M[R[1]] Accessing Deferred or pointer Indirect
Indexed Add R3,(R1+R2) R[3]=R[3]+M[R[1]+R[2]] Array addressing
Direct Add R1,(1001) R[1]=R[1]+M[1001] Accessing static
data
Memory Add R1,@(R3) R[1]=R[1]+M[M[R[3]]] Dereferencing a
indirect pointer
Addressing Modes
� GPR machines can address Constants, Registers, and Memory
� An address mode determines how a memory address is determined
Addressing Modes – DataMode Example Meaning When used
Indexed Add R3,(R1+R2) R[3]=R[3]+M[R[1]+R[2]] Array addressing
The address is computed by adding the contents of two registers
23
R1 = 16 22
R2 = 3 21Load Byte R3, (R1+R2)Loads the byte at address 19 into Array register 3.Same as loading the 4 byte of the array
17
16
Addressing Modes – DataMode Example Meaning When used
Displacement Add R4,100(R1) R[4]=R[4]+M[100+R[1]] Accessing local variables
The address is computed by adding a constant to the number in a register.
Examples
•R3 is a register containing the number 400•Load Word R2,0(R3)
•Load the word at memory address 400 into register 2•Load Word R4,4(R3)
•Load the word at memory address 404 into register 4
Addressing Modes: Comments
� Programs typically produce�Displacement
�Immediate
�Register deferred
� Displacement Addressing Modes
�How large a displacement?
�Affects instruction length
� Immediate�Comparisons for branching
�Most are small (if a = 0) then …
Addressing Modes: Comments
� Change the instruction count�Complex address modes reduce IC
� Change the organization of the machine�Complex address modes increase the complexity
�May increase CPI
Operator Types
� Arithmetic and logical – Add, Sub, AND, OR
� Data transfer – Load, Store, Move
� Control – Branch, Jump, Calls, Returns
� System – Operating system calls
� Floating point
� Decimal – Decimal arithmetic, character conversion
� String – Move, Compares, Searches
� Graphics – Pixel operations (MMX)
How Large a Displacement?
Register operand Displacement
Add R4,100(R1)
op-code Register operand
The bigger the displacement the more bits that be used. Leads to larger instruction size.
Operator Types
� Data transfer – Load, Store, Move�They transfer data.
�Load transfers data from memory to a register
�Store transfers data from a register to memory
�Move transfers data between registers
Operator Types
� Arithmetic and logical – Add, Sub, AND, OR�They do the obvious thing
�Use the ALU (arithmetic logic unit)
Instructions for Control Flow
� Many names – transfer, branch, jump
� Our terminology�Jump – unconditional change
�Branch – conditional change
� Four types�Conditional branches
�Jumps
�Procedure calls
�Procedure returns
Instructions for Control Flow
Load …
Load …
Add …
jump
…
…
…
…
PC = 86 before jump
86
PC-Relative
� Most architectures use PC relative
� Use fewer bits for destination
� Program independence – easier to link code
83 Load …
84 Load …
PC = 96 after jump 85 Add …
jump 10
Instructions for Control Flow
� Change the program counter (PC)�PC-relative addressing
� The operand for a control flow instruction is the destination
� Control flow instructions also have addressing modes
Addressing Modes
� Direct (immediate) or Indirect
� Direct then destination is known at compile time
� Indirect known at runtime�Case, switch statements�Usually the destination is put in a register
Conditional Branch
� If (a = b) { … } else { … }� Implementation issue: How is the condition set?
Jump
� Unconditional change in the order of execution of instructions
� Can be used for looping
for (i=1 to 100)
{ … }
Instructions: Summary
� Type of operation�ALU, Data transfer, Floating-point, Control
� Are operands explicit or implicit�Explicit – registers and memory
�Implicit – stack and accumulator
� How many operands are in memory�Load-store, register-memory, memory-memory
� How is the address determined (mode)�Immediate, indirect etc …
Commonly Executed Instructions
Rank Instruction Percentage
1 load 22
2 Conditional branch 20
3 Compare 16
4 Store 12
5 Add 8
6 And 6
7 Sub 5
8 Move 4
9 Call 1
10 return 1
Total 96%
80x86 processor using SPEC92
specifier nNo of operands specifier 1
Three Basic Variations
Variable approach (e.g. VAX)
Operation & Address Address field 1 … Address Address field n
These bits determine the address mode (explicit)
These bits determine which operation and how many operands
Encoding the ISA
� What is the binary representations of theInstruction Set Architecture?
� How are the operations encoded?�The op-code
� How are the operands encoded?�Variable or fixed
� How is the address mode encoded?�Explicit or implicit in the op-code
specifier nNo of operands specifier 1
Three Basic Variations
Variable approach (e.g. VAX)
Operation & Address Address field 1 … Address Address field n
Fixed approach (e.g. DLX, MIPS, Power PC, Sparc)
Operation Address field 1 Address field 2 Address field 3
Hybrid approach (e.g. IBM 360/70, Intel 80x86)
Operation Address specifier Address field
Operation Address specifier Address field 1 Address field 2
Multiple formats. The op-code determines the length.
specifier nNo of operands specifier 1
Three Basic Variations
Variable approach (e.g. VAX)
Operation & Address Address field 1 … Address Address field n
Fixed approach (e.g. DLX, MIPS, Power PC, Sparc)
Operation Address field 1 Address field 2 Address field 3
The address mode is implicit in the op-code
Compiler
� In the past decisions were made to make assembly language programming easier
� Today compilers do the work
� Compiler and ISA are not independent
Trade-offs
Fixed Easy to decode Many instructions
Variable Hard to decode Few instructions
Machine independent Loop transformations
(e.g. register count and types)
Language independent Machine dependent optimizations
The Register Allocation Problem
� Accessing registers is faster than memory
� Compiler should first ensure correctness then
� Compiler should minimize calls to memory
� Problem: How to assign variables to registers
Variable 1 Register 1Variable 2 Register 2… …
… …Variable n Register m
Many more variables than registers
Code
optimizations
Structure of CompilersDependency Function
Language dependent Font-end Transform to common form
Somewhat language dependent High-level Procedure in-lining
Small language dependence Global Register allocationSome machine dependence Optimizer
Detailed instruction selection
Highly machine dependent GeneratorHighly ISA dependent
The Effect on Register Allocation?
� Stack�Register allocation generally effective
� Global data area�Register allocation is difficult
� Heap�Register allocation is near impossible
�Too many pointers
�Too big
Where are the variables?
� Stack� Used for local variables and activation records� Scalars (single variables as opposed to arrays)� Register allocation generally effective
� Global data area� Statically declared objects� Global variables and constants� Register allocation is difficult
� Heap� Dynamic objects� Generally accessed via pointers� Not scalars� Register allocation is near impossible
Instruction set properties that help compiler writers
� Orthogonal�Operations, data types, and addressing modes
should be independent
� Provide primitives not solutions�What works in one language may be bad for another
�Avoid high-level instructions