CS61C:GreatIdeasinComputerArchitecture
MIPS Datapath
1
Instructors:NicholasWeaver&VladimirStojanovic
http://inst.eecs.Berkeley.edu/~cs61c/sp16
Arithmetic and Logic Unit
• Mostprocessorscontainaspeciallogicblockcalledthe“ArithmeticandLogicUnit” (ALU)
• We’ll showyouaneasyonethatdoesADD,SUB,bitwiseAND,bitwiseOR
2
Our simple ALU
3
How to design Adder/Subtractor?• Truth-table, then
determine canonical form, then minimize and implement as we’ve seen before
• Look at breaking the problem down into smaller pieces that we can cascade or hierarchically layer
4
Adder/Subtractor – One-bit adder LSB… (Half Adder)
5
Adder/Subtractor – One-bit full-adder (1/2)…
6
Adder/Subtractor – One-bit adder (2/2)
7
N 1-bit adders ⇒ 1 N-bit adder
What about overflow?Overflow = cn?
+ + +b0
8
x y XOR(x,y)0 0 00 1 11 0 11 1 0
+ + +
XOR serves asconditional inverter!
Aka "Subtract is Invert and add 1"9
Extremely Clever Adder/Subtractor
ClickerQuestionConvertthetruthtabletoaboolean expression(noneedtosimplify):
A:F=xy +x(~y)
B:F=xy +(~x)y+(~x)(~y)
C:F=(~x)y+x(~y)
D:F=xy +(~x)y
E:F=(x+y)(~x+~y)
10
x y F(x,y)0 0 00 1 11 0 01 1 1
Administrivia
• Project2-2isout!
11
Processor
Control
Datapath
ComponentsofaComputer
12
PC
Registers
Arithmetic&LogicUnit(ALU)
MemoryInput
Output
Bytes
Enable?Read/Write
Address
WriteData
ReadData
Processor-Memory Interface I/O-MemoryInterfaces
Program
Data
TheCPU• Processor(CPU):theactivepartofthecomputerthatdoesallthework(datamanipulationanddecision-making)
• Datapath:portionoftheprocessorthatcontainshardwarenecessarytoperformoperationsrequiredbytheprocessor(thebrawn)
• Control:portionoftheprocessor(alsoinhardware)thattellsthedatapath whatneedstobedone(thebrain)
13
DatapathandControl• Datapath designedtosupportdatatransfersrequiredbyinstructions
• Controllercausescorrecttransferstohappen
Controlleropcode, funct
inst
ruct
ion
mem
ory
+4
rtrsrd
regi
ster
sALU
Dat
am
emor
y
imm
PC
14
FiveStagesofInstructionExecution• Stage1:InstructionFetch
• Stage2:InstructionDecode
• Stage3:ALU(Arithmetic-LogicUnit)
• Stage4:MemoryAccess
• Stage5:RegisterWrite
15
StagesofExecutiononDatapath
inst
ruct
ion
mem
ory
+4
rtrsrd
regi
ster
s
ALU
Dat
am
emor
y
imm
1.InstructionFetch
2.Decode/RegisterRead
3.Execute 4.Memory 5.RegisterWrite
PC
16
StagesofExecution(1/5)
• ThereisawidevarietyofMIPSinstructions:sowhatgeneralstepsdotheyhaveincommon?
• Stage1:InstructionFetch– nomatterwhattheinstruction,the32-bitinstructionwordmustfirstbefetchedfrommemory(thecache-memoryhierarchy)
– also,thisiswhereweIncrementPC(thatis,PC=PC+4,topointtothenextinstruction:byteaddressingso+4)
17
StagesofExecution(2/5)• Stage2:InstructionDecode
– uponfetchingtheinstruction,wenextgatherdatafromthefields(decodeallnecessaryinstructiondata)
– first,readtheopcode todetermineinstructiontypeandfieldlengths
– second,readindatafromallnecessaryregisters• foradd,readtworegisters• foraddi,readoneregister• forjal,noreadsnecessary
18
StagesofExecution(3/5)• Stage3:ALU(Arithmetic-LogicUnit)
– therealworkofmostinstructionsisdonehere:arithmetic(+,-,*,/),shifting,logic(&,|),comparisons(slt)
– whataboutloadsandstores?• lw $t0,40($t1)• theaddressweareaccessinginmemory=thevaluein$t1 PLUSthevalue40
• sowedothisadditioninthisstage19
StagesofExecution(4/5)
• Stage4:MemoryAccess– actuallyonlytheloadandstoreinstructionsdoanythingduringthisstage;theothersremainidleduringthisstageorskipitalltogether
– sincetheseinstructionshaveauniquestep,weneedthisextrastagetoaccountforthem
– asaresultofthecachesystem,thisstageisexpectedtobefast
20
StagesofExecution(5/5)
• Stage5:RegisterWrite– mostinstructionswritetheresultofsomecomputationintoaregister
– examples:arithmetic,logical,shifts,loads,slt– whataboutstores,branches,jumps?
• don’twriteanythingintoaregisterattheend• theseremainidleduringthisfifthstageorskipitalltogether
21
StagesofExecutiononDatapath
inst
ruct
ion
mem
ory
+4
rtrsrd
regi
ster
s
ALU
Dat
am
emor
y
imm
1.InstructionFetch
2.Decode/RegisterRead
3.Execute 4.Memory 5.RegisterWrite
PC
22
DatapathWalkthroughs(1/3)
• add$r3,$r1,$r2#r3=r1+r2– Stage1:fetchthisinstruction,incrementPC– Stage2:decodetodetermineitisanadd,thenreadregisters$r1 and$r2
– Stage3:addthetwovaluesretrievedinStage2– Stage4:idle(nothingtowritetomemory)– Stage5:writeresultofStage3intoregister$r3
23
inst
ruct
ion
mem
ory
+4re
gist
ers
ALU
Dat
am
emor
y
imm
213
reg[1]+ reg[2]
reg[2]
reg[1]
Example:addInstructionPC
addr3,r1,r2
24
DatapathWalkthroughs(2/3)• slti $r3,$r1,17#if(r1<17)r3=1elser3=0– Stage1:fetchthisinstruction,incrementPC– Stage2:decodetodetermineitisanslti,thenreadregister$r1
– Stage3:comparevalueretrievedinStage2withtheinteger17
– Stage4:idle– Stage5:writetheresultofStage3(1ifreg sourcewaslessthansignedimmediate,0otherwise)intoregister$r3
25
inst
ruct
ion
mem
ory
+4re
gist
ers
ALU
Dat
am
emor
y
imm
31x
reg[1] < 17?
17
reg[1]
Example:slti InstructionPC
sltir3,r1,17
26
DatapathWalkthroughs(3/3)
• sw $r3,16($r1)#Mem[r1+16]=r3– Stage1:fetchthisinstruction,incrementPC– Stage2:decodetodetermineitisasw,thenreadregisters$r1 and$r3
– Stage3:add16 tovalueinregister$r1 (retrievedinStage2)tocomputeaddress
– Stage4:writevalueinregister$r3 (retrievedinStage2)intomemoryaddresscomputedinStage3
– Stage5:idle(nothingtowriteintoaregister)
27
inst
ruct
ion
mem
ory
+4
regi
ster
s
ALU
Dat
am
emor
y
imm
31x
reg[1] +16
16
reg[1]
MEM[r1+17]= r3
reg[3]
Example:sw InstructionPC
sw r3,16(r1)
28
WhyFiveStages?(1/2)
• Couldwehaveadifferentnumberofstages?– Yes,otherISAshavedifferentnaturalnumberofstages• Andthesedays,pipeliningcanbemuchmoreaggressivethanthe"natural"5stagesMIPSuses
• WhydoesMIPShavefiveifinstructionstendtoidleforatleastonestage?– Fivestagesaretheunionofalltheoperationsneededbyalltheinstructions.
– Oneinstructionusesallfivestages:theload29
WhyFiveStages?(2/2)• lw $r3,16($r1)#r3=Mem[r1+16]
– Stage1:fetchthisinstruction,incrementPC– Stage2:decodetodetermineitisalw,thenreadregister$r1
– Stage3:add16 tovalueinregister$r1 (retrievedinStage2)
– Stage4:readvaluefrommemoryaddresscomputedinStage3
– Stage5:writevaluereadinStage4intoregister$r3
30
ALU
inst
ruct
ion
mem
ory
+4
regi
ster
s
Dat
am
emor
y
imm
31x
reg[1] +16reg[1]
MEM[r1+16]
Example:lw InstructionPC
lw r3,17(r1)
31
16
Clickers/PeerInstruction
• WhichtypeofMIPSinstructionisactiveinthefeweststages?
A:LWB:BEQC:JD:JALE:ADDU
32
ProcessorDesign:5stepsStep1:Analyzeinstructionsettodetermine datapathrequirements
– Meaningofeachinstructionisgivenbyregistertransfers– Datapath mustincludestorageelementforISAregisters– Datapath mustsupporteachregistertransferStep2:Selectsetofdatapath components&establishclockmethodology
Step3:Assembledatapath componentsthatmeettherequirements
Step4:Analyzeimplementationofeachinstructiontodeterminesettingofcontrolpointsthatrealizestheregistertransfer
Step5:Assemblethecontrollogic33
• AllMIPSinstructionsare32bitslong.3formats:
– R-type
– I-type
– J-type
• Thedifferentfieldsare:– op:operation(“opcode”)oftheinstruction– rs,rt,rd:thesourceanddestinationregisterspecifiers– shamt:shiftamount– funct:selectsthevariantoftheoperationinthe“op”field– address/immediate:addressoffsetorimmediatevalue– targetaddress:targetaddressofjumpinstruction
op target address02631
6 bits 26 bits
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt address/immediate016212631
6 bits 16 bits5 bits5 bits
TheMIPSInstructionFormats
34
• ADDUandSUBU– addu rd,rs,rt– subu rd,rs,rt
• ORImmediate:– ori rt,rs,imm16
• LOADandSTOREWord– lw rt,rs,imm16– sw rt,rs,imm16
• BRANCH:– beq rs,rt,imm16
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
TheMIPS-liteSubset
35
• Colloquiallycalled“RegisterTransferLanguage”• RTLgivesthemeaning oftheinstructions• Allstartbyfetchingtheinstructionitself{op , rs , rt , rd , shamt , funct} ← MEM[ PC ]
{op , rs , rt , Imm16} ← MEM[ PC ]
Inst Register Transfers
ADDU R[rd] ← R[rs] + R[rt]; PC ← PC + 4
SUBU R[rd] ← R[rs] – R[rt]; PC ← PC + 4
ORI R[rt] ← R[rs] | zero_ext(Imm16); PC ← PC + 4
LOAD R[rt] ← MEM[ R[rs] + sign_ext(Imm16)]; PC ← PC + 4
STORE MEM[ R[rs] + sign_ext(Imm16) ] ← R[rt]; PC ← PC + 4
BEQ if ( R[rs] == R[rt] )PC ← PC + 4 + {sign_ext(Imm16), 2’b00}
else PC ← PC + 4
RegisterTransferLevel(RTL)
36
InConclusion
• “DivideandConquer”tobuildcomplexlogicblocksfromsmallersimplerpieces(adder)
• FivestagesofMIPSinstructionexecution• Mappinginstructionstodatapath components• Singlelongclockcycleperinstruction
37