+ All Categories
Home > Documents > Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification...

Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification...

Date post: 19-Mar-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
59
Lecture 03 Instruction Set Principles CSCE 513 Computer Architecture Department of Computer Science and Engineering Yonghong Yan [email protected] http://cse.sc.edu/~yanyh 1
Transcript
Page 1: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Lecture03InstructionSetPrinciples

CSCE513ComputerArchitecture

DepartmentofComputerScienceandEngineeringYonghong Yan

[email protected]://cse.sc.edu/~yanyh

1

Page 2: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Contents

1. Introduction2. ClassifyingInstructionSetArchitectures3. MemoryAddressing4. TypeandSizeofOperands5. OperationsintheInstructionSet6. InstructionsforControlFlow7. EncodinganInstructionSet8. CrosscuttingIssues:TheRoleofCompilers9. RISC-VISA

• Supplement(notcovered)– RISCvsCISC– ComparisonofISA

• AppendixK 2

Page 3: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

1Introduction

InstructionSetArchitecture– theportionofthemachinevisibletotheassemblylevelprogrammerortothecompilerwriter– Tousethehardwareofacomputer,wemustspeak itslanguage– Thewordsofacomputerlanguagearecalledinstructions,and

itsvocabularyiscalledaninstructionset

instructionset

software

hardware

Instr.# Operation+Operandsi movl -4(%ebp),%eax(i+1) addl %eax,(%edx)(i+2) cmpl 8(%ebp),%eax(i+3) jl L5:L5:

3

Page 4: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

sum.s forX86

• http://www.cs.virginia.edu/~evans/cs216/guides/x86.html• https://en.wikibooks.org/wiki/X86_Assembly/SSE

2operands-8(%eax):Memoryaddress

4

Page 5: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

sum.s forRISC-V

https://riscv.org/

2or3operands-20(s0):Memoryaddress

5

Page 6: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ISAInReal

• Apdfdocumentthatdefinesthemodel/architecture/interfaceofthemachine– X86andIntelSDM:https://software.intel.com/en-

us/articles/intel-sdm• Severalthousandspages

– RISC-VISASpec:https://riscv.org/specifications/• Latestversion2.2,145pages

• AspecificationthatprovidestheISAdetails

• ReviewChapter2oftheCODbook

6

Page 7: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

2ClassifyingInstructionSetArchitectures

OperandstorageinCPU Wherearetheyotherthanmemory

#explicitoperandsnamedperinstruction

Howmany?Min,Max,Average

Addressingmode Howtheeffectiveaddressforanoperandcalculated?Canalluseanymode?

Operations Whataretheoptionsfortheopcode?

Type&sizeofoperands Howistypingdone?Howisthesizespecified?

Thesechoicescriticallyaffectnumberofinstructions,CPI,andCPUcycletime

7

Page 8: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ISAClassification

• Mostbasicdifferentiation:internalstorageinaprocessor– Operandsmaybenamedexplicitly orimplicitly

• Majorchoices:1. Inanaccumulatorarchitecture oneoperandisimplicitly the

accumulator=>similartocalculator2. Theoperandsinastackarchitecture areimplicitly onthe

topofthestack3. Thegeneral-purposeregisterarchitectures haveonly

explicit operands– eitherregistersormemorylocation

8

Page 9: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

FourISAClasses

• Register-memory:X86(CISC)

• Register-register:RISC(e.g.ARM,MIPS,RISC-V,Power)

9

Page 10: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

RegisterMachines• Howmanyregistersaresufficient?• General-purposeregistersvs.special-purposeregisters

• compilerflexibilityandhand-optimization• Twomajorconcernsforarithmeticandlogicalinstructions(ALU)

1.Twoorthreeoperands• X+YÞ X• X+Y Þ Z

2.Howmanyoftheoperandsmaybememoryaddresses(0– 3)

Hence,registerarchitectureclassification(#mem,#operands)

Numberofmemoryaddresses

Maximumnumberofoperandsallowed

TypeofArchitecture Examples

0 3 Load-Store Alpha,ARM,MIPS,PowerPC,SPARC,SuperH,TM32

1 2 Register-Memory IBM360/370,Intel80x86,Motorola68000,TITMS320C54x

2 2 Memory– memory VAX(alsohas3operandformats)

3 3 Memory- memory VAX(alsohas2operandformats)

10

Page 11: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

(0,3):Register-Register(RISC)

• ALUisRegistertoRegister– alsoknownas– pureReducedInstructionSetComputer(RISC)

• Advantages– Simplefixedlengthinstructionencoding– Decodeissimplesinceinstructiontypesaresmall– Simplecodegenerationmodel– InstructionCPItendstobeveryuniform

• Exceptformemoryinstructionsofcourse– butthereareonly2ofthem- loadandstore

• Disadvantages– Instructioncounttendstobehigher– Someinstructionsareshort- wastinginstructionwordbits

11

Page 12: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

(1,2):Register-Memory(CISC,X86)

• EvolvedRISCandalsooldCISC– newRISCmachinescapableofdoingspeculativeloads– predicatedand/ordeferredloadsarealsopossible

• Advantages– dataaccesstoALUimmediatewithoutloadingfirst– instructionformatisrelativelysimpletoencode– codedensityisimprovedoverRegister(0,3)model

• Disadvantages– operandsarenotequivalent- sourceoperandmaybedestroyed– needformemoryaddressfieldmaylimit#ofregisters– CPIwillvary

• ifmemorytargetisinL0cachethennotsobad• ifnot- lifegetsmiserable

12

Page 13: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

(2,2)or(3,3):Memory-Memory

Notusedtoday

• TrueandmostcomplexCISCmodel– currentlyextinctandlikelytoremainso– morecomplexmemoryactionsarelikelytoappearbutnot– directlylinkedtotheALU

• Advantages– mostcompactcode– doesn’twasteregistersfortemporaryvalues

• goodideaforuseoncedata- e.g.streamingmedia

• Disadvantages– largevariationininstructionsize- mayneedashoe-horn– largevariationinCPI- i.e.workperinstruction– exacerbatestheinfamousmemorybottleneck

• registerfilereducesmemoryaccessesifreused

13

Page 14: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Summary:TradeoffsfortheISAClasses

Type Advantages Disadvantages

Register-register(0,3)

Simple,fixedlengthinstructionencoding.Simplecodegenerationmodel.Instructionstakesimilarnumbersofclockstoexecute.

Higherinstructioncountthanarchitectureswithmemoryreferencesintheinstructions.Moreinstructionsandlowerinstructiondensityleadstolargerprograms

Register-memory(1,2)

Datacanbeaccessedwithoutaseparateloadinstructionfirst.Instructionformattendstobeeasytoencodeandyieldsgooddensity

Operandsarenotequivalentsinceasourceoperandisdestroyed.Encodingaregisternumberandamemoryaddressineachinstructionmayrestrictthenumberofregisters.Clocksperinstructionvarybyoperandlocation

Memory-memory(2,2)or(3,3)

Mostcompact.Doesnotwasteregistersfortemporaries.

Largevariationininstructionsize,especiallyforthree-operandinstructions.Inaddition,largevariationinworkperinstruction.Memoryaccessescreatememorybottleneck.(Notusedtoday)

14

Page 15: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

3MemoryAddressing

•Objectshavebyteaddresses– thenumberofbytescountedfromthebeginningofmemory

•ObjectLength:–bytes(8bits),halfwords(16bits),–words(32bits),anddoublewords(64bits).–Thetypeisimpliedinopcode,e.g.,

• LDB– loadbyte• LDW– loadword,etc

• ByteOrdering– LittleEndian: putsthebytewhoseaddressisxx00attheleastsignificantpositionintheword.(7,6,5,4,3,2,1,0)

– BigEndian: putsthebytewhoseaddressisxx00atthemostsignificantpositionintheword.(0,1,2,3,4,5,6,7)

• Problemoccurswhenexchangingdataamongmachineswithdifferentorderings

15

Page 16: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

InterpretingMemoryAddresses

• AlignmentIssues– Accessestoobjectslargerthanabytemustbealigned.

• AnaccesstoanobjectofsizesbytesatbyteaddressAisalignedifAmods=0.

– Misalignmentcauseshardwarecomplications• sincememoryistypicallyalignedonawordoradouble-wordboundary

• MisalignmenttypicallyresultsinanalignmentfaultthatmustbehandledbytheOS

• Hence– byteaddressisanything- nevermisaligned– halfword- evenaddresses- loworderaddressbit=0(XXXXXXX0)

elsetrap– word- loworder2addressbits=0(XXXXXX00)elsetrap– doubleword- loworder3addressbits=0(XXXXX000)elsetrap

16

Page 17: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

MemoryAlignment

17

Page 18: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Aligned/MisalignedAddresses

18

Page 19: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

AddressingModes

• Howarchitecturespecifytheeffectiveaddressofanobject?– Effectiveaddress:theactualmemoryaddressspecifiedbythe

addressingmode.• E.g.Mem[R[R1]] referstothecontentsofthememorylocationwhoselocationisgivenbythecontentsofregister1(R1).

• AddressingModes:– Register.– Immediate– Displacement– Registerindirect,……..

-20(s0):Memoryaddress

19

Page 20: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

AddressModes

20

Page 21: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

AddressingModeImpacts

• Instructioncounts• ArchitectureComplexity• CPI

21

Page 22: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

SummaryofUseofMemoryAddressingModes

22

Page 23: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

DisplacementValuesareWidelyDistributed

Impactinstructionlength

23

Page 24: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

DisplacementAddressingMode

• Benchmarksshow– 12bitsofdisplacementwouldcaptureabout75%ofthefull32-bit

displacements– 16bitsshouldcaptureabout99%

• Remember:– optimizeforthecommoncase.Hence,thechoiceisatleast12-16bits

• Foraddressesthatdofitindisplacementsize:Add R4,10000(R0)

• Foraddressesthatdon’tfitindisplacementsize,thecompilermustdothefollowing:

Load R1,1000000Add R1,R0Add R4,0(R1)

24

Page 25: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ImmediateAddressingMode

• Usedwherewewanttogettoanumericalvalueinaninstruction• Around25%oftheoperationshaveanimmediateoperand

Athighlevel:

a=b+3;

if(a>17)

goto Addr

AtAssemblerlevel:

LoadR2,#3AddR0,R1,R2

LoadR2,#17CMPBGTR1,R2

LoadR1,AddressJump(R1)

25

Page 26: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

About25%ofdatatransferandALUoperationshaveanimmediateoperand

Impactinstructionlength

26

Page 27: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

NumberofBitsforImmediate

• 16bitswouldcaptureabout80%and8bitsabout50%.

Impactinstructionlength

27

Page 28: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Summary:MemoryAddressing

• Anewarchitectureexpectedtosupportatleast:displacement,immediate,andregisterindirect– represent75%to99%oftheaddressingmodes

• Thesizeoftheaddressfordisplacementmodetobeatleast12-16bits– capture75%to99%ofthedisplacements

• Thesizeoftheimmediatefieldtobeatleast8-16bits– capture50%to80%oftheimmediates

Processorsrelyoncompilerstogeneratecodesusingthoseaddressingmode

28

Page 29: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

4 TypeAndSizeofOperands

• Thetypeoftheoperandisusuallyencodedintheopcode– e.g.,LDB– loadbyte;LDW– loadword

• Commonoperandtypes:(implytheirsizes)Character(8bitsor1byte)Halfword(16bitsor2bytes)Word(32bitsor4bytes)Doubleword(64bitsor8bytes)Singleprecisionfloatingpoint(4bytesor1word)Doubleprecisionfloatingpoint(8bytesor2words)ü CharactersarealmostalwaysinASCIIü 16-bitUnicode(usedinJava)isgainingpopularityü Integersaretwo’scomplementbinaryü FloatingpointsfollowtheIEEEstandard754

• Somearchitecturessupportpackeddecimal:4bitsareusedtoencodethevalues0-9;2decimaldigitsarepackedintoeachbyte

Howisthetypeofanoperanddesignated?

29

Page 30: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

DistributionofDataAccessesbySize

30

Page 31: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Summary:TypeandSizeofoperands

• 32-architecturesupports8-,16-,and32-bitintegers,32-bitand64-bitIEEE754floating-pointdata.

• Anew64-bitaddressarchitecturesupports64-bitintegers• MediaprocessorandDSPsneedwideraccumulatingregistersforaccuracy.

31

Page 32: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

5 OperationsintheInstructionSet

• Allcomputersgenerallyprovideafullsetofoperationsforthefirstthreecategories

• Allcomputersmusthavesomeinstructionsupportforbasicsystemfunctions

• Graphicsinstructionstypicallyoperateonmanysmallerdataitemsinparallel

32

Page 33: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Top10Instructionsfor80x86

33

Page 34: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

InstructionEncoding

• RISC-VR-formatinstruction

34

• RISC-VI-formatinstruction

Page 35: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

6 InstructionsforControlFlow

• Controlinstructionschangetheflowofcontrol:– insteadofexecutingthenextinstruction,theprogrambranchesto

theaddressspecifiedinthebranchinginstructions• Theybreakthepipeline

– Difficulttooptimizeout– ANDtheyarefrequent

• Fourtypesofcontrolinstructions– Conditionalbranches

• if…else,for/while,switch/case,…– Jumps– unconditionaltransfer

• goto– Procedurecalls

• foo()– Procedurereturns

• return35

Page 36: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

BreakdownofControlFlowInstructions

– Conditionalbranches– Jumps– unconditionaltransfer– Procedurecalls– Procedurereturns

• Issues:– Whereisthetargetaddress?Howtospecifyit?(label)– Caller:Whereisreturnaddresskept?Howarethearguments

passed?– Callee:Whereisreturnaddress?Howaretheresultspassed?

36

Page 37: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

AddressingModesforControlFlowInstructions

• PC-relative(ProgramCounter)– SupplyadisplacementaddedtothePC

• Knownatcompiletimeforjumps,branches,andcalls(specifiedwithintheinstruction)

– Thetargetisoftennearthecurrentinstruction• Requiringfewerbits• Independentlyofwhereitisloaded(positionindependence)

• Registerindirectaddressing– dynamicaddressing– Thetargetaddressmaynotbeknownatcompiletime– Namingaregisterthatcontainsthetargetaddress

• Caseorswitchstatements• VirtualfunctionsormethodsinC++orJava• High-orderfunctionsorfunctionpointersinCorC++• Dynamicallysharedlibraries

37

Page 38: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

BranchDistances

38

Page 39: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ConditionalBranchOptions

Figure2.21Majormethodsforevaluatingbranchconditions

39

Page 40: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ComparisonTypevs.Frequency

• Mostloopsgofrom0ton.• Mostbackwardbranchesareloops– takenabout90%

Program % backward branches

% all control instructions that

modify PCgcc 26% 63%spice 31% 63%TeX 17% 70%Average 25% 65% 40

Page 41: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ProcedureInvocationOptions• Procedurecallsandreturns

– controltransfer– statesaving;thereturnaddressmustbesavedNewerarchitecturesrequirethecompilertogeneratestoresandloads

foreachregistersavedandrestored

• Twobasicconventionsinusetosaveregisters– callersaving:thecallingproceduremustsavetheregistersthatit

wantspreservedforaccessafterthecall• thecalledprocedureneednotworryaboutregisters

– callee saving:thecalledproceduremustsavetheregistersitwantstouse

• leavingthecallerunrestrained

mostrealsystemstodayuseacombinationofboth• Applicationbinaryinterface(ABI)thatsetdownthebasicrulesastowhichregisterbecallersavedandwhichshouldbecallee saved

41

Page 42: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

7.EncodinganInstructionSet

• Opcode:specifyingtheoperation• #ofoperand

– addressingmode– addressspecifier:tellswhataddressingmodeisused– Load-storecomputer

• Onlyonememoryoperand• Onlyoneortwoaddressingmodes

• Thearchitecturemustbalancingseveralcompetingforceswhenencodingtheinstructionset:– #ofregisters&&Addressingmodes– Sizeofregisters&&Addressingmodefields– Averageinstructionsize&&Averageprogramsize.– Easytohandleinpipelineimplementation.

42

Page 43: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Example:x86andAlpha

• x86:

• Alpha:

43

Page 44: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ThreeBasicVariationsforInstructionEncoding

Thelengthof80x86(CISC)instructionsvariesbetween1and17bytes.

ThelengthofmostRISCISAinstructionsare4bytes.

X86programaregenerallysmallerthanRISCISA.

ToreduceRISCcodesize

44

Page 45: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

InstructionLengthTradeoffs

• Fixedlength:Lengthofallinstructionsthesame+Easiertodecodesingleinstructioninhardware+Easiertodecodemultipleinstructionsconcurrently-- Wastedbitsininstructions(Whyisthisbad?)-- Harder-to-extendISA(howtoaddnewinstructions?)

• Variablelength:Lengthofinstructionsdifferent(determinedbyopcode andsub-opcode)+Compactencoding(Whyisthisgood?)

Intel432:Huffmanencoding(sortof).6to321bitinstructions.How?-- Morelogictodecodeasingleinstruction-- Hardertodecodemultipleinstructionsconcurrently

• Tradeoffs– Codesize(memoryspace,bandwidth,latency)vs.hardwarecomplexity– ISAextensibilityandexpressiveness– Performance?Smallercodevs.imperfectdecode

45

Page 46: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Uniformvs Non-uniformDecode

• Uniformdecode:Samebitsineachinstructioncorrespondtothesamemeaning– Opcode isalwaysinthesamelocation– immediatevalues,…– Many“RISC” ISAs:Alpha,MIPS,SPARC+Easierdecode,simplerhardware+Enablesparallelism:generatetargetaddressbeforeknowingtheinstruction

isabranch-- Restrictsinstructionformat(fewerinstructions?)orwastesspace

• Non-uniformdecode– E.g.,opcode canbethe1st-7thbyteinx86+Morecompactandpowerfulinstructionformat-- Morecomplexdecodelogic

46

Page 47: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ReducedCodeSizeinRISCs

• Hybridencoding– support16-bitand32-bitinstructionsinRISC,eg.ARMThumb,MIPS16andRISC-V– Narrowinstructionssupportfeweroperations,smalleraddressand

immediatefields,fewerregisters,andtwo-addressformatratherthantheclassicthree-addressformat

– Claimacodesizereductionofupto40%

• CompressioninIBM’sCodePack– Addshardwaretodecompressinstructionsastheyarefetchedfrom

memoryonaninstructioncachemiss– Theinstructioncachecontainsfull32-bitinstructions,but

compressedcodeiskeptinmainmemory,ROMs,andthedisk– Claimcodereduction35%- 40%– PowerPCcreateaHashtableinmemorythatmapbetween

compressedanduncompressedaddress.Codesize35%~40%

• Hitachi’sSuperH:fixed16-bitformat– 16ratherthan32registers– fewerinstructions

47

Page 48: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

SummaryofInstructionEncoding

• Threechoices– Variable,fixedandhybrid– Notethedifferencesofhybridandvariable

• Choicesofinstructionencodingisatradeoffbetween– Forperformance:fixedencoding– Forcodesize:variableencoding

• HowhybridencodingisusedinRISCtoreducecodesize– 16bitand32bit

• Ingeneral,wesee:– RISC:fixedorhybrid– CISC:variable

48

Page 49: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

8TheRoleofCompilers• Almostallprogrammingisdoneinhigh-levellanguages.

– AnISAisessentiallyacompliertarget.

• Seebackupslidesforthecompilationstagebymostcompiler,e.g.gcc

• Compilergoals:– Allcorrectprogramsexecutecorrectly– Mostcompiledprogramsexecutefast(optimizations)– Fastcompilation– Debuggingsupport

49

Page 50: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

TypicalModernCompilerStructure

Figure A.19 Compilers typically consist of two to four passes, with more highly optimizing compilers having more passes.This structure maximizes the probability that a program compiled at various levels of optimization will produce the same outputwhen given the same input. The optimizing passes are designed to be optional and may be skipped when faster compilation is thegoal and lower-quality code is acceptable. A pass is simply one phase in which the compiler reads and transforms the entireprogram. (The term phase is often used inter-changeably with pass.) Because the optimizing passes are separated, multiplelanguages can use the same optimizing and code generation passes. Only a new front end is required for a new language. 50

Page 51: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

OptimizationTypes

• Highlevel– doneatornearsourcecodelevel– Ifprocedureiscalledonlyonce,putitin-lineandsaveCALL– moregeneralcase:ifcall-count<somethreshold,putthemin-line

• Local– donewithinstraight-linecode– commonsub-expressionsproducesamevalue– eitherallocatea

registerorreplacewithsinglecopy– constantpropagation– replaceconstantvaluedvariablewiththe

constant– stackheightreduction– re-arrangeexpressiontreetominimize

temporarystorageneeds• Global– acrossabranch

– copypropagation– replaceallinstancesofavariableAthathasbeenassignedX(i.e.,A=X)withX.

– codemotion– removecodefromaloopthatcomputessamevalueeachiterationoftheloopandputitbeforetheloop

– simplifyoreliminatearrayaddressingcalculationsinloops

51

Page 52: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

OptimizationTypes

• Machine-dependentoptimizations– basedonmachineknowledge– strengthreduction– replacemultiplybyaconstantwithshifts

andadds• wouldmakesenseiftherewasnohardwaresupportforMUL• atrickierversion:17´ =arithmeticleftshift4andadd

• Pipeliningscheduling– reorderinstructionstoimprovepipelineperformance– dependencyanalysis– branchoffsetoptimization- reordercodetominimizebranch

offsets

52

Page 53: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

MajorTypesofOptimizations

53

Page 54: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

ComplierOptimizations– ChangeinIC

• L0– unoptimized• L1– localopts,codescheduling,&localreg.allocation• L2– globaloptsandlooptransformations,&globalreg.Allocation• L3– procedureintegration

gcc -O2hello.c -ohello

54

Page 55: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

CompilerBasedRegisterOptimization

• Compilerassumessmallnumberofregisters(16-32)– Optimizinguseisuptocompiler– HLLprogramshavenoexplicitreferencestoregisters

• CompilerApproach– Assignsymbolicorvirtualregistertoeachcandidatevariable– Map(unlimited)symbolicregisterstorealregisters– Symbolicregistersthatdonotoverlapcansharerealregisters– Ifyourunoutofrealregisterssomevariables

• Spilling

55

Page 56: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

GraphColoring

• Givenagraphofnodesandedges– Assignacolor toeachnode

• Adjacentnodeshavedifferentcolors• Useminimumnumberofcolors

• Registrationallocation– Nodesaresymbolicregisters– Tworegistersthatareliveinthesameprogramfragmentare

joinedbyanedge– Trytocolor thegraphwithn colors,wheren isthenumberof

realregisters– Nodesthatcannotbecolored areplacedinmemory

https://en.wikipedia.org/wiki/Graph_coloring

56

Page 57: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Iron-codeSummary• SectionA.2—Usegeneral-purposeregisterswithaload-storearchitecture.• SectionA.3—Supporttheseaddressingmodes:displacement(withanaddressoffset

sizeof12to16bits),immediate(size8to16bits),andregisterindirect.• SectionA.4—Supportthesedatasizesandtypes:8-,16-,32-,and64-bitintegersand

64-bitIEEE754floating-pointnumbers.– Nowwesee16-bitFPfordeeplearninginGPU

• http://www.nextplatform.com/2016/09/13/nvidia-pushes-deep-learning-inference-new-pascal-gpus/

• SectionA.5—Supportthesesimpleinstructions,sincetheywilldominatethenumberofinstructionsexecuted:load,store,add,subtract,moveregister- register,andshift.

• SectionA.6—Compareequal,comparenotequal,compareless,branch(withaPC-relativeaddressatleast8bitslong),jump,call,andreturn.

• SectionA.7—Usefixedinstructionencodingifinterestedinperformance,andusevariableinstructionencodingifinterestedincodesize.

• SectionA.8—Provideatleast16general-purposeregisters,besurealladdressingmodesapplytoalldatatransferinstructions,andaimforaminimalistIS

– Oftenuseseparatefloating-pointregisters.– Thejustificationistoincreasethetotalnumberofregisterswithoutraisingproblemsin

theinstructionformatorinthespeedofthegeneral-purposeregisterfile.Thiscompromise,however,isnotorthogonal.

57

Page 58: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

RealWorldISA

58

Page 59: Lecture 03 Instruction Set Principles · 2019-01-10 · Hence, register architecture classification (# mem, # operands) Number of memory addresses Maximum number of operands allowed

Thedetailsindesignistotrade-off!

59


Recommended