xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John...

Post on 05-Jun-2020

5 views 0 download

transcript

xBGAS:ABridgeProposalforRV128andHPC

JohnLeidel1,DavidDonofrio2,Farzad Fatollahi-Fard2,KurtKeville31TacticalComputingLabs

2LawrenceBerkeleyNationalLab3MIT

DataCenterScaleAddressing• ExtendedBaseGlobalAddressSpace(xBGAS)• Goals:

• ProvideextendedaddressingcapabilitieswithoutruiningthebaseABI• EG,RV64appswillstillexecutewithoutanissue

• Extendedaddressingmustbeflexibleenoughtosupportmultipletargetapplicationspaces/systemarchitectures• Traditionaldatacenters,clouds,HPC,etc..

• Extendedaddressingmustnotspecificallyrelyuponanyonevirtualmemorymechanism• EG,provideforobject-basedmemoryresolution

• WhatisxBGAS NOT?• …adirectreplacementforRV128

ApplicationDomains• HPA-FLAT

• Highperformanceanalyticsflataddressing• Forextremelylargedatasetsthataretoo

difficult/timeconsumingtoshard• MMAP-IO

• Mapstoragetiersintoaddressspace• Potentialforobject-basedaddressing• SeeDDNWOS

• Cloud-BSP• Potentialforglobalobjectvisibilityforin-memory

cloudinfrastructures(Spark)• Reducethetime/costtoportJavatoafull128-bit

addressingmodel• HPC-PGAS

• HighPerformanceComputing:PartitionedGlobalAddressSpace

HPC-PGAS• Traditionalmessagepassingparadigmhastremendousamountofoverhead• Userlibraryoverhead,driveroverhead• Optimizedforlargedatatransfers• ManagementofcommunicationforExascale-classsystems

• Wehaveexcellentexamplesoflow-latencyPGASruntimes,butlittlehardware/uArch support• LBNL:GASnet• PNNL:GlobalArrays/ARMCI• Cray:Chapel• OpenSHMEM

Part0 Part1 Part2 Part3 Part4

get

getget

putput put

AddressingArchitecture• uArch mapsextendedaddressingintoRV64• WehopetogeneralizethisforRV32aswell

• CSRbitsencodedtoappearasstandardRV64uArch• XLENmapstoRV64• TBDwhetherweneedadditionalinterruptsandexceptions

• Additionofextended {eN}registersthatmaptobasegeneralregisters• Extendedregistersaremanuallyutilizedviaextendedload/store/moveinstructions

RV64I ALU

RV64

I Reg

iste

r File

x0

x9x10

x31

.

.

.

.

.

.

.

.

.

.

.

.

.

.

RV12

8I E

xten

ded

Regi

ster

Filee10

e31

.

.

.

.

.

.

.

.

.

.

.

.

.

e9.

e0

eld x31, 0(x21)

Effective Address

[127:64] = e21 [63:0] = x21 imm+128-bit base address

ISAExtension

• Instructionsaresplitintothreeblocks:• Baseintegerload/store• Rawintegerload/store• Addressmanagement

• Baseintegerload/store(I-type)• Permitsloading/storingallbaseRV64Idatatypesusingstandardmnemonic• EX:eld rd,imm(rs1)• Theextendedregistermappedtothesameindexas’rs1’isimplied

• Rawintegerload/store(R-type)• Permitsloading/storingusingexplicitextendedregisterscombinedwithexplicitbaseregisters(noimm)• erld rd,rs1,ext2• LOAD(ext2[127-64],rs1[63-0])

• AddressManagement• Permitsexplicitmanipulationoftheextendedregistercontents• eaddie extd,rs1,imm• extd =rs1+imm

ISAExtensionEncodingsxBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4.2 Raw Integer Load/Store Instructions

Table 6: Raw Integer Load/Store Instructions

Mnemonic funct7 rs2 rs1 funct3 rd opcode

erld rd, rs1, ext2 0000010 ext2 rs1 011 rd 0111111erlw rd, rs1, ext2 0000010 ext2 rs1 010 rd 0111111erlh rd, rs1, ext2 0000010 ext2 rs1 001 rd 0111111erlhu rd, rs1, ext2 0000010 ext2 rs1 101 rd 0111111erlb rd, rs1, ext2 0000010 ext2 rs1 000 rd 0111111erlbu rd, rs1, ext2 0000010 ext2 rs1 100 rd 0111111erle extd, rs1, ext2 0000011 ext2 rs1 100 extd 0111111ersd rs1, rs2, ext3 0000100 rs2 rs1 011 rs1 0111111ersw rs1, rs2, ext3 0000100 rs2 rs1 010 rs1 0111111ersh rs1, rs2, ext3 0000100 rs2 rs1 001 rs1 0111111ersb rs1, rs2, ext3 0000100 rs2 rs1 000 rs1 0111111erse ext1, rs2, ext3 0001000 rs2 ext1 011 rs1 0111111

xBGAS 0.0.4 18

xBGAS Architecture Extension Specification

4.3 xBGAS Address Management Instructions

Table 7: Address Management Instructions

Mnemonic base funct3 dest opcode

eaddi rd, ext1, imm ext1 001 rd 1111111eaddie extd, rs1, imm rs1 100 extd 1111111eaddix extd, ext1, imm extd 101 ext1 1111111

4.4 Assembly Mnemonics

In addition to the aforementioned encodings and core xBGAS instruction extensions, we also define a setof complementary instruction mnemonics that may be supported by the target binary assembler in order tofacilitate condensed definition of common operations. The following table describes these instructions andtheir associated mnemonics.

Table 8: Assembly Mnemonics

Mnemonic Base Instruction

movebe rd, ext1 eaddi rd, ext1, 0moveeb extd, rs1 eaddie extd, rs1, 0moveee extd, ext1 eaddix extd, ext1, 0

xBGAS 0.0.4 19

xBGAS Architecture Extension Specification

4.3 xBGAS Address Management Instructions

Table 7: Address Management Instructions

Mnemonic base funct3 dest opcode

eaddi rd, ext1, imm ext1 001 rd 1111111eaddie extd, rs1, imm rs1 100 extd 1111111eaddix extd, ext1, imm extd 101 ext1 1111111

4.4 Assembly Mnemonics

In addition to the aforementioned encodings and core xBGAS instruction extensions, we also define a setof complementary instruction mnemonics that may be supported by the target binary assembler in order tofacilitate condensed definition of common operations. The following table describes these instructions andtheir associated mnemonics.

Table 8: Assembly Mnemonics

Mnemonic Base Instruction

movebe rd, ext1 eaddi rd, ext1, 0moveeb extd, rs1 eaddie extd, rs1, 0moveee extd, ext1 eaddix extd, ext1, 0

xBGAS 0.0.4 19

BaseIntegerLoad/Store RawIntegerLoad/Store

AddressManagement

AssemblyMnemonics

ABI(CallingConvention)• Thisiswherethingsgettricky…• ThebaseRV{32,64}ABIdefines:• Contextsave/restorespace• Call/returnregisterutilization• Caller/Callee savedstate• Coredatatypes

• Wewanttopreserveasmuchaspossiblewhileprovidingextendedaddressing

• Manyoutstandingquestions• HowdowelinkbaseRVobjectswithobjectscontainingextendedaddressing?• Howdoweaddressthecaller/callee savedstatewithextendedregisters?• Debugginganddebuggingmetadata?

Research&Progress

• Software• DataIntensiveScalableComputingLabatTexasTechisleadingthesoftwareresearch• CurrentxBGAS specimplementedinLLVMcompiler• Studythevariousapplicationdomains

• Hardware• TCL/LBNL/MITleadinghardwareeffort• Exploringpipelinedandaccelerator-basedimplementations

• OtherTopics• Operatingsystem(contextsaveinfo)• Debugging• ProgrammingModel

CommunitySupport&Interest

• xBGAS specavailableonGithub• https://github.com/tactcomplabs/xbgas-archspec

• RISC-VToolsBranchfromPriv-1.9inprogress• https://github.com/tactcomplabs/riscv-tools

• Wewelcomecomments/collaborators!

Acknowledgements

• Farzad Fatollahi-Fard,DavidDonofrio,JohnShalf:LawrenceBerkeleyLab• KurtKeville:MIT• XiWang,FrankConlon,YongChen:TexasTechUniversity• BruceJacob:UniversityofMaryland• SteveWallach:Micron