PrincetonUniversity *NVIDIA
ASPLOS2017
CarolineTrippel,Yatin A.Manerkar,DanielLustig*,MichaelPellauer*,MargaretMartonosi
TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA
http://check.cs.princeton.edu/
§Whatcangowrong?• Ill-specifiedHLLmemorymodel• IncorrectHLLàISAcompilation• InadequateISAspecification• Incorrecthardwareimplementation
§Currenttechniquesverifyonlyportionsofstack• CompilermappingsfromHLLtoISA• Validityofhardwareimplementation
Microarchitecture
Compilation
HardwareImplementation
High-levelLanguage(HLL)MemoryModel
ISAMemoryModel
Memory Models in the Hardware-Software Stack
ISAMemoryModel OSArMOR [ISCA15]
CompilerMappings
TriCheck
PipeCheck [MICRO47]CCICheck [MICRO48]COATCheck [ASPLOS16]Hardware
MemoryModel
SoftwareMemoryModel
Our Work: Memory Consistency Model Verification
Why is TriCheck Necessary?§Memorymodelbugsarerealandproblematic!• ARMRead-after-ReadHazard[Alglave etal.TOPLAS14]• RISC-VISAiscurrentlyincompatiblewithC11• C11àPOWER/ARMv7“trailing-sync”compilermapping[Battyetal.POPL‘12]• C11àPOWER/ARMv7“leading-sync”compilermapping[Lahav etal.PLDI17]
§ ISAsareanimportantandstill-fluiddesignpoint!• Often,ISAsdesignedinlightofdesiredHWoptimizations• ISAplacessomeconstraintsonhardwareandsomeoncompiler• Manyindustrymemorymodelsarestillevolving:C11,ARMv7vs.ARMv8• NewISAsaredesigned,e.g.,RISC-VCPUs,specializedaccelerators
§ Correctnessrequirescooperationofthewholestack
Thiswork
Outline§MemoryConsistencyModelVerification
§ Full-StackVerification:MotivatingExample
§TriCheckFrameworkforFull-StackMemoryModelVerification
§BugsFoundwithTriCheck:RISC-VCaseStudyandCompilerMappings
§OngoingWork&Conclusions
Microarchitecture
ISAMemoryModel
SoftwareMemoryModel
Compilation
HardwareImplementation
ARMCortex-A9
ARM Read-Read Hazard
ARM Read-Read Hazard
Microarchitecture
ISAMemoryModel
SoftwareMemoryModel
Compilation
HardwareImplementation
ARMCortex-A9
C11/C++11 ARMv7st(rlx) STRld(rlx) LDRld(acq) LDR; DMB
… …
ARM Read-Read Hazard
Microarchitecture
ISAMemoryModel
SoftwareMemoryModel
Compilation
HardwareImplementation
ARMCortex-A9
T0 T1st(data,1,rlx) st(data,2,rlx)r1=ld(ptr,rlx)
r2=ld(data,rlx)
Initialconditions:data=0,*ptr=&dataForbiddenbyC11:r1=2,r2=1
C11/C++11 ARMv7st(rlx) STRld(rlx) LDRld(acq) LDR; DMB
… …
T0 T1st(data,1,rlx) st(data,2,rlx)r1=ld(ptr,rlx)
r2=ld(data,rlx)
ARM Read-Read Hazard
Microarchitecture
ISAMemoryModel
SoftwareMemoryModel
Compilation
HardwareImplementation
ARMCortex-A9
C0 C1ST [data]ß1 ST [data]ß2LD [ptr]àr0LD [r0]àr1
LD [data]àr2
Initialconditions:data=0,*ptr=&dataForbiddenbyC11:r1=2,r2=1
C11/C++11 ARMv7st(rlx) STRld(rlx) LDRld(acq) LDR; DMB
… …
ARM Read-Read Hazard
Microarchitecture
ISAMemoryModel
SoftwareMemoryModel
Compilation
HardwareImplementation
T0 T1st(data,1,rlx) st(data,2,rlx)r1=ld(ptr,rlx)
r2=ld(data,rlx)
Twoloadsofthesameaddress
Initialconditions:data=0,*ptr=&dataForbiddenbyC11:r1=2,r2=1
ARMCortex-A9
C0 C1ST [data]ß1 ST [data]ß2LD [ptr]àr0LD [r0]àr1
LD [data]àr2
ForbiddenoutcomeobservableonCortex-A9
C11/C++11 ARMv7st(rlx) STRld(rlx) LDRld(acq) LDR; DMB
… …
Outline§MemoryConsistencyModelVerification
§ Full-StackVerification:MotivatingExample
§TriCheckFrameworkforFull-StackMemoryModelVerification
§BugsFoundwithTriCheck:RISC-VCaseStudyandCompilerMappings
§OngoingWork&Conclusions
TriCheck Key Ideas§ Firsttoolcapableoffullstackmemorymodelverification• Anylayercanintroducerealbugs
§ LitmusTests+Auto-generators• ComprehensivefamiliesoftestsacrossHLLorderingoptions,compilermappingvariations,ISAoptions
§Happens-before,graph-basedanalysis• Nodesarememoryaccesses&orderingprimitives• Edgesareeventordersdiscernedviamemorymodelrelations
§Efficienttop-to-bottomanalysis:Runtimeinsecondsorminutes• Fastenoughtofindrealbugs;Interactivedesignprocess
TriCheck Methodology§User-definedTriCheckinputs• HLLmemorymodel(Herd[Alglave etal.TOPLAS14])• HLLàISAcompilermappings• Hardwaremodel(μspec DSL)
§ Auto-generatedTriCheckinputs• HLLlitmustestsuitefromtemplates
§ Eachiteration:bugsanalyzedtoidentifycause• Compilerbug,hardwareimplementationbug,ISAbug• Blamemaybedebated• Blame!=Fix
User-defined inputs
HLLmemorymodel
HLLlitmustests
HLLàISAcompilermappings
Microarchitecturemodel
TriCheck
Bugs? Strict?
Yes
No
Yes
Yes/No
Done
Outline§MemoryConsistencyModelVerification
§ Full-StackVerification:MotivatingExample
§TriCheckFrameworkforFull-StackMemoryModelVerification
§BugsFoundwithTriCheck:RISC-VCaseStudyandCompilerMappings
§OngoingWork&Conclusions
RISC-V Case Study§ Createμspec modelsfor7distinctRISC-Vimplementationpossibilities:• AllabidebycurrentRISC-Vspec• Varyinpreservedprogramorderandstoreatomicity
§ Startedwithstricter-than-specmicroarchitecture:RISC-VRocketChip• TriCheck detectsbugs:refineforcorrectness• TriCheck detectsover-strictness:Performedlegal(perRISC-Vspec)microarchitectural relaxations
§ ImpossibletocompileC11forRISC-Vasspecified.
§Outof1,701testedC11programs:• RISC-V-Base-compliantdesignallows144buggyoutcomes• RISC-V-Base+A-compliant designallows221buggyoutcomes
050100150200250
WR
rWR
rWM
rMM
nWR
nMM
A9like
WR
rWR
rWM
rMM
nWR
nMM
A9like
riscv-curr riscv-ours
wrc
RISC-VBaseline(Base)
TestVa
riatio
nsBugs OverlyStrict Equivalent
μSpec Model:
Variation:
Litmustest:
ISA:
RISC-V Base: Lack of Cumulative FencesC11acquire/releasesynchronizationistransitive:accessesbeforeareleasewriteinprogramorder,andobservedbythereleasingcorepriortothereleasewrite mustbeorderedbeforethereleasefromtheviewpointofanacquirereadthatreadsfromthereleasewrite
C2C1C0
MainMemory
STB STB
C0 C1 C2ST flag1ß1 if (LD flag1==1) if (LD flag2==1)
FENCE[LD.ST,ST] FENCE[LD,LD.ST]
ST flag2ß1 LD flag1àtest
flag1=0 flag2=1
Settingflag1causessettingflag2
050100150200250
WR
rWR
rWM
rMM
nWR
nMM
A9like
WR
rWR
rWM
rMM
nWR
nMM
A9like
riscv-curr riscv-ours
wrc
RISC-VBaseline(Base)
TestVa
riatio
nsBugs OverlyStrict Equivalent
μSpec Model:
Variation:
Litmustest:
ISA:
RISC-V Base: Lack of Cumulative FencesC/C++acquire/releasesynchronizationistransitive:accessesbeforeareleasewriteinprogramorder,andobservedbythereleasingcorepriortothereleasewrite mustbeorderedbeforethereleasefromtheviewpointofanacquirereadthatreadsfromthereleasewrite
BaseRISC-VISAlackscumulativefences• Cumulativefenceneededtoenforceorderbetweendifferent-threadaccesses• Cannotfixbugsby modifyingcompiler
Oursolution: addcumulativefencestotheBaseRISC-VISA
More results in the paper:
Takeaway:CurrentRISC-VcannotserveasacompilertargetforC11
NextSteps:WearemembersofRISC-Vmemorymodelworkinggroup,workingtoformalizeamemorymodelforRISC-VthatmeetstheneedsofRISC-Vusers
andsupportsC11.
§BothBaseandBase+A:• Lackofcumulativelightweightfences• Lackofcumulativeheavyweightfences• Re-orderingofsame-addressloads• Nodependencyordering,butLinuxportassumesit
§Base+A only:• Lackofcumulativereleases;noacquire-releasesynchronization• Noroach-motelmovement
Evaluating Compiler Mappings with TriCheck§DuringRISC-Vanalysis,wediscoveredtwocounter-exampleswhileusingthe“proven-correct”trailing-syncmappingsforcompilingC11toPOWER/ARMv7
§Alsoincorrect:theproof fortheC11toPOWER/ARMv7trailing-synccompilermappings[Manerkaretal.,CoRR ‘16]
Conclusions§Memorymodeldesignchoicesarecomplicated=>• Verificationcallsforautomatedanalysistocomprehensivelytacklesubtleinterplaybetweenmanydiversefeatures.
§TriCheck uncoveredflawsintheRISC-Vmemorymodel…• Butmoregenerally,TriCheck canbeusedonanyISA.
§LanguagesandCompilersmattertoo…• TriCheckuncoveredbugsinthetrailing-synccompilermappingfromC11toPOWER/ARMv7
[email protected]://check.cs.princeton.edu/