+ All Categories
Home > Documents > Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA...

Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA...

Date post: 03-Jun-2020
Category:
Upload: others
View: 9 times
Download: 0 times
Share this document with a friend
39
Looking Backwards The Coming Decade of BSD George Neville-Neil
Transcript
Page 1: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

LookingBackwardsTheComingDecadeofBSD

GeorgeNeville-Neil

Page 2: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

WelcometoEuroBSD 2026!

• FreeBSD15• Droppedsupportforsparc64andPC98

• NetBSD 11.0• DroppedVAX,Amiga,andAtariSTSupport

• OpenBSD 9.0• FirstimplementationofSMP!

Page 3: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

SomeNotableBSDAchievements

• Scalingto32KCPUcores• SingleSystemServing10Terabits/sec• AlwaysonPetabyteFileServer• SecurityIsolationTechnologyinEveryMobileDevice• MostcommonlydeployedIoT OS• ThemostusedOStechnologyintheworld

Page 4: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

2017BSDDeclaredDead(again)

• 64bitinode workcomplete• Firstexabyte scaleUFS3deployment• Networkstacklibrarification continues• IntegrationofConcurrencyKitprimitives• BSDAPIStandardsPublished• LLVMCompilerExtensionsBegin

Page 5: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

2018LinuxOntheDesktop

• Threenewschedulersaddedaslibraries• MassiveMulticore(MMC)• LittleJohn(Big/LittlewrittenbyJohnBaldwin)• SkimpySched(Powerawareschedulerforembedded)• EnhancedNUMAAwarenessstartedinMMCScheduler

• 1TerabitNICssupport• VFSsystempackagedasalibrary• MSDOSFSfirstFStobeturnedintoalibrary• Adoptedasstandardbymostembeddedsystemsprojects

Page 6: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

2019Hinkley PointBMeltdowntrackedtouseofLinux2.6kernel• Allnetworkstackcomponentsarenowlibraries• BasedonpioneeringworkwithifLib• Networkdevicedriversshrinkby2/3

• Librarification ofVMsystemstarts• FirstworkingversionofLLVMassistedsystemconfigurator• LLDBandLLVMnowdefaultforallBSDsystemsandCPUarchitectures• Allcallstoprintf()replacedbyDTrace debugging• NVDIMMSupportComplete• Librariesmaynowusememorythatnevergoesaway

Page 7: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

2021GoogleAbandonsGoinFavorofRust

• VMsystemasalibrary• Alluserlevelconfigurationprogramsnowconsumeandemitmachinereadableoutput• AllBSDsnowcomeinflavorswhichmayormaynotlooklikedistributions• pkg systemachievessentienceanddemandsavacation

Page 8: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

2022DragonFly SelectedasDefaultOSonOpenCompute• GEOMandStorageLayersasalibrary• Storagedriversshrinkby2/3

• bhyve nowdefaultvirtualizationsystemonallBSDs• Configuratorcannowbuildkernelimagesbetween1Mand512G• SupportforRPi10• SupportforHAL9000

• Whichisnow25yearslate• Whichweknowistypical

Page 9: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

2023OpenBSD AdoptedastheprimaryOSatNSA,GCHQ,FSB,etc.• PCIasaFabricSupportAdded• Capsicumization ofkernelanduserspacecomponentscomplete• OpenBSD adoptscapsicum

• Configuratorcanremoteorlocalizecode• AdoptionofnewX12windowingsystem• JavaaddedtothebasesystemofallBSDs

Page 10: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

2025AppleDonatestotheFreeBSD,NetBSDandOpenBSD Foundations

• UniversalPeace• WorldHungerEnds• RealizationoftheHumanMillennium• Everyonegetsapony!

Page 11: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Whatdowewanttoachieve?

• ThemostusedOStechnologyintheworld• ScalingtomanymoreCPUcores• SingleSystemServingmanyTerabits/sec• AlwaysonYottabyteFileServer• SecurityIsolationTechnologyinEveryMobileDevice• MostcommonlydeployedIoT OS• OrwouldyoupreferLinuxorWindowstorunyournextautomobile?

Page 12: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Howdowegetthere?

• APIs• DesignGuidelines• Easeofremoting

• Libraries• Shatterthekernel,andglueitbacktogether

• Tooling• Wenowhavethemostflexible,opensource,compilerontheplanet• Butwebarelyuseitsadvancedfeatures• Orcreateourownextensions• That,must,change…

Page 13: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

JordanHubbardisCorrect…

Page 14: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process
Page 15: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Hardware/SoftwareCo-Evolution

• CPUExtensions effectonUNIX• NVME– FasterthanSSD• NVDIMM– Memory thatnevergoesaway• Morecores(18/36 availablein2014)• Morecaches(128 MBofL4willavailableonSkyLake)• FasterNICs• Terabitisnotasfarawayasyouthink

Page 16: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

WhatwasUNIXwrittenfor?

Page 17: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Hot,bedtime,reading

Page 18: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Acompanythatcared

Page 19: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Behold!ThePentium4!

Page 20: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

CurrentCPUTechnology

Page 21: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

SchedulerUpgrades

• Isalreadypluggable!• Manymorecores• NUMA• I/OScheduling• CacheAwareness• Power• Avoidthepitfalls

TheLinuxScheduler:aDecadeofWastedCores

Page 22: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

NUMAAwareness

• Weknowthememorytopology• Memorymustbeallocatedneartheprocess• Andprocessesoughttobestartedwherethereismemory• I/OComplicatestheproblem• ExtendtheschedulertoknowabouttheI/Olayout

Page 23: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

SchedulingforCache

• Instructionsarecheap,cachemissesareexpensive• Nowtheoverwhelmingsourceofmostbottlenecks• Teachthescheduleraboutcachelayoutandconstraints• Optimizeforcachecoherency• Feedhwpmc samplesintotheschedulingdecisions

Page 24: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Power

• Big/LittleWillBecomeMoreCommon• Needtounderstandthecomputepowerofeachcore• Doweschedulefor…• Quickesttocomplete• Earliestdeadline• Lowestpowerconsumption

Page 25: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

FrommonolithtobuildingblocksLibrarification

• NetBSD’s RUMPkernels• libuinet• ifLib• MusthavegoodAPIstandards• DocumentationstandardforAPIs

Needtokeepgoing

Page 26: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Iwantoneofthese!

Page 27: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

APIDesign

• Regularity• Tractability• Composability• Assisted bythecompilertoolchain• WitheredDrivers• Easilyforwardable APIs• Betterbuildingblocks!

Page 28: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

APIRegularity

• Thepositionofargumentsmatter• Whatistheverb?• Whatarethenouns?• ArewewritingEnglishorHebrew,orJapaneseor?

void *memcpy(void *dst, const void *src, size_t len);

void bcopy(const void *src, void *dst, size_t len);

Page 29: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

APITractability:GoldilocksandthethreeAPIs

• TooBig• MostWindowsAPIs• X11isclassicallyterrible

• TooSmall• ioctl()consideredharmful

• Whatdoesitmean?Ican’teasilytell.• Useasalastresort

• JustRight• Between5and7arguments

Page 30: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

APIForwarding

• In2026allsystemsaredistributedsystems• Itwastruein2016butweignoredthattruth

• Deepstructuresarehardtopack• WhatifthisAPIwasanRPC?• Pointersbecomemorefuntodealwith

• Goshallow• Splitstructuresintolocalandremotecomponents

Page 31: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

APItoResourceRelationship

• PassingPointers• Whoallocates?• Whofrees?

• SharingLocks• Wholocks?• Whounlocks?

• MoreaboutGoldilocks• Toobig?• TooSmall• Justright?

Page 32: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Aworkedexample

Page 33: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

LiterallyLitteredwithLibraries

Page 34: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

OptimistorPessimist?

Opponentspointoutthatnosuchprogramhaseverbeenconstructedandthatexperiencewouldindicatethatevenifitcouldbebuilt,itwouldberifewithuntestableandundetectableerrors.

Proponentssaythesoftwarecouldbeassembledinsmallerpieces,whichcouldprobablybetestedadequatelyorotherwisemade“fault-tolerant.”

Ithasbeenestimatedbyexpertsthatthenecessarysoftwareprogramwouldinvolvetenmillion(1x10^7)ormorelinesofcode.

Page 35: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Pervasive TracingandDebug

• Deathtoprintf()!!!• TracingFeaturesMustBePervasive• Easytouse• ProduceMachineReadableOutput

How big is anOSkernel?FilesLines

C 5,685 5,140,567CHeaderFiles 5,356 2,271,425

Page 36: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

UnifytheControlPlane

• Machinereadableoutput• Machinecontrollableinput• Addressbothhumansandprogrammers• Increaseandimproveautomation• Ifyou’redoingitbyhand,you’redoingitwrong!

“Thestudyofcomputerscienceisthestudyofwhatcanbeautomated.”D.Knuth

Page 37: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

PuttingthePiecesTogether

• HumptyDumptyKernel• Notamicro-kernel• Thoughitcouldbe

• NeedtheConfigurator• Tooling,tooling,tooling• “Inthe80speoplegotpaidtoaddfeaturestothekernel,andinthe90stheygotpaidtotakethesamefeaturesoutofit.”– H.Massalin

Page 38: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

OperatingSystemsAreLikeLegos

• TheBSDshavealwaysbuiltsolidarchitectures• SmallandFlexibleComponents• WelldefinedAPIs• Builtintolibraries• Comeinmanycolors!

Page 39: Looking Backwards The Coming Decade of BSD...The Linux Scheduler: a Decade of Wasted Cores NUMA Awareness • We know the memory topology • Memory must be allocated near the process

Comments?Questions?


Recommended